Search (42 results, page 1 of 3)

  • × theme_ss:"Retrievalalgorithmen"
  • × type_ss:"a"
  • × year_i:[2000 TO 2010}
  1. Losada, D.E.; Barreiro, A.: Emebedding term similarity and inverse document frequency into a logical model of information retrieval (2003) 0.05
    0.050061855 = product of:
      0.10012371 = sum of:
        0.10012371 = sum of:
          0.043561947 = weight(_text_:systems in 1422) [ClassicSimilarity], result of:
            0.043561947 = score(doc=1422,freq=2.0), product of:
              0.16037072 = queryWeight, product of:
                3.0731742 = idf(docFreq=5561, maxDocs=44218)
                0.052184064 = queryNorm
              0.2716328 = fieldWeight in 1422, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.0731742 = idf(docFreq=5561, maxDocs=44218)
                0.0625 = fieldNorm(doc=1422)
          0.056561764 = weight(_text_:22 in 1422) [ClassicSimilarity], result of:
            0.056561764 = score(doc=1422,freq=2.0), product of:
              0.1827397 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.052184064 = queryNorm
              0.30952093 = fieldWeight in 1422, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0625 = fieldNorm(doc=1422)
      0.5 = coord(1/2)
    
    Abstract
    We propose a novel approach to incorporate term similarity and inverse document frequency into a logical model of information retrieval. The ability of the logic to handle expressive representations along with the use of such classical notions are promising characteristics for IR systems. The approach proposed here has been efficiently implemented and experiments against test collections are presented.
    Date
    22. 3.2003 19:27:23
  2. Furner, J.: ¬A unifying model of document relatedness for hybrid search engines (2003) 0.04
    0.044312872 = product of:
      0.088625744 = sum of:
        0.088625744 = sum of:
          0.04620442 = weight(_text_:systems in 2717) [ClassicSimilarity], result of:
            0.04620442 = score(doc=2717,freq=4.0), product of:
              0.16037072 = queryWeight, product of:
                3.0731742 = idf(docFreq=5561, maxDocs=44218)
                0.052184064 = queryNorm
              0.28811008 = fieldWeight in 2717, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.0731742 = idf(docFreq=5561, maxDocs=44218)
                0.046875 = fieldNorm(doc=2717)
          0.042421322 = weight(_text_:22 in 2717) [ClassicSimilarity], result of:
            0.042421322 = score(doc=2717,freq=2.0), product of:
              0.1827397 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.052184064 = queryNorm
              0.23214069 = fieldWeight in 2717, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=2717)
      0.5 = coord(1/2)
    
    Abstract
    Previous work an search-engine design has indicated that information-seekers may benefit from being given the opportunity to exploit multiple sources of evidence of document relatedness. Few existing systems, however, give users more than minimal control over the selections that may be made among methods of exploitation. By applying the methods of "document network analysis" (DNA), a unifying, graph-theoretic model of content-, collaboration-, and context-based systems (CCC) may be developed in which the nature of the similarities between types of document relatedness and document ranking are clarified. The usefulness of the approach to system design suggested by this model may be tested by constructing and evaluating a prototype system (UCXtra) that allows searchers to maintain control over the multiple ways in which document collections may be ranked and re-ranked.
    Date
    11. 9.2004 17:32:22
  3. Kanaeva, Z.: Ranking: Google und CiteSeer (2005) 0.04
    0.043804124 = product of:
      0.08760825 = sum of:
        0.08760825 = sum of:
          0.038116705 = weight(_text_:systems in 3276) [ClassicSimilarity], result of:
            0.038116705 = score(doc=3276,freq=2.0), product of:
              0.16037072 = queryWeight, product of:
                3.0731742 = idf(docFreq=5561, maxDocs=44218)
                0.052184064 = queryNorm
              0.23767869 = fieldWeight in 3276, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.0731742 = idf(docFreq=5561, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3276)
          0.049491543 = weight(_text_:22 in 3276) [ClassicSimilarity], result of:
            0.049491543 = score(doc=3276,freq=2.0), product of:
              0.1827397 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.052184064 = queryNorm
              0.2708308 = fieldWeight in 3276, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3276)
      0.5 = coord(1/2)
    
    Abstract
    Im Rahmen des klassischen Information Retrieval wurden verschiedene Verfahren für das Ranking sowie die Suche in einer homogenen strukturlosen Dokumentenmenge entwickelt. Die Erfolge der Suchmaschine Google haben gezeigt dass die Suche in einer zwar inhomogenen aber zusammenhängenden Dokumentenmenge wie dem Internet unter Berücksichtigung der Dokumentenverbindungen (Links) sehr effektiv sein kann. Unter den von der Suchmaschine Google realisierten Konzepten ist ein Verfahren zum Ranking von Suchergebnissen (PageRank), das in diesem Artikel kurz erklärt wird. Darüber hinaus wird auf die Konzepte eines Systems namens CiteSeer eingegangen, welches automatisch bibliographische Angaben indexiert (engl. Autonomous Citation Indexing, ACI). Letzteres erzeugt aus einer Menge von nicht vernetzten wissenschaftlichen Dokumenten eine zusammenhängende Dokumentenmenge und ermöglicht den Einsatz von Banking-Verfahren, die auf den von Google genutzten Verfahren basieren.
    Date
    20. 3.2005 16:23:22
  4. Song, D.; Bruza, P.D.: Towards context sensitive information inference (2003) 0.03
    0.03128866 = product of:
      0.06257732 = sum of:
        0.06257732 = sum of:
          0.027226217 = weight(_text_:systems in 1428) [ClassicSimilarity], result of:
            0.027226217 = score(doc=1428,freq=2.0), product of:
              0.16037072 = queryWeight, product of:
                3.0731742 = idf(docFreq=5561, maxDocs=44218)
                0.052184064 = queryNorm
              0.1697705 = fieldWeight in 1428, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.0731742 = idf(docFreq=5561, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1428)
          0.0353511 = weight(_text_:22 in 1428) [ClassicSimilarity], result of:
            0.0353511 = score(doc=1428,freq=2.0), product of:
              0.1827397 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.052184064 = queryNorm
              0.19345059 = fieldWeight in 1428, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1428)
      0.5 = coord(1/2)
    
    Abstract
    Humans can make hasty, but generally robust judgements about what a text fragment is, or is not, about. Such judgements are termed information inference. This article furnishes an account of information inference from a psychologistic stance. By drawing an theories from nonclassical logic and applied cognition, an information inference mechanism is proposed that makes inferences via computations of information flow through an approximation of a conceptual space. Within a conceptual space information is represented geometrically. In this article, geometric representations of words are realized as vectors in a high dimensional semantic space, which is automatically constructed from a text corpus. Two approaches were presented for priming vector representations according to context. The first approach uses a concept combination heuristic to adjust the vector representation of a concept in the light of the representation of another concept. The second approach computes a prototypical concept an the basis of exemplar trace texts and moves it in the dimensional space according to the context. Information inference is evaluated by measuring the effectiveness of query models derived by information flow computations. Results show that information flow contributes significantly to query model effectiveness, particularly with respect to precision. Moreover, retrieval effectiveness compares favorably with two probabilistic query models, and another based an semantic association. More generally, this article can be seen as a contribution towards realizing operational systems that mimic text-based human reasoning.
    Date
    22. 3.2003 19:35:46
  5. Khoo, C.S.G.; Wan, K.-W.: ¬A simple relevancy-ranking strategy for an interface to Boolean OPACs (2004) 0.03
    0.025849177 = product of:
      0.051698353 = sum of:
        0.051698353 = sum of:
          0.026952581 = weight(_text_:systems in 2509) [ClassicSimilarity], result of:
            0.026952581 = score(doc=2509,freq=4.0), product of:
              0.16037072 = queryWeight, product of:
                3.0731742 = idf(docFreq=5561, maxDocs=44218)
                0.052184064 = queryNorm
              0.16806422 = fieldWeight in 2509, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.0731742 = idf(docFreq=5561, maxDocs=44218)
                0.02734375 = fieldNorm(doc=2509)
          0.024745772 = weight(_text_:22 in 2509) [ClassicSimilarity], result of:
            0.024745772 = score(doc=2509,freq=2.0), product of:
              0.1827397 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.052184064 = queryNorm
              0.1354154 = fieldWeight in 2509, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.02734375 = fieldNorm(doc=2509)
      0.5 = coord(1/2)
    
    Content
    "Most Web search engines accept natural language queries, perform some kind of fuzzy matching and produce ranked output, displaying first the documents that are most likely to be relevant. On the other hand, most library online public access catalogs (OPACs) an the Web are still Boolean retrieval systems that perform exact matching, and require users to express their search requests precisely in a Boolean search language and to refine their search statements to improve the search results. It is well-documented that users have difficulty searching Boolean OPACs effectively (e.g. Borgman, 1996; Ensor, 1992; Wallace, 1993). One approach to making OPACs easier to use is to develop a natural language search interface that acts as a middleware between the user's Web browser and the OPAC system. The search interface can accept a natural language query from the user and reformulate it as a series of Boolean search statements that are then submitted to the OPAC. The records retrieved by the OPAC are ranked by the search interface before forwarding them to the user's Web browser. The user, then, does not need to interact directly with the Boolean OPAC but with the natural language search interface or search intermediary. The search interface interacts with the OPAC system an the user's behalf. The advantage of this approach is that no modification to the OPAC or library system is required. Furthermore, the search interface can access multiple OPACs, acting as a meta search engine, and integrate search results from various OPACs before sending them to the user. The search interface needs to incorporate a method for converting the user's natural language query into a series of Boolean search statements, and for ranking the OPAC records retrieved. The purpose of this study was to develop a relevancyranking algorithm for a search interface to Boolean OPAC systems. This is part of an on-going effort to develop a knowledge-based search interface to OPACs called the E-Referencer (Khoo et al., 1998, 1999; Poo et al., 2000). E-Referencer v. 2 that has been implemented applies a repertoire of initial search strategies and reformulation strategies to retrieve records from OPACs using the Z39.50 protocol, and also assists users in mapping query keywords to the Library of Congress subject headings."
    Source
    Electronic library. 22(2004) no.2, S.112-120
  6. Back, J.: ¬An evaluation of relevancy ranking techniques used by Internet search engines (2000) 0.02
    0.024745772 = product of:
      0.049491543 = sum of:
        0.049491543 = product of:
          0.09898309 = sum of:
            0.09898309 = weight(_text_:22 in 3445) [ClassicSimilarity], result of:
              0.09898309 = score(doc=3445,freq=2.0), product of:
                0.1827397 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.052184064 = queryNorm
                0.5416616 = fieldWeight in 3445, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3445)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    25. 8.2005 17:42:22
  7. Figuerola, C.G.; Zazo, A.F.; Berrocal, J.L.A.: ¬La interaccion con el usuario en los sistemas de cuperacion de informacion realimentacion por relecvancia (2002) 0.02
    0.01633573 = product of:
      0.03267146 = sum of:
        0.03267146 = product of:
          0.06534292 = sum of:
            0.06534292 = weight(_text_:systems in 2875) [ClassicSimilarity], result of:
              0.06534292 = score(doc=2875,freq=2.0), product of:
                0.16037072 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.052184064 = queryNorm
                0.4074492 = fieldWeight in 2875, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.09375 = fieldNorm(doc=2875)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Footnote
    Übers. des Titels: User interaction in retrieval information systems through relevance feedback
  8. Kekäläinen, J.: Binary and graded relevance in IR evaluations : comparison of the effects on ranking of IR systems (2005) 0.02
    0.01633573 = product of:
      0.03267146 = sum of:
        0.03267146 = product of:
          0.06534292 = sum of:
            0.06534292 = weight(_text_:systems in 1036) [ClassicSimilarity], result of:
              0.06534292 = score(doc=1036,freq=8.0), product of:
                0.16037072 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.052184064 = queryNorm
                0.4074492 = fieldWeight in 1036, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1036)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In this study the rankings of IR systems based on binary and graded relevance in TREC 7 and 8 data are compared. Relevance of a sample TREC results is reassessed using a relevance scale with four levels: non-relevant, marginally relevant, fairly relevant, highly relevant. Twenty-one topics and 90 systems from TREC 7 and 20 topics and 121 systems from TREC 8 form the data. Binary precision, and cumulated gain, discounted cumulated gain and normalised discounted cumulated gain are the measures compared. Different weighting schemes for relevance levels are tested with cumulated gain measures. Kendall's rank correlations are computed to determine to what extent the rankings produced by different measures are similar. Weighting schemes from binary to emphasising highly relevant documents form a continuum, where the measures correlate strongly in the binary end, and less in the heavily weighted end. The results show the different character of the measures.
  9. MacFarlane, A.; Robertson, S.E.; McCann, J.A.: Parallel computing for passage retrieval (2004) 0.01
    0.014140441 = product of:
      0.028280882 = sum of:
        0.028280882 = product of:
          0.056561764 = sum of:
            0.056561764 = weight(_text_:22 in 5108) [ClassicSimilarity], result of:
              0.056561764 = score(doc=5108,freq=2.0), product of:
                0.1827397 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.052184064 = queryNorm
                0.30952093 = fieldWeight in 5108, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5108)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    20. 1.2007 18:30:22
  10. Losee, R.M.; Church Jr., L.: Are two document clusters better than one? : the cluster performance question for information retrieval (2005) 0.01
    0.013476291 = product of:
      0.026952581 = sum of:
        0.026952581 = product of:
          0.053905163 = sum of:
            0.053905163 = weight(_text_:systems in 3270) [ClassicSimilarity], result of:
              0.053905163 = score(doc=3270,freq=4.0), product of:
                0.16037072 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.052184064 = queryNorm
                0.33612844 = fieldWeight in 3270, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3270)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    When do information retrieval systems using two document clusters provide better retrieval performance than systems using no clustering? We answer this question for one set of assumptions and suggest how this may be studied with other assumptions. The "Cluster Hypothesis" asks an empirical question about the relationships between documents and user-supplied relevance judgments, while the "Cluster Performance Question" proposed here focuses an the when and why of information retrieval or digital library performance for clustered and unclustered text databases. This may be generalized to study the relative performance of m versus n clusters.
  11. Dominich, S.; Skrop, A.: PageRank and interaction information retrieval (2005) 0.01
    0.011551105 = product of:
      0.02310221 = sum of:
        0.02310221 = product of:
          0.04620442 = sum of:
            0.04620442 = weight(_text_:systems in 3268) [ClassicSimilarity], result of:
              0.04620442 = score(doc=3268,freq=4.0), product of:
                0.16037072 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.052184064 = queryNorm
                0.28811008 = fieldWeight in 3268, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3268)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The PageRank method is used by the Google Web search engine to compute the importance of Web pages. Two different views have been developed for the Interpretation of the PageRank method and values: (a) stochastic (random surfer): the PageRank values can be conceived as the steady-state distribution of a Markov chain, and (b) algebraic: the PageRank values form the eigenvector corresponding to eigenvalue 1 of the Web link matrix. The Interaction Information Retrieval (1**2 R) method is a nonclassical information retrieval paradigm, which represents a connectionist approach based an dynamic systems. In the present paper, a different Interpretation of PageRank is proposed, namely, a dynamic systems viewpoint, by showing that the PageRank method can be formally interpreted as a particular case of the Interaction Information Retrieval method; and thus, the PageRank values may be interpreted as neutral equilibrium points of the Web.
  12. Lin, J.; Katz, B.: Building a reusable test collection for question answering (2006) 0.01
    0.011551105 = product of:
      0.02310221 = sum of:
        0.02310221 = product of:
          0.04620442 = sum of:
            0.04620442 = weight(_text_:systems in 5045) [ClassicSimilarity], result of:
              0.04620442 = score(doc=5045,freq=4.0), product of:
                0.16037072 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.052184064 = queryNorm
                0.28811008 = fieldWeight in 5045, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5045)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In contrast to traditional information retrieval systems, which return ranked lists of documents that users must manually browse through, a question answering system attempts to directly answer natural language questions posed by the user. Although such systems possess language-processing capabilities, they still rely on traditional document retrieval techniques to generate an initial candidate set of documents. In this article, the authors argue that document retrieval for question answering represents a task different from retrieving documents in response to more general retrospective information needs. Thus, to guide future system development, specialized question answering test collections must be constructed. They show that the current evaluation resources have major shortcomings; to remedy the situation, they have manually created a small, reusable question answering test collection for research purposes. In this article they describe their methodology for building this test collection and discuss issues they encountered regarding the notion of "answer correctness."
  13. Guerrero-Bote, V.P.; Moya Anegón, F. de; Herrero Solana, V.: Document organization using Kohonen's algorithm (2002) 0.01
    0.010890487 = product of:
      0.021780973 = sum of:
        0.021780973 = product of:
          0.043561947 = sum of:
            0.043561947 = weight(_text_:systems in 2564) [ClassicSimilarity], result of:
              0.043561947 = score(doc=2564,freq=2.0), product of:
                0.16037072 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.052184064 = queryNorm
                0.2716328 = fieldWeight in 2564, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2564)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The classification of documents from a bibliographic database is a task that is linked to processes of information retrieval based on partial matching. A method is described of vectorizing reference documents from LISA which permits their topological organization using Kohonen's algorithm. As an example a map is generated of 202 documents from LISA, and an analysis is made of the possibilities of this type of neural network with respect to the development of information retrieval systems based on graphical browsing.
  14. Aizawa, A.: ¬An information-theoretic perspective of tf-idf measures (2003) 0.01
    0.010890487 = product of:
      0.021780973 = sum of:
        0.021780973 = product of:
          0.043561947 = sum of:
            0.043561947 = weight(_text_:systems in 4155) [ClassicSimilarity], result of:
              0.043561947 = score(doc=4155,freq=2.0), product of:
                0.16037072 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.052184064 = queryNorm
                0.2716328 = fieldWeight in 4155, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4155)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper presents a mathematical definition of the "probability-weighted amount of information" (PWI), a measure of specificity of terms in documents that is based on an information-theoretic view of retrieval events. The proposed PWI is expressed as a product of the occurrence probabilities of terms and their amounts of information, and corresponds well with the conventional term frequency - inverse document frequency measures that are commonly used in today's information retrieval systems. The mathematical definition of the PWI is shown, together with some illustrative examples of the calculation.
  15. Stock, W.G.: On relevance distributions (2006) 0.01
    0.010890487 = product of:
      0.021780973 = sum of:
        0.021780973 = product of:
          0.043561947 = sum of:
            0.043561947 = weight(_text_:systems in 5116) [ClassicSimilarity], result of:
              0.043561947 = score(doc=5116,freq=2.0), product of:
                0.16037072 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.052184064 = queryNorm
                0.2716328 = fieldWeight in 5116, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5116)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    There are at least three possible ways that documents are distributed by relevance: informetric (power law), inverse logistic, and dichotomous. The nature of the type of distribution has implications for the construction of relevance ranking algorithms for search engines, for automated (blind) relevance feedback, for user behavior when using Web search engines, for combining of outputs of search engines for metasearch, for topic detection and tracking, and for the methodology of evaluation of information retrieval systems.
  16. Vechtomova, O.; Karamuftuoglu, M.: Lexical cohesion and term proximity in document ranking (2008) 0.01
    0.010890487 = product of:
      0.021780973 = sum of:
        0.021780973 = product of:
          0.043561947 = sum of:
            0.043561947 = weight(_text_:systems in 2101) [ClassicSimilarity], result of:
              0.043561947 = score(doc=2101,freq=2.0), product of:
                0.16037072 = queryWeight, product of:
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.052184064 = queryNorm
                0.2716328 = fieldWeight in 2101, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0731742 = idf(docFreq=5561, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2101)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    We demonstrate effective new methods of document ranking based on lexical cohesive relationships between query terms. The proposed methods rely solely on the lexical relationships between original query terms, and do not involve query expansion or relevance feedback. Two types of lexical cohesive relationship information between query terms are used in document ranking: short-distance collocation relationship between query terms, and long-distance relationship, determined by the collocation of query terms with other words. The methods are evaluated on TREC corpora, and show improvements over baseline systems.
  17. Crestani, F.; Dominich, S.; Lalmas, M.; Rijsbergen, C.J.K. van: Mathematical, logical, and formal methods in information retrieval : an introduction to the special issue (2003) 0.01
    0.010605331 = product of:
      0.021210661 = sum of:
        0.021210661 = product of:
          0.042421322 = sum of:
            0.042421322 = weight(_text_:22 in 1451) [ClassicSimilarity], result of:
              0.042421322 = score(doc=1451,freq=2.0), product of:
                0.1827397 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.052184064 = queryNorm
                0.23214069 = fieldWeight in 1451, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1451)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2003 19:27:36
  18. Fan, W.; Fox, E.A.; Pathak, P.; Wu, H.: ¬The effects of fitness functions an genetic programming-based ranking discovery for Web search (2004) 0.01
    0.010605331 = product of:
      0.021210661 = sum of:
        0.021210661 = product of:
          0.042421322 = sum of:
            0.042421322 = weight(_text_:22 in 2239) [ClassicSimilarity], result of:
              0.042421322 = score(doc=2239,freq=2.0), product of:
                0.1827397 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.052184064 = queryNorm
                0.23214069 = fieldWeight in 2239, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2239)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    31. 5.2004 19:22:06
  19. Witschel, H.F.: Global term weights in distributed environments (2008) 0.01
    0.010605331 = product of:
      0.021210661 = sum of:
        0.021210661 = product of:
          0.042421322 = sum of:
            0.042421322 = weight(_text_:22 in 2096) [ClassicSimilarity], result of:
              0.042421322 = score(doc=2096,freq=2.0), product of:
                0.1827397 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.052184064 = queryNorm
                0.23214069 = fieldWeight in 2096, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2096)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    1. 8.2008 9:44:22
  20. Klas, C.-P.; Fuhr, N.; Schaefer, A.: Evaluating strategic support for information access in the DAFFODIL system (2004) 0.01
    0.010605331 = product of:
      0.021210661 = sum of:
        0.021210661 = product of:
          0.042421322 = sum of:
            0.042421322 = weight(_text_:22 in 2419) [ClassicSimilarity], result of:
              0.042421322 = score(doc=2419,freq=2.0), product of:
                0.1827397 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.052184064 = queryNorm
                0.23214069 = fieldWeight in 2419, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2419)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    16.11.2008 16:22:48

Languages

  • e 40
  • d 1
  • sp 1
  • More… Less…