Search (354 results, page 1 of 18)

  • × theme_ss:"Retrievalalgorithmen"
  1. Campos, L.M. de; Fernández-Luna, J.M.; Huete, J.F.: Implementing relevance feedback in the Bayesian network retrieval model (2003) 0.07
    0.0672971 = product of:
      0.100945644 = sum of:
        0.00890397 = weight(_text_:a in 825) [ClassicSimilarity], result of:
          0.00890397 = score(doc=825,freq=10.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.1709182 = fieldWeight in 825, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=825)
        0.09204167 = sum of:
          0.055313893 = weight(_text_:de in 825) [ClassicSimilarity], result of:
            0.055313893 = score(doc=825,freq=2.0), product of:
              0.19416152 = queryWeight, product of:
                4.297489 = idf(docFreq=1634, maxDocs=44218)
                0.045180224 = queryNorm
              0.28488597 = fieldWeight in 825, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.297489 = idf(docFreq=1634, maxDocs=44218)
                0.046875 = fieldNorm(doc=825)
          0.03672778 = weight(_text_:22 in 825) [ClassicSimilarity], result of:
            0.03672778 = score(doc=825,freq=2.0), product of:
              0.15821345 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.045180224 = queryNorm
              0.23214069 = fieldWeight in 825, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=825)
      0.6666667 = coord(2/3)
    
    Abstract
    Relevance Feedback consists in automatically formulating a new query according to the relevance judgments provided by the user after evaluating a set of retrieved documents. In this article, we introduce several relevance feedback methods for the Bayesian Network Retrieval ModeL The theoretical frame an which our methods are based uses the concept of partial evidences, which summarize the new pieces of information gathered after evaluating the results obtained by the original query. These partial evidences are inserted into the underlying Bayesian network and a new inference process (probabilities propagation) is run to compute the posterior relevance probabilities of the documents in the collection given the new query. The quality of the proposed methods is tested using a preliminary experimentation with different standard document collections.
    Date
    22. 3.2003 19:30:19
    Type
    a
  2. Courtois, M.P.; Berry, M.W.: Results ranking in Web search engines (1999) 0.07
    0.06588431 = product of:
      0.09882645 = sum of:
        0.0066366266 = weight(_text_:a in 3726) [ClassicSimilarity], result of:
          0.0066366266 = score(doc=3726,freq=2.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.12739488 = fieldWeight in 3726, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.078125 = fieldNorm(doc=3726)
        0.092189826 = product of:
          0.18437965 = sum of:
            0.18437965 = weight(_text_:de in 3726) [ClassicSimilarity], result of:
              0.18437965 = score(doc=3726,freq=8.0), product of:
                0.19416152 = queryWeight, product of:
                  4.297489 = idf(docFreq=1634, maxDocs=44218)
                  0.045180224 = queryNorm
                0.94961995 = fieldWeight in 3726, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  4.297489 = idf(docFreq=1634, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3726)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Comparaison des méthodes de classement de 5 moteurs de recherche (AltaVista, HotBot, Excie, Infoseek et Lycos). Sont testées la présence de tous les mots, la proximité et la localisation
    Type
    a
  3. Figuerola, C.G.; Zazo, A.F.; Berrocal, J.L.A.: ¬La interaccion con el usuario en los sistemas de cuperacion de informacion realimentacion por relecvancia (2002) 0.06
    0.05745974 = product of:
      0.08618961 = sum of:
        0.007963953 = weight(_text_:a in 2875) [ClassicSimilarity], result of:
          0.007963953 = score(doc=2875,freq=2.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.15287387 = fieldWeight in 2875, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.09375 = fieldNorm(doc=2875)
        0.07822566 = product of:
          0.15645131 = sum of:
            0.15645131 = weight(_text_:de in 2875) [ClassicSimilarity], result of:
              0.15645131 = score(doc=2875,freq=4.0), product of:
                0.19416152 = queryWeight, product of:
                  4.297489 = idf(docFreq=1634, maxDocs=44218)
                  0.045180224 = queryNorm
                0.8057792 = fieldWeight in 2875, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.297489 = idf(docFreq=1634, maxDocs=44218)
                  0.09375 = fieldNorm(doc=2875)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Type
    a
  4. Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.04
    0.039725985 = product of:
      0.059588976 = sum of:
        0.010618603 = weight(_text_:a in 402) [ClassicSimilarity], result of:
          0.010618603 = score(doc=402,freq=2.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.20383182 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
        0.048970375 = product of:
          0.09794075 = sum of:
            0.09794075 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
              0.09794075 = score(doc=402,freq=2.0), product of:
                0.15821345 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045180224 = queryNorm
                0.61904186 = fieldWeight in 402, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=402)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Source
    Information processing and management. 22(1986) no.6, S.465-476
    Type
    a
  5. Smeaton, A.F.; Rijsbergen, C.J. van: ¬The retrieval effects of query expansion on a feedback document retrieval system (1983) 0.04
    0.037325956 = product of:
      0.05598893 = sum of:
        0.013139851 = weight(_text_:a in 2134) [ClassicSimilarity], result of:
          0.013139851 = score(doc=2134,freq=4.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.25222903 = fieldWeight in 2134, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.109375 = fieldNorm(doc=2134)
        0.04284908 = product of:
          0.08569816 = sum of:
            0.08569816 = weight(_text_:22 in 2134) [ClassicSimilarity], result of:
              0.08569816 = score(doc=2134,freq=2.0), product of:
                0.15821345 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045180224 = queryNorm
                0.5416616 = fieldWeight in 2134, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=2134)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Date
    30. 3.2001 13:32:22
    Type
    a
  6. Back, J.: ¬An evaluation of relevancy ranking techniques used by Internet search engines (2000) 0.03
    0.034760237 = product of:
      0.052140355 = sum of:
        0.009291277 = weight(_text_:a in 3445) [ClassicSimilarity], result of:
          0.009291277 = score(doc=3445,freq=2.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.17835285 = fieldWeight in 3445, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.109375 = fieldNorm(doc=3445)
        0.04284908 = product of:
          0.08569816 = sum of:
            0.08569816 = weight(_text_:22 in 3445) [ClassicSimilarity], result of:
              0.08569816 = score(doc=3445,freq=2.0), product of:
                0.15821345 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045180224 = queryNorm
                0.5416616 = fieldWeight in 3445, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3445)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Date
    25. 8.2005 17:42:22
    Type
    a
  7. Al-Hawamdeh, S.; Smith, G.; Willett, P.; Vere, R. de: Using nearest-neighbour searching techniques to access full-text documents (1991) 0.03
    0.032498594 = product of:
      0.04874789 = sum of:
        0.01187196 = weight(_text_:a in 2300) [ClassicSimilarity], result of:
          0.01187196 = score(doc=2300,freq=10.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.22789092 = fieldWeight in 2300, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=2300)
        0.03687593 = product of:
          0.07375186 = sum of:
            0.07375186 = weight(_text_:de in 2300) [ClassicSimilarity], result of:
              0.07375186 = score(doc=2300,freq=2.0), product of:
                0.19416152 = queryWeight, product of:
                  4.297489 = idf(docFreq=1634, maxDocs=44218)
                  0.045180224 = queryNorm
                0.37984797 = fieldWeight in 2300, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.297489 = idf(docFreq=1634, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2300)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Summarises the results to date of a continuing programme of research at Sheffield Univ. to investigate the use of nearest-neighbour retrieval algorithms for full text searching. Given a natural language query statement, the research methods result in a ranking of the paragraphs comprising a full text document in order of decreasing similarity with the query, where the similarity for each paragraph is determined by the number of keyword stems that it has in common with the query
    Type
    a
  8. Guerrero-Bote, V.P.; Moya Anegón, F. de; Herrero Solana, V.: Document organization using Kohonen's algorithm (2002) 0.03
    0.032498594 = product of:
      0.04874789 = sum of:
        0.01187196 = weight(_text_:a in 2564) [ClassicSimilarity], result of:
          0.01187196 = score(doc=2564,freq=10.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.22789092 = fieldWeight in 2564, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=2564)
        0.03687593 = product of:
          0.07375186 = sum of:
            0.07375186 = weight(_text_:de in 2564) [ClassicSimilarity], result of:
              0.07375186 = score(doc=2564,freq=2.0), product of:
                0.19416152 = queryWeight, product of:
                  4.297489 = idf(docFreq=1634, maxDocs=44218)
                  0.045180224 = queryNorm
                0.37984797 = fieldWeight in 2564, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.297489 = idf(docFreq=1634, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2564)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    The classification of documents from a bibliographic database is a task that is linked to processes of information retrieval based on partial matching. A method is described of vectorizing reference documents from LISA which permits their topological organization using Kohonen's algorithm. As an example a map is generated of 202 documents from LISA, and an analysis is made of the possibilities of this type of neural network with respect to the development of information retrieval systems based on graphical browsing.
    Type
    a
  9. Fuhr, N.: Ranking-Experimente mit gewichteter Indexierung (1986) 0.03
    0.029794488 = product of:
      0.04469173 = sum of:
        0.007963953 = weight(_text_:a in 58) [ClassicSimilarity], result of:
          0.007963953 = score(doc=58,freq=2.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.15287387 = fieldWeight in 58, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.09375 = fieldNorm(doc=58)
        0.03672778 = product of:
          0.07345556 = sum of:
            0.07345556 = weight(_text_:22 in 58) [ClassicSimilarity], result of:
              0.07345556 = score(doc=58,freq=2.0), product of:
                0.15821345 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045180224 = queryNorm
                0.46428138 = fieldWeight in 58, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=58)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Date
    14. 6.2015 22:12:44
    Type
    a
  10. Fuhr, N.: Rankingexperimente mit gewichteter Indexierung (1986) 0.03
    0.029794488 = product of:
      0.04469173 = sum of:
        0.007963953 = weight(_text_:a in 2051) [ClassicSimilarity], result of:
          0.007963953 = score(doc=2051,freq=2.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.15287387 = fieldWeight in 2051, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.09375 = fieldNorm(doc=2051)
        0.03672778 = product of:
          0.07345556 = sum of:
            0.07345556 = weight(_text_:22 in 2051) [ClassicSimilarity], result of:
              0.07345556 = score(doc=2051,freq=2.0), product of:
                0.15821345 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045180224 = queryNorm
                0.46428138 = fieldWeight in 2051, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=2051)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Date
    14. 6.2015 22:12:56
    Type
    a
  11. López-Pujalte, C.; Guerrero-Bote, V.P.; Moya-Anegón, F. de: Genetic algorithms in relevance feedback : a second test and new contributions (2003) 0.03
    0.02589091 = product of:
      0.038836364 = sum of:
        0.0065699257 = weight(_text_:a in 1076) [ClassicSimilarity], result of:
          0.0065699257 = score(doc=1076,freq=4.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.12611452 = fieldWeight in 1076, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1076)
        0.032266438 = product of:
          0.064532876 = sum of:
            0.064532876 = weight(_text_:de in 1076) [ClassicSimilarity], result of:
              0.064532876 = score(doc=1076,freq=2.0), product of:
                0.19416152 = queryWeight, product of:
                  4.297489 = idf(docFreq=1634, maxDocs=44218)
                  0.045180224 = queryNorm
                0.33236697 = fieldWeight in 1076, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.297489 = idf(docFreq=1634, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1076)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Type
    a
  12. Sánchez-de-Madariaga, R.; Fernández-del-Castillo, J.R.: ¬The bootstrapping of the Yarowsky algorithm in real corpora (2009) 0.03
    0.02589091 = product of:
      0.038836364 = sum of:
        0.0065699257 = weight(_text_:a in 2451) [ClassicSimilarity], result of:
          0.0065699257 = score(doc=2451,freq=4.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.12611452 = fieldWeight in 2451, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2451)
        0.032266438 = product of:
          0.064532876 = sum of:
            0.064532876 = weight(_text_:de in 2451) [ClassicSimilarity], result of:
              0.064532876 = score(doc=2451,freq=2.0), product of:
                0.19416152 = queryWeight, product of:
                  4.297489 = idf(docFreq=1634, maxDocs=44218)
                  0.045180224 = queryNorm
                0.33236697 = fieldWeight in 2451, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.297489 = idf(docFreq=1634, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2451)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    The Yarowsky bootstrapping algorithm resolves the homograph-level word sense disambiguation (WSD) problem, which is the sense granularity level required for real natural language processing (NLP) applications. At the same time it resolves the knowledge acquisition bottleneck problem affecting most WSD algorithms and can be easily applied to foreign language corpora. However, this paper shows that the Yarowsky algorithm is significantly less accurate when applied to domain fluctuating, real corpora. This paper also introduces a new bootstrapping methodology that performs much better when applied to these corpora. The accuracy achieved in non-domain fluctuating corpora is not reached due to inherent domain fluctuation ambiguities.
    Type
    a
  13. Moura, E.S. de; Fernandes, D.; Ribeiro-Neto, B.; Silva, A.S. da; Gonçalves, M.A.: Using structural information to improve search in Web collections (2010) 0.02
    0.024373945 = product of:
      0.036560915 = sum of:
        0.00890397 = weight(_text_:a in 4119) [ClassicSimilarity], result of:
          0.00890397 = score(doc=4119,freq=10.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.1709182 = fieldWeight in 4119, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=4119)
        0.027656946 = product of:
          0.055313893 = sum of:
            0.055313893 = weight(_text_:de in 4119) [ClassicSimilarity], result of:
              0.055313893 = score(doc=4119,freq=2.0), product of:
                0.19416152 = queryWeight, product of:
                  4.297489 = idf(docFreq=1634, maxDocs=44218)
                  0.045180224 = queryNorm
                0.28488597 = fieldWeight in 4119, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.297489 = idf(docFreq=1634, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4119)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    In this work, we investigate the problem of using the block structure of Web pages to improve ranking results. Starting with basic intuitions provided by the concepts of term frequency (TF) and inverse document frequency (IDF), we propose nine block-weight functions to distinguish the impact of term occurrences inside page blocks, instead of inside whole pages. These are then used to compute a modified BM25 ranking function. Using four distinct Web collections, we ran extensive experiments to compare our block-weight ranking formulas with two other baselines: (a) a BM25 ranking applied to full pages, and (b) a BM25 ranking that takes into account best blocks. Our methods suggest that our block-weighting ranking method is superior to all baselines across all collections we used and that average gain in precision figures from 5 to 20% are generated.
    Type
    a
  14. Losada, D.E.; Barreiro, A.: Emebedding term similarity and inverse document frequency into a logical model of information retrieval (2003) 0.02
    0.0242381 = product of:
      0.03635715 = sum of:
        0.01187196 = weight(_text_:a in 1422) [ClassicSimilarity], result of:
          0.01187196 = score(doc=1422,freq=10.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.22789092 = fieldWeight in 1422, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=1422)
        0.024485188 = product of:
          0.048970375 = sum of:
            0.048970375 = weight(_text_:22 in 1422) [ClassicSimilarity], result of:
              0.048970375 = score(doc=1422,freq=2.0), product of:
                0.15821345 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045180224 = queryNorm
                0.30952093 = fieldWeight in 1422, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1422)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    We propose a novel approach to incorporate term similarity and inverse document frequency into a logical model of information retrieval. The ability of the logic to handle expressive representations along with the use of such classical notions are promising characteristics for IR systems. The approach proposed here has been efficiently implemented and experiments against test collections are presented.
    Date
    22. 3.2003 19:27:23
    Type
    a
  15. Faloutsos, C.: Signature files (1992) 0.02
    0.023402527 = product of:
      0.03510379 = sum of:
        0.010618603 = weight(_text_:a in 3499) [ClassicSimilarity], result of:
          0.010618603 = score(doc=3499,freq=8.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.20383182 = fieldWeight in 3499, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=3499)
        0.024485188 = product of:
          0.048970375 = sum of:
            0.048970375 = weight(_text_:22 in 3499) [ClassicSimilarity], result of:
              0.048970375 = score(doc=3499,freq=2.0), product of:
                0.15821345 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045180224 = queryNorm
                0.30952093 = fieldWeight in 3499, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3499)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Presents a survey and discussion on signature-based text retrieval methods. It describes the main idea behind the signature approach and its advantages over other text retrieval methods, it provides a classification of the signature methods that have appeared in the literature, it describes the main representatives of each class, together with the relative advantages and drawbacks, and it gives a list of applications as well as commercial or university prototypes that use the signature approach
    Date
    7. 5.1999 15:22:48
    Type
    a
  16. Bornmann, L.; Mutz, R.: From P100 to P100' : a new citation-rank approach (2014) 0.02
    0.023402527 = product of:
      0.03510379 = sum of:
        0.010618603 = weight(_text_:a in 1431) [ClassicSimilarity], result of:
          0.010618603 = score(doc=1431,freq=8.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.20383182 = fieldWeight in 1431, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=1431)
        0.024485188 = product of:
          0.048970375 = sum of:
            0.048970375 = weight(_text_:22 in 1431) [ClassicSimilarity], result of:
              0.048970375 = score(doc=1431,freq=2.0), product of:
                0.15821345 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045180224 = queryNorm
                0.30952093 = fieldWeight in 1431, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1431)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Properties of a percentile-based rating scale needed in bibliometrics are formulated. Based on these properties, P100 was recently introduced as a new citation-rank approach (Bornmann, Leydesdorff, & Wang, 2013). In this paper, we conceptualize P100 and propose an improvement which we call P100'. Advantages and disadvantages of citation-rank indicators are noted.
    Date
    22. 8.2014 17:05:18
    Type
    a
  17. He, J.; Meij, E.; Rijke, M. de: Result diversification based on query-specific cluster ranking (2011) 0.02
    0.02302829 = product of:
      0.034542434 = sum of:
        0.011494976 = weight(_text_:a in 4355) [ClassicSimilarity], result of:
          0.011494976 = score(doc=4355,freq=24.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.22065444 = fieldWeight in 4355, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4355)
        0.023047457 = product of:
          0.046094913 = sum of:
            0.046094913 = weight(_text_:de in 4355) [ClassicSimilarity], result of:
              0.046094913 = score(doc=4355,freq=2.0), product of:
                0.19416152 = queryWeight, product of:
                  4.297489 = idf(docFreq=1634, maxDocs=44218)
                  0.045180224 = queryNorm
                0.23740499 = fieldWeight in 4355, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.297489 = idf(docFreq=1634, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4355)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Result diversification is a retrieval strategy for dealing with ambiguous or multi-faceted queries by providing documents that cover as many facets of the query as possible. We propose a result diversification framework based on query-specific clustering and cluster ranking, in which diversification is restricted to documents belonging to clusters that potentially contain a high percentage of relevant documents. Empirical results show that the proposed framework improves the performance of several existing diversification methods. The framework also gives rise to a simple yet effective cluster-based approach to result diversification that selects documents from different clusters to be included in a ranked list in a round robin fashion. We describe a set of experiments aimed at thoroughly analyzing the behavior of the two main components of the proposed diversification framework, ranking and selecting clusters for diversification. Both components have a crucial impact on the overall performance of our framework, but ranking clusters plays a more important role than selecting clusters. We also examine properties that clusters should have in order for our diversification framework to be effective. Most relevant documents should be contained in a small number of high-quality clusters, while there should be no dominantly large clusters. Also, documents from these high-quality clusters should have a diverse content. These properties are strongly correlated with the overall performance of the proposed diversification framework.
    Type
    a
  18. Costa Carvalho, A. da; Rossi, C.; Moura, E.S. de; Silva, A.S. da; Fernandes, D.: LePrEF: Learn to precompute evidence fusion for efficient query evaluation (2012) 0.02
    0.022360591 = product of:
      0.033540886 = sum of:
        0.010493428 = weight(_text_:a in 278) [ClassicSimilarity], result of:
          0.010493428 = score(doc=278,freq=20.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.20142901 = fieldWeight in 278, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=278)
        0.023047457 = product of:
          0.046094913 = sum of:
            0.046094913 = weight(_text_:de in 278) [ClassicSimilarity], result of:
              0.046094913 = score(doc=278,freq=2.0), product of:
                0.19416152 = queryWeight, product of:
                  4.297489 = idf(docFreq=1634, maxDocs=44218)
                  0.045180224 = queryNorm
                0.23740499 = fieldWeight in 278, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.297489 = idf(docFreq=1634, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=278)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    State-of-the-art search engine ranking methods combine several distinct sources of relevance evidence to produce a high-quality ranking of results for each query. The fusion of information is currently done at query-processing time, which has a direct effect on the response time of search systems. Previous research also shows that an alternative to improve search efficiency in textual databases is to precompute term impacts at indexing time. In this article, we propose a novel alternative to precompute term impacts, providing a generic framework for combining any distinct set of sources of evidence by using a machine-learning technique. This method retains the advantages of producing high-quality results, but avoids the costs of combining evidence at query-processing time. Our method, called Learn to Precompute Evidence Fusion (LePrEF), uses genetic programming to compute a unified precomputed impact value for each term found in each document prior to query processing, at indexing time. Compared with previous research on precomputing term impacts, our method offers the advantage of providing a generic framework to precompute impact using any set of relevance evidence at any text collection, whereas previous research articles do not. The precomputed impact values are indexed and used later for computing document ranking at query-processing time. By doing so, our method effectively reduces the query processing to simple additions of such impacts. We show that this approach, while leading to results comparable to state-of-the-art ranking methods, also can lead to a significant decrease in computational costs during query processing.
    Type
    a
  19. López-Pujalte, C.; Guerrero-Bote, V.P.; Moya-Anegón, F. de: Order-based fitness functions for genetic algorithms applied to relevance feedback (2003) 0.02
    0.0220016 = product of:
      0.0330024 = sum of:
        0.0099549405 = weight(_text_:a in 5154) [ClassicSimilarity], result of:
          0.0099549405 = score(doc=5154,freq=18.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.19109234 = fieldWeight in 5154, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5154)
        0.023047457 = product of:
          0.046094913 = sum of:
            0.046094913 = weight(_text_:de in 5154) [ClassicSimilarity], result of:
              0.046094913 = score(doc=5154,freq=2.0), product of:
                0.19416152 = queryWeight, product of:
                  4.297489 = idf(docFreq=1634, maxDocs=44218)
                  0.045180224 = queryNorm
                0.23740499 = fieldWeight in 5154, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.297489 = idf(docFreq=1634, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5154)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Lopez-Pujalte and Guerrero-Bote test a relevance feedback genetic algorithm while varying its order based fitness functions and generating a function based upon the Ide dec-hi method as a base line. Using the non-zero weighted term types assigned to the query, and to the initially retrieved set of documents, as genes, a chromosome of equal length is created for each. The algorithm is provided with the chromosomes for judged relevant documents, for judged irrelevant documents, and for the irrelevant documents with their terms negated. The algorithm uses random selection of all possible genes, but gives greater likelihood to those with higher fitness values. When the fittest chromosome of a previous population is eliminated it is restored while the least fittest of the new population is eliminated in its stead. A crossover probability of .8 and a mutation probability of .2 were used with 20 generations. Three fitness functions were utilized; the Horng and Yeh function which takes into account the position of relevant documents, and two new functions, one based on accumulating the cosine similarity for retrieved documents, the other on stored fixed-recall-interval precessions. The Cranfield collection was used with the first 15 documents retrieved from 33 queries chosen to have at least 3 relevant documents in the first 15 and at least 5 relevant documents not initially retrieved. Precision was calculated at fixed recall levels using the residual collection method which removes viewed documents. One of the three functions improved the original retrieval by127 percent, while the Ide dec-hi method provided a 120 percent improvement.
    Type
    a
  20. MacFarlane, A.; Robertson, S.E.; McCann, J.A.: Parallel computing for passage retrieval (2004) 0.02
    0.021329116 = product of:
      0.031993672 = sum of:
        0.0075084865 = weight(_text_:a in 5108) [ClassicSimilarity], result of:
          0.0075084865 = score(doc=5108,freq=4.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.14413087 = fieldWeight in 5108, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=5108)
        0.024485188 = product of:
          0.048970375 = sum of:
            0.048970375 = weight(_text_:22 in 5108) [ClassicSimilarity], result of:
              0.048970375 = score(doc=5108,freq=2.0), product of:
                0.15821345 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045180224 = queryNorm
                0.30952093 = fieldWeight in 5108, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5108)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Date
    20. 1.2007 18:30:22
    Type
    a

Years

Languages

Types

  • a 337
  • el 8
  • m 7
  • s 3
  • p 2
  • r 2
  • More… Less…