Document (#40479)

Author
Layfield, C.
Azzopardi, J,
Staff, C.
Title
Experiments with document retrieval from small text collections using Latent Semantic Analysis or term similarity with query coordination and automatic relevance feedback
Source
Semantic keyword-based search on structured data sources: COST Action IC1302. Second International KEYSTONE Conference, IKC 2016, Cluj-Napoca, Romania, September 8-9, 2016, Revised Selected Papers. Eds.: A. Calì, A. et al
Imprint
Springer International Publishing
Year
2017
Pages
S.25-36
Series
Information Systems and Applications, incl. Internet/Web, and HCI; 10151
Abstract
One of the problems faced by users of databases containing textual documents is the difficulty in retrieving relevant results due to the diverse vocabulary used in queries and contained in relevant documents, especially when there are only a small number of relevant documents. This problem is known as the Vocabulary Gap. The PIKES team have constructed a small test collection of 331 articles extracted from a blog and a Gold Standard for 35 queries selected from the blog's search log so the results of different approaches to semantic search can be compared. So far, prior approaches include recognising Named Entities in documents and queries, and relations including temporal relations, and represent them as `semantic layers' in a retrieval system index. In this work, we take two different approaches that do not involve Named Entity Recognition. In the first approach, we process an unannotated version of the PIKES document collection using Latent Semantic Analysis and use a combination of query coordination and automatic relevance feedback with which we outperform prior work. However, this approach is highly dependent on the underlying collection, and is not necessarily scalable to massive collections. In our second approach, we use an LSA Model generated by SEMILAR from a Wikipedia dump to generate a Term Similarity Matrix (TSM). We automatically expand the queries in the PIKES test collection with related terms from the TSM and submit them to a term-by-document matrix derived by indexing the PIKES collection using the Vector Space Model. Coupled with a combination of query coordination and automatic relevance feedback we also outperform prior work with this approach. The advantage of the second approach is that it is independent of the underlying document collection.
Content
Vgl. auch: http://www.keystone-cost.eu/ikc2016/program.php.
Theme
Semantisches Umfeld in Indexierung u. Retrieval
Object
Latent Semantic Analysis

Similar documents (author)

  1. Baillie, M.; Azzopardi, L.; Ruthven, I.: Evaluating epistemic uncertainty under incomplete assessments (2008) 3.66
    3.6566167 = sum of:
      3.6566167 = weight(author_txt:azzopardi in 2065) [ClassicSimilarity], result of:
        3.6566167 = fieldWeight in 2065, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.7509775 = idf(docFreq=6, maxDocs=44218)
          0.375 = fieldNorm(doc=2065)
    
  2. Balog, K.; Azzopardi, L.; Rijke, M. de: ¬A language modeling framework for expert finding (2009) 3.66
    3.6566167 = sum of:
      3.6566167 = weight(author_txt:azzopardi in 2447) [ClassicSimilarity], result of:
        3.6566167 = fieldWeight in 2447, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.7509775 = idf(docFreq=6, maxDocs=44218)
          0.375 = fieldNorm(doc=2447)
    
  3. Russell-Rose, T.; Chamberlain, J.; Azzopardi, L.: Information retrieval in the workplace : a comparison of professional search practices (2018) 3.66
    3.6566167 = sum of:
      3.6566167 = weight(author_txt:azzopardi in 5048) [ClassicSimilarity], result of:
        3.6566167 = fieldWeight in 5048, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.7509775 = idf(docFreq=6, maxDocs=44218)
          0.375 = fieldNorm(doc=5048)
    
  4. Azzopardi, J.; Benedetti, F.; Guerra, F.; Lupu, M.: Back to the sketch-board : integrating keyword search, semantics, and information retrieval (2017) 3.05
    3.0471804 = sum of:
      3.0471804 = weight(author_txt:azzopardi in 3484) [ClassicSimilarity], result of:
        3.0471804 = fieldWeight in 3484, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.7509775 = idf(docFreq=6, maxDocs=44218)
          0.3125 = fieldNorm(doc=3484)
    
  5. Ruthven, I.; Baillie, M.; Azzopardi, L.; Bierig, R.; Nicol, E.; Sweeney, S.; Yaciki, M.: Contextual factors affecting the utility of surrogates within exploratory search (2008) 2.44
    2.4377444 = sum of:
      2.4377444 = weight(author_txt:azzopardi in 2042) [ClassicSimilarity], result of:
        2.4377444 = fieldWeight in 2042, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.7509775 = idf(docFreq=6, maxDocs=44218)
          0.25 = fieldNorm(doc=2042)
    

Similar documents (content)

  1. Deerwester, S.C.; Dumais, S.T.; Landauer, T.K.; Furnas, G.W.; Harshman, R.A.: Indexing by latent semantic analysis (1990) 0.44
    0.44377553 = sum of:
      0.44377553 = product of:
        0.92453235 = sum of:
          0.069053516 = weight(abstract_txt:combination in 2399) [ClassicSimilarity], result of:
            0.069053516 = score(doc=2399,freq=1.0), product of:
              0.15276031 = queryWeight, product of:
                1.0166018 = boost
                5.7860904 = idf(docFreq=368, maxDocs=44218)
                0.02597015 = queryNorm
              0.45203832 = fieldWeight in 2399, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7860904 = idf(docFreq=368, maxDocs=44218)
                0.078125 = fieldNorm(doc=2399)
          0.16222969 = weight(abstract_txt:matrix in 2399) [ClassicSimilarity], result of:
            0.16222969 = score(doc=2399,freq=2.0), product of:
              0.2142711 = queryWeight, product of:
                1.2040025 = boost
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.02597015 = queryNorm
              0.75712353 = fieldWeight in 2399, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.078125 = fieldNorm(doc=2399)
          0.026609942 = weight(abstract_txt:from in 2399) [ClassicSimilarity], result of:
            0.026609942 = score(doc=2399,freq=2.0), product of:
              0.08714035 = queryWeight, product of:
                1.2140183 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.02597015 = queryNorm
              0.30536878 = fieldWeight in 2399, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.078125 = fieldNorm(doc=2399)
          0.053263046 = weight(abstract_txt:relevant in 2399) [ClassicSimilarity], result of:
            0.053263046 = score(doc=2399,freq=1.0), product of:
              0.14707349 = queryWeight, product of:
                1.2216827 = boost
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.02597015 = queryNorm
              0.36215258 = fieldWeight in 2399, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.078125 = fieldNorm(doc=2399)
          0.059179597 = weight(abstract_txt:term in 2399) [ClassicSimilarity], result of:
            0.059179597 = score(doc=2399,freq=1.0), product of:
              0.15777266 = queryWeight, product of:
                1.2653396 = boost
                4.8012047 = idf(docFreq=987, maxDocs=44218)
                0.02597015 = queryNorm
              0.37509412 = fieldWeight in 2399, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8012047 = idf(docFreq=987, maxDocs=44218)
                0.078125 = fieldNorm(doc=2399)
          0.023623677 = weight(abstract_txt:with in 2399) [ClassicSimilarity], result of:
            0.023623677 = score(doc=2399,freq=2.0), product of:
              0.085535966 = queryWeight, product of:
                1.317591 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.02597015 = queryNorm
              0.27618414 = fieldWeight in 2399, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.078125 = fieldNorm(doc=2399)
          0.10605787 = weight(abstract_txt:automatic in 2399) [ClassicSimilarity], result of:
            0.10605787 = score(doc=2399,freq=2.0), product of:
              0.18475762 = queryWeight, product of:
                1.3692805 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.02597015 = queryNorm
              0.57403785 = fieldWeight in 2399, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.078125 = fieldNorm(doc=2399)
          0.099814504 = weight(abstract_txt:documents in 2399) [ClassicSimilarity], result of:
            0.099814504 = score(doc=2399,freq=4.0), product of:
              0.15500264 = queryWeight, product of:
                1.4482054 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.02597015 = queryNorm
              0.64395356 = fieldWeight in 2399, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.078125 = fieldNorm(doc=2399)
          0.07975151 = weight(abstract_txt:document in 2399) [ClassicSimilarity], result of:
            0.07975151 = score(doc=2399,freq=2.0), product of:
              0.16815609 = queryWeight, product of:
                1.5084013 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.02597015 = queryNorm
              0.4742707 = fieldWeight in 2399, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.078125 = fieldNorm(doc=2399)
          0.06386187 = weight(abstract_txt:semantic in 2399) [ClassicSimilarity], result of:
            0.06386187 = score(doc=2399,freq=1.0), product of:
              0.18269406 = queryWeight, product of:
                1.5722544 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.02597015 = queryNorm
              0.34955636 = fieldWeight in 2399, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.078125 = fieldNorm(doc=2399)
          0.04682089 = weight(abstract_txt:approach in 2399) [ClassicSimilarity], result of:
            0.04682089 = score(doc=2399,freq=1.0), product of:
              0.16001467 = queryWeight, product of:
                1.645112 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.02597015 = queryNorm
              0.29260373 = fieldWeight in 2399, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.078125 = fieldNorm(doc=2399)
          0.13426632 = weight(abstract_txt:queries in 2399) [ClassicSimilarity], result of:
            0.13426632 = score(doc=2399,freq=2.0), product of:
              0.23797503 = queryWeight, product of:
                1.7944291 = boost
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.02597015 = queryNorm
              0.5642034 = fieldWeight in 2399, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.078125 = fieldNorm(doc=2399)
        0.48 = coord(12/25)
    
  2. Dumais, S.T.: Latent semantic analysis (2003) 0.40
    0.3974006 = sum of:
      0.3974006 = product of:
        0.58441263 = sum of:
          0.011684514 = weight(abstract_txt:work in 2462) [ClassicSimilarity], result of:
            0.011684514 = score(doc=2462,freq=1.0), product of:
              0.09854113 = queryWeight, product of:
                3.7943997 = idf(docFreq=2703, maxDocs=44218)
                0.02597015 = queryNorm
              0.11857499 = fieldWeight in 2462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7943997 = idf(docFreq=2703, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.027621405 = weight(abstract_txt:combination in 2462) [ClassicSimilarity], result of:
            0.027621405 = score(doc=2462,freq=1.0), product of:
              0.15276031 = queryWeight, product of:
                1.0166018 = boost
                5.7860904 = idf(docFreq=368, maxDocs=44218)
                0.02597015 = queryNorm
              0.18081532 = fieldWeight in 2462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7860904 = idf(docFreq=368, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.028097592 = weight(abstract_txt:similarity in 2462) [ClassicSimilarity], result of:
            0.028097592 = score(doc=2462,freq=1.0), product of:
              0.154511 = queryWeight, product of:
                1.0224105 = boost
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.02597015 = queryNorm
              0.18184848 = fieldWeight in 2462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.04588549 = weight(abstract_txt:matrix in 2462) [ClassicSimilarity], result of:
            0.04588549 = score(doc=2462,freq=1.0), product of:
              0.2142711 = queryWeight, product of:
                1.2040025 = boost
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.02597015 = queryNorm
              0.21414688 = fieldWeight in 2462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.0075264284 = weight(abstract_txt:from in 2462) [ClassicSimilarity], result of:
            0.0075264284 = score(doc=2462,freq=1.0), product of:
              0.08714035 = queryWeight, product of:
                1.2140183 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.02597015 = queryNorm
              0.08637133 = fieldWeight in 2462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.036258906 = weight(abstract_txt:approaches in 2462) [ClassicSimilarity], result of:
            0.036258906 = score(doc=2462,freq=3.0), product of:
              0.14536053 = queryWeight, product of:
                1.2145474 = boost
                4.6084785 = idf(docFreq=1197, maxDocs=44218)
                0.02597015 = queryNorm
              0.2494412 = fieldWeight in 2462, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.6084785 = idf(docFreq=1197, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.036901716 = weight(abstract_txt:relevant in 2462) [ClassicSimilarity], result of:
            0.036901716 = score(doc=2462,freq=3.0), product of:
              0.14707349 = queryWeight, product of:
                1.2216827 = boost
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.02597015 = queryNorm
              0.25090665 = fieldWeight in 2462, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.048833214 = weight(abstract_txt:latent in 2462) [ClassicSimilarity], result of:
            0.048833214 = score(doc=2462,freq=1.0), product of:
              0.2233522 = queryWeight, product of:
                1.2292514 = boost
                6.996407 = idf(docFreq=109, maxDocs=44218)
                0.02597015 = queryNorm
              0.21863772 = fieldWeight in 2462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.996407 = idf(docFreq=109, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.051378258 = weight(abstract_txt:query in 2462) [ClassicSimilarity], result of:
            0.051378258 = score(doc=2462,freq=5.0), product of:
              0.15467021 = queryWeight, product of:
                1.252837 = boost
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.02597015 = queryNorm
              0.3321794 = fieldWeight in 2462, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.02367184 = weight(abstract_txt:term in 2462) [ClassicSimilarity], result of:
            0.02367184 = score(doc=2462,freq=1.0), product of:
              0.15777266 = queryWeight, product of:
                1.2653396 = boost
                4.8012047 = idf(docFreq=987, maxDocs=44218)
                0.02597015 = queryNorm
              0.15003765 = fieldWeight in 2462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8012047 = idf(docFreq=987, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.009449471 = weight(abstract_txt:with in 2462) [ClassicSimilarity], result of:
            0.009449471 = score(doc=2462,freq=2.0), product of:
              0.085535966 = queryWeight, product of:
                1.317591 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.02597015 = queryNorm
              0.110473655 = fieldWeight in 2462, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.042423148 = weight(abstract_txt:automatic in 2462) [ClassicSimilarity], result of:
            0.042423148 = score(doc=2462,freq=2.0), product of:
              0.18475762 = queryWeight, product of:
                1.3692805 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.02597015 = queryNorm
              0.22961514 = fieldWeight in 2462, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.06312823 = weight(abstract_txt:documents in 2462) [ClassicSimilarity], result of:
            0.06312823 = score(doc=2462,freq=10.0), product of:
              0.15500264 = queryWeight, product of:
                1.4482054 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.02597015 = queryNorm
              0.40727198 = fieldWeight in 2462, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.031900603 = weight(abstract_txt:document in 2462) [ClassicSimilarity], result of:
            0.031900603 = score(doc=2462,freq=2.0), product of:
              0.16815609 = queryWeight, product of:
                1.5084013 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.02597015 = queryNorm
              0.18970828 = fieldWeight in 2462, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.0442448 = weight(abstract_txt:semantic in 2462) [ClassicSimilarity], result of:
            0.0442448 = score(doc=2462,freq=3.0), product of:
              0.18269406 = queryWeight, product of:
                1.5722544 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.02597015 = queryNorm
              0.24217974 = fieldWeight in 2462, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.032438464 = weight(abstract_txt:approach in 2462) [ClassicSimilarity], result of:
            0.032438464 = score(doc=2462,freq=3.0), product of:
              0.16001467 = queryWeight, product of:
                1.645112 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.02597015 = queryNorm
              0.20272182 = fieldWeight in 2462, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.042968497 = weight(abstract_txt:collection in 2462) [ClassicSimilarity], result of:
            0.042968497 = score(doc=2462,freq=1.0), product of:
              0.29579255 = queryWeight, product of:
                2.4501903 = boost
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.02597015 = queryNorm
              0.14526565 = fieldWeight in 2462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
        0.68 = coord(17/25)
    
  3. Crouch, C.J.; Crouch, D.B.; Chen, Q.; Holtz, S.J.: Improving the retrieval effectiveness of very short queries (2002) 0.38
    0.37978002 = sum of:
      0.37978002 = product of:
        0.7912084 = sum of:
          0.023369027 = weight(abstract_txt:work in 2572) [ClassicSimilarity], result of:
            0.023369027 = score(doc=2572,freq=1.0), product of:
              0.09854113 = queryWeight, product of:
                3.7943997 = idf(docFreq=2703, maxDocs=44218)
                0.02597015 = queryNorm
              0.23714998 = fieldWeight in 2572, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7943997 = idf(docFreq=2703, maxDocs=44218)
                0.0625 = fieldNorm(doc=2572)
          0.021287953 = weight(abstract_txt:from in 2572) [ClassicSimilarity], result of:
            0.021287953 = score(doc=2572,freq=2.0), product of:
              0.08714035 = queryWeight, product of:
                1.2140183 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.02597015 = queryNorm
              0.24429502 = fieldWeight in 2572, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=2572)
          0.07380343 = weight(abstract_txt:relevant in 2572) [ClassicSimilarity], result of:
            0.07380343 = score(doc=2572,freq=3.0), product of:
              0.14707349 = queryWeight, product of:
                1.2216827 = boost
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.02597015 = queryNorm
              0.5018133 = fieldWeight in 2572, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.0625 = fieldNorm(doc=2572)
          0.064988926 = weight(abstract_txt:query in 2572) [ClassicSimilarity], result of:
            0.064988926 = score(doc=2572,freq=2.0), product of:
              0.15467021 = queryWeight, product of:
                1.252837 = boost
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.02597015 = queryNorm
              0.4201774 = fieldWeight in 2572, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.0625 = fieldNorm(doc=2572)
          0.01336357 = weight(abstract_txt:with in 2572) [ClassicSimilarity], result of:
            0.01336357 = score(doc=2572,freq=1.0), product of:
              0.085535966 = queryWeight, product of:
                1.317591 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.02597015 = queryNorm
              0.15623334 = fieldWeight in 2572, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.0625 = fieldNorm(doc=2572)
          0.05999539 = weight(abstract_txt:automatic in 2572) [ClassicSimilarity], result of:
            0.05999539 = score(doc=2572,freq=1.0), product of:
              0.18475762 = queryWeight, product of:
                1.3692805 = boost
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.02597015 = queryNorm
              0.32472485 = fieldWeight in 2572, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1955976 = idf(docFreq=665, maxDocs=44218)
                0.0625 = fieldNorm(doc=2572)
          0.0798516 = weight(abstract_txt:documents in 2572) [ClassicSimilarity], result of:
            0.0798516 = score(doc=2572,freq=4.0), product of:
              0.15500264 = queryWeight, product of:
                1.4482054 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.02597015 = queryNorm
              0.5151628 = fieldWeight in 2572, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=2572)
          0.045114264 = weight(abstract_txt:document in 2572) [ClassicSimilarity], result of:
            0.045114264 = score(doc=2572,freq=1.0), product of:
              0.16815609 = queryWeight, product of:
                1.5084013 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.02597015 = queryNorm
              0.26828802 = fieldWeight in 2572, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=2572)
          0.12706669 = weight(abstract_txt:feedback in 2572) [ClassicSimilarity], result of:
            0.12706669 = score(doc=2572,freq=2.0), product of:
              0.24184377 = queryWeight, product of:
                1.566602 = boost
                5.9443145 = idf(docFreq=314, maxDocs=44218)
                0.02597015 = queryNorm
              0.52540815 = fieldWeight in 2572, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.9443145 = idf(docFreq=314, maxDocs=44218)
                0.0625 = fieldNorm(doc=2572)
          0.06487693 = weight(abstract_txt:approach in 2572) [ClassicSimilarity], result of:
            0.06487693 = score(doc=2572,freq=3.0), product of:
              0.16001467 = queryWeight, product of:
                1.645112 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.02597015 = queryNorm
              0.40544364 = fieldWeight in 2572, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0625 = fieldNorm(doc=2572)
          0.13155358 = weight(abstract_txt:queries in 2572) [ClassicSimilarity], result of:
            0.13155358 = score(doc=2572,freq=3.0), product of:
              0.23797503 = queryWeight, product of:
                1.7944291 = boost
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.02597015 = queryNorm
              0.5528041 = fieldWeight in 2572, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.0625 = fieldNorm(doc=2572)
          0.08593699 = weight(abstract_txt:collection in 2572) [ClassicSimilarity], result of:
            0.08593699 = score(doc=2572,freq=1.0), product of:
              0.29579255 = queryWeight, product of:
                2.4501903 = boost
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.02597015 = queryNorm
              0.2905313 = fieldWeight in 2572, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.0625 = fieldNorm(doc=2572)
        0.48 = coord(12/25)
    
  4. Cai, F.; Wang, S.; Rijke, M.de: Behavior-based personalization in web search (2017) 0.37
    0.36800694 = sum of:
      0.36800694 = product of:
        0.76668113 = sum of:
          0.023369027 = weight(abstract_txt:work in 3527) [ClassicSimilarity], result of:
            0.023369027 = score(doc=3527,freq=1.0), product of:
              0.09854113 = queryWeight, product of:
                3.7943997 = idf(docFreq=2703, maxDocs=44218)
                0.02597015 = queryNorm
              0.23714998 = fieldWeight in 3527, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7943997 = idf(docFreq=2703, maxDocs=44218)
                0.0625 = fieldNorm(doc=3527)
          0.05524281 = weight(abstract_txt:combination in 3527) [ClassicSimilarity], result of:
            0.05524281 = score(doc=3527,freq=1.0), product of:
              0.15276031 = queryWeight, product of:
                1.0166018 = boost
                5.7860904 = idf(docFreq=368, maxDocs=44218)
                0.02597015 = queryNorm
              0.36163065 = fieldWeight in 3527, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7860904 = idf(docFreq=368, maxDocs=44218)
                0.0625 = fieldNorm(doc=3527)
          0.09177098 = weight(abstract_txt:matrix in 3527) [ClassicSimilarity], result of:
            0.09177098 = score(doc=3527,freq=1.0), product of:
              0.2142711 = queryWeight, product of:
                1.2040025 = boost
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.02597015 = queryNorm
              0.42829376 = fieldWeight in 3527, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.0625 = fieldNorm(doc=3527)
          0.026072312 = weight(abstract_txt:from in 3527) [ClassicSimilarity], result of:
            0.026072312 = score(doc=3527,freq=3.0), product of:
              0.08714035 = queryWeight, product of:
                1.2140183 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.02597015 = queryNorm
              0.29919907 = fieldWeight in 3527, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=3527)
          0.04186818 = weight(abstract_txt:approaches in 3527) [ClassicSimilarity], result of:
            0.04186818 = score(doc=3527,freq=1.0), product of:
              0.14536053 = queryWeight, product of:
                1.2145474 = boost
                4.6084785 = idf(docFreq=1197, maxDocs=44218)
                0.02597015 = queryNorm
              0.2880299 = fieldWeight in 3527, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6084785 = idf(docFreq=1197, maxDocs=44218)
                0.0625 = fieldNorm(doc=3527)
          0.042610433 = weight(abstract_txt:relevant in 3527) [ClassicSimilarity], result of:
            0.042610433 = score(doc=3527,freq=1.0), product of:
              0.14707349 = queryWeight, product of:
                1.2216827 = boost
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.02597015 = queryNorm
              0.28972206 = fieldWeight in 3527, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.635553 = idf(docFreq=1165, maxDocs=44218)
                0.0625 = fieldNorm(doc=3527)
          0.11256413 = weight(abstract_txt:query in 3527) [ClassicSimilarity], result of:
            0.11256413 = score(doc=3527,freq=6.0), product of:
              0.15467021 = queryWeight, product of:
                1.252837 = boost
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.02597015 = queryNorm
              0.72776866 = fieldWeight in 3527, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.0625 = fieldNorm(doc=3527)
          0.08887939 = weight(abstract_txt:relevance in 3527) [ClassicSimilarity], result of:
            0.08887939 = score(doc=3527,freq=3.0), product of:
              0.16647565 = queryWeight, product of:
                1.2997702 = boost
                4.931848 = idf(docFreq=866, maxDocs=44218)
                0.02597015 = queryNorm
              0.5338882 = fieldWeight in 3527, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.931848 = idf(docFreq=866, maxDocs=44218)
                0.0625 = fieldNorm(doc=3527)
          0.018898942 = weight(abstract_txt:with in 3527) [ClassicSimilarity], result of:
            0.018898942 = score(doc=3527,freq=2.0), product of:
              0.085535966 = queryWeight, product of:
                1.317591 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.02597015 = queryNorm
              0.22094731 = fieldWeight in 3527, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.0625 = fieldNorm(doc=3527)
          0.0798516 = weight(abstract_txt:documents in 3527) [ClassicSimilarity], result of:
            0.0798516 = score(doc=3527,freq=4.0), product of:
              0.15500264 = queryWeight, product of:
                1.4482054 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.02597015 = queryNorm
              0.5151628 = fieldWeight in 3527, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=3527)
          0.0781402 = weight(abstract_txt:document in 3527) [ClassicSimilarity], result of:
            0.0781402 = score(doc=3527,freq=3.0), product of:
              0.16815609 = queryWeight, product of:
                1.5084013 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.02597015 = queryNorm
              0.46468848 = fieldWeight in 3527, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=3527)
          0.10741305 = weight(abstract_txt:queries in 3527) [ClassicSimilarity], result of:
            0.10741305 = score(doc=3527,freq=2.0), product of:
              0.23797503 = queryWeight, product of:
                1.7944291 = boost
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.02597015 = queryNorm
              0.4513627 = fieldWeight in 3527, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.0625 = fieldNorm(doc=3527)
        0.48 = coord(12/25)
    
  5. Talvensaari, T.; Laurikkala, J.; Järvelin, K.; Juhola, M.: ¬A study on automatic creation of a comparable document collection in cross-language information retrieval (2006) 0.36
    0.3560825 = sum of:
      0.3560825 = product of:
        0.7418386 = sum of:
          0.056195185 = weight(abstract_txt:similarity in 5601) [ClassicSimilarity], result of:
            0.056195185 = score(doc=5601,freq=1.0), product of:
              0.154511 = queryWeight, product of:
                1.0224105 = boost
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.02597015 = queryNorm
              0.36369696 = fieldWeight in 5601, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.0625 = fieldNorm(doc=5601)
          0.021287953 = weight(abstract_txt:from in 5601) [ClassicSimilarity], result of:
            0.021287953 = score(doc=5601,freq=2.0), product of:
              0.08714035 = queryWeight, product of:
                1.2140183 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.02597015 = queryNorm
              0.24429502 = fieldWeight in 5601, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=5601)
          0.064988926 = weight(abstract_txt:query in 5601) [ClassicSimilarity], result of:
            0.064988926 = score(doc=5601,freq=2.0), product of:
              0.15467021 = queryWeight, product of:
                1.252837 = boost
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.02597015 = queryNorm
              0.4201774 = fieldWeight in 5601, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.0625 = fieldNorm(doc=5601)
          0.04734368 = weight(abstract_txt:term in 5601) [ClassicSimilarity], result of:
            0.04734368 = score(doc=5601,freq=1.0), product of:
              0.15777266 = queryWeight, product of:
                1.2653396 = boost
                4.8012047 = idf(docFreq=987, maxDocs=44218)
                0.02597015 = queryNorm
              0.3000753 = fieldWeight in 5601, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8012047 = idf(docFreq=987, maxDocs=44218)
                0.0625 = fieldNorm(doc=5601)
          0.05131454 = weight(abstract_txt:relevance in 5601) [ClassicSimilarity], result of:
            0.05131454 = score(doc=5601,freq=1.0), product of:
              0.16647565 = queryWeight, product of:
                1.2997702 = boost
                4.931848 = idf(docFreq=866, maxDocs=44218)
                0.02597015 = queryNorm
              0.3082405 = fieldWeight in 5601, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.931848 = idf(docFreq=866, maxDocs=44218)
                0.0625 = fieldNorm(doc=5601)
          0.03273393 = weight(abstract_txt:with in 5601) [ClassicSimilarity], result of:
            0.03273393 = score(doc=5601,freq=6.0), product of:
              0.085535966 = queryWeight, product of:
                1.317591 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.02597015 = queryNorm
              0.38269198 = fieldWeight in 5601, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.0625 = fieldNorm(doc=5601)
          0.06491364 = weight(abstract_txt:small in 5601) [ClassicSimilarity], result of:
            0.06491364 = score(doc=5601,freq=1.0), product of:
              0.19472173 = queryWeight, product of:
                1.4057188 = boost
                5.333859 = idf(docFreq=579, maxDocs=44218)
                0.02597015 = queryNorm
              0.3333662 = fieldWeight in 5601, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.333859 = idf(docFreq=579, maxDocs=44218)
                0.0625 = fieldNorm(doc=5601)
          0.0399258 = weight(abstract_txt:documents in 5601) [ClassicSimilarity], result of:
            0.0399258 = score(doc=5601,freq=1.0), product of:
              0.15500264 = queryWeight, product of:
                1.4482054 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.02597015 = queryNorm
              0.2575814 = fieldWeight in 5601, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=5601)
          0.10087856 = weight(abstract_txt:document in 5601) [ClassicSimilarity], result of:
            0.10087856 = score(doc=5601,freq=5.0), product of:
              0.16815609 = queryWeight, product of:
                1.5084013 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.02597015 = queryNorm
              0.59991026 = fieldWeight in 5601, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=5601)
          0.037456714 = weight(abstract_txt:approach in 5601) [ClassicSimilarity], result of:
            0.037456714 = score(doc=5601,freq=1.0), product of:
              0.16001467 = queryWeight, product of:
                1.645112 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.02597015 = queryNorm
              0.234083 = fieldWeight in 5601, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0625 = fieldNorm(doc=5601)
          0.0759525 = weight(abstract_txt:queries in 5601) [ClassicSimilarity], result of:
            0.0759525 = score(doc=5601,freq=1.0), product of:
              0.23797503 = queryWeight, product of:
                1.7944291 = boost
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.02597015 = queryNorm
              0.31916162 = fieldWeight in 5601, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.0625 = fieldNorm(doc=5601)
          0.14884724 = weight(abstract_txt:collection in 5601) [ClassicSimilarity], result of:
            0.14884724 = score(doc=5601,freq=3.0), product of:
              0.29579255 = queryWeight, product of:
                2.4501903 = boost
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.02597015 = queryNorm
              0.50321496 = fieldWeight in 5601, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.0625 = fieldNorm(doc=5601)
        0.48 = coord(12/25)