Document (#9897)

Author
Cox, K.
Title
¬An experiment to test the utility of repeating phrases for information retrieval systems
Source
Journal of information science. 20(1994) no.5, S.348-355
Year
1994
Abstract
Describes a method of evaluating the utility of repeating phrases for information retrieval systems and calculating recall and precision. The technique compares 2 different techniques by asking people to perform tasks and then examining the outcomes of those tasks. Shows that people found those phrases automatically generated by finding repeating content words in documents to be good content indicators and discriminators were better than using words alone or phrases generated randomly from content words

Similar documents (content)

  1. Fagan, J.L.: ¬The effectiveness of a nonsyntactic approach to automatic phrase indexing for document retrieval (1989) 0.21
    0.2114522 = sum of:
      0.2114522 = product of:
        0.75518644 = sum of:
          0.048274048 = weight(abstract_txt:good in 3846) [ClassicSimilarity], result of:
            0.048274048 = score(doc=3846,freq=2.0), product of:
              0.09814898 = queryWeight, product of:
                5.5645866 = idf(docFreq=436, maxDocs=41962)
                0.017638143 = queryNorm
              0.49184462 = fieldWeight in 3846, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5645866 = idf(docFreq=436, maxDocs=41962)
                0.0625 = fieldNorm(doc=3846)
          0.044441715 = weight(abstract_txt:indicators in 3846) [ClassicSimilarity], result of:
            0.044441715 = score(doc=3846,freq=1.0), product of:
              0.1170255 = queryWeight, product of:
                1.0919365 = boost
                6.076175 = idf(docFreq=261, maxDocs=41962)
                0.017638143 = queryNorm
              0.37976095 = fieldWeight in 3846, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.076175 = idf(docFreq=261, maxDocs=41962)
                0.0625 = fieldNorm(doc=3846)
          0.022421205 = weight(abstract_txt:systems in 3846) [ClassicSimilarity], result of:
            0.022421205 = score(doc=3846,freq=2.0), product of:
              0.07416391 = queryWeight, product of:
                1.2293298 = boost
                3.4203563 = idf(docFreq=3729, maxDocs=41962)
                0.017638143 = queryNorm
              0.30231965 = fieldWeight in 3846, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4203563 = idf(docFreq=3729, maxDocs=41962)
                0.0625 = fieldNorm(doc=3846)
          0.028460767 = weight(abstract_txt:retrieval in 3846) [ClassicSimilarity], result of:
            0.028460767 = score(doc=3846,freq=3.0), product of:
              0.07595458 = queryWeight, product of:
                1.2440822 = boost
                3.4614017 = idf(docFreq=3579, maxDocs=41962)
                0.017638143 = queryNorm
              0.37470773 = fieldWeight in 3846, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4614017 = idf(docFreq=3579, maxDocs=41962)
                0.0625 = fieldNorm(doc=3846)
          0.076955765 = weight(abstract_txt:content in 3846) [ClassicSimilarity], result of:
            0.076955765 = score(doc=3846,freq=3.0), product of:
              0.16875142 = queryWeight, product of:
                2.2711272 = boost
                4.212628 = idf(docFreq=1688, maxDocs=41962)
                0.017638143 = queryNorm
              0.45603034 = fieldWeight in 3846, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.212628 = idf(docFreq=1688, maxDocs=41962)
                0.0625 = fieldNorm(doc=3846)
          0.09211758 = weight(abstract_txt:words in 3846) [ClassicSimilarity], result of:
            0.09211758 = score(doc=3846,freq=1.0), product of:
              0.27438185 = queryWeight, product of:
                2.8959792 = boost
                5.3716426 = idf(docFreq=529, maxDocs=41962)
                0.017638143 = queryNorm
              0.33572766 = fieldWeight in 3846, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3716426 = idf(docFreq=529, maxDocs=41962)
                0.0625 = fieldNorm(doc=3846)
          0.44251537 = weight(abstract_txt:phrases in 3846) [ClassicSimilarity], result of:
            0.44251537 = score(doc=3846,freq=3.0), product of:
              0.59614486 = queryWeight, product of:
                4.9290476 = boost
                6.857028 = idf(docFreq=119, maxDocs=41962)
                0.017638143 = queryNorm
              0.742295 = fieldWeight in 3846, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.857028 = idf(docFreq=119, maxDocs=41962)
                0.0625 = fieldNorm(doc=3846)
        0.28 = coord(7/25)
    
  2. Kim, W.; Wilbur, W.J.: Corpus-based statistical screening for content-bearing terms (2001) 0.15
    0.1506509 = sum of:
      0.1506509 = product of:
        0.75325453 = sum of:
          0.027888821 = weight(abstract_txt:recall in 189) [ClassicSimilarity], result of:
            0.027888821 = score(doc=189,freq=1.0), product of:
              0.10391205 = queryWeight, product of:
                1.0289401 = boost
                5.725626 = idf(docFreq=371, maxDocs=41962)
                0.017638143 = queryNorm
              0.26838872 = fieldWeight in 189, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.725626 = idf(docFreq=371, maxDocs=41962)
                0.046875 = fieldNorm(doc=189)
          0.02422435 = weight(abstract_txt:those in 189) [ClassicSimilarity], result of:
            0.02422435 = score(doc=189,freq=1.0), product of:
              0.119185634 = queryWeight, product of:
                1.5584184 = boost
                4.335977 = idf(docFreq=1492, maxDocs=41962)
                0.017638143 = queryNorm
              0.20324892 = fieldWeight in 189, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.335977 = idf(docFreq=1492, maxDocs=41962)
                0.046875 = fieldNorm(doc=189)
          0.0745121 = weight(abstract_txt:content in 189) [ClassicSimilarity], result of:
            0.0745121 = score(doc=189,freq=5.0), product of:
              0.16875142 = queryWeight, product of:
                2.2711272 = boost
                4.212628 = idf(docFreq=1688, maxDocs=41962)
                0.017638143 = queryNorm
              0.44154948 = fieldWeight in 189, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.212628 = idf(docFreq=1688, maxDocs=41962)
                0.046875 = fieldNorm(doc=189)
          0.11966424 = weight(abstract_txt:words in 189) [ClassicSimilarity], result of:
            0.11966424 = score(doc=189,freq=3.0), product of:
              0.27438185 = queryWeight, product of:
                2.8959792 = boost
                5.3716426 = idf(docFreq=529, maxDocs=41962)
                0.017638143 = queryNorm
              0.436123 = fieldWeight in 189, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.3716426 = idf(docFreq=529, maxDocs=41962)
                0.046875 = fieldNorm(doc=189)
          0.50696504 = weight(abstract_txt:phrases in 189) [ClassicSimilarity], result of:
            0.50696504 = score(doc=189,freq=7.0), product of:
              0.59614486 = queryWeight, product of:
                4.9290476 = boost
                6.857028 = idf(docFreq=119, maxDocs=41962)
                0.017638143 = queryNorm
              0.8504058 = fieldWeight in 189, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.857028 = idf(docFreq=119, maxDocs=41962)
                0.046875 = fieldNorm(doc=189)
        0.2 = coord(5/25)
    
  3. Lin, X.: Searching and browsing on map displays (1995) 0.15
    0.15055895 = sum of:
      0.15055895 = product of:
        0.53771055 = sum of:
          0.059077393 = weight(abstract_txt:compares in 3921) [ClassicSimilarity], result of:
            0.059077393 = score(doc=3921,freq=1.0), product of:
              0.107970886 = queryWeight, product of:
                1.0488429 = boost
                5.836377 = idf(docFreq=332, maxDocs=41962)
                0.017638143 = queryNorm
              0.5471604 = fieldWeight in 3921, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.836377 = idf(docFreq=332, maxDocs=41962)
                0.09375 = fieldNorm(doc=3921)
          0.07045219 = weight(abstract_txt:perform in 3921) [ClassicSimilarity], result of:
            0.07045219 = score(doc=3921,freq=1.0), product of:
              0.12141959 = queryWeight, product of:
                1.1122477 = boost
                6.1891985 = idf(docFreq=233, maxDocs=41962)
                0.017638143 = queryNorm
              0.5802374 = fieldWeight in 3921, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1891985 = idf(docFreq=233, maxDocs=41962)
                0.09375 = fieldNorm(doc=3921)
          0.024647746 = weight(abstract_txt:retrieval in 3921) [ClassicSimilarity], result of:
            0.024647746 = score(doc=3921,freq=1.0), product of:
              0.07595458 = queryWeight, product of:
                1.2440822 = boost
                3.4614017 = idf(docFreq=3579, maxDocs=41962)
                0.017638143 = queryNorm
              0.3245064 = fieldWeight in 3921, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4614017 = idf(docFreq=3579, maxDocs=41962)
                0.09375 = fieldNorm(doc=3921)
          0.12092567 = weight(abstract_txt:randomly in 3921) [ClassicSimilarity], result of:
            0.12092567 = score(doc=3921,freq=1.0), product of:
              0.17406234 = queryWeight, product of:
                1.3317096 = boost
                7.4104133 = idf(docFreq=68, maxDocs=41962)
                0.017638143 = queryNorm
              0.6947262 = fieldWeight in 3921, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4104133 = idf(docFreq=68, maxDocs=41962)
                0.09375 = fieldNorm(doc=3921)
          0.07438079 = weight(abstract_txt:people in 3921) [ClassicSimilarity], result of:
            0.07438079 = score(doc=3921,freq=1.0), product of:
              0.15861455 = queryWeight, product of:
                1.7978092 = boost
                5.0020328 = idf(docFreq=766, maxDocs=41962)
                0.017638143 = queryNorm
              0.46894056 = fieldWeight in 3921, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0020328 = idf(docFreq=766, maxDocs=41962)
                0.09375 = fieldNorm(doc=3921)
          0.08531353 = weight(abstract_txt:tasks in 3921) [ClassicSimilarity], result of:
            0.08531353 = score(doc=3921,freq=1.0), product of:
              0.17379919 = queryWeight, product of:
                1.8818976 = boost
                5.235991 = idf(docFreq=606, maxDocs=41962)
                0.017638143 = queryNorm
              0.49087417 = fieldWeight in 3921, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.235991 = idf(docFreq=606, maxDocs=41962)
                0.09375 = fieldNorm(doc=3921)
          0.10291322 = weight(abstract_txt:generated in 3921) [ClassicSimilarity], result of:
            0.10291322 = score(doc=3921,freq=1.0), product of:
              0.19694725 = queryWeight, product of:
                2.003305 = boost
                5.573782 = idf(docFreq=432, maxDocs=41962)
                0.017638143 = queryNorm
              0.52254206 = fieldWeight in 3921, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.573782 = idf(docFreq=432, maxDocs=41962)
                0.09375 = fieldNorm(doc=3921)
        0.28 = coord(7/25)
    
  4. Dumais, S.T.: Latent semantic analysis (2003) 0.13
    0.13161619 = sum of:
      0.13161619 = product of:
        0.36560053 = sum of:
          0.017067453 = weight(abstract_txt:good in 4463) [ClassicSimilarity], result of:
            0.017067453 = score(doc=4463,freq=1.0), product of:
              0.09814898 = queryWeight, product of:
                5.5645866 = idf(docFreq=436, maxDocs=41962)
                0.017638143 = queryNorm
              0.17389333 = fieldWeight in 4463, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5645866 = idf(docFreq=436, maxDocs=41962)
                0.03125 = fieldNorm(doc=4463)
          0.029894594 = weight(abstract_txt:technique in 4463) [ClassicSimilarity], result of:
            0.029894594 = score(doc=4463,freq=3.0), product of:
              0.09888445 = queryWeight, product of:
                1.0037397 = boost
                5.585397 = idf(docFreq=427, maxDocs=41962)
                0.017638143 = queryNorm
              0.30231845 = fieldWeight in 4463, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.585397 = idf(docFreq=427, maxDocs=41962)
                0.03125 = fieldNorm(doc=4463)
          0.029830039 = weight(abstract_txt:alone in 4463) [ClassicSimilarity], result of:
            0.029830039 = score(doc=4463,freq=1.0), product of:
              0.14241067 = queryWeight, product of:
                1.2045598 = boost
                6.7028775 = idf(docFreq=139, maxDocs=41962)
                0.017638143 = queryNorm
              0.20946492 = fieldWeight in 4463, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7028775 = idf(docFreq=139, maxDocs=41962)
                0.03125 = fieldNorm(doc=4463)
          0.011210603 = weight(abstract_txt:systems in 4463) [ClassicSimilarity], result of:
            0.011210603 = score(doc=4463,freq=2.0), product of:
              0.07416391 = queryWeight, product of:
                1.2293298 = boost
                3.4203563 = idf(docFreq=3729, maxDocs=41962)
                0.017638143 = queryNorm
              0.15115982 = fieldWeight in 4463, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4203563 = idf(docFreq=3729, maxDocs=41962)
                0.03125 = fieldNorm(doc=4463)
          0.024647746 = weight(abstract_txt:retrieval in 4463) [ClassicSimilarity], result of:
            0.024647746 = score(doc=4463,freq=9.0), product of:
              0.07595458 = queryWeight, product of:
                1.2440822 = boost
                3.4614017 = idf(docFreq=3579, maxDocs=41962)
                0.017638143 = queryNorm
              0.3245064 = fieldWeight in 4463, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                3.4614017 = idf(docFreq=3579, maxDocs=41962)
                0.03125 = fieldNorm(doc=4463)
          0.016149567 = weight(abstract_txt:those in 4463) [ClassicSimilarity], result of:
            0.016149567 = score(doc=4463,freq=1.0), product of:
              0.119185634 = queryWeight, product of:
                1.5584184 = boost
                4.335977 = idf(docFreq=1492, maxDocs=41962)
                0.017638143 = queryNorm
              0.13549928 = fieldWeight in 4463, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.335977 = idf(docFreq=1492, maxDocs=41962)
                0.03125 = fieldNorm(doc=4463)
          0.042943772 = weight(abstract_txt:people in 4463) [ClassicSimilarity], result of:
            0.042943772 = score(doc=4463,freq=3.0), product of:
              0.15861455 = queryWeight, product of:
                1.7978092 = boost
                5.0020328 = idf(docFreq=766, maxDocs=41962)
                0.017638143 = queryNorm
              0.27074295 = fieldWeight in 4463, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.0020328 = idf(docFreq=766, maxDocs=41962)
                0.03125 = fieldNorm(doc=4463)
          0.034304406 = weight(abstract_txt:generated in 4463) [ClassicSimilarity], result of:
            0.034304406 = score(doc=4463,freq=1.0), product of:
              0.19694725 = queryWeight, product of:
                2.003305 = boost
                5.573782 = idf(docFreq=432, maxDocs=41962)
                0.017638143 = queryNorm
              0.17418069 = fieldWeight in 4463, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.573782 = idf(docFreq=432, maxDocs=41962)
                0.03125 = fieldNorm(doc=4463)
          0.15955232 = weight(abstract_txt:words in 4463) [ClassicSimilarity], result of:
            0.15955232 = score(doc=4463,freq=12.0), product of:
              0.27438185 = queryWeight, product of:
                2.8959792 = boost
                5.3716426 = idf(docFreq=529, maxDocs=41962)
                0.017638143 = queryNorm
              0.5814974 = fieldWeight in 4463, product of:
                3.4641016 = tf(freq=12.0), with freq of:
                  12.0 = termFreq=12.0
                5.3716426 = idf(docFreq=529, maxDocs=41962)
                0.03125 = fieldNorm(doc=4463)
        0.36 = coord(9/25)
    
  5. Sanderson, M.; Lawrie, D.: Building, testing, and applying concept hierarchies (2000) 0.12
    0.12431586 = sum of:
      0.12431586 = product of:
        0.6215793 = sum of:
          0.045775253 = weight(abstract_txt:experiment in 1038) [ClassicSimilarity], result of:
            0.045775253 = score(doc=1038,freq=1.0), product of:
              0.10285699 = queryWeight, product of:
                1.0237031 = boost
                5.6964846 = idf(docFreq=382, maxDocs=41962)
                0.017638143 = queryNorm
              0.44503784 = fieldWeight in 1038, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6964846 = idf(docFreq=382, maxDocs=41962)
                0.078125 = fieldNorm(doc=1038)
          0.08576102 = weight(abstract_txt:generated in 1038) [ClassicSimilarity], result of:
            0.08576102 = score(doc=1038,freq=1.0), product of:
              0.19694725 = queryWeight, product of:
                2.003305 = boost
                5.573782 = idf(docFreq=432, maxDocs=41962)
                0.017638143 = queryNorm
              0.43545172 = fieldWeight in 1038, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.573782 = idf(docFreq=432, maxDocs=41962)
                0.078125 = fieldNorm(doc=1038)
          0.05553804 = weight(abstract_txt:content in 1038) [ClassicSimilarity], result of:
            0.05553804 = score(doc=1038,freq=1.0), product of:
              0.16875142 = queryWeight, product of:
                2.2711272 = boost
                4.212628 = idf(docFreq=1688, maxDocs=41962)
                0.017638143 = queryNorm
              0.32911155 = fieldWeight in 1038, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.212628 = idf(docFreq=1688, maxDocs=41962)
                0.078125 = fieldNorm(doc=1038)
          0.11514697 = weight(abstract_txt:words in 1038) [ClassicSimilarity], result of:
            0.11514697 = score(doc=1038,freq=1.0), product of:
              0.27438185 = queryWeight, product of:
                2.8959792 = boost
                5.3716426 = idf(docFreq=529, maxDocs=41962)
                0.017638143 = queryNorm
              0.41965958 = fieldWeight in 1038, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3716426 = idf(docFreq=529, maxDocs=41962)
                0.078125 = fieldNorm(doc=1038)
          0.31935796 = weight(abstract_txt:phrases in 1038) [ClassicSimilarity], result of:
            0.31935796 = score(doc=1038,freq=1.0), product of:
              0.59614486 = queryWeight, product of:
                4.9290476 = boost
                6.857028 = idf(docFreq=119, maxDocs=41962)
                0.017638143 = queryNorm
              0.5357053 = fieldWeight in 1038, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.857028 = idf(docFreq=119, maxDocs=41962)
                0.078125 = fieldNorm(doc=1038)
        0.2 = coord(5/25)