Document (#11013)

Author
Story, R.E.
Title
¬An explanation of the effectiveness of latent semantic indexing by means of a Baysian regression model
Source
Information processing and management. 32(1996) no.3, S.329-344
Year
1996
Abstract
Latent Semantic Indexing (LSI) is an effective automated method for determining if a document is relevant to a reader based on a few words or an abstract describing the reader's needs. A particular feature of LSI is its ability to deal automatically with synonyms. Compares LSI to statistical regression and Bayesian methods. The relationships found can be useful in explaining the performance of LSI and in suggesting variations on the LSI approach
Object
Latent Semantic Indexing

Similar documents (content)

  1. Zhu, W.Z.; Allen, R.B.: Document clustering using the LSI subspace signature model (2013) 0.22
    0.21696866 = sum of:
      0.21696866 = product of:
        0.77488804 = sum of:
          0.068904854 = weight(abstract_txt:means in 690) [ClassicSimilarity], result of:
            0.068904854 = score(doc=690,freq=3.0), product of:
              0.10154239 = queryWeight, product of:
                5.0147786 = idf(docFreq=797, maxDocs=44218)
                0.02024863 = queryNorm
              0.67858213 = fieldWeight in 690, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.0147786 = idf(docFreq=797, maxDocs=44218)
                0.078125 = fieldNorm(doc=690)
          0.04180517 = weight(abstract_txt:effectiveness in 690) [ClassicSimilarity], result of:
            0.04180517 = score(doc=690,freq=1.0), product of:
              0.10495616 = queryWeight, product of:
                1.0166706 = boost
                5.098378 = idf(docFreq=733, maxDocs=44218)
                0.02024863 = queryNorm
              0.39831078 = fieldWeight in 690, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.098378 = idf(docFreq=733, maxDocs=44218)
                0.078125 = fieldNorm(doc=690)
          0.053819586 = weight(abstract_txt:statistical in 690) [ClassicSimilarity], result of:
            0.053819586 = score(doc=690,freq=1.0), product of:
              0.1242076 = queryWeight, product of:
                1.1059879 = boost
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.02024863 = queryNorm
              0.43330348 = fieldWeight in 690, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.078125 = fieldNorm(doc=690)
          0.064814635 = weight(abstract_txt:feature in 690) [ClassicSimilarity], result of:
            0.064814635 = score(doc=690,freq=1.0), product of:
              0.14059503 = queryWeight, product of:
                1.176688 = boost
                5.9008293 = idf(docFreq=328, maxDocs=44218)
                0.02024863 = queryNorm
              0.4610023 = fieldWeight in 690, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9008293 = idf(docFreq=328, maxDocs=44218)
                0.078125 = fieldNorm(doc=690)
          0.07342147 = weight(abstract_txt:indexing in 690) [ClassicSimilarity], result of:
            0.07342147 = score(doc=690,freq=2.0), product of:
              0.15278123 = queryWeight, product of:
                1.7347077 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.02024863 = queryNorm
              0.48056605 = fieldWeight in 690, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.078125 = fieldNorm(doc=690)
          0.09788271 = weight(abstract_txt:semantic in 690) [ClassicSimilarity], result of:
            0.09788271 = score(doc=690,freq=3.0), product of:
              0.16166954 = queryWeight, product of:
                1.7844542 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.02024863 = queryNorm
              0.6054493 = fieldWeight in 690, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.078125 = fieldNorm(doc=690)
          0.3742396 = weight(abstract_txt:latent in 690) [ClassicSimilarity], result of:
            0.3742396 = score(doc=690,freq=3.0), product of:
              0.39529744 = queryWeight, product of:
                2.7903154 = boost
                6.996407 = idf(docFreq=109, maxDocs=44218)
                0.02024863 = queryNorm
              0.9467291 = fieldWeight in 690, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.996407 = idf(docFreq=109, maxDocs=44218)
                0.078125 = fieldNorm(doc=690)
        0.28 = coord(7/25)
    
  2. He, X.; Cai, D.; Liu, H.; Ma, W.Y.: Locality preserving indexing for document representation (2004) 0.17
    0.16608068 = sum of:
      0.16608068 = product of:
        1.3840057 = sum of:
          0.29368588 = weight(abstract_txt:indexing in 4079) [ClassicSimilarity], result of:
            0.29368588 = score(doc=4079,freq=2.0), product of:
              0.15278123 = queryWeight, product of:
                1.7347077 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.02024863 = queryNorm
              1.9222642 = fieldWeight in 4079, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.3125 = fieldNorm(doc=4079)
          0.22605045 = weight(abstract_txt:semantic in 4079) [ClassicSimilarity], result of:
            0.22605045 = score(doc=4079,freq=1.0), product of:
              0.16166954 = queryWeight, product of:
                1.7844542 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.02024863 = queryNorm
              1.3982254 = fieldWeight in 4079, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.3125 = fieldNorm(doc=4079)
          0.8642693 = weight(abstract_txt:latent in 4079) [ClassicSimilarity], result of:
            0.8642693 = score(doc=4079,freq=1.0), product of:
              0.39529744 = queryWeight, product of:
                2.7903154 = boost
                6.996407 = idf(docFreq=109, maxDocs=44218)
                0.02024863 = queryNorm
              2.1863773 = fieldWeight in 4079, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.996407 = idf(docFreq=109, maxDocs=44218)
                0.3125 = fieldNorm(doc=4079)
        0.12 = coord(3/25)
    
  3. Dumais, S.T.: Latent semantic analysis (2003) 0.13
    0.1326285 = sum of:
      0.1326285 = product of:
        0.3684125 = sum of:
          0.015912894 = weight(abstract_txt:means in 2462) [ClassicSimilarity], result of:
            0.015912894 = score(doc=2462,freq=1.0), product of:
              0.10154239 = queryWeight, product of:
                5.0147786 = idf(docFreq=797, maxDocs=44218)
                0.02024863 = queryNorm
              0.15671183 = fieldWeight in 2462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0147786 = idf(docFreq=797, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.016722068 = weight(abstract_txt:effectiveness in 2462) [ClassicSimilarity], result of:
            0.016722068 = score(doc=2462,freq=1.0), product of:
              0.10495616 = queryWeight, product of:
                1.0166706 = boost
                5.098378 = idf(docFreq=733, maxDocs=44218)
                0.02024863 = queryNorm
              0.15932432 = fieldWeight in 2462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.098378 = idf(docFreq=733, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.06704678 = weight(abstract_txt:words in 2462) [ClassicSimilarity], result of:
            0.06704678 = score(doc=2462,freq=12.0), product of:
              0.11570163 = queryWeight, product of:
                1.0674464 = boost
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.02024863 = queryNorm
              0.57948 = fieldWeight in 2462, product of:
                3.4641016 = tf(freq=12.0), with freq of:
                  12.0 = termFreq=12.0
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.030444955 = weight(abstract_txt:statistical in 2462) [ClassicSimilarity], result of:
            0.030444955 = score(doc=2462,freq=2.0), product of:
              0.1242076 = queryWeight, product of:
                1.1059879 = boost
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.02024863 = queryNorm
              0.24511346 = fieldWeight in 2462, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.026761321 = weight(abstract_txt:deal in 2462) [ClassicSimilarity], result of:
            0.026761321 = score(doc=2462,freq=1.0), product of:
              0.14359951 = queryWeight, product of:
                1.1891942 = boost
                5.963546 = idf(docFreq=308, maxDocs=44218)
                0.02024863 = queryNorm
              0.1863608 = fieldWeight in 2462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.963546 = idf(docFreq=308, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.05657588 = weight(abstract_txt:synonyms in 2462) [ClassicSimilarity], result of:
            0.05657588 = score(doc=2462,freq=1.0), product of:
              0.23653866 = queryWeight, product of:
                1.526256 = boost
                7.653836 = idf(docFreq=56, maxDocs=44218)
                0.02024863 = queryNorm
              0.23918237 = fieldWeight in 2462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.653836 = idf(docFreq=56, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.02936859 = weight(abstract_txt:indexing in 2462) [ClassicSimilarity], result of:
            0.02936859 = score(doc=2462,freq=2.0), product of:
              0.15278123 = queryWeight, product of:
                1.7347077 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.02024863 = queryNorm
              0.19222642 = fieldWeight in 2462, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.039153084 = weight(abstract_txt:semantic in 2462) [ClassicSimilarity], result of:
            0.039153084 = score(doc=2462,freq=3.0), product of:
              0.16166954 = queryWeight, product of:
                1.7844542 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.02024863 = queryNorm
              0.24217974 = fieldWeight in 2462, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
          0.08642693 = weight(abstract_txt:latent in 2462) [ClassicSimilarity], result of:
            0.08642693 = score(doc=2462,freq=1.0), product of:
              0.39529744 = queryWeight, product of:
                2.7903154 = boost
                6.996407 = idf(docFreq=109, maxDocs=44218)
                0.02024863 = queryNorm
              0.21863772 = fieldWeight in 2462, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.996407 = idf(docFreq=109, maxDocs=44218)
                0.03125 = fieldNorm(doc=2462)
        0.36 = coord(9/25)
    
  4. Gordon, M.D.; Dumais, S.: Using latent semantic indexing for literature based discovery (1998) 0.13
    0.13208258 = sum of:
      0.13208258 = product of:
        0.6604129 = sum of:
          0.070945725 = weight(abstract_txt:effectiveness in 4892) [ClassicSimilarity], result of:
            0.070945725 = score(doc=4892,freq=2.0), product of:
              0.10495616 = queryWeight, product of:
                1.0166706 = boost
                5.098378 = idf(docFreq=733, maxDocs=44218)
                0.02024863 = queryNorm
              0.67595583 = fieldWeight in 4892, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.098378 = idf(docFreq=733, maxDocs=44218)
                0.09375 = fieldNorm(doc=4892)
          0.06458351 = weight(abstract_txt:statistical in 4892) [ClassicSimilarity], result of:
            0.06458351 = score(doc=4892,freq=1.0), product of:
              0.1242076 = queryWeight, product of:
                1.1059879 = boost
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.02024863 = queryNorm
              0.5199642 = fieldWeight in 4892, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.09375 = fieldNorm(doc=4892)
          0.062300187 = weight(abstract_txt:indexing in 4892) [ClassicSimilarity], result of:
            0.062300187 = score(doc=4892,freq=1.0), product of:
              0.15278123 = queryWeight, product of:
                1.7347077 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.02024863 = queryNorm
              0.40777382 = fieldWeight in 4892, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.09375 = fieldNorm(doc=4892)
          0.09590508 = weight(abstract_txt:semantic in 4892) [ClassicSimilarity], result of:
            0.09590508 = score(doc=4892,freq=2.0), product of:
              0.16166954 = queryWeight, product of:
                1.7844542 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.02024863 = queryNorm
              0.5932168 = fieldWeight in 4892, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.09375 = fieldNorm(doc=4892)
          0.36667845 = weight(abstract_txt:latent in 4892) [ClassicSimilarity], result of:
            0.36667845 = score(doc=4892,freq=2.0), product of:
              0.39529744 = queryWeight, product of:
                2.7903154 = boost
                6.996407 = idf(docFreq=109, maxDocs=44218)
                0.02024863 = queryNorm
              0.92760134 = fieldWeight in 4892, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.996407 = idf(docFreq=109, maxDocs=44218)
                0.09375 = fieldNorm(doc=4892)
        0.2 = coord(5/25)
    
  5. Ding, C.H.Q.: ¬A probabilistic model for Latent Semantic Indexing (2005) 0.13
    0.12567799 = sum of:
      0.12567799 = product of:
        0.62838995 = sum of:
          0.068429336 = weight(abstract_txt:words in 3459) [ClassicSimilarity], result of:
            0.068429336 = score(doc=3459,freq=2.0), product of:
              0.11570163 = queryWeight, product of:
                1.0674464 = boost
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.02024863 = queryNorm
              0.5914293 = fieldWeight in 3459, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.353007 = idf(docFreq=568, maxDocs=44218)
                0.078125 = fieldNorm(doc=3459)
          0.07611239 = weight(abstract_txt:statistical in 3459) [ClassicSimilarity], result of:
            0.07611239 = score(doc=3459,freq=2.0), product of:
              0.1242076 = queryWeight, product of:
                1.1059879 = boost
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.02024863 = queryNorm
              0.6127837 = fieldWeight in 3459, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5462847 = idf(docFreq=468, maxDocs=44218)
                0.078125 = fieldNorm(doc=3459)
          0.05191682 = weight(abstract_txt:indexing in 3459) [ClassicSimilarity], result of:
            0.05191682 = score(doc=3459,freq=1.0), product of:
              0.15278123 = queryWeight, product of:
                1.7347077 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.02024863 = queryNorm
              0.3398115 = fieldWeight in 3459, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.078125 = fieldNorm(doc=3459)
          0.12636605 = weight(abstract_txt:semantic in 3459) [ClassicSimilarity], result of:
            0.12636605 = score(doc=3459,freq=5.0), product of:
              0.16166954 = queryWeight, product of:
                1.7844542 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.02024863 = queryNorm
              0.78163177 = fieldWeight in 3459, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.078125 = fieldNorm(doc=3459)
          0.30556536 = weight(abstract_txt:latent in 3459) [ClassicSimilarity], result of:
            0.30556536 = score(doc=3459,freq=2.0), product of:
              0.39529744 = queryWeight, product of:
                2.7903154 = boost
                6.996407 = idf(docFreq=109, maxDocs=44218)
                0.02024863 = queryNorm
              0.7730011 = fieldWeight in 3459, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.996407 = idf(docFreq=109, maxDocs=44218)
                0.078125 = fieldNorm(doc=3459)
        0.2 = coord(5/25)