Search (2 results, page 1 of 1)

Berry, M.W.; Dumais, S.T.; O'Brien, G.W.: Using linear algebra for intelligent information retrieval (1995) 0.00
```
0.002269176 = product of:
  0.004538352 = sum of:
    0.004538352 = product of:
      0.009076704 = sum of:
        0.009076704 = weight(_text_:a in 2206) [ClassicSimilarity], result of:
          0.009076704 = score(doc=2206,freq=10.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.1709182 = fieldWeight in 2206, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=2206)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Currently, most approaches to retrieving textual materials from scientific databases depend on a lexical match between words in users' requests and those in or assigned to documents in a database. Because of the tremendous diversity in the words people use to describe the same document, lexical methods are necessarily incomplete and imprecise. Using the singular value decomposition (SVD), one can take advantage of the implicit higher-order structure in the association of terms with documents by determining the SVD of large sparse term by document matrices. Terms and documents represented by 200-300 of the largest singular vectors are then matched against user queries. We call this retrieval method Latent Semantic Indexing (LSI) because the subspace represents important associative relationships between terms and documents that are not evident in individual documents. LSI is a completely automatic yet intelligent indexing method, widely applicable, and a promising way to improve users...

Type

a
Deerwester, S.C.; Dumais, S.T.; Landauer, T.K.; Furnas, G.W.; Harshman, R.A.: Indexing by latent semantic analysis (1990) 0.00
```
0.0020296127 = product of:
  0.0040592253 = sum of:
    0.0040592253 = product of:
      0.008118451 = sum of:
        0.008118451 = weight(_text_:a in 2399) [ClassicSimilarity], result of:
          0.008118451 = score(doc=2399,freq=8.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.15287387 = fieldWeight in 2399, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=2399)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents ("semantic structure") in order to improve the detection of relevant documents on the basis of terms found in queries. The particular technique used is singular-value decomposition, in which a large term by document matrix is decomposed into a set of ca 100 orthogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca 100 item vectors of factor weights. Queries are represented as pseudo-document vectors formed from weighted combinations of terms, and documents with supra-threshold cosine values are returned. Initial tests find this completely automatic method for retrieval to be promising.

Type

a

Search (2 results, page 1 of 1)

Authors

Themes