Search (1 results, page 1 of 1)

  • × author_ss:"Dumais, S.T."
  • × theme_ss:"Literaturübersicht"
  1. Dumais, S.T.: Latent semantic analysis (2003) 0.01
    0.00693251 = product of:
      0.02079753 = sum of:
        0.02079753 = product of:
          0.04159506 = sum of:
            0.04159506 = weight(_text_:2002 in 2462) [ClassicSimilarity], result of:
              0.04159506 = score(doc=2462,freq=4.0), product of:
                0.20701107 = queryWeight, product of:
                  4.28654 = idf(docFreq=1652, maxDocs=44218)
                  0.048293278 = queryNorm
                0.20093156 = fieldWeight in 2462, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.28654 = idf(docFreq=1652, maxDocs=44218)
                  0.0234375 = fieldNorm(doc=2462)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    With the advent of large-scale collections of full text, statistical approaches are being used more and more to analyze the relationships among terms and documents. LSA takes this approach. LSA induces knowledge about the meanings of documents and words by analyzing large collections of texts. The approach simultaneously models the relationships among documents based an their constituent words, and the relationships between words based an their occurrence in documents. By using fewer dimensions for representation than there are unique words, LSA induces similarities among terms that are useful in solving the information retrieval problems described earlier. LSA is a fully automatic statistical approach to extracting relations among words by means of their contexts of use in documents, passages, or sentences. It makes no use of natural language processing techniques for analyzing morphological, syntactic, or semantic relations. Nor does it use humanly constructed resources like dictionaries, thesauri, lexical reference systems (e.g., WordNet), semantic networks, or other knowledge representations. Its only input is large amounts of texts. LSA is an unsupervised learning technique. It starts with a large collection of texts, builds a term-document matrix, and tries to uncover some similarity structures that are useful for information retrieval and related text-analysis problems. Several recent ARIST chapters have focused an text mining and discovery (Benoit, 2002; Solomon, 2002; Trybula, 2000). These chapters provide complementary coverage of the field of text analysis.