Search (4 results, page 1 of 1)

  • × theme_ss:"Computerlinguistik"
  • × theme_ss:"Konzeption und Anwendung des Prinzips Thesaurus"
  1. Byrne, C.C.; McCracken, S.A.: ¬An adaptive thesaurus employing semantic distance, relational inheritance and nominal compound interpretation for linguistic support of information retrieval (1999) 0.04
    0.03694487 = product of:
      0.07388974 = sum of:
        0.07388974 = sum of:
          0.010653937 = weight(_text_:e in 4483) [ClassicSimilarity], result of:
            0.010653937 = score(doc=4483,freq=2.0), product of:
              0.055905603 = queryWeight, product of:
                1.43737 = idf(docFreq=28552, maxDocs=44218)
                0.038894374 = queryNorm
              0.19057012 = fieldWeight in 4483, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                1.43737 = idf(docFreq=28552, maxDocs=44218)
                0.09375 = fieldNorm(doc=4483)
          0.063235804 = weight(_text_:22 in 4483) [ClassicSimilarity], result of:
            0.063235804 = score(doc=4483,freq=2.0), product of:
              0.13620147 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.038894374 = queryNorm
              0.46428138 = fieldWeight in 4483, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.09375 = fieldNorm(doc=4483)
      0.5 = coord(1/2)
    
    Date
    15. 3.2000 10:22:37
    Language
    e
  2. Schneider, J.W.; Borlund, P.: ¬A bibliometric-based semiautomatic approach to identification of candidate thesaurus terms : parsing and filtering of noun phrases from citation contexts (2005) 0.02
    0.021551177 = product of:
      0.043102354 = sum of:
        0.043102354 = sum of:
          0.006214797 = weight(_text_:e in 156) [ClassicSimilarity], result of:
            0.006214797 = score(doc=156,freq=2.0), product of:
              0.055905603 = queryWeight, product of:
                1.43737 = idf(docFreq=28552, maxDocs=44218)
                0.038894374 = queryNorm
              0.1111659 = fieldWeight in 156, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                1.43737 = idf(docFreq=28552, maxDocs=44218)
                0.0546875 = fieldNorm(doc=156)
          0.036887556 = weight(_text_:22 in 156) [ClassicSimilarity], result of:
            0.036887556 = score(doc=156,freq=2.0), product of:
              0.13620147 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.038894374 = queryNorm
              0.2708308 = fieldWeight in 156, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0546875 = fieldNorm(doc=156)
      0.5 = coord(1/2)
    
    Date
    8. 3.2007 19:55:22
    Language
    e
  3. Tseng, Y.-H.: Automatic thesaurus generation for Chinese documents (2002) 0.01
    0.012208363 = sum of:
      0.009988792 = product of:
        0.07991034 = sum of:
          0.07991034 = weight(_text_:asked in 5226) [ClassicSimilarity], result of:
            0.07991034 = score(doc=5226,freq=2.0), product of:
              0.237196 = queryWeight, product of:
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.038894374 = queryNorm
              0.33689582 = fieldWeight in 5226, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.0390625 = fieldNorm(doc=5226)
        0.125 = coord(1/8)
      0.0022195703 = product of:
        0.0044391407 = sum of:
          0.0044391407 = weight(_text_:e in 5226) [ClassicSimilarity], result of:
            0.0044391407 = score(doc=5226,freq=2.0), product of:
              0.055905603 = queryWeight, product of:
                1.43737 = idf(docFreq=28552, maxDocs=44218)
                0.038894374 = queryNorm
              0.07940422 = fieldWeight in 5226, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                1.43737 = idf(docFreq=28552, maxDocs=44218)
                0.0390625 = fieldNorm(doc=5226)
        0.5 = coord(1/2)
    
    Abstract
    Tseng constructs a word co-occurrence based thesaurus by means of the automatic analysis of Chinese text. Words are identified by a longest dictionary match supplemented by a key word extraction algorithm that merges back nearby tokens and accepts shorter strings of characters if they occur more often than the longest string. Single character auxiliary words are a major source of error but this can be greatly reduced with the use of a 70-character 2680 word stop list. Extracted terms with their associate document weights are sorted by decreasing frequency and the top of this list is associated using a Dice coefficient modified to account for longer documents on the weights of term pairs. Co-occurrence is not in the document as a whole but in paragraph or sentence size sections in order to reduce computation time. A window of 29 characters or 11 words was found to be sufficient. A thesaurus was produced from 25,230 Chinese news articles and judges asked to review the top 50 terms associated with each of 30 single word query terms. They determined 69% to be relevant.
    Language
    e
  4. Rahmstorf, G.: Information retrieval using conceptual representations of phrases (1994) 0.00
    0.0013317422 = product of:
      0.0026634843 = sum of:
        0.0026634843 = product of:
          0.0053269686 = sum of:
            0.0053269686 = weight(_text_:e in 7862) [ClassicSimilarity], result of:
              0.0053269686 = score(doc=7862,freq=2.0), product of:
                0.055905603 = queryWeight, product of:
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.038894374 = queryNorm
                0.09528506 = fieldWeight in 7862, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.43737 = idf(docFreq=28552, maxDocs=44218)
                  0.046875 = fieldNorm(doc=7862)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Language
    e