Search (3 results, page 1 of 1)

Did you mean:
sbss%3a%22Pub A 91 %2f information%22 3

Schneider, J.W.; Borlund, P.: ¬A bibliometric-based semiautomatic approach to identification of candidate thesaurus terms : parsing and filtering of noun phrases from citation contexts (2005) 0.03

0.025314135 = product of:
  0.063285336 = sum of:
    0.008258085 = weight(_text_:a in 156) [ClassicSimilarity], result of:
      0.008258085 = score(doc=156,freq=6.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.1544581 = fieldWeight in 156, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0546875 = fieldNorm(doc=156)
    0.05502725 = sum of:
      0.011051352 = weight(_text_:information in 156) [ClassicSimilarity], result of:
        0.011051352 = score(doc=156,freq=2.0), product of:
          0.08139861 = queryWeight, product of:
            1.7554779 = idf(docFreq=20772, maxDocs=44218)
            0.046368346 = queryNorm
          0.13576832 = fieldWeight in 156, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            1.7554779 = idf(docFreq=20772, maxDocs=44218)
            0.0546875 = fieldNorm(doc=156)
      0.043975897 = weight(_text_:22 in 156) [ClassicSimilarity], result of:
        0.043975897 = score(doc=156,freq=2.0), product of:
          0.16237405 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046368346 = queryNorm
          0.2708308 = fieldWeight in 156, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0546875 = fieldNorm(doc=156)
  0.4 = coord(2/5)

Abstract: The present study investigates the ability of a bibliometric based semi-automatic method to select candidate thesaurus terms from citation contexts. The method consists of document co-citation analysis, citation context analysis, and noun phrase parsing. The investigation is carried out within the specialty area of periodontology. The results clearly demonstrate that the method is able to select important candidate thesaurus terms within the chosen specialty area.
Date: 8. 3.2007 19:55:22
Source: Context: nature, impact and role. 5th International Conference an Conceptions of Library and Information Sciences, CoLIS 2005 Glasgow, UK, June 2005. Ed. by F. Crestani u. I. Ruthven
Type: a

Tseng, Y.-H.: Automatic thesaurus generation for Chinese documents (2002) 0.01
```
0.005886516 = product of:
  0.01471629 = sum of:
    0.010769378 = weight(_text_:a in 5226) [ClassicSimilarity], result of:
      0.010769378 = score(doc=5226,freq=20.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.20142901 = fieldWeight in 5226, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5226)
    0.003946911 = product of:
      0.007893822 = sum of:
        0.007893822 = weight(_text_:information in 5226) [ClassicSimilarity], result of:
          0.007893822 = score(doc=5226,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.09697737 = fieldWeight in 5226, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5226)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

Tseng constructs a word co-occurrence based thesaurus by means of the automatic analysis of Chinese text. Words are identified by a longest dictionary match supplemented by a key word extraction algorithm that merges back nearby tokens and accepts shorter strings of characters if they occur more often than the longest string. Single character auxiliary words are a major source of error but this can be greatly reduced with the use of a 70-character 2680 word stop list. Extracted terms with their associate document weights are sorted by decreasing frequency and the top of this list is associated using a Dice coefficient modified to account for longer documents on the weights of term pairs. Co-occurrence is not in the document as a whole but in paragraph or sentence size sections in order to reduce computation time. A window of 29 characters or 11 words was found to be sufficient. A thesaurus was produced from 25,230 Chinese news articles and judges asked to review the top 50 terms associated with each of 30 single word query terms. They determined 69% to be relevant.

Source

Journal of the American Society for Information Science and technology. 53(2002) no.13, S.1130-1138

Type

a

Pimenov, E.N.: Normativnost' i nekotorye problem razrabotki tezauruzov i drugikh lingvistiicheskikh sredstv IPS (2000) 0.01

0.00588199 = product of:
  0.014704974 = sum of:
    0.0068111527 = weight(_text_:a in 3281) [ClassicSimilarity], result of:
      0.0068111527 = score(doc=3281,freq=2.0), product of:
        0.053464882 = queryWeight, product of:
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.046368346 = queryNorm
        0.12739488 = fieldWeight in 3281, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.153047 = idf(docFreq=37942, maxDocs=44218)
          0.078125 = fieldNorm(doc=3281)
    0.007893822 = product of:
      0.015787644 = sum of:
        0.015787644 = weight(_text_:information in 3281) [ClassicSimilarity], result of:
          0.015787644 = score(doc=3281,freq=2.0), product of:
            0.08139861 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046368346 = queryNorm
            0.19395474 = fieldWeight in 3281, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.078125 = fieldNorm(doc=3281)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Footnote: Übers. des Titels: Standardisation and some other issues connected with the development of thesauri and other linguistic information retrieval tools
Type: a

Search (3 results, page 1 of 1)

Authors

Languages