-
Byrne, C.C.; McCracken, S.A.: ¬An adaptive thesaurus employing semantic distance, relational inheritance and nominal compound interpretation for linguistic support of information retrieval (1999)
0.00
0.004300794 = product of:
0.03655675 = sum of:
0.01626061 = weight(_text_:und in 4483) [ClassicSimilarity], result of:
0.01626061 = score(doc=4483,freq=2.0), product of:
0.055336144 = queryWeight, product of:
2.216367 = idf(docFreq=13101, maxDocs=44218)
0.024967048 = queryNorm
0.29385152 = fieldWeight in 4483, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
2.216367 = idf(docFreq=13101, maxDocs=44218)
0.09375 = fieldNorm(doc=4483)
0.020296142 = product of:
0.040592283 = sum of:
0.040592283 = weight(_text_:22 in 4483) [ClassicSimilarity], result of:
0.040592283 = score(doc=4483,freq=2.0), product of:
0.08743035 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.024967048 = queryNorm
0.46428138 = fieldWeight in 4483, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.09375 = fieldNorm(doc=4483)
0.5 = coord(1/2)
0.11764706 = coord(2/17)
- Date
- 15. 3.2000 10:22:37
- Theme
- Konzeption und Anwendung des Prinzips Thesaurus
-
Schneider, J.W.; Borlund, P.: ¬A bibliometric-based semiautomatic approach to identification of candidate thesaurus terms : parsing and filtering of noun phrases from citation contexts (2005)
0.00
0.0025087968 = product of:
0.021324772 = sum of:
0.0094853565 = weight(_text_:und in 156) [ClassicSimilarity], result of:
0.0094853565 = score(doc=156,freq=2.0), product of:
0.055336144 = queryWeight, product of:
2.216367 = idf(docFreq=13101, maxDocs=44218)
0.024967048 = queryNorm
0.17141339 = fieldWeight in 156, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
2.216367 = idf(docFreq=13101, maxDocs=44218)
0.0546875 = fieldNorm(doc=156)
0.011839416 = product of:
0.023678832 = sum of:
0.023678832 = weight(_text_:22 in 156) [ClassicSimilarity], result of:
0.023678832 = score(doc=156,freq=2.0), product of:
0.08743035 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.024967048 = queryNorm
0.2708308 = fieldWeight in 156, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.0546875 = fieldNorm(doc=156)
0.5 = coord(1/2)
0.11764706 = coord(2/17)
- Date
- 8. 3.2007 19:55:22
- Theme
- Konzeption und Anwendung des Prinzips Thesaurus
-
Tseng, Y.-H.: Automatic thesaurus generation for Chinese documents (2002)
0.00
0.0013171139 = product of:
0.011195468 = sum of:
0.0067752544 = weight(_text_:und in 5226) [ClassicSimilarity], result of:
0.0067752544 = score(doc=5226,freq=2.0), product of:
0.055336144 = queryWeight, product of:
2.216367 = idf(docFreq=13101, maxDocs=44218)
0.024967048 = queryNorm
0.12243814 = fieldWeight in 5226, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
2.216367 = idf(docFreq=13101, maxDocs=44218)
0.0390625 = fieldNorm(doc=5226)
0.004420214 = weight(_text_:in in 5226) [ClassicSimilarity], result of:
0.004420214 = score(doc=5226,freq=6.0), product of:
0.033961542 = queryWeight, product of:
1.3602545 = idf(docFreq=30841, maxDocs=44218)
0.024967048 = queryNorm
0.1301535 = fieldWeight in 5226, product of:
2.4494898 = tf(freq=6.0), with freq of:
6.0 = termFreq=6.0
1.3602545 = idf(docFreq=30841, maxDocs=44218)
0.0390625 = fieldNorm(doc=5226)
0.11764706 = coord(2/17)
- Abstract
- Tseng constructs a word co-occurrence based thesaurus by means of the automatic analysis of Chinese text. Words are identified by a longest dictionary match supplemented by a key word extraction algorithm that merges back nearby tokens and accepts shorter strings of characters if they occur more often than the longest string. Single character auxiliary words are a major source of error but this can be greatly reduced with the use of a 70-character 2680 word stop list. Extracted terms with their associate document weights are sorted by decreasing frequency and the top of this list is associated using a Dice coefficient modified to account for longer documents on the weights of term pairs. Co-occurrence is not in the document as a whole but in paragraph or sentence size sections in order to reduce computation time. A window of 29 characters or 11 words was found to be sufficient. A thesaurus was produced from 25,230 Chinese news articles and judges asked to review the top 50 terms associated with each of 30 single word query terms. They determined 69% to be relevant.
- Theme
- Konzeption und Anwendung des Prinzips Thesaurus
-
Rahmstorf, G.: Information retrieval using conceptual representations of phrases (1994)
0.00
0.0013167905 = product of:
0.0111927185 = sum of:
0.008130305 = weight(_text_:und in 7862) [ClassicSimilarity], result of:
0.008130305 = score(doc=7862,freq=2.0), product of:
0.055336144 = queryWeight, product of:
2.216367 = idf(docFreq=13101, maxDocs=44218)
0.024967048 = queryNorm
0.14692576 = fieldWeight in 7862, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
2.216367 = idf(docFreq=13101, maxDocs=44218)
0.046875 = fieldNorm(doc=7862)
0.0030624135 = weight(_text_:in in 7862) [ClassicSimilarity], result of:
0.0030624135 = score(doc=7862,freq=2.0), product of:
0.033961542 = queryWeight, product of:
1.3602545 = idf(docFreq=30841, maxDocs=44218)
0.024967048 = queryNorm
0.09017298 = fieldWeight in 7862, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
1.3602545 = idf(docFreq=30841, maxDocs=44218)
0.046875 = fieldNorm(doc=7862)
0.11764706 = coord(2/17)
- Series
- Studies in classification, data analysis, and knowledge organization
- Theme
- Konzeption und Anwendung des Prinzips Thesaurus
-
Pimenov, E.N.: Normativnost' i nekotorye problem razrabotki tezauruzov i drugikh lingvistiicheskikh sredstv IPS (2000)
0.00
7.970888E-4 = product of:
0.013550509 = sum of:
0.013550509 = weight(_text_:und in 3281) [ClassicSimilarity], result of:
0.013550509 = score(doc=3281,freq=2.0), product of:
0.055336144 = queryWeight, product of:
2.216367 = idf(docFreq=13101, maxDocs=44218)
0.024967048 = queryNorm
0.24487628 = fieldWeight in 3281, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
2.216367 = idf(docFreq=13101, maxDocs=44218)
0.078125 = fieldNorm(doc=3281)
0.05882353 = coord(1/17)
- Theme
- Konzeption und Anwendung des Prinzips Thesaurus