Schwarz, C.: THESYS: Thesaurus Syntax System : a fully automatic thesaurus building aid (1988)
0.03
0.033380013 = product of:
0.050070018 = sum of:
0.026404712 = weight(_text_:on in 1361) [ClassicSimilarity], result of:
0.026404712 = score(doc=1361,freq=4.0), product of:
0.109763056 = queryWeight, product of:
2.199415 = idf(docFreq=13325, maxDocs=44218)
0.04990557 = queryNorm
0.24056101 = fieldWeight in 1361, product of:
2.0 = tf(freq=4.0), with freq of:
4.0 = termFreq=4.0
2.199415 = idf(docFreq=13325, maxDocs=44218)
0.0546875 = fieldNorm(doc=1361)
0.023665305 = product of:
0.04733061 = sum of:
0.04733061 = weight(_text_:22 in 1361) [ClassicSimilarity], result of:
0.04733061 = score(doc=1361,freq=2.0), product of:
0.1747608 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.04990557 = queryNorm
0.2708308 = fieldWeight in 1361, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.0546875 = fieldNorm(doc=1361)
0.5 = coord(1/2)
0.6666667 = coord(2/3)
- Abstract
- THESYS is based on the natural language processing of free-text databases. It yields statistically evaluated correlations between words of the database. These correlations correspond to traditional thesaurus relations. The person who has to build a thesaurus is thus assisted by the proposals made by THESYS. THESYS is being tested on commercial databases under real world conditions. It is part of a text processing project at Siemens, called TINA (Text-Inhalts-Analyse). Software from TINA is actually being applied and evaluated by the US Department of Commerce for patent search and indexing (REALIST: REtrieval Aids by Linguistics and STatistics)
- Date
- 6. 1.1999 10:22:07