Aizawa, A.: ¬An information-theoretic perspective of tf-idf measures (2003)
0.00
6.6015165E-4 = product of:
0.009902274 = sum of:
0.009902274 = product of:
0.019804548 = sum of:
0.019804548 = weight(_text_:information in 4155) [ClassicSimilarity], result of:
0.019804548 = score(doc=4155,freq=12.0), product of:
0.052107345 = queryWeight, product of:
1.7554779 = idf(docFreq=20772, maxDocs=44218)
0.029682713 = queryNorm
0.38007212 = fieldWeight in 4155, product of:
3.4641016 = tf(freq=12.0), with freq of:
12.0 = termFreq=12.0
1.7554779 = idf(docFreq=20772, maxDocs=44218)
0.0625 = fieldNorm(doc=4155)
0.5 = coord(1/2)
0.06666667 = coord(1/15)
- Abstract
- This paper presents a mathematical definition of the "probability-weighted amount of information" (PWI), a measure of specificity of terms in documents that is based on an information-theoretic view of retrieval events. The proposed PWI is expressed as a product of the occurrence probabilities of terms and their amounts of information, and corresponds well with the conventional term frequency - inverse document frequency measures that are commonly used in today's information retrieval systems. The mathematical definition of the PWI is shown, together with some illustrative examples of the calculation.
- Source
- Information processing and management. 39(2003) no.1, S.45-65