Sun, Q.; Shaw, D.; Davis, C.H.: ¬A model for estimating the occurence of same-frequency words and the boundary between high- and low-frequency words in texts (1999)
0.00
0.003793148 = product of:
0.011379444 = sum of:
0.011379444 = weight(_text_:a in 3063) [ClassicSimilarity], result of:
0.011379444 = score(doc=3063,freq=12.0), product of:
0.05209492 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.045180224 = queryNorm
0.21843673 = fieldWeight in 3063, product of:
3.4641016 = tf(freq=12.0), with freq of:
12.0 = termFreq=12.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.0546875 = fieldNorm(doc=3063)
0.33333334 = coord(1/3)
- Abstract
- A simpler model is proposed for estimating the frequency of any same-frequency words and identifying the boundary point between high-frequency words and low-frequency words in a text. The model, based on a 'maximum-ranking method', assigns ranks to the words and estimates word frequency by a formula. The boundary value between high-frequency and low-frequency words is obtained by taking the square root of the number of different words in the text. This straightforward model was used successfully with both English and Chinese texts
- Type
- a