Sun, Q.; Shaw, D.; Davis, C.H.: ¬A model for estimating the occurence of same-frequency words and the boundary between high- and low-frequency words in texts (1999)
0.00
0.0029000505 = product of:
0.005800101 = sum of:
0.005800101 = product of:
0.011600202 = sum of:
0.011600202 = weight(_text_:a in 3063) [ClassicSimilarity], result of:
0.011600202 = score(doc=3063,freq=12.0), product of:
0.053105544 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.046056706 = queryNorm
0.21843673 = fieldWeight in 3063, product of:
3.4641016 = tf(freq=12.0), with freq of:
12.0 = termFreq=12.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.0546875 = fieldNorm(doc=3063)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Abstract
- A simpler model is proposed for estimating the frequency of any same-frequency words and identifying the boundary point between high-frequency words and low-frequency words in a text. The model, based on a 'maximum-ranking method', assigns ranks to the words and estimates word frequency by a formula. The boundary value between high-frequency and low-frequency words is obtained by taking the square root of the number of different words in the text. This straightforward model was used successfully with both English and Chinese texts
- Type
- a