Search (1 results, page 1 of 1)

Sun, Q.; Shaw, D.; Davis, C.H.: ¬A model for estimating the occurence of same-frequency words and the boundary between high- and low-frequency words in texts (1999) 0.00
```
0.0029000505 = product of:
  0.005800101 = sum of:
    0.005800101 = product of:
      0.011600202 = sum of:
        0.011600202 = weight(_text_:a in 3063) [ClassicSimilarity], result of:
          0.011600202 = score(doc=3063,freq=12.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.21843673 = fieldWeight in 3063, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3063)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

A simpler model is proposed for estimating the frequency of any same-frequency words and identifying the boundary point between high-frequency words and low-frequency words in a text. The model, based on a 'maximum-ranking method', assigns ranks to the words and estimates word frequency by a formula. The boundary value between high-frequency and low-frequency words is obtained by taking the square root of the number of different words in the text. This straightforward model was used successfully with both English and Chinese texts

Type

a