Search (3 results, page 1 of 1)

Manning, C.D.; Schütze, H.: Foundations of statistical natural language processing (2000) 0.00

0.0031205663 = product of:
  0.0062411325 = sum of:
    0.0062411325 = product of:
      0.018723397 = sum of:
        0.018723397 = weight(_text_:h in 1603) [ClassicSimilarity], result of:
          0.018723397 = score(doc=1603,freq=2.0), product of:
            0.113683715 = queryWeight, product of:
              2.4844491 = idf(docFreq=10020, maxDocs=44218)
              0.045758117 = queryNorm
            0.16469726 = fieldWeight in 1603, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.4844491 = idf(docFreq=10020, maxDocs=44218)
              0.046875 = fieldNorm(doc=1603)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Toutanova, K.; Klein, D.; Manning, C.D.; Singer, Y.: Feature-rich Part-of-Speech Tagging with a cyclic dependency network (2003) 0.00
```
0.0017534725 = product of:
  0.003506945 = sum of:
    0.003506945 = product of:
      0.0105208345 = sum of:
        0.0105208345 = weight(_text_:a in 1059) [ClassicSimilarity], result of:
          0.0105208345 = score(doc=1059,freq=10.0), product of:
            0.052761257 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045758117 = queryNorm
            0.19940455 = fieldWeight in 1059, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1059)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)
```
Abstract

We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use of lexical features, including jointly conditioning on multiple consecutive words, (iii) effective use of priors in conditional loglinear models, and (iv) fine-grained modeling of unknown word features. Using these ideas together, the resulting tagger gives a 97.24%accuracy on the Penn TreebankWSJ, an error reduction of 4.4% on the best previous single automatically learned tagging result.

Type

a
Toutanova, K.; Manning, C.D.: Enriching the knowledge sources used in a maximum entropy Part-of-Speech Tagger (2000) 0.00
```
0.0013582341 = product of:
  0.0027164682 = sum of:
    0.0027164682 = product of:
      0.008149404 = sum of:
        0.008149404 = weight(_text_:a in 1060) [ClassicSimilarity], result of:
          0.008149404 = score(doc=1060,freq=6.0), product of:
            0.052761257 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045758117 = queryNorm
            0.1544581 = fieldWeight in 1060, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1060)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)
```
Abstract

This paper presents results for a maximumentropy-based part of speech tagger, which achieves superior performance principally by enriching the information sources used for tagging. In particular, we get improved results by incorporating these features: (i) more extensive treatment of capitalization for unknown words; (ii) features for the disambiguation of the tense forms of verbs; (iii) features for disambiguating particles from prepositions and adverbs. The best resulting accuracy for the tagger on the Penn Treebank is 96.86% overall, and 86.91% on previously unseen words.

Type

a

Search (3 results, page 1 of 1)

Authors

Types

Subjects

Classifications