Search (2 results, page 1 of 1)

  • × author_ss:"Toutanova, K."
  • × theme_ss:"Computerlinguistik"
  1. Toutanova, K.; Klein, D.; Manning, C.D.; Singer, Y.: Feature-rich Part-of-Speech Tagging with a cyclic dependency network (2003) 0.05
    0.048099514 = product of:
      0.09619903 = sum of:
        0.09619903 = product of:
          0.19239806 = sum of:
            0.19239806 = weight(_text_:tagging in 1059) [ClassicSimilarity], result of:
              0.19239806 = score(doc=1059,freq=4.0), product of:
                0.2979515 = queryWeight, product of:
                  5.9038734 = idf(docFreq=327, maxDocs=44218)
                  0.05046712 = queryNorm
                0.64573616 = fieldWeight in 1059, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.9038734 = idf(docFreq=327, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1059)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use of lexical features, including jointly conditioning on multiple consecutive words, (iii) effective use of priors in conditional loglinear models, and (iv) fine-grained modeling of unknown word features. Using these ideas together, the resulting tagger gives a 97.24%accuracy on the Penn TreebankWSJ, an error reduction of 4.4% on the best previous single automatically learned tagging result.
  2. Toutanova, K.; Manning, C.D.: Enriching the knowledge sources used in a maximum entropy Part-of-Speech Tagger (2000) 0.03
    0.03401149 = product of:
      0.06802298 = sum of:
        0.06802298 = product of:
          0.13604596 = sum of:
            0.13604596 = weight(_text_:tagging in 1060) [ClassicSimilarity], result of:
              0.13604596 = score(doc=1060,freq=2.0), product of:
                0.2979515 = queryWeight, product of:
                  5.9038734 = idf(docFreq=327, maxDocs=44218)
                  0.05046712 = queryNorm
                0.4566044 = fieldWeight in 1060, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.9038734 = idf(docFreq=327, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1060)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper presents results for a maximumentropy-based part of speech tagger, which achieves superior performance principally by enriching the information sources used for tagging. In particular, we get improved results by incorporating these features: (i) more extensive treatment of capitalization for unknown words; (ii) features for the disambiguation of the tense forms of verbs; (iii) features for disambiguating particles from prepositions and adverbs. The best resulting accuracy for the tagger on the Penn Treebank is 96.86% overall, and 86.91% on previously unseen words.