Document (#38060)

Author
Toutanova, K.
Klein, D.
Manning, C.D.
Singer, Y.
Title
Feature-rich Part-of-Speech Tagging with a cyclic dependency network
Source
Proceedings of HLT-NAACL, 2003
Imprint
xx : xx
Year
2003
Pages
S.252-259
Abstract
We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use of lexical features, including jointly conditioning on multiple consecutive words, (iii) effective use of priors in conditional loglinear models, and (iv) fine-grained modeling of unknown word features. Using these ideas together, the resulting tagger gives a 97.24%accuracy on the Penn TreebankWSJ, an error reduction of 4.4% on the best previous single automatically learned tagging result.
Content
Vgl.: http://nlp.stanford.edu/software/tagger.shtml.
Theme
Computerlinguistik
Object
Stanford POS Tagger

Similar documents (author)

  1. Manning, R.W.: ¬The Anglo-American Cataloguing Rules and their future (1999) 1.25
    1.2477846 = sum of:
      1.2477846 = product of:
        3.7433536 = sum of:
          3.7433536 = weight(author_txt:manning in 6809) [ClassicSimilarity], result of:
            3.7433536 = score(doc=6809,freq=1.0), product of:
              0.6501713 = queryWeight, product of:
                1.1608297 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.060800374 = queryNorm
              5.7574883 = fieldWeight in 6809, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.625 = fieldNorm(doc=6809)
        0.33333334 = coord(1/3)
    
  2. Manning, R.W.: ¬The Anglo American Cataloguing Rules and their future (2000) 1.25
    1.2477846 = sum of:
      1.2477846 = product of:
        3.7433536 = sum of:
          3.7433536 = weight(author_txt:manning in 189) [ClassicSimilarity], result of:
            3.7433536 = score(doc=189,freq=1.0), product of:
              0.6501713 = queryWeight, product of:
                1.1608297 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.060800374 = queryNorm
              5.7574883 = fieldWeight in 189, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.625 = fieldNorm(doc=189)
        0.33333334 = coord(1/3)
    
  3. Manning, C.D.: Part-of-Speech Tagging from 97% to 100% : is it time for some linguistics? (2011) 1.25
    1.2477846 = sum of:
      1.2477846 = product of:
        3.7433536 = sum of:
          3.7433536 = weight(author_txt:manning in 1121) [ClassicSimilarity], result of:
            3.7433536 = score(doc=1121,freq=1.0), product of:
              0.6501713 = queryWeight, product of:
                1.1608297 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.060800374 = queryNorm
              5.7574883 = fieldWeight in 1121, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.625 = fieldNorm(doc=1121)
        0.33333334 = coord(1/3)
    
  4. Singer, W.: Hirnentwicklung oder die Suche nach Kohärenz (1994) 1.07
    1.0702103 = sum of:
      1.0702103 = product of:
        3.210631 = sum of:
          3.210631 = weight(author_txt:singer in 5573) [ClassicSimilarity], result of:
            3.210631 = score(doc=5573,freq=1.0), product of:
              0.5869226 = queryWeight, product of:
                1.1029226 = boost
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.060800374 = queryNorm
              5.47028 = fieldWeight in 5573, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.625 = fieldNorm(doc=5573)
        0.33333334 = coord(1/3)
    
  5. Singer, L.: Mammals: a multimedia encyclopedia (1992) 1.07
    1.0702103 = sum of:
      1.0702103 = product of:
        3.210631 = sum of:
          3.210631 = weight(author_txt:singer in 6395) [ClassicSimilarity], result of:
            3.210631 = score(doc=6395,freq=1.0), product of:
              0.5869226 = queryWeight, product of:
                1.1029226 = boost
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.060800374 = queryNorm
              5.47028 = fieldWeight in 6395, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.625 = fieldNorm(doc=6395)
        0.33333334 = coord(1/3)
    

Similar documents (content)

  1. Toutanova, K.; Manning, C.D.: Enriching the knowledge sources used in a maximum entropy Part-of-Speech Tagger (2000) 0.35
    0.345816 = sum of:
      0.345816 = product of:
        1.2350571 = sum of:
          0.09957489 = weight(abstract_txt:unknown in 1060) [ClassicSimilarity], result of:
            0.09957489 = score(doc=1060,freq=1.0), product of:
              0.14711888 = queryWeight, product of:
                1.1971806 = boost
                7.2195506 = idf(docFreq=87, maxDocs=44218)
                0.017021528 = queryNorm
              0.67683285 = fieldWeight in 1060, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2195506 = idf(docFreq=87, maxDocs=44218)
                0.09375 = fieldNorm(doc=1060)
          0.08573024 = weight(abstract_txt:features in 1060) [ClassicSimilarity], result of:
            0.08573024 = score(doc=1060,freq=3.0), product of:
              0.11631279 = queryWeight, product of:
                1.505408 = boost
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.017021528 = queryNorm
              0.7370663 = fieldWeight in 1060, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.09375 = fieldNorm(doc=1060)
          0.20151539 = weight(abstract_txt:penn in 1060) [ClassicSimilarity], result of:
            0.20151539 = score(doc=1060,freq=1.0), product of:
              0.23538241 = queryWeight, product of:
                1.514302 = boost
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.017021528 = queryNorm
              0.85611916 = fieldWeight in 1060, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.09375 = fieldNorm(doc=1060)
          0.050861128 = weight(abstract_txt:part in 1060) [ClassicSimilarity], result of:
            0.050861128 = score(doc=1060,freq=1.0), product of:
              0.11844111 = queryWeight, product of:
                1.5191188 = boost
                4.580493 = idf(docFreq=1231, maxDocs=44218)
                0.017021528 = queryNorm
              0.42942122 = fieldWeight in 1060, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.580493 = idf(docFreq=1231, maxDocs=44218)
                0.09375 = fieldNorm(doc=1060)
          0.13282698 = weight(abstract_txt:tagging in 1060) [ClassicSimilarity], result of:
            0.13282698 = score(doc=1060,freq=1.0), product of:
              0.22461359 = queryWeight, product of:
                2.0919847 = boost
                6.3078156 = idf(docFreq=218, maxDocs=44218)
                0.017021528 = queryNorm
              0.5913577 = fieldWeight in 1060, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3078156 = idf(docFreq=218, maxDocs=44218)
                0.09375 = fieldNorm(doc=1060)
          0.17149404 = weight(abstract_txt:speech in 1060) [ClassicSimilarity], result of:
            0.17149404 = score(doc=1060,freq=1.0), product of:
              0.26632455 = queryWeight, product of:
                2.27796 = boost
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.017021528 = queryNorm
              0.64392877 = fieldWeight in 1060, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.09375 = fieldNorm(doc=1060)
          0.49305457 = weight(abstract_txt:tagger in 1060) [ClassicSimilarity], result of:
            0.49305457 = score(doc=1060,freq=2.0), product of:
              0.42739734 = queryWeight, product of:
                2.8857348 = boost
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.017021528 = queryNorm
              1.1536211 = fieldWeight in 1060, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.09375 = fieldNorm(doc=1060)
        0.28 = coord(7/25)
    
  2. Manning, C.D.: Part-of-Speech Tagging from 97% to 100% : is it time for some linguistics? (2011) 0.16
    0.16445012 = sum of:
      0.16445012 = product of:
        0.68520886 = sum of:
          0.056769267 = weight(abstract_txt:error in 1121) [ClassicSimilarity], result of:
            0.056769267 = score(doc=1121,freq=1.0), product of:
              0.1325475 = queryWeight, product of:
                1.1363478 = boost
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.017021528 = queryNorm
              0.42829376 = fieldWeight in 1121, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.0625 = fieldNorm(doc=1121)
          0.032997586 = weight(abstract_txt:features in 1121) [ClassicSimilarity], result of:
            0.032997586 = score(doc=1121,freq=1.0), product of:
              0.11631279 = queryWeight, product of:
                1.505408 = boost
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.017021528 = queryNorm
              0.28369698 = fieldWeight in 1121, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.0625 = fieldNorm(doc=1121)
          0.047952328 = weight(abstract_txt:part in 1121) [ClassicSimilarity], result of:
            0.047952328 = score(doc=1121,freq=2.0), product of:
              0.11844111 = queryWeight, product of:
                1.5191188 = boost
                4.580493 = idf(docFreq=1231, maxDocs=44218)
                0.017021528 = queryNorm
              0.4048622 = fieldWeight in 1121, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.580493 = idf(docFreq=1231, maxDocs=44218)
                0.0625 = fieldNorm(doc=1121)
          0.15337539 = weight(abstract_txt:tagging in 1121) [ClassicSimilarity], result of:
            0.15337539 = score(doc=1121,freq=3.0), product of:
              0.22461359 = queryWeight, product of:
                2.0919847 = boost
                6.3078156 = idf(docFreq=218, maxDocs=44218)
                0.017021528 = queryNorm
              0.68284106 = fieldWeight in 1121, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.3078156 = idf(docFreq=218, maxDocs=44218)
                0.0625 = fieldNorm(doc=1121)
          0.16168612 = weight(abstract_txt:speech in 1121) [ClassicSimilarity], result of:
            0.16168612 = score(doc=1121,freq=2.0), product of:
              0.26632455 = queryWeight, product of:
                2.27796 = boost
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.017021528 = queryNorm
              0.60710186 = fieldWeight in 1121, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.0625 = fieldNorm(doc=1121)
          0.23242815 = weight(abstract_txt:tagger in 1121) [ClassicSimilarity], result of:
            0.23242815 = score(doc=1121,freq=1.0), product of:
              0.42739734 = queryWeight, product of:
                2.8857348 = boost
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.017021528 = queryNorm
              0.54382217 = fieldWeight in 1121, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.0625 = fieldNorm(doc=1121)
        0.24 = coord(6/25)
    
  3. L'Homme, D.; L'Homme, M.-C.; Lemay, C.: Benchmarking the performance of two Part-of-Speech (POS) taggers for terminological purposes (2002) 0.16
    0.1579686 = sum of:
      0.1579686 = product of:
        0.78984296 = sum of:
          0.082979076 = weight(abstract_txt:unknown in 1855) [ClassicSimilarity], result of:
            0.082979076 = score(doc=1855,freq=1.0), product of:
              0.14711888 = queryWeight, product of:
                1.1971806 = boost
                7.2195506 = idf(docFreq=87, maxDocs=44218)
                0.017021528 = queryNorm
              0.56402737 = fieldWeight in 1855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2195506 = idf(docFreq=87, maxDocs=44218)
                0.078125 = fieldNorm(doc=1855)
          0.042384274 = weight(abstract_txt:part in 1855) [ClassicSimilarity], result of:
            0.042384274 = score(doc=1855,freq=1.0), product of:
              0.11844111 = queryWeight, product of:
                1.5191188 = boost
                4.580493 = idf(docFreq=1231, maxDocs=44218)
                0.017021528 = queryNorm
              0.35785103 = fieldWeight in 1855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.580493 = idf(docFreq=1231, maxDocs=44218)
                0.078125 = fieldNorm(doc=1855)
          0.11068915 = weight(abstract_txt:tagging in 1855) [ClassicSimilarity], result of:
            0.11068915 = score(doc=1855,freq=1.0), product of:
              0.22461359 = queryWeight, product of:
                2.0919847 = boost
                6.3078156 = idf(docFreq=218, maxDocs=44218)
                0.017021528 = queryNorm
              0.4927981 = fieldWeight in 1855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3078156 = idf(docFreq=218, maxDocs=44218)
                0.078125 = fieldNorm(doc=1855)
          0.1429117 = weight(abstract_txt:speech in 1855) [ClassicSimilarity], result of:
            0.1429117 = score(doc=1855,freq=1.0), product of:
              0.26632455 = queryWeight, product of:
                2.27796 = boost
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.017021528 = queryNorm
              0.5366073 = fieldWeight in 1855, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.078125 = fieldNorm(doc=1855)
          0.4108788 = weight(abstract_txt:tagger in 1855) [ClassicSimilarity], result of:
            0.4108788 = score(doc=1855,freq=2.0), product of:
              0.42739734 = queryWeight, product of:
                2.8857348 = boost
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.017021528 = queryNorm
              0.96135086 = fieldWeight in 1855, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.078125 = fieldNorm(doc=1855)
        0.2 = coord(5/25)
    
  4. Li, C.; Sun, A.: Extracting fine-grained location with temporal awareness in tweets : a two-stage approach (2017) 0.12
    0.12201779 = sum of:
      0.12201779 = product of:
        0.61008894 = sum of:
          0.08100611 = weight(abstract_txt:fine in 3686) [ClassicSimilarity], result of:
            0.08100611 = score(doc=3686,freq=2.0), product of:
              0.14575578 = queryWeight, product of:
                1.1916217 = boost
                7.1860275 = idf(docFreq=90, maxDocs=44218)
                0.017021528 = queryNorm
              0.555766 = fieldWeight in 3686, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.1860275 = idf(docFreq=90, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3686)
          0.10633711 = weight(abstract_txt:grained in 3686) [ClassicSimilarity], result of:
            0.10633711 = score(doc=3686,freq=2.0), product of:
              0.17474465 = queryWeight, product of:
                1.3047504 = boost
                7.8682456 = idf(docFreq=45, maxDocs=44218)
                0.017021528 = queryNorm
              0.60852855 = fieldWeight in 3686, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.8682456 = idf(docFreq=45, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3686)
          0.085121244 = weight(abstract_txt:conditional in 3686) [ClassicSimilarity], result of:
            0.085121244 = score(doc=3686,freq=1.0), product of:
              0.18980862 = queryWeight, product of:
                1.3598264 = boost
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.017021528 = queryNorm
              0.44845825 = fieldWeight in 3686, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3686)
          0.050009307 = weight(abstract_txt:features in 3686) [ClassicSimilarity], result of:
            0.050009307 = score(doc=3686,freq=3.0), product of:
              0.11631279 = queryWeight, product of:
                1.505408 = boost
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.017021528 = queryNorm
              0.42995536 = fieldWeight in 3686, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3686)
          0.28761518 = weight(abstract_txt:tagger in 3686) [ClassicSimilarity], result of:
            0.28761518 = score(doc=3686,freq=2.0), product of:
              0.42739734 = queryWeight, product of:
                2.8857348 = boost
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.017021528 = queryNorm
              0.6729456 = fieldWeight in 3686, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3686)
        0.2 = coord(5/25)
    
  5. Chowdhury, A.; Mccabe, M.C.: Improving information retrieval systems using part of speech tagging (1993) 0.09
    0.085593574 = sum of:
      0.085593574 = product of:
        0.53495985 = sum of:
          0.11679998 = weight(abstract_txt:reduction in 1061) [ClassicSimilarity], result of:
            0.11679998 = score(doc=1061,freq=2.0), product of:
              0.1466587 = queryWeight, product of:
                1.1953069 = boost
                7.208251 = idf(docFreq=88, maxDocs=44218)
                0.017021528 = queryNorm
              0.79640675 = fieldWeight in 1061, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.208251 = idf(docFreq=88, maxDocs=44218)
                0.078125 = fieldNorm(doc=1061)
          0.059940413 = weight(abstract_txt:part in 1061) [ClassicSimilarity], result of:
            0.059940413 = score(doc=1061,freq=2.0), product of:
              0.11844111 = queryWeight, product of:
                1.5191188 = boost
                4.580493 = idf(docFreq=1231, maxDocs=44218)
                0.017021528 = queryNorm
              0.50607777 = fieldWeight in 1061, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.580493 = idf(docFreq=1231, maxDocs=44218)
                0.078125 = fieldNorm(doc=1061)
          0.11068915 = weight(abstract_txt:tagging in 1061) [ClassicSimilarity], result of:
            0.11068915 = score(doc=1061,freq=1.0), product of:
              0.22461359 = queryWeight, product of:
                2.0919847 = boost
                6.3078156 = idf(docFreq=218, maxDocs=44218)
                0.017021528 = queryNorm
              0.4927981 = fieldWeight in 1061, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3078156 = idf(docFreq=218, maxDocs=44218)
                0.078125 = fieldNorm(doc=1061)
          0.24753031 = weight(abstract_txt:speech in 1061) [ClassicSimilarity], result of:
            0.24753031 = score(doc=1061,freq=3.0), product of:
              0.26632455 = queryWeight, product of:
                2.27796 = boost
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.017021528 = queryNorm
              0.9294311 = fieldWeight in 1061, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.8685737 = idf(docFreq=124, maxDocs=44218)
                0.078125 = fieldNorm(doc=1061)
        0.16 = coord(4/25)