Search (41 results, page 1 of 3)

  • × year_i:[2000 TO 2010}
  • × theme_ss:"Computerlinguistik"
  1. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.07
    0.07259655 = sum of:
      0.054127328 = product of:
        0.21650931 = sum of:
          0.21650931 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
            0.21650931 = score(doc=562,freq=2.0), product of:
              0.38523552 = queryWeight, product of:
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.045439374 = queryNorm
              0.56201804 = fieldWeight in 562, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.046875 = fieldNorm(doc=562)
        0.25 = coord(1/4)
      0.018469224 = product of:
        0.036938448 = sum of:
          0.036938448 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
            0.036938448 = score(doc=562,freq=2.0), product of:
              0.15912095 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.045439374 = queryNorm
              0.23214069 = fieldWeight in 562, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=562)
        0.5 = coord(1/2)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  2. Schneider, J.W.; Borlund, P.: ¬A bibliometric-based semiautomatic approach to identification of candidate thesaurus terms : parsing and filtering of noun phrases from citation contexts (2005) 0.05
    0.04654435 = product of:
      0.0930887 = sum of:
        0.0930887 = sum of:
          0.049993843 = weight(_text_:i in 156) [ClassicSimilarity], result of:
            0.049993843 = score(doc=156,freq=2.0), product of:
              0.17138503 = queryWeight, product of:
                3.7717297 = idf(docFreq=2765, maxDocs=44218)
                0.045439374 = queryNorm
              0.29170483 = fieldWeight in 156, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.7717297 = idf(docFreq=2765, maxDocs=44218)
                0.0546875 = fieldNorm(doc=156)
          0.043094855 = weight(_text_:22 in 156) [ClassicSimilarity], result of:
            0.043094855 = score(doc=156,freq=2.0), product of:
              0.15912095 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.045439374 = queryNorm
              0.2708308 = fieldWeight in 156, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0546875 = fieldNorm(doc=156)
      0.5 = coord(1/2)
    
    Date
    8. 3.2007 19:55:22
    Source
    Context: nature, impact and role. 5th International Conference an Conceptions of Library and Information Sciences, CoLIS 2005 Glasgow, UK, June 2005. Ed. by F. Crestani u. I. Ruthven
  3. Computational linguistics for the new millennium : divergence or synergy? Proceedings of the International Symposium held at the Ruprecht-Karls Universität Heidelberg, 21-22 July 2000. Festschrift in honour of Peter Hellwig on the occasion of his 60th birthday (2002) 0.04
    0.040641725 = product of:
      0.08128345 = sum of:
        0.08128345 = sum of:
          0.050501406 = weight(_text_:i in 4900) [ClassicSimilarity], result of:
            0.050501406 = score(doc=4900,freq=4.0), product of:
              0.17138503 = queryWeight, product of:
                3.7717297 = idf(docFreq=2765, maxDocs=44218)
                0.045439374 = queryNorm
              0.29466638 = fieldWeight in 4900, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.7717297 = idf(docFreq=2765, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4900)
          0.03078204 = weight(_text_:22 in 4900) [ClassicSimilarity], result of:
            0.03078204 = score(doc=4900,freq=2.0), product of:
              0.15912095 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.045439374 = queryNorm
              0.19345059 = fieldWeight in 4900, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=4900)
      0.5 = coord(1/2)
    
    Content
    Contents: Manfred Klenner / Henriette Visser: Introduction - Khurshid Ahmad: Writing Linguistics: When I use a word it means what I choose it to mean - Jürgen Handke: 2000 and Beyond: The Potential of New Technologies in Linguistics - Jurij Apresjan / Igor Boguslavsky / Leonid Iomdin / Leonid Tsinman: Lexical Functions in NU: Possible Uses - Hubert Lehmann: Practical Machine Translation and Linguistic Theory - Karin Haenelt: A Contextbased Approach towards Content Processing of Electronic Documents - Petr Sgall / Eva Hajicová: Are Linguistic Frameworks Comparable? - Wolfgang Menzel: Theory and Applications in Computational Linguistics - Is there Common Ground? - Robert Porzel / Michael Strube: Towards Context-adaptive Natural Language Processing Systems - Nicoletta Calzolari: Language Resources in a Multilingual Setting: The European Perspective - Piek Vossen: Computational Linguistics for Theory and Practice.
  4. Pimenov, E.N.: Normativnost' i nekotorye problem razrabotki tezauruzov i drugikh lingvistiicheskikh sredstv IPS (2000) 0.03
    0.025250703 = product of:
      0.050501406 = sum of:
        0.050501406 = product of:
          0.10100281 = sum of:
            0.10100281 = weight(_text_:i in 3281) [ClassicSimilarity], result of:
              0.10100281 = score(doc=3281,freq=4.0), product of:
                0.17138503 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.045439374 = queryNorm
                0.58933276 = fieldWeight in 3281, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3281)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  5. Feldman, S.: Find what I mean, not what I say : meaning-based search tools (2000) 0.03
    0.025250703 = product of:
      0.050501406 = sum of:
        0.050501406 = product of:
          0.10100281 = sum of:
            0.10100281 = weight(_text_:i in 4799) [ClassicSimilarity], result of:
              0.10100281 = score(doc=4799,freq=4.0), product of:
                0.17138503 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.045439374 = queryNorm
                0.58933276 = fieldWeight in 4799, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.078125 = fieldNorm(doc=4799)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  6. Kunze, C.; Wagner, A.: Anwendungsperspektive des GermaNet, eines lexikalisch-semantischen Netzes für das Deutsche (2001) 0.02
    0.021425933 = product of:
      0.042851865 = sum of:
        0.042851865 = product of:
          0.08570373 = sum of:
            0.08570373 = weight(_text_:i in 7456) [ClassicSimilarity], result of:
              0.08570373 = score(doc=7456,freq=2.0), product of:
                0.17138503 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.045439374 = queryNorm
                0.50006545 = fieldWeight in 7456, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.09375 = fieldNorm(doc=7456)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Chancen und Perspektiven computergestützter Lexikographie. Hrsg.: I. Lemberg u.a
  7. Ruchimskaya, E.M.: Yavlenie variativnosti estestevennogo yazyka i sposoby ee ustraneniya v verbal'nykh IPYA (2000) 0.02
    0.021425933 = product of:
      0.042851865 = sum of:
        0.042851865 = product of:
          0.08570373 = sum of:
            0.08570373 = weight(_text_:i in 6472) [ClassicSimilarity], result of:
              0.08570373 = score(doc=6472,freq=2.0), product of:
                0.17138503 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.045439374 = queryNorm
                0.50006545 = fieldWeight in 6472, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6472)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  8. Belonogov, G.G.: Sistemy frazeologicheskogo machinnogo perevoda RETRANS i ERTRANS v seti Internet (2000) 0.02
    0.021425933 = product of:
      0.042851865 = sum of:
        0.042851865 = product of:
          0.08570373 = sum of:
            0.08570373 = weight(_text_:i in 183) [ClassicSimilarity], result of:
              0.08570373 = score(doc=183,freq=2.0), product of:
                0.17138503 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.045439374 = queryNorm
                0.50006545 = fieldWeight in 183, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.09375 = fieldNorm(doc=183)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  9. Godby, C.J.: ¬Two Techniques for the Identification of Phrases in Full Text (2001) 0.02
    0.021425933 = product of:
      0.042851865 = sum of:
        0.042851865 = product of:
          0.08570373 = sum of:
            0.08570373 = weight(_text_:i in 1000) [ClassicSimilarity], result of:
              0.08570373 = score(doc=1000,freq=2.0), product of:
                0.17138503 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.045439374 = queryNorm
                0.50006545 = fieldWeight in 1000, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1000)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Footnote
    Teil eines Themenheftes: OCLC and the Internet: An Historical Overview of Research Activities, 1990-1999 - Part I
  10. Boleda, G.; Evert, S.: Multiword expressions : a pain in the neck of lexical semantics (2009) 0.02
    0.018469224 = product of:
      0.036938448 = sum of:
        0.036938448 = product of:
          0.073876895 = sum of:
            0.073876895 = weight(_text_:22 in 4888) [ClassicSimilarity], result of:
              0.073876895 = score(doc=4888,freq=2.0), product of:
                0.15912095 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045439374 = queryNorm
                0.46428138 = fieldWeight in 4888, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=4888)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    1. 3.2013 14:56:22
  11. Monnerjahn, P.: Vorsprung ohne Technik : Übersetzen: Computer und Qualität (2000) 0.02
    0.018469224 = product of:
      0.036938448 = sum of:
        0.036938448 = product of:
          0.073876895 = sum of:
            0.073876895 = weight(_text_:22 in 5429) [ClassicSimilarity], result of:
              0.073876895 = score(doc=5429,freq=2.0), product of:
                0.15912095 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045439374 = queryNorm
                0.46428138 = fieldWeight in 5429, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=5429)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    c't. 2000, H.22, S.230-231
  12. Sokirko, A.V.: Obzor zarubezhnykh sistem avtomaticheskoi obrabotki teksta, ispol'zuyushchikh poverkhnosto-semanticheskoe predstavlenie, i mashinnykh sematicheskikh slovarei (2000) 0.02
    0.017854942 = product of:
      0.035709884 = sum of:
        0.035709884 = product of:
          0.07141977 = sum of:
            0.07141977 = weight(_text_:i in 8870) [ClassicSimilarity], result of:
              0.07141977 = score(doc=8870,freq=2.0), product of:
                0.17138503 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.045439374 = queryNorm
                0.41672117 = fieldWeight in 8870, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.078125 = fieldNorm(doc=8870)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  13. Egger, W.: Helferlein für jedermann : Elektronische Wörterbücher (2004) 0.02
    0.017854942 = product of:
      0.035709884 = sum of:
        0.035709884 = product of:
          0.07141977 = sum of:
            0.07141977 = weight(_text_:i in 1501) [ClassicSimilarity], result of:
              0.07141977 = score(doc=1501,freq=2.0), product of:
                0.17138503 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.045439374 = queryNorm
                0.41672117 = fieldWeight in 1501, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1501)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Object
    I-Finger
  14. Kuhlmann, U.; Monnerjahn, P.: Sprache auf Knopfdruck : Sieben automatische Übersetzungsprogramme im Test (2000) 0.02
    0.01539102 = product of:
      0.03078204 = sum of:
        0.03078204 = product of:
          0.06156408 = sum of:
            0.06156408 = weight(_text_:22 in 5428) [ClassicSimilarity], result of:
              0.06156408 = score(doc=5428,freq=2.0), product of:
                0.15912095 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045439374 = queryNorm
                0.38690117 = fieldWeight in 5428, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=5428)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    c't. 2000, H.22, S.220-229
  15. He, Q.: ¬A study of the strength indexes in co-word analysis (2000) 0.02
    0.015150423 = product of:
      0.030300846 = sum of:
        0.030300846 = product of:
          0.060601693 = sum of:
            0.060601693 = weight(_text_:i in 111) [ClassicSimilarity], result of:
              0.060601693 = score(doc=111,freq=4.0), product of:
                0.17138503 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.045439374 = queryNorm
                0.35359967 = fieldWeight in 111, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.046875 = fieldNorm(doc=111)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Co-word analysis is a technique for detecting the knowledge structure of scientific literature and mapping the dynamics in a research field. It is used to count the co-occurrences of term pairs, compute the strength between term pairs, and map the research field by inserting terms and their linkages into a graphical structure according to the strength values. In previous co-word studies, there are two indexes used to measure the strength between term pairs in order to identify the major areas in a research field - the inclusion index (I) and the equivalence index (E). This study will conduct two co-word analysis experiments using the two indexes, respectively, and compare the results from the two experiments. The results show, due to the difference in their computation, index I is more likely to identify general subject areas in a research field while index E is more likely to identify subject areas at more specific levels
  16. Diaz, I.; Morato, J.; Lioréns, J.: ¬An algorithm for term conflation based on tree structures (2002) 0.01
    0.014283955 = product of:
      0.02856791 = sum of:
        0.02856791 = product of:
          0.05713582 = sum of:
            0.05713582 = weight(_text_:i in 246) [ClassicSimilarity], result of:
              0.05713582 = score(doc=246,freq=2.0), product of:
                0.17138503 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.045439374 = queryNorm
                0.33337694 = fieldWeight in 246, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.0625 = fieldNorm(doc=246)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  17. Blair, D.C.: Information retrieval and the philosophy of language (2002) 0.01
    0.014283955 = product of:
      0.02856791 = sum of:
        0.02856791 = product of:
          0.05713582 = sum of:
            0.05713582 = weight(_text_:i in 4283) [ClassicSimilarity], result of:
              0.05713582 = score(doc=4283,freq=8.0), product of:
                0.17138503 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.045439374 = queryNorm
                0.33337694 = fieldWeight in 4283, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.03125 = fieldNorm(doc=4283)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Information retrieval - the retrieval, primarily, of documents or textual material - is fundamentally a linguistic process. At the very least we must describe what we want and match that description with descriptions of the information that is available to us. Furthermore, when we describe what we want, we must mean something by that description. This is a deceptively simple act, but such linguistic events have been the grist for philosophical analysis since Aristotle. Although there are complexities involved in referring to authors, document types, or other categories of information retrieval context, here I wish to focus an one of the most problematic activities in information retrieval: the description of the intellectual content of information items. And even though I take information retrieval to involve the description and retrieval of written text, what I say here is applicable to any information item whose intellectual content can be described for retrieval-books, documents, images, audio clips, video clips, scientific specimens, engineering schematics, and so forth. For convenience, though, I will refer only to the description and retrieval of documents. The description of intellectual content can go wrong in many obvious ways. We may describe what we want incorrectly; we may describe it correctly but in such general terms that its description is useless for retrieval; or we may describe what we want correctly, but misinterpret the descriptions of available information, and thereby match our description of what we want incorrectly. From a linguistic point of view, we can be misunderstood in the process of retrieval in many ways. Because the philosophy of language deals specifically with how we are understood and mis-understood, it should have some use for understanding the process of description in information retrieval. First, however, let us examine more closely the kinds of misunderstandings that can occur in information retrieval. We use language in searching for information in two principal ways. We use it to describe what we want and to discriminate what we want from other information that is available to us but that we do not want. Description and discrimination together articulate the goals of the information search process; they also delineate the two principal ways in which language can fail us in this process. Van Rijsbergen (1979) was the first to make this distinction, calling them "representation" and "discrimination.""
  18. Koppel, M.; Akiva, N.; Dagan, I.: Feature instability as a criterion for selecting potential style markers (2006) 0.01
    0.014283955 = product of:
      0.02856791 = sum of:
        0.02856791 = product of:
          0.05713582 = sum of:
            0.05713582 = weight(_text_:i in 6092) [ClassicSimilarity], result of:
              0.05713582 = score(doc=6092,freq=2.0), product of:
                0.17138503 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.045439374 = queryNorm
                0.33337694 = fieldWeight in 6092, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6092)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  19. Toutanova, K.; Klein, D.; Manning, C.D.; Singer, Y.: Feature-rich Part-of-Speech Tagging with a cyclic dependency network (2003) 0.01
    0.012498461 = product of:
      0.024996921 = sum of:
        0.024996921 = product of:
          0.049993843 = sum of:
            0.049993843 = weight(_text_:i in 1059) [ClassicSimilarity], result of:
              0.049993843 = score(doc=1059,freq=2.0), product of:
                0.17138503 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.045439374 = queryNorm
                0.29170483 = fieldWeight in 1059, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1059)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use of lexical features, including jointly conditioning on multiple consecutive words, (iii) effective use of priors in conditional loglinear models, and (iv) fine-grained modeling of unknown word features. Using these ideas together, the resulting tagger gives a 97.24%accuracy on the Penn TreebankWSJ, an error reduction of 4.4% on the best previous single automatically learned tagging result.
  20. Toutanova, K.; Manning, C.D.: Enriching the knowledge sources used in a maximum entropy Part-of-Speech Tagger (2000) 0.01
    0.012498461 = product of:
      0.024996921 = sum of:
        0.024996921 = product of:
          0.049993843 = sum of:
            0.049993843 = weight(_text_:i in 1060) [ClassicSimilarity], result of:
              0.049993843 = score(doc=1060,freq=2.0), product of:
                0.17138503 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.045439374 = queryNorm
                0.29170483 = fieldWeight in 1060, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1060)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper presents results for a maximumentropy-based part of speech tagger, which achieves superior performance principally by enriching the information sources used for tagging. In particular, we get improved results by incorporating these features: (i) more extensive treatment of capitalization for unknown words; (ii) features for the disambiguation of the tense forms of verbs; (iii) features for disambiguating particles from prepositions and adverbs. The best resulting accuracy for the tagger on the Penn Treebank is 96.86% overall, and 86.91% on previously unseen words.

Languages

Types

  • a 35
  • m 4
  • s 3
  • el 1
  • x 1
  • More… Less…