Search (1 results, page 1 of 1)

  • × author_ss:"Lemay, C."
  • × theme_ss:"Computerlinguistik"
  1. L'Homme, D.; L'Homme, M.-C.; Lemay, C.: Benchmarking the performance of two Part-of-Speech (POS) taggers for terminological purposes (2002) 0.03
    0.029152704 = product of:
      0.05830541 = sum of:
        0.05830541 = product of:
          0.11661082 = sum of:
            0.11661082 = weight(_text_:tagging in 1855) [ClassicSimilarity], result of:
              0.11661082 = score(doc=1855,freq=2.0), product of:
                0.2979515 = queryWeight, product of:
                  5.9038734 = idf(docFreq=327, maxDocs=44218)
                  0.05046712 = queryNorm
                0.39137518 = fieldWeight in 1855, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.9038734 = idf(docFreq=327, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1855)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Part-of-Speech (POS) taggers are used in an increasing number of terminology applications. However, terminologists do not know exactly how they perform an specialized texts since most POS taggers have been trained an "general" Corpora, that is, Corpora containing all sorts of undifferentiated texts. In this article, we evaluate the Performance of two POS taggers an French and English medical texts. The taggers are TnT (a statistical tagger developed at Saarland University (Brants 2000)) and WinBrill (the Windows version of the tagger initially developed by Eric Brill (1992)). Ten extracts from medical texts were submitted to the taggers and the outputs scanned manually. Results pertain to the accuracy of tagging in terms of correctly and incorrectly tagged words. We also study the handling of unknown words from different viewpoints.