Document (#36639)

Author
Perera, P.
Witte, R.
Title
¬A self-learning context-aware lemmatizer for German
Source
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP 2005), October 6-8, 2005
Imprint
Vancouver : Association for Computational Linguistics
Year
2005
Pages
S.636-643
Abstract
Accurate lemmatization of German nouns mandates the use of a lexicon. Comprehensive lexicons, however, are expensive to build and maintain. We present a self-learning lemmatizer capable of automatically creating a full-form lexicon by processing German documents.
Content
Vgl. unter: http://acl.ldc.upenn.edu//H/H05/H05-1080.pdf.
Theme
Computerlinguistik

Similar documents (author)

  1. Witte, L.: Sehnsucht nach Unsterblichkeit (2014) 6.01
    6.010904 = sum of:
      6.010904 = weight(author_txt:witte in 2076) [ClassicSimilarity], result of:
        6.010904 = fieldWeight in 2076, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.617446 = idf(docFreq=7, maxDocs=44218)
          0.625 = fieldNorm(doc=2076)
    
  2. Witte-Petit, K.: Mal schnell die Weilt retten (2021) 4.81
    4.808723 = sum of:
      4.808723 = weight(author_txt:witte in 241) [ClassicSimilarity], result of:
        4.808723 = fieldWeight in 241, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.617446 = idf(docFreq=7, maxDocs=44218)
          0.5 = fieldNorm(doc=241)
    
  3. Witte, R.; Gitzinger, T.: Semantic assistants : user-centric Natural Language Processing services for desktop clients (2009) 4.81
    4.808723 = sum of:
      4.808723 = weight(author_txt:witte in 4652) [ClassicSimilarity], result of:
        4.808723 = fieldWeight in 4652, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.617446 = idf(docFreq=7, maxDocs=44218)
          0.5 = fieldNorm(doc=4652)
    
  4. Witte-Petit, K.: Digitaler Handschlag : Corona-App (2019) 4.81
    4.808723 = sum of:
      4.808723 = weight(author_txt:witte in 5957) [ClassicSimilarity], result of:
        4.808723 = fieldWeight in 5957, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.617446 = idf(docFreq=7, maxDocs=44218)
          0.5 = fieldNorm(doc=5957)
    
  5. Witte-Petit, K.: ¬Der menschliche Kurswert (2017) 4.81
    4.808723 = sum of:
      4.808723 = weight(author_txt:witte in 225) [ClassicSimilarity], result of:
        4.808723 = fieldWeight in 225, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.617446 = idf(docFreq=7, maxDocs=44218)
          0.5 = fieldNorm(doc=225)
    

Similar documents (content)

  1. Stede, M.: Lexicalization in natural language generation (2002) 0.14
    0.14083749 = sum of:
      0.14083749 = product of:
        0.56334996 = sum of:
          0.02083667 = weight(abstract_txt:however in 4245) [ClassicSimilarity], result of:
            0.02083667 = score(doc=4245,freq=2.0), product of:
              0.06389655 = queryWeight, product of:
                1.0230888 = boost
                4.216459 = idf(docFreq=1772, maxDocs=44218)
                0.014812087 = queryNorm
              0.32610008 = fieldWeight in 4245, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.216459 = idf(docFreq=1772, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4245)
          0.023577563 = weight(abstract_txt:processing in 4245) [ClassicSimilarity], result of:
            0.023577563 = score(doc=4245,freq=1.0), product of:
              0.08741806 = queryWeight, product of:
                1.1966722 = boost
                4.931848 = idf(docFreq=866, maxDocs=44218)
                0.014812087 = queryNorm
              0.26971045 = fieldWeight in 4245, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.931848 = idf(docFreq=866, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4245)
          0.033002604 = weight(abstract_txt:automatically in 4245) [ClassicSimilarity], result of:
            0.033002604 = score(doc=4245,freq=1.0), product of:
              0.109387405 = queryWeight, product of:
                1.338623 = boost
                5.5168705 = idf(docFreq=482, maxDocs=44218)
                0.014812087 = queryNorm
              0.30170387 = fieldWeight in 4245, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5168705 = idf(docFreq=482, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4245)
          0.03965289 = weight(abstract_txt:build in 4245) [ClassicSimilarity], result of:
            0.03965289 = score(doc=4245,freq=1.0), product of:
              0.12362845 = queryWeight, product of:
                1.4230949 = boost
                5.8650045 = idf(docFreq=340, maxDocs=44218)
                0.014812087 = queryNorm
              0.32074243 = fieldWeight in 4245, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8650045 = idf(docFreq=340, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4245)
          0.1898441 = weight(abstract_txt:lexicons in 4245) [ClassicSimilarity], result of:
            0.1898441 = score(doc=4245,freq=2.0), product of:
              0.278734 = queryWeight, product of:
                2.1368282 = boost
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.014812087 = queryNorm
              0.68109417 = fieldWeight in 4245, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4245)
          0.25643617 = weight(abstract_txt:lexicon in 4245) [ClassicSimilarity], result of:
            0.25643617 = score(doc=4245,freq=2.0), product of:
              0.4291292 = queryWeight, product of:
                3.749589 = boost
                7.7265954 = idf(docFreq=52, maxDocs=44218)
                0.014812087 = queryNorm
              0.59757334 = fieldWeight in 4245, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.7265954 = idf(docFreq=52, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4245)
        0.25 = coord(6/24)
    
  2. Hubain, R.; Wilde, M. De; Hooland, S. van: Automated SKOS vocabulary design for the biopharmaceutical industry (2016) 0.12
    0.11961113 = sum of:
      0.11961113 = product of:
        0.4100953 = sum of:
          0.040852357 = weight(abstract_txt:documents in 5132) [ClassicSimilarity], result of:
            0.040852357 = score(doc=5132,freq=3.0), product of:
              0.06104509 = queryWeight, product of:
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.014812087 = queryNorm
              0.6692161 = fieldWeight in 5132, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.09375 = fieldNorm(doc=5132)
          0.02525786 = weight(abstract_txt:however in 5132) [ClassicSimilarity], result of:
            0.02525786 = score(doc=5132,freq=1.0), product of:
              0.06389655 = queryWeight, product of:
                1.0230888 = boost
                4.216459 = idf(docFreq=1772, maxDocs=44218)
                0.014812087 = queryNorm
              0.395293 = fieldWeight in 5132, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.216459 = idf(docFreq=1772, maxDocs=44218)
                0.09375 = fieldNorm(doc=5132)
          0.040193282 = weight(abstract_txt:full in 5132) [ClassicSimilarity], result of:
            0.040193282 = score(doc=5132,freq=1.0), product of:
              0.08709276 = queryWeight, product of:
                1.1944436 = boost
                4.922663 = idf(docFreq=874, maxDocs=44218)
                0.014812087 = queryNorm
              0.4614997 = fieldWeight in 5132, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.922663 = idf(docFreq=874, maxDocs=44218)
                0.09375 = fieldNorm(doc=5132)
          0.040418677 = weight(abstract_txt:processing in 5132) [ClassicSimilarity], result of:
            0.040418677 = score(doc=5132,freq=1.0), product of:
              0.08741806 = queryWeight, product of:
                1.1966722 = boost
                4.931848 = idf(docFreq=866, maxDocs=44218)
                0.014812087 = queryNorm
              0.46236074 = fieldWeight in 5132, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.931848 = idf(docFreq=866, maxDocs=44218)
                0.09375 = fieldNorm(doc=5132)
          0.05715654 = weight(abstract_txt:creating in 5132) [ClassicSimilarity], result of:
            0.05715654 = score(doc=5132,freq=1.0), product of:
              0.11013457 = queryWeight, product of:
                1.343187 = boost
                5.53568 = idf(docFreq=473, maxDocs=44218)
                0.014812087 = queryNorm
              0.51897 = fieldWeight in 5132, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.53568 = idf(docFreq=473, maxDocs=44218)
                0.09375 = fieldNorm(doc=5132)
          0.1339557 = weight(abstract_txt:expensive in 5132) [ClassicSimilarity], result of:
            0.1339557 = score(doc=5132,freq=1.0), product of:
              0.19432135 = queryWeight, product of:
                1.7841644 = boost
                7.3530817 = idf(docFreq=76, maxDocs=44218)
                0.014812087 = queryNorm
              0.68935144 = fieldWeight in 5132, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3530817 = idf(docFreq=76, maxDocs=44218)
                0.09375 = fieldNorm(doc=5132)
          0.07226089 = weight(abstract_txt:learning in 5132) [ClassicSimilarity], result of:
            0.07226089 = score(doc=5132,freq=1.0), product of:
              0.16224024 = queryWeight, product of:
                2.30552 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.014812087 = queryNorm
              0.44539434 = fieldWeight in 5132, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.09375 = fieldNorm(doc=5132)
        0.29166666 = coord(7/24)
    
  3. Xing, F.Z.; Pallucchini, F.; Cambria, E.: Cognitive-inspired domain adaptation of sentiment lexicons (2019) 0.12
    0.1188917 = sum of:
      0.1188917 = product of:
        0.7133502 = sum of:
          0.04531759 = weight(abstract_txt:build in 5104) [ClassicSimilarity], result of:
            0.04531759 = score(doc=5104,freq=1.0), product of:
              0.12362845 = queryWeight, product of:
                1.4230949 = boost
                5.8650045 = idf(docFreq=340, maxDocs=44218)
                0.014812087 = queryNorm
              0.36656278 = fieldWeight in 5104, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8650045 = idf(docFreq=340, maxDocs=44218)
                0.0625 = fieldNorm(doc=5104)
          0.30683443 = weight(abstract_txt:lexicons in 5104) [ClassicSimilarity], result of:
            0.30683443 = score(doc=5104,freq=4.0), product of:
              0.278734 = queryWeight, product of:
                2.1368282 = boost
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.014812087 = queryNorm
              1.1008145 = fieldWeight in 5104, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.0625 = fieldNorm(doc=5104)
          0.06812821 = weight(abstract_txt:learning in 5104) [ClassicSimilarity], result of:
            0.06812821 = score(doc=5104,freq=2.0), product of:
              0.16224024 = queryWeight, product of:
                2.30552 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.014812087 = queryNorm
              0.41992182 = fieldWeight in 5104, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.0625 = fieldNorm(doc=5104)
          0.29306993 = weight(abstract_txt:lexicon in 5104) [ClassicSimilarity], result of:
            0.29306993 = score(doc=5104,freq=2.0), product of:
              0.4291292 = queryWeight, product of:
                3.749589 = boost
                7.7265954 = idf(docFreq=52, maxDocs=44218)
                0.014812087 = queryNorm
              0.68294096 = fieldWeight in 5104, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.7265954 = idf(docFreq=52, maxDocs=44218)
                0.0625 = fieldNorm(doc=5104)
        0.16666667 = coord(4/24)
    
  4. Witten, I.H.; Bainbridge, M.; Nichols, D.M.: How to build a digital library (2010) 0.10
    0.104764275 = sum of:
      0.104764275 = product of:
        0.31429282 = sum of:
          0.016166687 = weight(abstract_txt:present in 4027) [ClassicSimilarity], result of:
            0.016166687 = score(doc=4027,freq=1.0), product of:
              0.067975 = queryWeight, product of:
                1.0552351 = boost
                4.348943 = idf(docFreq=1552, maxDocs=44218)
                0.014812087 = queryNorm
              0.23783283 = fieldWeight in 4027, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.348943 = idf(docFreq=1552, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4027)
          0.033157766 = weight(abstract_txt:full in 4027) [ClassicSimilarity], result of:
            0.033157766 = score(doc=4027,freq=2.0), product of:
              0.08709276 = queryWeight, product of:
                1.1944436 = boost
                4.922663 = idf(docFreq=874, maxDocs=44218)
                0.014812087 = queryNorm
              0.3807178 = fieldWeight in 4027, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.922663 = idf(docFreq=874, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4027)
          0.032781534 = weight(abstract_txt:comprehensive in 4027) [ClassicSimilarity], result of:
            0.032781534 = score(doc=4027,freq=1.0), product of:
              0.10889837 = queryWeight, product of:
                1.3356274 = boost
                5.5045247 = idf(docFreq=488, maxDocs=44218)
                0.014812087 = queryNorm
              0.3010287 = fieldWeight in 4027, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5045247 = idf(docFreq=488, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4027)
          0.033002604 = weight(abstract_txt:automatically in 4027) [ClassicSimilarity], result of:
            0.033002604 = score(doc=4027,freq=1.0), product of:
              0.109387405 = queryWeight, product of:
                1.338623 = boost
                5.5168705 = idf(docFreq=482, maxDocs=44218)
                0.014812087 = queryNorm
              0.30170387 = fieldWeight in 4027, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5168705 = idf(docFreq=482, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4027)
          0.033341315 = weight(abstract_txt:creating in 4027) [ClassicSimilarity], result of:
            0.033341315 = score(doc=4027,freq=1.0), product of:
              0.11013457 = queryWeight, product of:
                1.343187 = boost
                5.53568 = idf(docFreq=473, maxDocs=44218)
                0.014812087 = queryNorm
              0.3027325 = fieldWeight in 4027, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.53568 = idf(docFreq=473, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4027)
          0.03965289 = weight(abstract_txt:build in 4027) [ClassicSimilarity], result of:
            0.03965289 = score(doc=4027,freq=1.0), product of:
              0.12362845 = queryWeight, product of:
                1.4230949 = boost
                5.8650045 = idf(docFreq=340, maxDocs=44218)
                0.014812087 = queryNorm
              0.32074243 = fieldWeight in 4027, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8650045 = idf(docFreq=340, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4027)
          0.058576398 = weight(abstract_txt:maintain in 4027) [ClassicSimilarity], result of:
            0.058576398 = score(doc=4027,freq=1.0), product of:
              0.16035542 = queryWeight, product of:
                1.6207515 = boost
                6.6796074 = idf(docFreq=150, maxDocs=44218)
                0.014812087 = queryNorm
              0.36529103 = fieldWeight in 4027, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6796074 = idf(docFreq=150, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4027)
          0.067613594 = weight(abstract_txt:self in 4027) [ClassicSimilarity], result of:
            0.067613594 = score(doc=4027,freq=1.0), product of:
              0.22231454 = queryWeight, product of:
                2.6988177 = boost
                5.561322 = idf(docFreq=461, maxDocs=44218)
                0.014812087 = queryNorm
              0.30413482 = fieldWeight in 4027, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.561322 = idf(docFreq=461, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4027)
        0.33333334 = coord(8/24)
    
  5. Ko, Y.; Seo, J.: Text classification from unlabeled documents with bootstrapping and feature projection techniques (2009) 0.10
    0.09604633 = sum of:
      0.09604633 = product of:
        0.3841853 = sum of:
          0.031448163 = weight(abstract_txt:documents in 2452) [ClassicSimilarity], result of:
            0.031448163 = score(doc=2452,freq=4.0), product of:
              0.06104509 = queryWeight, product of:
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.014812087 = queryNorm
              0.5151628 = fieldWeight in 2452, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=2452)
          0.016838573 = weight(abstract_txt:however in 2452) [ClassicSimilarity], result of:
            0.016838573 = score(doc=2452,freq=1.0), product of:
              0.06389655 = queryWeight, product of:
                1.0230888 = boost
                4.216459 = idf(docFreq=1772, maxDocs=44218)
                0.014812087 = queryNorm
              0.26352867 = fieldWeight in 2452, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.216459 = idf(docFreq=1772, maxDocs=44218)
                0.0625 = fieldNorm(doc=2452)
          0.05334026 = weight(abstract_txt:automatically in 2452) [ClassicSimilarity], result of:
            0.05334026 = score(doc=2452,freq=2.0), product of:
              0.109387405 = queryWeight, product of:
                1.338623 = boost
                5.5168705 = idf(docFreq=482, maxDocs=44218)
                0.014812087 = queryNorm
              0.48762706 = fieldWeight in 2452, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.5168705 = idf(docFreq=482, maxDocs=44218)
                0.0625 = fieldNorm(doc=2452)
          0.056998104 = weight(abstract_txt:accurate in 2452) [ClassicSimilarity], result of:
            0.056998104 = score(doc=2452,freq=1.0), product of:
              0.1440503 = queryWeight, product of:
                1.5361433 = boost
                6.330911 = idf(docFreq=213, maxDocs=44218)
                0.014812087 = queryNorm
              0.39568195 = fieldWeight in 2452, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.330911 = idf(docFreq=213, maxDocs=44218)
                0.0625 = fieldNorm(doc=2452)
          0.0893038 = weight(abstract_txt:expensive in 2452) [ClassicSimilarity], result of:
            0.0893038 = score(doc=2452,freq=1.0), product of:
              0.19432135 = queryWeight, product of:
                1.7841644 = boost
                7.3530817 = idf(docFreq=76, maxDocs=44218)
                0.014812087 = queryNorm
              0.4595676 = fieldWeight in 2452, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3530817 = idf(docFreq=76, maxDocs=44218)
                0.0625 = fieldNorm(doc=2452)
          0.13625643 = weight(abstract_txt:learning in 2452) [ClassicSimilarity], result of:
            0.13625643 = score(doc=2452,freq=8.0), product of:
              0.16224024 = queryWeight, product of:
                2.30552 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.014812087 = queryNorm
              0.83984363 = fieldWeight in 2452, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.0625 = fieldNorm(doc=2452)
        0.25 = coord(6/24)