Search (10 results, page 1 of 1)

  • × year_i:[2010 TO 2020}
  • × theme_ss:"Computerlinguistik"
  1. Lezius, W.: Morphy - Morphologie und Tagging für das Deutsche (2013) 0.11
    0.105090946 = product of:
      0.21018189 = sum of:
        0.21018189 = sum of:
          0.1554811 = weight(_text_:tagging in 1490) [ClassicSimilarity], result of:
            0.1554811 = score(doc=1490,freq=2.0), product of:
              0.2979515 = queryWeight, product of:
                5.9038734 = idf(docFreq=327, maxDocs=44218)
                0.05046712 = queryNorm
              0.5218336 = fieldWeight in 1490, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.9038734 = idf(docFreq=327, maxDocs=44218)
                0.0625 = fieldNorm(doc=1490)
          0.054700784 = weight(_text_:22 in 1490) [ClassicSimilarity], result of:
            0.054700784 = score(doc=1490,freq=2.0), product of:
              0.17672725 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.05046712 = queryNorm
              0.30952093 = fieldWeight in 1490, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0625 = fieldNorm(doc=1490)
      0.5 = coord(1/2)
    
    Date
    22. 3.2015 9:30:24
  2. Manning, C.D.: Part-of-Speech Tagging from 97% to 100% : is it time for some linguistics? (2011) 0.05
    0.048587844 = product of:
      0.09717569 = sum of:
        0.09717569 = product of:
          0.19435138 = sum of:
            0.19435138 = weight(_text_:tagging in 1121) [ClassicSimilarity], result of:
              0.19435138 = score(doc=1121,freq=8.0), product of:
                0.2979515 = queryWeight, product of:
                  5.9038734 = idf(docFreq=327, maxDocs=44218)
                  0.05046712 = queryNorm
                0.652292 = fieldWeight in 1121, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  5.9038734 = idf(docFreq=327, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1121)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    I examine what would be necessary to move part-of-speech tagging performance from its current level of about 97.3% token accuracy (56% sentence accuracy) to close to 100% accuracy. I suggest that it must still be possible to greatly increase tagging performance and examine some useful improvements that have recently been made to the Stanford Part-of-Speech Tagger. However, an error analysis of some of the remaining errors suggests that there is limited further mileage to be had either from better machine learning or better features in a discriminative sequence classifier. The prospects for further gains from semisupervised learning also seem quite limited. Rather, I suggest and begin to demonstrate that the largest opportunity for further progress comes from improving the taxonomic basis of the linguistic resources from which taggers are trained. That is, from improved descriptive linguistics. However, I conclude by suggesting that there are also limits to this process. The status of some words may not be able to be adequately captured by assigning them to one of a small number of categories. While conventions can be used in such cases to improve tagging consistency, they lack a strong linguistic basis.
  3. Schöneberg, U.; Sperber, W.: POS tagging and its applications for mathematics (2014) 0.03
    0.029152704 = product of:
      0.05830541 = sum of:
        0.05830541 = product of:
          0.11661082 = sum of:
            0.11661082 = weight(_text_:tagging in 1748) [ClassicSimilarity], result of:
              0.11661082 = score(doc=1748,freq=2.0), product of:
                0.2979515 = queryWeight, product of:
                  5.9038734 = idf(docFreq=327, maxDocs=44218)
                  0.05046712 = queryNorm
                0.39137518 = fieldWeight in 1748, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.9038734 = idf(docFreq=327, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1748)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  4. Bredack, J.: Automatische Extraktion fachterminologischer Mehrwortbegriffe : ein Verfahrensvergleich (2016) 0.02
    0.024293922 = product of:
      0.048587844 = sum of:
        0.048587844 = product of:
          0.09717569 = sum of:
            0.09717569 = weight(_text_:tagging in 3194) [ClassicSimilarity], result of:
              0.09717569 = score(doc=3194,freq=2.0), product of:
                0.2979515 = queryWeight, product of:
                  5.9038734 = idf(docFreq=327, maxDocs=44218)
                  0.05046712 = queryNorm
                0.326146 = fieldWeight in 3194, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.9038734 = idf(docFreq=327, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3194)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Als Extraktionssysteme wurden der TreeTagger und die Indexierungssoftware Lingo verwendet. Der TreeTagger basiert auf einem statistischen Tagging- und Chunking- Algorithmus, mit dessen Hilfe NPs automatisch identifiziert und extrahiert werden. Er kann für verschiedene Anwendungsszenarien der natürlichen Sprachverarbeitung eingesetzt werden, in erster Linie als POS-Tagger für unterschiedliche Sprachen. Das Indexierungssystem Lingo arbeitet im Gegensatz zum TreeTagger mit elektronischen Wörterbüchern und einem musterbasierten Abgleich. Lingo ist ein auf automatische Indexierung ausgerichtetes System, was eine Vielzahl von Modulen mitliefert, die individuell auf eine bestimmte Aufgabenstellung angepasst und aufeinander abgestimmt werden können. Die unterschiedlichen Verarbeitungsweisen haben sich in den Ergebnismengen beider Systeme deutlich gezeigt. Die gering ausfallenden Übereinstimmungen der Ergebnismengen verdeutlichen die abweichende Funktionsweise und konnte mit einer qualitativen Analyse beispielhaft beschrieben werden. In der vorliegenden Arbeit kann abschließend nicht geklärt werden, welches der beiden Systeme bevorzugt für die Generierung von Indextermen eingesetzt werden sollte.
  5. Chen, L.; Fang, H.: ¬An automatic method for ex-tracting innovative ideas based on the Scopus® database (2019) 0.02
    0.024293922 = product of:
      0.048587844 = sum of:
        0.048587844 = product of:
          0.09717569 = sum of:
            0.09717569 = weight(_text_:tagging in 5310) [ClassicSimilarity], result of:
              0.09717569 = score(doc=5310,freq=2.0), product of:
                0.2979515 = queryWeight, product of:
                  5.9038734 = idf(docFreq=327, maxDocs=44218)
                  0.05046712 = queryNorm
                0.326146 = fieldWeight in 5310, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.9038734 = idf(docFreq=327, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5310)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The novelty of knowledge claims in a research paper can be considered an evaluation criterion for papers to supplement citations. To provide a foundation for research evaluation from the perspective of innovativeness, we propose an automatic approach for extracting innovative ideas from the abstracts of technology and engineering papers. The approach extracts N-grams as candidates based on part-of-speech tagging and determines whether they are novel by checking the Scopus® database to determine whether they had ever been presented previously. Moreover, we discussed the distributions of innovative ideas in different abstract structures. To improve the performance by excluding noisy N-grams, a list of stopwords and a list of research description characteristics were developed. We selected abstracts of articles published from 2011 to 2017 with the topic of semantic analysis as the experimental texts. Excluding noisy N-grams, considering the distribution of innovative ideas in abstracts, and suitably combining N-grams can effectively improve the performance of automatic innovative idea extraction. Unlike co-word and co-citation analysis, innovative-idea extraction aims to identify the differences in a paper from all previously published papers.
  6. Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.01
    0.010256397 = product of:
      0.020512793 = sum of:
        0.020512793 = product of:
          0.041025586 = sum of:
            0.041025586 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
              0.041025586 = score(doc=563,freq=2.0), product of:
                0.17672725 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05046712 = queryNorm
                0.23214069 = fieldWeight in 563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=563)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    10. 1.2013 19:22:47
  7. Lawrie, D.; Mayfield, J.; McNamee, P.; Oard, P.W.: Cross-language person-entity linking from 20 languages (2015) 0.01
    0.010256397 = product of:
      0.020512793 = sum of:
        0.020512793 = product of:
          0.041025586 = sum of:
            0.041025586 = weight(_text_:22 in 1848) [ClassicSimilarity], result of:
              0.041025586 = score(doc=1848,freq=2.0), product of:
                0.17672725 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05046712 = queryNorm
                0.23214069 = fieldWeight in 1848, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1848)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The goal of entity linking is to associate references to an entity that is found in unstructured natural language content to an authoritative inventory of known entities. This article describes the construction of 6 test collections for cross-language person-entity linking that together span 22 languages. Fully automated components were used together with 2 crowdsourced validation stages to affordably generate ground-truth annotations with an accuracy comparable to that of a completely manual process. The resulting test collections each contain between 642 (Arabic) and 2,361 (Romanian) person references in non-English texts for which the correct resolution in English Wikipedia is known, plus a similar number of references for which no correct resolution into English Wikipedia is believed to exist. Fully automated cross-language person-name linking experiments with 20 non-English languages yielded a resolution accuracy of between 0.84 (Serbian) and 0.98 (Romanian), which compares favorably with previously reported cross-language entity linking results for Spanish.
  8. Fóris, A.: Network theory and terminology (2013) 0.01
    0.008546998 = product of:
      0.017093996 = sum of:
        0.017093996 = product of:
          0.03418799 = sum of:
            0.03418799 = weight(_text_:22 in 1365) [ClassicSimilarity], result of:
              0.03418799 = score(doc=1365,freq=2.0), product of:
                0.17672725 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05046712 = queryNorm
                0.19345059 = fieldWeight in 1365, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1365)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    2. 9.2014 21:22:48
  9. Rötzer, F.: KI-Programm besser als Menschen im Verständnis natürlicher Sprache (2018) 0.01
    0.006837598 = product of:
      0.013675196 = sum of:
        0.013675196 = product of:
          0.027350392 = sum of:
            0.027350392 = weight(_text_:22 in 4217) [ClassicSimilarity], result of:
              0.027350392 = score(doc=4217,freq=2.0), product of:
                0.17672725 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05046712 = queryNorm
                0.15476047 = fieldWeight in 4217, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=4217)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 1.2018 11:32:44
  10. Deventer, J.P. van; Kruger, C.J.; Johnson, R.D.: Delineating knowledge management through lexical analysis : a retrospective (2015) 0.01
    0.005982898 = product of:
      0.011965796 = sum of:
        0.011965796 = product of:
          0.023931593 = sum of:
            0.023931593 = weight(_text_:22 in 3807) [ClassicSimilarity], result of:
              0.023931593 = score(doc=3807,freq=2.0), product of:
                0.17672725 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05046712 = queryNorm
                0.1354154 = fieldWeight in 3807, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=3807)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    20. 1.2015 18:30:22