Search (27 results, page 1 of 2)

  • × theme_ss:"Computerlinguistik"
  • × year_i:[2010 TO 2020}
  1. Lawrie, D.; Mayfield, J.; McNamee, P.; Oard, P.W.: Cross-language person-entity linking from 20 languages (2015) 0.03
    0.03157666 = product of:
      0.06315332 = sum of:
        0.06315332 = sum of:
          0.03241012 = weight(_text_:p in 1848) [ClassicSimilarity], result of:
            0.03241012 = score(doc=1848,freq=2.0), product of:
              0.1359764 = queryWeight, product of:
                3.5955126 = idf(docFreq=3298, maxDocs=44218)
                0.037818365 = queryNorm
              0.23835106 = fieldWeight in 1848, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5955126 = idf(docFreq=3298, maxDocs=44218)
                0.046875 = fieldNorm(doc=1848)
          0.030743198 = weight(_text_:22 in 1848) [ClassicSimilarity], result of:
            0.030743198 = score(doc=1848,freq=2.0), product of:
              0.13243347 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.037818365 = queryNorm
              0.23214069 = fieldWeight in 1848, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=1848)
      0.5 = coord(1/2)
    
    Abstract
    The goal of entity linking is to associate references to an entity that is found in unstructured natural language content to an authoritative inventory of known entities. This article describes the construction of 6 test collections for cross-language person-entity linking that together span 22 languages. Fully automated components were used together with 2 crowdsourced validation stages to affordably generate ground-truth annotations with an accuracy comparable to that of a completely manual process. The resulting test collections each contain between 642 (Arabic) and 2,361 (Romanian) person references in non-English texts for which the correct resolution in English Wikipedia is known, plus a similar number of references for which no correct resolution into English Wikipedia is believed to exist. Fully automated cross-language person-name linking experiments with 20 non-English languages yielded a resolution accuracy of between 0.84 (Serbian) and 0.98 (Romanian), which compares favorably with previously reported cross-language entity linking results for Spanish.
  2. Levin, M.; Krawczyk, S.; Bethard, S.; Jurafsky, D.: Citation-based bootstrapping for large-scale author disambiguation (2012) 0.02
    0.021080323 = product of:
      0.042160645 = sum of:
        0.042160645 = product of:
          0.25296387 = sum of:
            0.25296387 = weight(_text_:b3 in 246) [ClassicSimilarity], result of:
              0.25296387 = score(doc=246,freq=2.0), product of:
                0.41614348 = queryWeight, product of:
                  11.00374 = idf(docFreq=1, maxDocs=44218)
                  0.037818365 = queryNorm
                0.60787654 = fieldWeight in 246, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  11.00374 = idf(docFreq=1, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=246)
          0.16666667 = coord(1/6)
      0.5 = coord(1/2)
    
    Abstract
    We present a new, two-stage, self-supervised algorithm for author disambiguation in large bibliographic databases. In the first "bootstrap" stage, a collection of high-precision features is used to bootstrap a training set with positive and negative examples of coreferring authors. A supervised feature-based classifier is then trained on the bootstrap clusters and used to cluster the authors in a larger unlabeled dataset. Our self-supervised approach shares the advantages of unsupervised approaches (no need for expensive hand labels) as well as supervised approaches (a rich set of features that can be discriminatively trained). The algorithm disambiguates 54,000,000 author instances in Thomson Reuters' Web of Knowledge with B3 F1 of.807. We analyze parameters and features, particularly those from citation networks, which have not been deeply investigated in author disambiguation. The most important citation feature is self-citation, which can be approximated without expensive extraction of the full network. For the supervised stage, the minor improvement due to other citation features (increasing F1 from.748 to.767) suggests they may not be worth the trouble of extracting from databases that don't already have them. A lean feature set without expensive abstract and title features performs 130 times faster with about equal F1.
  3. Snajder, J.: Distributional semantics of multi-word expressions (2013) 0.01
    0.013504217 = product of:
      0.027008435 = sum of:
        0.027008435 = product of:
          0.05401687 = sum of:
            0.05401687 = weight(_text_:p in 2868) [ClassicSimilarity], result of:
              0.05401687 = score(doc=2868,freq=2.0), product of:
                0.1359764 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.037818365 = queryNorm
                0.39725178 = fieldWeight in 2868, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2868)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Content
    Folien einer Präsentation anlässlich COST Action IC1207 PARSEME Meeting, Warsaw, September 16, 2013. Vgl. den Beitrag: Snajder, J., P. Almic: Modeling semantic compositionality of Croatian multiword expressions. In: Informatica. 39(2015) H.3, S.301-309.
  4. Malo, P.; Sinha, A.; Korhonen, P.; Wallenius, J.; Takala, P.: Good debt or bad debt : detecting semantic orientations in economic texts (2014) 0.01
    0.011694996 = product of:
      0.023389991 = sum of:
        0.023389991 = product of:
          0.046779983 = sum of:
            0.046779983 = weight(_text_:p in 1226) [ClassicSimilarity], result of:
              0.046779983 = score(doc=1226,freq=6.0), product of:
                0.1359764 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.037818365 = queryNorm
                0.34403014 = fieldWeight in 1226, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1226)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  5. Panicheva, P.; Cardiff, J.; Rosso, P.: Identifying subjective statements in news titles using a personal sense annotation framework (2013) 0.01
    0.011458708 = product of:
      0.022917416 = sum of:
        0.022917416 = product of:
          0.045834832 = sum of:
            0.045834832 = weight(_text_:p in 968) [ClassicSimilarity], result of:
              0.045834832 = score(doc=968,freq=4.0), product of:
                0.1359764 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.037818365 = queryNorm
                0.33707932 = fieldWeight in 968, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.046875 = fieldNorm(doc=968)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  6. Baierer, K.; Zumstein, P.: Verbesserung der OCR in digitalen Sammlungen von Bibliotheken (2016) 0.01
    0.0108033735 = product of:
      0.021606747 = sum of:
        0.021606747 = product of:
          0.043213494 = sum of:
            0.043213494 = weight(_text_:p in 2818) [ClassicSimilarity], result of:
              0.043213494 = score(doc=2818,freq=2.0), product of:
                0.1359764 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.037818365 = queryNorm
                0.31780142 = fieldWeight in 2818, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2818)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  7. Lezius, W.: Morphy - Morphologie und Tagging für das Deutsche (2013) 0.01
    0.010247733 = product of:
      0.020495467 = sum of:
        0.020495467 = product of:
          0.040990934 = sum of:
            0.040990934 = weight(_text_:22 in 1490) [ClassicSimilarity], result of:
              0.040990934 = score(doc=1490,freq=2.0), product of:
                0.13243347 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.037818365 = queryNorm
                0.30952093 = fieldWeight in 1490, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1490)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2015 9:30:24
  8. Helbig, H.: Knowledge representation and the semantics of natural language (2014) 0.01
    0.009548924 = product of:
      0.019097848 = sum of:
        0.019097848 = product of:
          0.038195696 = sum of:
            0.038195696 = weight(_text_:p in 2396) [ClassicSimilarity], result of:
              0.038195696 = score(doc=2396,freq=4.0), product of:
                0.1359764 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.037818365 = queryNorm
                0.28089944 = fieldWeight in 2396, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2396)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Classification
    P
    LCC
    P
  9. Szpakowicz, S.; Bond, F.; Nakov, P.; Kim, S.N.: On the semantics of noun compounds (2013) 0.01
    0.009452951 = product of:
      0.018905902 = sum of:
        0.018905902 = product of:
          0.037811805 = sum of:
            0.037811805 = weight(_text_:p in 120) [ClassicSimilarity], result of:
              0.037811805 = score(doc=120,freq=2.0), product of:
                0.1359764 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.037818365 = queryNorm
                0.27807623 = fieldWeight in 120, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=120)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  10. Colace, F.; Santo, M. De; Greco, L.; Napoletano, P.: Weighted word pairs for query expansion (2015) 0.01
    0.009452951 = product of:
      0.018905902 = sum of:
        0.018905902 = product of:
          0.037811805 = sum of:
            0.037811805 = weight(_text_:p in 2687) [ClassicSimilarity], result of:
              0.037811805 = score(doc=2687,freq=2.0), product of:
                0.1359764 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.037818365 = queryNorm
                0.27807623 = fieldWeight in 2687, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2687)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  11. Radev, D.R.; Joseph, M.T.; Gibson, B.; Muthukrishnan, P.: ¬A bibliometric and network analysis of the field of computational linguistics (2016) 0.01
    0.009452951 = product of:
      0.018905902 = sum of:
        0.018905902 = product of:
          0.037811805 = sum of:
            0.037811805 = weight(_text_:p in 2764) [ClassicSimilarity], result of:
              0.037811805 = score(doc=2764,freq=2.0), product of:
                0.1359764 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.037818365 = queryNorm
                0.27807623 = fieldWeight in 2764, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2764)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  12. Rayson, P.; Piao, S.; Sharoff, S.; Evert, S.; Moiron, B.V.: Multiword expressions : hard going or plain sailing? (2015) 0.01
    0.009452951 = product of:
      0.018905902 = sum of:
        0.018905902 = product of:
          0.037811805 = sum of:
            0.037811805 = weight(_text_:p in 2918) [ClassicSimilarity], result of:
              0.037811805 = score(doc=2918,freq=2.0), product of:
                0.1359764 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.037818365 = queryNorm
                0.27807623 = fieldWeight in 2918, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2918)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  13. Budin, G.: Zum Entwicklungsstand der Terminologiewissenschaft (2019) 0.01
    0.009452951 = product of:
      0.018905902 = sum of:
        0.018905902 = product of:
          0.037811805 = sum of:
            0.037811805 = weight(_text_:p in 5604) [ClassicSimilarity], result of:
              0.037811805 = score(doc=5604,freq=2.0), product of:
                0.1359764 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.037818365 = queryNorm
                0.27807623 = fieldWeight in 5604, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5604)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Terminologie : Epochen - Schwerpunkte - Umsetzungen. Zum 25-jährigen Bestehen des Rats für Deutschsprachige Terminologie. Hrsg.: P. Drewer, u. D. Pulitano
  14. Spitkovsky, V.I.; Chang, A.X.: ¬A cross-lingual dictionary for english Wikipedia concepts (2012) 0.01
    0.00810253 = product of:
      0.01620506 = sum of:
        0.01620506 = product of:
          0.03241012 = sum of:
            0.03241012 = weight(_text_:p in 336) [ClassicSimilarity], result of:
              0.03241012 = score(doc=336,freq=2.0), product of:
                0.1359764 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.037818365 = queryNorm
                0.23835106 = fieldWeight in 336, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.046875 = fieldNorm(doc=336)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Content
    Vgl. auch: Spitkovsky, V., P. Norvig: From words to concepts and back: dictionaries for linking text, entities and ideas. In: http://googleresearch.blogspot.de/2012/05/from-words-to-concepts-and-back.html. Für den Datenpool vgl.: nlp.stanford.edu/pubs/corsswikis-data.tar.bz2.
  15. Snajder, J.; Almic, P.: Modeling semantic compositionality of Croatian multiword expressions (2015) 0.01
    0.00810253 = product of:
      0.01620506 = sum of:
        0.01620506 = product of:
          0.03241012 = sum of:
            0.03241012 = weight(_text_:p in 2920) [ClassicSimilarity], result of:
              0.03241012 = score(doc=2920,freq=2.0), product of:
                0.1359764 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.037818365 = queryNorm
                0.23835106 = fieldWeight in 2920, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2920)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  16. Ghazzawi, N.; Robichaud, B.; Drouin, P.; Sadat, F.: Automatic extraction of specialized verbal units (2018) 0.01
    0.00810253 = product of:
      0.01620506 = sum of:
        0.01620506 = product of:
          0.03241012 = sum of:
            0.03241012 = weight(_text_:p in 4094) [ClassicSimilarity], result of:
              0.03241012 = score(doc=4094,freq=2.0), product of:
                0.1359764 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.037818365 = queryNorm
                0.23835106 = fieldWeight in 4094, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4094)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  17. Mengel, T.: Wie viel Terminologiearbeit steckt in der Übersetzung der Dewey-Dezimalklassifikation? (2019) 0.01
    0.00810253 = product of:
      0.01620506 = sum of:
        0.01620506 = product of:
          0.03241012 = sum of:
            0.03241012 = weight(_text_:p in 5603) [ClassicSimilarity], result of:
              0.03241012 = score(doc=5603,freq=2.0), product of:
                0.1359764 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.037818365 = queryNorm
                0.23835106 = fieldWeight in 5603, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5603)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Terminologie : Epochen - Schwerpunkte - Umsetzungen. Zum 25-jährigen Bestehen des Rats für Deutschsprachige Terminologie. Hrsg.: P. Drewer, u. D. Pulitano
  18. Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.01
    0.0076857996 = product of:
      0.015371599 = sum of:
        0.015371599 = product of:
          0.030743198 = sum of:
            0.030743198 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
              0.030743198 = score(doc=563,freq=2.0), product of:
                0.13243347 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.037818365 = queryNorm
                0.23214069 = fieldWeight in 563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=563)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    10. 1.2013 19:22:47
  19. Symonds, M.; Bruza, P.; Zuccon, G.; Koopman, B.; Sitbon, L.; Turner, I.: Automatic query expansion : a structural linguistic perspective (2014) 0.01
    0.0067521087 = product of:
      0.013504217 = sum of:
        0.013504217 = product of:
          0.027008435 = sum of:
            0.027008435 = weight(_text_:p in 1338) [ClassicSimilarity], result of:
              0.027008435 = score(doc=1338,freq=2.0), product of:
                0.1359764 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.037818365 = queryNorm
                0.19862589 = fieldWeight in 1338, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1338)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  20. Hoenkamp, E.; Bruza, P.: How everyday language can and will boost effective information retrieval (2015) 0.01
    0.0067521087 = product of:
      0.013504217 = sum of:
        0.013504217 = product of:
          0.027008435 = sum of:
            0.027008435 = weight(_text_:p in 2123) [ClassicSimilarity], result of:
              0.027008435 = score(doc=2123,freq=2.0), product of:
                0.1359764 = queryWeight, product of:
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.037818365 = queryNorm
                0.19862589 = fieldWeight in 2123, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5955126 = idf(docFreq=3298, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2123)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    

Languages

  • e 21
  • d 6

Types