Search (7 results, page 1 of 1)

Spitkovsky, V.; Norvig, P.: From words to concepts and back : dictionaries for linking text, entities and ideas (2012) 0.02
```
0.01592519 = product of:
  0.03185038 = sum of:
    0.03185038 = product of:
      0.06370076 = sum of:
        0.06370076 = weight(_text_:encyclopedia in 337) [ClassicSimilarity], result of:
          0.06370076 = score(doc=337,freq=2.0), product of:
            0.270842 = queryWeight, product of:
              5.321862 = idf(docFreq=586, maxDocs=44218)
              0.05089233 = queryNorm
            0.2351953 = fieldWeight in 337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.321862 = idf(docFreq=586, maxDocs=44218)
              0.03125 = fieldNorm(doc=337)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Human language is both rich and ambiguous. When we hear or read words, we resolve meanings to mental representations, for example recognizing and linking names to the intended persons, locations or organizations. Bridging words and meaning - from turning search queries into relevant results to suggesting targeted keywords for advertisers - is also Google's core competency, and important for many other tasks in information retrieval and natural language processing. We are happy to release a resource, spanning 7,560,141 concepts and 175,100,788 unique text strings, that we hope will help everyone working in these areas. How do we represent concepts? Our approach piggybacks on the unique titles of entries from an encyclopedia, which are mostly proper and common noun phrases. We consider each individual Wikipedia article as representing a concept (an entity or an idea), identified by its URL. Text strings that refer to concepts were collected using the publicly available hypertext of anchors (the text you click on in a web link) that point to each Wikipedia page, thus drawing on the vast link structure of the web. For every English article we harvested the strings associated with its incoming hyperlinks from the rest of Wikipedia, the greater web, and also anchors of parallel, non-English Wikipedia pages. Our dictionaries are cross-lingual, and any concept deemed too fine can be broadened to a desired level of generality using Wikipedia's groupings of articles into hierarchical categories. The data set contains triples, each consisting of (i) text, a short, raw natural language string; (ii) url, a related concept, represented by an English Wikipedia article's canonical location; and (iii) count, an integer indicating the number of times text has been observed connected with the concept's url. Our database thus includes weights that measure degrees of association. For example, the top two entries for football indicate that it is an ambiguous term, which is almost twice as likely to refer to what we in the US call soccer. Vgl. auch: Spitkovsky, V.I., A.X. Chang: A cross-lingual dictionary for english Wikipedia concepts. In: http://nlp.stanford.edu/pubs/crosswikis.pdf.

Lezius, W.: Morphy - Morphologie und Tagging für das Deutsche (2013) 0.01

0.013790417 = product of:
  0.027580835 = sum of:
    0.027580835 = product of:
      0.05516167 = sum of:
        0.05516167 = weight(_text_:22 in 1490) [ClassicSimilarity], result of:
          0.05516167 = score(doc=1490,freq=2.0), product of:
            0.17821628 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05089233 = queryNorm
            0.30952093 = fieldWeight in 1490, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1490)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 3.2015 9:30:24

Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.01

0.010342812 = product of:
  0.020685624 = sum of:
    0.020685624 = product of:
      0.04137125 = sum of:
        0.04137125 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
          0.04137125 = score(doc=563,freq=2.0), product of:
            0.17821628 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05089233 = queryNorm
            0.23214069 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 10. 1.2013 19:22:47

Lawrie, D.; Mayfield, J.; McNamee, P.; Oard, P.W.: Cross-language person-entity linking from 20 languages (2015) 0.01
```
0.010342812 = product of:
  0.020685624 = sum of:
    0.020685624 = product of:
      0.04137125 = sum of:
        0.04137125 = weight(_text_:22 in 1848) [ClassicSimilarity], result of:
          0.04137125 = score(doc=1848,freq=2.0), product of:
            0.17821628 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05089233 = queryNorm
            0.23214069 = fieldWeight in 1848, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1848)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The goal of entity linking is to associate references to an entity that is found in unstructured natural language content to an authoritative inventory of known entities. This article describes the construction of 6 test collections for cross-language person-entity linking that together span 22 languages. Fully automated components were used together with 2 crowdsourced validation stages to affordably generate ground-truth annotations with an accuracy comparable to that of a completely manual process. The resulting test collections each contain between 642 (Arabic) and 2,361 (Romanian) person references in non-English texts for which the correct resolution in English Wikipedia is known, plus a similar number of references for which no correct resolution into English Wikipedia is believed to exist. Fully automated cross-language person-name linking experiments with 20 non-English languages yielded a resolution accuracy of between 0.84 (Serbian) and 0.98 (Romanian), which compares favorably with previously reported cross-language entity linking results for Spanish.

Fóris, A.: Network theory and terminology (2013) 0.01

0.00861901 = product of:
  0.01723802 = sum of:
    0.01723802 = product of:
      0.03447604 = sum of:
        0.03447604 = weight(_text_:22 in 1365) [ClassicSimilarity], result of:
          0.03447604 = score(doc=1365,freq=2.0), product of:
            0.17821628 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05089233 = queryNorm
            0.19345059 = fieldWeight in 1365, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1365)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 2. 9.2014 21:22:48

Rötzer, F.: KI-Programm besser als Menschen im Verständnis natürlicher Sprache (2018) 0.01

0.0068952087 = product of:
  0.013790417 = sum of:
    0.013790417 = product of:
      0.027580835 = sum of:
        0.027580835 = weight(_text_:22 in 4217) [ClassicSimilarity], result of:
          0.027580835 = score(doc=4217,freq=2.0), product of:
            0.17821628 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05089233 = queryNorm
            0.15476047 = fieldWeight in 4217, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=4217)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 1.2018 11:32:44

Deventer, J.P. van; Kruger, C.J.; Johnson, R.D.: Delineating knowledge management through lexical analysis : a retrospective (2015) 0.01

0.0060333074 = product of:
  0.012066615 = sum of:
    0.012066615 = product of:
      0.02413323 = sum of:
        0.02413323 = weight(_text_:22 in 3807) [ClassicSimilarity], result of:
          0.02413323 = score(doc=3807,freq=2.0), product of:
            0.17821628 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05089233 = queryNorm
            0.1354154 = fieldWeight in 3807, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.02734375 = fieldNorm(doc=3807)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 20. 1.2015 18:30:22

Search (7 results, page 1 of 1)

Authors

Languages

Types