Search (44 results, page 1 of 3)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.10

0.10151614 = sum of:
  0.080830514 = product of:
    0.24249153 = sum of:
      0.24249153 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
        0.24249153 = score(doc=562,freq=2.0), product of:
          0.43146574 = queryWeight, product of:
            8.478011 = idf(docFreq=24, maxDocs=44218)
            0.05089233 = queryNorm
          0.56201804 = fieldWeight in 562, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            8.478011 = idf(docFreq=24, maxDocs=44218)
            0.046875 = fieldNorm(doc=562)
    0.33333334 = coord(1/3)
  0.020685624 = product of:
    0.04137125 = sum of:
      0.04137125 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
        0.04137125 = score(doc=562,freq=2.0), product of:
          0.17821628 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05089233 = queryNorm
          0.23214069 = fieldWeight in 562, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=562)
    0.5 = coord(1/2)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Paolillo, J.C.: Linguistics and the information sciences (2009) 0.08

0.079871394 = product of:
  0.15974279 = sum of:
    0.15974279 = sum of:
      0.11147633 = weight(_text_:encyclopedia in 3840) [ClassicSimilarity], result of:
        0.11147633 = score(doc=3840,freq=2.0), product of:
          0.270842 = queryWeight, product of:
            5.321862 = idf(docFreq=586, maxDocs=44218)
            0.05089233 = queryNorm
          0.41159177 = fieldWeight in 3840, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.321862 = idf(docFreq=586, maxDocs=44218)
            0.0546875 = fieldNorm(doc=3840)
      0.04826646 = weight(_text_:22 in 3840) [ClassicSimilarity], result of:
        0.04826646 = score(doc=3840,freq=2.0), product of:
          0.17821628 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05089233 = queryNorm
          0.2708308 = fieldWeight in 3840, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0546875 = fieldNorm(doc=3840)
  0.5 = coord(1/2)

Date: 27. 8.2011 14:22:33
Source: Encyclopedia of library and information sciences. 3rd ed. Ed.: M.J. Bates

Smeaton, A.F.: Natural language processing used in information retrieval tasks : an overview of achievements to date (1995) 0.06

0.055738166 = product of:
  0.11147633 = sum of:
    0.11147633 = product of:
      0.22295266 = sum of:
        0.22295266 = weight(_text_:encyclopedia in 1265) [ClassicSimilarity], result of:
          0.22295266 = score(doc=1265,freq=2.0), product of:
            0.270842 = queryWeight, product of:
              5.321862 = idf(docFreq=586, maxDocs=44218)
              0.05089233 = queryNorm
            0.82318354 = fieldWeight in 1265, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.321862 = idf(docFreq=586, maxDocs=44218)
              0.109375 = fieldNorm(doc=1265)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Encyclopedia of library and information science. Vol.55, [=Suppl.18]

Sicilia-Garcia, E.I.; Smith, F.J.: Statistical language modeling (2002) 0.06

0.055738166 = product of:
  0.11147633 = sum of:
    0.11147633 = product of:
      0.22295266 = sum of:
        0.22295266 = weight(_text_:encyclopedia in 4261) [ClassicSimilarity], result of:
          0.22295266 = score(doc=4261,freq=2.0), product of:
            0.270842 = queryWeight, product of:
              5.321862 = idf(docFreq=586, maxDocs=44218)
              0.05089233 = queryNorm
            0.82318354 = fieldWeight in 4261, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.321862 = idf(docFreq=586, maxDocs=44218)
              0.109375 = fieldNorm(doc=4261)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Encyclopedia of library and information science. Vol.71, [=Suppl.34]

Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.04

0.040415257 = product of:
  0.080830514 = sum of:
    0.080830514 = product of:
      0.24249153 = sum of:
        0.24249153 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
          0.24249153 = score(doc=862,freq=2.0), product of:
            0.43146574 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.05089233 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Source: https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN

Moisl, H.: Artificial neural networks and Natural Language Processing (2009) 0.03

0.03185038 = product of:
  0.06370076 = sum of:
    0.06370076 = product of:
      0.12740152 = sum of:
        0.12740152 = weight(_text_:encyclopedia in 3138) [ClassicSimilarity], result of:
          0.12740152 = score(doc=3138,freq=2.0), product of:
            0.270842 = queryWeight, product of:
              5.321862 = idf(docFreq=586, maxDocs=44218)
              0.05089233 = queryNorm
            0.4703906 = fieldWeight in 3138, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.321862 = idf(docFreq=586, maxDocs=44218)
              0.0625 = fieldNorm(doc=3138)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Encyclopedia of library and information sciences. 3rd ed. Ed.: M.J. Bates

Liddy, E.D.: Natural language processing for information retrieval (2009) 0.03

0.03185038 = product of:
  0.06370076 = sum of:
    0.06370076 = product of:
      0.12740152 = sum of:
        0.12740152 = weight(_text_:encyclopedia in 3854) [ClassicSimilarity], result of:
          0.12740152 = score(doc=3854,freq=2.0), product of:
            0.270842 = queryWeight, product of:
              5.321862 = idf(docFreq=586, maxDocs=44218)
              0.05089233 = queryNorm
            0.4703906 = fieldWeight in 3854, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.321862 = idf(docFreq=586, maxDocs=44218)
              0.0625 = fieldNorm(doc=3854)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Encyclopedia of library and information sciences. 3rd ed. Ed.: M.J. Bates

Warner, A.J.: Natural language processing (1987) 0.03

0.027580835 = product of:
  0.05516167 = sum of:
    0.05516167 = product of:
      0.11032334 = sum of:
        0.11032334 = weight(_text_:22 in 337) [ClassicSimilarity], result of:
          0.11032334 = score(doc=337,freq=2.0), product of:
            0.17821628 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05089233 = queryNorm
            0.61904186 = fieldWeight in 337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=337)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Annual review of information science and technology. 22(1987), S.79-108

McMahon, J.G.; Smith, F.J.: Improved statistical language model performance with automatic generated word hierarchies (1996) 0.02

0.02413323 = product of:
  0.04826646 = sum of:
    0.04826646 = product of:
      0.09653292 = sum of:
        0.09653292 = weight(_text_:22 in 3164) [ClassicSimilarity], result of:
          0.09653292 = score(doc=3164,freq=2.0), product of:
            0.17821628 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05089233 = queryNorm
            0.5416616 = fieldWeight in 3164, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=3164)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Computational linguistics. 22(1996) no.2, S.217-248

Ruge, G.: ¬A spreading activation network for automatic generation of thesaurus relationships (1991) 0.02

0.02413323 = product of:
  0.04826646 = sum of:
    0.04826646 = product of:
      0.09653292 = sum of:
        0.09653292 = weight(_text_:22 in 4506) [ClassicSimilarity], result of:
          0.09653292 = score(doc=4506,freq=2.0), product of:
            0.17821628 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05089233 = queryNorm
            0.5416616 = fieldWeight in 4506, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=4506)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 8.10.2000 11:52:22

Somers, H.: Example-based machine translation : Review article (1999) 0.02

0.02413323 = product of:
  0.04826646 = sum of:
    0.04826646 = product of:
      0.09653292 = sum of:
        0.09653292 = weight(_text_:22 in 6672) [ClassicSimilarity], result of:
          0.09653292 = score(doc=6672,freq=2.0), product of:
            0.17821628 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05089233 = queryNorm
            0.5416616 = fieldWeight in 6672, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6672)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 31. 7.1996 9:22:19

New tools for human translators (1997) 0.02

0.02413323 = product of:
  0.04826646 = sum of:
    0.04826646 = product of:
      0.09653292 = sum of:
        0.09653292 = weight(_text_:22 in 1179) [ClassicSimilarity], result of:
          0.09653292 = score(doc=1179,freq=2.0), product of:
            0.17821628 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05089233 = queryNorm
            0.5416616 = fieldWeight in 1179, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=1179)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 31. 7.1996 9:22:19

Baayen, R.H.; Lieber, H.: Word frequency distributions and lexical semantics (1997) 0.02

0.02413323 = product of:
  0.04826646 = sum of:
    0.04826646 = product of:
      0.09653292 = sum of:
        0.09653292 = weight(_text_:22 in 3117) [ClassicSimilarity], result of:
          0.09653292 = score(doc=3117,freq=2.0), product of:
            0.17821628 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05089233 = queryNorm
            0.5416616 = fieldWeight in 3117, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=3117)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 28. 2.1999 10:48:22

Bookstein, A.; Kulyukin, V.; Raita, T.; Nicholson, J.: Adapting measures of clumping strength to assess term-term similarity (2003) 0.02
```
0.023887785 = product of:
  0.04777557 = sum of:
    0.04777557 = product of:
      0.09555114 = sum of:
        0.09555114 = weight(_text_:encyclopedia in 1609) [ClassicSimilarity], result of:
          0.09555114 = score(doc=1609,freq=2.0), product of:
            0.270842 = queryWeight, product of:
              5.321862 = idf(docFreq=586, maxDocs=44218)
              0.05089233 = queryNorm
            0.35279295 = fieldWeight in 1609, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.321862 = idf(docFreq=586, maxDocs=44218)
              0.046875 = fieldNorm(doc=1609)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Automated information retrieval relies heavily an statistical regularities that emerge as terms are deposited to produce text. This paper examines statistical patterns expected of a pair of terms that are semantically related to each other. Guided by a conceptualization of the text generation process, we derive measures of how tightly two terms are semantically associated. Our main objective is to probe whether such measures yield reasonable results. Specifically, we examine how the tendency of a content bearing term to clump, as quantified by previously developed measures of term clumping, is influenced by the presence of other terms. This approach allows us to present a toolkit from which a range of measures can be constructed. As an illustration, one of several suggested measures is evaluated an a large text corpus built from an on-line encyclopedia.

Byrne, C.C.; McCracken, S.A.: ¬An adaptive thesaurus employing semantic distance, relational inheritance and nominal compound interpretation for linguistic support of information retrieval (1999) 0.02

0.020685624 = product of:
  0.04137125 = sum of:
    0.04137125 = product of:
      0.0827425 = sum of:
        0.0827425 = weight(_text_:22 in 4483) [ClassicSimilarity], result of:
          0.0827425 = score(doc=4483,freq=2.0), product of:
            0.17821628 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05089233 = queryNorm
            0.46428138 = fieldWeight in 4483, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=4483)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 15. 3.2000 10:22:37

Boleda, G.; Evert, S.: Multiword expressions : a pain in the neck of lexical semantics (2009) 0.02

0.020685624 = product of:
  0.04137125 = sum of:
    0.04137125 = product of:
      0.0827425 = sum of:
        0.0827425 = weight(_text_:22 in 4888) [ClassicSimilarity], result of:
          0.0827425 = score(doc=4888,freq=2.0), product of:
            0.17821628 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05089233 = queryNorm
            0.46428138 = fieldWeight in 4888, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=4888)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 1. 3.2013 14:56:22

Stede, M.: Lexicalization in natural language generation (2002) 0.02

0.01990649 = product of:
  0.03981298 = sum of:
    0.03981298 = product of:
      0.07962596 = sum of:
        0.07962596 = weight(_text_:encyclopedia in 4245) [ClassicSimilarity], result of:
          0.07962596 = score(doc=4245,freq=2.0), product of:
            0.270842 = queryWeight, product of:
              5.321862 = idf(docFreq=586, maxDocs=44218)
              0.05089233 = queryNorm
            0.29399413 = fieldWeight in 4245, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.321862 = idf(docFreq=586, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4245)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Encyclopedia of library and information science. Vol.70, [=Suppl.33]

Chandrasekar, R.; Bangalore, S.: Glean : using syntactic information in document filtering (2002) 0.02

0.01990649 = product of:
  0.03981298 = sum of:
    0.03981298 = product of:
      0.07962596 = sum of:
        0.07962596 = weight(_text_:encyclopedia in 4257) [ClassicSimilarity], result of:
          0.07962596 = score(doc=4257,freq=2.0), product of:
            0.270842 = queryWeight, product of:
              5.321862 = idf(docFreq=586, maxDocs=44218)
              0.05089233 = queryNorm
            0.29399413 = fieldWeight in 4257, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.321862 = idf(docFreq=586, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4257)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Encyclopedia of library and information science. Vol.71, [=Suppl.34]

Hutchins, J.: From first conception to first demonstration : the nascent years of machine translation, 1947-1954. A chronology (1997) 0.02

0.01723802 = product of:
  0.03447604 = sum of:
    0.03447604 = product of:
      0.06895208 = sum of:
        0.06895208 = weight(_text_:22 in 1463) [ClassicSimilarity], result of:
          0.06895208 = score(doc=1463,freq=2.0), product of:
            0.17821628 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05089233 = queryNorm
            0.38690117 = fieldWeight in 1463, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=1463)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 31. 7.1996 9:22:19

Spitkovsky, V.; Norvig, P.: From words to concepts and back : dictionaries for linking text, entities and ideas (2012) 0.02
```
0.01592519 = product of:
  0.03185038 = sum of:
    0.03185038 = product of:
      0.06370076 = sum of:
        0.06370076 = weight(_text_:encyclopedia in 337) [ClassicSimilarity], result of:
          0.06370076 = score(doc=337,freq=2.0), product of:
            0.270842 = queryWeight, product of:
              5.321862 = idf(docFreq=586, maxDocs=44218)
              0.05089233 = queryNorm
            0.2351953 = fieldWeight in 337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.321862 = idf(docFreq=586, maxDocs=44218)
              0.03125 = fieldNorm(doc=337)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Human language is both rich and ambiguous. When we hear or read words, we resolve meanings to mental representations, for example recognizing and linking names to the intended persons, locations or organizations. Bridging words and meaning - from turning search queries into relevant results to suggesting targeted keywords for advertisers - is also Google's core competency, and important for many other tasks in information retrieval and natural language processing. We are happy to release a resource, spanning 7,560,141 concepts and 175,100,788 unique text strings, that we hope will help everyone working in these areas. How do we represent concepts? Our approach piggybacks on the unique titles of entries from an encyclopedia, which are mostly proper and common noun phrases. We consider each individual Wikipedia article as representing a concept (an entity or an idea), identified by its URL. Text strings that refer to concepts were collected using the publicly available hypertext of anchors (the text you click on in a web link) that point to each Wikipedia page, thus drawing on the vast link structure of the web. For every English article we harvested the strings associated with its incoming hyperlinks from the rest of Wikipedia, the greater web, and also anchors of parallel, non-English Wikipedia pages. Our dictionaries are cross-lingual, and any concept deemed too fine can be broadened to a desired level of generality using Wikipedia's groupings of articles into hierarchical categories. The data set contains triples, each consisting of (i) text, a short, raw natural language string; (ii) url, a related concept, represented by an English Wikipedia article's canonical location; and (iii) count, an integer indicating the number of times text has been observed connected with the concept's url. Our database thus includes weights that measure degrees of association. For example, the top two entries for football indicate that it is an ambiguous term, which is almost twice as likely to refer to what we in the US call soccer. Vgl. auch: Spitkovsky, V.I., A.X. Chang: A cross-lingual dictionary for english Wikipedia concepts. In: http://nlp.stanford.edu/pubs/crosswikis.pdf.

Search (44 results, page 1 of 3)

Authors

Years

Types

Themes