Search (46 results, page 1 of 3)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.10

0.10140387 = sum of:
  0.08074112 = product of:
    0.24222337 = sum of:
      0.24222337 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
        0.24222337 = score(doc=562,freq=2.0), product of:
          0.4309886 = queryWeight, product of:
            8.478011 = idf(docFreq=24, maxDocs=44218)
            0.050836053 = queryNorm
          0.56201804 = fieldWeight in 562, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            8.478011 = idf(docFreq=24, maxDocs=44218)
            0.046875 = fieldNorm(doc=562)
    0.33333334 = coord(1/3)
  0.02066275 = product of:
    0.0413255 = sum of:
      0.0413255 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
        0.0413255 = score(doc=562,freq=2.0), product of:
          0.1780192 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.050836053 = queryNorm
          0.23214069 = fieldWeight in 562, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=562)
    0.5 = coord(1/2)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Godby, C.J.; Reighart, R.R.: ¬The WordSmith Toolkit (2001) 0.05

0.049163908 = product of:
  0.098327816 = sum of:
    0.098327816 = product of:
      0.19665563 = sum of:
        0.19665563 = weight(_text_:ii in 1055) [ClassicSimilarity], result of:
          0.19665563 = score(doc=1055,freq=2.0), product of:
            0.2745971 = queryWeight, product of:
              5.4016213 = idf(docFreq=541, maxDocs=44218)
              0.050836053 = queryNorm
            0.7161606 = fieldWeight in 1055, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4016213 = idf(docFreq=541, maxDocs=44218)
              0.09375 = fieldNorm(doc=1055)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Footnote: Teil eines Themenheftes: OCLC and the Internet: An Historical Overview of Research Activities, 1990-1999 - Part II

Normore, L.F.: Using Visualization to Understand Phrase Structure (2001) 0.05

0.049163908 = product of:
  0.098327816 = sum of:
    0.098327816 = product of:
      0.19665563 = sum of:
        0.19665563 = weight(_text_:ii in 1060) [ClassicSimilarity], result of:
          0.19665563 = score(doc=1060,freq=2.0), product of:
            0.2745971 = queryWeight, product of:
              5.4016213 = idf(docFreq=541, maxDocs=44218)
              0.050836053 = queryNorm
            0.7161606 = fieldWeight in 1060, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4016213 = idf(docFreq=541, maxDocs=44218)
              0.09375 = fieldNorm(doc=1060)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Footnote: Teil eines Themenheftes: OCLC and the Internet: An Historical Overview of Research Activities, 1990-1999 - Part II

Godby, C.J.; Reighart, R.R.: ¬The WordSmith Indexing System (2001) 0.05

0.049163908 = product of:
  0.098327816 = sum of:
    0.098327816 = product of:
      0.19665563 = sum of:
        0.19665563 = weight(_text_:ii in 1063) [ClassicSimilarity], result of:
          0.19665563 = score(doc=1063,freq=2.0), product of:
            0.2745971 = queryWeight, product of:
              5.4016213 = idf(docFreq=541, maxDocs=44218)
              0.050836053 = queryNorm
            0.7161606 = fieldWeight in 1063, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4016213 = idf(docFreq=541, maxDocs=44218)
              0.09375 = fieldNorm(doc=1063)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Footnote: Teil eines Themenheftes: OCLC and the Internet: An Historical Overview of Research Activities, 1990-1999 - Part II

Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.04

0.04037056 = product of:
  0.08074112 = sum of:
    0.08074112 = product of:
      0.24222337 = sum of:
        0.24222337 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
          0.24222337 = score(doc=862,freq=2.0), product of:
            0.4309886 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.050836053 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Source: https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN

Toutanova, K.; Klein, D.; Manning, C.D.; Singer, Y.: Feature-rich Part-of-Speech Tagging with a cyclic dependency network (2003) 0.03
```
0.028678946 = product of:
  0.057357892 = sum of:
    0.057357892 = product of:
      0.114715785 = sum of:
        0.114715785 = weight(_text_:ii in 1059) [ClassicSimilarity], result of:
          0.114715785 = score(doc=1059,freq=2.0), product of:
            0.2745971 = queryWeight, product of:
              5.4016213 = idf(docFreq=541, maxDocs=44218)
              0.050836053 = queryNorm
            0.41776034 = fieldWeight in 1059, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4016213 = idf(docFreq=541, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1059)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use of lexical features, including jointly conditioning on multiple consecutive words, (iii) effective use of priors in conditional loglinear models, and (iv) fine-grained modeling of unknown word features. Using these ideas together, the resulting tagger gives a 97.24%accuracy on the Penn TreebankWSJ, an error reduction of 4.4% on the best previous single automatically learned tagging result.
Toutanova, K.; Manning, C.D.: Enriching the knowledge sources used in a maximum entropy Part-of-Speech Tagger (2000) 0.03
```
0.028678946 = product of:
  0.057357892 = sum of:
    0.057357892 = product of:
      0.114715785 = sum of:
        0.114715785 = weight(_text_:ii in 1060) [ClassicSimilarity], result of:
          0.114715785 = score(doc=1060,freq=2.0), product of:
            0.2745971 = queryWeight, product of:
              5.4016213 = idf(docFreq=541, maxDocs=44218)
              0.050836053 = queryNorm
            0.41776034 = fieldWeight in 1060, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4016213 = idf(docFreq=541, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1060)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This paper presents results for a maximumentropy-based part of speech tagger, which achieves superior performance principally by enriching the information sources used for tagging. In particular, we get improved results by incorporating these features: (i) more extensive treatment of capitalization for unknown words; (ii) features for the disambiguation of the tense forms of verbs; (iii) features for disambiguating particles from prepositions and adverbs. The best resulting accuracy for the tagger on the Penn Treebank is 96.86% overall, and 86.91% on previously unseen words.

Warner, A.J.: Natural language processing (1987) 0.03

0.027550334 = product of:
  0.05510067 = sum of:
    0.05510067 = product of:
      0.11020134 = sum of:
        0.11020134 = weight(_text_:22 in 337) [ClassicSimilarity], result of:
          0.11020134 = score(doc=337,freq=2.0), product of:
            0.1780192 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050836053 = queryNorm
            0.61904186 = fieldWeight in 337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=337)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Annual review of information science and technology. 22(1987), S.79-108

Nait-Baha, L.; Jackiewicz, A.; Djioua, B.; Laublet, P.: Query reformulation for information retrieval on the Web using the point of view methodology : preliminary results (2001) 0.02
```
0.024581954 = product of:
  0.049163908 = sum of:
    0.049163908 = product of:
      0.098327816 = sum of:
        0.098327816 = weight(_text_:ii in 249) [ClassicSimilarity], result of:
          0.098327816 = score(doc=249,freq=2.0), product of:
            0.2745971 = queryWeight, product of:
              5.4016213 = idf(docFreq=541, maxDocs=44218)
              0.050836053 = queryNorm
            0.3580803 = fieldWeight in 249, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4016213 = idf(docFreq=541, maxDocs=44218)
              0.046875 = fieldNorm(doc=249)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The work we are presenting is devoted to the information collected on the WWW. By the term collected we mean the whole process of retrieving, extracting and presenting results to the user. This research is part of the RAP (Research, Analyze, Propose) project in which we propose to combine two methods: (i) query reformulation using linguistic markers according to a given point of view; and (ii) text semantic analysis by means of contextual exploration results (Descles, 1991). The general project architecture describing the interactions between the users, the RAP system and the WWW search engines is presented in Nait-Baha et al. (1998). We will focus this paper on showing how we use linguistic markers to reformulate the queries according to a given point of view

McMahon, J.G.; Smith, F.J.: Improved statistical language model performance with automatic generated word hierarchies (1996) 0.02

0.024106542 = product of:
  0.048213083 = sum of:
    0.048213083 = product of:
      0.09642617 = sum of:
        0.09642617 = weight(_text_:22 in 3164) [ClassicSimilarity], result of:
          0.09642617 = score(doc=3164,freq=2.0), product of:
            0.1780192 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050836053 = queryNorm
            0.5416616 = fieldWeight in 3164, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=3164)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Computational linguistics. 22(1996) no.2, S.217-248

Ruge, G.: ¬A spreading activation network for automatic generation of thesaurus relationships (1991) 0.02

0.024106542 = product of:
  0.048213083 = sum of:
    0.048213083 = product of:
      0.09642617 = sum of:
        0.09642617 = weight(_text_:22 in 4506) [ClassicSimilarity], result of:
          0.09642617 = score(doc=4506,freq=2.0), product of:
            0.1780192 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050836053 = queryNorm
            0.5416616 = fieldWeight in 4506, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=4506)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 8.10.2000 11:52:22

Somers, H.: Example-based machine translation : Review article (1999) 0.02

0.024106542 = product of:
  0.048213083 = sum of:
    0.048213083 = product of:
      0.09642617 = sum of:
        0.09642617 = weight(_text_:22 in 6672) [ClassicSimilarity], result of:
          0.09642617 = score(doc=6672,freq=2.0), product of:
            0.1780192 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050836053 = queryNorm
            0.5416616 = fieldWeight in 6672, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6672)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 31. 7.1996 9:22:19

New tools for human translators (1997) 0.02

0.024106542 = product of:
  0.048213083 = sum of:
    0.048213083 = product of:
      0.09642617 = sum of:
        0.09642617 = weight(_text_:22 in 1179) [ClassicSimilarity], result of:
          0.09642617 = score(doc=1179,freq=2.0), product of:
            0.1780192 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050836053 = queryNorm
            0.5416616 = fieldWeight in 1179, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=1179)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 31. 7.1996 9:22:19

Baayen, R.H.; Lieber, H.: Word frequency distributions and lexical semantics (1997) 0.02

0.024106542 = product of:
  0.048213083 = sum of:
    0.048213083 = product of:
      0.09642617 = sum of:
        0.09642617 = weight(_text_:22 in 3117) [ClassicSimilarity], result of:
          0.09642617 = score(doc=3117,freq=2.0), product of:
            0.1780192 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050836053 = queryNorm
            0.5416616 = fieldWeight in 3117, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=3117)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 28. 2.1999 10:48:22

Byrne, C.C.; McCracken, S.A.: ¬An adaptive thesaurus employing semantic distance, relational inheritance and nominal compound interpretation for linguistic support of information retrieval (1999) 0.02

0.02066275 = product of:
  0.0413255 = sum of:
    0.0413255 = product of:
      0.082651 = sum of:
        0.082651 = weight(_text_:22 in 4483) [ClassicSimilarity], result of:
          0.082651 = score(doc=4483,freq=2.0), product of:
            0.1780192 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050836053 = queryNorm
            0.46428138 = fieldWeight in 4483, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=4483)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 15. 3.2000 10:22:37

Boleda, G.; Evert, S.: Multiword expressions : a pain in the neck of lexical semantics (2009) 0.02

0.02066275 = product of:
  0.0413255 = sum of:
    0.0413255 = product of:
      0.082651 = sum of:
        0.082651 = weight(_text_:22 in 4888) [ClassicSimilarity], result of:
          0.082651 = score(doc=4888,freq=2.0), product of:
            0.1780192 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050836053 = queryNorm
            0.46428138 = fieldWeight in 4888, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=4888)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 1. 3.2013 14:56:22

Niemi, T.; Jämsen, J.: ¬A query language for discovering semantic associations, part II : sample queries and query evaluation (2007) 0.02

0.020484962 = product of:
  0.040969923 = sum of:
    0.040969923 = product of:
      0.08193985 = sum of:
        0.08193985 = weight(_text_:ii in 580) [ClassicSimilarity], result of:
          0.08193985 = score(doc=580,freq=2.0), product of:
            0.2745971 = queryWeight, product of:
              5.4016213 = idf(docFreq=541, maxDocs=44218)
              0.050836053 = queryNorm
            0.29840025 = fieldWeight in 580, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4016213 = idf(docFreq=541, maxDocs=44218)
              0.0390625 = fieldNorm(doc=580)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Niemi, T.; Jämsen , J.: ¬A query language for discovering semantic associations, part I : approach and formal definition of query primitives (2007) 0.02

0.020484962 = product of:
  0.040969923 = sum of:
    0.040969923 = product of:
      0.08193985 = sum of:
        0.08193985 = weight(_text_:ii in 591) [ClassicSimilarity], result of:
          0.08193985 = score(doc=591,freq=2.0), product of:
            0.2745971 = queryWeight, product of:
              5.4016213 = idf(docFreq=541, maxDocs=44218)
              0.050836053 = queryNorm
            0.29840025 = fieldWeight in 591, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4016213 = idf(docFreq=541, maxDocs=44218)
              0.0390625 = fieldNorm(doc=591)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Content: Part II: Journal of the American Society for Information Science and Technology. 58(2007) no.11, S.1686-1700.

Hutchins, J.: From first conception to first demonstration : the nascent years of machine translation, 1947-1954. A chronology (1997) 0.02

0.017218959 = product of:
  0.034437917 = sum of:
    0.034437917 = product of:
      0.068875834 = sum of:
        0.068875834 = weight(_text_:22 in 1463) [ClassicSimilarity], result of:
          0.068875834 = score(doc=1463,freq=2.0), product of:
            0.1780192 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050836053 = queryNorm
            0.38690117 = fieldWeight in 1463, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=1463)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 31. 7.1996 9:22:19

Spitkovsky, V.; Norvig, P.: From words to concepts and back : dictionaries for linking text, entities and ideas (2012) 0.02
```
0.01638797 = product of:
  0.03277594 = sum of:
    0.03277594 = product of:
      0.06555188 = sum of:
        0.06555188 = weight(_text_:ii in 337) [ClassicSimilarity], result of:
          0.06555188 = score(doc=337,freq=2.0), product of:
            0.2745971 = queryWeight, product of:
              5.4016213 = idf(docFreq=541, maxDocs=44218)
              0.050836053 = queryNorm
            0.2387202 = fieldWeight in 337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4016213 = idf(docFreq=541, maxDocs=44218)
              0.03125 = fieldNorm(doc=337)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Human language is both rich and ambiguous. When we hear or read words, we resolve meanings to mental representations, for example recognizing and linking names to the intended persons, locations or organizations. Bridging words and meaning - from turning search queries into relevant results to suggesting targeted keywords for advertisers - is also Google's core competency, and important for many other tasks in information retrieval and natural language processing. We are happy to release a resource, spanning 7,560,141 concepts and 175,100,788 unique text strings, that we hope will help everyone working in these areas. How do we represent concepts? Our approach piggybacks on the unique titles of entries from an encyclopedia, which are mostly proper and common noun phrases. We consider each individual Wikipedia article as representing a concept (an entity or an idea), identified by its URL. Text strings that refer to concepts were collected using the publicly available hypertext of anchors (the text you click on in a web link) that point to each Wikipedia page, thus drawing on the vast link structure of the web. For every English article we harvested the strings associated with its incoming hyperlinks from the rest of Wikipedia, the greater web, and also anchors of parallel, non-English Wikipedia pages. Our dictionaries are cross-lingual, and any concept deemed too fine can be broadened to a desired level of generality using Wikipedia's groupings of articles into hierarchical categories. The data set contains triples, each consisting of (i) text, a short, raw natural language string; (ii) url, a related concept, represented by an English Wikipedia article's canonical location; and (iii) count, an integer indicating the number of times text has been observed connected with the concept's url. Our database thus includes weights that measure degrees of association. For example, the top two entries for football indicate that it is an ambiguous term, which is almost twice as likely to refer to what we in the US call soccer. Vgl. auch: Spitkovsky, V.I., A.X. Chang: A cross-lingual dictionary for english Wikipedia concepts. In: http://nlp.stanford.edu/pubs/crosswikis.pdf.

Search (46 results, page 1 of 3)

Authors

Years

Languages

Types

Themes