Search (104 results, page 2 of 6)

Belonogov, G.G.: Sistemy frazeologicheskogo machinnogo perevoda RETRANS i ERTRANS v seti Internet (2000) 0.02

0.021425933 = product of:
  0.042851865 = sum of:
    0.042851865 = product of:
      0.08570373 = sum of:
        0.08570373 = weight(_text_:i in 183) [ClassicSimilarity], result of:
          0.08570373 = score(doc=183,freq=2.0), product of:
            0.17138503 = queryWeight, product of:
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.045439374 = queryNorm
            0.50006545 = fieldWeight in 183, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.09375 = fieldNorm(doc=183)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Subbotin, M.M.: Intellektual'nye tekhnologii poiska i obrabotki tekstovoi informatsii kak instrument podderzhki analiticheskoi deyatel'nosti (1999) 0.02

0.021425933 = product of:
  0.042851865 = sum of:
    0.042851865 = product of:
      0.08570373 = sum of:
        0.08570373 = weight(_text_:i in 415) [ClassicSimilarity], result of:
          0.08570373 = score(doc=415,freq=2.0), product of:
            0.17138503 = queryWeight, product of:
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.045439374 = queryNorm
            0.50006545 = fieldWeight in 415, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.09375 = fieldNorm(doc=415)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Godby, C.J.: ¬Two Techniques for the Identification of Phrases in Full Text (2001) 0.02

0.021425933 = product of:
  0.042851865 = sum of:
    0.042851865 = product of:
      0.08570373 = sum of:
        0.08570373 = weight(_text_:i in 1000) [ClassicSimilarity], result of:
          0.08570373 = score(doc=1000,freq=2.0), product of:
            0.17138503 = queryWeight, product of:
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.045439374 = queryNorm
            0.50006545 = fieldWeight in 1000, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.09375 = fieldNorm(doc=1000)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Footnote: Teil eines Themenheftes: OCLC and the Internet: An Historical Overview of Research Activities, 1990-1999 - Part I

Manning, C.D.: Part-of-Speech Tagging from 97% to 100% : is it time for some linguistics? (2011) 0.02
```
0.019962436 = product of:
  0.03992487 = sum of:
    0.03992487 = product of:
      0.07984974 = sum of:
        0.07984974 = weight(_text_:i in 1121) [ClassicSimilarity], result of:
          0.07984974 = score(doc=1121,freq=10.0), product of:
            0.17138503 = queryWeight, product of:
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.045439374 = queryNorm
            0.46590847 = fieldWeight in 1121, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1121)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

I examine what would be necessary to move part-of-speech tagging performance from its current level of about 97.3% token accuracy (56% sentence accuracy) to close to 100% accuracy. I suggest that it must still be possible to greatly increase tagging performance and examine some useful improvements that have recently been made to the Stanford Part-of-Speech Tagger. However, an error analysis of some of the remaining errors suggests that there is limited further mileage to be had either from better machine learning or better features in a discriminative sequence classifier. The prospects for further gains from semisupervised learning also seem quite limited. Rather, I suggest and begin to demonstrate that the largest opportunity for further progress comes from improving the taxonomic basis of the linguistic resources from which taggers are trained. That is, from improved descriptive linguistics. However, I conclude by suggesting that there are also limits to this process. The status of some words may not be able to be adequately captured by assigning them to one of a small number of categories. While conventions can be used in such cases to improve tagging consistency, they lack a strong linguistic basis.

Source

Computational Linguistics and Intelligent Text Processing, 12th International Conference, CICLing 2011, Proceedings, Part I. Ed.: Alexander Gelbukh

Byrne, C.C.; McCracken, S.A.: ¬An adaptive thesaurus employing semantic distance, relational inheritance and nominal compound interpretation for linguistic support of information retrieval (1999) 0.02

0.018469224 = product of:
  0.036938448 = sum of:
    0.036938448 = product of:
      0.073876895 = sum of:
        0.073876895 = weight(_text_:22 in 4483) [ClassicSimilarity], result of:
          0.073876895 = score(doc=4483,freq=2.0), product of:
            0.15912095 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045439374 = queryNorm
            0.46428138 = fieldWeight in 4483, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=4483)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 15. 3.2000 10:22:37

Boleda, G.; Evert, S.: Multiword expressions : a pain in the neck of lexical semantics (2009) 0.02

0.018469224 = product of:
  0.036938448 = sum of:
    0.036938448 = product of:
      0.073876895 = sum of:
        0.073876895 = weight(_text_:22 in 4888) [ClassicSimilarity], result of:
          0.073876895 = score(doc=4888,freq=2.0), product of:
            0.15912095 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045439374 = queryNorm
            0.46428138 = fieldWeight in 4888, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=4888)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 1. 3.2013 14:56:22

Monnerjahn, P.: Vorsprung ohne Technik : Übersetzen: Computer und Qualität (2000) 0.02

0.018469224 = product of:
  0.036938448 = sum of:
    0.036938448 = product of:
      0.073876895 = sum of:
        0.073876895 = weight(_text_:22 in 5429) [ClassicSimilarity], result of:
          0.073876895 = score(doc=5429,freq=2.0), product of:
            0.15912095 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045439374 = queryNorm
            0.46428138 = fieldWeight in 5429, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=5429)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: c't. 2000, H.22, S.230-231

Sokirko, A.V.: Obzor zarubezhnykh sistem avtomaticheskoi obrabotki teksta, ispol'zuyushchikh poverkhnosto-semanticheskoe predstavlenie, i mashinnykh sematicheskikh slovarei (2000) 0.02

0.017854942 = product of:
  0.035709884 = sum of:
    0.035709884 = product of:
      0.07141977 = sum of:
        0.07141977 = weight(_text_:i in 8870) [ClassicSimilarity], result of:
          0.07141977 = score(doc=8870,freq=2.0), product of:
            0.17138503 = queryWeight, product of:
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.045439374 = queryNorm
            0.41672117 = fieldWeight in 8870, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.078125 = fieldNorm(doc=8870)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Sheremet'eva, S.O.: Teoreticheskie i metodologicheskie problemy inzhenernoi lingvistiki (1998) 0.02

0.017854942 = product of:
  0.035709884 = sum of:
    0.035709884 = product of:
      0.07141977 = sum of:
        0.07141977 = weight(_text_:i in 6316) [ClassicSimilarity], result of:
          0.07141977 = score(doc=6316,freq=2.0), product of:
            0.17138503 = queryWeight, product of:
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.045439374 = queryNorm
            0.41672117 = fieldWeight in 6316, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.078125 = fieldNorm(doc=6316)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Egger, W.: Helferlein für jedermann : Elektronische Wörterbücher (2004) 0.02

0.017854942 = product of:
  0.035709884 = sum of:
    0.035709884 = product of:
      0.07141977 = sum of:
        0.07141977 = weight(_text_:i in 1501) [ClassicSimilarity], result of:
          0.07141977 = score(doc=1501,freq=2.0), product of:
            0.17138503 = queryWeight, product of:
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.045439374 = queryNorm
            0.41672117 = fieldWeight in 1501, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.078125 = fieldNorm(doc=1501)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Object: I-Finger

Hutchins, J.: From first conception to first demonstration : the nascent years of machine translation, 1947-1954. A chronology (1997) 0.02

0.01539102 = product of:
  0.03078204 = sum of:
    0.03078204 = product of:
      0.06156408 = sum of:
        0.06156408 = weight(_text_:22 in 1463) [ClassicSimilarity], result of:
          0.06156408 = score(doc=1463,freq=2.0), product of:
            0.15912095 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045439374 = queryNorm
            0.38690117 = fieldWeight in 1463, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=1463)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 31. 7.1996 9:22:19

Kuhlmann, U.; Monnerjahn, P.: Sprache auf Knopfdruck : Sieben automatische Übersetzungsprogramme im Test (2000) 0.02

0.01539102 = product of:
  0.03078204 = sum of:
    0.03078204 = product of:
      0.06156408 = sum of:
        0.06156408 = weight(_text_:22 in 5428) [ClassicSimilarity], result of:
          0.06156408 = score(doc=5428,freq=2.0), product of:
            0.15912095 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045439374 = queryNorm
            0.38690117 = fieldWeight in 5428, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=5428)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: c't. 2000, H.22, S.220-229

Lezius, W.; Rapp, R.; Wettler, M.: ¬A morphology-system and part-of-speech tagger for German (1996) 0.02

0.01539102 = product of:
  0.03078204 = sum of:
    0.03078204 = product of:
      0.06156408 = sum of:
        0.06156408 = weight(_text_:22 in 1693) [ClassicSimilarity], result of:
          0.06156408 = score(doc=1693,freq=2.0), product of:
            0.15912095 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045439374 = queryNorm
            0.38690117 = fieldWeight in 1693, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=1693)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 3.2015 9:37:18

He, Q.: ¬A study of the strength indexes in co-word analysis (2000) 0.02
```
0.015150423 = product of:
  0.030300846 = sum of:
    0.030300846 = product of:
      0.060601693 = sum of:
        0.060601693 = weight(_text_:i in 111) [ClassicSimilarity], result of:
          0.060601693 = score(doc=111,freq=4.0), product of:
            0.17138503 = queryWeight, product of:
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.045439374 = queryNorm
            0.35359967 = fieldWeight in 111, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.046875 = fieldNorm(doc=111)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Co-word analysis is a technique for detecting the knowledge structure of scientific literature and mapping the dynamics in a research field. It is used to count the co-occurrences of term pairs, compute the strength between term pairs, and map the research field by inserting terms and their linkages into a graphical structure according to the strength values. In previous co-word studies, there are two indexes used to measure the strength between term pairs in order to identify the major areas in a research field - the inclusion index (I) and the equivalence index (E). This study will conduct two co-word analysis experiments using the two indexes, respectively, and compare the results from the two experiments. The results show, due to the difference in their computation, index I is more likely to identify general subject areas in a research field while index E is more likely to identify subject areas at more specific levels

Sumbatyan, M.A.; Khazagerov, G.G.: Tipy ruskikh omoform i ikx avtomaticheskoe razvedenie (1997) 0.01

0.014283955 = product of:
  0.02856791 = sum of:
    0.02856791 = product of:
      0.05713582 = sum of:
        0.05713582 = weight(_text_:i in 2259) [ClassicSimilarity], result of:
          0.05713582 = score(doc=2259,freq=2.0), product of:
            0.17138503 = queryWeight, product of:
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.045439374 = queryNorm
            0.33337694 = fieldWeight in 2259, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.0625 = fieldNorm(doc=2259)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Diaz, I.; Morato, J.; Lioréns, J.: ¬An algorithm for term conflation based on tree structures (2002) 0.01

0.014283955 = product of:
  0.02856791 = sum of:
    0.02856791 = product of:
      0.05713582 = sum of:
        0.05713582 = weight(_text_:i in 246) [ClassicSimilarity], result of:
          0.05713582 = score(doc=246,freq=2.0), product of:
            0.17138503 = queryWeight, product of:
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.045439374 = queryNorm
            0.33337694 = fieldWeight in 246, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.0625 = fieldNorm(doc=246)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Blair, D.C.: Information retrieval and the philosophy of language (2002) 0.01
```
0.014283955 = product of:
  0.02856791 = sum of:
    0.02856791 = product of:
      0.05713582 = sum of:
        0.05713582 = weight(_text_:i in 4283) [ClassicSimilarity], result of:
          0.05713582 = score(doc=4283,freq=8.0), product of:
            0.17138503 = queryWeight, product of:
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.045439374 = queryNorm
            0.33337694 = fieldWeight in 4283, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.03125 = fieldNorm(doc=4283)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Information retrieval - the retrieval, primarily, of documents or textual material - is fundamentally a linguistic process. At the very least we must describe what we want and match that description with descriptions of the information that is available to us. Furthermore, when we describe what we want, we must mean something by that description. This is a deceptively simple act, but such linguistic events have been the grist for philosophical analysis since Aristotle. Although there are complexities involved in referring to authors, document types, or other categories of information retrieval context, here I wish to focus an one of the most problematic activities in information retrieval: the description of the intellectual content of information items. And even though I take information retrieval to involve the description and retrieval of written text, what I say here is applicable to any information item whose intellectual content can be described for retrieval-books, documents, images, audio clips, video clips, scientific specimens, engineering schematics, and so forth. For convenience, though, I will refer only to the description and retrieval of documents. The description of intellectual content can go wrong in many obvious ways. We may describe what we want incorrectly; we may describe it correctly but in such general terms that its description is useless for retrieval; or we may describe what we want correctly, but misinterpret the descriptions of available information, and thereby match our description of what we want incorrectly. From a linguistic point of view, we can be misunderstood in the process of retrieval in many ways. Because the philosophy of language deals specifically with how we are understood and mis-understood, it should have some use for understanding the process of description in information retrieval. First, however, let us examine more closely the kinds of misunderstandings that can occur in information retrieval. We use language in searching for information in two principal ways. We use it to describe what we want and to discriminate what we want from other information that is available to us but that we do not want. Description and discrimination together articulate the goals of the information search process; they also delineate the two principal ways in which language can fail us in this process. Van Rijsbergen (1979) was the first to make this distinction, calling them "representation" and "discrimination.""

Koppel, M.; Akiva, N.; Dagan, I.: Feature instability as a criterion for selecting potential style markers (2006) 0.01

0.014283955 = product of:
  0.02856791 = sum of:
    0.02856791 = product of:
      0.05713582 = sum of:
        0.05713582 = weight(_text_:i in 6092) [ClassicSimilarity], result of:
          0.05713582 = score(doc=6092,freq=2.0), product of:
            0.17138503 = queryWeight, product of:
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.045439374 = queryNorm
            0.33337694 = fieldWeight in 6092, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.0625 = fieldNorm(doc=6092)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Toutanova, K.; Klein, D.; Manning, C.D.; Singer, Y.: Feature-rich Part-of-Speech Tagging with a cyclic dependency network (2003) 0.01
```
0.012498461 = product of:
  0.024996921 = sum of:
    0.024996921 = product of:
      0.049993843 = sum of:
        0.049993843 = weight(_text_:i in 1059) [ClassicSimilarity], result of:
          0.049993843 = score(doc=1059,freq=2.0), product of:
            0.17138503 = queryWeight, product of:
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.045439374 = queryNorm
            0.29170483 = fieldWeight in 1059, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1059)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use of lexical features, including jointly conditioning on multiple consecutive words, (iii) effective use of priors in conditional loglinear models, and (iv) fine-grained modeling of unknown word features. Using these ideas together, the resulting tagger gives a 97.24%accuracy on the Penn TreebankWSJ, an error reduction of 4.4% on the best previous single automatically learned tagging result.
Toutanova, K.; Manning, C.D.: Enriching the knowledge sources used in a maximum entropy Part-of-Speech Tagger (2000) 0.01
```
0.012498461 = product of:
  0.024996921 = sum of:
    0.024996921 = product of:
      0.049993843 = sum of:
        0.049993843 = weight(_text_:i in 1060) [ClassicSimilarity], result of:
          0.049993843 = score(doc=1060,freq=2.0), product of:
            0.17138503 = queryWeight, product of:
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.045439374 = queryNorm
            0.29170483 = fieldWeight in 1060, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7717297 = idf(docFreq=2765, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1060)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This paper presents results for a maximumentropy-based part of speech tagger, which achieves superior performance principally by enriching the information sources used for tagging. In particular, we get improved results by incorporating these features: (i) more extensive treatment of capitalization for unknown words; (ii) features for the disambiguation of the tense forms of verbs; (iii) features for disambiguating particles from prepositions and adverbs. The best resulting accuracy for the tagger on the Penn Treebank is 96.86% overall, and 86.91% on previously unseen words.

Search (104 results, page 2 of 6)

Authors

Years

Languages

Types

Themes

Subjects

Classifications