Search (43 results, page 1 of 3)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.10

0.100861 = sum of:
  0.08030887 = product of:
    0.24092661 = sum of:
      0.24092661 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
        0.24092661 = score(doc=562,freq=2.0), product of:
          0.42868128 = queryWeight, product of:
            8.478011 = idf(docFreq=24, maxDocs=44218)
            0.050563898 = queryNorm
          0.56201804 = fieldWeight in 562, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            8.478011 = idf(docFreq=24, maxDocs=44218)
            0.046875 = fieldNorm(doc=562)
    0.33333334 = coord(1/3)
  0.02055213 = product of:
    0.04110426 = sum of:
      0.04110426 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
        0.04110426 = score(doc=562,freq=2.0), product of:
          0.17706616 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.050563898 = queryNorm
          0.23214069 = fieldWeight in 562, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=562)
    0.5 = coord(1/2)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.04

0.040154435 = product of:
  0.08030887 = sum of:
    0.08030887 = product of:
      0.24092661 = sum of:
        0.24092661 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
          0.24092661 = score(doc=862,freq=2.0), product of:
            0.42868128 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.050563898 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Source: https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN

Savoy, J.: Searching strategies for the Hungarian language (2008) 0.04
```
0.039527763 = product of:
  0.079055525 = sum of:
    0.079055525 = product of:
      0.15811105 = sum of:
        0.15811105 = weight(_text_:light in 2037) [ClassicSimilarity], result of:
          0.15811105 = score(doc=2037,freq=4.0), product of:
            0.2920221 = queryWeight, product of:
              5.7753086 = idf(docFreq=372, maxDocs=44218)
              0.050563898 = queryNorm
            0.5414352 = fieldWeight in 2037, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.7753086 = idf(docFreq=372, maxDocs=44218)
              0.046875 = fieldNorm(doc=2037)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This paper reports on the underlying IR problems encountered when dealing with the complex morphology and compound constructions found in the Hungarian language. It describes evaluations carried out on two general stemming strategies for this language, and also demonstrates that a light stemming approach could be quite effective. Based on searches done on the CLEF test collection, we find that a more aggressive suffix-stripping approach may produce better MAP. When compared to an IR scheme without stemming or one based on only a light stemmer, we find the differences to be statistically significant. When compared with probabilistic, vector-space and language models, we find that the Okapi model results in the best retrieval effectiveness. The resulting MAP is found to be about 35% better than the classical tf idf approach, particularly for very short requests. Finally, we demonstrate that applying an automatic decompounding procedure for both queries and documents significantly improves IR performance (+10%), compared to word-based indexing strategies.
Rayson, P.; Piao, S.; Sharoff, S.; Evert, S.; Moiron, B.V.: Multiword expressions : hard going or plain sailing? (2015) 0.03
```
0.03260874 = product of:
  0.06521748 = sum of:
    0.06521748 = product of:
      0.13043496 = sum of:
        0.13043496 = weight(_text_:light in 2918) [ClassicSimilarity], result of:
          0.13043496 = score(doc=2918,freq=2.0), product of:
            0.2920221 = queryWeight, product of:
              5.7753086 = idf(docFreq=372, maxDocs=44218)
              0.050563898 = queryNorm
            0.44666123 = fieldWeight in 2918, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.7753086 = idf(docFreq=372, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2918)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Over the past two decades or so, Multi-Word Expressions (MWEs; also called Multi-word Units) have been an increasingly important concern for Computational Linguistics and Natural Language Processing (NLP). The term MWE has been used to refer to various types of linguistic units and expressions, including idioms, noun compounds, phrasal verbs, light verbs and other habitual collocations. However, while there is no universally agreed definition for MWE as yet, most researchers use the term to refer to those frequently occurring phrasal units which are subject to certain level of semantic opaqueness, or non-compositionality. Non-compositional MWEs pose tough challenges for automatic analysis because their interpretation cannot be achieved by directly combining the semantics of their constituents, thereby causing the "pain in the neck of NLP".
Chou, C.; Chu, T.: ¬An analysis of BERT (NLP) for assisted subject indexing for Project Gutenberg (2022) 0.03
```
0.03260874 = product of:
  0.06521748 = sum of:
    0.06521748 = product of:
      0.13043496 = sum of:
        0.13043496 = weight(_text_:light in 1139) [ClassicSimilarity], result of:
          0.13043496 = score(doc=1139,freq=2.0), product of:
            0.2920221 = queryWeight, product of:
              5.7753086 = idf(docFreq=372, maxDocs=44218)
              0.050563898 = queryNorm
            0.44666123 = fieldWeight in 1139, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.7753086 = idf(docFreq=372, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1139)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In light of AI (Artificial Intelligence) and NLP (Natural language processing) technologies, this article examines the feasibility of using AI/NLP models to enhance the subject indexing of digital resources. While BERT (Bidirectional Encoder Representations from Transformers) models are widely used in scholarly communities, the authors assess whether BERT models can be used in machine-assisted indexing in the Project Gutenberg collection, through suggesting Library of Congress subject headings filtered by certain Library of Congress Classification subclass labels. The findings of this study are informative for further research on BERT models to assist with automatic subject indexing for digital library collections.

Warner, A.J.: Natural language processing (1987) 0.03

0.027402842 = product of:
  0.054805685 = sum of:
    0.054805685 = product of:
      0.10961137 = sum of:
        0.10961137 = weight(_text_:22 in 337) [ClassicSimilarity], result of:
          0.10961137 = score(doc=337,freq=2.0), product of:
            0.17706616 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050563898 = queryNorm
            0.61904186 = fieldWeight in 337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=337)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Annual review of information science and technology. 22(1987), S.79-108

McMahon, J.G.; Smith, F.J.: Improved statistical language model performance with automatic generated word hierarchies (1996) 0.02

0.023977486 = product of:
  0.047954973 = sum of:
    0.047954973 = product of:
      0.095909946 = sum of:
        0.095909946 = weight(_text_:22 in 3164) [ClassicSimilarity], result of:
          0.095909946 = score(doc=3164,freq=2.0), product of:
            0.17706616 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050563898 = queryNorm
            0.5416616 = fieldWeight in 3164, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=3164)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Computational linguistics. 22(1996) no.2, S.217-248

Ruge, G.: ¬A spreading activation network for automatic generation of thesaurus relationships (1991) 0.02

0.023977486 = product of:
  0.047954973 = sum of:
    0.047954973 = product of:
      0.095909946 = sum of:
        0.095909946 = weight(_text_:22 in 4506) [ClassicSimilarity], result of:
          0.095909946 = score(doc=4506,freq=2.0), product of:
            0.17706616 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050563898 = queryNorm
            0.5416616 = fieldWeight in 4506, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=4506)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 8.10.2000 11:52:22

Somers, H.: Example-based machine translation : Review article (1999) 0.02

0.023977486 = product of:
  0.047954973 = sum of:
    0.047954973 = product of:
      0.095909946 = sum of:
        0.095909946 = weight(_text_:22 in 6672) [ClassicSimilarity], result of:
          0.095909946 = score(doc=6672,freq=2.0), product of:
            0.17706616 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050563898 = queryNorm
            0.5416616 = fieldWeight in 6672, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6672)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 31. 7.1996 9:22:19

Baayen, R.H.; Lieber, H.: Word frequency distributions and lexical semantics (1997) 0.02

0.023977486 = product of:
  0.047954973 = sum of:
    0.047954973 = product of:
      0.095909946 = sum of:
        0.095909946 = weight(_text_:22 in 3117) [ClassicSimilarity], result of:
          0.095909946 = score(doc=3117,freq=2.0), product of:
            0.17706616 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050563898 = queryNorm
            0.5416616 = fieldWeight in 3117, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=3117)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 28. 2.1999 10:48:22

¬Der Student aus dem Computer (2023) 0.02

0.023977486 = product of:
  0.047954973 = sum of:
    0.047954973 = product of:
      0.095909946 = sum of:
        0.095909946 = weight(_text_:22 in 1079) [ClassicSimilarity], result of:
          0.095909946 = score(doc=1079,freq=2.0), product of:
            0.17706616 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050563898 = queryNorm
            0.5416616 = fieldWeight in 1079, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=1079)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 27. 1.2023 16:22:55

Byrne, C.C.; McCracken, S.A.: ¬An adaptive thesaurus employing semantic distance, relational inheritance and nominal compound interpretation for linguistic support of information retrieval (1999) 0.02

0.02055213 = product of:
  0.04110426 = sum of:
    0.04110426 = product of:
      0.08220852 = sum of:
        0.08220852 = weight(_text_:22 in 4483) [ClassicSimilarity], result of:
          0.08220852 = score(doc=4483,freq=2.0), product of:
            0.17706616 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050563898 = queryNorm
            0.46428138 = fieldWeight in 4483, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=4483)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 15. 3.2000 10:22:37

Monnerjahn, P.: Vorsprung ohne Technik : Übersetzen: Computer und Qualität (2000) 0.02

0.02055213 = product of:
  0.04110426 = sum of:
    0.04110426 = product of:
      0.08220852 = sum of:
        0.08220852 = weight(_text_:22 in 5429) [ClassicSimilarity], result of:
          0.08220852 = score(doc=5429,freq=2.0), product of:
            0.17706616 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050563898 = queryNorm
            0.46428138 = fieldWeight in 5429, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=5429)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: c't. 2000, H.22, S.230-231

Hutchins, J.: From first conception to first demonstration : the nascent years of machine translation, 1947-1954. A chronology (1997) 0.02

0.017126776 = product of:
  0.034253553 = sum of:
    0.034253553 = product of:
      0.068507105 = sum of:
        0.068507105 = weight(_text_:22 in 1463) [ClassicSimilarity], result of:
          0.068507105 = score(doc=1463,freq=2.0), product of:
            0.17706616 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050563898 = queryNorm
            0.38690117 = fieldWeight in 1463, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=1463)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 31. 7.1996 9:22:19

Kuhlmann, U.; Monnerjahn, P.: Sprache auf Knopfdruck : Sieben automatische Übersetzungsprogramme im Test (2000) 0.02

0.017126776 = product of:
  0.034253553 = sum of:
    0.034253553 = product of:
      0.068507105 = sum of:
        0.068507105 = weight(_text_:22 in 5428) [ClassicSimilarity], result of:
          0.068507105 = score(doc=5428,freq=2.0), product of:
            0.17706616 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050563898 = queryNorm
            0.38690117 = fieldWeight in 5428, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=5428)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: c't. 2000, H.22, S.220-229

Lezius, W.; Rapp, R.; Wettler, M.: ¬A morphology-system and part-of-speech tagger for German (1996) 0.02

0.017126776 = product of:
  0.034253553 = sum of:
    0.034253553 = product of:
      0.068507105 = sum of:
        0.068507105 = weight(_text_:22 in 1693) [ClassicSimilarity], result of:
          0.068507105 = score(doc=1693,freq=2.0), product of:
            0.17706616 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050563898 = queryNorm
            0.38690117 = fieldWeight in 1693, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=1693)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 3.2015 9:37:18

Wanner, L.: Lexical choice in text generation and machine translation (1996) 0.01

0.013701421 = product of:
  0.027402842 = sum of:
    0.027402842 = product of:
      0.054805685 = sum of:
        0.054805685 = weight(_text_:22 in 8521) [ClassicSimilarity], result of:
          0.054805685 = score(doc=8521,freq=2.0), product of:
            0.17706616 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050563898 = queryNorm
            0.30952093 = fieldWeight in 8521, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=8521)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 31. 7.1996 9:22:19

Riloff, E.: ¬An empirical study of automated dictionary construction for information extraction in three domains (1996) 0.01

0.013701421 = product of:
  0.027402842 = sum of:
    0.027402842 = product of:
      0.054805685 = sum of:
        0.054805685 = weight(_text_:22 in 6752) [ClassicSimilarity], result of:
          0.054805685 = score(doc=6752,freq=2.0), product of:
            0.17706616 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050563898 = queryNorm
            0.30952093 = fieldWeight in 6752, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6752)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 6. 3.1997 16:22:15

Basili, R.; Pazienza, M.T.; Velardi, P.: ¬An empirical symbolic approach to natural language processing (1996) 0.01

0.013701421 = product of:
  0.027402842 = sum of:
    0.027402842 = product of:
      0.054805685 = sum of:
        0.054805685 = weight(_text_:22 in 6753) [ClassicSimilarity], result of:
          0.054805685 = score(doc=6753,freq=2.0), product of:
            0.17706616 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050563898 = queryNorm
            0.30952093 = fieldWeight in 6753, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6753)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 6. 3.1997 16:22:15

Haas, S.W.: Natural language processing : toward large-scale, robust systems (1996) 0.01
```
0.013701421 = product of:
  0.027402842 = sum of:
    0.027402842 = product of:
      0.054805685 = sum of:
        0.054805685 = weight(_text_:22 in 7415) [ClassicSimilarity], result of:
          0.054805685 = score(doc=7415,freq=2.0), product of:
            0.17706616 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050563898 = queryNorm
            0.30952093 = fieldWeight in 7415, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=7415)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

State of the art review of natural language processing updating an earlier review published in ARIST 22(1987). Discusses important developments that have allowed for significant advances in the field of natural language processing: materials and resources; knowledge based systems and statistical approaches; and a strong emphasis on evaluation. Reviews some natural language processing applications and common problems still awaiting solution. Considers closely related applications such as language generation and th egeneration phase of machine translation which face the same problems as natural language processing. Covers natural language methodologies for information retrieval only briefly

Search (43 results, page 1 of 3)

Authors

Years

Languages

Types

Themes