Search (56 results, page 1 of 3)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.11

0.113088265 = product of:
  0.28272066 = sum of:
    0.24151587 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
      0.24151587 = score(doc=562,freq=2.0), product of:
        0.42972976 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.050687566 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.04120479 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
      0.04120479 = score(doc=562,freq=2.0), product of:
        0.17749922 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.050687566 = queryNorm
        0.23214069 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
  0.4 = coord(2/5)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Doszkocs, T.E.; Zamora, A.: Dictionary services and spelling aids for Web searching (2004) 0.05
```
0.054647263 = product of:
  0.13661815 = sum of:
    0.08805784 = weight(_text_:line in 2541) [ClassicSimilarity], result of:
      0.08805784 = score(doc=2541,freq=2.0), product of:
        0.28424788 = queryWeight, product of:
          5.6078424 = idf(docFreq=440, maxDocs=44218)
          0.050687566 = queryNorm
        0.30979243 = fieldWeight in 2541, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.6078424 = idf(docFreq=440, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2541)
    0.048560314 = weight(_text_:22 in 2541) [ClassicSimilarity], result of:
      0.048560314 = score(doc=2541,freq=4.0), product of:
        0.17749922 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.050687566 = queryNorm
        0.27358043 = fieldWeight in 2541, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2541)
  0.4 = coord(2/5)
```
Abstract

The Specialized Information Services Division (SIS) of the National Library of Medicine (NLM) provides Web access to more than a dozen scientific databases on toxicology and the environment on TOXNET . Search queries on TOXNET often include misspelled or variant English words, medical and scientific jargon and chemical names. Following the example of search engines like Google and ClinicalTrials.gov, we set out to develop a spelling "suggestion" system for increased recall and precision in TOXNET searching. This paper describes development of dictionary technology that can be used in a variety of applications such as orthographic verification, writing aid, natural language processing, and information storage and retrieval. The design of the technology allows building complex applications using the components developed in the earlier phases of the work in a modular fashion without extensive rewriting of computer code. Since many of the potential applications envisioned for this work have on-line or web-based interfaces, the dictionaries and other computer components must have fast response, and must be adaptable to open-ended database vocabularies, including chemical nomenclature. The dictionary vocabulary for this work was derived from SIS and other databases and specialized resources, such as NLM's Unified Medical Language Systems (UMLS) . The resulting technology, A-Z Dictionary (AZdict), has three major constituents: 1) the vocabulary list, 2) the word attributes that define part of speech and morphological relationships between words in the list, and 3) a set of programs that implements the retrieval of words and their attributes, and determines similarity between words (ChemSpell). These three components can be used in various applications such as spelling verification, spelling aid, part-of-speech tagging, paraphrasing, and many other natural language processing functions.

Date

14. 8.2004 17:22:56

Source

Online. 28(2004) no.3, S.22-29

Computational linguistics for the new millennium : divergence or synergy? Proceedings of the International Symposium held at the Ruprecht-Karls Universität Heidelberg, 21-22 July 2000. Festschrift in honour of Peter Hellwig on the occasion of his 60th birthday (2002) 0.05

0.048958067 = product of:
  0.122395165 = sum of:
    0.08805784 = weight(_text_:line in 4900) [ClassicSimilarity], result of:
      0.08805784 = score(doc=4900,freq=2.0), product of:
        0.28424788 = queryWeight, product of:
          5.6078424 = idf(docFreq=440, maxDocs=44218)
          0.050687566 = queryNorm
        0.30979243 = fieldWeight in 4900, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.6078424 = idf(docFreq=440, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4900)
    0.034337327 = weight(_text_:22 in 4900) [ClassicSimilarity], result of:
      0.034337327 = score(doc=4900,freq=2.0), product of:
        0.17749922 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.050687566 = queryNorm
        0.19345059 = fieldWeight in 4900, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4900)
  0.4 = coord(2/5)

Abstract: The two seemingly conflicting tendencies, synergy and divergence, are both fundamental to the advancement of any science. Their interplay defines the demarcation line between application-oriented and theoretical research. The papers in this festschrift in honour of Peter Hellwig are geared to answer questions that arise from this insight: where does the discipline of Computational Linguistics currently stand, what has been achieved so far and what should be done next. Given the complexity of such questions, no simple answers can be expected. However, each of the practitioners and researchers are contributing from their very own perspective a piece of insight into the overall picture of today's and tomorrow's computational linguistics.

Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.05

0.048303176 = product of:
  0.24151587 = sum of:
    0.24151587 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
      0.24151587 = score(doc=862,freq=2.0), product of:
        0.42972976 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.050687566 = queryNorm
        0.56201804 = fieldWeight in 862, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=862)
  0.2 = coord(1/5)

Source: https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN

Warner, A.J.: Natural language processing (1987) 0.02

0.02197589 = product of:
  0.10987945 = sum of:
    0.10987945 = weight(_text_:22 in 337) [ClassicSimilarity], result of:
      0.10987945 = score(doc=337,freq=2.0), product of:
        0.17749922 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.050687566 = queryNorm
        0.61904186 = fieldWeight in 337, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.125 = fieldNorm(doc=337)
  0.2 = coord(1/5)

Source: Annual review of information science and technology. 22(1987), S.79-108

Wacholder, N.; Byrd, R.J.: Retrieving information from full text using linguistic knowledge (1994) 0.02
```
0.021133883 = product of:
  0.10566941 = sum of:
    0.10566941 = weight(_text_:line in 8524) [ClassicSimilarity], result of:
      0.10566941 = score(doc=8524,freq=2.0), product of:
        0.28424788 = queryWeight, product of:
          5.6078424 = idf(docFreq=440, maxDocs=44218)
          0.050687566 = queryNorm
        0.37175092 = fieldWeight in 8524, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.6078424 = idf(docFreq=440, maxDocs=44218)
          0.046875 = fieldNorm(doc=8524)
  0.2 = coord(1/5)
```
Abstract

Examines how techniques in the field of natural language processing can be applied to the analysis of text in information retrieval. State of the art text searching programs cannot distinguish, for example, between occurrences of the sickness, AIDS and aids as tool or between library school and school nor equate such terms as online or on-line which are variants of the same form. To make these distinction, systems must incorporate knowledge about the meaning of words in context. Research in natural language processing has concentrated on the automatic 'understanding' of language; how to analyze the grammatical structure and meaning of text. Although many asoects of this research remain experimental, describes how these techniques to recognize spelling variants, names, acronyms, and abbreviations
Bookstein, A.; Kulyukin, V.; Raita, T.; Nicholson, J.: Adapting measures of clumping strength to assess term-term similarity (2003) 0.02
```
0.021133883 = product of:
  0.10566941 = sum of:
    0.10566941 = weight(_text_:line in 1609) [ClassicSimilarity], result of:
      0.10566941 = score(doc=1609,freq=2.0), product of:
        0.28424788 = queryWeight, product of:
          5.6078424 = idf(docFreq=440, maxDocs=44218)
          0.050687566 = queryNorm
        0.37175092 = fieldWeight in 1609, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.6078424 = idf(docFreq=440, maxDocs=44218)
          0.046875 = fieldNorm(doc=1609)
  0.2 = coord(1/5)
```
Abstract

Automated information retrieval relies heavily an statistical regularities that emerge as terms are deposited to produce text. This paper examines statistical patterns expected of a pair of terms that are semantically related to each other. Guided by a conceptualization of the text generation process, we derive measures of how tightly two terms are semantically associated. Our main objective is to probe whether such measures yield reasonable results. Specifically, we examine how the tendency of a content bearing term to clump, as quantified by previously developed measures of term clumping, is influenced by the presence of other terms. This approach allows us to present a toolkit from which a range of measures can be constructed. As an illustration, one of several suggested measures is evaluated an a large text corpus built from an on-line encyclopedia.
Sebastiani, F.: ¬A tutorial an automated text categorisation (1999) 0.02
```
0.021133883 = product of:
  0.10566941 = sum of:
    0.10566941 = weight(_text_:line in 3390) [ClassicSimilarity], result of:
      0.10566941 = score(doc=3390,freq=2.0), product of:
        0.28424788 = queryWeight, product of:
          5.6078424 = idf(docFreq=440, maxDocs=44218)
          0.050687566 = queryNorm
        0.37175092 = fieldWeight in 3390, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.6078424 = idf(docFreq=440, maxDocs=44218)
          0.046875 = fieldNorm(doc=3390)
  0.2 = coord(1/5)
```
Abstract

The automated categorisation (or classification) of texts into topical categories has a long history, dating back at least to 1960. Until the late '80s, the dominant approach to the problem involved knowledge-engineering automatic categorisers, i.e. manually building a set of rules encoding expert knowledge an how to classify documents. In the '90s, with the booming production and availability of on-line documents, automated text categorisation has witnessed an increased and renewed interest. A newer paradigm based an machine learning has superseded the previous approach. Within this paradigm, a general inductive process automatically builds a classifier by "learning", from a set of previously classified documents, the characteristics of one or more categories; the advantages are a very good effectiveness, a considerable savings in terms of expert manpower, and domain independence. In this tutorial we look at the main approaches that have been taken towards automatic text categorisation within the general machine learning paradigm. Issues of document indexing, classifier construction, and classifier evaluation, will be touched upon.
Spitkovsky, V.I.; Chang, A.X.: ¬A cross-lingual dictionary for english Wikipedia concepts (2012) 0.02
```
0.021133883 = product of:
  0.10566941 = sum of:
    0.10566941 = weight(_text_:line in 336) [ClassicSimilarity], result of:
      0.10566941 = score(doc=336,freq=2.0), product of:
        0.28424788 = queryWeight, product of:
          5.6078424 = idf(docFreq=440, maxDocs=44218)
          0.050687566 = queryNorm
        0.37175092 = fieldWeight in 336, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.6078424 = idf(docFreq=440, maxDocs=44218)
          0.046875 = fieldNorm(doc=336)
  0.2 = coord(1/5)
```
Abstract

We present a resource for automatically associating strings of text with English Wikipedia concepts. Our machinery is bi-directional, in the sense that it uses the same fundamental probabilistic methods to map strings to empirical distributions over Wikipedia articles as it does to map article URLs to distributions over short, language-independent strings of natural language text. For maximal interoperability, we release our resource as a set of ?at line-based text ?les, lexicographically sorted and encoded with UTF-8. These files capture joint probability distributions underlying concepts (we use the terms article, concept and Wikipedia URL interchangeably) and associated snippets of text, as well as other features that can come in handy when working with Wikipedia articles and related information.

McMahon, J.G.; Smith, F.J.: Improved statistical language model performance with automatic generated word hierarchies (1996) 0.02

0.019228904 = product of:
  0.09614451 = sum of:
    0.09614451 = weight(_text_:22 in 3164) [ClassicSimilarity], result of:
      0.09614451 = score(doc=3164,freq=2.0), product of:
        0.17749922 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.050687566 = queryNorm
        0.5416616 = fieldWeight in 3164, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.109375 = fieldNorm(doc=3164)
  0.2 = coord(1/5)

Source: Computational linguistics. 22(1996) no.2, S.217-248

Ruge, G.: ¬A spreading activation network for automatic generation of thesaurus relationships (1991) 0.02

0.019228904 = product of:
  0.09614451 = sum of:
    0.09614451 = weight(_text_:22 in 4506) [ClassicSimilarity], result of:
      0.09614451 = score(doc=4506,freq=2.0), product of:
        0.17749922 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.050687566 = queryNorm
        0.5416616 = fieldWeight in 4506, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.109375 = fieldNorm(doc=4506)
  0.2 = coord(1/5)

Date: 8.10.2000 11:52:22

Somers, H.: Example-based machine translation : Review article (1999) 0.02

0.019228904 = product of:
  0.09614451 = sum of:
    0.09614451 = weight(_text_:22 in 6672) [ClassicSimilarity], result of:
      0.09614451 = score(doc=6672,freq=2.0), product of:
        0.17749922 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.050687566 = queryNorm
        0.5416616 = fieldWeight in 6672, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.109375 = fieldNorm(doc=6672)
  0.2 = coord(1/5)

Date: 31. 7.1996 9:22:19

New tools for human translators (1997) 0.02

0.019228904 = product of:
  0.09614451 = sum of:
    0.09614451 = weight(_text_:22 in 1179) [ClassicSimilarity], result of:
      0.09614451 = score(doc=1179,freq=2.0), product of:
        0.17749922 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.050687566 = queryNorm
        0.5416616 = fieldWeight in 1179, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.109375 = fieldNorm(doc=1179)
  0.2 = coord(1/5)

Date: 31. 7.1996 9:22:19

Baayen, R.H.; Lieber, H.: Word frequency distributions and lexical semantics (1997) 0.02

0.019228904 = product of:
  0.09614451 = sum of:
    0.09614451 = weight(_text_:22 in 3117) [ClassicSimilarity], result of:
      0.09614451 = score(doc=3117,freq=2.0), product of:
        0.17749922 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.050687566 = queryNorm
        0.5416616 = fieldWeight in 3117, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.109375 = fieldNorm(doc=3117)
  0.2 = coord(1/5)

Date: 28. 2.1999 10:48:22

¬Der Student aus dem Computer (2023) 0.02

0.019228904 = product of:
  0.09614451 = sum of:
    0.09614451 = weight(_text_:22 in 1079) [ClassicSimilarity], result of:
      0.09614451 = score(doc=1079,freq=2.0), product of:
        0.17749922 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.050687566 = queryNorm
        0.5416616 = fieldWeight in 1079, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.109375 = fieldNorm(doc=1079)
  0.2 = coord(1/5)

Date: 27. 1.2023 16:22:55

Byrne, C.C.; McCracken, S.A.: ¬An adaptive thesaurus employing semantic distance, relational inheritance and nominal compound interpretation for linguistic support of information retrieval (1999) 0.02

0.016481917 = product of:
  0.08240958 = sum of:
    0.08240958 = weight(_text_:22 in 4483) [ClassicSimilarity], result of:
      0.08240958 = score(doc=4483,freq=2.0), product of:
        0.17749922 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.050687566 = queryNorm
        0.46428138 = fieldWeight in 4483, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.09375 = fieldNorm(doc=4483)
  0.2 = coord(1/5)

Date: 15. 3.2000 10:22:37

Boleda, G.; Evert, S.: Multiword expressions : a pain in the neck of lexical semantics (2009) 0.02

0.016481917 = product of:
  0.08240958 = sum of:
    0.08240958 = weight(_text_:22 in 4888) [ClassicSimilarity], result of:
      0.08240958 = score(doc=4888,freq=2.0), product of:
        0.17749922 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.050687566 = queryNorm
        0.46428138 = fieldWeight in 4888, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.09375 = fieldNorm(doc=4888)
  0.2 = coord(1/5)

Date: 1. 3.2013 14:56:22

Monnerjahn, P.: Vorsprung ohne Technik : Übersetzen: Computer und Qualität (2000) 0.02

0.016481917 = product of:
  0.08240958 = sum of:
    0.08240958 = weight(_text_:22 in 5429) [ClassicSimilarity], result of:
      0.08240958 = score(doc=5429,freq=2.0), product of:
        0.17749922 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.050687566 = queryNorm
        0.46428138 = fieldWeight in 5429, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.09375 = fieldNorm(doc=5429)
  0.2 = coord(1/5)

Source: c't. 2000, H.22, S.230-231

Hutchins, J.: From first conception to first demonstration : the nascent years of machine translation, 1947-1954. A chronology (1997) 0.01

0.013734931 = product of:
  0.068674654 = sum of:
    0.068674654 = weight(_text_:22 in 1463) [ClassicSimilarity], result of:
      0.068674654 = score(doc=1463,freq=2.0), product of:
        0.17749922 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.050687566 = queryNorm
        0.38690117 = fieldWeight in 1463, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.078125 = fieldNorm(doc=1463)
  0.2 = coord(1/5)

Date: 31. 7.1996 9:22:19

Kuhlmann, U.; Monnerjahn, P.: Sprache auf Knopfdruck : Sieben automatische Übersetzungsprogramme im Test (2000) 0.01

0.013734931 = product of:
  0.068674654 = sum of:
    0.068674654 = weight(_text_:22 in 5428) [ClassicSimilarity], result of:
      0.068674654 = score(doc=5428,freq=2.0), product of:
        0.17749922 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.050687566 = queryNorm
        0.38690117 = fieldWeight in 5428, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.078125 = fieldNorm(doc=5428)
  0.2 = coord(1/5)

Source: c't. 2000, H.22, S.220-229

Search (56 results, page 1 of 3)

Authors

Years

Languages

Types

Themes

Subjects

Classifications