Search (201 results, page 1 of 11)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.31

0.30909637 = product of:
  0.37091565 = sum of:
    0.07054476 = product of:
      0.21163426 = sum of:
        0.21163426 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
          0.21163426 = score(doc=562,freq=2.0), product of:
            0.37656134 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.044416238 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
    0.031359423 = weight(_text_:web in 562) [ClassicSimilarity], result of:
      0.031359423 = score(doc=562,freq=2.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.21634221 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.21163426 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.21163426 = score(doc=562,freq=2.0), product of:
        0.37656134 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.044416238 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.039323866 = weight(_text_:computer in 562) [ClassicSimilarity], result of:
      0.039323866 = score(doc=562,freq=2.0), product of:
        0.16231956 = queryWeight, product of:
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.044416238 = queryNorm
        0.24226204 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.01805336 = product of:
      0.03610672 = sum of:
        0.03610672 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
          0.03610672 = score(doc=562,freq=2.0), product of:
            0.1555381 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.044416238 = queryNorm
            0.23214069 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.5 = coord(1/2)
  0.8333333 = coord(5/6)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32
Imprint: Washington, DC : IEEE Computer Society

Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.22

0.22115356 = product of:
  0.33173034 = sum of:
    0.062718846 = weight(_text_:web in 563) [ClassicSimilarity], result of:
      0.062718846 = score(doc=563,freq=8.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.43268442 = fieldWeight in 563, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.21163426 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
      0.21163426 = score(doc=563,freq=2.0), product of:
        0.37656134 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.044416238 = queryNorm
        0.56201804 = fieldWeight in 563, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.039323866 = weight(_text_:computer in 563) [ClassicSimilarity], result of:
      0.039323866 = score(doc=563,freq=2.0), product of:
        0.16231956 = queryWeight, product of:
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.044416238 = queryNorm
        0.24226204 = fieldWeight in 563, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.01805336 = product of:
      0.03610672 = sum of:
        0.03610672 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
          0.03610672 = score(doc=563,freq=2.0), product of:
            0.1555381 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.044416238 = queryNorm
            0.23214069 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
      0.5 = coord(1/2)
  0.6666667 = coord(4/6)

Abstract: In this thesis we propose three new word association measures for multi-word term extraction. We combine these association measures with LocalMaxs algorithm in our extraction model and compare the results of different multi-word term extraction methods. Our approach is language and domain independent and requires no training data. It can be applied to such tasks as text summarization, information retrieval, and document classification. We further explore the potential of using multi-word terms as an effective representation for general web-page summarization. We extract multi-word terms from human written summaries in a large collection of web-pages, and generate the summaries by aligning document words with these multi-word terms. Our system applies machine translation technology to learn the aligning process from a training set and focuses on selecting high quality multi-word terms from human written summaries to generate suitable results for web-page summarization.
Content: A Thesis presented to The University of Guelph In partial fulfilment of requirements for the degree of Master of Science in Computer Science. Vgl. Unter: http://www.inf.ufrgs.br%2F~ceramisch%2Fdownload_files%2Fpublications%2F2009%2Fp01.pdf.
Date: 10. 1.2013 19:22:47

Doszkocs, T.E.; Zamora, A.: Dictionary services and spelling aids for Web searching (2004) 0.11

0.10830758 = product of:
  0.21661516 = sum of:
    0.045263432 = weight(_text_:web in 2541) [ClassicSimilarity], result of:
      0.045263432 = score(doc=2541,freq=6.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.3122631 = fieldWeight in 2541, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2541)
    0.04634362 = weight(_text_:computer in 2541) [ClassicSimilarity], result of:
      0.04634362 = score(doc=2541,freq=4.0), product of:
        0.16231956 = queryWeight, product of:
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.044416238 = queryNorm
        0.28550854 = fieldWeight in 2541, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2541)
    0.1250081 = sum of:
      0.08245592 = weight(_text_:programs in 2541) [ClassicSimilarity], result of:
        0.08245592 = score(doc=2541,freq=2.0), product of:
          0.25748047 = queryWeight, product of:
            5.79699 = idf(docFreq=364, maxDocs=44218)
            0.044416238 = queryNorm
          0.32024145 = fieldWeight in 2541, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            5.79699 = idf(docFreq=364, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2541)
      0.04255218 = weight(_text_:22 in 2541) [ClassicSimilarity], result of:
        0.04255218 = score(doc=2541,freq=4.0), product of:
          0.1555381 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.044416238 = queryNorm
          0.27358043 = fieldWeight in 2541, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2541)
  0.5 = coord(3/6)

Abstract: The Specialized Information Services Division (SIS) of the National Library of Medicine (NLM) provides Web access to more than a dozen scientific databases on toxicology and the environment on TOXNET . Search queries on TOXNET often include misspelled or variant English words, medical and scientific jargon and chemical names. Following the example of search engines like Google and ClinicalTrials.gov, we set out to develop a spelling "suggestion" system for increased recall and precision in TOXNET searching. This paper describes development of dictionary technology that can be used in a variety of applications such as orthographic verification, writing aid, natural language processing, and information storage and retrieval. The design of the technology allows building complex applications using the components developed in the earlier phases of the work in a modular fashion without extensive rewriting of computer code. Since many of the potential applications envisioned for this work have on-line or web-based interfaces, the dictionaries and other computer components must have fast response, and must be adaptable to open-ended database vocabularies, including chemical nomenclature. The dictionary vocabulary for this work was derived from SIS and other databases and specialized resources, such as NLM's Unified Medical Language Systems (UMLS) . The resulting technology, A-Z Dictionary (AZdict), has three major constituents: 1) the vocabulary list, 2) the word attributes that define part of speech and morphological relationships between words in the list, and 3) a set of programs that implements the retrieval of words and their attributes, and determines similarity between words (ChemSpell). These three components can be used in various applications such as spelling verification, spelling aid, part-of-speech tagging, paraphrasing, and many other natural language processing functions.
Date: 14. 8.2004 17:22:56
Source: Online. 28(2004) no.3, S.22-29

Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.09

0.094059676 = product of:
  0.28217903 = sum of:
    0.07054476 = product of:
      0.21163426 = sum of:
        0.21163426 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
          0.21163426 = score(doc=862,freq=2.0), product of:
            0.37656134 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.044416238 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.33333334 = coord(1/3)
    0.21163426 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
      0.21163426 = score(doc=862,freq=2.0), product of:
        0.37656134 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.044416238 = queryNorm
        0.56201804 = fieldWeight in 862, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=862)
  0.33333334 = coord(2/6)

Source: https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN

Yang, C.C.; Luk, J.: Automatic generation of English/Chinese thesaurus based on a parallel corpus in laws (2003) 0.08
```
0.084475785 = product of:
  0.12671368 = sum of:
    0.033718713 = weight(_text_:wide in 1616) [ClassicSimilarity], result of:
      0.033718713 = score(doc=1616,freq=2.0), product of:
        0.19679762 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.044416238 = queryNorm
        0.171337 = fieldWeight in 1616, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1616)
    0.036585998 = weight(_text_:web in 1616) [ClassicSimilarity], result of:
      0.036585998 = score(doc=1616,freq=8.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.25239927 = fieldWeight in 1616, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1616)
    0.04587784 = weight(_text_:computer in 1616) [ClassicSimilarity], result of:
      0.04587784 = score(doc=1616,freq=8.0), product of:
        0.16231956 = queryWeight, product of:
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.044416238 = queryNorm
        0.28263903 = fieldWeight in 1616, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1616)
    0.010531127 = product of:
      0.021062255 = sum of:
        0.021062255 = weight(_text_:22 in 1616) [ClassicSimilarity], result of:
          0.021062255 = score(doc=1616,freq=2.0), product of:
            0.1555381 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.044416238 = queryNorm
            0.1354154 = fieldWeight in 1616, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1616)
      0.5 = coord(1/2)
  0.6666667 = coord(4/6)
```
Abstract

The information available in languages other than English in the World Wide Web is increasing significantly. According to a report from Computer Economics in 1999, 54% of Internet users are English speakers ("English Will Dominate Web for Only Three More Years," Computer Economics, July 9, 1999, http://www.computereconomics. com/new4/pr/pr990610.html). However, it is predicted that there will be only 60% increase in Internet users among English speakers verses a 150% growth among nonEnglish speakers for the next five years. By 2005, 57% of Internet users will be non-English speakers. A report by CNN.com in 2000 showed that the number of Internet users in China had been increased from 8.9 million to 16.9 million from January to June in 2000 ("Report: China Internet users double to 17 million," CNN.com, July, 2000, http://cnn.org/2000/TECH/computing/07/27/ china.internet.reut/index.html). According to Nielsen/ NetRatings, there was a dramatic leap from 22.5 millions to 56.6 millions Internet users from 2001 to 2002. China had become the second largest global at-home Internet population in 2002 (US's Internet population was 166 millions) (Robyn Greenspan, "China Pulls Ahead of Japan," Internet.com, April 22, 2002, http://cyberatias.internet.com/big-picture/geographics/article/0,,5911_1013841,00. html). All of the evidences reveal the importance of crosslingual research to satisfy the needs in the near future. Digital library research has been focusing in structural and semantic interoperability in the past. Searching and retrieving objects across variations in protocols, formats and disciplines are widely explored (Schatz, B., & Chen, H. (1999). Digital libraries: technological advances and social impacts. IEEE Computer, Special Issue an Digital Libraries, February, 32(2), 45-50.; Chen, H., Yen, J., & Yang, C.C. (1999). International activities: development of Asian digital libraries. IEEE Computer, Special Issue an Digital Libraries, 32(2), 48-49.). However, research in crossing language boundaries, especially across European languages and Oriental languages, is still in the initial stage. In this proposal, we put our focus an cross-lingual semantic interoperability by developing automatic generation of a cross-lingual thesaurus based an English/Chinese parallel corpus. When the searchers encounter retrieval problems, Professional librarians usually consult the thesaurus to identify other relevant vocabularies. In the problem of searching across language boundaries, a cross-lingual thesaurus, which is generated by co-occurrence analysis and Hopfield network, can be used to generate additional semantically relevant terms that cannot be obtained from dictionary. In particular, the automatically generated cross-lingual thesaurus is able to capture the unknown words that do not exist in a dictionary, such as names of persons, organizations, and events. Due to Hong Kong's unique history background, both English and Chinese are used as official languages in all legal documents. Therefore, English/Chinese cross-lingual information retrieval is critical for applications in courts and the government. In this paper, we develop an automatic thesaurus by the Hopfield network based an a parallel corpus collected from the Web site of the Department of Justice of the Hong Kong Special Administrative Region (HKSAR) Government. Experiments are conducted to measure the precision and recall of the automatic generated English/Chinese thesaurus. The result Shows that such thesaurus is a promising tool to retrieve relevant terms, especially in the language that is not the same as the input term. The direct translation of the input term can also be retrieved in most of the cases.

Footnote

Teil eines Themenheftes: "Web retrieval and mining: A machine learning perspective"

Chowdhury, G.G.: Natural language processing (2002) 0.07

0.072387636 = product of:
  0.14477527 = sum of:
    0.057803504 = weight(_text_:wide in 4284) [ClassicSimilarity], result of:
      0.057803504 = score(doc=4284,freq=2.0), product of:
        0.19679762 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.044416238 = queryNorm
        0.29372054 = fieldWeight in 4284, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.046875 = fieldNorm(doc=4284)
    0.031359423 = weight(_text_:web in 4284) [ClassicSimilarity], result of:
      0.031359423 = score(doc=4284,freq=2.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.21634221 = fieldWeight in 4284, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=4284)
    0.05561234 = weight(_text_:computer in 4284) [ClassicSimilarity], result of:
      0.05561234 = score(doc=4284,freq=4.0), product of:
        0.16231956 = queryWeight, product of:
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.044416238 = queryNorm
        0.34261024 = fieldWeight in 4284, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.046875 = fieldNorm(doc=4284)
  0.5 = coord(3/6)

Abstract: Natural Language Processing (NLP) is an area of research and application that explores how computers can be used to understand and manipulate natural language text or speech to do useful things. NLP researchers aim to gather knowledge an how human beings understand and use language so that appropriate tools and techniques can be developed to make computer systems understand and manipulate natural languages to perform desired tasks. The foundations of NLP lie in a number of disciplines, namely, computer and information sciences, linguistics, mathematics, electrical and electronic engineering, artificial intelligence and robotics, and psychology. Applications of NLP include a number of fields of study, such as machine translation, natural language text processing and summarization, user interfaces, multilingual and cross-language information retrieval (CLIR), speech recognition, artificial intelligence, and expert systems. One important application area that is relatively new and has not been covered in previous ARIST chapters an NLP relates to the proliferation of the World Wide Web and digital libraries.

Kokol, P.; Podgorelec, V.; Zorman, M.; Kokol, T.; Njivar, T.: Computer and natural language texts : a comparison based on long-range correlations (1999) 0.05

0.053696655 = product of:
  0.16108996 = sum of:
    0.07946275 = weight(_text_:computer in 4299) [ClassicSimilarity], result of:
      0.07946275 = score(doc=4299,freq=6.0), product of:
        0.16231956 = queryWeight, product of:
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.044416238 = queryNorm
        0.48954517 = fieldWeight in 4299, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4299)
    0.081627205 = product of:
      0.16325441 = sum of:
        0.16325441 = weight(_text_:programs in 4299) [ClassicSimilarity], result of:
          0.16325441 = score(doc=4299,freq=4.0), product of:
            0.25748047 = queryWeight, product of:
              5.79699 = idf(docFreq=364, maxDocs=44218)
              0.044416238 = queryNorm
            0.6340458 = fieldWeight in 4299, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.79699 = idf(docFreq=364, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4299)
      0.5 = coord(1/2)
  0.33333334 = coord(2/6)

Abstract: 'Long-range power low correlation' (LRC) is defined as a maximal propagation distance of the effect of some disturbance within a system found in many systems that can be represented as strings of symbols. LRC between characters has also been identified in natural language texts. The aim of this article is to show that long-range power law correlations can also be found in computer programs, meaning that some common laws hold for both natural language texts and computer programs. This fact enables one to draw parallels between these 2 different types of human writings, and also enables one to measure the differences between them

Monnerjahn, P.: Vorsprung ohne Technik : Übersetzen: Computer und Qualität (2000) 0.05

0.04911047 = product of:
  0.1473314 = sum of:
    0.11122468 = weight(_text_:computer in 5429) [ClassicSimilarity], result of:
      0.11122468 = score(doc=5429,freq=4.0), product of:
        0.16231956 = queryWeight, product of:
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.044416238 = queryNorm
        0.6852205 = fieldWeight in 5429, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.09375 = fieldNorm(doc=5429)
    0.03610672 = product of:
      0.07221344 = sum of:
        0.07221344 = weight(_text_:22 in 5429) [ClassicSimilarity], result of:
          0.07221344 = score(doc=5429,freq=2.0), product of:
            0.1555381 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.044416238 = queryNorm
            0.46428138 = fieldWeight in 5429, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=5429)
      0.5 = coord(1/2)
  0.33333334 = coord(2/6)

Abstract: Noch immer ist der menschliche Übersetzer dem Computer in sprachlicher Hinsicht überlegen. Zwar ist die Übersetzungssoftware besser geworden, aber die systembedingten Probleme bleiben
Source: c't. 2000, H.22, S.230-231

Sikkel, K.: Parsing schemata : a framework for specification and analysis of parsing algorithms (1996) 0.05

0.048598986 = product of:
  0.14579695 = sum of:
    0.09632341 = weight(_text_:computer in 685) [ClassicSimilarity], result of:
      0.09632341 = score(doc=685,freq=12.0), product of:
        0.16231956 = queryWeight, product of:
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.044416238 = queryNorm
        0.59341836 = fieldWeight in 685, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.046875 = fieldNorm(doc=685)
    0.049473554 = product of:
      0.09894711 = sum of:
        0.09894711 = weight(_text_:programs in 685) [ClassicSimilarity], result of:
          0.09894711 = score(doc=685,freq=2.0), product of:
            0.25748047 = queryWeight, product of:
              5.79699 = idf(docFreq=364, maxDocs=44218)
              0.044416238 = queryNorm
            0.38428974 = fieldWeight in 685, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.79699 = idf(docFreq=364, maxDocs=44218)
              0.046875 = fieldNorm(doc=685)
      0.5 = coord(1/2)
  0.33333334 = coord(2/6)

Abstract: Parsing, the syntactic analysis of language, has been studied extensively in computer science and computational linguistics. Computer programs and natural languages share an underlying theory of formal languages and require efficient parsing algorithms. This introductions reviews the thory of parsing from a novel perspective, it provides a formalism to capture the essential traits of a parser that abstracts from the fine detail and allows a uniform description and comparison of a variety of parsers, including Earley, Tomita, LR, Left-Corner, and Head-Corner parsers. The emphasis is on context-free phrase structure grammar and how these parsers can be extended to unification formalisms. The book combines mathematical rigor with high readability and is suitable as a graduate course text
LCSH: Computer algorithms
Parsing (Computer grammar)
Subject: Computer algorithms
Parsing (Computer grammar)

¬Der Student aus dem Computer (2023) 0.04

0.044626735 = product of:
  0.1338802 = sum of:
    0.09175568 = weight(_text_:computer in 1079) [ClassicSimilarity], result of:
      0.09175568 = score(doc=1079,freq=2.0), product of:
        0.16231956 = queryWeight, product of:
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.044416238 = queryNorm
        0.56527805 = fieldWeight in 1079, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.109375 = fieldNorm(doc=1079)
    0.04212451 = product of:
      0.08424902 = sum of:
        0.08424902 = weight(_text_:22 in 1079) [ClassicSimilarity], result of:
          0.08424902 = score(doc=1079,freq=2.0), product of:
            0.1555381 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.044416238 = queryNorm
            0.5416616 = fieldWeight in 1079, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=1079)
      0.5 = coord(1/2)
  0.33333334 = coord(2/6)

Date: 27. 1.2023 16:22:55

Schneider, R.: Web 3.0 ante portas? : Integration von Social Web und Semantic Web (2008) 0.04

0.03928657 = product of:
  0.11785971 = sum of:
    0.09679745 = weight(_text_:web in 4184) [ClassicSimilarity], result of:
      0.09679745 = score(doc=4184,freq=14.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.6677857 = fieldWeight in 4184, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4184)
    0.021062255 = product of:
      0.04212451 = sum of:
        0.04212451 = weight(_text_:22 in 4184) [ClassicSimilarity], result of:
          0.04212451 = score(doc=4184,freq=2.0), product of:
            0.1555381 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.044416238 = queryNorm
            0.2708308 = fieldWeight in 4184, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4184)
      0.5 = coord(1/2)
  0.33333334 = coord(2/6)

Abstract: Das Medium Internet ist im Wandel, und mit ihm ändern sich seine Publikations- und Rezeptionsbedingungen. Welche Chancen bieten die momentan parallel diskutierten Zukunftsentwürfe von Social Web und Semantic Web? Zur Beantwortung dieser Frage beschäftigt sich der Beitrag mit den Grundlagen beider Modelle unter den Aspekten Anwendungsbezug und Technologie, beleuchtet darüber hinaus jedoch auch deren Unzulänglichkeiten sowie den Mehrwert einer mediengerechten Kombination. Am Beispiel des grammatischen Online-Informationssystems grammis wird eine Strategie zur integrativen Nutzung der jeweiligen Stärken skizziert.
Date: 22. 1.2011 10:38:28
Source: Kommunikation, Partizipation und Wirkungen im Social Web, Band 1. Hrsg.: A. Zerfaß u.a
Theme: Semantic Web

Ruge, G.: Sprache und Computer : Wortbedeutung und Termassoziation. Methoden zur automatischen semantischen Klassifikation (1995) 0.04

0.038295243 = product of:
  0.114885725 = sum of:
    0.090814576 = weight(_text_:computer in 1534) [ClassicSimilarity], result of:
      0.090814576 = score(doc=1534,freq=6.0), product of:
        0.16231956 = queryWeight, product of:
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.044416238 = queryNorm
        0.5594802 = fieldWeight in 1534, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.0625 = fieldNorm(doc=1534)
    0.024071148 = product of:
      0.048142295 = sum of:
        0.048142295 = weight(_text_:22 in 1534) [ClassicSimilarity], result of:
          0.048142295 = score(doc=1534,freq=2.0), product of:
            0.1555381 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.044416238 = queryNorm
            0.30952093 = fieldWeight in 1534, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1534)
      0.5 = coord(1/2)
  0.33333334 = coord(2/6)

Content: Enthält folgende Kapitel: (1) Motivation; (2) Language philosophical foundations; (3) Structural comparison of extensions; (4) Earlier approaches towards term association; (5) Experiments; (6) Spreading-activation networks or memory models; (7) Perspective. Appendices: Heads and modifiers of 'car'. Glossary. Index. Language and computer. Word semantics and term association. Methods towards an automatic semantic classification
Footnote: Rez. in: Knowledge organization 22(1995) no.3/4, S.182-184 (M.T. Rolland)
Series: Sprache und Computer; Bd.14

Working with conceptual structures : contributions to ICCS 2000. 8th International Conference on Conceptual Structures: Logical, Linguistic, and Computational Issues. Darmstadt, August 14-18, 2000 (2000) 0.04
```
0.037475318 = product of:
  0.074950635 = sum of:
    0.033718713 = weight(_text_:wide in 5089) [ClassicSimilarity], result of:
      0.033718713 = score(doc=5089,freq=2.0), product of:
        0.19679762 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.044416238 = queryNorm
        0.171337 = fieldWeight in 5089, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.02734375 = fieldNorm(doc=5089)
    0.018292999 = weight(_text_:web in 5089) [ClassicSimilarity], result of:
      0.018292999 = score(doc=5089,freq=2.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.12619963 = fieldWeight in 5089, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.02734375 = fieldNorm(doc=5089)
    0.02293892 = weight(_text_:computer in 5089) [ClassicSimilarity], result of:
      0.02293892 = score(doc=5089,freq=2.0), product of:
        0.16231956 = queryWeight, product of:
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.044416238 = queryNorm
        0.14131951 = fieldWeight in 5089, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.02734375 = fieldNorm(doc=5089)
  0.5 = coord(3/6)
```
Abstract

The 8th International Conference on Conceptual Structures - Logical, Linguistic, and Computational Issues (ICCS 2000) brings together a wide range of researchers and practitioners working with conceptual structures. During the last few years, the ICCS conference series has considerably widened its scope on different kinds of conceptual structures, stimulating research across domain boundaries. We hope that this stimulation is further enhanced by ICCS 2000 joining the long tradition of conferences in Darmstadt with extensive, lively discussions. This volume consists of contributions presented at ICCS 2000, complementing the volume "Conceptual Structures: Logical, Linguistic, and Computational Issues" (B. Ganter, G.W. Mineau (Eds.), LNAI 1867, Springer, Berlin-Heidelberg 2000). It contains submissions reviewed by the program committee, and position papers. We wish to express our appreciation to all the authors of submitted papers, to the general chair, the program chair, the editorial board, the program committee, and to the additional reviewers for making ICCS 2000 a valuable contribution in the knowledge processing research field. Special thanks go to the local organizers for making the conference an enjoyable and inspiring event. We are grateful to Darmstadt University of Technology, the Ernst Schröder Center for Conceptual Knowledge Processing, the Center for Interdisciplinary Studies in Technology, the Deutsche Forschungsgemeinschaft, Land Hessen, and NaviCon GmbH for their generous support

Content

Concepts & Language: Knowledge organization by procedures of natural language processing. A case study using the method GABEK (J. Zelger, J. Gadner) - Computer aided narrative analysis using conceptual graphs (H. Schärfe, P. 0hrstrom) - Pragmatic representation of argumentative text: a challenge for the conceptual graph approach (H. Irandoust, B. Moulin) - Conceptual graphs as a knowledge representation core in a complex language learning environment (G. Angelova, A. Nenkova, S. Boycheva, T. Nikolov) - Conceptual Modeling and Ontologies: Relationships and actions in conceptual categories (Ch. Landauer, K.L. Bellman) - Concept approximations for formal concept analysis (J. Saquer, J.S. Deogun) - Faceted information representation (U. Priß) - Simple concept graphs with universal quantifiers (J. Tappe) - A framework for comparing methods for using or reusing multiple ontologies in an application (J. van ZyI, D. Corbett) - Designing task/method knowledge-based systems with conceptual graphs (M. Leclère, F.Trichet, Ch. Choquet) - A logical ontology (J. Farkas, J. Sarbo) - Algorithms and Tools: Fast concept analysis (Ch. Lindig) - A framework for conceptual graph unification (D. Corbett) - Visual CP representation of knowledge (H.D. Pfeiffer, R.T. Hartley) - Maximal isojoin for representing software textual specifications and detecting semantic anomalies (Th. Charnois) - Troika: using grids, lattices and graphs in knowledge acquisition (H.S. Delugach, B.E. Lampkin) - Open world theorem prover for conceptual graphs (J.E. Heaton, P. Kocura) - NetCare: a practical conceptual graphs software tool (S. Polovina, D. Strang) - CGWorld - a web based workbench for conceptual graphs management and applications (P. Dobrev, K. Toutanova) - Position papers: The edition project: Peirce's existential graphs (R. Mülller) - Mining association rules using formal concept analysis (N. Pasquier) - Contextual logic summary (R Wille) - Information channels and conceptual scaling (K.E. Wolff) - Spatial concepts - a rule exploration (S. Rudolph) - The TEXT-TO-ONTO learning environment (A. Mädche, St. Staab) - Controlling the semantics of metadata on audio-visual documents using ontologies (Th. Dechilly, B. Bachimont) - Building the ontological foundations of a terminology from natural language to conceptual graphs with Ribosome, a knowledge extraction system (Ch. Jacquelinet, A. Burgun) - CharGer: some lessons learned and new directions (H.S. Delugach) - Knowledge management using conceptual graphs (W.K. Pun)

Thelwall, M.; Price, L.: Language evolution and the spread of ideas on the Web : a procedure for identifying emergent hybrid word (2006) 0.04

0.03737321 = product of:
  0.11211963 = sum of:
    0.057803504 = weight(_text_:wide in 5896) [ClassicSimilarity], result of:
      0.057803504 = score(doc=5896,freq=2.0), product of:
        0.19679762 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.044416238 = queryNorm
        0.29372054 = fieldWeight in 5896, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.046875 = fieldNorm(doc=5896)
    0.054316122 = weight(_text_:web in 5896) [ClassicSimilarity], result of:
      0.054316122 = score(doc=5896,freq=6.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.37471575 = fieldWeight in 5896, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=5896)
  0.33333334 = coord(2/6)

Abstract: Word usage is of interest to linguists for its own sake as well as to social scientists and others who seek to track the spread of ideas, for example, in public debates over political decisions. The historical evolution of language can be analyzed with the tools of corpus linguistics through evolving corpora and the Web. But word usage statistics can only be gathered for known words. In this article, techniques are described and tested for identifying new words from the Web, focusing on the case when the words are related to a topic and have a hybrid form with a common sequence of letters. The results highlight the need to employ a combination of search techniques and show the wide potential of hybrid word family investigations in linguistics and social science.

Computerlinguistik und Sprachtechnologie : Eine Einführung (2010) 0.04

0.03695324 = product of:
  0.110859714 = sum of:
    0.0642156 = weight(_text_:computer in 1735) [ClassicSimilarity], result of:
      0.0642156 = score(doc=1735,freq=12.0), product of:
        0.16231956 = queryWeight, product of:
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.044416238 = queryNorm
        0.39561224 = fieldWeight in 1735, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.03125 = fieldNorm(doc=1735)
    0.046644114 = product of:
      0.09328823 = sum of:
        0.09328823 = weight(_text_:programs in 1735) [ClassicSimilarity], result of:
          0.09328823 = score(doc=1735,freq=4.0), product of:
            0.25748047 = queryWeight, product of:
              5.79699 = idf(docFreq=364, maxDocs=44218)
              0.044416238 = queryNorm
            0.36231187 = fieldWeight in 1735, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.79699 = idf(docFreq=364, maxDocs=44218)
              0.03125 = fieldNorm(doc=1735)
      0.5 = coord(1/2)
  0.33333334 = coord(2/6)

LCSH: Computer science
Translators (Computer programs)
Computer science
Subject: Computer science
Translators (Computer programs)
Computer science

Kreymer, O.: ¬An evaluation of help mechanisms in natural language information retrieval systems (2002) 0.03
```
0.03237579 = product of:
  0.09712737 = sum of:
    0.057803504 = weight(_text_:wide in 2557) [ClassicSimilarity], result of:
      0.057803504 = score(doc=2557,freq=2.0), product of:
        0.19679762 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.044416238 = queryNorm
        0.29372054 = fieldWeight in 2557, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.046875 = fieldNorm(doc=2557)
    0.039323866 = weight(_text_:computer in 2557) [ClassicSimilarity], result of:
      0.039323866 = score(doc=2557,freq=2.0), product of:
        0.16231956 = queryWeight, product of:
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.044416238 = queryNorm
        0.24226204 = fieldWeight in 2557, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.046875 = fieldNorm(doc=2557)
  0.33333334 = coord(2/6)
```
Abstract

The field of natural language processing (NLP) demonstrates rapid changes in the design of information retrieval systems and human-computer interaction. While natural language is being looked on as the most effective tool for information retrieval in a contemporary information environment, the systems using it are only beginning to emerge. This study attempts to evaluate the current state of NLP information retrieval systems from the user's point of view: what techniques are used by these systems to guide their users through the search process? The analysis focused on the structure and components of the systems' help mechanisms. Results of the study demonstrated that systems which claimed to be using natural language searching in fact used a wide range of information retrieval techniques from real natural language processing to Boolean searching. As a result, the user assistance mechanisms of these systems also varied. While pseudo-NLP systems would suit a more traditional method of instruction, real NLP systems primarily utilised the methods of explanation and user-system dialogue.
Perovsek, M.; Kranjca, J.; Erjaveca, T.; Cestnika, B.; Lavraca, N.: TextFlows : a visual programming platform for text mining and natural language processing (2016) 0.03
```
0.027890932 = product of:
  0.08367279 = sum of:
    0.04434892 = weight(_text_:web in 2697) [ClassicSimilarity], result of:
      0.04434892 = score(doc=2697,freq=4.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.3059541 = fieldWeight in 2697, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=2697)
    0.039323866 = weight(_text_:computer in 2697) [ClassicSimilarity], result of:
      0.039323866 = score(doc=2697,freq=2.0), product of:
        0.16231956 = queryWeight, product of:
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.044416238 = queryNorm
        0.24226204 = fieldWeight in 2697, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.046875 = fieldNorm(doc=2697)
  0.33333334 = coord(2/6)
```
Abstract

Text mining and natural language processing are fast growing areas of research, with numerous applications in business, science and creative industries. This paper presents TextFlows, a web-based text mining and natural language processing platform supporting workflow construction, sharing and execution. The platform enables visual construction of text mining workflows through a web browser, and the execution of the constructed workflows on a processing cloud. This makes TextFlows an adaptable infrastructure for the construction and sharing of text processing workflows, which can be reused in various applications. The paper presents the implemented text mining and language processing modules, and describes some precomposed workflows. Their features are demonstrated on three use cases: comparison of document classifiers and of different part-of-speech taggers on a text categorization problem, and outlier detection in document corpora.

Source

Science of computer programming. In Press, 2016
Hmeidi, I.I.; Al-Shalabi, R.F.; Al-Taani, A.T.; Najadat, H.; Al-Hazaimeh, S.A.: ¬A novel approach to the extraction of roots from Arabic words using bigrams (2010) 0.03
```
0.026979826 = product of:
  0.08093948 = sum of:
    0.04816959 = weight(_text_:wide in 3426) [ClassicSimilarity], result of:
      0.04816959 = score(doc=3426,freq=2.0), product of:
        0.19679762 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.044416238 = queryNorm
        0.24476713 = fieldWeight in 3426, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3426)
    0.03276989 = weight(_text_:computer in 3426) [ClassicSimilarity], result of:
      0.03276989 = score(doc=3426,freq=2.0), product of:
        0.16231956 = queryWeight, product of:
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.044416238 = queryNorm
        0.20188503 = fieldWeight in 3426, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3426)
  0.33333334 = coord(2/6)
```
Abstract

Root extraction is one of the most important topics in information retrieval (IR), natural language processing (NLP), text summarization, and many other important fields. In the last two decades, several algorithms have been proposed to extract Arabic roots. Most of these algorithms dealt with triliteral roots only, and some with fixed length words only. In this study, a novel approach to the extraction of roots from Arabic words using bigrams is proposed. Two similarity measures are used, the dissimilarity measure called the Manhattan distance, and Dice's measure of similarity. The proposed algorithm is tested on the Holy Qu'ran and on a corpus of 242 abstracts from the Proceedings of the Saudi Arabian National Computer Conferences. The two files used contain a wide range of data: the Holy Qu'ran contains most of the ancient Arabic words while the other file contains some modern Arabic words and some words borrowed from foreign languages in addition to the original Arabic words. The results of this study showed that combining N-grams with the Dice measure gives better results than using the Manhattan distance measure.

Rieger, F.: Lügende Computer (2023) 0.03

0.02550099 = product of:
  0.07650297 = sum of:
    0.05243182 = weight(_text_:computer in 912) [ClassicSimilarity], result of:
      0.05243182 = score(doc=912,freq=2.0), product of:
        0.16231956 = queryWeight, product of:
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.044416238 = queryNorm
        0.32301605 = fieldWeight in 912, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.0625 = fieldNorm(doc=912)
    0.024071148 = product of:
      0.048142295 = sum of:
        0.048142295 = weight(_text_:22 in 912) [ClassicSimilarity], result of:
          0.048142295 = score(doc=912,freq=2.0), product of:
            0.1555381 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.044416238 = queryNorm
            0.30952093 = fieldWeight in 912, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=912)
      0.5 = coord(1/2)
  0.33333334 = coord(2/6)

Date: 16. 3.2023 19:22:55

Rötzer, F.: Computer ergooglen die Bedeutung von Worten (2005) 0.03
```
0.02510659 = product of:
  0.07531977 = sum of:
    0.027158061 = weight(_text_:web in 3385) [ClassicSimilarity], result of:
      0.027158061 = score(doc=3385,freq=6.0), product of:
        0.14495286 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.044416238 = queryNorm
        0.18735787 = fieldWeight in 3385, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0234375 = fieldNorm(doc=3385)
    0.048161704 = weight(_text_:computer in 3385) [ClassicSimilarity], result of:
      0.048161704 = score(doc=3385,freq=12.0), product of:
        0.16231956 = queryWeight, product of:
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.044416238 = queryNorm
        0.29670918 = fieldWeight in 3385, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.6545093 = idf(docFreq=3109, maxDocs=44218)
          0.0234375 = fieldNorm(doc=3385)
  0.33333334 = coord(2/6)
```
Content

"Wie könnten Computer Sprache lernen und dabei auch die Bedeutung von Worten sowie die Beziehungen zwischen ihnen verstehen? Dieses Problem der Semantik stellt eine gewaltige, bislang nur ansatzweise bewältigte Aufgabe dar, da Worte und Wortverbindungen oft mehrere oder auch viele Bedeutungen haben, die zudem vom außersprachlichen Kontext abhängen. Die beiden holländischen (Ein künstliches Bewusstsein aus einfachen Aussagen (1)). Paul Vitanyi (2) und Rudi Cilibrasi vom Nationalen Institut für Mathematik und Informatik (3) in Amsterdam schlagen eine elegante Lösung vor: zum Nachschlagen im Internet, der größten Datenbank, die es gibt, wird einfach Google benutzt. Objekte wie eine Maus können mit ihren Namen "Maus" benannt werden, die Bedeutung allgemeiner Begriffe muss aus ihrem Kontext gelernt werden. Ein semantisches Web zur Repräsentation von Wissen besteht aus den möglichen Verbindungen, die Objekte und ihre Namen eingehen können. Natürlich können in der Wirklichkeit neue Namen, aber auch neue Bedeutungen und damit neue Verknüpfungen geschaffen werden. Sprache ist lebendig und flexibel. Um einer Künstlichen Intelligenz alle Wortbedeutungen beizubringen, müsste mit der Hilfe von menschlichen Experten oder auch vielen Mitarbeitern eine riesige Datenbank mit den möglichen semantischen Netzen aufgebaut und dazu noch ständig aktualisiert werden. Das aber müsste gar nicht notwendig sein, denn mit dem Web gibt es nicht nur die größte und weitgehend kostenlos benutzbare semantische Datenbank, sie wird auch ständig von zahllosen Internetnutzern aktualisiert. Zudem gibt es Suchmaschinen wie Google, die Verbindungen zwischen Worten und damit deren Bedeutungskontext in der Praxis in ihrer Wahrscheinlichkeit quantitativ mit der Angabe der Webseiten, auf denen sie gefunden wurden, messen.
Mit einem bereits zuvor von Paul Vitanyi und anderen entwickeltem Verfahren, das den Zusammenhang von Objekten misst (normalized information distance - NID ), kann die Nähe zwischen bestimmten Objekten (Bilder, Worte, Muster, Intervalle, Genome, Programme etc.) anhand aller Eigenschaften analysiert und aufgrund der dominanten gemeinsamen Eigenschaft bestimmt werden. Ähnlich können auch die allgemein verwendeten, nicht unbedingt "wahren" Bedeutungen von Namen mit der Google-Suche erschlossen werden. 'At this moment one database stands out as the pinnacle of computer-accessible human knowledge and the most inclusive summary of statistical information: the Google search engine. There can be no doubt that Google has already enabled science to accelerate tremendously and revolutionized the research process. It has dominated the attention of internet users for years, and has recently attracted substantial attention of many Wall Street investors, even reshaping their ideas of company financing.' (Paul Vitanyi und Rudi Cilibrasi) Gibt man ein Wort ein wie beispielsweise "Pferd", erhält man bei Google 4.310.000 indexierte Seiten. Für "Reiter" sind es 3.400.000 Seiten. Kombiniert man beide Begriffe, werden noch 315.000 Seiten erfasst. Für das gemeinsame Auftreten beispielsweise von "Pferd" und "Bart" werden zwar noch immer erstaunliche 67.100 Seiten aufgeführt, aber man sieht schon, dass "Pferd" und "Reiter" enger zusammen hängen. Daraus ergibt sich eine bestimmte Wahrscheinlichkeit für das gemeinsame Auftreten von Begriffen. Aus dieser Häufigkeit, die sich im Vergleich mit der maximalen Menge (5.000.000.000) an indexierten Seiten ergibt, haben die beiden Wissenschaftler eine statistische Größe entwickelt, die sie "normalised Google distance" (NGD) nennen und die normalerweise zwischen 0 und 1 liegt. Je geringer NGD ist, desto enger hängen zwei Begriffe zusammen. "Das ist eine automatische Bedeutungsgenerierung", sagt Vitanyi gegenüber dern New Scientist (4). "Das könnte gut eine Möglichkeit darstellen, einen Computer Dinge verstehen und halbintelligent handeln zu lassen." Werden solche Suchen immer wieder durchgeführt, lässt sich eine Karte für die Verbindungen von Worten erstellen. Und aus dieser Karte wiederum kann ein Computer, so die Hoffnung, auch die Bedeutung der einzelnen Worte in unterschiedlichen natürlichen Sprachen und Kontexten erfassen. So habe man über einige Suchen realisiert, dass ein Computer zwischen Farben und Zahlen unterscheiden, holländische Maler aus dem 17. Jahrhundert und Notfälle sowie Fast-Notfälle auseinander halten oder elektrische oder religiöse Begriffe verstehen könne. Überdies habe eine einfache automatische Übersetzung Englisch-Spanisch bewerkstelligt werden können. Auf diese Weise ließe sich auch, so hoffen die Wissenschaftler, die Bedeutung von Worten erlernen, könne man Spracherkennung verbessern oder ein semantisches Web erstellen und natürlich endlich eine bessere automatische Übersetzung von einer Sprache in die andere realisieren.

Search (201 results, page 1 of 11)

Authors

Years

Languages

Types

Themes

Subjects

Classifications