Search (426 results, page 1 of 22)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.06

0.06056131 = product of:
  0.090841964 = sum of:
    0.072331384 = product of:
      0.21699414 = sum of:
        0.21699414 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
          0.21699414 = score(doc=562,freq=2.0), product of:
            0.38609818 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.045541126 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
    0.018510582 = product of:
      0.037021164 = sum of:
        0.037021164 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
          0.037021164 = score(doc=562,freq=2.0), product of:
            0.15947726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045541126 = queryNorm
            0.23214069 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.06
```
0.059743285 = product of:
  0.08961493 = sum of:
    0.072331384 = product of:
      0.21699414 = sum of:
        0.21699414 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
          0.21699414 = score(doc=862,freq=2.0), product of:
            0.38609818 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.045541126 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.33333334 = coord(1/3)
    0.017283546 = weight(_text_:to in 862) [ClassicSimilarity], result of:
      0.017283546 = score(doc=862,freq=6.0), product of:
        0.08279609 = queryWeight, product of:
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.045541126 = queryNorm
        0.20874833 = fieldWeight in 862, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.046875 = fieldNorm(doc=862)
  0.6666667 = coord(2/3)
```
Abstract

This research revisits the classic Turing test and compares recent large language models such as ChatGPT for their abilities to reproduce human-level comprehension and compelling text generation. Two task challenges- summary and question answering- prompt ChatGPT to produce original content (98-99%) from a single text entry and sequential questions initially posed by Turing in 1950. We score the original and generated content against the OpenAI GPT-2 Output Detector from 2019, and establish multiple cases where the generated content proves original and undetectable (98%). The question of a machine fooling a human judge recedes in this work relative to the question of "how would one prove it?" The original contribution of the work presents a metric and simple grammatical set for understanding the writing mechanics of chatbots in evaluating their readability and statistical clarity, engagement, delivery, overall quality, and plagiarism risks. While Turing's original prose scores at least 14% below the machine-generated output, whether an algorithm displays hints of Turing's true initial thoughts (the "Lovelace 2.0" test) remains unanswerable.

Source

https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN

Ruge, G.: ¬A spreading activation network for automatic generation of thesaurus relationships (1991) 0.04

0.044316597 = product of:
  0.06647489 = sum of:
    0.023283537 = weight(_text_:to in 4506) [ClassicSimilarity], result of:
      0.023283537 = score(doc=4506,freq=2.0), product of:
        0.08279609 = queryWeight, product of:
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.045541126 = queryNorm
        0.28121543 = fieldWeight in 4506, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.109375 = fieldNorm(doc=4506)
    0.043191355 = product of:
      0.08638271 = sum of:
        0.08638271 = weight(_text_:22 in 4506) [ClassicSimilarity], result of:
          0.08638271 = score(doc=4506,freq=2.0), product of:
            0.15947726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045541126 = queryNorm
            0.5416616 = fieldWeight in 4506, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=4506)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Date: 8.10.2000 11:52:22
Source: Library science with a slant to documentation. 28(1991) no.4, S.125-130

New tools for human translators (1997) 0.04

0.044316597 = product of:
  0.06647489 = sum of:
    0.023283537 = weight(_text_:to in 1179) [ClassicSimilarity], result of:
      0.023283537 = score(doc=1179,freq=2.0), product of:
        0.08279609 = queryWeight, product of:
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.045541126 = queryNorm
        0.28121543 = fieldWeight in 1179, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.109375 = fieldNorm(doc=1179)
    0.043191355 = product of:
      0.08638271 = sum of:
        0.08638271 = weight(_text_:22 in 1179) [ClassicSimilarity], result of:
          0.08638271 = score(doc=1179,freq=2.0), product of:
            0.15947726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045541126 = queryNorm
            0.5416616 = fieldWeight in 1179, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=1179)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: A special issue devoted to the theme of new tools for human tranlators
Date: 31. 7.1996 9:22:19

Hutchins, J.: From first conception to first demonstration : the nascent years of machine translation, 1947-1954. A chronology (1997) 0.04

0.03977125 = product of:
  0.059656877 = sum of:
    0.028805908 = weight(_text_:to in 1463) [ClassicSimilarity], result of:
      0.028805908 = score(doc=1463,freq=6.0), product of:
        0.08279609 = queryWeight, product of:
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.045541126 = queryNorm
        0.34791386 = fieldWeight in 1463, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.078125 = fieldNorm(doc=1463)
    0.03085097 = product of:
      0.06170194 = sum of:
        0.06170194 = weight(_text_:22 in 1463) [ClassicSimilarity], result of:
          0.06170194 = score(doc=1463,freq=2.0), product of:
            0.15947726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045541126 = queryNorm
            0.38690117 = fieldWeight in 1463, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=1463)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: Chronicles the early history of applying electronic computers to the task of translating natural languages, from the 1st suggestions by Warren Weaver in Mar 1947 to the 1st demonstration of a working, if limited, program in Jan 1954
Date: 31. 7.1996 9:22:19

Morris, V.: Automated language identification of bibliographic resources (2020) 0.04

0.038180627 = product of:
  0.057270937 = sum of:
    0.032590162 = weight(_text_:to in 5749) [ClassicSimilarity], result of:
      0.032590162 = score(doc=5749,freq=12.0), product of:
        0.08279609 = queryWeight, product of:
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.045541126 = queryNorm
        0.39361957 = fieldWeight in 5749, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.0625 = fieldNorm(doc=5749)
    0.024680775 = product of:
      0.04936155 = sum of:
        0.04936155 = weight(_text_:22 in 5749) [ClassicSimilarity], result of:
          0.04936155 = score(doc=5749,freq=2.0), product of:
            0.15947726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045541126 = queryNorm
            0.30952093 = fieldWeight in 5749, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=5749)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: This article describes experiments in the use of machine learning techniques at the British Library to assign language codes to catalog records, in order to provide information about the language of content of the resources described. In the first phase of the project, language codes were assigned to 1.15 million records with 99.7% confidence. The automated language identification tools developed will be used to contribute to future enhancement of over 4 million legacy records.
Date: 2. 3.2020 19:04:22

Kay, M.: ¬The proper place of men and machines in language translation (1997) 0.03

0.031751644 = product of:
  0.047627464 = sum of:
    0.026031785 = weight(_text_:to in 1178) [ClassicSimilarity], result of:
      0.026031785 = score(doc=1178,freq=10.0), product of:
        0.08279609 = queryWeight, product of:
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.045541126 = queryNorm
        0.3144084 = fieldWeight in 1178, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1178)
    0.021595677 = product of:
      0.043191355 = sum of:
        0.043191355 = weight(_text_:22 in 1178) [ClassicSimilarity], result of:
          0.043191355 = score(doc=1178,freq=2.0), product of:
            0.15947726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045541126 = queryNorm
            0.2708308 = fieldWeight in 1178, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1178)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: Machine translation stands no chance of filling actual needs for translation because, although there has been progress in relevant areas of computer science, advance in linguistics have not touched the core problems. Cooperative man-machine systems need to be developed, Proposes a translator's amanuensis, incorporating into a word processor some simple facilities peculiar to translation. Gradual enhancements of such a system could lead to the original goal of machine translation
Date: 31. 7.1996 9:22:19
Footnote: Contribution to a special issue devoted to the theme of new tools for human translators

Lawrie, D.; Mayfield, J.; McNamee, P.; Oard, P.W.: Cross-language person-entity linking from 20 languages (2015) 0.03

0.028635468 = product of:
  0.0429532 = sum of:
    0.02444262 = weight(_text_:to in 1848) [ClassicSimilarity], result of:
      0.02444262 = score(doc=1848,freq=12.0), product of:
        0.08279609 = queryWeight, product of:
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.045541126 = queryNorm
        0.29521468 = fieldWeight in 1848, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.046875 = fieldNorm(doc=1848)
    0.018510582 = product of:
      0.037021164 = sum of:
        0.037021164 = weight(_text_:22 in 1848) [ClassicSimilarity], result of:
          0.037021164 = score(doc=1848,freq=2.0), product of:
            0.15947726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045541126 = queryNorm
            0.23214069 = fieldWeight in 1848, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1848)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: The goal of entity linking is to associate references to an entity that is found in unstructured natural language content to an authoritative inventory of known entities. This article describes the construction of 6 test collections for cross-language person-entity linking that together span 22 languages. Fully automated components were used together with 2 crowdsourced validation stages to affordably generate ground-truth annotations with an accuracy comparable to that of a completely manual process. The resulting test collections each contain between 642 (Arabic) and 2,361 (Romanian) person references in non-English texts for which the correct resolution in English Wikipedia is known, plus a similar number of references for which no correct resolution into English Wikipedia is believed to exist. Fully automated cross-language person-name linking experiments with 20 non-English languages yielded a resolution accuracy of between 0.84 (Serbian) and 0.98 (Romanian), which compares favorably with previously reported cross-language entity linking results for Spanish.

Melby, A.: Some notes on 'The proper place of men and machines in language translation' (1997) 0.03

0.027839875 = product of:
  0.04175981 = sum of:
    0.020164136 = weight(_text_:to in 330) [ClassicSimilarity], result of:
      0.020164136 = score(doc=330,freq=6.0), product of:
        0.08279609 = queryWeight, product of:
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.045541126 = queryNorm
        0.24353972 = fieldWeight in 330, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.0546875 = fieldNorm(doc=330)
    0.021595677 = product of:
      0.043191355 = sum of:
        0.043191355 = weight(_text_:22 in 330) [ClassicSimilarity], result of:
          0.043191355 = score(doc=330,freq=2.0), product of:
            0.15947726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045541126 = queryNorm
            0.2708308 = fieldWeight in 330, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=330)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: Responds to Kay, M.: The proper place of men and machines in language translation. Examines the appropriateness of machine translation (MT) under the following special circumstances: controlled domain-specific text and high-quality output; controlled domain-specific text and indicative output; dynamic general text and indicative output and dynamic general text and high-quality output. MT is appropriate in the 1st 3 cases but the 4th case requires human translation. Examines how MT research could be more useful for aiding human translation
Date: 31. 7.1996 9:22:19
Footnote: Contribution to a special issue devoted to the theme of new tools for human translators

Schneider, J.W.; Borlund, P.: ¬A bibliometric-based semiautomatic approach to identification of candidate thesaurus terms : parsing and filtering of noun phrases from citation contexts (2005) 0.03

0.027839875 = product of:
  0.04175981 = sum of:
    0.020164136 = weight(_text_:to in 156) [ClassicSimilarity], result of:
      0.020164136 = score(doc=156,freq=6.0), product of:
        0.08279609 = queryWeight, product of:
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.045541126 = queryNorm
        0.24353972 = fieldWeight in 156, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.0546875 = fieldNorm(doc=156)
    0.021595677 = product of:
      0.043191355 = sum of:
        0.043191355 = weight(_text_:22 in 156) [ClassicSimilarity], result of:
          0.043191355 = score(doc=156,freq=2.0), product of:
            0.15947726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045541126 = queryNorm
            0.2708308 = fieldWeight in 156, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=156)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: The present study investigates the ability of a bibliometric based semi-automatic method to select candidate thesaurus terms from citation contexts. The method consists of document co-citation analysis, citation context analysis, and noun phrase parsing. The investigation is carried out within the specialty area of periodontology. The results clearly demonstrate that the method is able to select important candidate thesaurus terms within the chosen specialty area.
Date: 8. 3.2007 19:55:22

Paolillo, J.C.: Linguistics and the information sciences (2009) 0.03

0.027839875 = product of:
  0.04175981 = sum of:
    0.020164136 = weight(_text_:to in 3840) [ClassicSimilarity], result of:
      0.020164136 = score(doc=3840,freq=6.0), product of:
        0.08279609 = queryWeight, product of:
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.045541126 = queryNorm
        0.24353972 = fieldWeight in 3840, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3840)
    0.021595677 = product of:
      0.043191355 = sum of:
        0.043191355 = weight(_text_:22 in 3840) [ClassicSimilarity], result of:
          0.043191355 = score(doc=3840,freq=2.0), product of:
            0.15947726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045541126 = queryNorm
            0.2708308 = fieldWeight in 3840, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3840)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: Linguistics is the scientific study of language which emphasizes language spoken in everyday settings by human beings. It has a long history of interdisciplinarity, both internally and in contribution to other fields, including information science. A linguistic perspective is beneficial in many ways in information science, since it examines the relationship between the forms of meaningful expressions and their social, cognitive, institutional, and communicative context, these being two perspectives on information that are actively studied, to different degrees, in information science. Examples of issues relevant to information science are presented for which the approach taken under a linguistic perspective is illustrated.
Date: 27. 8.2011 14:22:33

Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.03
```
0.027215695 = product of:
  0.04082354 = sum of:
    0.02231296 = weight(_text_:to in 563) [ClassicSimilarity], result of:
      0.02231296 = score(doc=563,freq=10.0), product of:
        0.08279609 = queryWeight, product of:
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.045541126 = queryNorm
        0.26949292 = fieldWeight in 563, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.018510582 = product of:
      0.037021164 = sum of:
        0.037021164 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
          0.037021164 = score(doc=563,freq=2.0), product of:
            0.15947726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045541126 = queryNorm
            0.23214069 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

In this thesis we propose three new word association measures for multi-word term extraction. We combine these association measures with LocalMaxs algorithm in our extraction model and compare the results of different multi-word term extraction methods. Our approach is language and domain independent and requires no training data. It can be applied to such tasks as text summarization, information retrieval, and document classification. We further explore the potential of using multi-word terms as an effective representation for general web-page summarization. We extract multi-word terms from human written summaries in a large collection of web-pages, and generate the summaries by aligning document words with these multi-word terms. Our system applies machine translation technology to learn the aligning process from a training set and focuses on selecting high quality multi-word terms from human written summaries to generate suitable results for web-page summarization.

Content

A Thesis presented to The University of Guelph In partial fulfilment of requirements for the degree of Master of Science in Computer Science. Vgl. Unter: http://www.inf.ufrgs.br%2F~ceramisch%2Fdownload_files%2Fpublications%2F2009%2Fp01.pdf.

Date

10. 1.2013 19:22:47

Schwarz, C.: THESYS: Thesaurus Syntax System : a fully automatic thesaurus building aid (1988) 0.03

0.025373083 = product of:
  0.038059622 = sum of:
    0.016463947 = weight(_text_:to in 1361) [ClassicSimilarity], result of:
      0.016463947 = score(doc=1361,freq=4.0), product of:
        0.08279609 = queryWeight, product of:
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.045541126 = queryNorm
        0.19884932 = fieldWeight in 1361, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1361)
    0.021595677 = product of:
      0.043191355 = sum of:
        0.043191355 = weight(_text_:22 in 1361) [ClassicSimilarity], result of:
          0.043191355 = score(doc=1361,freq=2.0), product of:
            0.15947726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045541126 = queryNorm
            0.2708308 = fieldWeight in 1361, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1361)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: THESYS is based on the natural language processing of free-text databases. It yields statistically evaluated correlations between words of the database. These correlations correspond to traditional thesaurus relations. The person who has to build a thesaurus is thus assisted by the proposals made by THESYS. THESYS is being tested on commercial databases under real world conditions. It is part of a text processing project at Siemens, called TINA (Text-Inhalts-Analyse). Software from TINA is actually being applied and evaluated by the US Department of Commerce for patent search and indexing (REALIST: REtrieval Aids by Linguistics and STatistics)
Date: 6. 1.1999 10:22:07

Godby, J.: WordSmith research project bridges gap between tokens and indexes (1998) 0.03

0.025373083 = product of:
  0.038059622 = sum of:
    0.016463947 = weight(_text_:to in 4729) [ClassicSimilarity], result of:
      0.016463947 = score(doc=4729,freq=4.0), product of:
        0.08279609 = queryWeight, product of:
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.045541126 = queryNorm
        0.19884932 = fieldWeight in 4729, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4729)
    0.021595677 = product of:
      0.043191355 = sum of:
        0.043191355 = weight(_text_:22 in 4729) [ClassicSimilarity], result of:
          0.043191355 = score(doc=4729,freq=2.0), product of:
            0.15947726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045541126 = queryNorm
            0.2708308 = fieldWeight in 4729, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4729)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: Reports on an OCLC natural language processing research project to develop methods for identifying terminology in unstructured electronic text, especially material associated with new cultural trends and emerging subjects. Current OCLC production software can only identify single words as indexable terms in full text documents, thus a major goal of the WordSmith project is to develop software that can automatically identify and intelligently organize phrases for uses in database indexes. By analyzing user terminology from local newspapers in the USA, the latest cultural trends and technical developments as well as personal and geographic names have been drawm out. Notes that this new vocabulary can also be mapped into reference works
Source: OCLC newsletter. 1998, no.234, Jul/Aug, S.22-24

Wanner, L.: Lexical choice in text generation and machine translation (1996) 0.03

0.02532377 = product of:
  0.037985653 = sum of:
    0.013304878 = weight(_text_:to in 8521) [ClassicSimilarity], result of:
      0.013304878 = score(doc=8521,freq=2.0), product of:
        0.08279609 = queryWeight, product of:
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.045541126 = queryNorm
        0.16069452 = fieldWeight in 8521, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.0625 = fieldNorm(doc=8521)
    0.024680775 = product of:
      0.04936155 = sum of:
        0.04936155 = weight(_text_:22 in 8521) [ClassicSimilarity], result of:
          0.04936155 = score(doc=8521,freq=2.0), product of:
            0.15947726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045541126 = queryNorm
            0.30952093 = fieldWeight in 8521, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=8521)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: Presents the state of the art in lexical choice research in text generation and machine translation. Discusses the existing implementations with respect to: the place of lexical choice in the overall generation rates; the information flow within the generation process and the consequences thereof for lexical choice; the internal organization of the lexical choice process; and the phenomena covered by lexical choice. Identifies possible future directions in lexical choice research
Date: 31. 7.1996 9:22:19

Basili, R.; Pazienza, M.T.; Velardi, P.: ¬An empirical symbolic approach to natural language processing (1996) 0.03

0.02532377 = product of:
  0.037985653 = sum of:
    0.013304878 = weight(_text_:to in 6753) [ClassicSimilarity], result of:
      0.013304878 = score(doc=6753,freq=2.0), product of:
        0.08279609 = queryWeight, product of:
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.045541126 = queryNorm
        0.16069452 = fieldWeight in 6753, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.0625 = fieldNorm(doc=6753)
    0.024680775 = product of:
      0.04936155 = sum of:
        0.04936155 = weight(_text_:22 in 6753) [ClassicSimilarity], result of:
          0.04936155 = score(doc=6753,freq=2.0), product of:
            0.15947726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045541126 = queryNorm
            0.30952093 = fieldWeight in 6753, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6753)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Date: 6. 3.1997 16:22:15

Way, E.C.: Knowledge representation and metaphor (oder: meaning) (1994) 0.03

0.02532377 = product of:
  0.037985653 = sum of:
    0.013304878 = weight(_text_:to in 771) [ClassicSimilarity], result of:
      0.013304878 = score(doc=771,freq=2.0), product of:
        0.08279609 = queryWeight, product of:
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.045541126 = queryNorm
        0.16069452 = fieldWeight in 771, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.0625 = fieldNorm(doc=771)
    0.024680775 = product of:
      0.04936155 = sum of:
        0.04936155 = weight(_text_:22 in 771) [ClassicSimilarity], result of:
          0.04936155 = score(doc=771,freq=2.0), product of:
            0.15947726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045541126 = queryNorm
            0.30952093 = fieldWeight in 771, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=771)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Content: Enthält folgende 9 Kapitel: The literal and the metaphoric; Views of metaphor; Knowledge representation; Representation schemes and conceptual graphs; The dynamic type hierarchy theory of metaphor; Computational approaches to metaphor; Thenature and structure of semantic hierarchies; Language games, open texture and family resemblance; Programming the dynamic type hierarchy; Subject index
Footnote: Bereits 1991 bei Kluwer publiziert // Rez. in: Knowledge organization 22(1995) no.1, S.48-49 (O. Sechser)

Doszkocs, T.E.; Zamora, A.: Dictionary services and spelling aids for Web searching (2004) 0.02
```
0.024145257 = product of:
  0.036217883 = sum of:
    0.014402954 = weight(_text_:to in 2541) [ClassicSimilarity], result of:
      0.014402954 = score(doc=2541,freq=6.0), product of:
        0.08279609 = queryWeight, product of:
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.045541126 = queryNorm
        0.17395693 = fieldWeight in 2541, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2541)
    0.02181493 = product of:
      0.04362986 = sum of:
        0.04362986 = weight(_text_:22 in 2541) [ClassicSimilarity], result of:
          0.04362986 = score(doc=2541,freq=4.0), product of:
            0.15947726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045541126 = queryNorm
            0.27358043 = fieldWeight in 2541, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2541)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The Specialized Information Services Division (SIS) of the National Library of Medicine (NLM) provides Web access to more than a dozen scientific databases on toxicology and the environment on TOXNET . Search queries on TOXNET often include misspelled or variant English words, medical and scientific jargon and chemical names. Following the example of search engines like Google and ClinicalTrials.gov, we set out to develop a spelling "suggestion" system for increased recall and precision in TOXNET searching. This paper describes development of dictionary technology that can be used in a variety of applications such as orthographic verification, writing aid, natural language processing, and information storage and retrieval. The design of the technology allows building complex applications using the components developed in the earlier phases of the work in a modular fashion without extensive rewriting of computer code. Since many of the potential applications envisioned for this work have on-line or web-based interfaces, the dictionaries and other computer components must have fast response, and must be adaptable to open-ended database vocabularies, including chemical nomenclature. The dictionary vocabulary for this work was derived from SIS and other databases and specialized resources, such as NLM's Unified Medical Language Systems (UMLS) . The resulting technology, A-Z Dictionary (AZdict), has three major constituents: 1) the vocabulary list, 2) the word attributes that define part of speech and morphological relationships between words in the list, and 3) a set of programs that implements the retrieval of words and their attributes, and determines similarity between words (ChemSpell). These three components can be used in various applications such as spelling verification, spelling aid, part-of-speech tagging, paraphrasing, and many other natural language processing functions.

Date

14. 8.2004 17:22:56

Source

Online. 28(2004) no.3, S.22-29
Deventer, J.P. van; Kruger, C.J.; Johnson, R.D.: Delineating knowledge management through lexical analysis : a retrospective (2015) 0.02
```
0.024113655 = product of:
  0.036170483 = sum of:
    0.025372645 = weight(_text_:to in 3807) [ClassicSimilarity], result of:
      0.025372645 = score(doc=3807,freq=38.0), product of:
        0.08279609 = queryWeight, product of:
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.045541126 = queryNorm
        0.3064474 = fieldWeight in 3807, product of:
          6.164414 = tf(freq=38.0), with freq of:
            38.0 = termFreq=38.0
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.02734375 = fieldNorm(doc=3807)
    0.010797839 = product of:
      0.021595677 = sum of:
        0.021595677 = weight(_text_:22 in 3807) [ClassicSimilarity], result of:
          0.021595677 = score(doc=3807,freq=2.0), product of:
            0.15947726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045541126 = queryNorm
            0.1354154 = fieldWeight in 3807, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.02734375 = fieldNorm(doc=3807)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Purpose Academic authors tend to define terms that meet their own needs. Knowledge Management (KM) is a term that comes to mind and is examined in this study. Lexicographical research identified KM terms used by authors from 1996 to 2006 in academic outlets to define KM. Data were collected based on strict criteria which included that definitions should be unique instances. From 2006 onwards, these authors could not identify new unique instances of definitions with repetitive usage of such definition instances. Analysis revealed that KM is directly defined by People (Person and Organisation), Processes (Codify, Share, Leverage, and Process) and Contextualised Content (Information). The paper aims to discuss these issues. Design/methodology/approach The aim of this paper is to add to the body of knowledge in the KM discipline and supply KM practitioners and scholars with insight into what is commonly regarded to be KM so as to reignite the debate on what one could consider as KM. The lexicon used by KM scholars was evaluated though the application of lexicographical research methods as extended though Knowledge Discovery and Text Analysis methods. Findings By simplifying term relationships through the application of lexicographical research methods, as extended though Knowledge Discovery and Text Analysis methods, it was found that KM is directly defined by People (Person and Organisation), Processes (Codify, Share, Leverage, Process) and Contextualised Content (Information). One would therefore be able to indicate that KM, from an academic point of view, refers to people processing contextualised content.
Research limitations/implications In total, 42 definitions were identified spanning a period of 11 years. This represented the first use of KM through the estimated apex of terms used. From 2006 onwards definitions were used in repetition, and all definitions that were considered to repeat were therefore subsequently excluded as not being unique instances. All definitions listed are by no means complete and exhaustive. The definitions are viewed outside the scope and context in which they were originally formulated and then used to review the key concepts in the definitions themselves. Social implications When the authors refer to the aforementioned discussion of KM content as well as the presentation of the method followed in this paper, the authors may have a few implications for future research in KM. First the research validates ideas presented by the OECD in 2005 pertaining to KM. It also validates that through the evolution of KM, the authors ended with a description of KM that may be seen as a standardised description. If the authors as academics and practitioners, for example, refer to KM as the same construct and/or idea, it has the potential to speculatively, distinguish between what KM may or may not be. Originality/value By simplifying the term used to define KM, by focusing on the most common definitions, the paper assist in refocusing KM by reconsidering the dimensions that is the most common in how it has been defined over time. This would hopefully assist in reigniting discussions about KM and how it may be used to the benefit of an organisation.

Date

20. 1.2015 18:30:22
Luo, L.; Ju, J.; Li, Y.-F.; Haffari, G.; Xiong, B.; Pan, S.: ChatRule: mining logical rules with large language models for knowledge graph reasoning (2023) 0.02
```
0.023862889 = product of:
  0.035794333 = sum of:
    0.02036885 = weight(_text_:to in 1171) [ClassicSimilarity], result of:
      0.02036885 = score(doc=1171,freq=12.0), product of:
        0.08279609 = queryWeight, product of:
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.045541126 = queryNorm
        0.24601223 = fieldWeight in 1171, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.818051 = idf(docFreq=19512, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1171)
    0.015425485 = product of:
      0.03085097 = sum of:
        0.03085097 = weight(_text_:22 in 1171) [ClassicSimilarity], result of:
          0.03085097 = score(doc=1171,freq=2.0), product of:
            0.15947726 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.045541126 = queryNorm
            0.19345059 = fieldWeight in 1171, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1171)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Logical rules are essential for uncovering the logical connections between relations, which could improve the reasoning performance and provide interpretable results on knowledge graphs (KGs). Although there have been many efforts to mine meaningful logical rules over KGs, existing methods suffer from the computationally intensive searches over the rule space and a lack of scalability for large-scale KGs. Besides, they often ignore the semantics of relations which is crucial for uncovering logical connections. Recently, large language models (LLMs) have shown impressive performance in the field of natural language processing and various applications, owing to their emergent ability and generalizability. In this paper, we propose a novel framework, ChatRule, unleashing the power of large language models for mining logical rules over knowledge graphs. Specifically, the framework is initiated with an LLM-based rule generator, leveraging both the semantic and structural information of KGs to prompt LLMs to generate logical rules. To refine the generated rules, a rule ranking module estimates the rule quality by incorporating facts from existing KGs. Last, a rule validator harnesses the reasoning ability of LLMs to validate the logical correctness of ranked rules through chain-of-thought reasoning. ChatRule is evaluated on four large-scale KGs, w.r.t. different rule quality metrics and downstream tasks, showing the effectiveness and scalability of our method.

Date

23.11.2023 19:07:22

Search (426 results, page 1 of 22)

Authors

Years

Languages

Types

Themes

Subjects

Classifications