Search (63 results, page 1 of 4)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.23

0.22995225 = product of:
  0.30660298 = sum of:
    0.07204164 = product of:
      0.2161249 = sum of:
        0.2161249 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
          0.2161249 = score(doc=562,freq=2.0), product of:
            0.38455155 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0453587 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
    0.2161249 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.2161249 = score(doc=562,freq=2.0), product of:
        0.38455155 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.0453587 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.018436432 = product of:
      0.036872864 = sum of:
        0.036872864 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
          0.036872864 = score(doc=562,freq=2.0), product of:
            0.15883844 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0453587 = queryNorm
            0.23214069 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.14

0.14408328 = product of:
  0.28816655 = sum of:
    0.07204164 = product of:
      0.2161249 = sum of:
        0.2161249 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
          0.2161249 = score(doc=862,freq=2.0), product of:
            0.38455155 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0453587 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.33333334 = coord(1/3)
    0.2161249 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
      0.2161249 = score(doc=862,freq=2.0), product of:
        0.38455155 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.0453587 = queryNorm
        0.56201804 = fieldWeight in 862, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=862)
  0.5 = coord(2/4)

Source: https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN

Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.12

0.11728067 = product of:
  0.23456134 = sum of:
    0.2161249 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
      0.2161249 = score(doc=563,freq=2.0), product of:
        0.38455155 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.0453587 = queryNorm
        0.56201804 = fieldWeight in 563, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.018436432 = product of:
      0.036872864 = sum of:
        0.036872864 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
          0.036872864 = score(doc=563,freq=2.0), product of:
            0.15883844 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0453587 = queryNorm
            0.23214069 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Content: A Thesis presented to The University of Guelph In partial fulfilment of requirements for the degree of Master of Science in Computer Science. Vgl. Unter: http://www.inf.ufrgs.br%2F~ceramisch%2Fdownload_files%2Fpublications%2F2009%2Fp01.pdf.
Date: 10. 1.2013 19:22:47

Schwarz, C.: THESYS: Thesaurus Syntax System : a fully automatic thesaurus building aid (1988) 0.09

0.090796195 = product of:
  0.18159239 = sum of:
    0.16008322 = weight(_text_:assisted in 1361) [ClassicSimilarity], result of:
      0.16008322 = score(doc=1361,freq=2.0), product of:
        0.30640912 = queryWeight, product of:
          6.7552447 = idf(docFreq=139, maxDocs=44218)
          0.0453587 = queryNorm
        0.52244925 = fieldWeight in 1361, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.7552447 = idf(docFreq=139, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1361)
    0.02150917 = product of:
      0.04301834 = sum of:
        0.04301834 = weight(_text_:22 in 1361) [ClassicSimilarity], result of:
          0.04301834 = score(doc=1361,freq=2.0), product of:
            0.15883844 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0453587 = queryNorm
            0.2708308 = fieldWeight in 1361, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1361)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: THESYS is based on the natural language processing of free-text databases. It yields statistically evaluated correlations between words of the database. These correlations correspond to traditional thesaurus relations. The person who has to build a thesaurus is thus assisted by the proposals made by THESYS. THESYS is being tested on commercial databases under real world conditions. It is part of a text processing project at Siemens, called TINA (Text-Inhalts-Analyse). Software from TINA is actually being applied and evaluated by the US Department of Commerce for patent search and indexing (REALIST: REtrieval Aids by Linguistics and STatistics)
Date: 6. 1.1999 10:22:07

Chou, C.; Chu, T.: ¬An analysis of BERT (NLP) for assisted subject indexing for Project Gutenberg (2022) 0.06
```
0.05659797 = product of:
  0.22639188 = sum of:
    0.22639188 = weight(_text_:assisted in 1139) [ClassicSimilarity], result of:
      0.22639188 = score(doc=1139,freq=4.0), product of:
        0.30640912 = queryWeight, product of:
          6.7552447 = idf(docFreq=139, maxDocs=44218)
          0.0453587 = queryNorm
        0.7388549 = fieldWeight in 1139, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          6.7552447 = idf(docFreq=139, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1139)
  0.25 = coord(1/4)
```
Abstract

In light of AI (Artificial Intelligence) and NLP (Natural language processing) technologies, this article examines the feasibility of using AI/NLP models to enhance the subject indexing of digital resources. While BERT (Bidirectional Encoder Representations from Transformers) models are widely used in scholarly communities, the authors assess whether BERT models can be used in machine-assisted indexing in the Project Gutenberg collection, through suggesting Library of Congress subject headings filtered by certain Library of Congress Classification subclass labels. The findings of this study are informative for further research on BERT models to assist with automatic subject indexing for digital library collections.
Oard, D.W.; He, D.; Wang, J.: User-assisted query translation for interactive cross-language information retrieval (2008) 0.05
```
0.048512544 = product of:
  0.19405018 = sum of:
    0.19405018 = weight(_text_:assisted in 2030) [ClassicSimilarity], result of:
      0.19405018 = score(doc=2030,freq=4.0), product of:
        0.30640912 = queryWeight, product of:
          6.7552447 = idf(docFreq=139, maxDocs=44218)
          0.0453587 = queryNorm
        0.6333042 = fieldWeight in 2030, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          6.7552447 = idf(docFreq=139, maxDocs=44218)
          0.046875 = fieldNorm(doc=2030)
  0.25 = coord(1/4)
```
Abstract

Interactive Cross-Language Information Retrieval (CLIR), a process in which searcher and system collaborate to find documents that satisfy an information need regardless of the language in which those documents are written, calls for designs in which synergies between searcher and system can be leveraged so that the strengths of one can cover weaknesses of the other. This paper describes an approach that employs user-assisted query translation to help searchers better understand the system's operation. Supporting interaction and interface designs are introduced, and results from three user studies are presented. The results indicate that experienced searchers presented with this new system evolve new search strategies that make effective use of the new capabilities, that they achieve retrieval effectiveness comparable to results obtained using fully automatic techniques, and that reported satisfaction with support for cross-language searching increased. The paper concludes with a description of a freely available interactive CLIR system that incorporates lessons learned from this research.

Polity, Y.: Vers une ergonomie linguistique (1994) 0.05

0.045738064 = product of:
  0.18295226 = sum of:
    0.18295226 = weight(_text_:assisted in 36) [ClassicSimilarity], result of:
      0.18295226 = score(doc=36,freq=2.0), product of:
        0.30640912 = queryWeight, product of:
          6.7552447 = idf(docFreq=139, maxDocs=44218)
          0.0453587 = queryNorm
        0.5970849 = fieldWeight in 36, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.7552447 = idf(docFreq=139, maxDocs=44218)
          0.0625 = fieldNorm(doc=36)
  0.25 = coord(1/4)

Abstract: Analyzed a special type of man-mchine interaction, that of searching an information system with natural language. A model for full text processing for information retrieval was proposed that considered the system's users and how they employ information. Describes how INIST (the National Institute for Scientific and Technical Information) is developing computer assisted indexing as an aid to improving relevance when retrieving information from bibliographic data banks

Gillaspie, L.: ¬The role of linguistic phenomena in retrieval performance (1995) 0.05

0.045738064 = product of:
  0.18295226 = sum of:
    0.18295226 = weight(_text_:assisted in 3861) [ClassicSimilarity], result of:
      0.18295226 = score(doc=3861,freq=2.0), product of:
        0.30640912 = queryWeight, product of:
          6.7552447 = idf(docFreq=139, maxDocs=44218)
          0.0453587 = queryNorm
        0.5970849 = fieldWeight in 3861, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.7552447 = idf(docFreq=139, maxDocs=44218)
          0.0625 = fieldNorm(doc=3861)
  0.25 = coord(1/4)

Abstract: This progress report presents findings from a failure analysis of 2 commercial full text computer assisted legal research (CALR) systems. Linguistic analyzes of unretrieved documents als false drops reveal a number of potential causes for performance problems in these databases, ranging from synonymy and homography to discourse level cohesive relations. Ecxamines and discusses examples of natural language phenomena that affects Boolean retrieval system performance

Armstrong, G.: Computer-assisted literary analysis using the TACT a text-retrieval program (1996) 0.05

0.045738064 = product of:
  0.18295226 = sum of:
    0.18295226 = weight(_text_:assisted in 5690) [ClassicSimilarity], result of:
      0.18295226 = score(doc=5690,freq=2.0), product of:
        0.30640912 = queryWeight, product of:
          6.7552447 = idf(docFreq=139, maxDocs=44218)
          0.0453587 = queryNorm
        0.5970849 = fieldWeight in 5690, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.7552447 = idf(docFreq=139, maxDocs=44218)
          0.0625 = fieldNorm(doc=5690)
  0.25 = coord(1/4)

Jaaranen, K.; Lehtola, A.; Tenni, J.; Bounsaythip, C.: Webtran tools for in-company language support (2000) 0.03
```
0.034303546 = product of:
  0.13721418 = sum of:
    0.13721418 = weight(_text_:assisted in 5553) [ClassicSimilarity], result of:
      0.13721418 = score(doc=5553,freq=2.0), product of:
        0.30640912 = queryWeight, product of:
          6.7552447 = idf(docFreq=139, maxDocs=44218)
          0.0453587 = queryNorm
        0.44781366 = fieldWeight in 5553, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.7552447 = idf(docFreq=139, maxDocs=44218)
          0.046875 = fieldNorm(doc=5553)
  0.25 = coord(1/4)
```
Abstract

Webtran tools for authoring and translating domain specific texts can make the multilingual text production in a company more efficient and less expensive. Tile tools have been in production use since spring 2000 for checking and translating product article texts of a specific domain, namely an in-company language in sales catalogues of a mail-order company. Webtran tools have been developed by VTT Information Technology. Use experiences have shown that an automatic translation process is faster than phrase-lexicon assisted manual translation, if an in-company language model is created to control and support the language used within the company
Anguiano Peña, G.; Naumis Peña, C.: Method for selecting specialized terms from a general language corpus (2015) 0.03
```
0.034303546 = product of:
  0.13721418 = sum of:
    0.13721418 = weight(_text_:assisted in 2196) [ClassicSimilarity], result of:
      0.13721418 = score(doc=2196,freq=2.0), product of:
        0.30640912 = queryWeight, product of:
          6.7552447 = idf(docFreq=139, maxDocs=44218)
          0.0453587 = queryNorm
        0.44781366 = fieldWeight in 2196, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.7552447 = idf(docFreq=139, maxDocs=44218)
          0.046875 = fieldNorm(doc=2196)
  0.25 = coord(1/4)
```
Abstract

Among the many aspects studied by library and information science are linguistic phenomena associated with document content analysis, for purposes of both information organization and retrieval. To this end, terms used in scientific and technical language must be recovered and their area of domain and behavior studied. Through language, society controls the knowledge available to people. Document content analysis, in this case of scientific texts, facilitates gathering knowledge of lexical units and their major applications and separating such specialized terms from the general language, to create indexing languages. The model presented here or other lexicographic resources with similar characteristics may be useful in the near future, in computer-assisted indexing or as corpora monitors, with respect to new text analyses or specialized corpora. Thus, using techniques for document content analysis of a lexicographically labeled general language corpus proposed herein, components which enable the extraction of lexical units from specialized language may be obtained and characterized.
From information to knowledge : conceptual and content analysis by computer (1995) 0.03
```
0.02858629 = product of:
  0.11434516 = sum of:
    0.11434516 = weight(_text_:assisted in 5392) [ClassicSimilarity], result of:
      0.11434516 = score(doc=5392,freq=2.0), product of:
        0.30640912 = queryWeight, product of:
          6.7552447 = idf(docFreq=139, maxDocs=44218)
          0.0453587 = queryNorm
        0.37317806 = fieldWeight in 5392, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.7552447 = idf(docFreq=139, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5392)
  0.25 = coord(1/4)
```
Content

SCHMIDT, K.M.: Concepts - content - meaning: an introduction; DUCHASTEL, J. et al.: The SACAO project: using computation toward textual data analysis; PAQUIN, L.-C. u. L. DUPUY: An approach to expertise transfer: computer-assisted text analysis; HOGENRAAD, R., Y. BESTGEN u. J.-L. NYSTEN: Terrorist rhetoric: texture and architecture; MOHLER, P.P.: On the interaction between reading and computing: an interpretative approach to content analysis; LANCASHIRE, I.: Computer tools for cognitive stylistics; MERGENTHALER, E.: An outline of knowledge based text analysis; NAMENWIRTH, J.Z.: Ideography in computer-aided content analysis; WEBER, R.P. u. J.Z. Namenwirth: Content-analytic indicators: a self-critique; McKINNON, A.: Optimizing the aberrant frequency word technique; ROSATI, R.: Factor analysis in classical archaeology: export patterns of Attic pottery trade; PETRILLO, P.S.: Old and new worlds: ancient coinage and modern technology; DARANYI, S., S. MARJAI u.a.: Caryatids and the measurement of semiosis in architecture; ZARRI, G.P.: Intelligent information retrieval: an application in the field of historical biographical data; BOUCHARD, G., R. ROY u.a.: Computers and genealogy: from family reconstitution to population reconstruction; DEMÉLAS-BOHY, M.-D. u. M. RENAUD: Instability, networks and political parties: a political history expert system prototype; DARANYI, S., A. ABRANYI u. G. KOVACS: Knowledge extraction from ethnopoetic texts by multivariate statistical methods; FRAUTSCHI, R.L.: Measures of narrative voice in French prose fiction applied to textual samples from the enlightenment to the twentieth century; DANNENBERG, R. u.a.: A project in computer music: the musician's workbench
Ramisch, C.: Multiword expressions acquisition : a generic and open framework (2015) 0.02
```
0.022869032 = product of:
  0.09147613 = sum of:
    0.09147613 = weight(_text_:assisted in 1649) [ClassicSimilarity], result of:
      0.09147613 = score(doc=1649,freq=2.0), product of:
        0.30640912 = queryWeight, product of:
          6.7552447 = idf(docFreq=139, maxDocs=44218)
          0.0453587 = queryNorm
        0.29854244 = fieldWeight in 1649, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.7552447 = idf(docFreq=139, maxDocs=44218)
          0.03125 = fieldNorm(doc=1649)
  0.25 = coord(1/4)
```
Abstract

This book is an excellent introduction to multiword expressions. It provides a unique, comprehensive and up-to-date overview of this exciting topic in computational linguistics. The first part describes the diversity and richness of multiword expressions, including many examples in several languages. These constructions are not only complex and arbitrary, but also much more frequent than one would guess, making them a real nightmare for natural language processing applications. The second part introduces a new generic framework for automatic acquisition of multiword expressions from texts. Furthermore, it describes the accompanying free software tool, the mwetoolkit, which comes in handy when looking for expressions in texts (regardless of the language). Evaluation is greatly emphasized, underlining the fact that results depend on parameters like corpus size, language, MWE type, etc. The last part contains solid experimental results and evaluates the mwetoolkit, demonstrating its usefulness for computer-assisted lexicography and machine translation. This is the first book to cover the whole pipeline of multiword expression acquisition in a single volume. It is addresses the needs of students and researchers in computational and theoretical linguistics, cognitive sciences, artificial intelligence and computer science. Its good balance between computational and linguistic views make it the perfect starting point for anyone interested in multiword expressions, language and text processing in general.
Brown, T.B.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; Agarwal, S.; Herbert-Voss, A.; Krueger, G.; Henighan, T.; Child, R.; Ramesh, A.; Ziegler, D.M.; Wu, J.; Winter, C.; Hesse, C.; Chen, M.; Sigler, E.; Litwin, M.; Gray, S.; Chess, B.; Clark, J.; Berner, C.; McCandlish, S.; Radford, A.; Sutskever, I.; Amodei, D.: Language models are few-shot learners (2020) 0.01
```
0.012395732 = product of:
  0.04958293 = sum of:
    0.04958293 = product of:
      0.09916586 = sum of:
        0.09916586 = weight(_text_:instructions in 872) [ClassicSimilarity], result of:
          0.09916586 = score(doc=872,freq=2.0), product of:
            0.31902805 = queryWeight, product of:
              7.033448 = idf(docFreq=105, maxDocs=44218)
              0.0453587 = queryNorm
            0.31083742 = fieldWeight in 872, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.033448 = idf(docFreq=105, maxDocs=44218)
              0.03125 = fieldNorm(doc=872)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. At the same time, we also identify some datasets where GPT-3's few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora. Finally, we find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans. We discuss broader societal impacts of this finding and of GPT-3 in general.

Warner, A.J.: Natural language processing (1987) 0.01

0.0122909555 = product of:
  0.049163822 = sum of:
    0.049163822 = product of:
      0.098327644 = sum of:
        0.098327644 = weight(_text_:22 in 337) [ClassicSimilarity], result of:
          0.098327644 = score(doc=337,freq=2.0), product of:
            0.15883844 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0453587 = queryNorm
            0.61904186 = fieldWeight in 337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=337)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Source: Annual review of information science and technology. 22(1987), S.79-108

Jha, A.: Why GPT-4 isn't all it's cracked up to be (2023) 0.01
```
0.010846265 = product of:
  0.04338506 = sum of:
    0.04338506 = product of:
      0.08677012 = sum of:
        0.08677012 = weight(_text_:instructions in 923) [ClassicSimilarity], result of:
          0.08677012 = score(doc=923,freq=2.0), product of:
            0.31902805 = queryWeight, product of:
              7.033448 = idf(docFreq=105, maxDocs=44218)
              0.0453587 = queryNorm
            0.27198273 = fieldWeight in 923, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.033448 = idf(docFreq=105, maxDocs=44218)
              0.02734375 = fieldNorm(doc=923)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

They might appear intelligent, but LLMs are nothing of the sort. They don't understand the meanings of the words they are using, nor the concepts expressed within the sentences they create. When asked how to bring a cow back to life, earlier versions of ChatGPT, for example, which ran on a souped-up version of GPT-3, would confidently provide a list of instructions. So-called hallucinations like this happen because language models have no concept of what a "cow" is or that "death" is a non-reversible state of being. LLMs do not have minds that can think about objects in the world and how they relate to each other. All they "know" is how likely it is that some sets of words will follow other sets of words, having calculated those probabilities from their training data. To make sense of all this, I spoke with Gary Marcus, an emeritus professor of psychology and neural science at New York University, for "Babbage", our science and technology podcast. Last year, as the world was transfixed by the sudden appearance of ChatGPT, he made some fascinating predictions about GPT-4.

McMahon, J.G.; Smith, F.J.: Improved statistical language model performance with automatic generated word hierarchies (1996) 0.01

0.010754585 = product of:
  0.04301834 = sum of:
    0.04301834 = product of:
      0.08603668 = sum of:
        0.08603668 = weight(_text_:22 in 3164) [ClassicSimilarity], result of:
          0.08603668 = score(doc=3164,freq=2.0), product of:
            0.15883844 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0453587 = queryNorm
            0.5416616 = fieldWeight in 3164, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=3164)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Source: Computational linguistics. 22(1996) no.2, S.217-248

Ruge, G.: ¬A spreading activation network for automatic generation of thesaurus relationships (1991) 0.01

0.010754585 = product of:
  0.04301834 = sum of:
    0.04301834 = product of:
      0.08603668 = sum of:
        0.08603668 = weight(_text_:22 in 4506) [ClassicSimilarity], result of:
          0.08603668 = score(doc=4506,freq=2.0), product of:
            0.15883844 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0453587 = queryNorm
            0.5416616 = fieldWeight in 4506, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=4506)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 8.10.2000 11:52:22

Somers, H.: Example-based machine translation : Review article (1999) 0.01

0.010754585 = product of:
  0.04301834 = sum of:
    0.04301834 = product of:
      0.08603668 = sum of:
        0.08603668 = weight(_text_:22 in 6672) [ClassicSimilarity], result of:
          0.08603668 = score(doc=6672,freq=2.0), product of:
            0.15883844 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0453587 = queryNorm
            0.5416616 = fieldWeight in 6672, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6672)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 31. 7.1996 9:22:19

New tools for human translators (1997) 0.01

0.010754585 = product of:
  0.04301834 = sum of:
    0.04301834 = product of:
      0.08603668 = sum of:
        0.08603668 = weight(_text_:22 in 1179) [ClassicSimilarity], result of:
          0.08603668 = score(doc=1179,freq=2.0), product of:
            0.15883844 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0453587 = queryNorm
            0.5416616 = fieldWeight in 1179, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=1179)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 31. 7.1996 9:22:19

Search (63 results, page 1 of 4)

Authors

Years

Languages

Types

Themes

Subjects

Classifications