Search (18 results, page 1 of 1)

  • × type_ss:"el"
  • × theme_ss:"Computerlinguistik"
  1. Bird, S.; Dale, R.; Dorr, B.; Gibson, B.; Joseph, M.; Kan, M.-Y.; Lee, D.; Powley, B.; Radev, D.; Tan, Y.F.: ¬The ACL Anthology Reference Corpus : a reference dataset for bibliographic research in computational linguistics (2008) 0.02
    0.01608469 = product of:
      0.07506188 = sum of:
        0.023310447 = weight(_text_:classification in 2804) [ClassicSimilarity], result of:
          0.023310447 = score(doc=2804,freq=6.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.24377833 = fieldWeight in 2804, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03125 = fieldNorm(doc=2804)
        0.028440988 = weight(_text_:bibliographic in 2804) [ClassicSimilarity], result of:
          0.028440988 = score(doc=2804,freq=4.0), product of:
            0.11688946 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.03002521 = queryNorm
            0.24331525 = fieldWeight in 2804, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.03125 = fieldNorm(doc=2804)
        0.023310447 = weight(_text_:classification in 2804) [ClassicSimilarity], result of:
          0.023310447 = score(doc=2804,freq=6.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.24377833 = fieldWeight in 2804, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03125 = fieldNorm(doc=2804)
      0.21428572 = coord(3/14)
    
    Abstract
    The ACL Anthology is a digital archive of conference and journal papers in natural language processing and computational linguistics. Its primary purpose is to serve as a reference repository of research results, but we believe that it can also be an object of study and a platform for research in its own right. We describe an enriched and standardized reference corpus derived from the ACL Anthology that can be used for research in scholarly document processing. This corpus, which we call the ACL Anthology Reference Corpus (ACL ARC), brings together the recent activities of a number of research groups around the world. Our goal is to make the corpus widely available, and to encourage other researchers to use it as a standard testbed for experiments in both bibliographic and bibliometric research.
    Content
    Vgl. auch: Automatic Term Recognition (ATR) is a research task that deals with the identification of domain-specific terms. Terms, in simple words, are textual realization of significant concepts in an expertise domain. Additionally, domain-specific terms may be classified into a number of categories, in which each category represents a significant concept. A term classification task is often defined on top of an ATR procedure to perform such categorization. For instance, in the biomedical domain, terms can be classified as drugs, proteins, and genes. This is a reference dataset for terminology extraction and classification research in computational linguistics. It is a set of manually annotated terms in English language that are extracted from the ACL Anthology Reference Corpus (ACL ARC). The ACL ARC is a canonicalised and frozen subset of scientific publications in the domain of Human Language Technologies (HLT). It consists of 10,921 articles from 1965 to 2006. The dataset, called ACL RD-TEC, is comprised of more than 69,000 candidate terms that are manually annotated as valid and invalid terms. Furthermore, valid terms are classified as technology and non-technology terms. Technology terms refer to a method, process, or in general a technological concept in the domain of HLT, e.g. machine translation, word sense disambiguation, and language modelling. On the other hand, non-technology terms refer to important concepts other than technological; examples of such terms in the domain of HLT are multilingual lexicon, corpora, word sense, and language model. The dataset is created to serve as a gold standard for the comparison of the algorithms of term recognition and classification. [http://catalog.elra.info/product_info.php?products_id=1236].
  2. Sebastiani, F.: ¬A tutorial an automated text categorisation (1999) 0.02
    0.015061316 = product of:
      0.07028614 = sum of:
        0.02018744 = weight(_text_:classification in 3390) [ClassicSimilarity], result of:
          0.02018744 = score(doc=3390,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.21111822 = fieldWeight in 3390, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=3390)
        0.02018744 = weight(_text_:classification in 3390) [ClassicSimilarity], result of:
          0.02018744 = score(doc=3390,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.21111822 = fieldWeight in 3390, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=3390)
        0.029911257 = product of:
          0.059822515 = sum of:
            0.059822515 = weight(_text_:texts in 3390) [ClassicSimilarity], result of:
              0.059822515 = score(doc=3390,freq=2.0), product of:
                0.16460659 = queryWeight, product of:
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.03002521 = queryNorm
                0.36342722 = fieldWeight in 3390, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3390)
          0.5 = coord(1/2)
      0.21428572 = coord(3/14)
    
    Abstract
    The automated categorisation (or classification) of texts into topical categories has a long history, dating back at least to 1960. Until the late '80s, the dominant approach to the problem involved knowledge-engineering automatic categorisers, i.e. manually building a set of rules encoding expert knowledge an how to classify documents. In the '90s, with the booming production and availability of on-line documents, automated text categorisation has witnessed an increased and renewed interest. A newer paradigm based an machine learning has superseded the previous approach. Within this paradigm, a general inductive process automatically builds a classifier by "learning", from a set of previously classified documents, the characteristics of one or more categories; the advantages are a very good effectiveness, a considerable savings in terms of expert manpower, and domain independence. In this tutorial we look at the main approaches that have been taken towards automatic text categorisation within the general machine learning paradigm. Issues of document indexing, classifier construction, and classifier evaluation, will be touched upon.
  3. Zadeh, B.Q.; Handschuh, S.: ¬The ACL RD-TEC : a dataset for benchmarking terminology extraction and classification in computational linguistics (2014) 0.01
    0.008156957 = product of:
      0.057098698 = sum of:
        0.028549349 = weight(_text_:classification in 2803) [ClassicSimilarity], result of:
          0.028549349 = score(doc=2803,freq=4.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.29856625 = fieldWeight in 2803, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=2803)
        0.028549349 = weight(_text_:classification in 2803) [ClassicSimilarity], result of:
          0.028549349 = score(doc=2803,freq=4.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.29856625 = fieldWeight in 2803, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=2803)
      0.14285715 = coord(2/14)
    
    Abstract
    This paper introduces ACL RD-TEC: a dataset for evaluating the extraction and classification of terms from literature in the domain of computational linguistics. The dataset is derived from the Association for Computational Linguistics anthology reference corpus (ACL ARC). In its first release, the ACL RD-TEC consists of automatically segmented, part-of-speech-tagged ACL ARC documents, three lists of candidate terms, and more than 82,000 manually annotated terms. The annotated terms are marked as either valid or invalid, and valid terms are further classified as technology and non-technology terms. Technology terms signify methods, algorithms, and solutions in computational linguistics. The paper describes the dataset and reports the relevant statistics. We hope the step described in this paper encourages a collaborative effort towards building a full-fledged annotated corpus from the computational linguistics literature.
  4. Snajder, J.; Almic, P.: Modeling semantic compositionality of Croatian multiword expressions (2015) 0.01
    0.00576784 = product of:
      0.04037488 = sum of:
        0.02018744 = weight(_text_:classification in 2920) [ClassicSimilarity], result of:
          0.02018744 = score(doc=2920,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.21111822 = fieldWeight in 2920, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=2920)
        0.02018744 = weight(_text_:classification in 2920) [ClassicSimilarity], result of:
          0.02018744 = score(doc=2920,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.21111822 = fieldWeight in 2920, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=2920)
      0.14285715 = coord(2/14)
    
    Abstract
    A distinguishing feature of many multiword expressions (MWEs) is their semantic non-compositionality. Determining the semantic compositionality of MWEs is important for many natural language processing tasks. We address the task of modeling semantic compositionality of Croatian MWEs. We adopt a composition-based approach within the distributional semantics framework. We build and evaluate models based on Latent Semantic Analysis and the recently proposed neural network-based Skip-gram model, and experiment with different composition functions. We show that the compositionality scores predicted by the Skip-gram additive models correlate well with human judgments (=0.50). When framed as a classification task, the model achieves an accuracy of 0.64.
  5. Radford, A.; Narasimhan, K.; Salimans, T.; Sutskever, I.: Improving language understanding by Generative Pre-Training 0.01
    0.00576784 = product of:
      0.04037488 = sum of:
        0.02018744 = weight(_text_:classification in 870) [ClassicSimilarity], result of:
          0.02018744 = score(doc=870,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.21111822 = fieldWeight in 870, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=870)
        0.02018744 = weight(_text_:classification in 870) [ClassicSimilarity], result of:
          0.02018744 = score(doc=870,freq=2.0), product of:
            0.09562149 = queryWeight, product of:
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.03002521 = queryNorm
            0.21111822 = fieldWeight in 870, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1847067 = idf(docFreq=4974, maxDocs=44218)
              0.046875 = fieldNorm(doc=870)
      0.14285715 = coord(2/14)
    
    Abstract
    Natural language understanding comprises a wide range of diverse tasks such as textual entailment, question answering, semantic similarity assessment, and document classification. Although large unlabeled text corpora are abundant, labeled data for learning these specific tasks is scarce, making it challenging for discriminatively trained models to perform adequately. We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task. In contrast to previous approaches, we make use of task-aware input transformations during fine-tuning to achieve effective transfer while requiring minimal changes to the model architecture. We demonstrate the effectiveness of our approach on a wide range of benchmarks for natural language understanding. Our general task-agnostic model outperforms discriminatively trained models that use architectures specifically crafted for each task, significantly improving upon the state of the art in 9 out of the 12 tasks studied. For instance, we achieve absolute improvements of 8.9% on commonsense reasoning (Stories Cloze Test), 5.7% on question answering (RACE), and 1.5% on textual entailment (MultiNLI).
  6. Nagy T., I.: Detecting multiword expressions and named entities in natural language texts (2014) 0.00
    0.0030528049 = product of:
      0.042739265 = sum of:
        0.042739265 = product of:
          0.08547853 = sum of:
            0.08547853 = weight(_text_:texts in 1536) [ClassicSimilarity], result of:
              0.08547853 = score(doc=1536,freq=12.0), product of:
                0.16460659 = queryWeight, product of:
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.03002521 = queryNorm
                0.51928985 = fieldWeight in 1536, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=1536)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Abstract
    Multiword expressions (MWEs) are lexical items that can be decomposed into single words and display lexical, syntactic, semantic, pragmatic and/or statistical idiosyncrasy (Sag et al., 2002; Kim, 2008; Calzolari et al., 2002). The proper treatment of multiword expressions such as rock 'n' roll and make a decision is essential for many natural language processing (NLP) applications like information extraction and retrieval, terminology extraction and machine translation, and it is important to identify multiword expressions in context. For example, in machine translation we must know that MWEs form one semantic unit, hence their parts should not be translated separately. For this, multiword expressions should be identified first in the text to be translated. The chief aim of this thesis is to develop machine learning-based approaches for the automatic detection of different types of multiword expressions in English and Hungarian natural language texts. In our investigations, we pay attention to the characteristics of different types of multiword expressions such as nominal compounds, multiword named entities and light verb constructions, and we apply novel methods to identify MWEs in raw texts. In the thesis it will be demonstrated that nominal compounds and multiword amed entities may require a similar approach for their automatic detection as they behave in the same way from a linguistic point of view. Furthermore, it will be shown that the automatic detection of light verb constructions can be carried out using two effective machine learning-based approaches.
    In this thesis, we focused on the automatic detection of multiword expressions in natural language texts. On the basis of the main contributions, we can argue that: - Supervised machine learning methods can be successfully applied for the automatic detection of different types of multiword expressions in natural language texts. - Machine learning-based multiword expression detection can be successfully carried out for English as well as for Hungarian. - Our supervised machine learning-based model was successfully applied to the automatic detection of nominal compounds from English raw texts. - We developed a Wikipedia-based dictionary labeling method to automatically detect English nominal compounds. - A prior knowledge of nominal compounds can enhance Named Entity Recognition, while previously identified named entities can assist the nominal compound identification process. - The machine learning-based method can also provide acceptable results when it was trained on an automatically generated silver standard corpus. - As named entities form one semantic unit and may consist of more than one word and function as a noun, we can treat them in a similar way to nominal compounds. - Our sequence labelling-based tool can be successfully applied for identifying verbal light verb constructions in two typologically different languages, namely English and Hungarian. - Domain adaptation techniques may help diminish the distance between domains in the automatic detection of light verb constructions. - Our syntax-based method can be successfully applied for the full-coverage identification of light verb constructions. As a first step, a data-driven candidate extraction method can be utilized. After, a machine learning approach that makes use of an extended and rich feature set selects LVCs among extracted candidates. - When a precise syntactic parser is available for the actual domain, the full-coverage identification can be performed better. In other cases, the usage of the sequence labeling method is recommended.
  7. Altmann, E.G.; Cristadoro, G.; Esposti, M.D.: On the origin of long-range correlations in texts (2012) 0.00
    0.0030214933 = product of:
      0.042300906 = sum of:
        0.042300906 = product of:
          0.08460181 = sum of:
            0.08460181 = weight(_text_:texts in 330) [ClassicSimilarity], result of:
              0.08460181 = score(doc=330,freq=4.0), product of:
                0.16460659 = queryWeight, product of:
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.03002521 = queryNorm
                0.5139637 = fieldWeight in 330, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.046875 = fieldNorm(doc=330)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Abstract
    The complexity of human interactions with social and natural phenomena is mirrored in the way we describe our experiences through natural language. In order to retain and convey such a high dimensional information, the statistical properties of our linguistic output has to be highly correlated in time. An example are the robust observations, still largely not understood, of correlations on arbitrary long scales in literary texts. In this paper we explain how long-range correlations flow from highly structured linguistic levels down to the building blocks of a text (words, letters, etc..). By combining calculations and data analysis we show that correlations take form of a bursty sequence of events once we approach the semantically relevant topics of the text. The mechanisms we identify are fairly general and can be equally applied to other hierarchical settings.
  8. Collard, J.; Paiva, V. de; Fong, B.; Subrahmanian, E.: Extracting mathematical concepts from text (2022) 0.00
    0.0024926048 = product of:
      0.034896467 = sum of:
        0.034896467 = product of:
          0.069792934 = sum of:
            0.069792934 = weight(_text_:texts in 668) [ClassicSimilarity], result of:
              0.069792934 = score(doc=668,freq=2.0), product of:
                0.16460659 = queryWeight, product of:
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.03002521 = queryNorm
                0.42399842 = fieldWeight in 668, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=668)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Abstract
    We investigate different systems for extracting mathematical entities from English texts in the mathematical field of category theory as a first step for constructing a mathematical knowledge graph. We consider four different term extractors and compare their results. This small experiment showcases some of the issues with the construction and evaluation of terms extracted from noisy domain text. We also make available two open corpora in research mathematics, in particular in category theory: a small corpus of 755 abstracts from the journal TAC (3188 sentences), and a larger corpus from the nLab community wiki (15,000 sentences).
  9. Hausser, R.: Grammatical disambiguation : the linear complexity hypothesis for natural language (2020) 0.00
    0.0018186709 = product of:
      0.02546139 = sum of:
        0.02546139 = weight(_text_:subject in 22) [ClassicSimilarity], result of:
          0.02546139 = score(doc=22,freq=2.0), product of:
            0.10738805 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.03002521 = queryNorm
            0.23709705 = fieldWeight in 22, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.046875 = fieldNorm(doc=22)
      0.071428575 = coord(1/14)
    
    Abstract
    DBS uses a strictly time-linear derivation order. Therefore the basic computational complexity degree of DBS is linear time. The only way to increase DBS complexity above linear is repeating ambiguity. In natural language, however, repeating ambiguity is prevented by grammatical disambiguation. A classic example of a grammatical ambiguity is the 'garden path' sentence The horse raced by the barn fell. The continuation horse+raced introduces an ambiguity between horse which raced and horse which was raced, leading to two parallel derivation strands up to The horse raced by the barn. Depending on whether the continuation is interpunctuation or a verb, they are grammatically disambiguated, resulting in unambiguous output. A repeated ambiguity occurs in The man who loves the woman who feeds Lucy who Peter loves., with who serving as subject or as object. These readings are grammatically disambiguated by continuing after who with a verb or a noun.
  10. Boleda, G.; Evert, S.: Multiword expressions : a pain in the neck of lexical semantics (2009) 0.00
    0.0017434291 = product of:
      0.024408007 = sum of:
        0.024408007 = product of:
          0.048816014 = sum of:
            0.048816014 = weight(_text_:22 in 4888) [ClassicSimilarity], result of:
              0.048816014 = score(doc=4888,freq=2.0), product of:
                0.10514317 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03002521 = queryNorm
                0.46428138 = fieldWeight in 4888, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=4888)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Date
    1. 3.2013 14:56:22
  11. Franke-Maier, M.: Computerlinguistik und Bibliotheken : Editorial (2016) 0.00
    0.0015155592 = product of:
      0.021217827 = sum of:
        0.021217827 = weight(_text_:subject in 3206) [ClassicSimilarity], result of:
          0.021217827 = score(doc=3206,freq=2.0), product of:
            0.10738805 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.03002521 = queryNorm
            0.19758089 = fieldWeight in 3206, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3206)
      0.071428575 = coord(1/14)
    
    Abstract
    Vor 50 Jahren, im Februar 1966, wies Floyd M. Cammack auf den Zusammenhang von "Linguistics and Libraries" hin. Er ging dabei von dem Eintrag für "Linguistics" in den Library of Congress Subject Headings (LCSH) von 1957 aus, der als Verweis "See Language and Languages; Philology; Philology, Comparative" enthielt. Acht Jahre später kamen unter dem Schlagwort "Language and Languages" Ergänzungen wie "language data processing", "automatic indexing", "machine translation" und "psycholinguistics" hinzu. Für Cammack zeigt sich hier ein Netz komplexer Wechselbeziehungen, die unter dem Begriff "Linguistics" zusammengefasst werden sollten. Dieses System habe wichtigen Einfluss auf alle, die mit dem Sammeln, Organisieren, Speichern und Wiederauffinden von Informationen befasst seien. (Cammack 1966:73). Hier liegt - im übertragenen Sinne - ein Heft vor Ihnen, in dem es um computerlinguistische Verfahren in Bibliotheken geht. Letztlich geht es um eine Versachlichung der Diskussion, um den Stellenwert der Inhaltserschliessung und die Rekalibrierung ihrer Wertschätzung in Zeiten von Mega-Indizes und Big Data. Der derzeitige Widerspruch zwischen dem Wunsch nach relevanter Treffermenge in Rechercheoberflächen vs. der Erfahrung des Relevanz-Rankings ist zu lösen. Explizit auch die Frage, wie oft wir von letzterem enttäuscht wurden und was zu tun ist, um das Verhältnis von recall und precision wieder in ein angebrachtes Gleichgewicht zu bringen. Unsere Nutzerinnen und Nutzer werden es uns danken.
  12. Zhai, X.: ChatGPT user experience: : implications for education (2022) 0.00
    0.0015155592 = product of:
      0.021217827 = sum of:
        0.021217827 = weight(_text_:subject in 849) [ClassicSimilarity], result of:
          0.021217827 = score(doc=849,freq=2.0), product of:
            0.10738805 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.03002521 = queryNorm
            0.19758089 = fieldWeight in 849, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.0390625 = fieldNorm(doc=849)
      0.071428575 = coord(1/14)
    
    Abstract
    ChatGPT, a general-purpose conversation chatbot released on November 30, 2022, by OpenAI, is expected to impact every aspect of society. However, the potential impacts of this NLP tool on education remain unknown. Such impact can be enormous as the capacity of ChatGPT may drive changes to educational learning goals, learning activities, and assessment and evaluation practices. This study was conducted by piloting ChatGPT to write an academic paper, titled Artificial Intelligence for Education (see Appendix A). The piloting result suggests that ChatGPT is able to help researchers write a paper that is coherent, (partially) accurate, informative, and systematic. The writing is extremely efficient (2-3 hours) and involves very limited professional knowledge from the author. Drawing upon the user experience, I reflect on the potential impacts of ChatGPT, as well as similar AI tools, on education. The paper concludes by suggesting adjusting learning goals-students should be able to use AI tools to conduct subject-domain tasks and education should focus on improving students' creativity and critical thinking rather than general skills. To accomplish the learning goals, researchers should design AI-involved learning tasks to engage students in solving real-world problems. ChatGPT also raises concerns that students may outsource assessment tasks. This paper concludes that new formats of assessments are needed to focus on creativity and critical thinking that AI cannot substitute.
  13. Aydin, Ö.; Karaarslan, E.: OpenAI ChatGPT generated literature review: : digital twin in healthcare (2022) 0.00
    0.0012124473 = product of:
      0.016974261 = sum of:
        0.016974261 = weight(_text_:subject in 851) [ClassicSimilarity], result of:
          0.016974261 = score(doc=851,freq=2.0), product of:
            0.10738805 = queryWeight, product of:
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.03002521 = queryNorm
            0.15806471 = fieldWeight in 851, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.576596 = idf(docFreq=3361, maxDocs=44218)
              0.03125 = fieldNorm(doc=851)
      0.071428575 = coord(1/14)
    
    Abstract
    Literature review articles are essential to summarize the related work in the selected field. However, covering all related studies takes too much time and effort. This study questions how Artificial Intelligence can be used in this process. We used ChatGPT to create a literature review article to show the stage of the OpenAI ChatGPT artificial intelligence application. As the subject, the applications of Digital Twin in the health field were chosen. Abstracts of the last three years (2020, 2021 and 2022) papers were obtained from the keyword "Digital twin in healthcare" search results on Google Scholar and paraphrased by ChatGPT. Later on, we asked ChatGPT questions. The results are promising; however, the paraphrased parts had significant matches when checked with the Ithenticate tool. This article is the first attempt to show the compilation and expression of knowledge will be accelerated with the help of artificial intelligence. We are still at the beginning of such advances. The future academic publishing process will require less human effort, which in turn will allow academics to focus on their studies. In future studies, we will monitor citations to this study to evaluate the academic validity of the content produced by the ChatGPT. 1. Introduction OpenAI ChatGPT (ChatGPT, 2022) is a chatbot based on the OpenAI GPT-3 language model. It is designed to generate human-like text responses to user input in a conversational context. OpenAI ChatGPT is trained on a large dataset of human conversations and can be used to create responses to a wide range of topics and prompts. The chatbot can be used for customer service, content creation, and language translation tasks, creating replies in multiple languages. OpenAI ChatGPT is available through the OpenAI API, which allows developers to access and integrate the chatbot into their applications and systems. OpenAI ChatGPT is a variant of the GPT (Generative Pre-trained Transformer) language model developed by OpenAI. It is designed to generate human-like text, allowing it to engage in conversation with users naturally and intuitively. OpenAI ChatGPT is trained on a large dataset of human conversations, allowing it to understand and respond to a wide range of topics and contexts. It can be used in various applications, such as chatbots, customer service agents, and language translation systems. OpenAI ChatGPT is a state-of-the-art language model able to generate coherent and natural text that can be indistinguishable from text written by a human. As an artificial intelligence, ChatGPT may need help to change academic writing practices. However, it can provide information and guidance on ways to improve people's academic writing skills.
  14. Lezius, W.: Morphy - Morphologie und Tagging für das Deutsche (2013) 0.00
    0.0011622861 = product of:
      0.016272005 = sum of:
        0.016272005 = product of:
          0.03254401 = sum of:
            0.03254401 = weight(_text_:22 in 1490) [ClassicSimilarity], result of:
              0.03254401 = score(doc=1490,freq=2.0), product of:
                0.10514317 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03002521 = queryNorm
                0.30952093 = fieldWeight in 1490, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1490)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Date
    22. 3.2015 9:30:24
  15. Bager, J.: ¬Die Text-KI ChatGPT schreibt Fachtexte, Prosa, Gedichte und Programmcode (2023) 0.00
    0.0011622861 = product of:
      0.016272005 = sum of:
        0.016272005 = product of:
          0.03254401 = sum of:
            0.03254401 = weight(_text_:22 in 835) [ClassicSimilarity], result of:
              0.03254401 = score(doc=835,freq=2.0), product of:
                0.10514317 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03002521 = queryNorm
                0.30952093 = fieldWeight in 835, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=835)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Date
    29.12.2022 18:22:55
  16. Rieger, F.: Lügende Computer (2023) 0.00
    0.0011622861 = product of:
      0.016272005 = sum of:
        0.016272005 = product of:
          0.03254401 = sum of:
            0.03254401 = weight(_text_:22 in 912) [ClassicSimilarity], result of:
              0.03254401 = score(doc=912,freq=2.0), product of:
                0.10514317 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03002521 = queryNorm
                0.30952093 = fieldWeight in 912, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=912)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Date
    16. 3.2023 19:22:55
  17. RWI/PH: Auf der Suche nach dem entscheidenden Wort : die Häufung bestimmter Wörter innerhalb eines Textes macht diese zu Schlüsselwörtern (2012) 0.00
    0.0010682592 = product of:
      0.014955629 = sum of:
        0.014955629 = product of:
          0.029911257 = sum of:
            0.029911257 = weight(_text_:texts in 331) [ClassicSimilarity], result of:
              0.029911257 = score(doc=331,freq=2.0), product of:
                0.16460659 = queryWeight, product of:
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.03002521 = queryNorm
                0.18171361 = fieldWeight in 331, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.4822793 = idf(docFreq=499, maxDocs=44218)
                  0.0234375 = fieldNorm(doc=331)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Footnote
    Pressemitteilung zum Artikel: Eduardo G. Altmann, Giampaolo Cristadoro and Mirko Degli Esposti: On the origin of long-range correlations in texts. In: Proceedings of the National Academy of Sciences, 2. Juli 2012. DOI: 10.1073/pnas.1117723109.
  18. Rötzer, F.: KI-Programm besser als Menschen im Verständnis natürlicher Sprache (2018) 0.00
    5.811431E-4 = product of:
      0.008136002 = sum of:
        0.008136002 = product of:
          0.016272005 = sum of:
            0.016272005 = weight(_text_:22 in 4217) [ClassicSimilarity], result of:
              0.016272005 = score(doc=4217,freq=2.0), product of:
                0.10514317 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03002521 = queryNorm
                0.15476047 = fieldWeight in 4217, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=4217)
          0.5 = coord(1/2)
      0.071428575 = coord(1/14)
    
    Date
    22. 1.2018 11:32:44