Search (62 results, page 1 of 4)

  • × type_ss:"el"
  • × theme_ss:"Computerlinguistik"
  1. Stoykova, V.; Petkova, E.: Automatic extraction of mathematical terms for precalculus (2012) 0.04
    0.039748333 = product of:
      0.11924499 = sum of:
        0.05872617 = weight(_text_:applications in 156) [ClassicSimilarity], result of:
          0.05872617 = score(doc=156,freq=2.0), product of:
            0.17247584 = queryWeight, product of:
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.03917671 = queryNorm
            0.34048924 = fieldWeight in 156, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.0546875 = fieldNorm(doc=156)
        0.0128330635 = weight(_text_:of in 156) [ClassicSimilarity], result of:
          0.0128330635 = score(doc=156,freq=6.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.20947541 = fieldWeight in 156, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=156)
        0.047685754 = weight(_text_:software in 156) [ClassicSimilarity], result of:
          0.047685754 = score(doc=156,freq=2.0), product of:
            0.15541996 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.03917671 = queryNorm
            0.30681872 = fieldWeight in 156, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.0546875 = fieldNorm(doc=156)
      0.33333334 = coord(3/9)
    
    Abstract
    In this work, we present the results of research for evaluating a methodology for extracting mathematical terms for precalculus using the techniques for semantically-oriented statistical search. We use the corpus-based approach and the combination of different statistically-based techniques for extracting keywords, collocations and co-occurrences incorporated in the Sketch Engine software. We evaluate the collocations candidate terms for the basic concept function(s) and approve the related methodology by precalculus domain conceptual terms definitions. Finally, we offer a conceptual terms hierarchical representation and discuss the results with respect to their possible applications.
  2. Shen, M.; Liu, D.-R.; Huang, Y.-S.: Extracting semantic relations to enrich domain ontologies (2012) 0.03
    0.034636453 = product of:
      0.10390935 = sum of:
        0.05872617 = weight(_text_:applications in 267) [ClassicSimilarity], result of:
          0.05872617 = score(doc=267,freq=2.0), product of:
            0.17247584 = queryWeight, product of:
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.03917671 = queryNorm
            0.34048924 = fieldWeight in 267, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.0546875 = fieldNorm(doc=267)
        0.016567415 = weight(_text_:of in 267) [ClassicSimilarity], result of:
          0.016567415 = score(doc=267,freq=10.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.2704316 = fieldWeight in 267, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=267)
        0.028615767 = weight(_text_:systems in 267) [ClassicSimilarity], result of:
          0.028615767 = score(doc=267,freq=2.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.23767869 = fieldWeight in 267, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=267)
      0.33333334 = coord(3/9)
    
    Abstract
    Domain ontologies facilitate the organization, sharing and reuse of domain knowledge, and enable various vertical domain applications to operate successfully. Most methods for automatically constructing ontologies focus on taxonomic relations, such as is-kind-of and is- part-of relations. However, much of the domain-specific semantics is ignored. This work proposes a semi-unsupervised approach for extracting semantic relations from domain-specific text documents. The approach effectively utilizes text mining and existing taxonomic relations in domain ontologies to discover candidate keywords that can represent semantic relations. A preliminary experiment on the natural science domain (Taiwan K9 education) indicates that the proposed method yields valuable recommendations. This work enriches domain ontologies by adding distilled semantics.
    Source
    Journal of Intelligent Information Systems
  3. Aydin, Ö.; Karaarslan, E.: OpenAI ChatGPT generated literature review: : digital twin in healthcare (2022) 0.03
    0.03217137 = product of:
      0.0965141 = sum of:
        0.05812384 = weight(_text_:applications in 851) [ClassicSimilarity], result of:
          0.05812384 = score(doc=851,freq=6.0), product of:
            0.17247584 = queryWeight, product of:
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.03917671 = queryNorm
            0.33699697 = fieldWeight in 851, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.03125 = fieldNorm(doc=851)
        0.01526523 = weight(_text_:of in 851) [ClassicSimilarity], result of:
          0.01526523 = score(doc=851,freq=26.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.2491759 = fieldWeight in 851, product of:
              5.0990195 = tf(freq=26.0), with freq of:
                26.0 = termFreq=26.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03125 = fieldNorm(doc=851)
        0.023125032 = weight(_text_:systems in 851) [ClassicSimilarity], result of:
          0.023125032 = score(doc=851,freq=4.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.19207339 = fieldWeight in 851, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03125 = fieldNorm(doc=851)
      0.33333334 = coord(3/9)
    
    Abstract
    Literature review articles are essential to summarize the related work in the selected field. However, covering all related studies takes too much time and effort. This study questions how Artificial Intelligence can be used in this process. We used ChatGPT to create a literature review article to show the stage of the OpenAI ChatGPT artificial intelligence application. As the subject, the applications of Digital Twin in the health field were chosen. Abstracts of the last three years (2020, 2021 and 2022) papers were obtained from the keyword "Digital twin in healthcare" search results on Google Scholar and paraphrased by ChatGPT. Later on, we asked ChatGPT questions. The results are promising; however, the paraphrased parts had significant matches when checked with the Ithenticate tool. This article is the first attempt to show the compilation and expression of knowledge will be accelerated with the help of artificial intelligence. We are still at the beginning of such advances. The future academic publishing process will require less human effort, which in turn will allow academics to focus on their studies. In future studies, we will monitor citations to this study to evaluate the academic validity of the content produced by the ChatGPT. 1. Introduction OpenAI ChatGPT (ChatGPT, 2022) is a chatbot based on the OpenAI GPT-3 language model. It is designed to generate human-like text responses to user input in a conversational context. OpenAI ChatGPT is trained on a large dataset of human conversations and can be used to create responses to a wide range of topics and prompts. The chatbot can be used for customer service, content creation, and language translation tasks, creating replies in multiple languages. OpenAI ChatGPT is available through the OpenAI API, which allows developers to access and integrate the chatbot into their applications and systems. OpenAI ChatGPT is a variant of the GPT (Generative Pre-trained Transformer) language model developed by OpenAI. It is designed to generate human-like text, allowing it to engage in conversation with users naturally and intuitively. OpenAI ChatGPT is trained on a large dataset of human conversations, allowing it to understand and respond to a wide range of topics and contexts. It can be used in various applications, such as chatbots, customer service agents, and language translation systems. OpenAI ChatGPT is a state-of-the-art language model able to generate coherent and natural text that can be indistinguishable from text written by a human. As an artificial intelligence, ChatGPT may need help to change academic writing practices. However, it can provide information and guidance on ways to improve people's academic writing skills.
  4. Park, J.S.; O'Brien, J.C.; Cai, C.J.; Ringel Morris, M.; Liang, P.; Bernstein, M.S.: Generative agents : interactive simulacra of human behavior (2023) 0.03
    0.030003514 = product of:
      0.09001054 = sum of:
        0.041947264 = weight(_text_:applications in 972) [ClassicSimilarity], result of:
          0.041947264 = score(doc=972,freq=2.0), product of:
            0.17247584 = queryWeight, product of:
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.03917671 = queryNorm
            0.2432066 = fieldWeight in 972, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.0390625 = fieldNorm(doc=972)
        0.0140020205 = weight(_text_:of in 972) [ClassicSimilarity], result of:
          0.0140020205 = score(doc=972,freq=14.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.22855641 = fieldWeight in 972, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=972)
        0.034061253 = weight(_text_:software in 972) [ClassicSimilarity], result of:
          0.034061253 = score(doc=972,freq=2.0), product of:
            0.15541996 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.03917671 = queryNorm
            0.21915624 = fieldWeight in 972, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.0390625 = fieldNorm(doc=972)
      0.33333334 = coord(3/9)
    
    Abstract
    Believable proxies of human behavior can empower interactive applications ranging from immersive environments to rehearsal spaces for interpersonal communication to prototyping tools. In this paper, we introduce generative agents--computational software agents that simulate believable human behavior. Generative agents wake up, cook breakfast, and head to work; artists paint, while authors write; they form opinions, notice each other, and initiate conversations; they remember and reflect on days past as they plan the next day. To enable generative agents, we describe an architecture that extends a large language model to store a complete record of the agent's experiences using natural language, synthesize those memories over time into higher-level reflections, and retrieve them dynamically to plan behavior. We instantiate generative agents to populate an interactive sandbox environment inspired by The Sims, where end users can interact with a small town of twenty five agents using natural language. In an evaluation, these generative agents produce believable individual and emergent social behaviors: for example, starting with only a single user-specified notion that one agent wants to throw a Valentine's Day party, the agents autonomously spread invitations to the party over the next two days, make new acquaintances, ask each other out on dates to the party, and coordinate to show up for the party together at the right time. We demonstrate through ablation that the components of our agent architecture--observation, planning, and reflection--each contribute critically to the believability of agent behavior. By fusing large language models with computational, interactive agents, this work introduces architectural and interaction patterns for enabling believable simulations of human behavior.
  5. Perovsek, M.; Kranjca, J.; Erjaveca, T.; Cestnika, B.; Lavraca, N.: TextFlows : a visual programming platform for text mining and natural language processing (2016) 0.02
    0.01981098 = product of:
      0.08914941 = sum of:
        0.07118686 = weight(_text_:applications in 2697) [ClassicSimilarity], result of:
          0.07118686 = score(doc=2697,freq=4.0), product of:
            0.17247584 = queryWeight, product of:
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.03917671 = queryNorm
            0.41273528 = fieldWeight in 2697, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.046875 = fieldNorm(doc=2697)
        0.017962547 = weight(_text_:of in 2697) [ClassicSimilarity], result of:
          0.017962547 = score(doc=2697,freq=16.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.2932045 = fieldWeight in 2697, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2697)
      0.22222222 = coord(2/9)
    
    Abstract
    Text mining and natural language processing are fast growing areas of research, with numerous applications in business, science and creative industries. This paper presents TextFlows, a web-based text mining and natural language processing platform supporting workflow construction, sharing and execution. The platform enables visual construction of text mining workflows through a web browser, and the execution of the constructed workflows on a processing cloud. This makes TextFlows an adaptable infrastructure for the construction and sharing of text processing workflows, which can be reused in various applications. The paper presents the implemented text mining and language processing modules, and describes some precomposed workflows. Their features are demonstrated on three use cases: comparison of document classifiers and of different part-of-speech taggers on a text categorization problem, and outlier detection in document corpora.
    Source
    Science of computer programming. In Press, 2016
  6. Jha, A.: Why GPT-4 isn't all it's cracked up to be (2023) 0.02
    0.01889123 = product of:
      0.05667369 = sum of:
        0.018522931 = weight(_text_:of in 923) [ClassicSimilarity], result of:
          0.018522931 = score(doc=923,freq=50.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.3023517 = fieldWeight in 923, product of:
              7.071068 = tf(freq=50.0), with freq of:
                50.0 = termFreq=50.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.02734375 = fieldNorm(doc=923)
        0.014307884 = weight(_text_:systems in 923) [ClassicSimilarity], result of:
          0.014307884 = score(doc=923,freq=2.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.118839346 = fieldWeight in 923, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.02734375 = fieldNorm(doc=923)
        0.023842877 = weight(_text_:software in 923) [ClassicSimilarity], result of:
          0.023842877 = score(doc=923,freq=2.0), product of:
            0.15541996 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.03917671 = queryNorm
            0.15340936 = fieldWeight in 923, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.02734375 = fieldNorm(doc=923)
      0.33333334 = coord(3/9)
    
    Abstract
    "I still don't know what to think about GPT-4, the new large language model (LLM) from OpenAI. On the one hand it is a remarkable product that easily passes the Turing test. If you ask it questions, via the ChatGPT interface, GPT-4 can easily produce fluid sentences largely indistinguishable from those a person might write. But on the other hand, amid the exceptional levels of hype and anticipation, it's hard to know where GPT-4 and other LLMs truly fit in the larger project of making machines intelligent.
    They might appear intelligent, but LLMs are nothing of the sort. They don't understand the meanings of the words they are using, nor the concepts expressed within the sentences they create. When asked how to bring a cow back to life, earlier versions of ChatGPT, for example, which ran on a souped-up version of GPT-3, would confidently provide a list of instructions. So-called hallucinations like this happen because language models have no concept of what a "cow" is or that "death" is a non-reversible state of being. LLMs do not have minds that can think about objects in the world and how they relate to each other. All they "know" is how likely it is that some sets of words will follow other sets of words, having calculated those probabilities from their training data. To make sense of all this, I spoke with Gary Marcus, an emeritus professor of psychology and neural science at New York University, for "Babbage", our science and technology podcast. Last year, as the world was transfixed by the sudden appearance of ChatGPT, he made some fascinating predictions about GPT-4.
    He doesn't dismiss the potential of LLMs to become useful assistants in all sorts of ways-Google and Microsoft have already announced that they will be integrating LLMs into their search and office productivity software. But he talked me through some of his criticisms of the technology's apparent capabilities. At the heart of Dr Marcus's thoughtful critique is an attempt to put LLMs into proper context. Deep learning, the underlying technology that makes LLMs work, is only one piece of the puzzle in the quest for machine intelligence. To reach the level of artificial general intelligence (AGI) that many tech companies strive for-i.e. machines that can plan, reason and solve problems in the way human brains can-they will need to deploy a suite of other AI techniques. These include, for example, the kind of "symbolic AI" that was popular before artificial neural networks and deep learning became all the rage.
    People use symbols to think about the world: if I say the words "cat", "house" or "aeroplane", you know instantly what I mean. Symbols can also be used to describe the way things are behaving (running, falling, flying) or they can represent how things should behave in relation to each other (a "+" means add the numbers before and after). Symbolic AI is a way to embed this human knowledge and reasoning into computer systems. Though the idea has been around for decades, it fell by the wayside a few years ago as deep learning-buoyed by the sudden easy availability of lots of training data and cheap computing power-became more fashionable. In the near future at least, there's no doubt people will find LLMs useful. But whether they represent a critical step on the path towards AGI, or rather just an intriguing detour, remains to be seen."
  7. Lezius, W.: Morphy - Morphologie und Tagging für das Deutsche (2013) 0.02
    0.016828805 = product of:
      0.07572962 = sum of:
        0.054498006 = weight(_text_:software in 1490) [ClassicSimilarity], result of:
          0.054498006 = score(doc=1490,freq=2.0), product of:
            0.15541996 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.03917671 = queryNorm
            0.35064998 = fieldWeight in 1490, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.0625 = fieldNorm(doc=1490)
        0.021231614 = product of:
          0.042463228 = sum of:
            0.042463228 = weight(_text_:22 in 1490) [ClassicSimilarity], result of:
              0.042463228 = score(doc=1490,freq=2.0), product of:
                0.13719016 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03917671 = queryNorm
                0.30952093 = fieldWeight in 1490, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1490)
          0.5 = coord(1/2)
      0.22222222 = coord(2/9)
    
    Abstract
    Morphy ist ein frei verfügbares Softwarepaket für die morphologische Analyse und Synthese und die kontextsensitive Wortartenbestimmung des Deutschen. Die Verwendung der Software unterliegt keinen Beschränkungen. Da die Weiterentwicklung eingestellt worden ist, verwenden Sie Morphy as is, d.h. auf eigenes Risiko, ohne jegliche Haftung und Gewährleistung und vor allem ohne Support. Morphy ist nur für die Windows-Plattform verfügbar und nur auf Standalone-PCs lauffähig.
    Date
    22. 3.2015 9:30:24
  8. Schmid, H.: Improvements in Part-of-Speech tagging with an application to German (1995) 0.02
    0.0153698595 = product of:
      0.069164366 = sum of:
        0.014666359 = weight(_text_:of in 124) [ClassicSimilarity], result of:
          0.014666359 = score(doc=124,freq=6.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.23940048 = fieldWeight in 124, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=124)
        0.054498006 = weight(_text_:software in 124) [ClassicSimilarity], result of:
          0.054498006 = score(doc=124,freq=2.0), product of:
            0.15541996 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.03917671 = queryNorm
            0.35064998 = fieldWeight in 124, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.0625 = fieldNorm(doc=124)
      0.22222222 = coord(2/9)
    
    Abstract
    This paper presents a couple of extensions to a basic Markov Model tagger (called TreeTagger) which improve its accuracy when trained on small corpora. The basic tagger was originally developed for English Schmid, 1994. The extensions together reduced error rates on a German test corpus by more than a third.
    Content
    Beitrag für: Proceedings of the ACL SIGDAT-Workshop. Dublin, Ireland, 1995. Für die Software TreeTagger, vgl.: http://www.ims.uni-stuttgart.de/~schmid/.
  9. Radford, A.; Wu, J.; Child, R.; Luan, D.; Amode, D.; Sutskever, I.: Language models are unsupervised multitask learners 0.01
    0.011942157 = product of:
      0.053739704 = sum of:
        0.019052157 = weight(_text_:of in 871) [ClassicSimilarity], result of:
          0.019052157 = score(doc=871,freq=18.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.3109903 = fieldWeight in 871, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=871)
        0.034687545 = weight(_text_:systems in 871) [ClassicSimilarity], result of:
          0.034687545 = score(doc=871,freq=4.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.28811008 = fieldWeight in 871, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.046875 = fieldNorm(doc=871)
      0.22222222 = coord(2/9)
    
    Abstract
    Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on task-specific datasets. We demonstrate that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText. When conditioned on a document plus questions, the answers generated by the language model reach 55 F1 on the CoQA dataset - matching or exceeding the performance of 3 out of 4 baseline systems without using the 127,000+ training examples. The capacity of the language model is essential to the success of zero-shot task transfer and increasing it improves performance in a log-linear fashion across tasks. Our largest model, GPT-2, is a 1.5B parameter Transformer that achieves state of the art results on 7 out of 8 tested language modeling datasets in a zero-shot setting but still underfits WebText. Samples from the model reflect these improvements and contain coherent paragraphs of text. These findings suggest a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.
  10. Scobel, G.: GPT: Eine Software, die die Welt verändert (2023) 0.01
    0.0107044205 = product of:
      0.096339785 = sum of:
        0.096339785 = weight(_text_:software in 839) [ClassicSimilarity], result of:
          0.096339785 = score(doc=839,freq=4.0), product of:
            0.15541996 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.03917671 = queryNorm
            0.6198675 = fieldWeight in 839, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.078125 = fieldNorm(doc=839)
      0.11111111 = coord(1/9)
    
    Abstract
    GPT-3 ist eine jener Entwicklungen, die binnen weniger Monate an Einfluss und Reichweite zulegen. Die Software wird sich massiv auf Ökonomie und Gesellschaft auswirken.
  11. Dias, G.: Multiword unit hybrid extraction (o.J.) 0.01
    0.010392102 = product of:
      0.04676446 = sum of:
        0.018148692 = weight(_text_:of in 643) [ClassicSimilarity], result of:
          0.018148692 = score(doc=643,freq=12.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.29624295 = fieldWeight in 643, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=643)
        0.028615767 = weight(_text_:systems in 643) [ClassicSimilarity], result of:
          0.028615767 = score(doc=643,freq=2.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.23767869 = fieldWeight in 643, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=643)
      0.22222222 = coord(2/9)
    
    Abstract
    This paper describes an original hybrid system that extracts multiword unit candidates from part-of-speech tagged corpora. While classical hybrid systems manually define local part-of-speech patterns that lead to the identification of well-known multiword units (mainly compound nouns), our solution automatically identifies relevant syntactical patterns from the corpus. Word statistics are then combined with the endogenously acquired linguistic information in order to extract the most relevant sequences of words. As a result, (1) human intervention is avoided providing total flexibility of use of the system and (2) different multiword units like phrasal verbs, adverbial locutions and prepositional locutions may be identified. The system has been tested on the Brown Corpus leading to encouraging results
  12. Nagy T., I.: Detecting multiword expressions and named entities in natural language texts (2014) 0.01
    0.010386474 = product of:
      0.046739135 = sum of:
        0.029363085 = weight(_text_:applications in 1536) [ClassicSimilarity], result of:
          0.029363085 = score(doc=1536,freq=2.0), product of:
            0.17247584 = queryWeight, product of:
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.03917671 = queryNorm
            0.17024462 = fieldWeight in 1536, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1536)
        0.01737605 = weight(_text_:of in 1536) [ClassicSimilarity], result of:
          0.01737605 = score(doc=1536,freq=44.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.28363106 = fieldWeight in 1536, product of:
              6.6332498 = tf(freq=44.0), with freq of:
                44.0 = termFreq=44.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1536)
      0.22222222 = coord(2/9)
    
    Abstract
    Multiword expressions (MWEs) are lexical items that can be decomposed into single words and display lexical, syntactic, semantic, pragmatic and/or statistical idiosyncrasy (Sag et al., 2002; Kim, 2008; Calzolari et al., 2002). The proper treatment of multiword expressions such as rock 'n' roll and make a decision is essential for many natural language processing (NLP) applications like information extraction and retrieval, terminology extraction and machine translation, and it is important to identify multiword expressions in context. For example, in machine translation we must know that MWEs form one semantic unit, hence their parts should not be translated separately. For this, multiword expressions should be identified first in the text to be translated. The chief aim of this thesis is to develop machine learning-based approaches for the automatic detection of different types of multiword expressions in English and Hungarian natural language texts. In our investigations, we pay attention to the characteristics of different types of multiword expressions such as nominal compounds, multiword named entities and light verb constructions, and we apply novel methods to identify MWEs in raw texts. In the thesis it will be demonstrated that nominal compounds and multiword amed entities may require a similar approach for their automatic detection as they behave in the same way from a linguistic point of view. Furthermore, it will be shown that the automatic detection of light verb constructions can be carried out using two effective machine learning-based approaches.
    In this thesis, we focused on the automatic detection of multiword expressions in natural language texts. On the basis of the main contributions, we can argue that: - Supervised machine learning methods can be successfully applied for the automatic detection of different types of multiword expressions in natural language texts. - Machine learning-based multiword expression detection can be successfully carried out for English as well as for Hungarian. - Our supervised machine learning-based model was successfully applied to the automatic detection of nominal compounds from English raw texts. - We developed a Wikipedia-based dictionary labeling method to automatically detect English nominal compounds. - A prior knowledge of nominal compounds can enhance Named Entity Recognition, while previously identified named entities can assist the nominal compound identification process. - The machine learning-based method can also provide acceptable results when it was trained on an automatically generated silver standard corpus. - As named entities form one semantic unit and may consist of more than one word and function as a noun, we can treat them in a similar way to nominal compounds. - Our sequence labelling-based tool can be successfully applied for identifying verbal light verb constructions in two typologically different languages, namely English and Hungarian. - Domain adaptation techniques may help diminish the distance between domains in the automatic detection of light verb constructions. - Our syntax-based method can be successfully applied for the full-coverage identification of light verb constructions. As a first step, a data-driven candidate extraction method can be utilized. After, a machine learning approach that makes use of an extended and rich feature set selects LVCs among extracted candidates. - When a precise syntactic parser is available for the actual domain, the full-coverage identification can be performed better. In other cases, the usage of the sequence labeling method is recommended.
    Imprint
    Szeged : University of Szeged, Faculty of Science and Informatics, Doctoral School of Computer Science
  13. Chowdhury, A.; Mccabe, M.C.: Improving information retrieval systems using part of speech tagging (1993) 0.01
    0.009913453 = product of:
      0.044610538 = sum of:
        0.020082738 = weight(_text_:of in 1061) [ClassicSimilarity], result of:
          0.020082738 = score(doc=1061,freq=20.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.32781258 = fieldWeight in 1061, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=1061)
        0.0245278 = weight(_text_:systems in 1061) [ClassicSimilarity], result of:
          0.0245278 = score(doc=1061,freq=2.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.2037246 = fieldWeight in 1061, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.046875 = fieldNorm(doc=1061)
      0.22222222 = coord(2/9)
    
    Abstract
    The object of Information Retrieval is to retrieve all relevant documents for a user query and only those relevant documents. Much research has focused on achieving this objective with little regard for storage overhead or performance. In the paper we evaluate the use of Part of Speech Tagging to improve, the index storage overhead and general speed of the system with only a minimal reduction to precision recall measurements. We tagged 500Mbs of the Los Angeles Times 1990 and 1989 document collection provided by TREC for parts of speech. We then experimented to find the most relevant part of speech to index. We show that 90% of precision recall is achieved with 40% of the document collections terms. We also show that this is a improvement in overhead with only a 1% reduction in precision recall.
  14. Boleda, G.; Evert, S.: Multiword expressions : a pain in the neck of lexical semantics (2009) 0.01
    0.009899746 = product of:
      0.044548854 = sum of:
        0.012701439 = weight(_text_:of in 4888) [ClassicSimilarity], result of:
          0.012701439 = score(doc=4888,freq=2.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.20732689 = fieldWeight in 4888, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.09375 = fieldNorm(doc=4888)
        0.031847417 = product of:
          0.063694835 = sum of:
            0.063694835 = weight(_text_:22 in 4888) [ClassicSimilarity], result of:
              0.063694835 = score(doc=4888,freq=2.0), product of:
                0.13719016 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03917671 = queryNorm
                0.46428138 = fieldWeight in 4888, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=4888)
          0.5 = coord(1/2)
      0.22222222 = coord(2/9)
    
    Date
    1. 3.2013 14:56:22
  15. Collard, J.; Paiva, V. de; Fong, B.; Subrahmanian, E.: Extracting mathematical concepts from text (2022) 0.01
    0.009652025 = product of:
      0.043434113 = sum of:
        0.014818345 = weight(_text_:of in 668) [ClassicSimilarity], result of:
          0.014818345 = score(doc=668,freq=8.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.24188137 = fieldWeight in 668, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=668)
        0.028615767 = weight(_text_:systems in 668) [ClassicSimilarity], result of:
          0.028615767 = score(doc=668,freq=2.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.23767869 = fieldWeight in 668, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=668)
      0.22222222 = coord(2/9)
    
    Abstract
    We investigate different systems for extracting mathematical entities from English texts in the mathematical field of category theory as a first step for constructing a mathematical knowledge graph. We consider four different term extractors and compare their results. This small experiment showcases some of the issues with the construction and evaluation of terms extracted from noisy domain text. We also make available two open corpora in research mathematics, in particular in category theory: a small corpus of 755 abstracts from the journal TAC (3188 sentences), and a larger corpus from the nLab community wiki (15,000 sentences).
  16. Galitsky, B.: Can many agents answer questions better than one? (2005) 0.01
    0.008907516 = product of:
      0.04008382 = sum of:
        0.015556021 = weight(_text_:of in 3094) [ClassicSimilarity], result of:
          0.015556021 = score(doc=3094,freq=12.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.25392252 = fieldWeight in 3094, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=3094)
        0.0245278 = weight(_text_:systems in 3094) [ClassicSimilarity], result of:
          0.0245278 = score(doc=3094,freq=2.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.2037246 = fieldWeight in 3094, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.046875 = fieldNorm(doc=3094)
      0.22222222 = coord(2/9)
    
    Abstract
    The paper addresses the issue of how online natural language question answering, based on deep semantic analysis, may compete with currently popular keyword search, open domain information retrieval systems, covering a horizontal domain. We suggest the multiagent question answering approach, where each domain is represented by an agent which tries to answer questions taking into account its specific knowledge. The meta-agent controls the cooperation between question answering agents and chooses the most relevant answer(s). We argue that multiagent question answering is optimal in terms of access to business and financial knowledge, flexibility in query phrasing, and efficiency and usability of advice. The knowledge and advice encoded in the system are initially prepared by domain experts. We analyze the commercial application of multiagent question answering and the robustness of the meta-agent. The paper suggests that a multiagent architecture is optimal when a real world question answering domain combines a number of vertical ones to form a horizontal domain.
  17. Wong, W.; Liu, W.; Bennamoun, M.: Ontology learning from text : a look back and into the future (2010) 0.01
    0.008687538 = product of:
      0.03909392 = sum of:
        0.010478153 = weight(_text_:of in 4733) [ClassicSimilarity], result of:
          0.010478153 = score(doc=4733,freq=4.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.17103596 = fieldWeight in 4733, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4733)
        0.028615767 = weight(_text_:systems in 4733) [ClassicSimilarity], result of:
          0.028615767 = score(doc=4733,freq=2.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.23767869 = fieldWeight in 4733, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4733)
      0.22222222 = coord(2/9)
    
    Abstract
    Ontologies are often viewed as the answer to the need for inter-operable semantics in modern information systems. The explosion of textual information on the "Read/Write" Web coupled with the increasing demand for ontologies to power the Semantic Web have made (semi-)automatic ontology learning from text a very promising research area. This together with the advanced state in related areas such as natural language processing have fuelled research into ontology learning over the past decade. This survey looks at how far we have come since the turn of the millennium, and discusses the remaining challenges that will define the research directions in this area in the near future.
  18. Aizawa, A.; Kohlhase, M.: Mathematical information retrieval (2021) 0.01
    0.008687538 = product of:
      0.03909392 = sum of:
        0.010478153 = weight(_text_:of in 667) [ClassicSimilarity], result of:
          0.010478153 = score(doc=667,freq=4.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.17103596 = fieldWeight in 667, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=667)
        0.028615767 = weight(_text_:systems in 667) [ClassicSimilarity], result of:
          0.028615767 = score(doc=667,freq=2.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.23767869 = fieldWeight in 667, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=667)
      0.22222222 = coord(2/9)
    
    Abstract
    We present an overview of the NTCIR Math Tasks organized during NTCIR-10, 11, and 12. These tasks are primarily dedicated to techniques for searching mathematical content with formula expressions. In this chapter, we first summarize the task design and introduce test collections generated in the tasks. We also describe the features and main challenges of mathematical information retrieval systems and discuss future perspectives in the field.
  19. Dampz, N.: ChatGPT interpretiert jetzt auch Bilder : Neue Version (2023) 0.01
    0.0075691673 = product of:
      0.068122506 = sum of:
        0.068122506 = weight(_text_:software in 874) [ClassicSimilarity], result of:
          0.068122506 = score(doc=874,freq=2.0), product of:
            0.15541996 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.03917671 = queryNorm
            0.43831247 = fieldWeight in 874, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.078125 = fieldNorm(doc=874)
      0.11111111 = coord(1/9)
    
    Abstract
    Das kalifornische Unternehmen Open AI hat eine neue Version ihres Chatbots ChatGPT vorgestellt. Auffallendste Neuerung: Die Software, die mit Künstlicher Intelligenz funktioniert und bisher auf Text ausgerichtet war, interpretiert nun auch Bilder.
  20. Leighton, T.: ChatGPT und Künstliche Intelligenz : Utopie oder Dystopie? (2023) 0.01
    0.0075691673 = product of:
      0.068122506 = sum of:
        0.068122506 = weight(_text_:software in 908) [ClassicSimilarity], result of:
          0.068122506 = score(doc=908,freq=2.0), product of:
            0.15541996 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.03917671 = queryNorm
            0.43831247 = fieldWeight in 908, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.078125 = fieldNorm(doc=908)
      0.11111111 = coord(1/9)
    
    Abstract
    Das Tool wird immer ausgefeilter; es erstellt Software und erfindet die unglaublichsten Fiktionen. Wie "klug" ist es? Wie sieht es mit den Ängsten aus? Und mit Moral?

Years

Languages

  • e 44
  • d 16
  • el 1
  • More… Less…

Types

  • a 39
  • p 5
  • x 2
  • b 1
  • m 1
  • More… Less…