Search (15 results, page 1 of 1)

  • × language_ss:"e"
  • × theme_ss:"Computerlinguistik"
  • × type_ss:"el"
  1. Rindflesch, T.C.; Aronson, A.R.: Semantic processing in information retrieval (1993) 0.02
    0.016522959 = product of:
      0.066091835 = sum of:
        0.050746426 = weight(_text_:retrieval in 4121) [ClassicSimilarity], result of:
          0.050746426 = score(doc=4121,freq=10.0), product of:
            0.09700725 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032069415 = queryNorm
            0.5231199 = fieldWeight in 4121, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4121)
        0.015345411 = product of:
          0.030690823 = sum of:
            0.030690823 = weight(_text_:29 in 4121) [ClassicSimilarity], result of:
              0.030690823 = score(doc=4121,freq=2.0), product of:
                0.11281017 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.032069415 = queryNorm
                0.27205724 = fieldWeight in 4121, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4121)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    Intuition suggests that one way to enhance the information retrieval process would be the use of phrases to characterize the contents of text. A number of researchers, however, have noted that phrases alone do not improve retrieval effectiveness. In this paper we briefly review the use of phrases in information retrieval and then suggest extensions to this paradigm using semantic information. We claim that semantic processing, which can be viewed as expressing relations between the concepts represented by phrases, will in fact enhance retrieval effectiveness. The availability of the UMLS® domain model, which we exploit extensively, significantly contributes to the feasibility of this processing.
    Date
    29. 6.2015 14:51:28
  2. Chowdhury, A.; Mccabe, M.C.: Improving information retrieval systems using part of speech tagging (1993) 0.01
    0.009513528 = product of:
      0.038054112 = sum of:
        0.027509877 = weight(_text_:retrieval in 1061) [ClassicSimilarity], result of:
          0.027509877 = score(doc=1061,freq=4.0), product of:
            0.09700725 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032069415 = queryNorm
            0.2835858 = fieldWeight in 1061, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=1061)
        0.010544236 = product of:
          0.021088472 = sum of:
            0.021088472 = weight(_text_:system in 1061) [ClassicSimilarity], result of:
              0.021088472 = score(doc=1061,freq=2.0), product of:
                0.10100432 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.032069415 = queryNorm
                0.20878783 = fieldWeight in 1061, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1061)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    The object of Information Retrieval is to retrieve all relevant documents for a user query and only those relevant documents. Much research has focused on achieving this objective with little regard for storage overhead or performance. In the paper we evaluate the use of Part of Speech Tagging to improve, the index storage overhead and general speed of the system with only a minimal reduction to precision recall measurements. We tagged 500Mbs of the Los Angeles Times 1990 and 1989 document collection provided by TREC for parts of speech. We then experimented to find the most relevant part of speech to index. We show that 90% of precision recall is achieved with 40% of the document collections terms. We also show that this is a improvement in overhead with only a 1% reduction in precision recall.
  3. Rajasurya, S.; Muralidharan, T.; Devi, S.; Swamynathan, S.: Semantic information retrieval using ontology in university domain (2012) 0.01
    0.00883785 = product of:
      0.0353514 = sum of:
        0.022924898 = weight(_text_:retrieval in 2861) [ClassicSimilarity], result of:
          0.022924898 = score(doc=2861,freq=4.0), product of:
            0.09700725 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032069415 = queryNorm
            0.23632148 = fieldWeight in 2861, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2861)
        0.012426502 = product of:
          0.024853004 = sum of:
            0.024853004 = weight(_text_:system in 2861) [ClassicSimilarity], result of:
              0.024853004 = score(doc=2861,freq=4.0), product of:
                0.10100432 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.032069415 = queryNorm
                0.24605882 = fieldWeight in 2861, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2861)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    Today's conventional search engines hardly do provide the essential content relevant to the user's search query. This is because the context and semantics of the request made by the user is not analyzed to the full extent. So here the need for a semantic web search arises. SWS is upcoming in the area of web search which combines Natural Language Processing and Artificial Intelligence. The objective of the work done here is to design, develop and implement a semantic search engine- SIEU(Semantic Information Extraction in University Domain) confined to the university domain. SIEU uses ontology as a knowledge base for the information retrieval process. It is not just a mere keyword search. It is one layer above what Google or any other search engines retrieve by analyzing just the keywords. Here the query is analyzed both syntactically and semantically. The developed system retrieves the web results more relevant to the user query through keyword expansion. The results obtained here will be accurate enough to satisfy the request made by the user. The level of accuracy will be enhanced since the query is analyzed semantically. The system will be of great use to the developers and researchers who work on web. The Google results are re-ranked and optimized for providing the relevant links. For ranking an algorithm has been applied which fetches more apt results for the user query.
  4. Galitsky, B.: Can many agents answer questions better than one? (2005) 0.01
    0.007499164 = product of:
      0.029996656 = sum of:
        0.019452421 = weight(_text_:retrieval in 3094) [ClassicSimilarity], result of:
          0.019452421 = score(doc=3094,freq=2.0), product of:
            0.09700725 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032069415 = queryNorm
            0.20052543 = fieldWeight in 3094, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=3094)
        0.010544236 = product of:
          0.021088472 = sum of:
            0.021088472 = weight(_text_:system in 3094) [ClassicSimilarity], result of:
              0.021088472 = score(doc=3094,freq=2.0), product of:
                0.10100432 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.032069415 = queryNorm
                0.20878783 = fieldWeight in 3094, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3094)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    The paper addresses the issue of how online natural language question answering, based on deep semantic analysis, may compete with currently popular keyword search, open domain information retrieval systems, covering a horizontal domain. We suggest the multiagent question answering approach, where each domain is represented by an agent which tries to answer questions taking into account its specific knowledge. The meta-agent controls the cooperation between question answering agents and chooses the most relevant answer(s). We argue that multiagent question answering is optimal in terms of access to business and financial knowledge, flexibility in query phrasing, and efficiency and usability of advice. The knowledge and advice encoded in the system are initially prepared by domain experts. We analyze the commercial application of multiagent question answering and the robustness of the meta-agent. The paper suggests that a multiagent architecture is optimal when a real world question answering domain combines a number of vertical ones to form a horizontal domain.
  5. Aizawa, A.; Kohlhase, M.: Mathematical information retrieval (2021) 0.01
    0.0056736227 = product of:
      0.04538898 = sum of:
        0.04538898 = weight(_text_:retrieval in 667) [ClassicSimilarity], result of:
          0.04538898 = score(doc=667,freq=8.0), product of:
            0.09700725 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032069415 = queryNorm
            0.46789268 = fieldWeight in 667, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=667)
      0.125 = coord(1/8)
    
    Abstract
    We present an overview of the NTCIR Math Tasks organized during NTCIR-10, 11, and 12. These tasks are primarily dedicated to techniques for searching mathematical content with formula expressions. In this chapter, we first summarize the task design and introduce test collections generated in the tasks. We also describe the features and main challenges of mathematical information retrieval systems and discuss future perspectives in the field.
    Series
    ¬The Information retrieval series, vol 43
    Source
    Evaluating information retrieval and access tasks. Eds.: Sakai, T., Oard, D., Kando, N. [https://doi.org/10.1007/978-981-15-5554-1_12]
  6. Shree, P.: ¬The journey of Open AI GPT models (2020) 0.00
    0.003898194 = product of:
      0.031185552 = sum of:
        0.031185552 = product of:
          0.062371105 = sum of:
            0.062371105 = weight(_text_:etc in 869) [ClassicSimilarity], result of:
              0.062371105 = score(doc=869,freq=2.0), product of:
                0.17370372 = queryWeight, product of:
                  5.4164915 = idf(docFreq=533, maxDocs=44218)
                  0.032069415 = queryNorm
                0.35906604 = fieldWeight in 869, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.4164915 = idf(docFreq=533, maxDocs=44218)
                  0.046875 = fieldNorm(doc=869)
          0.5 = coord(1/2)
      0.125 = coord(1/8)
    
    Abstract
    Generative Pre-trained Transformer (GPT) models by OpenAI have taken natural language processing (NLP) community by storm by introducing very powerful language models. These models can perform various NLP tasks like question answering, textual entailment, text summarisation etc. without any supervised training. These language models need very few to no examples to understand the tasks and perform equivalent or even better than the state-of-the-art models trained in supervised fashion. In this article we will cover the journey of these models and understand how they have evolved over a period of 2 years. 1. Discussion of GPT-1 paper (Improving Language Understanding by Generative Pre-training). 2. Discussion of GPT-2 paper (Language Models are unsupervised multitask learners) and its subsequent improvements over GPT-1. 3. Discussion of GPT-3 paper (Language models are few shot learners) and the improvements which have made it one of the most powerful models NLP has seen till date. This article assumes familiarity with the basics of NLP terminologies and transformer architecture.
  7. Boleda, G.; Evert, S.: Multiword expressions : a pain in the neck of lexical semantics (2009) 0.00
    0.0032587221 = product of:
      0.026069777 = sum of:
        0.026069777 = product of:
          0.052139554 = sum of:
            0.052139554 = weight(_text_:22 in 4888) [ClassicSimilarity], result of:
              0.052139554 = score(doc=4888,freq=2.0), product of:
                0.112301625 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.032069415 = queryNorm
                0.46428138 = fieldWeight in 4888, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=4888)
          0.5 = coord(1/2)
      0.125 = coord(1/8)
    
    Date
    1. 3.2013 14:56:22
  8. Harari, Y.N.: ¬[Yuval-Noah-Harari-argues-that] AI has hacked the operating system of human civilisation (2023) 0.00
    0.0031066255 = product of:
      0.024853004 = sum of:
        0.024853004 = product of:
          0.04970601 = sum of:
            0.04970601 = weight(_text_:system in 953) [ClassicSimilarity], result of:
              0.04970601 = score(doc=953,freq=4.0), product of:
                0.10100432 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.032069415 = queryNorm
                0.49211764 = fieldWeight in 953, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.078125 = fieldNorm(doc=953)
          0.5 = coord(1/2)
      0.125 = coord(1/8)
    
    Source
    https://www.economist.com/by-invitation/2023/04/28/yuval-noah-harari-argues-that-ai-has-hacked-the-operating-system-of-human-civilisation?giftId=6982bba3-94bc-441d-9153-6d42468817ad
  9. Griffiths, T.L.; Steyvers, M.: ¬A probabilistic approach to semantic representation (2002) 0.00
    0.0031002415 = product of:
      0.024801932 = sum of:
        0.024801932 = product of:
          0.049603865 = sum of:
            0.049603865 = weight(_text_:29 in 3671) [ClassicSimilarity], result of:
              0.049603865 = score(doc=3671,freq=4.0), product of:
                0.11281017 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.032069415 = queryNorm
                0.43971092 = fieldWeight in 3671, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3671)
          0.5 = coord(1/2)
      0.125 = coord(1/8)
    
    Date
    29. 6.2015 14:55:01
    29. 6.2015 16:09:05
  10. Snajder, J.: Distributional semantics of multi-word expressions (2013) 0.00
    0.0027402523 = product of:
      0.021922018 = sum of:
        0.021922018 = product of:
          0.043844037 = sum of:
            0.043844037 = weight(_text_:29 in 2868) [ClassicSimilarity], result of:
              0.043844037 = score(doc=2868,freq=2.0), product of:
                0.11281017 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.032069415 = queryNorm
                0.38865322 = fieldWeight in 2868, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2868)
          0.5 = coord(1/2)
      0.125 = coord(1/8)
    
    Date
    29. 4.2016 12:04:50
  11. Dias, G.: Multiword unit hybrid extraction (o.J.) 0.00
    0.0026633765 = product of:
      0.021307012 = sum of:
        0.021307012 = product of:
          0.042614024 = sum of:
            0.042614024 = weight(_text_:system in 643) [ClassicSimilarity], result of:
              0.042614024 = score(doc=643,freq=6.0), product of:
                0.10100432 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.032069415 = queryNorm
                0.42190298 = fieldWeight in 643, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=643)
          0.5 = coord(1/2)
      0.125 = coord(1/8)
    
    Abstract
    This paper describes an original hybrid system that extracts multiword unit candidates from part-of-speech tagged corpora. While classical hybrid systems manually define local part-of-speech patterns that lead to the identification of well-known multiword units (mainly compound nouns), our solution automatically identifies relevant syntactical patterns from the corpus. Word statistics are then combined with the endogenously acquired linguistic information in order to extract the most relevant sequences of words. As a result, (1) human intervention is avoided providing total flexibility of use of the system and (2) different multiword units like phrasal verbs, adverbial locutions and prepositional locutions may be identified. The system has been tested on the Brown Corpus leading to encouraging results
  12. Stoykova, V.; Petkova, E.: Automatic extraction of mathematical terms for precalculus (2012) 0.00
    0.0019181764 = product of:
      0.015345411 = sum of:
        0.015345411 = product of:
          0.030690823 = sum of:
            0.030690823 = weight(_text_:29 in 156) [ClassicSimilarity], result of:
              0.030690823 = score(doc=156,freq=2.0), product of:
                0.11281017 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.032069415 = queryNorm
                0.27205724 = fieldWeight in 156, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=156)
          0.5 = coord(1/2)
      0.125 = coord(1/8)
    
    Date
    29. 5.2012 10:17:08
  13. Snajder, J.; Almic, P.: Modeling semantic compositionality of Croatian multiword expressions (2015) 0.00
    0.0016441513 = product of:
      0.01315321 = sum of:
        0.01315321 = product of:
          0.02630642 = sum of:
            0.02630642 = weight(_text_:29 in 2920) [ClassicSimilarity], result of:
              0.02630642 = score(doc=2920,freq=2.0), product of:
                0.11281017 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.032069415 = queryNorm
                0.23319192 = fieldWeight in 2920, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2920)
          0.5 = coord(1/2)
      0.125 = coord(1/8)
    
    Date
    29. 4.2016 12:42:17
  14. Spitkovsky, V.; Norvig, P.: From words to concepts and back : dictionaries for linking text, entities and ideas (2012) 0.00
    0.0016210352 = product of:
      0.012968281 = sum of:
        0.012968281 = weight(_text_:retrieval in 337) [ClassicSimilarity], result of:
          0.012968281 = score(doc=337,freq=2.0), product of:
            0.09700725 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032069415 = queryNorm
            0.13368362 = fieldWeight in 337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.03125 = fieldNorm(doc=337)
      0.125 = coord(1/8)
    
    Abstract
    Human language is both rich and ambiguous. When we hear or read words, we resolve meanings to mental representations, for example recognizing and linking names to the intended persons, locations or organizations. Bridging words and meaning - from turning search queries into relevant results to suggesting targeted keywords for advertisers - is also Google's core competency, and important for many other tasks in information retrieval and natural language processing. We are happy to release a resource, spanning 7,560,141 concepts and 175,100,788 unique text strings, that we hope will help everyone working in these areas. How do we represent concepts? Our approach piggybacks on the unique titles of entries from an encyclopedia, which are mostly proper and common noun phrases. We consider each individual Wikipedia article as representing a concept (an entity or an idea), identified by its URL. Text strings that refer to concepts were collected using the publicly available hypertext of anchors (the text you click on in a web link) that point to each Wikipedia page, thus drawing on the vast link structure of the web. For every English article we harvested the strings associated with its incoming hyperlinks from the rest of Wikipedia, the greater web, and also anchors of parallel, non-English Wikipedia pages. Our dictionaries are cross-lingual, and any concept deemed too fine can be broadened to a desired level of generality using Wikipedia's groupings of articles into hierarchical categories. The data set contains triples, each consisting of (i) text, a short, raw natural language string; (ii) url, a related concept, represented by an English Wikipedia article's canonical location; and (iii) count, an integer indicating the number of times text has been observed connected with the concept's url. Our database thus includes weights that measure degrees of association. For example, the top two entries for football indicate that it is an ambiguous term, which is almost twice as likely to refer to what we in the US call soccer. Vgl. auch: Spitkovsky, V.I., A.X. Chang: A cross-lingual dictionary for english Wikipedia concepts. In: http://nlp.stanford.edu/pubs/crosswikis.pdf.
  15. Nagy T., I.: Detecting multiword expressions and named entities in natural language texts (2014) 0.00
    0.0014184057 = product of:
      0.011347245 = sum of:
        0.011347245 = weight(_text_:retrieval in 1536) [ClassicSimilarity], result of:
          0.011347245 = score(doc=1536,freq=2.0), product of:
            0.09700725 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.032069415 = queryNorm
            0.11697317 = fieldWeight in 1536, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1536)
      0.125 = coord(1/8)
    
    Abstract
    Multiword expressions (MWEs) are lexical items that can be decomposed into single words and display lexical, syntactic, semantic, pragmatic and/or statistical idiosyncrasy (Sag et al., 2002; Kim, 2008; Calzolari et al., 2002). The proper treatment of multiword expressions such as rock 'n' roll and make a decision is essential for many natural language processing (NLP) applications like information extraction and retrieval, terminology extraction and machine translation, and it is important to identify multiword expressions in context. For example, in machine translation we must know that MWEs form one semantic unit, hence their parts should not be translated separately. For this, multiword expressions should be identified first in the text to be translated. The chief aim of this thesis is to develop machine learning-based approaches for the automatic detection of different types of multiword expressions in English and Hungarian natural language texts. In our investigations, we pay attention to the characteristics of different types of multiword expressions such as nominal compounds, multiword named entities and light verb constructions, and we apply novel methods to identify MWEs in raw texts. In the thesis it will be demonstrated that nominal compounds and multiword amed entities may require a similar approach for their automatic detection as they behave in the same way from a linguistic point of view. Furthermore, it will be shown that the automatic detection of light verb constructions can be carried out using two effective machine learning-based approaches.