Search (92 results, page 1 of 5)

  • × theme_ss:"Computerlinguistik"
  1. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.24
    0.23600978 = product of:
      0.3146797 = sum of:
        0.073939405 = product of:
          0.22181821 = sum of:
            0.22181821 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.22181821 = score(doc=562,freq=2.0), product of:
                0.39468166 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046553567 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
        0.22181821 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.22181821 = score(doc=562,freq=2.0), product of:
            0.39468166 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046553567 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.018922098 = product of:
          0.037844196 = sum of:
            0.037844196 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.037844196 = score(doc=562,freq=2.0), product of:
                0.16302267 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046553567 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  2. Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.15
    0.14787881 = product of:
      0.29575762 = sum of:
        0.073939405 = product of:
          0.22181821 = sum of:
            0.22181821 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
              0.22181821 = score(doc=862,freq=2.0), product of:
                0.39468166 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046553567 = queryNorm
                0.56201804 = fieldWeight in 862, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=862)
          0.33333334 = coord(1/3)
        0.22181821 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
          0.22181821 = score(doc=862,freq=2.0), product of:
            0.39468166 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046553567 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.5 = coord(2/4)
    
    Source
    https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN
  3. Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.12
    0.12037015 = product of:
      0.2407403 = sum of:
        0.22181821 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
          0.22181821 = score(doc=563,freq=2.0), product of:
            0.39468166 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046553567 = queryNorm
            0.56201804 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.018922098 = product of:
          0.037844196 = sum of:
            0.037844196 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
              0.037844196 = score(doc=563,freq=2.0), product of:
                0.16302267 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046553567 = queryNorm
                0.23214069 = fieldWeight in 563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=563)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Content
    A Thesis presented to The University of Guelph In partial fulfilment of requirements for the degree of Master of Science in Computer Science. Vgl. Unter: http://www.inf.ufrgs.br%2F~ceramisch%2Fdownload_files%2Fpublications%2F2009%2Fp01.pdf.
    Date
    10. 1.2013 19:22:47
  4. Doszkocs, T.E.; Zamora, A.: Dictionary services and spelling aids for Web searching (2004) 0.06
    0.06314828 = product of:
      0.12629656 = sum of:
        0.05215197 = weight(_text_:open in 2541) [ClassicSimilarity], result of:
          0.05215197 = score(doc=2541,freq=2.0), product of:
            0.20964009 = queryWeight, product of:
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.046553567 = queryNorm
            0.24876907 = fieldWeight in 2541, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2541)
        0.074144594 = sum of:
          0.029544784 = weight(_text_:access in 2541) [ClassicSimilarity], result of:
            0.029544784 = score(doc=2541,freq=2.0), product of:
              0.15778996 = queryWeight, product of:
                3.389428 = idf(docFreq=4053, maxDocs=44218)
                0.046553567 = queryNorm
              0.18724121 = fieldWeight in 2541, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.389428 = idf(docFreq=4053, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2541)
          0.044599812 = weight(_text_:22 in 2541) [ClassicSimilarity], result of:
            0.044599812 = score(doc=2541,freq=4.0), product of:
              0.16302267 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046553567 = queryNorm
              0.27358043 = fieldWeight in 2541, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2541)
      0.5 = coord(2/4)
    
    Abstract
    The Specialized Information Services Division (SIS) of the National Library of Medicine (NLM) provides Web access to more than a dozen scientific databases on toxicology and the environment on TOXNET . Search queries on TOXNET often include misspelled or variant English words, medical and scientific jargon and chemical names. Following the example of search engines like Google and ClinicalTrials.gov, we set out to develop a spelling "suggestion" system for increased recall and precision in TOXNET searching. This paper describes development of dictionary technology that can be used in a variety of applications such as orthographic verification, writing aid, natural language processing, and information storage and retrieval. The design of the technology allows building complex applications using the components developed in the earlier phases of the work in a modular fashion without extensive rewriting of computer code. Since many of the potential applications envisioned for this work have on-line or web-based interfaces, the dictionaries and other computer components must have fast response, and must be adaptable to open-ended database vocabularies, including chemical nomenclature. The dictionary vocabulary for this work was derived from SIS and other databases and specialized resources, such as NLM's Unified Medical Language Systems (UMLS) . The resulting technology, A-Z Dictionary (AZdict), has three major constituents: 1) the vocabulary list, 2) the word attributes that define part of speech and morphological relationships between words in the list, and 3) a set of programs that implements the retrieval of words and their attributes, and determines similarity between words (ChemSpell). These three components can be used in various applications such as spelling verification, spelling aid, part-of-speech tagging, paraphrasing, and many other natural language processing functions.
    Date
    14. 8.2004 17:22:56
    Source
    Online. 28(2004) no.3, S.22-29
  5. Way, E.C.: Knowledge representation and metaphor (oder: meaning) (1994) 0.05
    0.054336306 = product of:
      0.10867261 = sum of:
        0.08344315 = weight(_text_:open in 771) [ClassicSimilarity], result of:
          0.08344315 = score(doc=771,freq=2.0), product of:
            0.20964009 = queryWeight, product of:
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.046553567 = queryNorm
            0.39803052 = fieldWeight in 771, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.0625 = fieldNorm(doc=771)
        0.025229463 = product of:
          0.050458927 = sum of:
            0.050458927 = weight(_text_:22 in 771) [ClassicSimilarity], result of:
              0.050458927 = score(doc=771,freq=2.0), product of:
                0.16302267 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046553567 = queryNorm
                0.30952093 = fieldWeight in 771, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=771)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Content
    Enthält folgende 9 Kapitel: The literal and the metaphoric; Views of metaphor; Knowledge representation; Representation schemes and conceptual graphs; The dynamic type hierarchy theory of metaphor; Computational approaches to metaphor; Thenature and structure of semantic hierarchies; Language games, open texture and family resemblance; Programming the dynamic type hierarchy; Subject index
    Footnote
    Bereits 1991 bei Kluwer publiziert // Rez. in: Knowledge organization 22(1995) no.1, S.48-49 (O. Sechser)
  6. Galitsky, B.: Can many agents answer questions better than one? (2005) 0.04
    0.040154617 = product of:
      0.080309235 = sum of:
        0.062582366 = weight(_text_:open in 3094) [ClassicSimilarity], result of:
          0.062582366 = score(doc=3094,freq=2.0), product of:
            0.20964009 = queryWeight, product of:
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.046553567 = queryNorm
            0.2985229 = fieldWeight in 3094, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.046875 = fieldNorm(doc=3094)
        0.01772687 = product of:
          0.03545374 = sum of:
            0.03545374 = weight(_text_:access in 3094) [ClassicSimilarity], result of:
              0.03545374 = score(doc=3094,freq=2.0), product of:
                0.15778996 = queryWeight, product of:
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.046553567 = queryNorm
                0.22468945 = fieldWeight in 3094, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3094)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    The paper addresses the issue of how online natural language question answering, based on deep semantic analysis, may compete with currently popular keyword search, open domain information retrieval systems, covering a horizontal domain. We suggest the multiagent question answering approach, where each domain is represented by an agent which tries to answer questions taking into account its specific knowledge. The meta-agent controls the cooperation between question answering agents and chooses the most relevant answer(s). We argue that multiagent question answering is optimal in terms of access to business and financial knowledge, flexibility in query phrasing, and efficiency and usability of advice. The knowledge and advice encoded in the system are initially prepared by domain experts. We analyze the commercial application of multiagent question answering and the robustness of the meta-agent. The paper suggests that a multiagent architecture is optimal when a real world question answering domain combines a number of vertical ones to form a horizontal domain.
  7. Wright, S.E.: Leveraging terminology resources across application boundaries : accessing resources in future integrated environments (2000) 0.03
    0.03346218 = product of:
      0.06692436 = sum of:
        0.05215197 = weight(_text_:open in 5528) [ClassicSimilarity], result of:
          0.05215197 = score(doc=5528,freq=2.0), product of:
            0.20964009 = queryWeight, product of:
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.046553567 = queryNorm
            0.24876907 = fieldWeight in 5528, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5528)
        0.014772392 = product of:
          0.029544784 = sum of:
            0.029544784 = weight(_text_:access in 5528) [ClassicSimilarity], result of:
              0.029544784 = score(doc=5528,freq=2.0), product of:
                0.15778996 = queryWeight, product of:
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.046553567 = queryNorm
                0.18724121 = fieldWeight in 5528, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5528)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    The title for this conference, stated in English, is Language Technology for a Dynamic Economy - y in the Media Age - The question arises as to what the media are we are dealing with and to what extent we are moving away from tile reality of different media to a world in which all sub-categories flow together into a unified stream of information that is constantly resealed to appear in different hardware configurations. A few years ago, people who were interested in sharing data or getting different electronic "boxes" to talk to each other were focused on two major aspects: I ) developing data conversion technology, and 2) convincing potential users that sharing information was an even remotely interesting option. Although some content "owners" are still reticent about releasing their data, it has become dramatically apparent in the Web environment that a broad range of users does indeed want this technology. Even as researchers struggle with the remaining technical, legal, and ethical impediments that stand in the way of unlimited information access to existing multi-platform resources, the future view of the world will no longer be as obsessed with conversion capability as it will be with creating content, with ,in eye to morphing technologies that will enable the delivery of that content from ail open-standards-based format such as XML (eXtensibic Markup Language), MPEG (Moving Picture Experts Group), or WAP (Wireless Application Protocol) to a rich variety of display Options
  8. Reyes Ayala, B.; Knudson, R.; Chen, J.; Cao, G.; Wang, X.: Metadata records machine translation combining multi-engine outputs with limited parallel data (2018) 0.03
    0.03346218 = product of:
      0.06692436 = sum of:
        0.05215197 = weight(_text_:open in 4010) [ClassicSimilarity], result of:
          0.05215197 = score(doc=4010,freq=2.0), product of:
            0.20964009 = queryWeight, product of:
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.046553567 = queryNorm
            0.24876907 = fieldWeight in 4010, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4010)
        0.014772392 = product of:
          0.029544784 = sum of:
            0.029544784 = weight(_text_:access in 4010) [ClassicSimilarity], result of:
              0.029544784 = score(doc=4010,freq=2.0), product of:
                0.15778996 = queryWeight, product of:
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.046553567 = queryNorm
                0.18724121 = fieldWeight in 4010, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.389428 = idf(docFreq=4053, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4010)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    One way to facilitate Multilingual Information Access (MLIA) for digital libraries is to generate multilingual metadata records by applying Machine Translation (MT) techniques. Current online MT services are available and affordable, but are not always effective for creating multilingual metadata records. In this study, we implemented 3 different MT strategies and evaluated their performance when translating English metadata records to Chinese and Spanish. These strategies included combining MT results from 3 online MT systems (Google, Bing, and Yahoo!) with and without additional linguistic resources, such as manually-generated parallel corpora, and metadata records in the two target languages obtained from international partners. The open-source statistical MT platform Moses was applied to design and implement the three translation strategies. Human evaluation of the MT results using adequacy and fluency demonstrated that two of the strategies produced higher quality translations than individual online MT systems for both languages. Especially, adding small, manually-generated parallel corpora of metadata records significantly improved translation performance. Our study suggested an effective and efficient MT approach for providing multilingual services for digital collections.
  9. Dampz, N.: ChatGPT interpretiert jetzt auch Bilder : Neue Version (2023) 0.03
    0.026075985 = product of:
      0.10430394 = sum of:
        0.10430394 = weight(_text_:open in 874) [ClassicSimilarity], result of:
          0.10430394 = score(doc=874,freq=2.0), product of:
            0.20964009 = queryWeight, product of:
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.046553567 = queryNorm
            0.49753815 = fieldWeight in 874, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.078125 = fieldNorm(doc=874)
      0.25 = coord(1/4)
    
    Abstract
    Das kalifornische Unternehmen Open AI hat eine neue Version ihres Chatbots ChatGPT vorgestellt. Auffallendste Neuerung: Die Software, die mit Künstlicher Intelligenz funktioniert und bisher auf Text ausgerichtet war, interpretiert nun auch Bilder.
  10. Collovini de Abreu, S.; Vieira, R.: RelP: Portuguese open relation extraction (2017) 0.02
    0.022582468 = product of:
      0.09032987 = sum of:
        0.09032987 = weight(_text_:open in 3621) [ClassicSimilarity], result of:
          0.09032987 = score(doc=3621,freq=6.0), product of:
            0.20964009 = queryWeight, product of:
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.046553567 = queryNorm
            0.43088073 = fieldWeight in 3621, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3621)
      0.25 = coord(1/4)
    
    Abstract
    Natural language texts are valuable data sources in many human activities. NLP techniques are being widely used in order to help find the right information to specific needs. In this paper, we present one such technique: relation extraction from texts. This task aims at identifying and classifying semantic relations that occur between entities in a text. For example, the sentence "Roberto Marinho is the founder of Rede Globo" expresses a relation occurring between "Roberto Marinho" and "Rede Globo." This work presents a system for Portuguese Open Relation Extraction, named RelP, which extracts any relation descriptor that describes an explicit relation between named entities in the organisation domain by applying the Conditional Random Fields. For implementing RelP, we define the representation scheme, features based on previous work, and a reference corpus. RelP achieved state of the art results for open relation extraction; the F-measure rate was around 60% between the named entities person, organisation and place. For better understanding of the output, we present a way for organizing the output from the mining of the extracted relation descriptors. This organization can be useful to classify relation types, to cluster the entities involved in a common relation and to populate datasets.
  11. Shree, P.: ¬The journey of Open AI GPT models (2020) 0.02
    0.022126209 = product of:
      0.088504836 = sum of:
        0.088504836 = weight(_text_:open in 869) [ClassicSimilarity], result of:
          0.088504836 = score(doc=869,freq=4.0), product of:
            0.20964009 = queryWeight, product of:
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.046553567 = queryNorm
            0.42217514 = fieldWeight in 869, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.046875 = fieldNorm(doc=869)
      0.25 = coord(1/4)
    
    Source
    https://medium.com/walmartglobaltech/the-journey-of-open-ai-gpt-models-32d95b7b7fb2
  12. Liddy, E.D.: Natural language processing for information retrieval and knowledge discovery (1998) 0.02
    0.021378566 = product of:
      0.08551426 = sum of:
        0.08551426 = sum of:
          0.0413627 = weight(_text_:access in 2345) [ClassicSimilarity], result of:
            0.0413627 = score(doc=2345,freq=2.0), product of:
              0.15778996 = queryWeight, product of:
                3.389428 = idf(docFreq=4053, maxDocs=44218)
                0.046553567 = queryNorm
              0.2621377 = fieldWeight in 2345, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.389428 = idf(docFreq=4053, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2345)
          0.04415156 = weight(_text_:22 in 2345) [ClassicSimilarity], result of:
            0.04415156 = score(doc=2345,freq=2.0), product of:
              0.16302267 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046553567 = queryNorm
              0.2708308 = fieldWeight in 2345, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2345)
      0.25 = coord(1/4)
    
    Date
    22. 9.1997 19:16:05
    Source
    Visualizing subject access for 21st century information resources: Papers presented at the 1997 Clinic on Library Applications of Data Processing, 2-4 Mar 1997, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign. Ed.: P.A. Cochrane et al
  13. Stede, M.: Lexicalization in natural language generation : a survey (1994/95) 0.02
    0.020860787 = product of:
      0.08344315 = sum of:
        0.08344315 = weight(_text_:open in 1913) [ClassicSimilarity], result of:
          0.08344315 = score(doc=1913,freq=2.0), product of:
            0.20964009 = queryWeight, product of:
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.046553567 = queryNorm
            0.39803052 = fieldWeight in 1913, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.0625 = fieldNorm(doc=1913)
      0.25 = coord(1/4)
    
    Abstract
    In natural language generation, a meaning representation of some kind is successively transformed into a sentence or a text. Naturally, a central subtask of this problem is the choice of words, or lexicalization. Proposes 4 major issues that determine how a generator tackles lexicalization, and surveys the contributions that research have made to them. Identifies open problems, and sketches a possible direction for research
  14. Bian, G.-W.; Chen, H.-H.: Cross-language information access to multilingual collections on the Internet (2000) 0.02
    0.018324483 = product of:
      0.07329793 = sum of:
        0.07329793 = sum of:
          0.03545374 = weight(_text_:access in 4436) [ClassicSimilarity], result of:
            0.03545374 = score(doc=4436,freq=2.0), product of:
              0.15778996 = queryWeight, product of:
                3.389428 = idf(docFreq=4053, maxDocs=44218)
                0.046553567 = queryNorm
              0.22468945 = fieldWeight in 4436, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.389428 = idf(docFreq=4053, maxDocs=44218)
                0.046875 = fieldNorm(doc=4436)
          0.037844196 = weight(_text_:22 in 4436) [ClassicSimilarity], result of:
            0.037844196 = score(doc=4436,freq=2.0), product of:
              0.16302267 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046553567 = queryNorm
              0.23214069 = fieldWeight in 4436, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=4436)
      0.25 = coord(1/4)
    
    Date
    16. 2.2000 14:22:39
  15. AI-Sughaiyer, I.A.; AI-Kharashi, I.A.: Arabic morphological analysis techniques : a comprehensive survey (2004) 0.02
    0.01825319 = product of:
      0.07301276 = sum of:
        0.07301276 = weight(_text_:open in 2206) [ClassicSimilarity], result of:
          0.07301276 = score(doc=2206,freq=2.0), product of:
            0.20964009 = queryWeight, product of:
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.046553567 = queryNorm
            0.3482767 = fieldWeight in 2206, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2206)
      0.25 = coord(1/4)
    
    Abstract
    After several decades of heavy research activity an English stemmers, Arabic morphological analysis techniques have become a popular area of research. The Arabic language is one of the Semitic languages; it exhibits a very systematic but complex morphological structure based an root-pattern schemes. As a consequence, survey of such techniques proves to be more necessary. The aim of this paper is to summarize and organize the information available in the literature in an attempt to motivate researchers to look into these techniques and try to develop more advanced ones. This paper introduces, classifies, and surveys Arabic morphological analysis techniques. Furthermore, conclusions, open areas, and future directions are provided at the end.
  16. Collard, J.; Paiva, V. de; Fong, B.; Subrahmanian, E.: Extracting mathematical concepts from text (2022) 0.02
    0.01825319 = product of:
      0.07301276 = sum of:
        0.07301276 = weight(_text_:open in 668) [ClassicSimilarity], result of:
          0.07301276 = score(doc=668,freq=2.0), product of:
            0.20964009 = queryWeight, product of:
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.046553567 = queryNorm
            0.3482767 = fieldWeight in 668, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.0546875 = fieldNorm(doc=668)
      0.25 = coord(1/4)
    
    Abstract
    We investigate different systems for extracting mathematical entities from English texts in the mathematical field of category theory as a first step for constructing a mathematical knowledge graph. We consider four different term extractors and compare their results. This small experiment showcases some of the issues with the construction and evaluation of terms extracted from noisy domain text. We also make available two open corpora in research mathematics, in particular in category theory: a small corpus of 755 abstracts from the journal TAC (3188 sentences), and a larger corpus from the nLab community wiki (15,000 sentences).
  17. Ferret, O.; Grau, B.; Hurault-Plantet, M.; Illouz, G.; Jacquemin, C.; Monceaux, L.; Robba, I.; Vilnat, A.: How NLP can improve question answering (2002) 0.02
    0.015645592 = product of:
      0.062582366 = sum of:
        0.062582366 = weight(_text_:open in 1850) [ClassicSimilarity], result of:
          0.062582366 = score(doc=1850,freq=2.0), product of:
            0.20964009 = queryWeight, product of:
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.046553567 = queryNorm
            0.2985229 = fieldWeight in 1850, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.046875 = fieldNorm(doc=1850)
      0.25 = coord(1/4)
    
    Abstract
    Answering open-domain factual questions requires Natural Language processing for refining document selection and answer identification. With our system QALC, we have participated in the Question Answering track of the TREC8, TREC9 and TREC10 evaluations. QALC performs an analysis of documents relying an multiword term searches and their linguistic variation both to minimize the number of documents selected and to provide additional clues when comparing question and sentence representations. This comparison process also makes use of the results of a syntactic parsing of the questions and Named Entity recognition functionalities. Answer extraction relies an the application of syntactic patterns chosen according to the kind of information that is sought, and categorized depending an the syntactic form of the question. These patterns allow QALC to handle nicely linguistic variations at the answer level.
  18. Albrecht, I.: GPT-3: die Zukunft studentischer Hausarbeiten oder eine Bedrohung der wissenschaftlichen Integrität? (2023) 0.02
    0.015645592 = product of:
      0.062582366 = sum of:
        0.062582366 = weight(_text_:open in 881) [ClassicSimilarity], result of:
          0.062582366 = score(doc=881,freq=2.0), product of:
            0.20964009 = queryWeight, product of:
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.046553567 = queryNorm
            0.2985229 = fieldWeight in 881, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.046875 = fieldNorm(doc=881)
      0.25 = coord(1/4)
    
    Abstract
    Mit dem Fortschritt künstlicher Intelligenzen sind auch progressive Sprachverarbeitungsmodelle auf den Markt gekommen. GPT-3 war nach seiner Veröffentlichung das leistungsstärkste Modell seiner Zeit. Das Programm generiert Texte, die oft nicht von menschlich verfassten Inhalten zu unterscheiden sind. GPT-3s Größe und Komplexität ermöglichen es, dass es wissenschaftliche Artikel eigenständig schreiben und ausgeben kann, die laut Studien und Untersuchungen zum Bestehen von Universitätskursen ausreichen. Mit der Entwicklung solcher Künstlichen Intelligenzen, insbesondere auf Open Source-Basis, könnten Studierende in Zukunft studentische Hausarbeiten von Textgeneratoren schreiben lassen. Diese Arbeit beschäftigt sich einerseits mit dem Modell GPT-3 und seinen Fähigkeiten sowie Risiken. Andererseits wird die Frage thematisiert, wie Hochschulen und Universitäten in Zukunft mit maschinell verfassten Arbeiten umgehen können.
  19. Manhart, K.: Digitales Kauderwelsch : Online-Übersetzungsdienste (2004) 0.01
    0.013037993 = product of:
      0.05215197 = sum of:
        0.05215197 = weight(_text_:open in 2077) [ClassicSimilarity], result of:
          0.05215197 = score(doc=2077,freq=2.0), product of:
            0.20964009 = queryWeight, product of:
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.046553567 = queryNorm
            0.24876907 = fieldWeight in 2077, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2077)
      0.25 = coord(1/4)
    
    Abstract
    Eine englische oder französische Website mal schnell ins Deutsche übersetzen - nichts einfacher als das. OnlineÜbersetzungsdienste versprechen den Sprachtransfer per Mausklick und zum Nulltarif. Doch was taugen sie wirklich? Online-Übersetzungsdienste wollen die Sprachbarriere im WWW beseitigen. Die automatischen Übersetzer versprechen, die E-Mail-Korrespondenz verständlich zu machen und das deutschsprachige Surfen in fremdsprachigen Webangeboten zu ermöglichen. Englische, spanische oder gar chinesische EMails und Websites können damit per Mausklick schnell in die eigene Sprache übertragen werden. Auch komplizierte englische Bedienungsanleitungen oder russische Nachrichten sollen für die Dienste kein Problem sein. Und der eine oder andere Homepage-Besitzer träumt davon, mit Hilfe der digitalen Übersetzungshelfer seine deutsche Website in perfektem Englisch online stellen zu können - in der Hoffung auf internationale Kontakte und höhere Besucherzahlen. Das klingt schön - doch die Realität sieht anders aus. Wer jemals einen solchen Dienst konsultiert hat, reibt sich meist verwundert die Augen über die gebotenen Ergebnisse. Schon einfache Sätze bereiten vielen Online-Über setzern Probleme-und sorgen unfreiwillig für Humor. Aus der CNN-Meldung "Iraq blast injures 31 U.S. troops" wird im Deutschen der Satz: "Der Irak Knall verletzt 31 Vereinigte Staaten Truppen." Sites mit schwierigem Satzbau können die Übersetzer oft nur unverständlich wiedergeben. Den Satz "The Slider is equipped with a brilliant color screen and sports an innovative design that slides open with a push of your thumb" übersetzt der bekannteste Online-Dolmetscher Babelfish mit folgendem Kauderwelsch: "Der Schweber wird mit einem leuchtenden Farbe Schirm ausgerüstet und ein erfinderisches Design sports, das geöffnetes mit einem Stoß Ihres Daumens schiebt." Solch dadaistische Texte muten alle Übersetzer ihren Nutzern zu.
  20. Chandrasekar, R.; Bangalore, S.: Glean : using syntactic information in document filtering (2002) 0.01
    0.013037993 = product of:
      0.05215197 = sum of:
        0.05215197 = weight(_text_:open in 4257) [ClassicSimilarity], result of:
          0.05215197 = score(doc=4257,freq=2.0), product of:
            0.20964009 = queryWeight, product of:
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.046553567 = queryNorm
            0.24876907 = fieldWeight in 4257, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.5032015 = idf(docFreq=1330, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4257)
      0.25 = coord(1/4)
    
    Abstract
    In today's networked world, a huge amount of data is available in machine-processable form. Likewise, there are any number of search engines and specialized information retrieval (IR) programs that seek to extract relevant information from these data repositories. Most IR systems and Web search engines have been designed for speed and tend to maximize the quantity of information (recall) rather than the relevance of the information (precision) to the query. As a result, search engine users get inundated with information for practically any query, and are forced to scan a large number of potentially relevant items to get to the information of interest. The Holy Grail of IR is to somehow retrieve those and only those documents pertinent to the user's query. Polysemy and synonymy - the fact that often there are several meanings for a word or phrase, and likewise, many ways to express a conceptmake this a very hard task. While conventional IR systems provide usable solutions, there are a number of open problems to be solved, in areas such as syntactic processing, semantic analysis, and user modeling, before we develop systems that "understand" user queries and text collections. Meanwhile, we can use tools and techniques available today to improve the precision of retrieval. In particular, using the approach described in this article, we can approximate understanding using the syntactic structure and patterns of language use that is latent in documents to make IR more effective.

Years

Languages

  • e 71
  • d 21

Types

  • a 73
  • el 12
  • m 9
  • s 4
  • p 3
  • x 2
  • d 1
  • More… Less…