Search (94 results, page 3 of 5)

  • × language_ss:"e"
  • × theme_ss:"Computerlinguistik"
  1. Rindflesch, T.C.; Fizsman, M.: The interaction of domain knowledge and linguistic structure in natural language processing : interpreting hypernymic propositions in biomedical text (2003) 0.01
    0.0053038225 = product of:
      0.015911467 = sum of:
        0.015911467 = product of:
          0.047734402 = sum of:
            0.047734402 = weight(_text_:network in 2097) [ClassicSimilarity], result of:
              0.047734402 = score(doc=2097,freq=2.0), product of:
                0.19402927 = queryWeight, product of:
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.043569047 = queryNorm
                0.2460165 = fieldWeight in 2097, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2097)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    Interpretation of semantic propositions in free-text documents such as MEDLINE citations would provide valuable support for biomedical applications, and several approaches to semantic interpretation are being pursued in the biomedical informatics community. In this paper, we describe a methodology for interpreting linguistic structures that encode hypernymic propositions, in which a more specific concept is in a taxonomic relationship with a more general concept. In order to effectively process these constructions, we exploit underspecified syntactic analysis and structured domain knowledge from the Unified Medical Language System (UMLS). After introducing the syntactic processing on which our system depends, we focus on the UMLS knowledge that supports interpretation of hypernymic propositions. We first use semantic groups from the Semantic Network to ensure that the two concepts involved are compatible; hierarchical information in the Metathesaurus then determines which concept is more general and which more specific. A preliminary evaluation of a sample based on the semantic group Chemicals and Drugs provides 83% precision. An error analysis was conducted and potential solutions to the problems encountered are presented. The research discussed here serves as a paradigm for investigating the interaction between domain knowledge and linguistic structure in natural language processing, and could also make a contribution to research on automatic processing of discourse structure. Additional implications of the system we present include its integration in advanced semantic interpretation processors for biomedical text and its use for information extraction in specific domains. The approach has the potential to support a range of applications, including information retrieval and ontology engineering.
  2. Gencosman, B.C.; Ozmutlu, H.C.; Ozmutlu, S.: Character n-gram application for automatic new topic identification (2014) 0.01
    0.0053038225 = product of:
      0.015911467 = sum of:
        0.015911467 = product of:
          0.047734402 = sum of:
            0.047734402 = weight(_text_:network in 2688) [ClassicSimilarity], result of:
              0.047734402 = score(doc=2688,freq=2.0), product of:
                0.19402927 = queryWeight, product of:
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.043569047 = queryNorm
                0.2460165 = fieldWeight in 2688, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2688)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    The widespread availability of the Internet and the variety of Internet-based applications have resulted in a significant increase in the amount of web pages. Determining the behaviors of search engine users has become a critical step in enhancing search engine performance. Search engine user behaviors can be determined by content-based or content-ignorant algorithms. Although many content-ignorant studies have been performed to automatically identify new topics, previous results have demonstrated that spelling errors can cause significant errors in topic shift estimates. In this study, we focused on minimizing the number of wrong estimates that were based on spelling errors. We developed a new hybrid algorithm combining character n-gram and neural network methodologies, and compared the experimental results with results from previous studies. For the FAST and Excite datasets, the proposed algorithm improved topic shift estimates by 6.987% and 2.639%, respectively. Moreover, we analyzed the performance of the character n-gram method in different aspects including the comparison with Levenshtein edit-distance method. The experimental results demonstrated that the character n-gram method outperformed to the Levensthein edit distance method in terms of topic identification.
  3. Doval, Y.; Gómez-Rodríguez, C.: Comparing neural- and N-gram-based language models for word segmentation (2019) 0.01
    0.0053038225 = product of:
      0.015911467 = sum of:
        0.015911467 = product of:
          0.047734402 = sum of:
            0.047734402 = weight(_text_:network in 4675) [ClassicSimilarity], result of:
              0.047734402 = score(doc=4675,freq=2.0), product of:
                0.19402927 = queryWeight, product of:
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.043569047 = queryNorm
                0.2460165 = fieldWeight in 4675, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4675)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    Word segmentation is the task of inserting or deleting word boundary characters in order to separate character sequences that correspond to words in some language. In this article we propose an approach based on a beam search algorithm and a language model working at the byte/character level, the latter component implemented either as an n-gram model or a recurrent neural network. The resulting system analyzes the text input with no word boundaries one token at a time, which can be a character or a byte, and uses the information gathered by the language model to determine if a boundary must be placed in the current position or not. Our aim is to use this system in a preprocessing step for a microtext normalization system. This means that it needs to effectively cope with the data sparsity present on this kind of texts. We also strove to surpass the performance of two readily available word segmentation systems: The well-known and accessible Word Breaker by Microsoft, and the Python module WordSegment by Grant Jenks. The results show that we have met our objectives, and we hope to continue to improve both the precision and the efficiency of our system in the future.
  4. Soni, S.; Lerman, K.; Eisenstein, J.: Follow the leader : documents on the leading edge of semantic change get more citations (2021) 0.01
    0.0053038225 = product of:
      0.015911467 = sum of:
        0.015911467 = product of:
          0.047734402 = sum of:
            0.047734402 = weight(_text_:network in 169) [ClassicSimilarity], result of:
              0.047734402 = score(doc=169,freq=2.0), product of:
                0.19402927 = queryWeight, product of:
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.043569047 = queryNorm
                0.2460165 = fieldWeight in 169, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.4533744 = idf(docFreq=1398, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=169)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    Diachronic word embeddings-vector representations of words over time-offer remarkable insights into the evolution of language and provide a tool for quantifying sociocultural change from text documents. Prior work has used such embeddings to identify shifts in the meaning of individual words. However, simply knowing that a word has changed in meaning is insufficient to identify the instances of word usage that convey the historical meaning or the newer meaning. In this study, we link diachronic word embeddings to documents, by situating those documents as leaders or laggards with respect to ongoing semantic changes. Specifically, we propose a novel method to quantify the degree of semantic progressiveness in each word usage, and then show how these usages can be aggregated to obtain scores for each document. We analyze two large collections of documents, representing legal opinions and scientific articles. Documents that are scored as semantically progressive receive a larger number of citations, indicating that they are especially influential. Our work thus provides a new technique for identifying lexical semantic leaders and demonstrates a new link between progressive use of language and influence in a citation network.
  5. Chiba, K.; Kyojima, M.: Document transformation based on syntax-directed free translation (1995) 0.01
    0.0052947453 = product of:
      0.015884236 = sum of:
        0.015884236 = product of:
          0.047652703 = sum of:
            0.047652703 = weight(_text_:29 in 4069) [ClassicSimilarity], result of:
              0.047652703 = score(doc=4069,freq=2.0), product of:
                0.15326229 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.043569047 = queryNorm
                0.31092256 = fieldWeight in 4069, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4069)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Source
    Electronic publishing. 8(1995) no.1, S.15-29
  6. Mustafa El Hadi, W.: Terminologies, ontologies and information access (2006) 0.01
    0.0052947453 = product of:
      0.015884236 = sum of:
        0.015884236 = product of:
          0.047652703 = sum of:
            0.047652703 = weight(_text_:29 in 1488) [ClassicSimilarity], result of:
              0.047652703 = score(doc=1488,freq=2.0), product of:
                0.15326229 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.043569047 = queryNorm
                0.31092256 = fieldWeight in 1488, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1488)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    29. 2.2008 16:25:23
  7. Wanner, L.: Lexical choice in text generation and machine translation (1996) 0.01
    0.005247115 = product of:
      0.015741345 = sum of:
        0.015741345 = product of:
          0.047224034 = sum of:
            0.047224034 = weight(_text_:22 in 8521) [ClassicSimilarity], result of:
              0.047224034 = score(doc=8521,freq=2.0), product of:
                0.15257138 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043569047 = queryNorm
                0.30952093 = fieldWeight in 8521, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=8521)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    31. 7.1996 9:22:19
  8. Riloff, E.: ¬An empirical study of automated dictionary construction for information extraction in three domains (1996) 0.01
    0.005247115 = product of:
      0.015741345 = sum of:
        0.015741345 = product of:
          0.047224034 = sum of:
            0.047224034 = weight(_text_:22 in 6752) [ClassicSimilarity], result of:
              0.047224034 = score(doc=6752,freq=2.0), product of:
                0.15257138 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043569047 = queryNorm
                0.30952093 = fieldWeight in 6752, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6752)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    6. 3.1997 16:22:15
  9. Basili, R.; Pazienza, M.T.; Velardi, P.: ¬An empirical symbolic approach to natural language processing (1996) 0.01
    0.005247115 = product of:
      0.015741345 = sum of:
        0.015741345 = product of:
          0.047224034 = sum of:
            0.047224034 = weight(_text_:22 in 6753) [ClassicSimilarity], result of:
              0.047224034 = score(doc=6753,freq=2.0), product of:
                0.15257138 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043569047 = queryNorm
                0.30952093 = fieldWeight in 6753, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6753)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    6. 3.1997 16:22:15
  10. Haas, S.W.: Natural language processing : toward large-scale, robust systems (1996) 0.01
    0.005247115 = product of:
      0.015741345 = sum of:
        0.015741345 = product of:
          0.047224034 = sum of:
            0.047224034 = weight(_text_:22 in 7415) [ClassicSimilarity], result of:
              0.047224034 = score(doc=7415,freq=2.0), product of:
                0.15257138 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043569047 = queryNorm
                0.30952093 = fieldWeight in 7415, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=7415)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    State of the art review of natural language processing updating an earlier review published in ARIST 22(1987). Discusses important developments that have allowed for significant advances in the field of natural language processing: materials and resources; knowledge based systems and statistical approaches; and a strong emphasis on evaluation. Reviews some natural language processing applications and common problems still awaiting solution. Considers closely related applications such as language generation and th egeneration phase of machine translation which face the same problems as natural language processing. Covers natural language methodologies for information retrieval only briefly
  11. Way, E.C.: Knowledge representation and metaphor (oder: meaning) (1994) 0.01
    0.005247115 = product of:
      0.015741345 = sum of:
        0.015741345 = product of:
          0.047224034 = sum of:
            0.047224034 = weight(_text_:22 in 771) [ClassicSimilarity], result of:
              0.047224034 = score(doc=771,freq=2.0), product of:
                0.15257138 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043569047 = queryNorm
                0.30952093 = fieldWeight in 771, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=771)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Footnote
    Bereits 1991 bei Kluwer publiziert // Rez. in: Knowledge organization 22(1995) no.1, S.48-49 (O. Sechser)
  12. Morris, V.: Automated language identification of bibliographic resources (2020) 0.01
    0.005247115 = product of:
      0.015741345 = sum of:
        0.015741345 = product of:
          0.047224034 = sum of:
            0.047224034 = weight(_text_:22 in 5749) [ClassicSimilarity], result of:
              0.047224034 = score(doc=5749,freq=2.0), product of:
                0.15257138 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043569047 = queryNorm
                0.30952093 = fieldWeight in 5749, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5749)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    2. 3.2020 19:04:22
  13. Gomez, F.: Learning word syntactic subcategorizations interactively (1995) 0.00
    0.0046329014 = product of:
      0.013898704 = sum of:
        0.013898704 = product of:
          0.041696113 = sum of:
            0.041696113 = weight(_text_:29 in 3130) [ClassicSimilarity], result of:
              0.041696113 = score(doc=3130,freq=2.0), product of:
                0.15326229 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.043569047 = queryNorm
                0.27205724 = fieldWeight in 3130, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3130)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    29. 1.1996 18:28:58
  14. Saeed, K.; Dardzinska, A.: Natural language processing : word recognition without segmentation (2001) 0.00
    0.0046329014 = product of:
      0.013898704 = sum of:
        0.013898704 = product of:
          0.041696113 = sum of:
            0.041696113 = weight(_text_:29 in 7707) [ClassicSimilarity], result of:
              0.041696113 = score(doc=7707,freq=2.0), product of:
                0.15326229 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.043569047 = queryNorm
                0.27205724 = fieldWeight in 7707, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=7707)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    16.12.2001 18:29:38
  15. WordNet : an electronic lexical database (language, speech and communication) (1998) 0.00
    0.0046329014 = product of:
      0.013898704 = sum of:
        0.013898704 = product of:
          0.041696113 = sum of:
            0.041696113 = weight(_text_:29 in 2434) [ClassicSimilarity], result of:
              0.041696113 = score(doc=2434,freq=2.0), product of:
                0.15326229 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.043569047 = queryNorm
                0.27205724 = fieldWeight in 2434, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2434)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    29. 3.1996 18:16:49
  16. Chen, K.-H.: Evaluating Chinese text retrieval with multilingual queries (2002) 0.00
    0.0046329014 = product of:
      0.013898704 = sum of:
        0.013898704 = product of:
          0.041696113 = sum of:
            0.041696113 = weight(_text_:29 in 1851) [ClassicSimilarity], result of:
              0.041696113 = score(doc=1851,freq=2.0), product of:
                0.15326229 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.043569047 = queryNorm
                0.27205724 = fieldWeight in 1851, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1851)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Source
    Knowledge organization. 29(2002) nos.3/4, S.156-170
  17. Bowker, L.: Information retrieval in translation memory systems : assessment of current limitations and possibilities for future development (2002) 0.00
    0.0046329014 = product of:
      0.013898704 = sum of:
        0.013898704 = product of:
          0.041696113 = sum of:
            0.041696113 = weight(_text_:29 in 1854) [ClassicSimilarity], result of:
              0.041696113 = score(doc=1854,freq=2.0), product of:
                0.15326229 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.043569047 = queryNorm
                0.27205724 = fieldWeight in 1854, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1854)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Source
    Knowledge organization. 29(2002) nos.3/4, S.198-203
  18. Rindflesch, T.C.; Aronson, A.R.: Semantic processing in information retrieval (1993) 0.00
    0.0046329014 = product of:
      0.013898704 = sum of:
        0.013898704 = product of:
          0.041696113 = sum of:
            0.041696113 = weight(_text_:29 in 4121) [ClassicSimilarity], result of:
              0.041696113 = score(doc=4121,freq=2.0), product of:
                0.15326229 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.043569047 = queryNorm
                0.27205724 = fieldWeight in 4121, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4121)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    29. 6.2015 14:51:28
  19. Stoykova, V.; Petkova, E.: Automatic extraction of mathematical terms for precalculus (2012) 0.00
    0.0046329014 = product of:
      0.013898704 = sum of:
        0.013898704 = product of:
          0.041696113 = sum of:
            0.041696113 = weight(_text_:29 in 156) [ClassicSimilarity], result of:
              0.041696113 = score(doc=156,freq=2.0), product of:
                0.15326229 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.043569047 = queryNorm
                0.27205724 = fieldWeight in 156, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=156)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    29. 5.2012 10:17:08
  20. Rayson, P.; Piao, S.; Sharoff, S.; Evert, S.; Moiron, B.V.: Multiword expressions : hard going or plain sailing? (2015) 0.00
    0.0046329014 = product of:
      0.013898704 = sum of:
        0.013898704 = product of:
          0.041696113 = sum of:
            0.041696113 = weight(_text_:29 in 2918) [ClassicSimilarity], result of:
              0.041696113 = score(doc=2918,freq=2.0), product of:
                0.15326229 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.043569047 = queryNorm
                0.27205724 = fieldWeight in 2918, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2918)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    29. 4.2016 12:05:56

Years

Types

  • a 83
  • el 6
  • m 4
  • s 3
  • p 2
  • x 1
  • More… Less…