Search (110 results, page 2 of 6)

  • × theme_ss:"Computerlinguistik"
  • × year_i:[2010 TO 2020}
  1. Malo, P.; Sinha, A.; Korhonen, P.; Wallenius, J.; Takala, P.: Good debt or bad debt : detecting semantic orientations in economic texts (2014) 0.02
    0.022237768 = product of:
      0.059300717 = sum of:
        0.037047986 = weight(_text_:use in 1226) [ClassicSimilarity], result of:
          0.037047986 = score(doc=1226,freq=6.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.29299045 = fieldWeight in 1226, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1226)
        0.0167351 = weight(_text_:of in 1226) [ClassicSimilarity], result of:
          0.0167351 = score(doc=1226,freq=18.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.25915858 = fieldWeight in 1226, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1226)
        0.0055176322 = product of:
          0.0110352645 = sum of:
            0.0110352645 = weight(_text_:on in 1226) [ClassicSimilarity], result of:
              0.0110352645 = score(doc=1226,freq=2.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.121501654 = fieldWeight in 1226, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1226)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    Abstract
    The use of robo-readers to analyze news texts is an emerging technology trend in computational finance. Recent research has developed sophisticated financial polarity lexicons for investigating how financial sentiments relate to future company performance. However, based on experience from fields that commonly analyze sentiment, it is well known that the overall semantic orientation of a sentence may differ from that of individual words. This article investigates how semantic orientations can be better detected in financial and economic news by accommodating the overall phrase-structure information and domain-specific use of language. Our three main contributions are the following: (a) a human-annotated finance phrase bank that can be used for training and evaluating alternative models; (b) a technique to enhance financial lexicons with attributes that help to identify expected direction of events that affect sentiment; and (c) a linearized phrase-structure model for detecting contextual semantic orientations in economic texts. The relevance of the newly added lexicon features and the benefit of using the proposed learning algorithm are demonstrated in a comparative study against general sentiment models as well as the popular word frequency models used in recent financial studies. The proposed framework is parsimonious and avoids the explosion in feature space caused by the use of conventional n-gram features.
    Source
    Journal of the Association for Information Science and Technology. 65(2014) no.4, S.782-796
  2. Farreús, M.; Costa-jussà, M.R.; Popovic' Morse, M.: Study and correlation analysis of linguistic, perceptual, and automatic machine translation evaluations (2012) 0.02
    0.021115765 = product of:
      0.056308705 = sum of:
        0.036299463 = weight(_text_:use in 4975) [ClassicSimilarity], result of:
          0.036299463 = score(doc=4975,freq=4.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.2870708 = fieldWeight in 4975, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.046875 = fieldNorm(doc=4975)
        0.013388081 = weight(_text_:of in 4975) [ClassicSimilarity], result of:
          0.013388081 = score(doc=4975,freq=8.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.20732689 = fieldWeight in 4975, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=4975)
        0.006621159 = product of:
          0.013242318 = sum of:
            0.013242318 = weight(_text_:on in 4975) [ClassicSimilarity], result of:
              0.013242318 = score(doc=4975,freq=2.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.14580199 = fieldWeight in 4975, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4975)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    Abstract
    Evaluation of machine translation output is an important task. Various human evaluation techniques as well as automatic metrics have been proposed and investigated in the last decade. However, very few evaluation methods take the linguistic aspect into account. In this article, we use an objective evaluation method for machine translation output that classifies all translation errors into one of the five following linguistic levels: orthographic, morphological, lexical, semantic, and syntactic. Linguistic guidelines for the target language are required, and human evaluators use them in to classify the output errors. The experiments are performed on English-to-Catalan and Spanish-to-Catalan translation outputs generated by four different systems: 2 rule-based and 2 statistical. All translations are evaluated using the 3 following methods: a standard human perceptual evaluation method, several widely used automatic metrics, and the human linguistic evaluation. Pearson and Spearman correlation coefficients between the linguistic, perceptual, and automatic results are then calculated, showing that the semantic level correlates significantly with both perceptual evaluation and automatic metrics.
    Source
    Journal of the American Society for Information Science and Technology. 63(2012) no.1, S.174-184
  3. Perovsek, M.; Kranjca, J.; Erjaveca, T.; Cestnika, B.; Lavraca, N.: TextFlows : a visual programming platform for text mining and natural language processing (2016) 0.02
    0.021026019 = product of:
      0.056069385 = sum of:
        0.025667597 = weight(_text_:use in 2697) [ClassicSimilarity], result of:
          0.025667597 = score(doc=2697,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.20298971 = fieldWeight in 2697, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.046875 = fieldNorm(doc=2697)
        0.018933605 = weight(_text_:of in 2697) [ClassicSimilarity], result of:
          0.018933605 = score(doc=2697,freq=16.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.2932045 = fieldWeight in 2697, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2697)
        0.011468184 = product of:
          0.022936368 = sum of:
            0.022936368 = weight(_text_:on in 2697) [ClassicSimilarity], result of:
              0.022936368 = score(doc=2697,freq=6.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.25253648 = fieldWeight in 2697, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2697)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    Abstract
    Text mining and natural language processing are fast growing areas of research, with numerous applications in business, science and creative industries. This paper presents TextFlows, a web-based text mining and natural language processing platform supporting workflow construction, sharing and execution. The platform enables visual construction of text mining workflows through a web browser, and the execution of the constructed workflows on a processing cloud. This makes TextFlows an adaptable infrastructure for the construction and sharing of text processing workflows, which can be reused in various applications. The paper presents the implemented text mining and language processing modules, and describes some precomposed workflows. Their features are demonstrated on three use cases: comparison of document classifiers and of different part-of-speech taggers on a text categorization problem, and outlier detection in document corpora.
    Source
    Science of computer programming. In Press, 2016
  4. Schmolz, H.: Anaphora resolution and text retrieval : a lnguistic analysis of hypertexts (2015) 0.02
    0.020866146 = product of:
      0.083464585 = sum of:
        0.07230785 = weight(_text_:retrieval in 1172) [ClassicSimilarity], result of:
          0.07230785 = score(doc=1172,freq=6.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.5788671 = fieldWeight in 1172, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.078125 = fieldNorm(doc=1172)
        0.011156735 = weight(_text_:of in 1172) [ClassicSimilarity], result of:
          0.011156735 = score(doc=1172,freq=2.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.17277241 = fieldWeight in 1172, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=1172)
      0.25 = coord(2/8)
    
    RSWK
    Englisch / Anapher <Syntax> / Hypertext / Information Retrieval / Korpus <Linguistik>
    Subject
    Englisch / Anapher <Syntax> / Hypertext / Information Retrieval / Korpus <Linguistik>
  5. Fang, L.; Tuan, L.A.; Hui, S.C.; Wu, L.: Syntactic based approach for grammar question retrieval (2018) 0.02
    0.020750891 = product of:
      0.055335708 = sum of:
        0.036153924 = weight(_text_:retrieval in 5086) [ClassicSimilarity], result of:
          0.036153924 = score(doc=5086,freq=6.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.28943354 = fieldWeight in 5086, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5086)
        0.013664153 = weight(_text_:of in 5086) [ClassicSimilarity], result of:
          0.013664153 = score(doc=5086,freq=12.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.21160212 = fieldWeight in 5086, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5086)
        0.0055176322 = product of:
          0.0110352645 = sum of:
            0.0110352645 = weight(_text_:on in 5086) [ClassicSimilarity], result of:
              0.0110352645 = score(doc=5086,freq=2.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.121501654 = fieldWeight in 5086, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5086)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    Abstract
    With the popularity of online educational platforms, English learners can learn and practice no matter where they are and what they do. English grammar is one of the important components in learning English. To learn English grammar effectively, it requires students to practice questions containing focused grammar knowledge. In this paper, we study a novel problem of retrieving English grammar questions with similar grammatical focus. Since the grammatical focus similarity is different from textual similarity or sentence syntactic similarity, existing approaches cannot be applied directly to our problem. To address this problem, we propose a syntactic based approach for English grammar question retrieval which can retrieve related grammar questions with similar grammatical focus effectively. In the proposed syntactic based approach, we first propose a new syntactic tree, namely parse-key tree, to capture English grammar questions' grammatical focus. Next, we propose two kernel functions, namely relaxed tree kernel and part-of-speech order kernel, to compute the similarity between two parse-key trees of the query and grammar questions in the collection. Then, the retrieved grammar questions are ranked according to the similarity between the parse-key trees. In addition, if a query is submitted together with answer choices, conceptual similarity and textual similarity are also incorporated to further improve the retrieval accuracy. The performance results have shown that our proposed approach outperforms the state-of-the-art methods based on statistical analysis and syntactic analysis.
  6. Ramisch, C.; Villavicencio, A.; Kordoni, V.: Introduction to the special issue on multiword expressions : from theory to practice and use (2013) 0.02
    0.020204343 = product of:
      0.053878248 = sum of:
        0.025667597 = weight(_text_:use in 1124) [ClassicSimilarity], result of:
          0.025667597 = score(doc=1124,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.20298971 = fieldWeight in 1124, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.046875 = fieldNorm(doc=1124)
        0.014968331 = weight(_text_:of in 1124) [ClassicSimilarity], result of:
          0.014968331 = score(doc=1124,freq=10.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.23179851 = fieldWeight in 1124, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=1124)
        0.013242318 = product of:
          0.026484637 = sum of:
            0.026484637 = weight(_text_:on in 1124) [ClassicSimilarity], result of:
              0.026484637 = score(doc=1124,freq=8.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.29160398 = fieldWeight in 1124, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1124)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    Abstract
    We are in 2013, and multiword expressions have been around for a while in the computational linguistics research community. Since the first ACL workshop on MWEs 12 years ago in Sapporo, Japan, much has been discussed, proposed, experimented, evaluated and argued about MWEs. And yet, they deserve the publication of a whole special issue of the ACM Transactions on Speech and Language Processing. But what is it about multiword expressions that keeps them in fashion? Who are the people and the institutions who perform and publish groundbreaking fundamental and applied research in this field? What is the place and the relevance of our lively research community in the bigger picture of computational linguistics? Where do we come from as a community, and most importantly, where are we heading? In this introductory article, we share our point of view about the answers to these questions and introduce the articles that compose the current special issue.
    Source
    ACM Transactions on Speech and Language Processing. 10(2013) no.2
  7. Nissim, M.; Zaninello, A,: Modeling the internal variability of multiword expressions through a pattern-based method (2013) 0.02
    0.02019021 = product of:
      0.053840555 = sum of:
        0.021389665 = weight(_text_:use in 990) [ClassicSimilarity], result of:
          0.021389665 = score(doc=990,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.1691581 = fieldWeight in 990, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.0390625 = fieldNorm(doc=990)
        0.02011309 = weight(_text_:of in 990) [ClassicSimilarity], result of:
          0.02011309 = score(doc=990,freq=26.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.31146988 = fieldWeight in 990, product of:
              5.0990195 = tf(freq=26.0), with freq of:
                26.0 = termFreq=26.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=990)
        0.012337802 = product of:
          0.024675604 = sum of:
            0.024675604 = weight(_text_:on in 990) [ClassicSimilarity], result of:
              0.024675604 = score(doc=990,freq=10.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.271686 = fieldWeight in 990, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=990)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    Abstract
    The issue of internal variability of multiword expressions (MWEs) is crucial towards their identification and extraction in running text.We present a corpus-supported and computational study on Italian MWEs, aimed at defining an automatic method for modeling internal variation, exploiting frequency and part-of-speech (POS) information. We do so by deriving an XML-encoded lexicon of MWEs based on a manually compiled dictionary, which is then projected onto a a large corpus. Since a search for fixed forms suffers from low recall, while an unconstrained flexible search for lemmas yields a loss in precision, we suggest a procedure aimed at maximizing precision in the identification of MWEs within a flexible search. Our method builds on the idea that internal variability can be modelled via the novel introduction of variation patterns, which work over POS patterns, and can be used as working tools for controlling precision. We also compare the performance of variation patterns to that of association measures, and explore the possibility of using variation patterns in MWE extraction in addition to identification. Finally, we suggest that corpus-derived, pattern-related information can be included in the original MWE lexicon by means of an enriched coding and the creation of an XML-based repository of patterns.
    Series
    Special issue on multiword expressions: from theory to practice and use
    Source
    ACM Transactions on Speech and Language Processing. 10(2013) no.2, Article7, S.1-26
  8. Al-Shawakfa, E.; Al-Badarneh, A.; Shatnawi, S.; Al-Rabab'ah, K.; Bani-Ismail, B.: ¬A comparison study of some Arabic root finding algorithms (2010) 0.02
    0.019778287 = product of:
      0.052742098 = sum of:
        0.025667597 = weight(_text_:use in 3457) [ClassicSimilarity], result of:
          0.025667597 = score(doc=3457,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.20298971 = fieldWeight in 3457, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.046875 = fieldNorm(doc=3457)
        0.017710768 = weight(_text_:of in 3457) [ClassicSimilarity], result of:
          0.017710768 = score(doc=3457,freq=14.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.2742677 = fieldWeight in 3457, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=3457)
        0.009363732 = product of:
          0.018727465 = sum of:
            0.018727465 = weight(_text_:on in 3457) [ClassicSimilarity], result of:
              0.018727465 = score(doc=3457,freq=4.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.20619515 = fieldWeight in 3457, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3457)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    Abstract
    Arabic has a complex structure, which makes it difficult to apply natural language processing (NLP). Much research on Arabic NLP (ANLP) does exist; however, it is not as mature as that of other languages. Finding Arabic roots is an important step toward conducting effective research on most of ANLP applications. The authors have studied and compared six root-finding algorithms with success rates of over 90%. All algorithms of this study did not use the same testing corpus and/or benchmarking measures. They unified the testing process by implementing their own algorithm descriptions and building a corpus out of 3823 triliteral roots, applying 73 triliteral patterns, and with 18 affixes, producing around 27.6 million words. They tested the algorithms with the generated corpus and have obtained interesting results; they offer to share the corpus freely for benchmarking and ANLP research.
    Source
    Journal of the American Society for Information Science and Technology. 61(2010) no.5, S.1015-1024
  9. Stoykova, V.; Petkova, E.: Automatic extraction of mathematical terms for precalculus (2012) 0.02
    0.019198887 = product of:
      0.051197033 = sum of:
        0.029945528 = weight(_text_:use in 156) [ClassicSimilarity], result of:
          0.029945528 = score(doc=156,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.23682132 = fieldWeight in 156, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.0546875 = fieldNorm(doc=156)
        0.013526822 = weight(_text_:of in 156) [ClassicSimilarity], result of:
          0.013526822 = score(doc=156,freq=6.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.20947541 = fieldWeight in 156, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=156)
        0.007724685 = product of:
          0.01544937 = sum of:
            0.01544937 = weight(_text_:on in 156) [ClassicSimilarity], result of:
              0.01544937 = score(doc=156,freq=2.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.17010231 = fieldWeight in 156, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=156)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    Abstract
    In this work, we present the results of research for evaluating a methodology for extracting mathematical terms for precalculus using the techniques for semantically-oriented statistical search. We use the corpus-based approach and the combination of different statistically-based techniques for extracting keywords, collocations and co-occurrences incorporated in the Sketch Engine software. We evaluate the collocations candidate terms for the basic concept function(s) and approve the related methodology by precalculus domain conceptual terms definitions. Finally, we offer a conceptual terms hierarchical representation and discuss the results with respect to their possible applications.
    Content
    Beitrag für: First World Conference on Innovation and Computer Sciences (INSODE 2011). Vgl.: http://www.sciencedirect.com/science/article/pii/S221201731200103X.
  10. Doko, A.; Stula, , M.; Seric, L.: Improved sentence retrieval using local context and sentence length (2013) 0.02
    0.019080892 = product of:
      0.07632357 = sum of:
        0.06135524 = weight(_text_:retrieval in 2705) [ClassicSimilarity], result of:
          0.06135524 = score(doc=2705,freq=12.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.49118498 = fieldWeight in 2705, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=2705)
        0.014968331 = weight(_text_:of in 2705) [ClassicSimilarity], result of:
          0.014968331 = score(doc=2705,freq=10.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.23179851 = fieldWeight in 2705, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2705)
      0.25 = coord(2/8)
    
    Abstract
    In this paper we propose improved variants of the sentence retrieval method TF-ISF (a TF-IDF or Term Frequency-Inverse Document Frequency variant for sentence retrieval). The improvement is achieved by using context consisting of neighboring sentences and at the same time promoting the retrieval of longer sentences. We thoroughly compare new modified TF-ISF methods to the TF-ISF baseline, to an earlier attempt to include context into TF-ISF named tfmix and to a language modeling based method that uses context and promoting retrieval of long sentences named 3MMPDS. Experimental results show that the TF-ISF method can be improved using local context. Results also show that the TF-ISF method can be improved by promoting the retrieval of longer sentences. Finally we show that the best results are achieved when combining both modifications. All new methods (TF-ISF variants) also show statistically significant better results than the other tested methods.
  11. Lhadj, L.S.; Boughanem, M.; Amrouche, K.: Enhancing information retrieval through concept-based language modeling and semantic smoothing (2016) 0.02
    0.018585209 = product of:
      0.049560558 = sum of:
        0.036153924 = weight(_text_:retrieval in 3221) [ClassicSimilarity], result of:
          0.036153924 = score(doc=3221,freq=6.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.28943354 = fieldWeight in 3221, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3221)
        0.007889003 = weight(_text_:of in 3221) [ClassicSimilarity], result of:
          0.007889003 = score(doc=3221,freq=4.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.12216854 = fieldWeight in 3221, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3221)
        0.0055176322 = product of:
          0.0110352645 = sum of:
            0.0110352645 = weight(_text_:on in 3221) [ClassicSimilarity], result of:
              0.0110352645 = score(doc=3221,freq=2.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.121501654 = fieldWeight in 3221, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3221)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    Abstract
    Traditionally, many information retrieval models assume that terms occur in documents independently. Although these models have already shown good performance, the word independency assumption seems to be unrealistic from a natural language point of view, which considers that terms are related to each other. Therefore, such an assumption leads to two well-known problems in information retrieval (IR), namely, polysemy, or term mismatch, and synonymy. In language models, these issues have been addressed by considering dependencies such as bigrams, phrasal-concepts, or word relationships, but such models are estimated using simple n-grams or concept counting. In this paper, we address polysemy and synonymy mismatch with a concept-based language modeling approach that combines ontological concepts from external resources with frequently found collocations from the document collection. In addition, the concept-based model is enriched with subconcepts and semantic relationships through a semantic smoothing technique so as to perform semantic matching. Experiments carried out on TREC collections show that our model achieves significant improvements over a single word-based model and the Markov Random Field model (using a Markov classifier).
    Source
    Journal of the Association for Information Science and Technology. 67(2016) no.12, S.2909-2927
  12. Rosemblat, G.; Resnick, M.P.; Auston, I.; Shin, D.; Sneiderman, C.; Fizsman, M.; Rindflesch, T.C.: Extending SemRep to the public health domain (2013) 0.02
    0.018257152 = product of:
      0.048685737 = sum of:
        0.025667597 = weight(_text_:use in 2096) [ClassicSimilarity], result of:
          0.025667597 = score(doc=2096,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.20298971 = fieldWeight in 2096, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.046875 = fieldNorm(doc=2096)
        0.016396983 = weight(_text_:of in 2096) [ClassicSimilarity], result of:
          0.016396983 = score(doc=2096,freq=12.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.25392252 = fieldWeight in 2096, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2096)
        0.006621159 = product of:
          0.013242318 = sum of:
            0.013242318 = weight(_text_:on in 2096) [ClassicSimilarity], result of:
              0.013242318 = score(doc=2096,freq=2.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.14580199 = fieldWeight in 2096, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2096)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    Abstract
    We describe the use of a domain-independent method to extend a natural language processing (NLP) application, SemRep (Rindflesch, Fiszman, & Libbus, 2005), based on the knowledge sources afforded by the Unified Medical Language System (UMLS®; Humphreys, Lindberg, Schoolman, & Barnett, 1998) to support the area of health promotion within the public health domain. Public health professionals require good information about successful health promotion policies and programs that might be considered for application within their own communities. Our effort seeks to improve access to relevant information for the public health profession, to help those in the field remain an information-savvy workforce. Natural language processing and semantic techniques hold promise to help public health professionals navigate the growing ocean of information by organizing and structuring this knowledge into a focused public health framework paired with a user-friendly visualization application as a way to summarize results of PubMed® searches in this field of knowledge.
    Source
    Journal of the American Society for Information Science and Technology. 64(2013) no.10, S.1963-1974
  13. Clark, M.; Kim, Y.; Kruschwitz, U.; Song, D.; Albakour, D.; Dignum, S.; Beresi, U.C.; Fasli, M.; Roeck, A De: Automatically structuring domain knowledge from text : an overview of current research (2012) 0.02
    0.018024867 = product of:
      0.048066314 = sum of:
        0.025048172 = weight(_text_:retrieval in 2738) [ClassicSimilarity], result of:
          0.025048172 = score(doc=2738,freq=2.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.20052543 = fieldWeight in 2738, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=2738)
        0.016396983 = weight(_text_:of in 2738) [ClassicSimilarity], result of:
          0.016396983 = score(doc=2738,freq=12.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.25392252 = fieldWeight in 2738, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2738)
        0.006621159 = product of:
          0.013242318 = sum of:
            0.013242318 = weight(_text_:on in 2738) [ClassicSimilarity], result of:
              0.013242318 = score(doc=2738,freq=2.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.14580199 = fieldWeight in 2738, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2738)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    Abstract
    This paper presents an overview of automatic methods for building domain knowledge structures (domain models) from text collections. Applications of domain models have a long history within knowledge engineering and artificial intelligence. In the last couple of decades they have surfaced noticeably as a useful tool within natural language processing, information retrieval and semantic web technology. Inspired by the ubiquitous propagation of domain model structures that are emerging in several research disciplines, we give an overview of the current research landscape and some techniques and approaches. We will also discuss trade-offs between different approaches and point to some recent trends.
    Content
    Beitrag in einem Themenheft "Soft Approaches to IA on the Web". Vgl.: doi:10.1016/j.ipm.2011.07.002.
  14. Hmeidi, I.I.; Al-Shalabi, R.F.; Al-Taani, A.T.; Najadat, H.; Al-Hazaimeh, S.A.: ¬A novel approach to the extraction of roots from Arabic words using bigrams (2010) 0.02
    0.017691728 = product of:
      0.04717794 = sum of:
        0.020873476 = weight(_text_:retrieval in 3426) [ClassicSimilarity], result of:
          0.020873476 = score(doc=3426,freq=2.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.16710453 = fieldWeight in 3426, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3426)
        0.01850135 = weight(_text_:of in 3426) [ClassicSimilarity], result of:
          0.01850135 = score(doc=3426,freq=22.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.28651062 = fieldWeight in 3426, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3426)
        0.007803111 = product of:
          0.015606222 = sum of:
            0.015606222 = weight(_text_:on in 3426) [ClassicSimilarity], result of:
              0.015606222 = score(doc=3426,freq=4.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.1718293 = fieldWeight in 3426, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3426)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    Abstract
    Root extraction is one of the most important topics in information retrieval (IR), natural language processing (NLP), text summarization, and many other important fields. In the last two decades, several algorithms have been proposed to extract Arabic roots. Most of these algorithms dealt with triliteral roots only, and some with fixed length words only. In this study, a novel approach to the extraction of roots from Arabic words using bigrams is proposed. Two similarity measures are used, the dissimilarity measure called the Manhattan distance, and Dice's measure of similarity. The proposed algorithm is tested on the Holy Qu'ran and on a corpus of 242 abstracts from the Proceedings of the Saudi Arabian National Computer Conferences. The two files used contain a wide range of data: the Holy Qu'ran contains most of the ancient Arabic words while the other file contains some modern Arabic words and some words borrowed from foreign languages in addition to the original Arabic words. The results of this study showed that combining N-grams with the Dice measure gives better results than using the Manhattan distance measure.
    Source
    Journal of the American Society for Information Science and Technology. 61(2010) no.3, S.583-591
  15. Schmolz, H.: Anaphora resolution and text retrieval : a lnguistic analysis of hypertexts (2013) 0.02
    0.01754896 = product of:
      0.07019584 = sum of:
        0.059039105 = weight(_text_:retrieval in 1810) [ClassicSimilarity], result of:
          0.059039105 = score(doc=1810,freq=4.0), product of:
            0.124912694 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.041294612 = queryNorm
            0.47264296 = fieldWeight in 1810, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.078125 = fieldNorm(doc=1810)
        0.011156735 = weight(_text_:of in 1810) [ClassicSimilarity], result of:
          0.011156735 = score(doc=1810,freq=2.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.17277241 = fieldWeight in 1810, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=1810)
      0.25 = coord(2/8)
    
    Content
    Trägerin des VFI-Dissertationspreises 2014: "Überzeugende gründliche linguistische und quantitative Analyse eines im Information Retrieval bisher wenig beachteten Textelementes anhand eines eigens erstellten grossen Hypertextkorpus, einschliesslich der Evaluation selbsterstellter Auflösungsregeln für die Nutzung in künftigen IR-Systemen.".
  16. Muresan, S.; Klavans, J.L.: Inducing terminologies from text : a case study for the consumer health domain (2013) 0.02
    0.017128814 = product of:
      0.04567684 = sum of:
        0.025667597 = weight(_text_:use in 682) [ClassicSimilarity], result of:
          0.025667597 = score(doc=682,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.20298971 = fieldWeight in 682, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.046875 = fieldNorm(doc=682)
        0.013388081 = weight(_text_:of in 682) [ClassicSimilarity], result of:
          0.013388081 = score(doc=682,freq=8.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.20732689 = fieldWeight in 682, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=682)
        0.006621159 = product of:
          0.013242318 = sum of:
            0.013242318 = weight(_text_:on in 682) [ClassicSimilarity], result of:
              0.013242318 = score(doc=682,freq=2.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.14580199 = fieldWeight in 682, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.046875 = fieldNorm(doc=682)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    Abstract
    Specialized medical ontologies and terminologies, such as SNOMED CT and the Unified Medical Language System (UMLS), have been successfully leveraged in medical information systems to provide a standard web-accessible medium for interoperability, access, and reuse. However, these clinically oriented terminologies and ontologies cannot provide sufficient support when integrated into consumer-oriented applications, because these applications must "understand" both technical and lay vocabulary. The latter is not part of these specialized terminologies and ontologies. In this article, we propose a two-step approach for building consumer health terminologies from text: 1) automatic extraction of definitions from consumer-oriented articles and web documents, which reflects language in use, rather than relying solely on dictionaries, and 2) learning to map definitions expressed in natural language to terminological knowledge by inducing a syntactic-semantic grammar rather than using hand-written patterns or grammars. We present quantitative and qualitative evaluations of our two-step approach, which show that our framework could be used to induce consumer health terminologies from text.
    Source
    Journal of the American Society for Information Science and Technology. 64(2013) no.4, S.727-744
  17. Keselman, A.; Rosemblat, G.; Kilicoglu, H.; Fiszman, M.; Jin, H.; Shin, D.; Rindflesch, T.C.: Adapting semantic natural language processing technology to address information overload in influenza epidemic management (2010) 0.02
    0.017028242 = product of:
      0.045408648 = sum of:
        0.021389665 = weight(_text_:use in 1312) [ClassicSimilarity], result of:
          0.021389665 = score(doc=1312,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.1691581 = fieldWeight in 1312, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1312)
        0.01850135 = weight(_text_:of in 1312) [ClassicSimilarity], result of:
          0.01850135 = score(doc=1312,freq=22.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.28651062 = fieldWeight in 1312, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1312)
        0.0055176322 = product of:
          0.0110352645 = sum of:
            0.0110352645 = weight(_text_:on in 1312) [ClassicSimilarity], result of:
              0.0110352645 = score(doc=1312,freq=2.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.121501654 = fieldWeight in 1312, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1312)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    Abstract
    The explosion of disaster health information results in information overload among response professionals. The objective of this project was to determine the feasibility of applying semantic natural language processing (NLP) technology to addressing this overload. The project characterizes concepts and relationships commonly used in disaster health-related documents on influenza pandemics, as the basis for adapting an existing semantic summarizer to the domain. Methods include human review and semantic NLP analysis of a set of relevant documents. This is followed by a pilot test in which two information specialists use the adapted application for a realistic information-seeking task. According to the results, the ontology of influenza epidemics management can be described via a manageable number of semantic relationships that involve concepts from a limited number of semantic types. Test users demonstrate several ways to engage with the application to obtain useful information. This suggests that existing semantic NLP algorithms can be adapted to support information summarization and visualization in influenza epidemics and other disaster health areas. However, additional research is needed in the areas of terminology development (as many relevant relationships and terms are not part of existing standardized vocabularies), NLP, and user interface design.
    Source
    Journal of the American Society for Information Science and Technology. 61(2010) no.12, S.2531-2543
  18. Karlova-Bourbonus, N.: Automatic detection of contradictions in texts (2018) 0.02
    0.016849859 = product of:
      0.044932958 = sum of:
        0.018149732 = weight(_text_:use in 5976) [ClassicSimilarity], result of:
          0.018149732 = score(doc=5976,freq=4.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.1435354 = fieldWeight in 5976, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.0234375 = fieldNorm(doc=5976)
        0.018024256 = weight(_text_:of in 5976) [ClassicSimilarity], result of:
          0.018024256 = score(doc=5976,freq=58.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.27912235 = fieldWeight in 5976, product of:
              7.615773 = tf(freq=58.0), with freq of:
                58.0 = termFreq=58.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0234375 = fieldNorm(doc=5976)
        0.008758971 = product of:
          0.017517941 = sum of:
            0.017517941 = weight(_text_:on in 5976) [ClassicSimilarity], result of:
              0.017517941 = score(doc=5976,freq=14.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.19287792 = fieldWeight in 5976, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0234375 = fieldNorm(doc=5976)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    Abstract
    Natural language contradictions are of complex nature. As will be shown in Chapter 5, the realization of contradictions is not limited to the examples such as Socrates is a man and Socrates is not a man (under the condition that Socrates refers to the same object in the real world), which is discussed by Aristotle (Section 3.1.1). Empirical evidence (see Chapter 5 for more details) shows that only a few contradictions occurring in the real life are of that explicit (prototypical) kind. Rather, con-tradictions make use of a variety of natural language devices such as, e.g., paraphrasing, synonyms and antonyms, passive and active voice, diversity of negation expression, and figurative linguistic means such as idioms, irony, and metaphors. Additionally, the most so-phisticated kind of contradictions, the so-called implicit contradictions, can be found only when applying world knowledge and after conducting a sequence of logical operations such as e.g. in: (1.1) The first prize was given to the experienced grandmaster L. Stein who, in total, col-lected ten points (7 wins and 3 draws). Those familiar with the chess rules know that a chess player gets one point for winning and zero points for losing the game. In case of a draw, each player gets a half point. Built on this idea and by conducting some simple mathematical operations, we can infer that in the case of 7 wins and 3 draws (the second part of the sentence), a player can only collect 8.5 points and not 10 points. Hence, we observe that there is a contradiction between the first and the second parts of the sentence.
    Implicit contradictions will only partially be the subject of the present study, aiming primarily at identifying the realization mechanism and cues (Chapter 5) as well as finding the parts of contradictions by applying the state of the art algorithms for natural language processing without conducting deep meaning processing. Further in focus are the explicit and implicit contradictions that can be detected by means of explicit linguistic, structural, lexical cues, and by conducting some additional processing operations (e.g., counting the sum in order to detect contradictions arising from numerical divergencies). One should note that an additional complexity in finding contradictions can arise in case parts of the contradictions occur on different levels of realization. Thus, a contradiction can be observed on the word- and phrase-level, such as in a married bachelor (for variations of contradictions on lexical level, see Ganeev 2004), on the sentence level - between parts of a sentence or between two or more sentences, or on the text level - between the portions of a text or between the whole texts such as a contradiction between the Bible and the Quran, for example. Only contradictions arising at the level of single sentences occurring in one or more texts, as well as parts of a sentence, will be considered for the purpose of this study. Though the focus of interest will be on single sentences, it will make use of text particularities such as coreference resolution without establishing the referents in the real world. Finally, another aspect to be considered is that parts of the contradictions are not neces-sarily to appear at the same time. They can be separated by many years and centuries with or without time expression making their recognition by human and detection by machine challenging. According to Aristotle's ontological version of the LNC (Section 3.1.1), how-ever, the same time reference is required in order for two statements to be judged as a contradiction. Taking this into account, we set the borders for the study by limiting the ana-lyzed textual data thematically (only nine world events) and temporally (three days after the reported event had happened) (Section 5.1). No sophisticated time processing will thus be conducted.
  19. Fegley, B.D.; Torvik, V.I.: On the role of poetic versus nonpoetic features in "kindred" and diachronic poetry attribution (2012) 0.02
    0.016705364 = product of:
      0.04454764 = sum of:
        0.021389665 = weight(_text_:use in 488) [ClassicSimilarity], result of:
          0.021389665 = score(doc=488,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.1691581 = fieldWeight in 488, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.0390625 = fieldNorm(doc=488)
        0.017640345 = weight(_text_:of in 488) [ClassicSimilarity], result of:
          0.017640345 = score(doc=488,freq=20.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.27317715 = fieldWeight in 488, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=488)
        0.0055176322 = product of:
          0.0110352645 = sum of:
            0.0110352645 = weight(_text_:on in 488) [ClassicSimilarity], result of:
              0.0110352645 = score(doc=488,freq=2.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.121501654 = fieldWeight in 488, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=488)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    Abstract
    Author attribution studies have demonstrated remarkable success in applying orthographic and lexicographic features of text in a variety of discrimination problems. What might poetic features, such as syllabic stress and mood, contribute? We address this question in the context of two different attribution problems: (a) kindred: differentiate Langston Hughes' early poems from those of kindred poets and (b) diachronic: differentiate Hughes' early from his later poems. Using a diverse set of 535 generic text features, each categorized as poetic or nonpoetic, correlation-based greedy forward search ranked the features and a support vector machine classified the poems. A small subset of features (~10) achieved cross-validated precision and recall as high as 87%. Poetic features (rhyme patterns particularly) were nearly as effective as nonpoetic in kindred discrimination, but less effective diachronically. In other words, Hughes used both poetic and nonpoetic features in distinctive ways and his use of nonpoetic features evolved systematically while he continued to experiment with poetic features. These findings affirm qualitative studies attesting to structural elements from Black oral tradition and Black folk music (blues) and to the internal consistency of Hughes' early poetry.
    Source
    Journal of the American Society for Information Science and Technology. 63(2012) no.11, S.2165-2181
  20. Rozinajová, V.; Macko, P.: Using natural language to search linked data (2017) 0.02
    0.016481906 = product of:
      0.04395175 = sum of:
        0.021389665 = weight(_text_:use in 3488) [ClassicSimilarity], result of:
          0.021389665 = score(doc=3488,freq=2.0), product of:
            0.12644777 = queryWeight, product of:
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.041294612 = queryNorm
            0.1691581 = fieldWeight in 3488, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0620887 = idf(docFreq=5623, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3488)
        0.014758972 = weight(_text_:of in 3488) [ClassicSimilarity], result of:
          0.014758972 = score(doc=3488,freq=14.0), product of:
            0.06457475 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.041294612 = queryNorm
            0.22855641 = fieldWeight in 3488, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3488)
        0.007803111 = product of:
          0.015606222 = sum of:
            0.015606222 = weight(_text_:on in 3488) [ClassicSimilarity], result of:
              0.015606222 = score(doc=3488,freq=4.0), product of:
                0.090823986 = queryWeight, product of:
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.041294612 = queryNorm
                0.1718293 = fieldWeight in 3488, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  2.199415 = idf(docFreq=13325, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3488)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    Abstract
    There are many endeavors aiming to offer users more effective ways of getting relevant information from web. One of them is represented by a concept of Linked Data, which provides interconnected data sources. But querying these types of data is difficult not only for the conventional web users but also for ex-perts in this field. Therefore, a more comfortable way of user query would be of great value. One direction could be to allow the user to use a natural language. To make this task easier we have proposed a method for translating natural language query to SPARQL query. It is based on a sentence structure - utilizing dependen-cies between the words in user queries. Dependencies are used to map the query to the semantic web structure, which is in the next step translated to SPARQL query. According to our first experiments we are able to answer a significant group of user queries.
    Source
    Semantic keyword-based search on structured data sources: COST Action IC1302. Second International KEYSTONE Conference, IKC 2016, Cluj-Napoca, Romania, September 8-9, 2016, Revised Selected Papers. Eds.: A. Calì, A. et al

Languages

  • e 96
  • d 12
  • el 1
  • More… Less…

Types

  • a 88
  • el 21
  • x 8
  • m 6
  • s 3
  • More… Less…