Search (13 results, page 1 of 1)

  • × theme_ss:"Computerlinguistik"
  • × year_i:[2010 TO 2020}
  1. Biselli, A.: Unter Generalverdacht durch Algorithmen (2014) 0.05
    0.046397857 = product of:
      0.092795715 = sum of:
        0.092795715 = product of:
          0.18559143 = sum of:
            0.18559143 = weight(_text_:news in 809) [ClassicSimilarity], result of:
              0.18559143 = score(doc=809,freq=2.0), product of:
                0.26705483 = queryWeight, product of:
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.05094824 = queryNorm
                0.6949563 = fieldWeight in 809, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.09375 = fieldNorm(doc=809)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    http://www.golem.de/news/textanalyse-unter-generalverdacht-durch-algorithmen-1402-104637.html
  2. Panicheva, P.; Cardiff, J.; Rosso, P.: Identifying subjective statements in news titles using a personal sense annotation framework (2013) 0.03
    0.03280824 = product of:
      0.06561648 = sum of:
        0.06561648 = product of:
          0.13123296 = sum of:
            0.13123296 = weight(_text_:news in 968) [ClassicSimilarity], result of:
              0.13123296 = score(doc=968,freq=4.0), product of:
                0.26705483 = queryWeight, product of:
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.05094824 = queryNorm
                0.49140832 = fieldWeight in 968, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.046875 = fieldNorm(doc=968)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Subjective language contains information about private states. The goal of subjective language identification is to determine that a private state is expressed, without considering its polarity or specific emotion. A component of word meaning, "Personal Sense," has clear potential in the field of subjective language identification, as it reflects a meaning of words in terms of unique personal experience and carries personal characteristics. In this paper we investigate how Personal Sense can be harnessed for the purpose of identifying subjectivity in news titles. In the process, we develop a new Personal Sense annotation framework for annotating and classifying subjectivity, polarity, and emotion. The Personal Sense framework yields high performance in a fine-grained subsentence subjectivity classification. Our experiments demonstrate lexico-syntactic features to be useful for the identification of subjectivity indicators and the targets that receive the subjective Personal Sense.
  3. AL-Smadi, M.; Jaradat, Z.; AL-Ayyoub, M.; Jararweh, Y.: Paraphrase identification and semantic text similarity analysis in Arabic news tweets using lexical, syntactic, and semantic features (2017) 0.03
    0.03280824 = product of:
      0.06561648 = sum of:
        0.06561648 = product of:
          0.13123296 = sum of:
            0.13123296 = weight(_text_:news in 5095) [ClassicSimilarity], result of:
              0.13123296 = score(doc=5095,freq=4.0), product of:
                0.26705483 = queryWeight, product of:
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.05094824 = queryNorm
                0.49140832 = fieldWeight in 5095, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5095)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The rapid growth in digital information has raised considerable challenges in particular when it comes to automated content analysis. Social media such as twitter share a lot of its users' information about their events, opinions, personalities, etc. Paraphrase Identification (PI) is concerned with recognizing whether two texts have the same/similar meaning, whereas the Semantic Text Similarity (STS) is concerned with the degree of that similarity. This research proposes a state-of-the-art approach for paraphrase identification and semantic text similarity analysis in Arabic news tweets. The approach adopts several phases of text processing, features extraction and text classification. Lexical, syntactic, and semantic features are extracted to overcome the weakness and limitations of the current technologies in solving these tasks for the Arabic language. Maximum Entropy (MaxEnt) and Support Vector Regression (SVR) classifiers are trained using these features and are evaluated using a dataset prepared for this research. The experimentation results show that the approach achieves good results in comparison to the baseline results.
  4. Malo, P.; Sinha, A.; Korhonen, P.; Wallenius, J.; Takala, P.: Good debt or bad debt : detecting semantic orientations in economic texts (2014) 0.03
    0.0273402 = product of:
      0.0546804 = sum of:
        0.0546804 = product of:
          0.1093608 = sum of:
            0.1093608 = weight(_text_:news in 1226) [ClassicSimilarity], result of:
              0.1093608 = score(doc=1226,freq=4.0), product of:
                0.26705483 = queryWeight, product of:
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.05094824 = queryNorm
                0.40950692 = fieldWeight in 1226, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1226)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The use of robo-readers to analyze news texts is an emerging technology trend in computational finance. Recent research has developed sophisticated financial polarity lexicons for investigating how financial sentiments relate to future company performance. However, based on experience from fields that commonly analyze sentiment, it is well known that the overall semantic orientation of a sentence may differ from that of individual words. This article investigates how semantic orientations can be better detected in financial and economic news by accommodating the overall phrase-structure information and domain-specific use of language. Our three main contributions are the following: (a) a human-annotated finance phrase bank that can be used for training and evaluating alternative models; (b) a technique to enhance financial lexicons with attributes that help to identify expected direction of events that affect sentiment; and (c) a linearized phrase-structure model for detecting contextual semantic orientations in economic texts. The relevance of the newly added lexicon features and the benefit of using the proposed learning algorithm are demonstrated in a comparative study against general sentiment models as well as the popular word frequency models used in recent financial studies. The proposed framework is parsimonious and avoids the explosion in feature space caused by the use of conventional n-gram features.
  5. Holland, M.: Erstes wissenschaftliches Buch eines Algorithmus' veröffentlicht (2019) 0.03
    0.027065417 = product of:
      0.054130834 = sum of:
        0.054130834 = product of:
          0.10826167 = sum of:
            0.10826167 = weight(_text_:news in 5227) [ClassicSimilarity], result of:
              0.10826167 = score(doc=5227,freq=2.0), product of:
                0.26705483 = queryWeight, product of:
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.05094824 = queryNorm
                0.40539116 = fieldWeight in 5227, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5227)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Series
    Heise Online: News
  6. Multi-source, multilingual information extraction and summarization (2013) 0.02
    0.01933244 = product of:
      0.03866488 = sum of:
        0.03866488 = product of:
          0.07732976 = sum of:
            0.07732976 = weight(_text_:news in 978) [ClassicSimilarity], result of:
              0.07732976 = score(doc=978,freq=2.0), product of:
                0.26705483 = queryWeight, product of:
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.05094824 = queryNorm
                0.28956512 = fieldWeight in 978, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=978)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Information extraction (IE) and text summarization (TS) are powerful technologies for finding relevant pieces of information in text and presenting them to the user in condensed form. The ongoing information explosion makes IE and TS critical for successful functioning within the information society. These technologies face particular challenges due to the inherent multi-source nature of the information explosion. The technologies must now handle not isolated texts or individual narratives, but rather large-scale repositories and streams---in general, in multiple languages---containing a multiplicity of perspectives, opinions, or commentaries on particular topics, entities or events. There is thus a need to adapt existing techniques and develop new ones to deal with these challenges. This volume contains a selection of papers that present a variety of methodologies for content identification and extraction, as well as for content fusion and regeneration. The chapters cover various aspects of the challenges, depending on the nature of the information sought---names vs. events,--- and the nature of the sources---news streams vs. image captions vs. scientific research papers, etc. This volume aims to offer a broad and representative sample of studies from this very active research field.
  7. Sankarasubramaniam, Y.; Ramanathan, K.; Ghosh, S.: Text summarization using Wikipedia (2014) 0.02
    0.01933244 = product of:
      0.03866488 = sum of:
        0.03866488 = product of:
          0.07732976 = sum of:
            0.07732976 = weight(_text_:news in 2693) [ClassicSimilarity], result of:
              0.07732976 = score(doc=2693,freq=2.0), product of:
                0.26705483 = queryWeight, product of:
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.05094824 = queryNorm
                0.28956512 = fieldWeight in 2693, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.2416887 = idf(docFreq=635, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2693)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Automatic text summarization has been an active field of research for many years. Several approaches have been proposed, ranging from simple position and word-frequency methods, to learning and graph based algorithms. The advent of human-generated knowledge bases like Wikipedia offer a further possibility in text summarization - they can be used to understand the input text in terms of salient concepts from the knowledge base. In this paper, we study a novel approach that leverages Wikipedia in conjunction with graph-based ranking. Our approach is to first construct a bipartite sentence-concept graph, and then rank the input sentences using iterative updates on this graph. We consider several models for the bipartite graph, and derive convergence properties under each model. Then, we take up personalized and query-focused summarization, where the sentence ranks additionally depend on user interests and queries, respectively. Finally, we present a Wikipedia-based multi-document summarization algorithm. An important feature of the proposed algorithms is that they enable real-time incremental summarization - users can first view an initial summary, and then request additional content if interested. We evaluate the performance of our proposed summarizer using the ROUGE metric, and the results show that leveraging Wikipedia can significantly improve summary quality. We also present results from a user study, which suggests that using incremental summarization can help in better understanding news articles.
  8. Lezius, W.: Morphy - Morphologie und Tagging für das Deutsche (2013) 0.01
    0.013805566 = product of:
      0.027611133 = sum of:
        0.027611133 = product of:
          0.055222265 = sum of:
            0.055222265 = weight(_text_:22 in 1490) [ClassicSimilarity], result of:
              0.055222265 = score(doc=1490,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.30952093 = fieldWeight in 1490, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1490)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2015 9:30:24
  9. Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.01
    0.010354174 = product of:
      0.020708349 = sum of:
        0.020708349 = product of:
          0.041416697 = sum of:
            0.041416697 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
              0.041416697 = score(doc=563,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.23214069 = fieldWeight in 563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=563)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    10. 1.2013 19:22:47
  10. Lawrie, D.; Mayfield, J.; McNamee, P.; Oard, P.W.: Cross-language person-entity linking from 20 languages (2015) 0.01
    0.010354174 = product of:
      0.020708349 = sum of:
        0.020708349 = product of:
          0.041416697 = sum of:
            0.041416697 = weight(_text_:22 in 1848) [ClassicSimilarity], result of:
              0.041416697 = score(doc=1848,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.23214069 = fieldWeight in 1848, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1848)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The goal of entity linking is to associate references to an entity that is found in unstructured natural language content to an authoritative inventory of known entities. This article describes the construction of 6 test collections for cross-language person-entity linking that together span 22 languages. Fully automated components were used together with 2 crowdsourced validation stages to affordably generate ground-truth annotations with an accuracy comparable to that of a completely manual process. The resulting test collections each contain between 642 (Arabic) and 2,361 (Romanian) person references in non-English texts for which the correct resolution in English Wikipedia is known, plus a similar number of references for which no correct resolution into English Wikipedia is believed to exist. Fully automated cross-language person-name linking experiments with 20 non-English languages yielded a resolution accuracy of between 0.84 (Serbian) and 0.98 (Romanian), which compares favorably with previously reported cross-language entity linking results for Spanish.
  11. Fóris, A.: Network theory and terminology (2013) 0.01
    0.008628479 = product of:
      0.017256958 = sum of:
        0.017256958 = product of:
          0.034513917 = sum of:
            0.034513917 = weight(_text_:22 in 1365) [ClassicSimilarity], result of:
              0.034513917 = score(doc=1365,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.19345059 = fieldWeight in 1365, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1365)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    2. 9.2014 21:22:48
  12. Rötzer, F.: KI-Programm besser als Menschen im Verständnis natürlicher Sprache (2018) 0.01
    0.006902783 = product of:
      0.013805566 = sum of:
        0.013805566 = product of:
          0.027611133 = sum of:
            0.027611133 = weight(_text_:22 in 4217) [ClassicSimilarity], result of:
              0.027611133 = score(doc=4217,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.15476047 = fieldWeight in 4217, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=4217)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 1.2018 11:32:44
  13. Deventer, J.P. van; Kruger, C.J.; Johnson, R.D.: Delineating knowledge management through lexical analysis : a retrospective (2015) 0.01
    0.006039935 = product of:
      0.01207987 = sum of:
        0.01207987 = product of:
          0.02415974 = sum of:
            0.02415974 = weight(_text_:22 in 3807) [ClassicSimilarity], result of:
              0.02415974 = score(doc=3807,freq=2.0), product of:
                0.17841205 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05094824 = queryNorm
                0.1354154 = fieldWeight in 3807, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=3807)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    20. 1.2015 18:30:22