Search (27 results, page 1 of 2)

  • × theme_ss:"Computerlinguistik"
  • × year_i:[2010 TO 2020}
  1. Rosemblat, G.; Resnick, M.P.; Auston, I.; Shin, D.; Sneiderman, C.; Fizsman, M.; Rindflesch, T.C.: Extending SemRep to the public health domain (2013) 0.03
    0.02591448 = product of:
      0.05182896 = sum of:
        0.05182896 = product of:
          0.07774344 = sum of:
            0.042335 = weight(_text_:i in 2096) [ClassicSimilarity], result of:
              0.042335 = score(doc=2096,freq=2.0), product of:
                0.16931784 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.044891298 = queryNorm
                0.25003272 = fieldWeight in 2096, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2096)
            0.035408445 = weight(_text_:c in 2096) [ClassicSimilarity], result of:
              0.035408445 = score(doc=2096,freq=2.0), product of:
                0.15484828 = queryWeight, product of:
                  3.4494052 = idf(docFreq=3817, maxDocs=44218)
                  0.044891298 = queryNorm
                0.22866541 = fieldWeight in 2096, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.4494052 = idf(docFreq=3817, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2096)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
  2. Kocijan, K.: Visualizing natural language resources (2015) 0.01
    0.0139097525 = product of:
      0.027819505 = sum of:
        0.027819505 = product of:
          0.08345851 = sum of:
            0.08345851 = weight(_text_:c in 2995) [ClassicSimilarity], result of:
              0.08345851 = score(doc=2995,freq=4.0), product of:
                0.15484828 = queryWeight, product of:
                  3.4494052 = idf(docFreq=3817, maxDocs=44218)
                  0.044891298 = queryNorm
                0.5389696 = fieldWeight in 2995, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.4494052 = idf(docFreq=3817, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2995)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Source
    Re:inventing information science in the networked society: Proceedings of the 14th International Symposium on Information Science, Zadar/Croatia, 19th-21st May 2015. Eds.: F. Pehar, C. Schloegl u. C. Wolff
  3. Manning, C.D.: Part-of-Speech Tagging from 97% to 100% : is it time for some linguistics? (2011) 0.01
    0.0131477695 = product of:
      0.026295539 = sum of:
        0.026295539 = product of:
          0.07888661 = sum of:
            0.07888661 = weight(_text_:i in 1121) [ClassicSimilarity], result of:
              0.07888661 = score(doc=1121,freq=10.0), product of:
                0.16931784 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.044891298 = queryNorm
                0.46590847 = fieldWeight in 1121, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1121)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Abstract
    I examine what would be necessary to move part-of-speech tagging performance from its current level of about 97.3% token accuracy (56% sentence accuracy) to close to 100% accuracy. I suggest that it must still be possible to greatly increase tagging performance and examine some useful improvements that have recently been made to the Stanford Part-of-Speech Tagger. However, an error analysis of some of the remaining errors suggests that there is limited further mileage to be had either from better machine learning or better features in a discriminative sequence classifier. The prospects for further gains from semisupervised learning also seem quite limited. Rather, I suggest and begin to demonstrate that the largest opportunity for further progress comes from improving the taxonomic basis of the linguistic resources from which taggers are trained. That is, from improved descriptive linguistics. However, I conclude by suggesting that there are also limits to this process. The status of some words may not be able to be adequately captured by assigning them to one of a small number of categories. While conventions can be used in such cases to improve tagging consistency, they lack a strong linguistic basis.
    Source
    Computational Linguistics and Intelligent Text Processing, 12th International Conference, CICLing 2011, Proceedings, Part I. Ed.: Alexander Gelbukh
  4. Lu, C.; Bu, Y.; Wang, J.; Ding, Y.; Torvik, V.; Schnaars, M.; Zhang, C.: Examining scientific writing styles from the perspective of linguistic complexity : a cross-level moderation model (2019) 0.01
    0.008345851 = product of:
      0.016691701 = sum of:
        0.016691701 = product of:
          0.050075103 = sum of:
            0.050075103 = weight(_text_:c in 5219) [ClassicSimilarity], result of:
              0.050075103 = score(doc=5219,freq=4.0), product of:
                0.15484828 = queryWeight, product of:
                  3.4494052 = idf(docFreq=3817, maxDocs=44218)
                  0.044891298 = queryNorm
                0.32338172 = fieldWeight in 5219, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.4494052 = idf(docFreq=3817, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5219)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  5. Sünkler, S.; Kerkmann, F.; Schultheiß, S.: Ok Google . the end of search as we know it : sprachgesteuerte Websuche im Test (2018) 0.01
    0.008231806 = product of:
      0.016463611 = sum of:
        0.016463611 = product of:
          0.049390834 = sum of:
            0.049390834 = weight(_text_:i in 5626) [ClassicSimilarity], result of:
              0.049390834 = score(doc=5626,freq=2.0), product of:
                0.16931784 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.044891298 = queryNorm
                0.29170483 = fieldWeight in 5626, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5626)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Content
    Vgl.: https://www.b-i-t-online.de/heft/2018-01-index.php.
  6. Lezius, W.: Morphy - Morphologie und Tagging für das Deutsche (2013) 0.01
    0.008109535 = product of:
      0.01621907 = sum of:
        0.01621907 = product of:
          0.04865721 = sum of:
            0.04865721 = weight(_text_:22 in 1490) [ClassicSimilarity], result of:
              0.04865721 = score(doc=1490,freq=2.0), product of:
                0.15720168 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044891298 = queryNorm
                0.30952093 = fieldWeight in 1490, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1490)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Date
    22. 3.2015 9:30:24
  7. Becks, D.; Schulz, J.M.: Domänenübergreifende Phrasenextraktion mithilfe einer lexikonunabhängigen Analysekomponente (2010) 0.01
    0.007868543 = product of:
      0.015737087 = sum of:
        0.015737087 = product of:
          0.04721126 = sum of:
            0.04721126 = weight(_text_:c in 4661) [ClassicSimilarity], result of:
              0.04721126 = score(doc=4661,freq=2.0), product of:
                0.15484828 = queryWeight, product of:
                  3.4494052 = idf(docFreq=3817, maxDocs=44218)
                  0.044891298 = queryNorm
                0.3048872 = fieldWeight in 4661, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.4494052 = idf(docFreq=3817, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4661)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Source
    Information und Wissen: global, sozial und frei? Proceedings des 12. Internationalen Symposiums für Informationswissenschaft (ISI 2011) ; Hildesheim, 9. - 11. März 2011. Hrsg.: J. Griesbaum, T. Mandl u. C. Womser-Hacker
  8. Smalheiser, N.R.: Literature-based discovery : Beyond the ABCs (2012) 0.01
    0.0070558335 = product of:
      0.014111667 = sum of:
        0.014111667 = product of:
          0.042335 = sum of:
            0.042335 = weight(_text_:i in 4967) [ClassicSimilarity], result of:
              0.042335 = score(doc=4967,freq=2.0), product of:
                0.16931784 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.044891298 = queryNorm
                0.25003272 = fieldWeight in 4967, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4967)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Abstract
    Literature-based discovery (LBD) refers to a particular type of text mining that seeks to identify nontrivial assertions that are implicit, and not explicitly stated, and that are detected by juxtaposing (generally a large body of) documents. In this review, I will provide a brief overview of LBD, both past and present, and will propose some new directions for the next decade. The prevalent ABC model is not "wrong"; however, it is only one of several different types of models that can contribute to the development of the next generation of LBD tools. Perhaps the most urgent need is to develop a series of objective literature-based interestingness measures, which can customize the output of LBD systems for different types of scientific investigations.
  9. Lu, K.; Cai, X.; Ajiferuke, I.; Wolfram, D.: Vocabulary size and its effect on topic representation (2017) 0.01
    0.0070558335 = product of:
      0.014111667 = sum of:
        0.014111667 = product of:
          0.042335 = sum of:
            0.042335 = weight(_text_:i in 3414) [ClassicSimilarity], result of:
              0.042335 = score(doc=3414,freq=2.0), product of:
                0.16931784 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.044891298 = queryNorm
                0.25003272 = fieldWeight in 3414, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3414)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  10. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I.: Attention Is all you need (2017) 0.01
    0.0070558335 = product of:
      0.014111667 = sum of:
        0.014111667 = product of:
          0.042335 = sum of:
            0.042335 = weight(_text_:i in 970) [ClassicSimilarity], result of:
              0.042335 = score(doc=970,freq=2.0), product of:
                0.16931784 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.044891298 = queryNorm
                0.25003272 = fieldWeight in 970, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.046875 = fieldNorm(doc=970)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  11. Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.01
    0.006082151 = product of:
      0.012164302 = sum of:
        0.012164302 = product of:
          0.036492907 = sum of:
            0.036492907 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
              0.036492907 = score(doc=563,freq=2.0), product of:
                0.15720168 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044891298 = queryNorm
                0.23214069 = fieldWeight in 563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=563)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Date
    10. 1.2013 19:22:47
  12. Lawrie, D.; Mayfield, J.; McNamee, P.; Oard, P.W.: Cross-language person-entity linking from 20 languages (2015) 0.01
    0.006082151 = product of:
      0.012164302 = sum of:
        0.012164302 = product of:
          0.036492907 = sum of:
            0.036492907 = weight(_text_:22 in 1848) [ClassicSimilarity], result of:
              0.036492907 = score(doc=1848,freq=2.0), product of:
                0.15720168 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044891298 = queryNorm
                0.23214069 = fieldWeight in 1848, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1848)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Abstract
    The goal of entity linking is to associate references to an entity that is found in unstructured natural language content to an authoritative inventory of known entities. This article describes the construction of 6 test collections for cross-language person-entity linking that together span 22 languages. Fully automated components were used together with 2 crowdsourced validation stages to affordably generate ground-truth annotations with an accuracy comparable to that of a completely manual process. The resulting test collections each contain between 642 (Arabic) and 2,361 (Romanian) person references in non-English texts for which the correct resolution in English Wikipedia is known, plus a similar number of references for which no correct resolution into English Wikipedia is believed to exist. Fully automated cross-language person-name linking experiments with 20 non-English languages yielded a resolution accuracy of between 0.84 (Serbian) and 0.98 (Romanian), which compares favorably with previously reported cross-language entity linking results for Spanish.
  13. Vasalou, A.; Gill, A.J.; Mazanderani, F.; Papoutsi, C.; Joinson, A.: Privacy dictionary : a new resource for the automated content analysis of privacy (2011) 0.01
    0.0059014075 = product of:
      0.011802815 = sum of:
        0.011802815 = product of:
          0.035408445 = sum of:
            0.035408445 = weight(_text_:c in 4915) [ClassicSimilarity], result of:
              0.035408445 = score(doc=4915,freq=2.0), product of:
                0.15484828 = queryWeight, product of:
                  3.4494052 = idf(docFreq=3817, maxDocs=44218)
                  0.044891298 = queryNorm
                0.22866541 = fieldWeight in 4915, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.4494052 = idf(docFreq=3817, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4915)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  14. Ramisch, C.; Villavicencio, A.; Kordoni, V.: Introduction to the special issue on multiword expressions : from theory to practice and use (2013) 0.01
    0.0059014075 = product of:
      0.011802815 = sum of:
        0.011802815 = product of:
          0.035408445 = sum of:
            0.035408445 = weight(_text_:c in 1124) [ClassicSimilarity], result of:
              0.035408445 = score(doc=1124,freq=2.0), product of:
                0.15484828 = queryWeight, product of:
                  3.4494052 = idf(docFreq=3817, maxDocs=44218)
                  0.044891298 = queryNorm
                0.22866541 = fieldWeight in 1124, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.4494052 = idf(docFreq=3817, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1124)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  15. Anguiano Peña, G.; Naumis Peña, C.: Method for selecting specialized terms from a general language corpus (2015) 0.01
    0.0059014075 = product of:
      0.011802815 = sum of:
        0.011802815 = product of:
          0.035408445 = sum of:
            0.035408445 = weight(_text_:c in 2196) [ClassicSimilarity], result of:
              0.035408445 = score(doc=2196,freq=2.0), product of:
                0.15484828 = queryWeight, product of:
                  3.4494052 = idf(docFreq=3817, maxDocs=44218)
                  0.044891298 = queryNorm
                0.22866541 = fieldWeight in 2196, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.4494052 = idf(docFreq=3817, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2196)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  16. Symonds, M.; Bruza, P.; Zuccon, G.; Koopman, B.; Sitbon, L.; Turner, I.: Automatic query expansion : a structural linguistic perspective (2014) 0.01
    0.0058798613 = product of:
      0.011759723 = sum of:
        0.011759723 = product of:
          0.035279166 = sum of:
            0.035279166 = weight(_text_:i in 1338) [ClassicSimilarity], result of:
              0.035279166 = score(doc=1338,freq=2.0), product of:
                0.16931784 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.044891298 = queryNorm
                0.20836058 = fieldWeight in 1338, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1338)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  17. Muneer, I.; Sharjeel, M.; Iqbal, M.; Adeel Nawab, R.M.; Rayson, P.: CLEU - A Cross-language english-urdu corpus and benchmark for text reuse experiments (2019) 0.01
    0.0058798613 = product of:
      0.011759723 = sum of:
        0.011759723 = product of:
          0.035279166 = sum of:
            0.035279166 = weight(_text_:i in 5299) [ClassicSimilarity], result of:
              0.035279166 = score(doc=5299,freq=2.0), product of:
                0.16931784 = queryWeight, product of:
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.044891298 = queryNorm
                0.20836058 = fieldWeight in 5299, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.7717297 = idf(docFreq=2765, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5299)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  18. Fóris, A.: Network theory and terminology (2013) 0.01
    0.0050684595 = product of:
      0.010136919 = sum of:
        0.010136919 = product of:
          0.030410757 = sum of:
            0.030410757 = weight(_text_:22 in 1365) [ClassicSimilarity], result of:
              0.030410757 = score(doc=1365,freq=2.0), product of:
                0.15720168 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044891298 = queryNorm
                0.19345059 = fieldWeight in 1365, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1365)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Date
    2. 9.2014 21:22:48
  19. Malo, P.; Sinha, A.; Korhonen, P.; Wallenius, J.; Takala, P.: Good debt or bad debt : detecting semantic orientations in economic texts (2014) 0.00
    0.0049178395 = product of:
      0.009835679 = sum of:
        0.009835679 = product of:
          0.029507035 = sum of:
            0.029507035 = weight(_text_:c in 1226) [ClassicSimilarity], result of:
              0.029507035 = score(doc=1226,freq=2.0), product of:
                0.15484828 = queryWeight, product of:
                  3.4494052 = idf(docFreq=3817, maxDocs=44218)
                  0.044891298 = queryNorm
                0.1905545 = fieldWeight in 1226, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.4494052 = idf(docFreq=3817, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1226)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Abstract
    The use of robo-readers to analyze news texts is an emerging technology trend in computational finance. Recent research has developed sophisticated financial polarity lexicons for investigating how financial sentiments relate to future company performance. However, based on experience from fields that commonly analyze sentiment, it is well known that the overall semantic orientation of a sentence may differ from that of individual words. This article investigates how semantic orientations can be better detected in financial and economic news by accommodating the overall phrase-structure information and domain-specific use of language. Our three main contributions are the following: (a) a human-annotated finance phrase bank that can be used for training and evaluating alternative models; (b) a technique to enhance financial lexicons with attributes that help to identify expected direction of events that affect sentiment; and (c) a linearized phrase-structure model for detecting contextual semantic orientations in economic texts. The relevance of the newly added lexicon features and the benefit of using the proposed learning algorithm are demonstrated in a comparative study against general sentiment models as well as the popular word frequency models used in recent financial studies. The proposed framework is parsimonious and avoids the explosion in feature space caused by the use of conventional n-gram features.
  20. Lian, T.; Yu, C.; Wang, W.; Yuan, Q.; Hou, Z.: Doctoral dissertations on tourism in China : a co-word analysis (2016) 0.00
    0.0049178395 = product of:
      0.009835679 = sum of:
        0.009835679 = product of:
          0.029507035 = sum of:
            0.029507035 = weight(_text_:c in 3178) [ClassicSimilarity], result of:
              0.029507035 = score(doc=3178,freq=2.0), product of:
                0.15484828 = queryWeight, product of:
                  3.4494052 = idf(docFreq=3817, maxDocs=44218)
                  0.044891298 = queryNorm
                0.1905545 = fieldWeight in 3178, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.4494052 = idf(docFreq=3817, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3178)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)