Search (205 results, page 1 of 11)

  • × theme_ss:"Computerlinguistik"
  1. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.25
    0.25463268 = sum of:
      0.058879625 = product of:
        0.17663887 = sum of:
          0.17663887 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
            0.17663887 = score(doc=562,freq=2.0), product of:
              0.31429395 = queryWeight, product of:
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.03707166 = queryNorm
              0.56201804 = fieldWeight in 562, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.046875 = fieldNorm(doc=562)
        0.33333334 = coord(1/3)
      0.17663887 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
        0.17663887 = score(doc=562,freq=2.0), product of:
          0.31429395 = queryWeight, product of:
            8.478011 = idf(docFreq=24, maxDocs=44218)
            0.03707166 = queryNorm
          0.56201804 = fieldWeight in 562, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            8.478011 = idf(docFreq=24, maxDocs=44218)
            0.046875 = fieldNorm(doc=562)
      0.013086946 = product of:
        0.026173891 = sum of:
          0.026173891 = weight(_text_:web in 562) [ClassicSimilarity], result of:
            0.026173891 = score(doc=562,freq=2.0), product of:
              0.12098375 = queryWeight, product of:
                3.2635105 = idf(docFreq=4597, maxDocs=44218)
                0.03707166 = queryNorm
              0.21634221 = fieldWeight in 562, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.2635105 = idf(docFreq=4597, maxDocs=44218)
                0.046875 = fieldNorm(doc=562)
        0.5 = coord(1/2)
      0.0060272375 = product of:
        0.030136187 = sum of:
          0.030136187 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
            0.030136187 = score(doc=562,freq=2.0), product of:
              0.12981863 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.03707166 = queryNorm
              0.23214069 = fieldWeight in 562, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=562)
        0.2 = coord(1/5)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  2. Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.16
    0.15663 = product of:
      0.20884 = sum of:
        0.17663887 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
          0.17663887 = score(doc=563,freq=2.0), product of:
            0.31429395 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03707166 = queryNorm
            0.56201804 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.026173891 = product of:
          0.052347783 = sum of:
            0.052347783 = weight(_text_:web in 563) [ClassicSimilarity], result of:
              0.052347783 = score(doc=563,freq=8.0), product of:
                0.12098375 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.03707166 = queryNorm
                0.43268442 = fieldWeight in 563, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.046875 = fieldNorm(doc=563)
          0.5 = coord(1/2)
        0.0060272375 = product of:
          0.030136187 = sum of:
            0.030136187 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
              0.030136187 = score(doc=563,freq=2.0), product of:
                0.12981863 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03707166 = queryNorm
                0.23214069 = fieldWeight in 563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=563)
          0.2 = coord(1/5)
      0.75 = coord(3/4)
    
    Abstract
    In this thesis we propose three new word association measures for multi-word term extraction. We combine these association measures with LocalMaxs algorithm in our extraction model and compare the results of different multi-word term extraction methods. Our approach is language and domain independent and requires no training data. It can be applied to such tasks as text summarization, information retrieval, and document classification. We further explore the potential of using multi-word terms as an effective representation for general web-page summarization. We extract multi-word terms from human written summaries in a large collection of web-pages, and generate the summaries by aligning document words with these multi-word terms. Our system applies machine translation technology to learn the aligning process from a training set and focuses on selecting high quality multi-word terms from human written summaries to generate suitable results for web-page summarization.
    Content
    A Thesis presented to The University of Guelph In partial fulfilment of requirements for the degree of Master of Science in Computer Science. Vgl. Unter: http://www.inf.ufrgs.br%2F~ceramisch%2Fdownload_files%2Fpublications%2F2009%2Fp01.pdf.
    Date
    10. 1.2013 19:22:47
  3. Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.12
    0.11775925 = product of:
      0.2355185 = sum of:
        0.058879625 = product of:
          0.17663887 = sum of:
            0.17663887 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
              0.17663887 = score(doc=862,freq=2.0), product of:
                0.31429395 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.03707166 = queryNorm
                0.56201804 = fieldWeight in 862, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=862)
          0.33333334 = coord(1/3)
        0.17663887 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
          0.17663887 = score(doc=862,freq=2.0), product of:
            0.31429395 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03707166 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.5 = coord(2/4)
    
    Source
    https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN
  4. Dreehsen, B.: ¬Der PC als Dolmetscher (1998) 0.04
    0.041862942 = product of:
      0.16745177 = sum of:
        0.16745177 = sum of:
          0.043623157 = weight(_text_:web in 1474) [ClassicSimilarity], result of:
            0.043623157 = score(doc=1474,freq=2.0), product of:
              0.12098375 = queryWeight, product of:
                3.2635105 = idf(docFreq=4597, maxDocs=44218)
                0.03707166 = queryNorm
              0.36057037 = fieldWeight in 1474, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.2635105 = idf(docFreq=4597, maxDocs=44218)
                0.078125 = fieldNorm(doc=1474)
          0.12382862 = weight(_text_:seiten in 1474) [ClassicSimilarity], result of:
            0.12382862 = score(doc=1474,freq=2.0), product of:
              0.20383513 = queryWeight, product of:
                5.4984083 = idf(docFreq=491, maxDocs=44218)
                0.03707166 = queryNorm
              0.607494 = fieldWeight in 1474, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4984083 = idf(docFreq=491, maxDocs=44218)
                0.078125 = fieldNorm(doc=1474)
      0.25 = coord(1/4)
    
    Abstract
    Für englische Web-Seiten und fremdsprachige Korrespondenz ist Übersetzungssoftware hilfreich, die per Mausklick den Text ins Deutsche überträgt und umgekehrt. Die neuen Versionen geben den Inhalt sinngemäß bereits gut wieder. CHIP hat die Leistungen von 5 Programmen getestet
  5. Heyer, G.; Quasthoff, U.; Wittig, T.: Text Mining : Wissensrohstoff Text. Konzepte, Algorithmen, Ergebnisse (2006) 0.04
    0.039206646 = product of:
      0.07841329 = sum of:
        0.07420843 = sum of:
          0.024676984 = weight(_text_:web in 5218) [ClassicSimilarity], result of:
            0.024676984 = score(doc=5218,freq=4.0), product of:
              0.12098375 = queryWeight, product of:
                3.2635105 = idf(docFreq=4597, maxDocs=44218)
                0.03707166 = queryNorm
              0.2039694 = fieldWeight in 5218, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.2635105 = idf(docFreq=4597, maxDocs=44218)
                0.03125 = fieldNorm(doc=5218)
          0.04953145 = weight(_text_:seiten in 5218) [ClassicSimilarity], result of:
            0.04953145 = score(doc=5218,freq=2.0), product of:
              0.20383513 = queryWeight, product of:
                5.4984083 = idf(docFreq=491, maxDocs=44218)
                0.03707166 = queryNorm
              0.2429976 = fieldWeight in 5218, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4984083 = idf(docFreq=491, maxDocs=44218)
                0.03125 = fieldNorm(doc=5218)
        0.004204865 = product of:
          0.021024324 = sum of:
            0.021024324 = weight(_text_:28 in 5218) [ClassicSimilarity], result of:
              0.021024324 = score(doc=5218,freq=2.0), product of:
                0.13280044 = queryWeight, product of:
                  3.5822632 = idf(docFreq=3342, maxDocs=44218)
                  0.03707166 = queryNorm
                0.15831517 = fieldWeight in 5218, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5822632 = idf(docFreq=3342, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5218)
          0.2 = coord(1/5)
      0.5 = coord(2/4)
    
    Abstract
    Ein großer Teil des Weltwissens befindet sich in Form digitaler Texte im Internet oder in Intranets. Heutige Suchmaschinen nutzen diesen Wissensrohstoff nur rudimentär: Sie können semantische Zusammen-hänge nur bedingt erkennen. Alle warten auf das semantische Web, in dem die Ersteller von Text selbst die Semantik einfügen. Das wird aber noch lange dauern. Es gibt jedoch eine Technologie, die es bereits heute ermöglicht semantische Zusammenhänge in Rohtexten zu analysieren und aufzubereiten. Das Forschungsgebiet "Text Mining" ermöglicht es mit Hilfe statistischer und musterbasierter Verfahren, Wissen aus Texten zu extrahieren, zu verarbeiten und zu nutzen. Hier wird die Basis für die Suchmaschinen der Zukunft gelegt. Das erste deutsche Lehrbuch zu einer bahnbrechenden Technologie: Text Mining: Wissensrohstoff Text Konzepte, Algorithmen, Ergebnisse Ein großer Teil des Weltwissens befindet sich in Form digitaler Texte im Internet oder in Intranets. Heutige Suchmaschinen nutzen diesen Wissensrohstoff nur rudimentär: Sie können semantische Zusammen-hänge nur bedingt erkennen. Alle warten auf das semantische Web, in dem die Ersteller von Text selbst die Semantik einfügen. Das wird aber noch lange dauern. Es gibt jedoch eine Technologie, die es bereits heute ermöglicht semantische Zusammenhänge in Rohtexten zu analysieren und aufzubereiten. Das For-schungsgebiet "Text Mining" ermöglicht es mit Hilfe statistischer und musterbasierter Verfahren, Wissen aus Texten zu extrahieren, zu verarbeiten und zu nutzen. Hier wird die Basis für die Suchmaschinen der Zukunft gelegt. Was fällt Ihnen bei dem Wort "Stich" ein? Die einen denken an Tennis, die anderen an Skat. Die verschiedenen Zusammenhänge können durch Text Mining automatisch ermittelt und in Form von Wortnetzen dargestellt werden. Welche Begriffe stehen am häufigsten links und rechts vom Wort "Festplatte"? Welche Wortformen und Eigennamen treten seit 2001 neu in der deutschen Sprache auf? Text Mining beantwortet diese und viele weitere Fragen. Tauchen Sie mit diesem Lehrbuch ein in eine neue, faszinierende Wissenschaftsdisziplin und entdecken Sie neue, bisher unbekannte Zusammenhänge und Sichtweisen. Sehen Sie, wie aus dem Wissensrohstoff Text Wissen wird! Dieses Lehrbuch richtet sich sowohl an Studierende als auch an Praktiker mit einem fachlichen Schwerpunkt in der Informatik, Wirtschaftsinformatik und/oder Linguistik, die sich über die Grundlagen, Verfahren und Anwendungen des Text Mining informieren möchten und Anregungen für die Implementierung eigener Anwendungen suchen. Es basiert auf Arbeiten, die während der letzten Jahre an der Abteilung Automatische Sprachverarbeitung am Institut für Informatik der Universität Leipzig unter Leitung von Prof. Dr. Heyer entstanden sind. Eine Fülle praktischer Beispiele von Text Mining-Konzepten und -Algorithmen verhelfen dem Leser zu einem umfassenden, aber auch detaillierten Verständnis der Grundlagen und Anwendungen des Text Mining. Folgende Themen werden behandelt: Wissen und Text Grundlagen der Bedeutungsanalyse Textdatenbanken Sprachstatistik Clustering Musteranalyse Hybride Verfahren Beispielanwendungen Anhänge: Statistik und linguistische Grundlagen 360 Seiten, 54 Abb., 58 Tabellen und 95 Glossarbegriffe Mit kostenlosen e-learning-Kurs "Schnelleinstieg: Sprachstatistik" Zusätzlich zum Buch gibt es in Kürze einen Online-Zertifikats-Kurs mit Mentor- und Tutorunterstützung.
    Date
    19. 7.2006 20:28:27
  6. Doszkocs, T.E.; Zamora, A.: Dictionary services and spelling aids for Web searching (2004) 0.04
    0.035586 = product of:
      0.071172 = sum of:
        0.01888938 = product of:
          0.03777876 = sum of:
            0.03777876 = weight(_text_:web in 2541) [ClassicSimilarity], result of:
              0.03777876 = score(doc=2541,freq=6.0), product of:
                0.12098375 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.03707166 = queryNorm
                0.3122631 = fieldWeight in 2541, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2541)
          0.5 = coord(1/2)
        0.05228262 = product of:
          0.0871377 = sum of:
            0.026280407 = weight(_text_:28 in 2541) [ClassicSimilarity], result of:
              0.026280407 = score(doc=2541,freq=2.0), product of:
                0.13280044 = queryWeight, product of:
                  3.5822632 = idf(docFreq=3342, maxDocs=44218)
                  0.03707166 = queryNorm
                0.19789396 = fieldWeight in 2541, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5822632 = idf(docFreq=3342, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2541)
            0.025341455 = weight(_text_:29 in 2541) [ClassicSimilarity], result of:
              0.025341455 = score(doc=2541,freq=2.0), product of:
                0.13040651 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03707166 = queryNorm
                0.19432661 = fieldWeight in 2541, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2541)
            0.035515837 = weight(_text_:22 in 2541) [ClassicSimilarity], result of:
              0.035515837 = score(doc=2541,freq=4.0), product of:
                0.12981863 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03707166 = queryNorm
                0.27358043 = fieldWeight in 2541, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2541)
          0.6 = coord(3/5)
      0.5 = coord(2/4)
    
    Abstract
    The Specialized Information Services Division (SIS) of the National Library of Medicine (NLM) provides Web access to more than a dozen scientific databases on toxicology and the environment on TOXNET . Search queries on TOXNET often include misspelled or variant English words, medical and scientific jargon and chemical names. Following the example of search engines like Google and ClinicalTrials.gov, we set out to develop a spelling "suggestion" system for increased recall and precision in TOXNET searching. This paper describes development of dictionary technology that can be used in a variety of applications such as orthographic verification, writing aid, natural language processing, and information storage and retrieval. The design of the technology allows building complex applications using the components developed in the earlier phases of the work in a modular fashion without extensive rewriting of computer code. Since many of the potential applications envisioned for this work have on-line or web-based interfaces, the dictionaries and other computer components must have fast response, and must be adaptable to open-ended database vocabularies, including chemical nomenclature. The dictionary vocabulary for this work was derived from SIS and other databases and specialized resources, such as NLM's Unified Medical Language Systems (UMLS) . The resulting technology, A-Z Dictionary (AZdict), has three major constituents: 1) the vocabulary list, 2) the word attributes that define part of speech and morphological relationships between words in the list, and 3) a set of programs that implements the retrieval of words and their attributes, and determines similarity between words (ChemSpell). These three components can be used in various applications such as spelling verification, spelling aid, part-of-speech tagging, paraphrasing, and many other natural language processing functions.
    Date
    14. 8.2004 17:22:56
    Source
    Online. 28(2004) no.3, S.22-29
  7. Schneider, R.: Web 3.0 ante portas? : Integration von Social Web und Semantic Web (2008) 0.03
    0.034588095 = product of:
      0.06917619 = sum of:
        0.04039561 = product of:
          0.08079122 = sum of:
            0.08079122 = weight(_text_:web in 4184) [ClassicSimilarity], result of:
              0.08079122 = score(doc=4184,freq=14.0), product of:
                0.12098375 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.03707166 = queryNorm
                0.6677857 = fieldWeight in 4184, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4184)
          0.5 = coord(1/2)
        0.02878058 = product of:
          0.07195145 = sum of:
            0.03679257 = weight(_text_:28 in 4184) [ClassicSimilarity], result of:
              0.03679257 = score(doc=4184,freq=2.0), product of:
                0.13280044 = queryWeight, product of:
                  3.5822632 = idf(docFreq=3342, maxDocs=44218)
                  0.03707166 = queryNorm
                0.27705154 = fieldWeight in 4184, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5822632 = idf(docFreq=3342, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4184)
            0.035158884 = weight(_text_:22 in 4184) [ClassicSimilarity], result of:
              0.035158884 = score(doc=4184,freq=2.0), product of:
                0.12981863 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03707166 = queryNorm
                0.2708308 = fieldWeight in 4184, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4184)
          0.4 = coord(2/5)
      0.5 = coord(2/4)
    
    Abstract
    Das Medium Internet ist im Wandel, und mit ihm ändern sich seine Publikations- und Rezeptionsbedingungen. Welche Chancen bieten die momentan parallel diskutierten Zukunftsentwürfe von Social Web und Semantic Web? Zur Beantwortung dieser Frage beschäftigt sich der Beitrag mit den Grundlagen beider Modelle unter den Aspekten Anwendungsbezug und Technologie, beleuchtet darüber hinaus jedoch auch deren Unzulänglichkeiten sowie den Mehrwert einer mediengerechten Kombination. Am Beispiel des grammatischen Online-Informationssystems grammis wird eine Strategie zur integrativen Nutzung der jeweiligen Stärken skizziert.
    Date
    22. 1.2011 10:38:28
    Source
    Kommunikation, Partizipation und Wirkungen im Social Web, Band 1. Hrsg.: A. Zerfaß u.a
    Theme
    Semantic Web
  8. Rötzer, F.: Computer ergooglen die Bedeutung von Worten (2005) 0.03
    0.026433505 = product of:
      0.10573402 = sum of:
        0.10573402 = sum of:
          0.022667257 = weight(_text_:web in 3385) [ClassicSimilarity], result of:
            0.022667257 = score(doc=3385,freq=6.0), product of:
              0.12098375 = queryWeight, product of:
                3.2635105 = idf(docFreq=4597, maxDocs=44218)
                0.03707166 = queryNorm
              0.18735787 = fieldWeight in 3385, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.2635105 = idf(docFreq=4597, maxDocs=44218)
                0.0234375 = fieldNorm(doc=3385)
          0.08306676 = weight(_text_:seiten in 3385) [ClassicSimilarity], result of:
            0.08306676 = score(doc=3385,freq=10.0), product of:
              0.20383513 = queryWeight, product of:
                5.4984083 = idf(docFreq=491, maxDocs=44218)
                0.03707166 = queryNorm
              0.40751937 = fieldWeight in 3385, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                5.4984083 = idf(docFreq=491, maxDocs=44218)
                0.0234375 = fieldNorm(doc=3385)
      0.25 = coord(1/4)
    
    Content
    "Wie könnten Computer Sprache lernen und dabei auch die Bedeutung von Worten sowie die Beziehungen zwischen ihnen verstehen? Dieses Problem der Semantik stellt eine gewaltige, bislang nur ansatzweise bewältigte Aufgabe dar, da Worte und Wortverbindungen oft mehrere oder auch viele Bedeutungen haben, die zudem vom außersprachlichen Kontext abhängen. Die beiden holländischen (Ein künstliches Bewusstsein aus einfachen Aussagen (1)). Paul Vitanyi (2) und Rudi Cilibrasi vom Nationalen Institut für Mathematik und Informatik (3) in Amsterdam schlagen eine elegante Lösung vor: zum Nachschlagen im Internet, der größten Datenbank, die es gibt, wird einfach Google benutzt. Objekte wie eine Maus können mit ihren Namen "Maus" benannt werden, die Bedeutung allgemeiner Begriffe muss aus ihrem Kontext gelernt werden. Ein semantisches Web zur Repräsentation von Wissen besteht aus den möglichen Verbindungen, die Objekte und ihre Namen eingehen können. Natürlich können in der Wirklichkeit neue Namen, aber auch neue Bedeutungen und damit neue Verknüpfungen geschaffen werden. Sprache ist lebendig und flexibel. Um einer Künstlichen Intelligenz alle Wortbedeutungen beizubringen, müsste mit der Hilfe von menschlichen Experten oder auch vielen Mitarbeitern eine riesige Datenbank mit den möglichen semantischen Netzen aufgebaut und dazu noch ständig aktualisiert werden. Das aber müsste gar nicht notwendig sein, denn mit dem Web gibt es nicht nur die größte und weitgehend kostenlos benutzbare semantische Datenbank, sie wird auch ständig von zahllosen Internetnutzern aktualisiert. Zudem gibt es Suchmaschinen wie Google, die Verbindungen zwischen Worten und damit deren Bedeutungskontext in der Praxis in ihrer Wahrscheinlichkeit quantitativ mit der Angabe der Webseiten, auf denen sie gefunden wurden, messen.
    Mit einem bereits zuvor von Paul Vitanyi und anderen entwickeltem Verfahren, das den Zusammenhang von Objekten misst (normalized information distance - NID ), kann die Nähe zwischen bestimmten Objekten (Bilder, Worte, Muster, Intervalle, Genome, Programme etc.) anhand aller Eigenschaften analysiert und aufgrund der dominanten gemeinsamen Eigenschaft bestimmt werden. Ähnlich können auch die allgemein verwendeten, nicht unbedingt "wahren" Bedeutungen von Namen mit der Google-Suche erschlossen werden. 'At this moment one database stands out as the pinnacle of computer-accessible human knowledge and the most inclusive summary of statistical information: the Google search engine. There can be no doubt that Google has already enabled science to accelerate tremendously and revolutionized the research process. It has dominated the attention of internet users for years, and has recently attracted substantial attention of many Wall Street investors, even reshaping their ideas of company financing.' (Paul Vitanyi und Rudi Cilibrasi) Gibt man ein Wort ein wie beispielsweise "Pferd", erhält man bei Google 4.310.000 indexierte Seiten. Für "Reiter" sind es 3.400.000 Seiten. Kombiniert man beide Begriffe, werden noch 315.000 Seiten erfasst. Für das gemeinsame Auftreten beispielsweise von "Pferd" und "Bart" werden zwar noch immer erstaunliche 67.100 Seiten aufgeführt, aber man sieht schon, dass "Pferd" und "Reiter" enger zusammen hängen. Daraus ergibt sich eine bestimmte Wahrscheinlichkeit für das gemeinsame Auftreten von Begriffen. Aus dieser Häufigkeit, die sich im Vergleich mit der maximalen Menge (5.000.000.000) an indexierten Seiten ergibt, haben die beiden Wissenschaftler eine statistische Größe entwickelt, die sie "normalised Google distance" (NGD) nennen und die normalerweise zwischen 0 und 1 liegt. Je geringer NGD ist, desto enger hängen zwei Begriffe zusammen. "Das ist eine automatische Bedeutungsgenerierung", sagt Vitanyi gegenüber dern New Scientist (4). "Das könnte gut eine Möglichkeit darstellen, einen Computer Dinge verstehen und halbintelligent handeln zu lassen." Werden solche Suchen immer wieder durchgeführt, lässt sich eine Karte für die Verbindungen von Worten erstellen. Und aus dieser Karte wiederum kann ein Computer, so die Hoffnung, auch die Bedeutung der einzelnen Worte in unterschiedlichen natürlichen Sprachen und Kontexten erfassen. So habe man über einige Suchen realisiert, dass ein Computer zwischen Farben und Zahlen unterscheiden, holländische Maler aus dem 17. Jahrhundert und Notfälle sowie Fast-Notfälle auseinander halten oder elektrische oder religiöse Begriffe verstehen könne. Überdies habe eine einfache automatische Übersetzung Englisch-Spanisch bewerkstelligt werden können. Auf diese Weise ließe sich auch, so hoffen die Wissenschaftler, die Bedeutung von Worten erlernen, könne man Spracherkennung verbessern oder ein semantisches Web erstellen und natürlich endlich eine bessere automatische Übersetzung von einer Sprache in die andere realisieren.
  9. Ruge, G.: ¬A spreading activation network for automatic generation of thesaurus relationships (1991) 0.01
    0.01439029 = product of:
      0.05756116 = sum of:
        0.05756116 = product of:
          0.1439029 = sum of:
            0.07358514 = weight(_text_:28 in 4506) [ClassicSimilarity], result of:
              0.07358514 = score(doc=4506,freq=2.0), product of:
                0.13280044 = queryWeight, product of:
                  3.5822632 = idf(docFreq=3342, maxDocs=44218)
                  0.03707166 = queryNorm
                0.5541031 = fieldWeight in 4506, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5822632 = idf(docFreq=3342, maxDocs=44218)
                  0.109375 = fieldNorm(doc=4506)
            0.07031777 = weight(_text_:22 in 4506) [ClassicSimilarity], result of:
              0.07031777 = score(doc=4506,freq=2.0), product of:
                0.12981863 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03707166 = queryNorm
                0.5416616 = fieldWeight in 4506, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=4506)
          0.4 = coord(2/5)
      0.25 = coord(1/4)
    
    Date
    8.10.2000 11:52:22
    Source
    Library science with a slant to documentation. 28(1991) no.4, S.125-130
  10. Baayen, R.H.; Lieber, H.: Word frequency distributions and lexical semantics (1997) 0.01
    0.01439029 = product of:
      0.05756116 = sum of:
        0.05756116 = product of:
          0.1439029 = sum of:
            0.07358514 = weight(_text_:28 in 3117) [ClassicSimilarity], result of:
              0.07358514 = score(doc=3117,freq=2.0), product of:
                0.13280044 = queryWeight, product of:
                  3.5822632 = idf(docFreq=3342, maxDocs=44218)
                  0.03707166 = queryNorm
                0.5541031 = fieldWeight in 3117, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5822632 = idf(docFreq=3342, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3117)
            0.07031777 = weight(_text_:22 in 3117) [ClassicSimilarity], result of:
              0.07031777 = score(doc=3117,freq=2.0), product of:
                0.12981863 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03707166 = queryNorm
                0.5416616 = fieldWeight in 3117, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3117)
          0.4 = coord(2/5)
      0.25 = coord(1/4)
    
    Date
    28. 2.1999 10:48:22
  11. Clark, M.; Kim, Y.; Kruschwitz, U.; Song, D.; Albakour, D.; Dignum, S.; Beresi, U.C.; Fasli, M.; Roeck, A De: Automatically structuring domain knowledge from text : an overview of current research (2012) 0.01
    0.013554457 = product of:
      0.027108913 = sum of:
        0.018507738 = product of:
          0.037015475 = sum of:
            0.037015475 = weight(_text_:web in 2738) [ClassicSimilarity], result of:
              0.037015475 = score(doc=2738,freq=4.0), product of:
                0.12098375 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.03707166 = queryNorm
                0.3059541 = fieldWeight in 2738, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2738)
          0.5 = coord(1/2)
        0.008601176 = product of:
          0.043005876 = sum of:
            0.043005876 = weight(_text_:29 in 2738) [ClassicSimilarity], result of:
              0.043005876 = score(doc=2738,freq=4.0), product of:
                0.13040651 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03707166 = queryNorm
                0.3297832 = fieldWeight in 2738, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2738)
          0.2 = coord(1/5)
      0.5 = coord(2/4)
    
    Abstract
    This paper presents an overview of automatic methods for building domain knowledge structures (domain models) from text collections. Applications of domain models have a long history within knowledge engineering and artificial intelligence. In the last couple of decades they have surfaced noticeably as a useful tool within natural language processing, information retrieval and semantic web technology. Inspired by the ubiquitous propagation of domain model structures that are emerging in several research disciplines, we give an overview of the current research landscape and some techniques and approaches. We will also discuss trade-offs between different approaches and point to some recent trends.
    Content
    Beitrag in einem Themenheft "Soft Approaches to IA on the Web". Vgl.: doi:10.1016/j.ipm.2011.07.002.
    Date
    29. 1.2016 18:29:51
  12. Mustafa El Hadi, W.: Evaluating human language technology : general applications to information access and management (2002) 0.01
    0.012389246 = product of:
      0.049556985 = sum of:
        0.049556985 = product of:
          0.123892464 = sum of:
            0.06307297 = weight(_text_:28 in 1840) [ClassicSimilarity], result of:
              0.06307297 = score(doc=1840,freq=2.0), product of:
                0.13280044 = queryWeight, product of:
                  3.5822632 = idf(docFreq=3342, maxDocs=44218)
                  0.03707166 = queryNorm
                0.4749455 = fieldWeight in 1840, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5822632 = idf(docFreq=3342, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1840)
            0.06081949 = weight(_text_:29 in 1840) [ClassicSimilarity], result of:
              0.06081949 = score(doc=1840,freq=2.0), product of:
                0.13040651 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03707166 = queryNorm
                0.46638384 = fieldWeight in 1840, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1840)
          0.4 = coord(2/5)
      0.25 = coord(1/4)
    
    Date
    6. 1.1997 18:30:28
    Source
    Knowledge organization. 29(2002) nos.3/4, S.124-134
  13. Bian, G.-W.; Chen, H.-H.: Cross-language information access to multilingual collections on the Internet (2000) 0.01
    0.012267487 = product of:
      0.024534974 = sum of:
        0.018507738 = product of:
          0.037015475 = sum of:
            0.037015475 = weight(_text_:web in 4436) [ClassicSimilarity], result of:
              0.037015475 = score(doc=4436,freq=4.0), product of:
                0.12098375 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.03707166 = queryNorm
                0.3059541 = fieldWeight in 4436, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4436)
          0.5 = coord(1/2)
        0.0060272375 = product of:
          0.030136187 = sum of:
            0.030136187 = weight(_text_:22 in 4436) [ClassicSimilarity], result of:
              0.030136187 = score(doc=4436,freq=2.0), product of:
                0.12981863 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03707166 = queryNorm
                0.23214069 = fieldWeight in 4436, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4436)
          0.2 = coord(1/5)
      0.5 = coord(2/4)
    
    Abstract
    Language barrier is the major problem that people face in searching for, retrieving, and understanding multilingual collections on the Internet. This paper deals with query translation and document translation in a Chinese-English information retrieval system called MTIR. Bilingual dictionary and monolingual corpus-based approaches are adopted to select suitable tranlated query terms. A machine transliteration algorithm is introduced to resolve proper name searching. We consider several design issues for document translation, including which material is translated, what roles the HTML tags play in translation, what the tradeoff is between the speed performance and the translation performance, and what from the translated result is presented in. About 100.000 Web pages translated in the last 4 months of 1997 are used for quantitative study of online and real-time Web page translation
    Date
    16. 2.2000 14:22:39
  14. Babik, W.: Keywords as linguistic tools in information and knowledge organization (2017) 0.01
    0.0111818565 = product of:
      0.022363713 = sum of:
        0.015268105 = product of:
          0.03053621 = sum of:
            0.03053621 = weight(_text_:web in 3510) [ClassicSimilarity], result of:
              0.03053621 = score(doc=3510,freq=2.0), product of:
                0.12098375 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.03707166 = queryNorm
                0.25239927 = fieldWeight in 3510, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3510)
          0.5 = coord(1/2)
        0.0070956075 = product of:
          0.035478037 = sum of:
            0.035478037 = weight(_text_:29 in 3510) [ClassicSimilarity], result of:
              0.035478037 = score(doc=3510,freq=2.0), product of:
                0.13040651 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03707166 = queryNorm
                0.27205724 = fieldWeight in 3510, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3510)
          0.2 = coord(1/5)
      0.5 = coord(2/4)
    
    Source
    Theorie, Semantik und Organisation von Wissen: Proceedings der 13. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und dem 13. Internationalen Symposium der Informationswissenschaft der Higher Education Association for Information Science (HI) Potsdam (19.-20.03.2013): 'Theory, Information and Organization of Knowledge' / Proceedings der 14. Tagung der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) und Natural Language & Information Systems (NLDB) Passau (16.06.2015): 'Lexical Resources for Knowledge Organization' / Proceedings des Workshops der Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) auf der SEMANTICS Leipzig (1.09.2014): 'Knowledge Organization and Semantic Web' / Proceedings des Workshops der Polnischen und Deutschen Sektion der Internationalen Gesellschaft für Wissensorganisation (ISKO) Cottbus (29.-30.09.2011): 'Economics of Knowledge Production and Organization'. Hrsg. von W. Babik, H.P. Ohly u. K. Weber
  15. Nait-Baha, L.; Jackiewicz, A.; Djioua, B.; Laublet, P.: Query reformulation for information retrieval on the Web using the point of view methodology : preliminary results (2001) 0.01
    0.011003406 = product of:
      0.022006812 = sum of:
        0.013086946 = product of:
          0.026173891 = sum of:
            0.026173891 = weight(_text_:web in 249) [ClassicSimilarity], result of:
              0.026173891 = score(doc=249,freq=2.0), product of:
                0.12098375 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.03707166 = queryNorm
                0.21634221 = fieldWeight in 249, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.046875 = fieldNorm(doc=249)
          0.5 = coord(1/2)
        0.008919866 = product of:
          0.04459933 = sum of:
            0.04459933 = weight(_text_:28 in 249) [ClassicSimilarity], result of:
              0.04459933 = score(doc=249,freq=4.0), product of:
                0.13280044 = queryWeight, product of:
                  3.5822632 = idf(docFreq=3342, maxDocs=44218)
                  0.03707166 = queryNorm
                0.3358372 = fieldWeight in 249, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5822632 = idf(docFreq=3342, maxDocs=44218)
                  0.046875 = fieldNorm(doc=249)
          0.2 = coord(1/5)
      0.5 = coord(2/4)
    
    Date
    6. 1.1997 18:30:28
    Source
    Knowledge organization. 28(2001) no.3, S.129-136
  16. Whitelock, P.; Kilby, K.: Linguistic and computational techniques in machine translation system design : 2nd ed (1995) 0.01
    0.010324373 = product of:
      0.04129749 = sum of:
        0.04129749 = product of:
          0.10324372 = sum of:
            0.052560814 = weight(_text_:28 in 1750) [ClassicSimilarity], result of:
              0.052560814 = score(doc=1750,freq=2.0), product of:
                0.13280044 = queryWeight, product of:
                  3.5822632 = idf(docFreq=3342, maxDocs=44218)
                  0.03707166 = queryNorm
                0.39578792 = fieldWeight in 1750, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5822632 = idf(docFreq=3342, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1750)
            0.05068291 = weight(_text_:29 in 1750) [ClassicSimilarity], result of:
              0.05068291 = score(doc=1750,freq=2.0), product of:
                0.13040651 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03707166 = queryNorm
                0.38865322 = fieldWeight in 1750, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1750)
          0.4 = coord(2/5)
      0.25 = coord(1/4)
    
    Date
    29. 3.1996 18:28:09
  17. Abdelali, A.: Localization in modern standard Arabic (2004) 0.01
    0.009697122 = product of:
      0.019394243 = sum of:
        0.013086946 = product of:
          0.026173891 = sum of:
            0.026173891 = weight(_text_:web in 2066) [ClassicSimilarity], result of:
              0.026173891 = score(doc=2066,freq=2.0), product of:
                0.12098375 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.03707166 = queryNorm
                0.21634221 = fieldWeight in 2066, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2066)
          0.5 = coord(1/2)
        0.0063072974 = product of:
          0.031536486 = sum of:
            0.031536486 = weight(_text_:28 in 2066) [ClassicSimilarity], result of:
              0.031536486 = score(doc=2066,freq=2.0), product of:
                0.13280044 = queryWeight, product of:
                  3.5822632 = idf(docFreq=3342, maxDocs=44218)
                  0.03707166 = queryNorm
                0.23747274 = fieldWeight in 2066, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5822632 = idf(docFreq=3342, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2066)
          0.2 = coord(1/5)
      0.5 = coord(2/4)
    
    Abstract
    Modern Standard Arabic (MSA) is the official language used in all Arabic countries. In this paper we describe an investigation of the uniformity of MSA across different countries. Many studies have been carried out locally or regionally an Arabic and its dialects. Here we look an a more global scale by studying language variations between countries. The source material used in this investigation was derived from national newspapers available an the Web, which provided samples of common media usage in each country. This corpus has been used to investigate the lexical characteristics of Modern Standard Arabic as found in 10 different Arabic speaking countries. We describe our collection methods, the types of lexical analysis performed, and the results of our investigations. With respect to newspaper articles, MSA seems to be very uniform across all the countries included in the study, but we have detected various types of differences, with implications for computational processing of MSA.
    Source
    Journal of the American Society for Information Science and technology. 55(2004) no.1, S.23-28
  18. Peis, E.; Herrera-Viedma, E.; Herrera, J.C.: On the evaluation of XML documents using Fuzzy linguistic techniques (2003) 0.01
    0.009697122 = product of:
      0.019394243 = sum of:
        0.013086946 = product of:
          0.026173891 = sum of:
            0.026173891 = weight(_text_:web in 2778) [ClassicSimilarity], result of:
              0.026173891 = score(doc=2778,freq=2.0), product of:
                0.12098375 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.03707166 = queryNorm
                0.21634221 = fieldWeight in 2778, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2778)
          0.5 = coord(1/2)
        0.0063072974 = product of:
          0.031536486 = sum of:
            0.031536486 = weight(_text_:28 in 2778) [ClassicSimilarity], result of:
              0.031536486 = score(doc=2778,freq=2.0), product of:
                0.13280044 = queryWeight, product of:
                  3.5822632 = idf(docFreq=3342, maxDocs=44218)
                  0.03707166 = queryNorm
                0.23747274 = fieldWeight in 2778, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5822632 = idf(docFreq=3342, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2778)
          0.2 = coord(1/5)
      0.5 = coord(2/4)
    
    Abstract
    Recommender systems evaluate and filter the great amount of information available an the Web to assist people in their search processes. A fuzzy evaluation method of XML documents based an computing with words is presented. Given an XML document type (e.g. scientific article), we consider that its elements are not equally informative. This is indicated by the use of a DTD and defining linguistic importance attributes to the more meaningful elements of the DTD designed. Then, the evaluation method generates linguistic recommendations from linguistic evaluation judgements provided by different recommenders an meaningful elements of DTD.
    Date
    6. 1.1997 18:30:28
  19. Zhang, C.; Zeng, D.; Li, J.; Wang, F.-Y.; Zuo, W.: Sentiment analysis of Chinese documents : from sentence to document level (2009) 0.01
    0.009584447 = product of:
      0.019168895 = sum of:
        0.013086946 = product of:
          0.026173891 = sum of:
            0.026173891 = weight(_text_:web in 3296) [ClassicSimilarity], result of:
              0.026173891 = score(doc=3296,freq=2.0), product of:
                0.12098375 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.03707166 = queryNorm
                0.21634221 = fieldWeight in 3296, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3296)
          0.5 = coord(1/2)
        0.0060819495 = product of:
          0.030409746 = sum of:
            0.030409746 = weight(_text_:29 in 3296) [ClassicSimilarity], result of:
              0.030409746 = score(doc=3296,freq=2.0), product of:
                0.13040651 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03707166 = queryNorm
                0.23319192 = fieldWeight in 3296, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3296)
          0.2 = coord(1/5)
      0.5 = coord(2/4)
    
    Abstract
    User-generated content on the Web has become an extremely valuable source for mining and analyzing user opinions on any topic. Recent years have seen an increasing body of work investigating methods to recognize favorable and unfavorable sentiments toward specific subjects from online text. However, most of these efforts focus on English and there have been very few studies on sentiment analysis of Chinese content. This paper aims to address the unique challenges posed by Chinese sentiment analysis. We propose a rule-based approach including two phases: (1) determining each sentence's sentiment based on word dependency, and (2) aggregating sentences to predict the document sentiment. We report the results of an experimental study comparing our approach with three machine learning-based approaches using two sets of Chinese articles. These results illustrate the effectiveness of our proposed method and its advantages against learning-based approaches.
    Date
    2. 2.2010 19:29:56
  20. Rettinger, A.; Schumilin, A.; Thoma, S.; Ell, B.: Learning a cross-lingual semantic representation of relations expressed in text (2015) 0.01
    0.00944469 = product of:
      0.03777876 = sum of:
        0.03777876 = product of:
          0.07555752 = sum of:
            0.07555752 = weight(_text_:web in 2027) [ClassicSimilarity], result of:
              0.07555752 = score(doc=2027,freq=6.0), product of:
                0.12098375 = queryWeight, product of:
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.03707166 = queryNorm
                0.6245262 = fieldWeight in 2027, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.2635105 = idf(docFreq=4597, maxDocs=44218)
                  0.078125 = fieldNorm(doc=2027)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Series
    Information Systems and Applications, incl. Internet/Web, and HCI; Bd. 9088
    Source
    The Semantic Web: latest advances and new domains. 12th European Semantic Web Conference, ESWC 2015 Portoroz, Slovenia, May 31 -- June 4, 2015. Proceedings. Eds.: F. Gandon u.a

Years

Languages

Types

  • a 163
  • el 27
  • m 23
  • s 9
  • x 5
  • p 2
  • b 1
  • d 1
  • More… Less…

Subjects

Classifications