Search (260 results, page 2 of 13)

  • × theme_ss:"Computerlinguistik"
  1. Lezius, W.; Rapp, R.; Wettler, M.: ¬A morphology-system and part-of-speech tagger for German (1996) 0.03
    0.026320523 = product of:
      0.06580131 = sum of:
        0.046599183 = weight(_text_:system in 1693) [ClassicSimilarity], result of:
          0.046599183 = score(doc=1693,freq=2.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.3479797 = fieldWeight in 1693, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.078125 = fieldNorm(doc=1693)
        0.019202124 = product of:
          0.057606373 = sum of:
            0.057606373 = weight(_text_:22 in 1693) [ClassicSimilarity], result of:
              0.057606373 = score(doc=1693,freq=2.0), product of:
                0.1488917 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04251826 = queryNorm
                0.38690117 = fieldWeight in 1693, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1693)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Date
    22. 3.2015 9:37:18
  2. Lund, B.D.; Wang, T.; Mannuru, N.R.; Nie, B.; Shimray, S.; Wang, Z.: ChatGPT and a new academic reality : artificial Intelligence-written research papers and the ethics of the large language models in scholarly publishing (2023) 0.02
    0.024017572 = product of:
      0.06004393 = sum of:
        0.04841807 = weight(_text_:context in 943) [ClassicSimilarity], result of:
          0.04841807 = score(doc=943,freq=2.0), product of:
            0.17622331 = queryWeight, product of:
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.04251826 = queryNorm
            0.27475408 = fieldWeight in 943, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.046875 = fieldNorm(doc=943)
        0.011625858 = product of:
          0.034877572 = sum of:
            0.034877572 = weight(_text_:29 in 943) [ClassicSimilarity], result of:
              0.034877572 = score(doc=943,freq=2.0), product of:
                0.14956595 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04251826 = queryNorm
                0.23319192 = fieldWeight in 943, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=943)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Abstract
    This article discusses OpenAI's ChatGPT, a generative pre-trained transformer, which uses natural language processing to fulfill text-based user requests (i.e., a "chatbot"). The history and principles behind ChatGPT and similar models are discussed. This technology is then discussed in relation to its potential impact on academia and scholarly research and publishing. ChatGPT is seen as a potential model for the automated preparation of essays and other types of scholarly manuscripts. Potential ethical issues that could arise with the emergence of large language models like GPT-3, the underlying technology behind ChatGPT, and its usage by academics and researchers, are discussed and situated within the context of broader advancements in artificial intelligence, machine learning, and natural language processing for research and scholarly publishing.
    Date
    19. 4.2023 19:29:44
  3. Hammwöhner, R.: TransRouter revisited : Decision support in the routing of translation projects (2000) 0.02
    0.023828931 = product of:
      0.059572328 = sum of:
        0.04613084 = weight(_text_:system in 5483) [ClassicSimilarity], result of:
          0.04613084 = score(doc=5483,freq=4.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.34448233 = fieldWeight in 5483, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5483)
        0.013441487 = product of:
          0.04032446 = sum of:
            0.04032446 = weight(_text_:22 in 5483) [ClassicSimilarity], result of:
              0.04032446 = score(doc=5483,freq=2.0), product of:
                0.1488917 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04251826 = queryNorm
                0.2708308 = fieldWeight in 5483, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5483)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Abstract
    This paper gives an outline of the final results of the TransRouter project. In the scope of this project a decision support system for translation managers has been developed, which will support the selection of appropriate routes for translation projects. In this paper emphasis is put on the decision model, which is based on a stepwise refined assessment of translation routes. The workflow of using this system is considered as well
    Date
    10.12.2000 18:22:35
  4. Satta, G.; Stock, O.: Bidirectional context-free grammar parsing for natural language processing (1994) 0.02
    0.022824498 = product of:
      0.11412249 = sum of:
        0.11412249 = weight(_text_:context in 1443) [ClassicSimilarity], result of:
          0.11412249 = score(doc=1443,freq=4.0), product of:
            0.17622331 = queryWeight, product of:
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.04251826 = queryNorm
            0.64760154 = fieldWeight in 1443, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.078125 = fieldNorm(doc=1443)
      0.2 = coord(1/5)
    
    Abstract
    While natural langugae is usually analyzed from left to right, bidirectional parsing is very attractive for both theoretical and practical reasons. Describes a formal framework for bidirectional tabular parsing of general context-free languages, and studies some applications to natural language processing
  5. Wiebe, J.; Hirst, G.; Horton, D.: Language use in context (1996) 0.02
    0.0225951 = product of:
      0.1129755 = sum of:
        0.1129755 = weight(_text_:context in 6828) [ClassicSimilarity], result of:
          0.1129755 = score(doc=6828,freq=2.0), product of:
            0.17622331 = queryWeight, product of:
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.04251826 = queryNorm
            0.64109284 = fieldWeight in 6828, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.109375 = fieldNorm(doc=6828)
      0.2 = coord(1/5)
    
  6. Doko, A.; Stula, , M.; Seric, L.: Improved sentence retrieval using local context and sentence length (2013) 0.02
    0.02165322 = product of:
      0.1082661 = sum of:
        0.1082661 = weight(_text_:context in 2705) [ClassicSimilarity], result of:
          0.1082661 = score(doc=2705,freq=10.0), product of:
            0.17622331 = queryWeight, product of:
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.04251826 = queryNorm
            0.6143688 = fieldWeight in 2705, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.046875 = fieldNorm(doc=2705)
      0.2 = coord(1/5)
    
    Abstract
    In this paper we propose improved variants of the sentence retrieval method TF-ISF (a TF-IDF or Term Frequency-Inverse Document Frequency variant for sentence retrieval). The improvement is achieved by using context consisting of neighboring sentences and at the same time promoting the retrieval of longer sentences. We thoroughly compare new modified TF-ISF methods to the TF-ISF baseline, to an earlier attempt to include context into TF-ISF named tfmix and to a language modeling based method that uses context and promoting retrieval of long sentences named 3MMPDS. Experimental results show that the TF-ISF method can be improved using local context. Results also show that the TF-ISF method can be improved by promoting the retrieval of longer sentences. Finally we show that the best results are achieved when combining both modifications. All new methods (TF-ISF variants) also show statistically significant better results than the other tested methods.
  7. Campe, P.: Case, semantic roles, and grammatical relations : a comprehensive bibliography (1994) 0.02
    0.0215282 = product of:
      0.107641 = sum of:
        0.107641 = weight(_text_:index in 8663) [ClassicSimilarity], result of:
          0.107641 = score(doc=8663,freq=2.0), product of:
            0.18579477 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04251826 = queryNorm
            0.5793543 = fieldWeight in 8663, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.09375 = fieldNorm(doc=8663)
      0.2 = coord(1/5)
    
    Abstract
    Contains references to more than 6000 publications with a subject and a language index as well as a guide to the relevant languages and language families
  8. Peters, W.; Vossen, P.; Diez-Orzas, P.; Adriaens, G.: Cross-linguistic alignment of WordNets with an inter-lingual-index (1998) 0.02
    0.0215282 = product of:
      0.107641 = sum of:
        0.107641 = weight(_text_:index in 6446) [ClassicSimilarity], result of:
          0.107641 = score(doc=6446,freq=2.0), product of:
            0.18579477 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04251826 = queryNorm
            0.5793543 = fieldWeight in 6446, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.09375 = fieldNorm(doc=6446)
      0.2 = coord(1/5)
    
  9. He, Q.: ¬A study of the strength indexes in co-word analysis (2000) 0.02
    0.0215282 = product of:
      0.107641 = sum of:
        0.107641 = weight(_text_:index in 111) [ClassicSimilarity], result of:
          0.107641 = score(doc=111,freq=8.0), product of:
            0.18579477 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04251826 = queryNorm
            0.5793543 = fieldWeight in 111, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.046875 = fieldNorm(doc=111)
      0.2 = coord(1/5)
    
    Abstract
    Co-word analysis is a technique for detecting the knowledge structure of scientific literature and mapping the dynamics in a research field. It is used to count the co-occurrences of term pairs, compute the strength between term pairs, and map the research field by inserting terms and their linkages into a graphical structure according to the strength values. In previous co-word studies, there are two indexes used to measure the strength between term pairs in order to identify the major areas in a research field - the inclusion index (I) and the equivalence index (E). This study will conduct two co-word analysis experiments using the two indexes, respectively, and compare the results from the two experiments. The results show, due to the difference in their computation, index I is more likely to identify general subject areas in a research field while index E is more likely to identify subject areas at more specific levels
  10. Riloff, E.: ¬An empirical study of automated dictionary construction for information extraction in three domains (1996) 0.02
    0.02105642 = product of:
      0.05264105 = sum of:
        0.03727935 = weight(_text_:system in 6752) [ClassicSimilarity], result of:
          0.03727935 = score(doc=6752,freq=2.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.27838376 = fieldWeight in 6752, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0625 = fieldNorm(doc=6752)
        0.015361699 = product of:
          0.046085097 = sum of:
            0.046085097 = weight(_text_:22 in 6752) [ClassicSimilarity], result of:
              0.046085097 = score(doc=6752,freq=2.0), product of:
                0.1488917 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04251826 = queryNorm
                0.30952093 = fieldWeight in 6752, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6752)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Abstract
    AutoSlog is a system that addresses the knowledge engineering bottleneck for information extraction. AutoSlog automatically creates domain specific dictionaries for information extraction, given an appropriate training corpus. Describes experiments with AutoSlog in terrorism, joint ventures and microelectronics domains. Compares the performance of AutoSlog across the 3 domains, discusses the lessons learned and presents results from 2 experiments which demonstrate that novice users can generate effective dictionaries using AutoSlog
    Date
    6. 3.1997 16:22:15
  11. Basili, R.; Pazienza, M.T.; Velardi, P.: ¬An empirical symbolic approach to natural language processing (1996) 0.02
    0.02105642 = product of:
      0.05264105 = sum of:
        0.03727935 = weight(_text_:system in 6753) [ClassicSimilarity], result of:
          0.03727935 = score(doc=6753,freq=2.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.27838376 = fieldWeight in 6753, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0625 = fieldNorm(doc=6753)
        0.015361699 = product of:
          0.046085097 = sum of:
            0.046085097 = weight(_text_:22 in 6753) [ClassicSimilarity], result of:
              0.046085097 = score(doc=6753,freq=2.0), product of:
                0.1488917 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04251826 = queryNorm
                0.30952093 = fieldWeight in 6753, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6753)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Abstract
    Describes and evaluates the results of a large scale lexical learning system, ARISTO-LEX, that uses a combination of probabilisitc and knowledge based methods for the acquisition of selectional restrictions of words in sublanguages. Presents experimental data obtained from different corpora in different doamins and languages, and shows that the acquired lexical data not only have practical applications in natural language processing, but they are useful for a comparative analysis of sublanguages
    Date
    6. 3.1997 16:22:15
  12. Gill, A.J.; Hinrichs-Krapels, S.; Blanke, T.; Grant, J.; Hedges, M.; Tanner, S.: Insight workflow : systematically combining human and computational methods to explore textual data (2017) 0.02
    0.020014644 = product of:
      0.05003661 = sum of:
        0.040348392 = weight(_text_:context in 3682) [ClassicSimilarity], result of:
          0.040348392 = score(doc=3682,freq=2.0), product of:
            0.17622331 = queryWeight, product of:
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.04251826 = queryNorm
            0.22896172 = fieldWeight in 3682, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3682)
        0.009688215 = product of:
          0.029064644 = sum of:
            0.029064644 = weight(_text_:29 in 3682) [ClassicSimilarity], result of:
              0.029064644 = score(doc=3682,freq=2.0), product of:
                0.14956595 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04251826 = queryNorm
                0.19432661 = fieldWeight in 3682, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3682)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Abstract
    Analyzing large quantities of real-world textual data has the potential to provide new insights for researchers. However, such data present challenges for both human and computational methods, requiring a diverse range of specialist skills, often shared across a number of individuals. In this paper we use the analysis of a real-world data set as our case study, and use this exploration as a demonstration of our "insight workflow," which we present for use and adaptation by other researchers. The data we use are impact case study documents collected as part of the UK Research Excellence Framework (REF), consisting of 6,679 documents and 6.25 million words; the analysis was commissioned by the Higher Education Funding Council for England (published as report HEFCE 2015). In our exploration and analysis we used a variety of techniques, ranging from keyword in context and frequency information to more sophisticated methods (topic modeling), with these automated techniques providing an empirical point of entry for in-depth and intensive human analysis. We present the 60 topics to demonstrate the output of our methods, and illustrate how the variety of analysis techniques can be combined to provide insights. We note potential limitations and propose future work.
    Date
    16.11.2017 14:00:29
  13. Computational linguistics for the new millennium : divergence or synergy? Proceedings of the International Symposium held at the Ruprecht-Karls Universität Heidelberg, 21-22 July 2000. Festschrift in honour of Peter Hellwig on the occasion of his 60th birthday (2002) 0.02
    0.01997978 = product of:
      0.049949452 = sum of:
        0.040348392 = weight(_text_:context in 4900) [ClassicSimilarity], result of:
          0.040348392 = score(doc=4900,freq=2.0), product of:
            0.17622331 = queryWeight, product of:
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.04251826 = queryNorm
            0.22896172 = fieldWeight in 4900, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4900)
        0.009601062 = product of:
          0.028803186 = sum of:
            0.028803186 = weight(_text_:22 in 4900) [ClassicSimilarity], result of:
              0.028803186 = score(doc=4900,freq=2.0), product of:
                0.1488917 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04251826 = queryNorm
                0.19345059 = fieldWeight in 4900, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4900)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Content
    Contents: Manfred Klenner / Henriette Visser: Introduction - Khurshid Ahmad: Writing Linguistics: When I use a word it means what I choose it to mean - Jürgen Handke: 2000 and Beyond: The Potential of New Technologies in Linguistics - Jurij Apresjan / Igor Boguslavsky / Leonid Iomdin / Leonid Tsinman: Lexical Functions in NU: Possible Uses - Hubert Lehmann: Practical Machine Translation and Linguistic Theory - Karin Haenelt: A Contextbased Approach towards Content Processing of Electronic Documents - Petr Sgall / Eva Hajicová: Are Linguistic Frameworks Comparable? - Wolfgang Menzel: Theory and Applications in Computational Linguistics - Is there Common Ground? - Robert Porzel / Michael Strube: Towards Context-adaptive Natural Language Processing Systems - Nicoletta Calzolari: Language Resources in a Multilingual Setting: The European Perspective - Piek Vossen: Computational Linguistics for Theory and Practice.
  14. Melzer, C.: ¬Der Maschine anpassen : PC-Spracherkennung - Programme sind mittlerweile alltagsreif (2005) 0.02
    0.01866849 = product of:
      0.046671223 = sum of:
        0.03995048 = weight(_text_:system in 4044) [ClassicSimilarity], result of:
          0.03995048 = score(doc=4044,freq=12.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.29833046 = fieldWeight in 4044, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.02734375 = fieldNorm(doc=4044)
        0.0067207436 = product of:
          0.02016223 = sum of:
            0.02016223 = weight(_text_:22 in 4044) [ClassicSimilarity], result of:
              0.02016223 = score(doc=4044,freq=2.0), product of:
                0.1488917 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04251826 = queryNorm
                0.1354154 = fieldWeight in 4044, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=4044)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Content
    "Der Spracherkennung am Computer schien vor wenigen Jahren die Zukunft zu gehören. Geradezu euphorisch waren viele Computernutzer, als sich auf den Bildschirmen die ersten gesprochenen Sätze als Text darstellten. Doch die Spracherkennung erwies sich als anfällig, die Nachbearbeitung nahm manchmal mehr Zeit in Anspruch als gespart wurde. Dabei ist die Kommunikation des Menschen mit der Maschine über die Tastatur eigentlich höchst kompliziert - selbst geübte Schreiber sprechen schneller als sie tippen. Deshalb hat sich inzwischen viel getan: Im Preis und in der Genauigkeit sind viele Spracherkennungsprogramme heute alltagsreif. Die besten Systeme kosten aber noch immer mehrere hundert Euro, die günstigsten weisen Lücken auf. Letztlich gilt: Respektable Ergebnisse sind erreichbar, wenn sich der Mensch der Maschine anpasst. Die Stiftung Warentest in Berlin hat die sechs gängigsten Systeme auf den Prüfstand gestellt. Die ersten Ergebnisse waren ernüchternd: Das deutlich gesprochene "Johann Wolfgang von Goethe" wurde als "Juan Wolf kann Mohnblüte", "Jaun Wolfgang von Göbel" oder "Johann-Wolfgang Wohngüte" geschrieben. Grundsätzlich gilt: Bei einem einfachen Basiswortschatz sind die Ergebnisse genau, sobald es etwas spezieller wird, wird die Software erfinderisch. "Zweiter Weltkrieg" kann dann zu "Zeit für Geld kriegt" werden. Doch ebenso wie der Nutzer lernt auch das System. Bei der Software ist Lernfähigkeit Standard. Ohnehin muss der Benutzer das System einrichten, indem er vorgegebene Texte liest. Dabei wird das Programm der Stimme und der Sprechgeschwindigkeit angepasst. Hier gilt, dass der Anwender deutlich, aber ganz normal vorlesen sollte. Wer akzentuiert und übertrieben betont, wird später mit ungenauen Ausgaben bestraft. Erkennt das System auch nach dem Training einzelne Wörter nicht, können sie nachträglich eingefügt werden. Gleiches gilt für kompliziertere Orts- oder Eigennamen. Wie gut das funktioniert, beweist ein Gegentest: Liest ein anderer den selben Text vor, sinkt das Erkennungsniveau rapide. Die beste Lernfähigkeit attestierten die Warentester dem System "Voice Pro 10" von linguatec. Das war das mit Abstand vielseitigste, mit fast 200 Euro jedoch auch das teuerste Programm.
    Billiger geht es mit "Via Voice Standard" von IBM. Die Software kostet etwa 50 Euro, hat aber erhebliche Schwächen in der Lernfähigkeit: Sie schneidet jedoch immer noch besser ab als das gut drei Mal so teure "Voice Office Premium 10"; das im Test der sechs Programme als einziges nur ein "Befriedigend" bekam. "Man liest über Spracherkennung nicht mehr so viel" weil es funktioniert", glaubt Dorothee Wiegand von der in Hannover erscheinenden Computerzeitschrift "c't". Die Technik" etwa "Dragon Naturally Speaking" von ScanSoft, sei ausgereift, "Spracherkennung ist vor allem Statistik, die Auswertung unendlicher Wortmöglichkeiten. Eigentlich war eher die Hardware das Problem", sagt Wiegand. Da jetzt selbst einfache Heimcomputer schnell und leistungsfähig seien, hätten die Entwickler viel mehr Möglichkeiten."Aber selbst ältere Computer kommen mit den Systemen klar. Sie brauchen nur etwas länger! "Jedes Byte macht die Spracherkennung etwas schneller, ungenauer ist sie sonst aber nicht", bestätigt Kristina Henry von linguatec in München. Auch für die Produkte des Herstellers gelte jedoch, dass "üben und deutlich sprechen wichtiger sind als jede Hardware". Selbst Stimmen von Diktiergeräten würden klar, erkannt, versichert Henry: "Wir wollen einen Schritt weiter gehen und das Diktieren von unterwegs möglich machen." Der Benutzer könnte dann eine Nummer anwählen, etwa im Auto einen Text aufsprechen und ihn zu Hause "getippt" vorfinden. Grundsätzlich passt die Spracherkennungssoftware inzwischen auch auf den privaten Computer. Klar ist aber, dass selbst der bestgesprochene Text nachbearbeitet werden muss. Zudem ist vom Nutzer Geduld gefragt: Ebenso wie sein System lernt, muss der Mensch sich in Aussprache und Geschwindigkeit dem System anpassen. Dann sind die Ergebnisse allerdings beachtlich - und "Sexterminvereinbarung" statt "zwecks Terminvereinbarung" gehört der Vergangenheit an."
    Date
    3. 5.1997 8:44:22
  15. Bowker, L.: Information retrieval in translation memory systems : assessment of current limitations and possibilities for future development (2002) 0.02
    0.018473173 = product of:
      0.04618293 = sum of:
        0.03261943 = weight(_text_:system in 1854) [ClassicSimilarity], result of:
          0.03261943 = score(doc=1854,freq=2.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.2435858 = fieldWeight in 1854, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1854)
        0.013563501 = product of:
          0.0406905 = sum of:
            0.0406905 = weight(_text_:29 in 1854) [ClassicSimilarity], result of:
              0.0406905 = score(doc=1854,freq=2.0), product of:
                0.14956595 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04251826 = queryNorm
                0.27205724 = fieldWeight in 1854, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1854)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Abstract
    A translation memory system is a new type of human language technology (HLT) tool that is gaining popularity among translators. Such tools allow translators to store previously translated texts in a type of aligned bilingual database, and to recycle relevant parts of these texts when producing new translations. Currently, these tools retrieve information from the database using superficial character string matching, which often results in poor precision and recall. This paper explains how translation memory systems work, and it considers some possible ways for introducing more sophisticated information retrieval techniques into such systems by taking syntactic and semantic similarity into account. Some of the suggested techniques are inspired by these used in other areas of HLT, and some by techniques used in information science.
    Source
    Knowledge organization. 29(2002) nos.3/4, S.198-203
  16. Schwarz, C.: THESYS: Thesaurus Syntax System : a fully automatic thesaurus building aid (1988) 0.02
    0.018424368 = product of:
      0.04606092 = sum of:
        0.03261943 = weight(_text_:system in 1361) [ClassicSimilarity], result of:
          0.03261943 = score(doc=1361,freq=2.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.2435858 = fieldWeight in 1361, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1361)
        0.013441487 = product of:
          0.04032446 = sum of:
            0.04032446 = weight(_text_:22 in 1361) [ClassicSimilarity], result of:
              0.04032446 = score(doc=1361,freq=2.0), product of:
                0.1488917 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04251826 = queryNorm
                0.2708308 = fieldWeight in 1361, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1361)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Date
    6. 1.1999 10:22:07
  17. Kay, M.: ¬The proper place of men and machines in language translation (1997) 0.02
    0.018424368 = product of:
      0.04606092 = sum of:
        0.03261943 = weight(_text_:system in 1178) [ClassicSimilarity], result of:
          0.03261943 = score(doc=1178,freq=2.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.2435858 = fieldWeight in 1178, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1178)
        0.013441487 = product of:
          0.04032446 = sum of:
            0.04032446 = weight(_text_:22 in 1178) [ClassicSimilarity], result of:
              0.04032446 = score(doc=1178,freq=2.0), product of:
                0.1488917 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04251826 = queryNorm
                0.2708308 = fieldWeight in 1178, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1178)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Abstract
    Machine translation stands no chance of filling actual needs for translation because, although there has been progress in relevant areas of computer science, advance in linguistics have not touched the core problems. Cooperative man-machine systems need to be developed, Proposes a translator's amanuensis, incorporating into a word processor some simple facilities peculiar to translation. Gradual enhancements of such a system could lead to the original goal of machine translation
    Date
    31. 7.1996 9:22:19
  18. Conceptual structures : logical, linguistic, and computational issues. 8th International Conference on Conceptual Structures, ICCS 2000, Darmstadt, Germany, August 14-18, 2000 (2000) 0.02
    0.017591758 = product of:
      0.043979395 = sum of:
        0.024209036 = weight(_text_:context in 691) [ClassicSimilarity], result of:
          0.024209036 = score(doc=691,freq=2.0), product of:
            0.17622331 = queryWeight, product of:
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.04251826 = queryNorm
            0.13737704 = fieldWeight in 691, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.14465 = idf(docFreq=1904, maxDocs=44218)
              0.0234375 = fieldNorm(doc=691)
        0.01977036 = weight(_text_:system in 691) [ClassicSimilarity], result of:
          0.01977036 = score(doc=691,freq=4.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.14763528 = fieldWeight in 691, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0234375 = fieldNorm(doc=691)
      0.4 = coord(2/5)
    
    Content
    Concepts and Language: The Role of Conceptual Structure in Human Evolution (Keith Devlin) - Concepts in Linguistics - Concepts in Natural Language (Gisela Harras) - Patterns, Schemata, and Types: Author Support through Formalized Experience (Felix H. Gatzemeier) - Conventions and Notations for Knowledge Representation and Retrieval (Philippe Martin) - Conceptual Ontology: Ontology, Metadata, and Semiotics (John F. Sowa) - Pragmatically Yours (Mary Keeler) - Conceptual Modeling for Distributed Ontology Environments (Deborah L. McGuinness) - Discovery of Class Relations in Exception Structured Knowledge Bases (Hendra Suryanto, Paul Compton) - Conceptual Graphs: Perspectives: CGs Applications: Where Are We 7 Years after the First ICCS ? (Michel Chein, David Genest) - The Engineering of a CC-Based System: Fundamental Issues (Guy W. Mineau) - Conceptual Graphs, Metamodeling, and Notation of Concepts (Olivier Gerbé, Guy W. Mineau, Rudolf K. Keller) - Knowledge Representation and Reasonings: Based on Graph Homomorphism (Marie-Laure Mugnier) - User Modeling Using Conceptual Graphs for Intelligent Agents (James F. Baldwin, Trevor P. Martin, Aimilia Tzanavari) - Towards a Unified Querying System of Both Structured and Semi-structured Imprecise Data Using Fuzzy View (Patrice Buche, Ollivier Haemmerlé) - Formal Semantics of Conceptual Structures: The Extensional Semantics of the Conceptual Graph Formalism (Guy W. Mineau) - Semantics of Attribute Relations in Conceptual Graphs (Pavel Kocura) - Nested Concept Graphs and Triadic Power Context Families (Susanne Prediger) - Negations in Simple Concept Graphs (Frithjof Dau) - Extending the CG Model by Simulations (Jean-François Baget) - Contextual Logic and Formal Concept Analysis: Building and Structuring Description Logic Knowledge Bases: Using Least Common Subsumers and Concept Analysis (Franz Baader, Ralf Molitor) - On the Contextual Logic of Ordinal Data (Silke Pollandt, Rudolf Wille) - Boolean Concept Logic (Rudolf Wille) - Lattices of Triadic Concept Graphs (Bernd Groh, Rudolf Wille) - Formalizing Hypotheses with Concepts (Bernhard Ganter, Sergei 0. Kuznetsov) - Generalized Formal Concept Analysis (Laurent Chaudron, Nicolas Maille) - A Logical Generalization of Formal Concept Analysis (Sébastien Ferré, Olivier Ridoux) - On the Treatment of Incomplete Knowledge in Formal Concept Analysis (Peter Burmeister, Richard Holzer) - Conceptual Structures in Practice: Logic-Based Networks: Concept Graphs and Conceptual Structures (Peter W. Eklund) - Conceptual Knowledge Discovery and Data Analysis (Joachim Hereth, Gerd Stumme, Rudolf Wille, Uta Wille) - CEM - A Conceptual Email Manager (Richard Cole, Gerd Stumme) - A Contextual-Logic Extension of TOSCANA (Peter Eklund, Bernd Groh, Gerd Stumme, Rudolf Wille) - A Conceptual Graph Model for W3C Resource Description Framework (Olivier Corby, Rose Dieng, Cédric Hébert) - Computational Aspects of Conceptual Structures: Computing with Conceptual Structures (Bernhard Ganter) - Symmetry and the Computation of Conceptual Structures (Robert Levinson) An Introduction to SNePS 3 (Stuart C. Shapiro) - Composition Norm Dynamics Calculation with Conceptual Graphs (Aldo de Moor) - From PROLOG++ to PROLOG+CG: A CG Object-Oriented Logic Programming Language (Adil Kabbaj, Martin Janta-Polczynski) - A Cost-Bounded Algorithm to Control Events Generalization (Gaël de Chalendar, Brigitte Grau, Olivier Ferret)
  19. Vlachidis, A.; Binding, C.; Tudhope, D.; May, K.: Excavating grey literature : a case study on the rich indexing of archaeological documents via natural language-processing techniques and knowledge-based resources (2010) 0.02
    0.017452361 = product of:
      0.0436309 = sum of:
        0.03588033 = weight(_text_:index in 3948) [ClassicSimilarity], result of:
          0.03588033 = score(doc=3948,freq=2.0), product of:
            0.18579477 = queryWeight, product of:
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.04251826 = queryNorm
            0.1931181 = fieldWeight in 3948, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.369764 = idf(docFreq=1520, maxDocs=44218)
              0.03125 = fieldNorm(doc=3948)
        0.0077505717 = product of:
          0.023251714 = sum of:
            0.023251714 = weight(_text_:29 in 3948) [ClassicSimilarity], result of:
              0.023251714 = score(doc=3948,freq=2.0), product of:
                0.14956595 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.04251826 = queryNorm
                0.15546128 = fieldWeight in 3948, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3948)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Abstract
    Purpose - This paper sets out to discuss the use of information extraction (IE), a natural language-processing (NLP) technique to assist "rich" semantic indexing of diverse archaeological text resources. The focus of the research is to direct a semantic-aware "rich" indexing of diverse natural language resources with properties capable of satisfying information retrieval from online publications and datasets associated with the Semantic Technologies for Archaeological Resources (STAR) project. Design/methodology/approach - The paper proposes use of the English Heritage extension (CRM-EH) of the standard core ontology in cultural heritage, CIDOC CRM, and exploitation of domain thesauri resources for driving and enhancing an Ontology-Oriented Information Extraction process. The process of semantic indexing is based on a rule-based Information Extraction technique, which is facilitated by the General Architecture of Text Engineering (GATE) toolkit and expressed by Java Annotation Pattern Engine (JAPE) rules. Findings - Initial results suggest that the combination of information extraction with knowledge resources and standard conceptual models is capable of supporting semantic-aware term indexing. Additional efforts are required for further exploitation of the technique and adoption of formal evaluation methods for assessing the performance of the method in measurable terms. Originality/value - The value of the paper lies in the semantic indexing of 535 unpublished online documents often referred to as "Grey Literature", from the Archaeological Data Service OASIS corpus (Online AccesS to the Index of archaeological investigationS), with respect to the CRM ontological concepts E49.Time Appellation and P19.Physical Object.
    Date
    29. 8.2010 12:03:40
  20. Schürmann, H.: Software scannt Radio- und Fernsehsendungen : Recherche in Nachrichtenarchiven erleichtert (2001) 0.02
    0.017276151 = product of:
      0.043190375 = sum of:
        0.03646963 = weight(_text_:system in 5759) [ClassicSimilarity], result of:
          0.03646963 = score(doc=5759,freq=10.0), product of:
            0.13391352 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.04251826 = queryNorm
            0.2723372 = fieldWeight in 5759, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.02734375 = fieldNorm(doc=5759)
        0.0067207436 = product of:
          0.02016223 = sum of:
            0.02016223 = weight(_text_:22 in 5759) [ClassicSimilarity], result of:
              0.02016223 = score(doc=5759,freq=2.0), product of:
                0.1488917 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04251826 = queryNorm
                0.1354154 = fieldWeight in 5759, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=5759)
          0.33333334 = coord(1/3)
      0.4 = coord(2/5)
    
    Content
    Um Firmen und Agenturen die Beobachtungen von Medien zu erleichtern, entwickeln Forscher an der Duisburger Hochschule zurzeit ein System zur automatischen Themenerkennung in Rundfunk und Fernsehen. Das so genannte Alert-System soll dem Nutzer helfen, die für ihn relevanten Sprachinformationen aus Nachrichtensendungen herauszufiltem und weiterzuverarbeiten. Durch die automatische Analyse durch den Computer können mehrere Programme rund um die Uhr beobachtet werden. Noch erfolgt die Informationsgewinnung aus TV- und Radiosendungen auf klassischem Wege: Ein Mensch sieht, hört, liest und wertet aus. Das ist enorm zeitaufwendig und für eine Firma, die beispielsweise die Konkurrenz beobachten oder ihre Medienpräsenz dokumentieren lassen möchte, auch sehr teuer. Diese Arbeit ließe sich mit einem Spracherkenner automatisieren, sagten sich die Duisburger Forscher. Sie arbeiten nun zusammen mit Partnern aus Deutschland, Frankreich und Portugal in einem europaweiten Projekt an der Entwicklung einer entsprechenden Technologie (http://alert.uni-duisburg.de). An dem Projekt sind auch zwei Medienbeobachtungsuntemehmen beteiligt, die Oberserver Argus Media GmbH aus Baden-Baden und das französische Unternehmen Secodip. Unsere Arbeit würde schon dadurch erleichtert, wenn Informationen, die über unsere Kunden in den Medien erscheinen, vorselektiert würden", beschreibt Simone Holderbach, Leiterin der Produktentwicklung bei Oberserver, ihr Interesse an der Technik. Und wie funktioniert Alert? Das Spracherkennungssystem wird darauf getrimmt, Nachrichtensendungen in Radio und Fernsehen zu überwachen: Alles, was gesagt wird - sei es vom Nachrichtensprecher, Reporter oder Interviewten -, wird durch die automatische Spracherkennung in Text umgewandelt. Dabei werden Themen und Schlüsselwörter erkannt und gespeichert. Diese werden mit den Suchbegriffen des Nutzers verglichen. Gefundene Übereinstimmungen werden angezeigt und dem Benutzer automatisch mitgeteilt. Konventionelle Spracherkennungstechnik sei für die Medienbeobachtung nicht einsetzbar, da diese für einen anderen Zweck entwickelt worden sei, betont Prof. Gerhard Rigoll, Leiter des Fachgebiets Technische Informatik an der Duisburger Hochschule. Für die Umwandlung von Sprache in Text wurde die Alert-Software gründlich trainiert. Aus Zeitungstexten, Audio- und Video-Material wurden bislang rund 3 50 Millionen Wörter verarbeitet. Das System arbeitet in drei Sprachen. Doch so ganz fehlerfrei sei der automatisch gewonnene Text nicht, räumt Rigoll ein. Zurzeit liegt die Erkennungsrate bei 40 bis 70 Prozent. Und das wird sich in absehbarer Zeit auch nicht ändern." Musiküberlagerungen oder starke Hintergrundgeräusche bei Reportagen führen zu Ungenauigkeiten bei der Textumwandlung. Deshalb haben die, Duisburger Wissenschaftler Methoden entwickelt, die über die herkömmliche Suche nach Schlüsselwörtern hinausgehen und eine inhaltsorientierte Zuordnung ermöglichen. Dadurch erhält der Nutzer dann auch solche Nachrichten, die zwar zum Thema passen, in denen das Stichwort aber gar nicht auftaucht", bringt Rigoll den Vorteil der Technik auf den Punkt. Wird beispielsweise "Ölpreis" als Suchbegriff eingegeben, werden auch solche Nachrichten angezeigt, in denen Olkonzerne und Energieagenturen eine Rolle spielen. Rigoll: Das Alert-System liest sozusagen zwischen den Zeilen!' Das Forschungsprojekt wurde vor einem Jahr gestartet und läuft noch bis Mitte 2002. Wer sich über den Stand der Technik informieren möchte, kann dies in dieser Woche auf der Industriemesse in Hannover. Das Alert-System wird auf dem Gemeinschaftsstand "Forschungsland NRW" in Halle 18, Stand M12, präsentiert
    Source
    Handelsblatt. Nr.79 vom 24.4.2001, S.22

Years

Languages

  • e 198
  • d 54
  • m 3
  • ru 2
  • f 1
  • More… Less…

Types

  • a 211
  • el 23
  • m 23
  • s 13
  • x 8
  • p 3
  • d 2
  • b 1
  • pat 1
  • r 1
  • More… Less…

Subjects

Classifications