Search (152 results, page 1 of 8)

  • × theme_ss:"Computerlinguistik"
  1. Morris, V.: Automated language identification of bibliographic resources (2020) 0.08
    0.07525398 = product of:
      0.11288096 = sum of:
        0.08516933 = weight(_text_:resources in 5749) [ClassicSimilarity], result of:
          0.08516933 = score(doc=5749,freq=4.0), product of:
            0.18665522 = queryWeight, product of:
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.051133685 = queryNorm
            0.45629224 = fieldWeight in 5749, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.0625 = fieldNorm(doc=5749)
        0.027711634 = product of:
          0.055423267 = sum of:
            0.055423267 = weight(_text_:22 in 5749) [ClassicSimilarity], result of:
              0.055423267 = score(doc=5749,freq=2.0), product of:
                0.17906146 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051133685 = queryNorm
                0.30952093 = fieldWeight in 5749, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5749)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    This article describes experiments in the use of machine learning techniques at the British Library to assign language codes to catalog records, in order to provide information about the language of content of the resources described. In the first phase of the project, language codes were assigned to 1.15 million records with 99.7% confidence. The automated language identification tools developed will be used to contribute to future enhancement of over 4 million legacy records.
    Date
    2. 3.2020 19:04:22
  2. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.07
    0.06799838 = product of:
      0.10199757 = sum of:
        0.08121385 = product of:
          0.24364153 = sum of:
            0.24364153 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.24364153 = score(doc=562,freq=2.0), product of:
                0.43351194 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.051133685 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
        0.020783724 = product of:
          0.04156745 = sum of:
            0.04156745 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.04156745 = score(doc=562,freq=2.0), product of:
                0.17906146 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051133685 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  3. Sprachtechnologie für eine dynamische Wirtschaft im Medienzeitalter - Language technologies for dynamic business in the age of the media - L'ingénierie linguistique au service de la dynamisation économique à l'ère du multimédia : Tagungsakten der XXVI. Jahrestagung der Internationalen Vereinigung Sprache und Wirtschaft e.V., 23.-25.11.2000 Fachhochschule Köln (2000) 0.06
    0.061991245 = product of:
      0.09298687 = sum of:
        0.06519419 = weight(_text_:resources in 5527) [ClassicSimilarity], result of:
          0.06519419 = score(doc=5527,freq=6.0), product of:
            0.18665522 = queryWeight, product of:
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.051133685 = queryNorm
            0.349276 = fieldWeight in 5527, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5527)
        0.027792677 = product of:
          0.055585355 = sum of:
            0.055585355 = weight(_text_:management in 5527) [ClassicSimilarity], result of:
              0.055585355 = score(doc=5527,freq=6.0), product of:
                0.17235184 = queryWeight, product of:
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.051133685 = queryNorm
                0.32251096 = fieldWeight in 5527, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5527)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Content
    Enthält die Beiträge: WRIGHT, S.E.: Leveraging terminology resources across application boundaries: accessing resources in future integrated environments; PALME, K.: E-Commerce: Verhindert Sprache Business-to-business?; RÜEGGER, R.: Die qualität der virtuellen Information als Wettbewerbsvorteil: Information im Internet ist Sprache - noch; SCHIRMER, K. u. J. HALLER: Zugang zu mehrsprachigen Nachrichten im Internet; WEISS, A. u. W. WIEDEN: Die Herstellung mehrsprachiger Informations- und Wissensressourcen in Unternehmen; FULFORD, H.: Monolingual or multilingual web sites? An exploratory study of UK SMEs; SCHMIDTKE-NIKELLA, M.: Effiziente Hypermediaentwicklung: Die Autorenentlastung durch eine Engine; SCHMIDT, R.: Maschinelle Text-Ton-Synchronisation in Wissenschaft und Wirtschaft; HELBIG, H. u.a.: Natürlichsprachlicher Zugang zu Informationsanbietern im Internet und zu lokalen Datenbanken; SIENEL, J. u.a.: Sprachtechnologien für die Informationsgesellschaft des 21. Jahrhunderts; ERBACH, G.: Sprachdialogsysteme für Telefondienste: Stand der Technik und zukünftige Entwicklungen; SUSEN, A.: Spracherkennung: Akteulle Einsatzmöglichkeiten im Bereich der Telekommunikation; BENZMÜLLER, R.: Logox WebSpeech: die neue Technologie für sprechende Internetseiten; JAARANEN, K. u.a.: Webtran tools for in-company language support; SCHMITZ, K.-D.: Projektforschung und Infrastrukturen im Bereich der Terminologie: Wie kann die Wirtschaft davon profitieren?; SCHRÖTER, F. u. U. MEYER: Entwicklung sprachlicher Handlungskompetenz in englisch mit hilfe eines Multimedia-Sprachlernsystems; KLEIN, A.: Der Einsatz von Sprachverarbeitungstools beim Sprachenlernen im Intranet; HAUER, M.: Knowledge Management braucht Terminologie Management; HEYER, G. u.a.: Texttechnologische Anwendungen am Beispiel Text Mining
    Theme
    Information Resources Management
  4. Haas, S.W.: Natural language processing : toward large-scale, robust systems (1996) 0.06
    0.058623634 = product of:
      0.08793545 = sum of:
        0.06022381 = weight(_text_:resources in 7415) [ClassicSimilarity], result of:
          0.06022381 = score(doc=7415,freq=2.0), product of:
            0.18665522 = queryWeight, product of:
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.051133685 = queryNorm
            0.32264733 = fieldWeight in 7415, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.0625 = fieldNorm(doc=7415)
        0.027711634 = product of:
          0.055423267 = sum of:
            0.055423267 = weight(_text_:22 in 7415) [ClassicSimilarity], result of:
              0.055423267 = score(doc=7415,freq=2.0), product of:
                0.17906146 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051133685 = queryNorm
                0.30952093 = fieldWeight in 7415, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=7415)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    State of the art review of natural language processing updating an earlier review published in ARIST 22(1987). Discusses important developments that have allowed for significant advances in the field of natural language processing: materials and resources; knowledge based systems and statistical approaches; and a strong emphasis on evaluation. Reviews some natural language processing applications and common problems still awaiting solution. Considers closely related applications such as language generation and th egeneration phase of machine translation which face the same problems as natural language processing. Covers natural language methodologies for information retrieval only briefly
  5. Dorr, B.J.: Large-scale dictionary construction for foreign language tutoring and interlingual machine translation (1997) 0.06
    0.05644048 = product of:
      0.08466072 = sum of:
        0.063876994 = weight(_text_:resources in 3244) [ClassicSimilarity], result of:
          0.063876994 = score(doc=3244,freq=4.0), product of:
            0.18665522 = queryWeight, product of:
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.051133685 = queryNorm
            0.34221917 = fieldWeight in 3244, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.046875 = fieldNorm(doc=3244)
        0.020783724 = product of:
          0.04156745 = sum of:
            0.04156745 = weight(_text_:22 in 3244) [ClassicSimilarity], result of:
              0.04156745 = score(doc=3244,freq=2.0), product of:
                0.17906146 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051133685 = queryNorm
                0.23214069 = fieldWeight in 3244, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3244)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Describes techniques for automatic construction of dictionaries for use in large-scale foreign language tutoring (FLT) and interlingual machine translation (MT) systems. The dictionaries are based on a language independent representation called lexical conceptual structure (LCS). Demonstrates that synonymous verb senses share distribution patterns. Shows how the syntax-semantics relation can be used to develop a lexical acquisition approach that contributes both toward the enrichment of existing online resources and toward the development of lexicons containing more complete information than is provided in any of these resources alone. Describes the structure of the LCS and shows how this representation is used in FLT and MT. Focuses on the problem of building LCS dictionaries for large-scale FLT and MT. Describes authoring tools for manual and semi-automatic construction of LCS dictionaries. Presents an approach that uses linguistic techniques for building word definitions automatically. The techniques have been implemented as part of a set of lixicon-development tools used in the MILT FLT project
    Date
    31. 7.1996 9:22:19
  6. Liddy, E.D.: Natural language processing for information retrieval and knowledge discovery (1998) 0.05
    0.05129568 = product of:
      0.07694352 = sum of:
        0.052695833 = weight(_text_:resources in 2345) [ClassicSimilarity], result of:
          0.052695833 = score(doc=2345,freq=2.0), product of:
            0.18665522 = queryWeight, product of:
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.051133685 = queryNorm
            0.28231642 = fieldWeight in 2345, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2345)
        0.02424768 = product of:
          0.04849536 = sum of:
            0.04849536 = weight(_text_:22 in 2345) [ClassicSimilarity], result of:
              0.04849536 = score(doc=2345,freq=2.0), product of:
                0.17906146 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051133685 = queryNorm
                0.2708308 = fieldWeight in 2345, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2345)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Date
    22. 9.1997 19:16:05
    Source
    Visualizing subject access for 21st century information resources: Papers presented at the 1997 Clinic on Library Applications of Data Processing, 2-4 Mar 1997, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign. Ed.: P.A. Cochrane et al
  7. Carter-Sigglow, J.: ¬Die Rolle der Sprache bei der Informationsvermittlung (2001) 0.05
    0.048266005 = product of:
      0.072399005 = sum of:
        0.045167856 = weight(_text_:resources in 5882) [ClassicSimilarity], result of:
          0.045167856 = score(doc=5882,freq=2.0), product of:
            0.18665522 = queryWeight, product of:
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.051133685 = queryNorm
            0.2419855 = fieldWeight in 5882, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.046875 = fieldNorm(doc=5882)
        0.027231153 = product of:
          0.054462306 = sum of:
            0.054462306 = weight(_text_:management in 5882) [ClassicSimilarity], result of:
              0.054462306 = score(doc=5882,freq=4.0), product of:
                0.17235184 = queryWeight, product of:
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.051133685 = queryNorm
                0.31599492 = fieldWeight in 5882, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5882)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Source
    Information Research & Content Management: Orientierung, Ordnung und Organisation im Wissensmarkt; 23. DGI-Online-Tagung der DGI und 53. Jahrestagung der Deutschen Gesellschaft für Informationswissenschaft und Informationspraxis e.V. DGI, Frankfurt am Main, 8.-10.5.2001. Proceedings. Hrsg.: R. Schmidt
    Theme
    Information Resources Management
  8. Jaaranen, K.; Lehtola, A.; Tenni, J.; Bounsaythip, C.: Webtran tools for in-company language support (2000) 0.04
    0.042948794 = product of:
      0.06442319 = sum of:
        0.045167856 = weight(_text_:resources in 5553) [ClassicSimilarity], result of:
          0.045167856 = score(doc=5553,freq=2.0), product of:
            0.18665522 = queryWeight, product of:
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.051133685 = queryNorm
            0.2419855 = fieldWeight in 5553, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.046875 = fieldNorm(doc=5553)
        0.01925533 = product of:
          0.03851066 = sum of:
            0.03851066 = weight(_text_:management in 5553) [ClassicSimilarity], result of:
              0.03851066 = score(doc=5553,freq=2.0), product of:
                0.17235184 = queryWeight, product of:
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.051133685 = queryNorm
                0.22344214 = fieldWeight in 5553, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5553)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Theme
    Information Resources Management
  9. Santana Suárez, O.; Carreras Riudavets, F.J.; Hernández Figueroa, Z.; González Cabrera, A.C.: Integration of an XML electronic dictionary with linguistic tools for natural language processing (2007) 0.04
    0.042948794 = product of:
      0.06442319 = sum of:
        0.045167856 = weight(_text_:resources in 921) [ClassicSimilarity], result of:
          0.045167856 = score(doc=921,freq=2.0), product of:
            0.18665522 = queryWeight, product of:
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.051133685 = queryNorm
            0.2419855 = fieldWeight in 921, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.046875 = fieldNorm(doc=921)
        0.01925533 = product of:
          0.03851066 = sum of:
            0.03851066 = weight(_text_:management in 921) [ClassicSimilarity], result of:
              0.03851066 = score(doc=921,freq=2.0), product of:
                0.17235184 = queryWeight, product of:
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.051133685 = queryNorm
                0.22344214 = fieldWeight in 921, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.046875 = fieldNorm(doc=921)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    This study proposes the codification of lexical information in electronic dictionaries, in accordance with a generic and extendable XML scheme model, and its conjunction with linguistic tools for the processing of natural language. Our approach is different from other similar studies in that we propose XML coding of those items from a dictionary of meanings that are less related to the lexical units. Linguistic information, such as morphology, syllables, phonology, etc., will be included by means of specific linguistic tools. The use of XML as a container for the information allows the use of other XML tools for carrying out searches or for enabling presentation of the information in different resources. This model is particularly important as it combines two parallel paradigms-extendable labelling of documents and computational linguistics-and it is also applicable to other languages. We have included a comparison with the labelling proposal of printed dictionaries carried out by the Text Encoding Initiative (TEI). The proposed design has been validated with a dictionary of more than 145 000 accepted meanings.
    Source
    Information processing and management. 43(2007) no.4, S.946-957
  10. Doszkocs, T.E.; Zamora, A.: Dictionary services and spelling aids for Web searching (2004) 0.04
    0.041422494 = product of:
      0.062133737 = sum of:
        0.037639882 = weight(_text_:resources in 2541) [ClassicSimilarity], result of:
          0.037639882 = score(doc=2541,freq=2.0), product of:
            0.18665522 = queryWeight, product of:
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.051133685 = queryNorm
            0.20165458 = fieldWeight in 2541, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2541)
        0.024493856 = product of:
          0.048987713 = sum of:
            0.048987713 = weight(_text_:22 in 2541) [ClassicSimilarity], result of:
              0.048987713 = score(doc=2541,freq=4.0), product of:
                0.17906146 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051133685 = queryNorm
                0.27358043 = fieldWeight in 2541, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2541)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    The Specialized Information Services Division (SIS) of the National Library of Medicine (NLM) provides Web access to more than a dozen scientific databases on toxicology and the environment on TOXNET . Search queries on TOXNET often include misspelled or variant English words, medical and scientific jargon and chemical names. Following the example of search engines like Google and ClinicalTrials.gov, we set out to develop a spelling "suggestion" system for increased recall and precision in TOXNET searching. This paper describes development of dictionary technology that can be used in a variety of applications such as orthographic verification, writing aid, natural language processing, and information storage and retrieval. The design of the technology allows building complex applications using the components developed in the earlier phases of the work in a modular fashion without extensive rewriting of computer code. Since many of the potential applications envisioned for this work have on-line or web-based interfaces, the dictionaries and other computer components must have fast response, and must be adaptable to open-ended database vocabularies, including chemical nomenclature. The dictionary vocabulary for this work was derived from SIS and other databases and specialized resources, such as NLM's Unified Medical Language Systems (UMLS) . The resulting technology, A-Z Dictionary (AZdict), has three major constituents: 1) the vocabulary list, 2) the word attributes that define part of speech and morphological relationships between words in the list, and 3) a set of programs that implements the retrieval of words and their attributes, and determines similarity between words (ChemSpell). These three components can be used in various applications such as spelling verification, spelling aid, part-of-speech tagging, paraphrasing, and many other natural language processing functions.
    Date
    14. 8.2004 17:22:56
    Source
    Online. 28(2004) no.3, S.22-29
  11. Mustafa el Hadi, W.: Human language technology and its role in information access and management (2003) 0.04
    0.040221673 = product of:
      0.060332507 = sum of:
        0.037639882 = weight(_text_:resources in 5524) [ClassicSimilarity], result of:
          0.037639882 = score(doc=5524,freq=2.0), product of:
            0.18665522 = queryWeight, product of:
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.051133685 = queryNorm
            0.20165458 = fieldWeight in 5524, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5524)
        0.022692626 = product of:
          0.045385253 = sum of:
            0.045385253 = weight(_text_:management in 5524) [ClassicSimilarity], result of:
              0.045385253 = score(doc=5524,freq=4.0), product of:
                0.17235184 = queryWeight, product of:
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.051133685 = queryNorm
                0.2633291 = fieldWeight in 5524, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5524)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    The role of linguistics in information access, extraction and dissemination is essential. Radical changes in the techniques of information and communication at the end of the twentieth century have had a significant effect on the function of the linguistic paradigm and its applications in all forms of communication. The introduction of new technical means have deeply changed the possibilities for the distribution of information. In this situation, what is the role of the linguistic paradigm and its practical applications, i.e., natural language processing (NLP) techniques when applied to information access? What solutions can linguistics offer in human computer interaction, extraction and management? Many fields show the relevance of the linguistic paradigm through the various technologies that require NLP, such as document and message understanding, information detection, extraction, and retrieval, question and answer, cross-language information retrieval (CLIR), text summarization, filtering, and spoken document retrieval. This paper focuses on the central role of human language technologies in the information society, surveys the current situation, describes the benefits of the above mentioned applications, outlines successes and challenges, and discusses solutions. It reviews the resources and means needed to advance information access and dissemination across language boundaries in the twenty-first century. Multilingualism, which is a natural result of globalization, requires more effort in the direction of language technology. The scope of human language technology (HLT) is large, so we limit our review to applications that involve multilinguality.
  12. Computational linguistics for the new millennium : divergence or synergy? Proceedings of the International Symposium held at the Ruprecht-Karls Universität Heidelberg, 21-22 July 2000. Festschrift in honour of Peter Hellwig on the occasion of his 60th birthday (2002) 0.04
    0.036639772 = product of:
      0.054959655 = sum of:
        0.037639882 = weight(_text_:resources in 4900) [ClassicSimilarity], result of:
          0.037639882 = score(doc=4900,freq=2.0), product of:
            0.18665522 = queryWeight, product of:
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.051133685 = queryNorm
            0.20165458 = fieldWeight in 4900, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4900)
        0.017319772 = product of:
          0.034639545 = sum of:
            0.034639545 = weight(_text_:22 in 4900) [ClassicSimilarity], result of:
              0.034639545 = score(doc=4900,freq=2.0), product of:
                0.17906146 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051133685 = queryNorm
                0.19345059 = fieldWeight in 4900, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4900)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Content
    Contents: Manfred Klenner / Henriette Visser: Introduction - Khurshid Ahmad: Writing Linguistics: When I use a word it means what I choose it to mean - Jürgen Handke: 2000 and Beyond: The Potential of New Technologies in Linguistics - Jurij Apresjan / Igor Boguslavsky / Leonid Iomdin / Leonid Tsinman: Lexical Functions in NU: Possible Uses - Hubert Lehmann: Practical Machine Translation and Linguistic Theory - Karin Haenelt: A Contextbased Approach towards Content Processing of Electronic Documents - Petr Sgall / Eva Hajicová: Are Linguistic Frameworks Comparable? - Wolfgang Menzel: Theory and Applications in Computational Linguistics - Is there Common Ground? - Robert Porzel / Michael Strube: Towards Context-adaptive Natural Language Processing Systems - Nicoletta Calzolari: Language Resources in a Multilingual Setting: The European Perspective - Piek Vossen: Computational Linguistics for Theory and Practice.
  13. Mustafa El Hadi, W.: Terminologies, ontologies and information access (2006) 0.03
    0.028389778 = product of:
      0.08516933 = sum of:
        0.08516933 = weight(_text_:resources in 1488) [ClassicSimilarity], result of:
          0.08516933 = score(doc=1488,freq=4.0), product of:
            0.18665522 = queryWeight, product of:
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.051133685 = queryNorm
            0.45629224 = fieldWeight in 1488, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.0625 = fieldNorm(doc=1488)
      0.33333334 = coord(1/3)
    
    Abstract
    Ontologies have become an important issue in research communities across several disciplines. This paper discusses some of the innovative techniques involving automatic terminology resources acquisition are briefly discussed. Suggests that NLP-based ontologies are useful in reducing the cost of ontology engineering. Emphasizes that linguistic ontologies covering both ontological and lexical information can offer solutions since they can be more easily updated by the resources of NLP products.
  14. Ballesteros, L.A.: Cross-language retrieval via transitive relation (2000) 0.03
    0.028055113 = product of:
      0.084165335 = sum of:
        0.084165335 = weight(_text_:resources in 30) [ClassicSimilarity], result of:
          0.084165335 = score(doc=30,freq=10.0), product of:
            0.18665522 = queryWeight, product of:
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.051133685 = queryNorm
            0.45091337 = fieldWeight in 30, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.0390625 = fieldNorm(doc=30)
      0.33333334 = coord(1/3)
    
    Abstract
    The growth in availability of multi-lingual data in all areas of the public and private sector is driving an increasing need for systems that facilitate access to multi-lingual resources. Cross-language Retrieval (CLR) technology is a means of addressing this need. A CLR system must address two main hurdles to effective cross-language retrieval. First, it must address the ambiguity that arises when trying to map the meaning of text across languages. That is, it must address both within-language ambiguity and cross-language ambiguity. Second, it has to incorporate multilingual resources that will enable it to perform the mapping across languages. The difficulty here is that there is a limited number of lexical resources and virtually none for some pairs of languages. This work focuses on a dictionary approach to addressing the problem of limited lexical resources. A dictionary approach is taken since bilingual dictionaries are more prevalent and simpler to apply than other resources. We show that a transitive translation approach, where a third language is employed as an interlingua between the source and target languages, is a viable means of performing CLR between languages for which no bilingual dictionary is available
  15. Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.03
    0.027071282 = product of:
      0.08121385 = sum of:
        0.08121385 = product of:
          0.24364153 = sum of:
            0.24364153 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
              0.24364153 = score(doc=862,freq=2.0), product of:
                0.43351194 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.051133685 = queryNorm
                0.56201804 = fieldWeight in 862, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=862)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Source
    https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN
  16. Vlachidis, A.; Binding, C.; Tudhope, D.; May, K.: Excavating grey literature : a case study on the rich indexing of archaeological documents via natural language-processing techniques and knowledge-based resources (2010) 0.03
    0.026556207 = product of:
      0.07966862 = sum of:
        0.07966862 = weight(_text_:resources in 3948) [ClassicSimilarity], result of:
          0.07966862 = score(doc=3948,freq=14.0), product of:
            0.18665522 = queryWeight, product of:
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.051133685 = queryNorm
            0.42682233 = fieldWeight in 3948, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.03125 = fieldNorm(doc=3948)
      0.33333334 = coord(1/3)
    
    Abstract
    Purpose - This paper sets out to discuss the use of information extraction (IE), a natural language-processing (NLP) technique to assist "rich" semantic indexing of diverse archaeological text resources. The focus of the research is to direct a semantic-aware "rich" indexing of diverse natural language resources with properties capable of satisfying information retrieval from online publications and datasets associated with the Semantic Technologies for Archaeological Resources (STAR) project. Design/methodology/approach - The paper proposes use of the English Heritage extension (CRM-EH) of the standard core ontology in cultural heritage, CIDOC CRM, and exploitation of domain thesauri resources for driving and enhancing an Ontology-Oriented Information Extraction process. The process of semantic indexing is based on a rule-based Information Extraction technique, which is facilitated by the General Architecture of Text Engineering (GATE) toolkit and expressed by Java Annotation Pattern Engine (JAPE) rules. Findings - Initial results suggest that the combination of information extraction with knowledge resources and standard conceptual models is capable of supporting semantic-aware term indexing. Additional efforts are required for further exploitation of the technique and adoption of formal evaluation methods for assessing the performance of the method in measurable terms. Originality/value - The value of the paper lies in the semantic indexing of 535 unpublished online documents often referred to as "Grey Literature", from the Archaeological Data Service OASIS corpus (Online AccesS to the Index of archaeological investigationS), with respect to the CRM ontological concepts E49.Time Appellation and P19.Physical Object.
    Footnote
    Beitrag in einem Special Issue: Content architecture: exploiting and managing diverse resources: proceedings of the first national conference of the United Kingdom chapter of the International Society for Knowedge Organization (ISKO)
  17. Cimiano, P.; Völker, J.; Studer, R.: Ontologies on demand? : a description of the state-of-the-art, applications, challenges and trends for ontology learning from text (2006) 0.03
    0.026077677 = product of:
      0.078233026 = sum of:
        0.078233026 = weight(_text_:resources in 6014) [ClassicSimilarity], result of:
          0.078233026 = score(doc=6014,freq=6.0), product of:
            0.18665522 = queryWeight, product of:
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.051133685 = queryNorm
            0.4191312 = fieldWeight in 6014, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.046875 = fieldNorm(doc=6014)
      0.33333334 = coord(1/3)
    
    Abstract
    Ontologies are nowadays used for many applications requiring data, services and resources in general to be interoperable and machine understandable. Such applications are for example web service discovery and composition, information integration across databases, intelligent search, etc. The general idea is that data and services are semantically described with respect to ontologies, which are formal specifications of a domain of interest, and can thus be shared and reused in a way such that the shared meaning specified by the ontology remains formally the same across different parties and applications. As the cost of creating ontologies is relatively high, different proposals have emerged for learning ontologies from structured and unstructured resources. In this article we examine the maturity of techniques for ontology learning from textual resources, addressing the question whether the state-of-the-art is mature enough to produce ontologies 'on demand'.
  18. Kocijan, K.: Visualizing natural language resources (2015) 0.03
    0.025093256 = product of:
      0.075279765 = sum of:
        0.075279765 = weight(_text_:resources in 2995) [ClassicSimilarity], result of:
          0.075279765 = score(doc=2995,freq=2.0), product of:
            0.18665522 = queryWeight, product of:
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.051133685 = queryNorm
            0.40330917 = fieldWeight in 2995, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.078125 = fieldNorm(doc=2995)
      0.33333334 = coord(1/3)
    
  19. Wright, S.E.: Leveraging terminology resources across application boundaries : accessing resources in future integrated environments (2000) 0.02
    0.021731397 = product of:
      0.06519419 = sum of:
        0.06519419 = weight(_text_:resources in 5528) [ClassicSimilarity], result of:
          0.06519419 = score(doc=5528,freq=6.0), product of:
            0.18665522 = queryWeight, product of:
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.051133685 = queryNorm
            0.349276 = fieldWeight in 5528, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5528)
      0.33333334 = coord(1/3)
    
    Abstract
    The title for this conference, stated in English, is Language Technology for a Dynamic Economy - y in the Media Age - The question arises as to what the media are we are dealing with and to what extent we are moving away from tile reality of different media to a world in which all sub-categories flow together into a unified stream of information that is constantly resealed to appear in different hardware configurations. A few years ago, people who were interested in sharing data or getting different electronic "boxes" to talk to each other were focused on two major aspects: I ) developing data conversion technology, and 2) convincing potential users that sharing information was an even remotely interesting option. Although some content "owners" are still reticent about releasing their data, it has become dramatically apparent in the Web environment that a broad range of users does indeed want this technology. Even as researchers struggle with the remaining technical, legal, and ethical impediments that stand in the way of unlimited information access to existing multi-platform resources, the future view of the world will no longer be as obsessed with conversion capability as it will be with creating content, with ,in eye to morphing technologies that will enable the delivery of that content from ail open-standards-based format such as XML (eXtensibic Markup Language), MPEG (Moving Picture Experts Group), or WAP (Wireless Application Protocol) to a rich variety of display Options
  20. Shaalan, K.; Raza, H.: NERA: Named Entity Recognition for Arabic (2009) 0.02
    0.021731397 = product of:
      0.06519419 = sum of:
        0.06519419 = weight(_text_:resources in 2953) [ClassicSimilarity], result of:
          0.06519419 = score(doc=2953,freq=6.0), product of:
            0.18665522 = queryWeight, product of:
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.051133685 = queryNorm
            0.349276 = fieldWeight in 2953, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2953)
      0.33333334 = coord(1/3)
    
    Abstract
    Name identification has been worked on quite intensively for the past few years, and has been incorporated into several products revolving around natural language processing tasks. Many researchers have attacked the name identification problem in a variety of languages, but only a few limited research efforts have focused on named entity recognition for Arabic script. This is due to the lack of resources for Arabic named entities and the limited amount of progress made in Arabic natural language processing in general. In this article, we present the results of our attempt at the recognition and extraction of the 10 most important categories of named entities in Arabic script: the person name, location, company, date, time, price, measurement, phone number, ISBN, and file name. We developed the system Named Entity Recognition for Arabic (NERA) using a rule-based approach. The resources created are: a Whitelist representing a dictionary of names, and a grammar, in the form of regular expressions, which are responsible for recognizing the named entities. A filtration mechanism is used that serves two different purposes: (a) revision of the results from a named entity extractor by using metadata, in terms of a Blacklist or rejecter, about ill-formed named entities and (b) disambiguation of identical or overlapping textual matches returned by different name entity extractors to get the correct choice. In NERA, we addressed major challenges posed by NER in the Arabic language arising due to the complexity of the language, peculiarities in the Arabic orthographic system, nonstandardization of the written text, ambiguity, and lack of resources. NERA has been effectively evaluated using our own tagged corpus; it achieved satisfactory results in terms of precision, recall, and F-measure.

Years

Languages

  • e 128
  • d 22
  • m 1
  • More… Less…

Types

  • a 131
  • m 11
  • s 7
  • el 6
  • x 4
  • p 2
  • d 1
  • More… Less…