Search (187 results, page 1 of 10)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.26

0.25871408 = product of:
  0.46568534 = sum of:
    0.062223002 = product of:
      0.186669 = sum of:
        0.186669 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
          0.186669 = score(doc=562,freq=2.0), product of:
            0.3321406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03917671 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
    0.186669 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.186669 = score(doc=562,freq=2.0), product of:
        0.3321406 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03917671 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.014200641 = weight(_text_:of in 562) [ClassicSimilarity], result of:
      0.014200641 = score(doc=562,freq=10.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.23179851 = fieldWeight in 562, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.186669 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.186669 = score(doc=562,freq=2.0), product of:
        0.3321406 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03917671 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.015923709 = product of:
      0.031847417 = sum of:
        0.031847417 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
          0.031847417 = score(doc=562,freq=2.0), product of:
            0.13719016 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03917671 = queryNorm
            0.23214069 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.5 = coord(1/2)
  0.5555556 = coord(5/9)

Abstract: Document representations for text classification are typically based on the classical Bag-Of-Words paradigm. This approach comes with deficiencies that motivate the integration of features on a higher semantic level than single words. In this paper we propose an enhancement of the classical document representation through concepts extracted from background knowledge. Boosting is used for actual classification. Experimental evaluations on two well known text corpora support our approach through consistent improvement of the results.
Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32
Source: Proceedings of the 4th IEEE International Conference on Data Mining (ICDM 2004), 1-4 November 2004, Brighton, UK

Doszkocs, T.E.; Zamora, A.: Dictionary services and spelling aids for Web searching (2004) 0.06

0.06319208 = product of:
  0.14218217 = sum of:
    0.08389453 = weight(_text_:applications in 2541) [ClassicSimilarity], result of:
      0.08389453 = score(doc=2541,freq=8.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.4864132 = fieldWeight in 2541, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2541)
    0.019081537 = weight(_text_:of in 2541) [ClassicSimilarity], result of:
      0.019081537 = score(doc=2541,freq=26.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.31146988 = fieldWeight in 2541, product of:
          5.0990195 = tf(freq=26.0), with freq of:
            26.0 = termFreq=26.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2541)
    0.020439833 = weight(_text_:systems in 2541) [ClassicSimilarity], result of:
      0.020439833 = score(doc=2541,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.1697705 = fieldWeight in 2541, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2541)
    0.018766273 = product of:
      0.037532546 = sum of:
        0.037532546 = weight(_text_:22 in 2541) [ClassicSimilarity], result of:
          0.037532546 = score(doc=2541,freq=4.0), product of:
            0.13719016 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03917671 = queryNorm
            0.27358043 = fieldWeight in 2541, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2541)
      0.5 = coord(1/2)
  0.44444445 = coord(4/9)

Abstract: The Specialized Information Services Division (SIS) of the National Library of Medicine (NLM) provides Web access to more than a dozen scientific databases on toxicology and the environment on TOXNET . Search queries on TOXNET often include misspelled or variant English words, medical and scientific jargon and chemical names. Following the example of search engines like Google and ClinicalTrials.gov, we set out to develop a spelling "suggestion" system for increased recall and precision in TOXNET searching. This paper describes development of dictionary technology that can be used in a variety of applications such as orthographic verification, writing aid, natural language processing, and information storage and retrieval. The design of the technology allows building complex applications using the components developed in the earlier phases of the work in a modular fashion without extensive rewriting of computer code. Since many of the potential applications envisioned for this work have on-line or web-based interfaces, the dictionaries and other computer components must have fast response, and must be adaptable to open-ended database vocabularies, including chemical nomenclature. The dictionary vocabulary for this work was derived from SIS and other databases and specialized resources, such as NLM's Unified Medical Language Systems (UMLS) . The resulting technology, A-Z Dictionary (AZdict), has three major constituents: 1) the vocabulary list, 2) the word attributes that define part of speech and morphological relationships between words in the list, and 3) a set of programs that implements the retrieval of words and their attributes, and determines similarity between words (ChemSpell). These three components can be used in various applications such as spelling verification, spelling aid, part-of-speech tagging, paraphrasing, and many other natural language processing functions.
Date: 14. 8.2004 17:22:56
Source: Online. 28(2004) no.3, S.22-29

Computational linguistics for the new millennium : divergence or synergy? Proceedings of the International Symposium held at the Ruprecht-Karls Universität Heidelberg, 21-22 July 2000. Festschrift in honour of Peter Hellwig on the occasion of his 60th birthday (2002) 0.04

0.04177325 = product of:
  0.093989804 = sum of:
    0.041947264 = weight(_text_:applications in 4900) [ClassicSimilarity], result of:
      0.041947264 = score(doc=4900,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.2432066 = fieldWeight in 4900, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4900)
    0.018332949 = weight(_text_:of in 4900) [ClassicSimilarity], result of:
      0.018332949 = score(doc=4900,freq=24.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.2992506 = fieldWeight in 4900, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4900)
    0.020439833 = weight(_text_:systems in 4900) [ClassicSimilarity], result of:
      0.020439833 = score(doc=4900,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.1697705 = fieldWeight in 4900, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4900)
    0.013269759 = product of:
      0.026539518 = sum of:
        0.026539518 = weight(_text_:22 in 4900) [ClassicSimilarity], result of:
          0.026539518 = score(doc=4900,freq=2.0), product of:
            0.13719016 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03917671 = queryNorm
            0.19345059 = fieldWeight in 4900, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4900)
      0.5 = coord(1/2)
  0.44444445 = coord(4/9)

Abstract: The two seemingly conflicting tendencies, synergy and divergence, are both fundamental to the advancement of any science. Their interplay defines the demarcation line between application-oriented and theoretical research. The papers in this festschrift in honour of Peter Hellwig are geared to answer questions that arise from this insight: where does the discipline of Computational Linguistics currently stand, what has been achieved so far and what should be done next. Given the complexity of such questions, no simple answers can be expected. However, each of the practitioners and researchers are contributing from their very own perspective a piece of insight into the overall picture of today's and tomorrow's computational linguistics.
Content: Contents: Manfred Klenner / Henriette Visser: Introduction - Khurshid Ahmad: Writing Linguistics: When I use a word it means what I choose it to mean - Jürgen Handke: 2000 and Beyond: The Potential of New Technologies in Linguistics - Jurij Apresjan / Igor Boguslavsky / Leonid Iomdin / Leonid Tsinman: Lexical Functions in NU: Possible Uses - Hubert Lehmann: Practical Machine Translation and Linguistic Theory - Karin Haenelt: A Contextbased Approach towards Content Processing of Electronic Documents - Petr Sgall / Eva Hajicová: Are Linguistic Frameworks Comparable? - Wolfgang Menzel: Theory and Applications in Computational Linguistics - Is there Common Ground? - Robert Porzel / Michael Strube: Towards Context-adaptive Natural Language Processing Systems - Nicoletta Calzolari: Language Resources in a Multilingual Setting: The European Perspective - Piek Vossen: Computational Linguistics for Theory and Practice.

Working with conceptual structures : contributions to ICCS 2000. 8th International Conference on Conceptual Structures: Logical, Linguistic, and Computational Issues. Darmstadt, August 14-18, 2000 (2000) 0.04
```
0.039856274 = product of:
  0.08967662 = sum of:
    0.029363085 = weight(_text_:applications in 5089) [ClassicSimilarity], result of:
      0.029363085 = score(doc=5089,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.17024462 = fieldWeight in 5089, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.02734375 = fieldNorm(doc=5089)
    0.012286724 = weight(_text_:of in 5089) [ClassicSimilarity], result of:
      0.012286724 = score(doc=5089,freq=22.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.20055744 = fieldWeight in 5089, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.02734375 = fieldNorm(doc=5089)
    0.014307884 = weight(_text_:systems in 5089) [ClassicSimilarity], result of:
      0.014307884 = score(doc=5089,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.118839346 = fieldWeight in 5089, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.02734375 = fieldNorm(doc=5089)
    0.033718925 = weight(_text_:software in 5089) [ClassicSimilarity], result of:
      0.033718925 = score(doc=5089,freq=4.0), product of:
        0.15541996 = queryWeight, product of:
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.03917671 = queryNorm
        0.21695362 = fieldWeight in 5089, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.9671519 = idf(docFreq=2274, maxDocs=44218)
          0.02734375 = fieldNorm(doc=5089)
  0.44444445 = coord(4/9)
```
Abstract

The 8th International Conference on Conceptual Structures - Logical, Linguistic, and Computational Issues (ICCS 2000) brings together a wide range of researchers and practitioners working with conceptual structures. During the last few years, the ICCS conference series has considerably widened its scope on different kinds of conceptual structures, stimulating research across domain boundaries. We hope that this stimulation is further enhanced by ICCS 2000 joining the long tradition of conferences in Darmstadt with extensive, lively discussions. This volume consists of contributions presented at ICCS 2000, complementing the volume "Conceptual Structures: Logical, Linguistic, and Computational Issues" (B. Ganter, G.W. Mineau (Eds.), LNAI 1867, Springer, Berlin-Heidelberg 2000). It contains submissions reviewed by the program committee, and position papers. We wish to express our appreciation to all the authors of submitted papers, to the general chair, the program chair, the editorial board, the program committee, and to the additional reviewers for making ICCS 2000 a valuable contribution in the knowledge processing research field. Special thanks go to the local organizers for making the conference an enjoyable and inspiring event. We are grateful to Darmstadt University of Technology, the Ernst Schröder Center for Conceptual Knowledge Processing, the Center for Interdisciplinary Studies in Technology, the Deutsche Forschungsgemeinschaft, Land Hessen, and NaviCon GmbH for their generous support

Content

Concepts & Language: Knowledge organization by procedures of natural language processing. A case study using the method GABEK (J. Zelger, J. Gadner) - Computer aided narrative analysis using conceptual graphs (H. Schärfe, P. 0hrstrom) - Pragmatic representation of argumentative text: a challenge for the conceptual graph approach (H. Irandoust, B. Moulin) - Conceptual graphs as a knowledge representation core in a complex language learning environment (G. Angelova, A. Nenkova, S. Boycheva, T. Nikolov) - Conceptual Modeling and Ontologies: Relationships and actions in conceptual categories (Ch. Landauer, K.L. Bellman) - Concept approximations for formal concept analysis (J. Saquer, J.S. Deogun) - Faceted information representation (U. Priß) - Simple concept graphs with universal quantifiers (J. Tappe) - A framework for comparing methods for using or reusing multiple ontologies in an application (J. van ZyI, D. Corbett) - Designing task/method knowledge-based systems with conceptual graphs (M. Leclère, F.Trichet, Ch. Choquet) - A logical ontology (J. Farkas, J. Sarbo) - Algorithms and Tools: Fast concept analysis (Ch. Lindig) - A framework for conceptual graph unification (D. Corbett) - Visual CP representation of knowledge (H.D. Pfeiffer, R.T. Hartley) - Maximal isojoin for representing software textual specifications and detecting semantic anomalies (Th. Charnois) - Troika: using grids, lattices and graphs in knowledge acquisition (H.S. Delugach, B.E. Lampkin) - Open world theorem prover for conceptual graphs (J.E. Heaton, P. Kocura) - NetCare: a practical conceptual graphs software tool (S. Polovina, D. Strang) - CGWorld - a web based workbench for conceptual graphs management and applications (P. Dobrev, K. Toutanova) - Position papers: The edition project: Peirce's existential graphs (R. Mülller) - Mining association rules using formal concept analysis (N. Pasquier) - Contextual logic summary (R Wille) - Information channels and conceptual scaling (K.E. Wolff) - Spatial concepts - a rule exploration (S. Rudolph) - The TEXT-TO-ONTO learning environment (A. Mädche, St. Staab) - Controlling the semantics of metadata on audio-visual documents using ontologies (Th. Dechilly, B. Bachimont) - Building the ontological foundations of a terminology from natural language to conceptual graphs with Ribosome, a knowledge extraction system (Ch. Jacquelinet, A. Burgun) - CharGer: some lessons learned and new directions (H.S. Delugach) - Knowledge management using conceptual graphs (W.K. Pun)

Chowdhury, G.G.: Natural language processing (2002) 0.03

0.034328938 = product of:
  0.10298681 = sum of:
    0.050336715 = weight(_text_:applications in 4284) [ClassicSimilarity], result of:
      0.050336715 = score(doc=4284,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.2918479 = fieldWeight in 4284, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.046875 = fieldNorm(doc=4284)
    0.017962547 = weight(_text_:of in 4284) [ClassicSimilarity], result of:
      0.017962547 = score(doc=4284,freq=16.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.2932045 = fieldWeight in 4284, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=4284)
    0.034687545 = weight(_text_:systems in 4284) [ClassicSimilarity], result of:
      0.034687545 = score(doc=4284,freq=4.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.28811008 = fieldWeight in 4284, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.046875 = fieldNorm(doc=4284)
  0.33333334 = coord(3/9)

Abstract: Natural Language Processing (NLP) is an area of research and application that explores how computers can be used to understand and manipulate natural language text or speech to do useful things. NLP researchers aim to gather knowledge an how human beings understand and use language so that appropriate tools and techniques can be developed to make computer systems understand and manipulate natural languages to perform desired tasks. The foundations of NLP lie in a number of disciplines, namely, computer and information sciences, linguistics, mathematics, electrical and electronic engineering, artificial intelligence and robotics, and psychology. Applications of NLP include a number of fields of study, such as machine translation, natural language text processing and summarization, user interfaces, multilingual and cross-language information retrieval (CLIR), speech recognition, artificial intelligence, and expert systems. One important application area that is relatively new and has not been covered in previous ARIST chapters an NLP relates to the proliferation of the World Wide Web and digital libraries.
Source: Annual review of information science and technology. 37(2003), S.51-90

Humphreys, K.; Demetriou, G.; Gaizauskas, R.: Bioinformatics applications of information extraction from scientific journal articles (2000) 0.03

0.030757478 = product of:
  0.13840865 = sum of:
    0.11745234 = weight(_text_:applications in 4545) [ClassicSimilarity], result of:
      0.11745234 = score(doc=4545,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.6809785 = fieldWeight in 4545, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.109375 = fieldNorm(doc=4545)
    0.020956306 = weight(_text_:of in 4545) [ClassicSimilarity], result of:
      0.020956306 = score(doc=4545,freq=4.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.34207192 = fieldWeight in 4545, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.109375 = fieldNorm(doc=4545)
  0.22222222 = coord(2/9)

Source: Journal of information science. 26(2000) no.2, S.75-85

Stede, M.: Lexicalization in natural language generation (2002) 0.03
```
0.02637424 = product of:
  0.079122715 = sum of:
    0.041947264 = weight(_text_:applications in 4245) [ClassicSimilarity], result of:
      0.041947264 = score(doc=4245,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.2432066 = fieldWeight in 4245, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4245)
    0.016735615 = weight(_text_:of in 4245) [ClassicSimilarity], result of:
      0.016735615 = score(doc=4245,freq=20.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.27317715 = fieldWeight in 4245, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4245)
    0.020439833 = weight(_text_:systems in 4245) [ClassicSimilarity], result of:
      0.020439833 = score(doc=4245,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.1697705 = fieldWeight in 4245, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4245)
  0.33333334 = coord(3/9)
```
Abstract

Natural language generation (NLG), the automatic production of text by Computers, is commonly seen as a process consisting of the following distinct phases: Obviously, choosing words is a central aspect of generatiog language. In which of the these phases it should take place is not entirely clear, however. The decision depends an various factors: what exactly is seen as an individual lexical item; how the relation between word meaning and background knowledge (concepts) is defined; how one accounts for the interactions between individual lexical choices in the Same sentence; what criteria are employed for choosing between similar words; whether or not output is required in one or more languages. This article surveys these issues and the answers that have been proposed in NLG research. For many applications of natural language processing, large scale lexical resources have become available in recent years, such as the WordNet database. In language generation, however, generic lexicons are not in use yet; rather, almost every generation project develops its own format for lexical representations. The reason is that the entries of a generation lexicon need their specific interfaces to the Input representations processed by the generator; lexical semantics in an NLG lexicon needs to be tailored to the Input. Ort the other hand, the large lexicons used for language analysis typically have only very limited semantic information at all. Yet the syntactic behavior of words remains the same regardless of the particular application; thus, it should be possible to build at least parts of generic NLG lexical entries automatically, which could then be used by different systems.

Source

Encyclopedia of library and information science. Vol.70, [=Suppl.33]

Mustafa El Hadi, W.: Evaluating human language technology : general applications to information access and management (2002) 0.03

0.02636355 = product of:
  0.118635975 = sum of:
    0.10067343 = weight(_text_:applications in 1840) [ClassicSimilarity], result of:
      0.10067343 = score(doc=1840,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.5836958 = fieldWeight in 1840, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.09375 = fieldNorm(doc=1840)
    0.017962547 = weight(_text_:of in 1840) [ClassicSimilarity], result of:
      0.017962547 = score(doc=1840,freq=4.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.2932045 = fieldWeight in 1840, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.09375 = fieldNorm(doc=1840)
  0.22222222 = coord(2/9)

Footnote: Guest editorial to a special issue of Knowledge Organization on "Evaluation of HLT"

Cimiano, P.; Völker, J.; Studer, R.: Ontologies on demand? : a description of the state-of-the-art, applications, challenges and trends for ontology learning from text (2006) 0.03
```
0.026105745 = product of:
  0.11747585 = sum of:
    0.10067343 = weight(_text_:applications in 6014) [ClassicSimilarity], result of:
      0.10067343 = score(doc=6014,freq=8.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.5836958 = fieldWeight in 6014, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.046875 = fieldNorm(doc=6014)
    0.016802425 = weight(_text_:of in 6014) [ClassicSimilarity], result of:
      0.016802425 = score(doc=6014,freq=14.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.2742677 = fieldWeight in 6014, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=6014)
  0.22222222 = coord(2/9)
```
Abstract

Ontologies are nowadays used for many applications requiring data, services and resources in general to be interoperable and machine understandable. Such applications are for example web service discovery and composition, information integration across databases, intelligent search, etc. The general idea is that data and services are semantically described with respect to ontologies, which are formal specifications of a domain of interest, and can thus be shared and reused in a way such that the shared meaning specified by the ontology remains formally the same across different parties and applications. As the cost of creating ontologies is relatively high, different proposals have emerged for learning ontologies from structured and unstructured resources. In this article we examine the maturity of techniques for ontology learning from textual resources, addressing the question whether the state-of-the-art is mature enough to produce ontologies 'on demand'.
Jones, I.; Cunliffe, D.; Tudhope, D.: Natural language processing and knowledge organization systems as an aid to retrieval (2004) 0.03
```
0.026089903 = product of:
  0.078269705 = sum of:
    0.029363085 = weight(_text_:applications in 2677) [ClassicSimilarity], result of:
      0.029363085 = score(doc=2677,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.17024462 = fieldWeight in 2677, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.02734375 = fieldNorm(doc=2677)
    0.020290855 = weight(_text_:of in 2677) [ClassicSimilarity], result of:
      0.020290855 = score(doc=2677,freq=60.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.33120972 = fieldWeight in 2677, product of:
          7.745967 = tf(freq=60.0), with freq of:
            60.0 = termFreq=60.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.02734375 = fieldNorm(doc=2677)
    0.028615767 = weight(_text_:systems in 2677) [ClassicSimilarity], result of:
      0.028615767 = score(doc=2677,freq=8.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.23767869 = fieldWeight in 2677, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.02734375 = fieldNorm(doc=2677)
  0.33333334 = coord(3/9)
```
Abstract

This paper discusses research that employs methods from Natural Language Processing (NLP) in exploiting the intellectual resources of Knowledge Organization Systems (KOS), particularly in the retrieval of information. A technique for the disambiguation of homographs and nominal compounds in free text, where these are known ambiguous terms in the KOS itself, is described. The use of Roget's Thesaurus as an intermediary in the process is also reported. A short review of the relevant literature in the field is given. Design considerations, results and conclusions are presented from the implementation of a prototype system. The linguistic techniques are applied at two complementary levels, namely an a free text string used as an entry point to the KOS, and an the underlying controlled vocabulary itself.

Content

1. Introduction The need for research into the application of linguistic techniques in Information Retrieval (IR) in general, and a similar need in faceted Knowledge Organization Systems (KOS) has been indicated by various authors. Smeaton (1997) points out the inherent limitations of conventional approaches to IR based an "bags of words", mainly difficulties caused by lexical ambiguity in the words concerned, and goes an to suggest the possibility of using Natural Language Processing (NLP) in query formulation. Past experience with a faceted retrieval system highlighted the need for integrating the linguistic perspective in order to fully utilise the potential of a KOS (Tudhope et al." 2002). The present research seeks to address some of these needs in using NLP to improve the efficacy of KOS tools in query and retrieval systems. Syntactic parsing and part-of-speech tagging can substantially reduce lexical ambiguity through homograph disambiguation. Given the two strings "1 fable the motion" and "I put the motion an the fable", for instance, the parser used in this research clearly indicates that 'fable' in the first string is a verb, while 'table' in the second string is a noun, a distinction that would be missed in the "bag of words" approach. This syntactic disambiguation enables a more precise matching from free text to the controlled vocabulary of a KOS and vice versa. The use of a general linguistic resource, namely Roget's Thesaurus of English Words and Phrases (RTEWP), as an intermediary in this process, is investigated. The adaptation of the Link parser (Sleator & Temperley, 1993) to the purposes of the research is reported. The design and implementation of the early practical stages of the project are described, and the results of the initial experiments are presented and evaluated. Applications of the techniques developed are foreseen in the areas of query disambiguation, information retrieval and automatic indexing. In the first section of the paper a brief review of the literature and relevant current work in the field is presented. The second section includes reports an the development of algorithms, the construction of data sets and theoretical and experimental work undertaken to date. The third section evaluates the results obtained, and outlines directions for future research.

Source

Knowledge organization and the global information society: Proceedings of the 8th International ISKO Conference 13-16 July 2004, London, UK. Ed.: I.C. McIlwaine

Jurafsky, D.; Martin, J.H.: Speech and language processing : ani ntroduction to natural language processing, computational linguistics and speech recognition (2009) 0.03

0.025116816 = product of:
  0.07535045 = sum of:
    0.041947264 = weight(_text_:applications in 1081) [ClassicSimilarity], result of:
      0.041947264 = score(doc=1081,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.2432066 = fieldWeight in 1081, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1081)
    0.012963352 = weight(_text_:of in 1081) [ClassicSimilarity], result of:
      0.012963352 = score(doc=1081,freq=12.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.21160212 = fieldWeight in 1081, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1081)
    0.020439833 = weight(_text_:systems in 1081) [ClassicSimilarity], result of:
      0.020439833 = score(doc=1081,freq=2.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.1697705 = fieldWeight in 1081, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1081)
  0.33333334 = coord(3/9)

Abstract: For undergraduate or advanced undergraduate courses in Classical Natural Language Processing, Statistical Natural Language Processing, Speech Recognition, Computational Linguistics, and Human Language Processing. An explosion of Web-based language techniques, merging of distinct fields, availability of phone-based dialogue systems, and much more make this an exciting time in speech and language processing. The first of its kind to thoroughly cover language technology at all levels and with all modern technologies this text takes an empirical approach to the subject, based on applying statistical and other machine-learning algorithms to large corporations. The authors cover areas that traditionally are taught in different courses, to describe a unified vision of speech and language processing. Emphasis is on practical applications and scientific evaluation. An accompanying Website contains teaching materials for instructors, with pointers to language processing resources on the Web. The Second Edition offers a significant amount of new and extended material.

Conceptual structures : logical, linguistic, and computational issues. 8th International Conference on Conceptual Structures, ICCS 2000, Darmstadt, Germany, August 14-18, 2000 (2000) 0.02
```
0.02493256 = product of:
  0.074797675 = sum of:
    0.03559343 = weight(_text_:applications in 691) [ClassicSimilarity], result of:
      0.03559343 = score(doc=691,freq=4.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.20636764 = fieldWeight in 691, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0234375 = fieldNorm(doc=691)
    0.017962547 = weight(_text_:of in 691) [ClassicSimilarity], result of:
      0.017962547 = score(doc=691,freq=64.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.2932045 = fieldWeight in 691, product of:
          8.0 = tf(freq=64.0), with freq of:
            64.0 = termFreq=64.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0234375 = fieldNorm(doc=691)
    0.021241698 = weight(_text_:systems in 691) [ClassicSimilarity], result of:
      0.021241698 = score(doc=691,freq=6.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.17643067 = fieldWeight in 691, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.0234375 = fieldNorm(doc=691)
  0.33333334 = coord(3/9)
```
Abstract

Computer scientists create models of a perceived reality. Through AI techniques, these models aim at providing the basic support for emulating cognitive behavior such as reasoning and learning, which is one of the main goals of the Al research effort. Such computer models are formed through the interaction of various acquisition and inference mechanisms: perception, concept learning, conceptual clustering, hypothesis testing, probabilistic inference, etc., and are represented using different paradigms tightly linked to the processes that use them. Among these paradigms let us cite: biological models (neural nets, genetic programming), logic-based models (first-order logic, modal logic, rule-based systems), virtual reality models (object systems, agent systems), probabilistic models (Bayesian nets, fuzzy logic), linguistic models (conceptual dependency graphs, language-based rep resentations), etc. One of the strengths of the Conceptual Graph (CG) theory is its versatility in terms of the representation paradigms under which it falls. It can be viewed and therefore used, under different representation paradigms, which makes it a popular choice for a wealth of applications. Its full coupling with different cognitive processes lead to the opening of the field toward related research communities such as the Description Logic, Formal Concept Analysis, and Computational Linguistic communities. We now see more and more research results from one community enrich the other, laying the foundations of common philosophical grounds from which a successful synergy can emerge. ICCS 2000 embodies this spirit of research collaboration. It presents a set of papers that we believe, by their exposure, will benefit the whole community. For instance, the technical program proposes tracks on Conceptual Ontologies, Language, Formal Concept Analysis, Computational Aspects of Conceptual Structures, and Formal Semantics, with some papers on pragmatism and human related aspects of computing. Never before was the program of ICCS formed by so heterogeneously rooted theories of knowledge representation and use. We hope that this swirl of ideas will benefit you as much as it already has benefited us while putting together this program

Content

Concepts and Language: The Role of Conceptual Structure in Human Evolution (Keith Devlin) - Concepts in Linguistics - Concepts in Natural Language (Gisela Harras) - Patterns, Schemata, and Types: Author Support through Formalized Experience (Felix H. Gatzemeier) - Conventions and Notations for Knowledge Representation and Retrieval (Philippe Martin) - Conceptual Ontology: Ontology, Metadata, and Semiotics (John F. Sowa) - Pragmatically Yours (Mary Keeler) - Conceptual Modeling for Distributed Ontology Environments (Deborah L. McGuinness) - Discovery of Class Relations in Exception Structured Knowledge Bases (Hendra Suryanto, Paul Compton) - Conceptual Graphs: Perspectives: CGs Applications: Where Are We 7 Years after the First ICCS ? (Michel Chein, David Genest) - The Engineering of a CC-Based System: Fundamental Issues (Guy W. Mineau) - Conceptual Graphs, Metamodeling, and Notation of Concepts (Olivier Gerbé, Guy W. Mineau, Rudolf K. Keller) - Knowledge Representation and Reasonings: Based on Graph Homomorphism (Marie-Laure Mugnier) - User Modeling Using Conceptual Graphs for Intelligent Agents (James F. Baldwin, Trevor P. Martin, Aimilia Tzanavari) - Towards a Unified Querying System of Both Structured and Semi-structured Imprecise Data Using Fuzzy View (Patrice Buche, Ollivier Haemmerlé) - Formal Semantics of Conceptual Structures: The Extensional Semantics of the Conceptual Graph Formalism (Guy W. Mineau) - Semantics of Attribute Relations in Conceptual Graphs (Pavel Kocura) - Nested Concept Graphs and Triadic Power Context Families (Susanne Prediger) - Negations in Simple Concept Graphs (Frithjof Dau) - Extending the CG Model by Simulations (Jean-François Baget) - Contextual Logic and Formal Concept Analysis: Building and Structuring Description Logic Knowledge Bases: Using Least Common Subsumers and Concept Analysis (Franz Baader, Ralf Molitor) - On the Contextual Logic of Ordinal Data (Silke Pollandt, Rudolf Wille) - Boolean Concept Logic (Rudolf Wille) - Lattices of Triadic Concept Graphs (Bernd Groh, Rudolf Wille) - Formalizing Hypotheses with Concepts (Bernhard Ganter, Sergei 0. Kuznetsov) - Generalized Formal Concept Analysis (Laurent Chaudron, Nicolas Maille) - A Logical Generalization of Formal Concept Analysis (Sébastien Ferré, Olivier Ridoux) - On the Treatment of Incomplete Knowledge in Formal Concept Analysis (Peter Burmeister, Richard Holzer) - Conceptual Structures in Practice: Logic-Based Networks: Concept Graphs and Conceptual Structures (Peter W. Eklund) - Conceptual Knowledge Discovery and Data Analysis (Joachim Hereth, Gerd Stumme, Rudolf Wille, Uta Wille) - CEM - A Conceptual Email Manager (Richard Cole, Gerd Stumme) - A Contextual-Logic Extension of TOSCANA (Peter Eklund, Bernd Groh, Gerd Stumme, Rudolf Wille) - A Conceptual Graph Model for W3C Resource Description Framework (Olivier Corby, Rose Dieng, Cédric Hébert) - Computational Aspects of Conceptual Structures: Computing with Conceptual Structures (Bernhard Ganter) - Symmetry and the Computation of Conceptual Structures (Robert Levinson) An Introduction to SNePS 3 (Stuart C. Shapiro) - Composition Norm Dynamics Calculation with Conceptual Graphs (Aldo de Moor) - From PROLOG++ to PROLOG+CG: A CG Object-Oriented Logic Programming Language (Adil Kabbaj, Martin Janta-Polczynski) - A Cost-Bounded Algorithm to Control Events Generalization (Gaël de Chalendar, Brigitte Grau, Olivier Ferret)
Mustafa el Hadi, W.: Human language technology and its role in information access and management (2003) 0.02
```
0.02304364 = product of:
  0.103696376 = sum of:
    0.08389453 = weight(_text_:applications in 5524) [ClassicSimilarity], result of:
      0.08389453 = score(doc=5524,freq=8.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.4864132 = fieldWeight in 5524, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5524)
    0.019801848 = weight(_text_:of in 5524) [ClassicSimilarity], result of:
      0.019801848 = score(doc=5524,freq=28.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.32322758 = fieldWeight in 5524, product of:
          5.2915025 = tf(freq=28.0), with freq of:
            28.0 = termFreq=28.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5524)
  0.22222222 = coord(2/9)
```
Abstract

The role of linguistics in information access, extraction and dissemination is essential. Radical changes in the techniques of information and communication at the end of the twentieth century have had a significant effect on the function of the linguistic paradigm and its applications in all forms of communication. The introduction of new technical means have deeply changed the possibilities for the distribution of information. In this situation, what is the role of the linguistic paradigm and its practical applications, i.e., natural language processing (NLP) techniques when applied to information access? What solutions can linguistics offer in human computer interaction, extraction and management? Many fields show the relevance of the linguistic paradigm through the various technologies that require NLP, such as document and message understanding, information detection, extraction, and retrieval, question and answer, cross-language information retrieval (CLIR), text summarization, filtering, and spoken document retrieval. This paper focuses on the central role of human language technologies in the information society, surveys the current situation, describes the benefits of the above mentioned applications, outlines successes and challenges, and discusses solutions. It reviews the resources and means needed to advance information access and dissemination across language boundaries in the twenty-first century. Multilingualism, which is a natural result of globalization, requires more effort in the direction of language technology. The scope of human language technology (HLT) is large, so we limit our review to applications that involve multilinguality.
Kreymer, O.: ¬An evaluation of help mechanisms in natural language information retrieval systems (2002) 0.02
```
0.02191704 = product of:
  0.09862667 = sum of:
    0.021062955 = weight(_text_:of in 2557) [ClassicSimilarity], result of:
      0.021062955 = score(doc=2557,freq=22.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.34381276 = fieldWeight in 2557, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2557)
    0.07756372 = weight(_text_:systems in 2557) [ClassicSimilarity], result of:
      0.07756372 = score(doc=2557,freq=20.0), product of:
        0.12039685 = queryWeight, product of:
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.03917671 = queryNorm
        0.64423376 = fieldWeight in 2557, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          3.0731742 = idf(docFreq=5561, maxDocs=44218)
          0.046875 = fieldNorm(doc=2557)
  0.22222222 = coord(2/9)
```
Abstract

The field of natural language processing (NLP) demonstrates rapid changes in the design of information retrieval systems and human-computer interaction. While natural language is being looked on as the most effective tool for information retrieval in a contemporary information environment, the systems using it are only beginning to emerge. This study attempts to evaluate the current state of NLP information retrieval systems from the user's point of view: what techniques are used by these systems to guide their users through the search process? The analysis focused on the structure and components of the systems' help mechanisms. Results of the study demonstrated that systems which claimed to be using natural language searching in fact used a wide range of information retrieval techniques from real natural language processing to Boolean searching. As a result, the user assistance mechanisms of these systems also varied. While pseudo-NLP systems would suit a more traditional method of instruction, real NLP systems primarily utilised the methods of explanation and user-system dialogue.
Witschel, H.F.: Terminology extraction and automatic indexing : comparison and qualitative evaluation of methods (2005) 0.02
```
0.021405008 = product of:
  0.09632254 = sum of:
    0.0726548 = weight(_text_:applications in 1842) [ClassicSimilarity], result of:
      0.0726548 = score(doc=1842,freq=6.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.42124623 = fieldWeight in 1842, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1842)
    0.023667734 = weight(_text_:of in 1842) [ClassicSimilarity], result of:
      0.023667734 = score(doc=1842,freq=40.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.38633084 = fieldWeight in 1842, product of:
          6.3245554 = tf(freq=40.0), with freq of:
            40.0 = termFreq=40.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1842)
  0.22222222 = coord(2/9)
```
Abstract

Many terminology engineering processes involve the task of automatic terminology extraction: before the terminology of a given domain can be modelled, organised or standardised, important concepts (or terms) of this domain have to be identified and fed into terminological databases. These serve in further steps as a starting point for compiling dictionaries, thesauri or maybe even terminological ontologies for the domain. For the extraction of the initial concepts, extraction methods are needed that operate on specialised language texts. On the other hand, many machine learning or information retrieval applications require automatic indexing techniques. In Machine Learning applications concerned with the automatic clustering or classification of texts, often feature vectors are needed that describe the contents of a given text briefly but meaningfully. These feature vectors typically consist of a fairly small set of index terms together with weights indicating their importance. Short but meaningful descriptions of document contents as provided by good index terms are also useful to humans: some knowledge management applications (e.g. topic maps) use them as a set of basic concepts (topics). The author believes that the tasks of terminology extraction and automatic indexing have much in common and can thus benefit from the same set of basic algorithms. It is the goal of this paper to outline some methods that may be used in both contexts, but also to find the discriminating factors between the two tasks that call for the variation of parameters or application of different techniques. The discussion of these methods will be based on statistical, syntactical and especially morphological properties of (index) terms. The paper is concluded by the presentation of some qualitative and quantitative results comparing statistical and morphological methods.

Source

TKE 2005: Proc. of Terminology and Knowledge Engineering (TKE) 2005
Mustafa el Hadi, W.: Terminology & information retrieval : new tools for new needs. Integration of knowledge across boundaries (2003) 0.02
```
0.020907713 = product of:
  0.09408471 = sum of:
    0.07118686 = weight(_text_:applications in 2688) [ClassicSimilarity], result of:
      0.07118686 = score(doc=2688,freq=4.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.41273528 = fieldWeight in 2688, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.046875 = fieldNorm(doc=2688)
    0.022897845 = weight(_text_:of in 2688) [ClassicSimilarity], result of:
      0.022897845 = score(doc=2688,freq=26.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.37376386 = fieldWeight in 2688, product of:
          5.0990195 = tf(freq=26.0), with freq of:
            26.0 = termFreq=26.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2688)
  0.22222222 = coord(2/9)
```
Abstract

The radical changes in information and communication techniques at the end of the 20th century have significantly modified the function of terminology and its applications in all forms of communication. The introduction of new mediums has deeply changed the possibilities of distribution of scientific information. What in this situation is the role of terminology and its practical applications? What is the place for multiple functions of terminology in the communication society? What is the impact of natural language (NLP) techniques used in its processing and management? In this article we will focus an the possibilities NLP techniques offer and how they can be directed towards the satisfaction of the newly expressed needs.

Source

Challenges in knowledge representation and organization for the 21st century: Integration of knowledge across boundaries. Proceedings of the 7th ISKO International Conference Granada, Spain, July 10-13, 2002. Ed.: M. López-Huertas
¬The semantics of relationships : an interdisciplinary perspective (2002) 0.02
```
0.020700369 = product of:
  0.09315166 = sum of:
    0.0726548 = weight(_text_:applications in 1430) [ClassicSimilarity], result of:
      0.0726548 = score(doc=1430,freq=6.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.42124623 = fieldWeight in 1430, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1430)
    0.02049686 = weight(_text_:of in 1430) [ClassicSimilarity], result of:
      0.02049686 = score(doc=1430,freq=30.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.33457235 = fieldWeight in 1430, product of:
          5.477226 = tf(freq=30.0), with freq of:
            30.0 = termFreq=30.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1430)
  0.22222222 = coord(2/9)
```
Abstract

Work on relationships takes place in many communities, including, among others, data modeling, knowledge representation, natural language processing, linguistics, and information retrieval. Unfortunately, continued disciplinary splintering and specialization keeps any one person from being familiar with the full expanse of that work. By including contributions form experts in a variety of disciplines and backgrounds, this volume demonstrates both the parallels that inform work on relationships across a number of fields and the singular emphases that have yet to be fully embraced, The volume is organized into 3 parts: (1) Types of relationships (2) Relationships in knowledge representation and reasoning (3) Applications of relationships

Content

Enthält die Beiträge: Pt.1: Types of relationships: CRUDE, D.A.: Hyponymy and its varieties; FELLBAUM, C.: On the semantics of troponymy; PRIBBENOW, S.: Meronymic relationships: from classical mereology to complex part-whole relations; KHOO, C. u.a.: The many facets of cause-effect relation - Pt.2: Relationships in knowledge representation and reasoning: GREEN, R.: Internally-structured conceptual models in cognitive semantics; HOVY, E.: Comparing sets of semantic relations in ontologies; GUARINO, N., C. WELTY: Identity and subsumption; JOUIS; C.: Logic of relationships - Pt.3: Applications of relationships: EVENS, M.: Thesaural relations in information retrieval; KHOO, C., S.H. MYAENG: Identifying semantic relations in text for information retrieval and information extraction; McCRAY, A.T., O. BODENREICHER: A conceptual framework for the biiomedical domain; HETZLER, B.: Visual analysis and exploration of relationships

Footnote

Mit ausführlicher Einleitung der Herausgeber zu den Themen: Types of relationships - Relationships in knowledge representation and reasoning - Applications of relationships
Yang, C.C.; Luk, J.: Automatic generation of English/Chinese thesaurus based on a parallel corpus in laws (2003) 0.02
```
0.01812305 = product of:
  0.054369144 = sum of:
    0.029363085 = weight(_text_:applications in 1616) [ClassicSimilarity], result of:
      0.029363085 = score(doc=1616,freq=2.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.17024462 = fieldWeight in 1616, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1616)
    0.015717229 = weight(_text_:of in 1616) [ClassicSimilarity], result of:
      0.015717229 = score(doc=1616,freq=36.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.25655392 = fieldWeight in 1616, product of:
          6.0 = tf(freq=36.0), with freq of:
            36.0 = termFreq=36.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1616)
    0.009288831 = product of:
      0.018577661 = sum of:
        0.018577661 = weight(_text_:22 in 1616) [ClassicSimilarity], result of:
          0.018577661 = score(doc=1616,freq=2.0), product of:
            0.13719016 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03917671 = queryNorm
            0.1354154 = fieldWeight in 1616, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1616)
      0.5 = coord(1/2)
  0.33333334 = coord(3/9)
```
Abstract

The information available in languages other than English in the World Wide Web is increasing significantly. According to a report from Computer Economics in 1999, 54% of Internet users are English speakers ("English Will Dominate Web for Only Three More Years," Computer Economics, July 9, 1999, http://www.computereconomics. com/new4/pr/pr990610.html). However, it is predicted that there will be only 60% increase in Internet users among English speakers verses a 150% growth among nonEnglish speakers for the next five years. By 2005, 57% of Internet users will be non-English speakers. A report by CNN.com in 2000 showed that the number of Internet users in China had been increased from 8.9 million to 16.9 million from January to June in 2000 ("Report: China Internet users double to 17 million," CNN.com, July, 2000, http://cnn.org/2000/TECH/computing/07/27/ china.internet.reut/index.html). According to Nielsen/ NetRatings, there was a dramatic leap from 22.5 millions to 56.6 millions Internet users from 2001 to 2002. China had become the second largest global at-home Internet population in 2002 (US's Internet population was 166 millions) (Robyn Greenspan, "China Pulls Ahead of Japan," Internet.com, April 22, 2002, http://cyberatias.internet.com/big-picture/geographics/article/0,,5911_1013841,00. html). All of the evidences reveal the importance of crosslingual research to satisfy the needs in the near future. Digital library research has been focusing in structural and semantic interoperability in the past. Searching and retrieving objects across variations in protocols, formats and disciplines are widely explored (Schatz, B., & Chen, H. (1999). Digital libraries: technological advances and social impacts. IEEE Computer, Special Issue an Digital Libraries, February, 32(2), 45-50.; Chen, H., Yen, J., & Yang, C.C. (1999). International activities: development of Asian digital libraries. IEEE Computer, Special Issue an Digital Libraries, 32(2), 48-49.). However, research in crossing language boundaries, especially across European languages and Oriental languages, is still in the initial stage. In this proposal, we put our focus an cross-lingual semantic interoperability by developing automatic generation of a cross-lingual thesaurus based an English/Chinese parallel corpus. When the searchers encounter retrieval problems, Professional librarians usually consult the thesaurus to identify other relevant vocabularies. In the problem of searching across language boundaries, a cross-lingual thesaurus, which is generated by co-occurrence analysis and Hopfield network, can be used to generate additional semantically relevant terms that cannot be obtained from dictionary. In particular, the automatically generated cross-lingual thesaurus is able to capture the unknown words that do not exist in a dictionary, such as names of persons, organizations, and events. Due to Hong Kong's unique history background, both English and Chinese are used as official languages in all legal documents. Therefore, English/Chinese cross-lingual information retrieval is critical for applications in courts and the government. In this paper, we develop an automatic thesaurus by the Hopfield network based an a parallel corpus collected from the Web site of the Department of Justice of the Hong Kong Special Administrative Region (HKSAR) Government. Experiments are conducted to measure the precision and recall of the automatic generated English/Chinese thesaurus. The result Shows that such thesaurus is a promising tool to retrieve relevant terms, especially in the language that is not the same as the input term. The direct translation of the input term can also be retrieved in most of the cases.

Source

Journal of the American Society for Information Science and technology. 54(2003) no.7, S.671-682
Liu, X.; Croft, W.B.: Statistical language modeling for information retrieval (2004) 0.02
```
0.017583163 = product of:
  0.079124235 = sum of:
    0.059322387 = weight(_text_:applications in 4277) [ClassicSimilarity], result of:
      0.059322387 = score(doc=4277,freq=4.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.34394607 = fieldWeight in 4277, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4277)
    0.019801848 = weight(_text_:of in 4277) [ClassicSimilarity], result of:
      0.019801848 = score(doc=4277,freq=28.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.32322758 = fieldWeight in 4277, product of:
          5.2915025 = tf(freq=28.0), with freq of:
            28.0 = termFreq=28.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4277)
  0.22222222 = coord(2/9)
```
Abstract

This chapter reviews research and applications in statistical language modeling for information retrieval (IR), which has emerged within the past several years as a new probabilistic framework for describing information retrieval processes. Generally speaking, statistical language modeling, or more simply language modeling (LM), involves estimating a probability distribution that captures statistical regularities of natural language use. Applied to information retrieval, language modeling refers to the problem of estimating the likelihood that a query and a document could have been generated by the same language model, given the language model of the document either with or without a language model of the query. The roots of statistical language modeling date to the beginning of the twentieth century when Markov tried to model letter sequences in works of Russian literature (Manning & Schütze, 1999). Zipf (1929, 1932, 1949, 1965) studied the statistical properties of text and discovered that the frequency of works decays as a Power function of each works rank. However, it was Shannon's (1951) work that inspired later research in this area. In 1951, eager to explore the applications of his newly founded information theory to human language, Shannon used a prediction game involving n-grams to investigate the information content of English text. He evaluated n-gram models' performance by comparing their crossentropy an texts with the true entropy estimated using predictions made by human subjects. For many years, statistical language models have been used primarily for automatic speech recognition. Since 1980, when the first significant language model was proposed (Rosenfeld, 2000), statistical language modeling has become a fundamental component of speech recognition, machine translation, and spelling correction.

Source

Annual review of information science and technology. 39(2005), S.3-32
Rindflesch, T.C.; Fizsman, M.: The interaction of domain knowledge and linguistic structure in natural language processing : interpreting hypernymic propositions in biomedical text (2003) 0.02
```
0.016509151 = product of:
  0.07429118 = sum of:
    0.059322387 = weight(_text_:applications in 2097) [ClassicSimilarity], result of:
      0.059322387 = score(doc=2097,freq=4.0), product of:
        0.17247584 = queryWeight, product of:
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.03917671 = queryNorm
        0.34394607 = fieldWeight in 2097, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.4025097 = idf(docFreq=1471, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2097)
    0.014968789 = weight(_text_:of in 2097) [ClassicSimilarity], result of:
      0.014968789 = score(doc=2097,freq=16.0), product of:
        0.061262865 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03917671 = queryNorm
        0.24433708 = fieldWeight in 2097, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2097)
  0.22222222 = coord(2/9)
```
Abstract

Interpretation of semantic propositions in free-text documents such as MEDLINE citations would provide valuable support for biomedical applications, and several approaches to semantic interpretation are being pursued in the biomedical informatics community. In this paper, we describe a methodology for interpreting linguistic structures that encode hypernymic propositions, in which a more specific concept is in a taxonomic relationship with a more general concept. In order to effectively process these constructions, we exploit underspecified syntactic analysis and structured domain knowledge from the Unified Medical Language System (UMLS). After introducing the syntactic processing on which our system depends, we focus on the UMLS knowledge that supports interpretation of hypernymic propositions. We first use semantic groups from the Semantic Network to ensure that the two concepts involved are compatible; hierarchical information in the Metathesaurus then determines which concept is more general and which more specific. A preliminary evaluation of a sample based on the semantic group Chemicals and Drugs provides 83% precision. An error analysis was conducted and potential solutions to the problems encountered are presented. The research discussed here serves as a paradigm for investigating the interaction between domain knowledge and linguistic structure in natural language processing, and could also make a contribution to research on automatic processing of discourse structure. Additional implications of the system we present include its integration in advanced semantic interpretation processors for biomedical text and its use for information extraction in specific domains. The approach has the potential to support a range of applications, including information retrieval and ontology engineering.

Source

Journal of Biomedical Informatics, 36(2003) no.6), S.462-477

Search (187 results, page 1 of 10)

Authors

Languages

Types

Themes

Subjects

Classifications