Search (144 results, page 1 of 8)

  • × theme_ss:"Computerlinguistik"
  1. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.23
    0.22834091 = product of:
      0.45668182 = sum of:
        0.06293926 = product of:
          0.18881777 = sum of:
            0.18881777 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.18881777 = score(doc=562,freq=2.0), product of:
                0.3359639 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.03962768 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
        0.18881777 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.18881777 = score(doc=562,freq=2.0), product of:
            0.3359639 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03962768 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.18881777 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.18881777 = score(doc=562,freq=2.0), product of:
            0.3359639 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03962768 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.01610701 = product of:
          0.03221402 = sum of:
            0.03221402 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.03221402 = score(doc=562,freq=2.0), product of:
                0.13876937 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03962768 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.5 = coord(1/2)
      0.5 = coord(4/8)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  2. Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.17
    0.16521555 = product of:
      0.4405748 = sum of:
        0.06293926 = product of:
          0.18881777 = sum of:
            0.18881777 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
              0.18881777 = score(doc=862,freq=2.0), product of:
                0.3359639 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.03962768 = queryNorm
                0.56201804 = fieldWeight in 862, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=862)
          0.33333334 = coord(1/3)
        0.18881777 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
          0.18881777 = score(doc=862,freq=2.0), product of:
            0.3359639 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03962768 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
        0.18881777 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
          0.18881777 = score(doc=862,freq=2.0), product of:
            0.3359639 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03962768 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.375 = coord(3/8)
    
    Source
    https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN
  3. Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.15
    0.14765346 = product of:
      0.39374256 = sum of:
        0.18881777 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
          0.18881777 = score(doc=563,freq=2.0), product of:
            0.3359639 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03962768 = queryNorm
            0.56201804 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.18881777 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
          0.18881777 = score(doc=563,freq=2.0), product of:
            0.3359639 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03962768 = queryNorm
            0.56201804 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.01610701 = product of:
          0.03221402 = sum of:
            0.03221402 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
              0.03221402 = score(doc=563,freq=2.0), product of:
                0.13876937 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03962768 = queryNorm
                0.23214069 = fieldWeight in 563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=563)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    Content
    A Thesis presented to The University of Guelph In partial fulfilment of requirements for the degree of Master of Science in Computer Science. Vgl. Unter: http://www.inf.ufrgs.br%2F~ceramisch%2Fdownload_files%2Fpublications%2F2009%2Fp01.pdf.
    Date
    10. 1.2013 19:22:47
  4. Semantic role universals and argument linking : theoretical, typological, and psycholinguistic perspectives (2006) 0.03
    0.031122763 = product of:
      0.08299404 = sum of:
        0.033850174 = weight(_text_:case in 3670) [ClassicSimilarity], result of:
          0.033850174 = score(doc=3670,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.1942959 = fieldWeight in 3670, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03125 = fieldNorm(doc=3670)
        0.027884906 = weight(_text_:studies in 3670) [ClassicSimilarity], result of:
          0.027884906 = score(doc=3670,freq=2.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.17634688 = fieldWeight in 3670, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03125 = fieldNorm(doc=3670)
        0.02125896 = product of:
          0.04251792 = sum of:
            0.04251792 = weight(_text_:area in 3670) [ClassicSimilarity], result of:
              0.04251792 = score(doc=3670,freq=2.0), product of:
                0.1952553 = queryWeight, product of:
                  4.927245 = idf(docFreq=870, maxDocs=44218)
                  0.03962768 = queryNorm
                0.21775553 = fieldWeight in 3670, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.927245 = idf(docFreq=870, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3670)
          0.5 = coord(1/2)
      0.375 = coord(3/8)
    
    Abstract
    The concept of semantic roles has been central to linguistic theory for many decades. More specifically, the assumption of such representations as mediators in the correspondence between a linguistic form and its associated meaning has helped to address a number of critical issues related to grammatical phenomena. Furthermore, in addition to featuring in all major theories of grammar, semantic (or 'thematic') roles have been referred to extensively within a wide range of other linguistic subdisciplines, including language typology and psycho-/neurolinguistics. This volume brings together insights from these different perspectives and thereby, for the first time, seeks to build upon the obvious potential for cross-fertilisation between hitherto autonomous approaches to a common theme. To this end, a view on semantic roles is adopted that goes beyond the mere assumption of generalised roles, but also focuses on their hierarchical organisation. The book is thus centred around the interdisciplinary examination of how these hierarchical dependencies subserve argument linking - both in terms of linguistic theory and with respect to real-time language processing - and how they interact with other information types in this process. Furthermore, the contributions examine the interaction between the role hierarchy and the conceptual content of (generalised) semantic roles and investigate their cross-linguistic applicability and psychological reality, as well as their explanatory potential in accounting for phenomena in the domain of language disorders. In bridging the gap between different disciplines, the book provides a valuable overview of current thought on semantic roles and argument linking, and may further serve as a point of departure for future interdisciplinary research in this area. As such, it will be of interest to scientists and advanced students in all domains of linguistics and cognitive science.
    Content
    Inhalt: Argument hierarchy and other factors determining argument realization / Dieter Wunderlich - Mismatches in semantic-role hierarchies and the dimensions of role semantics / Beatrice Primus - Thematic roles : universal, particular, and idiosyncratic aspects / Manfred Bierwisch - Experiencer constructions in Daghestanian languages / Bernard Comrie and Helma van den Berg - Clause-level vs. predicate-level linking / Balthasar Bickel - From meaning to syntax semantic roles and beyond / Walter Bisang - Meaning, form and function in basic case roles / Georg Bossong - Semantic macroroles and language processing / Robert D. Van Valin, Jr. - Thematic roles as event structure relations / Maria Mercedes Pinango - Generalised semantic roles and syntactic templates: Anew framework for language comprehension / Ina Bornkessel and Matthias Schlesewsky
    Series
    Trends in linguistics. Studies and monographs; 165
  5. Kracht, M.: Mathematical linguistics (2002) 0.03
    0.02748202 = product of:
      0.10992808 = sum of:
        0.05077526 = weight(_text_:case in 3572) [ClassicSimilarity], result of:
          0.05077526 = score(doc=3572,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.29144385 = fieldWeight in 3572, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.046875 = fieldNorm(doc=3572)
        0.05915282 = weight(_text_:studies in 3572) [ClassicSimilarity], result of:
          0.05915282 = score(doc=3572,freq=4.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.37408823 = fieldWeight in 3572, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.046875 = fieldNorm(doc=3572)
      0.25 = coord(2/8)
    
    Abstract
    This book studies language(s) and linguistic theories from a mathematical point of view. Starting with ideas already contained in Montague's work, it develops the mathematical foundations of present day linguistics. It equips the reader with all the background necessary to understand and evaluate theories as diverse as Montague Grammar, Categorial Grammar, HPSG and GB. The mathematical tools are mainly from universal algebra and logic, but no particular knowledge is presupposed beyond a certain mathematical sophistication that is in any case needed in order to fruitfully work within these theories. The presentation focuses an abstract mathematical structures and their computational properties, but plenty of examples from different natural languages are provided to illustrate the main concepts and results. In contrast to books devoted to so-called formal language theory, languages are seen here as semiotic systems, that is, as systems of signs. A language sign correlates form with meaning. Using the principle of compositionality it is possible to gain substantial insight into the interaction between form and meaning in natural languages.
    Series
    Studies in generative grammar; 63
  6. Andrushchenko, M.; Sandberg, K.; Turunen, R.; Marjanen, J.; Hatavara, M.; Kurunmäki, J.; Nummenmaa, T.; Hyvärinen, M.; Teräs, K.; Peltonen, J.; Nummenmaa, J.: Using parsed and annotated corpora to analyze parliamentarians' talk in Finland (2022) 0.03
    0.027283307 = product of:
      0.10913323 = sum of:
        0.059839215 = weight(_text_:case in 471) [ClassicSimilarity], result of:
          0.059839215 = score(doc=471,freq=4.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.34346986 = fieldWeight in 471, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.0390625 = fieldNorm(doc=471)
        0.049294014 = weight(_text_:studies in 471) [ClassicSimilarity], result of:
          0.049294014 = score(doc=471,freq=4.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.3117402 = fieldWeight in 471, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.0390625 = fieldNorm(doc=471)
      0.25 = coord(2/8)
    
    Abstract
    We present a search system for grammatically analyzed corpora of Finnish parliamentary records and interviews with former parliamentarians, annotated with metadata of talk structure and involved parliamentarians, and discuss their use through carefully chosen digital humanities case studies. We first introduce the construction, contents, and principles of use of the corpora. Then we discuss the application of the search system and the corpora to study how politicians talk about power, how ideological terms are used in political speech, and how to identify narratives in the data. All case studies stem from questions in the humanities and the social sciences, but rely on the grammatically parsed corpora in both identifying and quantifying passages of interest. Finally, the paper discusses the role of natural language processing methods for questions in the (digital) humanities. It makes the claim that a digital humanities inquiry of parliamentary speech and interviews with politicians cannot only rely on computational humanities modeling, but needs to accommodate a range of perspectives starting with simple searches, quantitative exploration, and ending with modeling. Furthermore, the digital humanities need a more thorough discussion about how the utilization of tools from information science and technologies alter the research questions posed in the humanities.
  7. Anguiano Peña, G.; Naumis Peña, C.: Method for selecting specialized terms from a general language corpus (2015) 0.02
    0.020665925 = product of:
      0.0826637 = sum of:
        0.05077526 = weight(_text_:case in 2196) [ClassicSimilarity], result of:
          0.05077526 = score(doc=2196,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.29144385 = fieldWeight in 2196, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.046875 = fieldNorm(doc=2196)
        0.031888437 = product of:
          0.06377687 = sum of:
            0.06377687 = weight(_text_:area in 2196) [ClassicSimilarity], result of:
              0.06377687 = score(doc=2196,freq=2.0), product of:
                0.1952553 = queryWeight, product of:
                  4.927245 = idf(docFreq=870, maxDocs=44218)
                  0.03962768 = queryNorm
                0.32663327 = fieldWeight in 2196, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.927245 = idf(docFreq=870, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2196)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    Among the many aspects studied by library and information science are linguistic phenomena associated with document content analysis, for purposes of both information organization and retrieval. To this end, terms used in scientific and technical language must be recovered and their area of domain and behavior studied. Through language, society controls the knowledge available to people. Document content analysis, in this case of scientific texts, facilitates gathering knowledge of lexical units and their major applications and separating such specialized terms from the general language, to create indexing languages. The model presented here or other lexicographic resources with similar characteristics may be useful in the near future, in computer-assisted indexing or as corpora monitors, with respect to new text analyses or specialized corpora. Thus, using techniques for document content analysis of a lexicographically labeled general language corpus proposed herein, components which enable the extraction of lexical units from specialized language may be obtained and characterized.
  8. Melby, A.: Some notes on 'The proper place of men and machines in language translation' (1997) 0.02
    0.01950733 = product of:
      0.07802932 = sum of:
        0.059237804 = weight(_text_:case in 330) [ClassicSimilarity], result of:
          0.059237804 = score(doc=330,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.34001783 = fieldWeight in 330, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.0546875 = fieldNorm(doc=330)
        0.018791512 = product of:
          0.037583023 = sum of:
            0.037583023 = weight(_text_:22 in 330) [ClassicSimilarity], result of:
              0.037583023 = score(doc=330,freq=2.0), product of:
                0.13876937 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03962768 = queryNorm
                0.2708308 = fieldWeight in 330, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=330)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    Responds to Kay, M.: The proper place of men and machines in language translation. Examines the appropriateness of machine translation (MT) under the following special circumstances: controlled domain-specific text and high-quality output; controlled domain-specific text and indicative output; dynamic general text and indicative output and dynamic general text and high-quality output. MT is appropriate in the 1st 3 cases but the 4th case requires human translation. Examines how MT research could be more useful for aiding human translation
    Date
    31. 7.1996 9:22:19
  9. Tao, J.; Zhou, L.; Hickey, K.: Making sense of the black-boxes : toward interpretable text classification using deep learning models (2023) 0.02
    0.019292213 = product of:
      0.07716885 = sum of:
        0.042312715 = weight(_text_:case in 990) [ClassicSimilarity], result of:
          0.042312715 = score(doc=990,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.24286987 = fieldWeight in 990, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.0390625 = fieldNorm(doc=990)
        0.034856133 = weight(_text_:studies in 990) [ClassicSimilarity], result of:
          0.034856133 = score(doc=990,freq=2.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.22043361 = fieldWeight in 990, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.0390625 = fieldNorm(doc=990)
      0.25 = coord(2/8)
    
    Abstract
    Text classification is a common task in data science. Despite the superior performances of deep learning based models in various text classification tasks, their black-box nature poses significant challenges for wide adoption. The knowledge-to-action framework emphasizes several principles concerning the application and use of knowledge, such as ease-of-use, customization, and feedback. With the guidance of the above principles and the properties of interpretable machine learning, we identify the design requirements for and propose an interpretable deep learning (IDeL) based framework for text classification models. IDeL comprises three main components: feature penetration, instance aggregation, and feature perturbation. We evaluate our implementation of the framework with two distinct case studies: fake news detection and social question categorization. The experiment results provide evidence for the efficacy of IDeL components in enhancing the interpretability of text classification models. Moreover, the findings are generalizable across binary and multi-label, multi-class classification problems. The proposed IDeL framework introduce a unique iField perspective for building trusted models in data science by improving the transparency and access to advanced black-box models.
  10. Chowdhury, G.G.: Natural language processing (2002) 0.02
    0.018361554 = product of:
      0.073446214 = sum of:
        0.02834915 = weight(_text_:libraries in 4284) [ClassicSimilarity], result of:
          0.02834915 = score(doc=4284,freq=2.0), product of:
            0.13017908 = queryWeight, product of:
              3.2850544 = idf(docFreq=4499, maxDocs=44218)
              0.03962768 = queryNorm
            0.2177704 = fieldWeight in 4284, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2850544 = idf(docFreq=4499, maxDocs=44218)
              0.046875 = fieldNorm(doc=4284)
        0.045097064 = product of:
          0.09019413 = sum of:
            0.09019413 = weight(_text_:area in 4284) [ClassicSimilarity], result of:
              0.09019413 = score(doc=4284,freq=4.0), product of:
                0.1952553 = queryWeight, product of:
                  4.927245 = idf(docFreq=870, maxDocs=44218)
                  0.03962768 = queryNorm
                0.46192923 = fieldWeight in 4284, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.927245 = idf(docFreq=870, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4284)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    Natural Language Processing (NLP) is an area of research and application that explores how computers can be used to understand and manipulate natural language text or speech to do useful things. NLP researchers aim to gather knowledge an how human beings understand and use language so that appropriate tools and techniques can be developed to make computer systems understand and manipulate natural languages to perform desired tasks. The foundations of NLP lie in a number of disciplines, namely, computer and information sciences, linguistics, mathematics, electrical and electronic engineering, artificial intelligence and robotics, and psychology. Applications of NLP include a number of fields of study, such as machine translation, natural language text processing and summarization, user interfaces, multilingual and cross-language information retrieval (CLIR), speech recognition, artificial intelligence, and expert systems. One important application area that is relatively new and has not been covered in previous ARIST chapters an NLP relates to the proliferation of the World Wide Web and digital libraries.
  11. Schneider, J.W.; Borlund, P.: ¬A bibliometric-based semiautomatic approach to identification of candidate thesaurus terms : parsing and filtering of noun phrases from citation contexts (2005) 0.02
    0.017851189 = product of:
      0.14280951 = sum of:
        0.14280951 = sum of:
          0.10522648 = weight(_text_:area in 156) [ClassicSimilarity], result of:
            0.10522648 = score(doc=156,freq=4.0), product of:
              0.1952553 = queryWeight, product of:
                4.927245 = idf(docFreq=870, maxDocs=44218)
                0.03962768 = queryNorm
              0.5389174 = fieldWeight in 156, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.927245 = idf(docFreq=870, maxDocs=44218)
                0.0546875 = fieldNorm(doc=156)
          0.037583023 = weight(_text_:22 in 156) [ClassicSimilarity], result of:
            0.037583023 = score(doc=156,freq=2.0), product of:
              0.13876937 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.03962768 = queryNorm
              0.2708308 = fieldWeight in 156, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0546875 = fieldNorm(doc=156)
      0.125 = coord(1/8)
    
    Abstract
    The present study investigates the ability of a bibliometric based semi-automatic method to select candidate thesaurus terms from citation contexts. The method consists of document co-citation analysis, citation context analysis, and noun phrase parsing. The investigation is carried out within the specialty area of periodontology. The results clearly demonstrate that the method is able to select important candidate thesaurus terms within the chosen specialty area.
    Date
    8. 3.2007 19:55:22
  12. Pepper, S.: ¬The typology and semantics of binominal lexemes : noun-noun compounds and their functional equivalents (2020) 0.02
    0.01543377 = product of:
      0.06173508 = sum of:
        0.033850174 = weight(_text_:case in 104) [ClassicSimilarity], result of:
          0.033850174 = score(doc=104,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.1942959 = fieldWeight in 104, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03125 = fieldNorm(doc=104)
        0.027884906 = weight(_text_:studies in 104) [ClassicSimilarity], result of:
          0.027884906 = score(doc=104,freq=2.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.17634688 = fieldWeight in 104, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03125 = fieldNorm(doc=104)
      0.25 = coord(2/8)
    
    Abstract
    The dissertation establishes 'binominal lexeme' as a comparative concept and discusses its cross-linguistic typology and semantics. Informally, a binominal lexeme is a noun-noun compound or functional equivalent; more precisely, it is a lexical item that consists primarily of two thing-morphs between which there exists an unstated semantic relation. Examples of binominals include Mandarin Chinese ?? (tielù) [iron road], French chemin de fer [way of iron] and Russian ???????? ?????? (zeleznaja doroga) [iron:adjz road]. All of these combine a word denoting 'iron' and a word denoting 'road' or 'way' to denote the meaning railway. In each case, the unstated semantic relation is one of composition: a railway is conceptualized as a road that is composed (or made) of iron. However, three different morphosyntactic strategies are employed: compounding, prepositional phrase and relational adjective. This study explores the range of such strategies used by a worldwide sample of 106 languages to express a set of 100 meanings from various semantic domains, resulting in a classification consisting of nine different morphosyntactic types. The semantic relations found in the data are also explored and a classification called the Hatcher-Bourque system is developed that operates at two levels of granularity, together with a tool for classifying binominals, the Bourquifier. The classification is extended to other subfields of language, including metonymy and lexical semantics, and beyond language to the domain of knowledge representation, resulting in a proposal for a general model of associative relations called the PHAB model. The many findings of the research include universals concerning the recruitment of anchoring nominal modification strategies, a method for comparing non-binary typologies, the non-universality (despite its predominance) of compounding, and a scale of frequencies for semantic relations which may provide insights into the associative nature of human thought.
    Imprint
    Oslo : University of Oslo / Faculty of Humanities / Department of Linguistics and Scandinavian Studies
  13. Ibekwe-SanJuan, F.; SanJuan, E.: From term variants to research topics (2002) 0.02
    0.015357459 = product of:
      0.061429836 = sum of:
        0.034856133 = weight(_text_:studies in 1853) [ClassicSimilarity], result of:
          0.034856133 = score(doc=1853,freq=2.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.22043361 = fieldWeight in 1853, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1853)
        0.0265737 = product of:
          0.0531474 = sum of:
            0.0531474 = weight(_text_:area in 1853) [ClassicSimilarity], result of:
              0.0531474 = score(doc=1853,freq=2.0), product of:
                0.1952553 = queryWeight, product of:
                  4.927245 = idf(docFreq=870, maxDocs=44218)
                  0.03962768 = queryNorm
                0.27219442 = fieldWeight in 1853, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.927245 = idf(docFreq=870, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1853)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    In a scientific and technological watch (STW) task, an expert user needs to survey the evolution of research topics in his area of specialisation in order to detect interesting changes. The majority of methods proposing evaluation metrics (bibliometrics and scientometrics studies) for STW rely solely an statistical data analysis methods (Co-citation analysis, co-word analysis). Such methods usually work an structured databases where the units of analysis (words, keywords) are already attributed to documents by human indexers. The advent of huge amounts of unstructured textual data has rendered necessary the integration of natural language processing (NLP) techniques to first extract meaningful units from texts. We propose a method for STW which is NLP-oriented. The method not only analyses texts linguistically in order to extract terms from them, but also uses linguistic relations (syntactic variations) as the basis for clustering. Terms and variation relations are formalised as weighted di-graphs which the clustering algorithm, CPCL (Classification by Preferential Clustered Link) will seek to reduce in order to produces classes. These classes ideally represent the research topics present in the corpus. The results of the classification are subjected to validation by an expert in STW.
  14. Haas, S.W.: ¬A feasibility study of the case hierarchy model for the construction and porting of natural language interfaces (1990) 0.01
    0.014809451 = product of:
      0.11847561 = sum of:
        0.11847561 = weight(_text_:case in 8071) [ClassicSimilarity], result of:
          0.11847561 = score(doc=8071,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.68003565 = fieldWeight in 8071, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.109375 = fieldNorm(doc=8071)
      0.125 = coord(1/8)
    
  15. Working with conceptual structures : contributions to ICCS 2000. 8th International Conference on Conceptual Structures: Logical, Linguistic, and Computational Issues. Darmstadt, August 14-18, 2000 (2000) 0.01
    0.013504548 = product of:
      0.054018192 = sum of:
        0.029618902 = weight(_text_:case in 5089) [ClassicSimilarity], result of:
          0.029618902 = score(doc=5089,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.17000891 = fieldWeight in 5089, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.02734375 = fieldNorm(doc=5089)
        0.024399292 = weight(_text_:studies in 5089) [ClassicSimilarity], result of:
          0.024399292 = score(doc=5089,freq=2.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.15430352 = fieldWeight in 5089, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.02734375 = fieldNorm(doc=5089)
      0.25 = coord(2/8)
    
    Abstract
    The 8th International Conference on Conceptual Structures - Logical, Linguistic, and Computational Issues (ICCS 2000) brings together a wide range of researchers and practitioners working with conceptual structures. During the last few years, the ICCS conference series has considerably widened its scope on different kinds of conceptual structures, stimulating research across domain boundaries. We hope that this stimulation is further enhanced by ICCS 2000 joining the long tradition of conferences in Darmstadt with extensive, lively discussions. This volume consists of contributions presented at ICCS 2000, complementing the volume "Conceptual Structures: Logical, Linguistic, and Computational Issues" (B. Ganter, G.W. Mineau (Eds.), LNAI 1867, Springer, Berlin-Heidelberg 2000). It contains submissions reviewed by the program committee, and position papers. We wish to express our appreciation to all the authors of submitted papers, to the general chair, the program chair, the editorial board, the program committee, and to the additional reviewers for making ICCS 2000 a valuable contribution in the knowledge processing research field. Special thanks go to the local organizers for making the conference an enjoyable and inspiring event. We are grateful to Darmstadt University of Technology, the Ernst Schröder Center for Conceptual Knowledge Processing, the Center for Interdisciplinary Studies in Technology, the Deutsche Forschungsgemeinschaft, Land Hessen, and NaviCon GmbH for their generous support
    Content
    Concepts & Language: Knowledge organization by procedures of natural language processing. A case study using the method GABEK (J. Zelger, J. Gadner) - Computer aided narrative analysis using conceptual graphs (H. Schärfe, P. 0hrstrom) - Pragmatic representation of argumentative text: a challenge for the conceptual graph approach (H. Irandoust, B. Moulin) - Conceptual graphs as a knowledge representation core in a complex language learning environment (G. Angelova, A. Nenkova, S. Boycheva, T. Nikolov) - Conceptual Modeling and Ontologies: Relationships and actions in conceptual categories (Ch. Landauer, K.L. Bellman) - Concept approximations for formal concept analysis (J. Saquer, J.S. Deogun) - Faceted information representation (U. Priß) - Simple concept graphs with universal quantifiers (J. Tappe) - A framework for comparing methods for using or reusing multiple ontologies in an application (J. van ZyI, D. Corbett) - Designing task/method knowledge-based systems with conceptual graphs (M. Leclère, F.Trichet, Ch. Choquet) - A logical ontology (J. Farkas, J. Sarbo) - Algorithms and Tools: Fast concept analysis (Ch. Lindig) - A framework for conceptual graph unification (D. Corbett) - Visual CP representation of knowledge (H.D. Pfeiffer, R.T. Hartley) - Maximal isojoin for representing software textual specifications and detecting semantic anomalies (Th. Charnois) - Troika: using grids, lattices and graphs in knowledge acquisition (H.S. Delugach, B.E. Lampkin) - Open world theorem prover for conceptual graphs (J.E. Heaton, P. Kocura) - NetCare: a practical conceptual graphs software tool (S. Polovina, D. Strang) - CGWorld - a web based workbench for conceptual graphs management and applications (P. Dobrev, K. Toutanova) - Position papers: The edition project: Peirce's existential graphs (R. Mülller) - Mining association rules using formal concept analysis (N. Pasquier) - Contextual logic summary (R Wille) - Information channels and conceptual scaling (K.E. Wolff) - Spatial concepts - a rule exploration (S. Rudolph) - The TEXT-TO-ONTO learning environment (A. Mädche, St. Staab) - Controlling the semantics of metadata on audio-visual documents using ontologies (Th. Dechilly, B. Bachimont) - Building the ontological foundations of a terminology from natural language to conceptual graphs with Ribosome, a knowledge extraction system (Ch. Jacquelinet, A. Burgun) - CharGer: some lessons learned and new directions (H.S. Delugach) - Knowledge management using conceptual graphs (W.K. Pun)
  16. Witschel, H.F.: Global and local resources for peer-to-peer text retrieval (2008) 0.01
    0.01327685 = product of:
      0.0531074 = sum of:
        0.03450581 = weight(_text_:studies in 127) [ClassicSimilarity], result of:
          0.03450581 = score(doc=127,freq=4.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.21821813 = fieldWeight in 127, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.02734375 = fieldNorm(doc=127)
        0.018601589 = product of:
          0.037203178 = sum of:
            0.037203178 = weight(_text_:area in 127) [ClassicSimilarity], result of:
              0.037203178 = score(doc=127,freq=2.0), product of:
                0.1952553 = queryWeight, product of:
                  4.927245 = idf(docFreq=870, maxDocs=44218)
                  0.03962768 = queryNorm
                0.19053608 = fieldWeight in 127, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.927245 = idf(docFreq=870, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=127)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    Chapter 5 empirically tackles the first of the two research questions formulated above, namely the question of global collection statistics. More precisely, it studies possibilities of radically simplified results merging. The simplification comes from the attempt - without having knowledge of the complete collection - to equip all peers with the same global statistics, making document scores comparable across peers. Chapter 5 empirically tackles the first of the two research questions formulated above, namely the question of global collection statistics. More precisely, it studies possibilities of radically simplified results merging. The simplification comes from the attempt - without having knowledge of the complete collection - to equip all peers with the same global statistics, making document scores comparable across peers. What is examined, is the question of how we can obtain such global statistics and to what extent their use will lead to a drop in retrieval effectiveness. In chapter 6, the second research question is tackled, namely that of making forwarding decisions for queries, based on profiles of other peers. After a review of related work in that area, the chapter first defines the approaches that will be compared against each other. Then, a novel evaluation framework is introduced, including a new measure for comparing results of a distributed search engine against those of a centralised one. Finally, the actual evaluation is performed using the new framework.
  17. Campe, P.: Case, semantic roles, and grammatical relations : a comprehensive bibliography (1994) 0.01
    0.012693815 = product of:
      0.10155052 = sum of:
        0.10155052 = weight(_text_:case in 8663) [ClassicSimilarity], result of:
          0.10155052 = score(doc=8663,freq=2.0), product of:
            0.1742197 = queryWeight, product of:
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.03962768 = queryNorm
            0.5828877 = fieldWeight in 8663, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.3964143 = idf(docFreq=1480, maxDocs=44218)
              0.09375 = fieldNorm(doc=8663)
      0.125 = coord(1/8)
    
  18. Kajanan, S.; Bao, Y.; Datta, A.; VanderMeer, D.; Dutta, K.: Efficient automatic search query formulation using phrase-level analysis (2014) 0.01
    0.012285966 = product of:
      0.049143866 = sum of:
        0.027884906 = weight(_text_:studies in 1264) [ClassicSimilarity], result of:
          0.027884906 = score(doc=1264,freq=2.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.17634688 = fieldWeight in 1264, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03125 = fieldNorm(doc=1264)
        0.02125896 = product of:
          0.04251792 = sum of:
            0.04251792 = weight(_text_:area in 1264) [ClassicSimilarity], result of:
              0.04251792 = score(doc=1264,freq=2.0), product of:
                0.1952553 = queryWeight, product of:
                  4.927245 = idf(docFreq=870, maxDocs=44218)
                  0.03962768 = queryNorm
                0.21775553 = fieldWeight in 1264, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.927245 = idf(docFreq=870, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1264)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    Over the past decade, the volume of information available digitally over the Internet has grown enormously. Technical developments in the area of search, such as Google's Page Rank algorithm, have proved so good at serving relevant results that Internet search has become integrated into daily human activity. One can endlessly explore topics of interest simply by querying and reading through the resulting links. Yet, although search engines are well known for providing relevant results based on users' queries, users do not always receive the results they are looking for. Google's Director of Research describes clickstream evidence of frustrated users repeatedly reformulating queries and searching through page after page of results. Given the general quality of search engine results, one must consider the possibility that the frustrated user's query is not effective; that is, it does not describe the essence of the user's interest. Indeed, extensive research into human search behavior has found that humans are not very effective at formulating good search queries that describe what they are interested in. Ideally, the user should simply point to a portion of text that sparked the user's interest, and a system should automatically formulate a search query that captures the essence of the text. In this paper, we describe an implemented system that provides this capability. We first describe how our work differs from existing work in automatic query formulation, and propose a new method for improved quantification of the relevance of candidate search terms drawn from input text using phrase-level analysis. We then propose an implementable method designed to provide relevant queries based on a user's text input. We demonstrate the quality of our results and performance of our system through experimental studies. Our results demonstrate that our system produces relevant search terms with roughly two-thirds precision and recall compared to search terms selected by experts, and that typical users find significantly more relevant results (31% more relevant) more quickly (64% faster) using our system than self-formulated search queries. Further, we show that our implementation can scale to request loads of up to 10 requests per second within current online responsiveness expectations (<2-second response times at the highest loads tested).
  19. Sharada, B.A.: Identification and interpretation of metaphors in document titles (1999) 0.01
    0.012199646 = product of:
      0.09759717 = sum of:
        0.09759717 = weight(_text_:studies in 6792) [ClassicSimilarity], result of:
          0.09759717 = score(doc=6792,freq=2.0), product of:
            0.15812531 = queryWeight, product of:
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.03962768 = queryNorm
            0.6172141 = fieldWeight in 6792, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9902744 = idf(docFreq=2222, maxDocs=44218)
              0.109375 = fieldNorm(doc=6792)
      0.125 = coord(1/8)
    
    Source
    Library science with a slant to documentation and information studies. 36(1999) no.1, S.27-33
  20. Yang, C.C.; Luk, J.: Automatic generation of English/Chinese thesaurus based on a parallel corpus in laws (2003) 0.01
    0.0106174415 = product of:
      0.042469766 = sum of:
        0.03307401 = weight(_text_:libraries in 1616) [ClassicSimilarity], result of:
          0.03307401 = score(doc=1616,freq=8.0), product of:
            0.13017908 = queryWeight, product of:
              3.2850544 = idf(docFreq=4499, maxDocs=44218)
              0.03962768 = queryNorm
            0.25406548 = fieldWeight in 1616, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2850544 = idf(docFreq=4499, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1616)
        0.009395756 = product of:
          0.018791512 = sum of:
            0.018791512 = weight(_text_:22 in 1616) [ClassicSimilarity], result of:
              0.018791512 = score(doc=1616,freq=2.0), product of:
                0.13876937 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03962768 = queryNorm
                0.1354154 = fieldWeight in 1616, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=1616)
          0.5 = coord(1/2)
      0.25 = coord(2/8)
    
    Abstract
    The information available in languages other than English in the World Wide Web is increasing significantly. According to a report from Computer Economics in 1999, 54% of Internet users are English speakers ("English Will Dominate Web for Only Three More Years," Computer Economics, July 9, 1999, http://www.computereconomics. com/new4/pr/pr990610.html). However, it is predicted that there will be only 60% increase in Internet users among English speakers verses a 150% growth among nonEnglish speakers for the next five years. By 2005, 57% of Internet users will be non-English speakers. A report by CNN.com in 2000 showed that the number of Internet users in China had been increased from 8.9 million to 16.9 million from January to June in 2000 ("Report: China Internet users double to 17 million," CNN.com, July, 2000, http://cnn.org/2000/TECH/computing/07/27/ china.internet.reut/index.html). According to Nielsen/ NetRatings, there was a dramatic leap from 22.5 millions to 56.6 millions Internet users from 2001 to 2002. China had become the second largest global at-home Internet population in 2002 (US's Internet population was 166 millions) (Robyn Greenspan, "China Pulls Ahead of Japan," Internet.com, April 22, 2002, http://cyberatias.internet.com/big-picture/geographics/article/0,,5911_1013841,00. html). All of the evidences reveal the importance of crosslingual research to satisfy the needs in the near future. Digital library research has been focusing in structural and semantic interoperability in the past. Searching and retrieving objects across variations in protocols, formats and disciplines are widely explored (Schatz, B., & Chen, H. (1999). Digital libraries: technological advances and social impacts. IEEE Computer, Special Issue an Digital Libraries, February, 32(2), 45-50.; Chen, H., Yen, J., & Yang, C.C. (1999). International activities: development of Asian digital libraries. IEEE Computer, Special Issue an Digital Libraries, 32(2), 48-49.). However, research in crossing language boundaries, especially across European languages and Oriental languages, is still in the initial stage. In this proposal, we put our focus an cross-lingual semantic interoperability by developing automatic generation of a cross-lingual thesaurus based an English/Chinese parallel corpus. When the searchers encounter retrieval problems, Professional librarians usually consult the thesaurus to identify other relevant vocabularies. In the problem of searching across language boundaries, a cross-lingual thesaurus, which is generated by co-occurrence analysis and Hopfield network, can be used to generate additional semantically relevant terms that cannot be obtained from dictionary. In particular, the automatically generated cross-lingual thesaurus is able to capture the unknown words that do not exist in a dictionary, such as names of persons, organizations, and events. Due to Hong Kong's unique history background, both English and Chinese are used as official languages in all legal documents. Therefore, English/Chinese cross-lingual information retrieval is critical for applications in courts and the government. In this paper, we develop an automatic thesaurus by the Hopfield network based an a parallel corpus collected from the Web site of the Department of Justice of the Hong Kong Special Administrative Region (HKSAR) Government. Experiments are conducted to measure the precision and recall of the automatic generated English/Chinese thesaurus. The result Shows that such thesaurus is a promising tool to retrieve relevant terms, especially in the language that is not the same as the input term. The direct translation of the input term can also be retrieved in most of the cases.

Years

Languages

  • e 119
  • d 21
  • m 2
  • ru 2
  • chi 1
  • f 1
  • More… Less…

Types

  • a 114
  • m 16
  • el 10
  • s 8
  • x 5
  • p 4
  • b 1
  • d 1
  • More… Less…

Classifications