Search (454 results, page 1 of 23)

  • × theme_ss:"Computerlinguistik"
  1. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.24
    0.23999754 = product of:
      0.3199967 = sum of:
        0.075188726 = product of:
          0.22556618 = sum of:
            0.22556618 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.22556618 = score(doc=562,freq=2.0), product of:
                0.40135044 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.047340166 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
        0.22556618 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.22556618 = score(doc=562,freq=2.0), product of:
            0.40135044 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.047340166 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.019241815 = product of:
          0.03848363 = sum of:
            0.03848363 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.03848363 = score(doc=562,freq=2.0), product of:
                0.16577719 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.047340166 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  2. Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.19
    0.19085933 = product of:
      0.2544791 = sum of:
        0.22556618 = weight(_text_:2f in 563) [ClassicSimilarity], result of:
          0.22556618 = score(doc=563,freq=2.0), product of:
            0.40135044 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.047340166 = queryNorm
            0.56201804 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.009671121 = weight(_text_:information in 563) [ClassicSimilarity], result of:
          0.009671121 = score(doc=563,freq=2.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.116372846 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
        0.019241815 = product of:
          0.03848363 = sum of:
            0.03848363 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
              0.03848363 = score(doc=563,freq=2.0), product of:
                0.16577719 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.047340166 = queryNorm
                0.23214069 = fieldWeight in 563, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=563)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    In this thesis we propose three new word association measures for multi-word term extraction. We combine these association measures with LocalMaxs algorithm in our extraction model and compare the results of different multi-word term extraction methods. Our approach is language and domain independent and requires no training data. It can be applied to such tasks as text summarization, information retrieval, and document classification. We further explore the potential of using multi-word terms as an effective representation for general web-page summarization. We extract multi-word terms from human written summaries in a large collection of web-pages, and generate the summaries by aligning document words with these multi-word terms. Our system applies machine translation technology to learn the aligning process from a training set and focuses on selecting high quality multi-word terms from human written summaries to generate suitable results for web-page summarization.
    Content
    A Thesis presented to The University of Guelph In partial fulfilment of requirements for the degree of Master of Science in Computer Science. Vgl. Unter: http://www.inf.ufrgs.br%2F~ceramisch%2Fdownload_files%2Fpublications%2F2009%2Fp01.pdf.
    Date
    10. 1.2013 19:22:47
  3. Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.15
    0.15037745 = product of:
      0.3007549 = sum of:
        0.075188726 = product of:
          0.22556618 = sum of:
            0.22556618 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
              0.22556618 = score(doc=862,freq=2.0), product of:
                0.40135044 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.047340166 = queryNorm
                0.56201804 = fieldWeight in 862, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=862)
          0.33333334 = coord(1/3)
        0.22556618 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
          0.22556618 = score(doc=862,freq=2.0), product of:
            0.40135044 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.047340166 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.5 = coord(2/4)
    
    Source
    https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN
  4. Doszkocs, T.E.; Zamora, A.: Dictionary services and spelling aids for Web searching (2004) 0.05
    0.053301096 = product of:
      0.10660219 = sum of:
        0.011397525 = weight(_text_:information in 2541) [ClassicSimilarity], result of:
          0.011397525 = score(doc=2541,freq=4.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.13714671 = fieldWeight in 2541, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2541)
        0.095204666 = sum of:
          0.049851276 = weight(_text_:services in 2541) [ClassicSimilarity], result of:
            0.049851276 = score(doc=2541,freq=4.0), product of:
              0.1738033 = queryWeight, product of:
                3.6713707 = idf(docFreq=3057, maxDocs=44218)
                0.047340166 = queryNorm
              0.28682584 = fieldWeight in 2541, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.6713707 = idf(docFreq=3057, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2541)
          0.045353394 = weight(_text_:22 in 2541) [ClassicSimilarity], result of:
            0.045353394 = score(doc=2541,freq=4.0), product of:
              0.16577719 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.047340166 = queryNorm
              0.27358043 = fieldWeight in 2541, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2541)
      0.5 = coord(2/4)
    
    Abstract
    The Specialized Information Services Division (SIS) of the National Library of Medicine (NLM) provides Web access to more than a dozen scientific databases on toxicology and the environment on TOXNET . Search queries on TOXNET often include misspelled or variant English words, medical and scientific jargon and chemical names. Following the example of search engines like Google and ClinicalTrials.gov, we set out to develop a spelling "suggestion" system for increased recall and precision in TOXNET searching. This paper describes development of dictionary technology that can be used in a variety of applications such as orthographic verification, writing aid, natural language processing, and information storage and retrieval. The design of the technology allows building complex applications using the components developed in the earlier phases of the work in a modular fashion without extensive rewriting of computer code. Since many of the potential applications envisioned for this work have on-line or web-based interfaces, the dictionaries and other computer components must have fast response, and must be adaptable to open-ended database vocabularies, including chemical nomenclature. The dictionary vocabulary for this work was derived from SIS and other databases and specialized resources, such as NLM's Unified Medical Language Systems (UMLS) . The resulting technology, A-Z Dictionary (AZdict), has three major constituents: 1) the vocabulary list, 2) the word attributes that define part of speech and morphological relationships between words in the list, and 3) a set of programs that implements the retrieval of words and their attributes, and determines similarity between words (ChemSpell). These three components can be used in various applications such as spelling verification, spelling aid, part-of-speech tagging, paraphrasing, and many other natural language processing functions.
    Date
    14. 8.2004 17:22:56
    Source
    Online. 28(2004) no.3, S.22-29
  5. Warner, A.J.: Natural language processing (1987) 0.04
    0.038550586 = product of:
      0.07710117 = sum of:
        0.025789656 = weight(_text_:information in 337) [ClassicSimilarity], result of:
          0.025789656 = score(doc=337,freq=2.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.3103276 = fieldWeight in 337, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.125 = fieldNorm(doc=337)
        0.05131151 = product of:
          0.10262302 = sum of:
            0.10262302 = weight(_text_:22 in 337) [ClassicSimilarity], result of:
              0.10262302 = score(doc=337,freq=2.0), product of:
                0.16577719 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.047340166 = queryNorm
                0.61904186 = fieldWeight in 337, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=337)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Source
    Annual review of information science and technology. 22(1987), S.79-108
  6. Byrne, C.C.; McCracken, S.A.: ¬An adaptive thesaurus employing semantic distance, relational inheritance and nominal compound interpretation for linguistic support of information retrieval (1999) 0.03
    0.032918844 = product of:
      0.06583769 = sum of:
        0.02735406 = weight(_text_:information in 4483) [ClassicSimilarity], result of:
          0.02735406 = score(doc=4483,freq=4.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.3291521 = fieldWeight in 4483, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=4483)
        0.03848363 = product of:
          0.07696726 = sum of:
            0.07696726 = weight(_text_:22 in 4483) [ClassicSimilarity], result of:
              0.07696726 = score(doc=4483,freq=2.0), product of:
                0.16577719 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.047340166 = queryNorm
                0.46428138 = fieldWeight in 4483, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=4483)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Date
    15. 3.2000 10:22:37
    Source
    Journal of information science. 25(1999) no.2, S.113-131
  7. Paolillo, J.C.: Linguistics and the information sciences (2009) 0.03
    0.026150364 = product of:
      0.05230073 = sum of:
        0.029851943 = weight(_text_:information in 3840) [ClassicSimilarity], result of:
          0.029851943 = score(doc=3840,freq=14.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.3592092 = fieldWeight in 3840, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3840)
        0.022448786 = product of:
          0.04489757 = sum of:
            0.04489757 = weight(_text_:22 in 3840) [ClassicSimilarity], result of:
              0.04489757 = score(doc=3840,freq=2.0), product of:
                0.16577719 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.047340166 = queryNorm
                0.2708308 = fieldWeight in 3840, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3840)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Linguistics is the scientific study of language which emphasizes language spoken in everyday settings by human beings. It has a long history of interdisciplinarity, both internally and in contribution to other fields, including information science. A linguistic perspective is beneficial in many ways in information science, since it examines the relationship between the forms of meaningful expressions and their social, cognitive, institutional, and communicative context, these being two perspectives on information that are actively studied, to different degrees, in information science. Examples of issues relevant to information science are presented for which the approach taken under a linguistic perspective is illustrated.
    Date
    27. 8.2011 14:22:33
    Source
    Encyclopedia of library and information sciences. 3rd ed. Ed.: M.J. Bates
  8. Addison, E.R.; Wilson, H.D.; Feder, J.: ¬The impact of plain English searching on end users (1993) 0.03
    0.02526732 = product of:
      0.05053464 = sum of:
        0.022334497 = weight(_text_:information in 5354) [ClassicSimilarity], result of:
          0.022334497 = score(doc=5354,freq=6.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.2687516 = fieldWeight in 5354, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=5354)
        0.028200142 = product of:
          0.056400284 = sum of:
            0.056400284 = weight(_text_:services in 5354) [ClassicSimilarity], result of:
              0.056400284 = score(doc=5354,freq=2.0), product of:
                0.1738033 = queryWeight, product of:
                  3.6713707 = idf(docFreq=3057, maxDocs=44218)
                  0.047340166 = queryNorm
                0.3245064 = fieldWeight in 5354, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.6713707 = idf(docFreq=3057, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5354)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Commercial software products are available with plain English searching capabilities as engines for online and CD-ROM information services, and for internal text information management. With plain English interfaces, end users do not need to master the keyword and connector approach of the Boolean search query language. Describes plain English searching and its impact on the process of full text retrieval. Explores the issues of ease of use, reliability and implications for the total research process
    Imprint
    Medford, NJ : Learned Information
  9. Riloff, E.: ¬An empirical study of automated dictionary construction for information extraction in three domains (1996) 0.02
    0.023995128 = product of:
      0.047990255 = sum of:
        0.022334497 = weight(_text_:information in 6752) [ClassicSimilarity], result of:
          0.022334497 = score(doc=6752,freq=6.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.2687516 = fieldWeight in 6752, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=6752)
        0.025655756 = product of:
          0.05131151 = sum of:
            0.05131151 = weight(_text_:22 in 6752) [ClassicSimilarity], result of:
              0.05131151 = score(doc=6752,freq=2.0), product of:
                0.16577719 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.047340166 = queryNorm
                0.30952093 = fieldWeight in 6752, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6752)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    AutoSlog is a system that addresses the knowledge engineering bottleneck for information extraction. AutoSlog automatically creates domain specific dictionaries for information extraction, given an appropriate training corpus. Describes experiments with AutoSlog in terrorism, joint ventures and microelectronics domains. Compares the performance of AutoSlog across the 3 domains, discusses the lessons learned and presents results from 2 experiments which demonstrate that novice users can generate effective dictionaries using AutoSlog
    Date
    6. 3.1997 16:22:15
  10. Liddy, E.D.: Natural language processing for information retrieval and knowledge discovery (1998) 0.02
    0.023839142 = product of:
      0.047678284 = sum of:
        0.025229499 = weight(_text_:information in 2345) [ClassicSimilarity], result of:
          0.025229499 = score(doc=2345,freq=10.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.3035872 = fieldWeight in 2345, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2345)
        0.022448786 = product of:
          0.04489757 = sum of:
            0.04489757 = weight(_text_:22 in 2345) [ClassicSimilarity], result of:
              0.04489757 = score(doc=2345,freq=2.0), product of:
                0.16577719 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.047340166 = queryNorm
                0.2708308 = fieldWeight in 2345, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2345)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Natural language processing (NLP) is a powerful technology for the vital tasks of information retrieval (IR) and knowledge discovery (KD) which, in turn, feed the visualization systems of the present and future and enable knowledge workers to focus more of their time on the vital tasks of analysis and prediction
    Date
    22. 9.1997 19:16:05
    Imprint
    Urbana-Champaign, IL : Illinois University at Urbana-Champaign, Graduate School of Library and Information Science
    Source
    Visualizing subject access for 21st century information resources: Papers presented at the 1997 Clinic on Library Applications of Data Processing, 2-4 Mar 1997, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign. Ed.: P.A. Cochrane et al
  11. Wright, L.W.; Nardini, H.K.G.; Aronson, A.R.; Rindflesch, T.C.: Hierarchical concept indexing of full-text documents in the Unified Medical Language System Information sources Map (1999) 0.02
    0.023368742 = product of:
      0.046737485 = sum of:
        0.02558738 = weight(_text_:information in 2111) [ClassicSimilarity], result of:
          0.02558738 = score(doc=2111,freq=14.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.3078936 = fieldWeight in 2111, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=2111)
        0.021150107 = product of:
          0.042300213 = sum of:
            0.042300213 = weight(_text_:services in 2111) [ClassicSimilarity], result of:
              0.042300213 = score(doc=2111,freq=2.0), product of:
                0.1738033 = queryWeight, product of:
                  3.6713707 = idf(docFreq=3057, maxDocs=44218)
                  0.047340166 = queryNorm
                0.2433798 = fieldWeight in 2111, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.6713707 = idf(docFreq=3057, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2111)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Full-text documents are a vital and rapidly growing part of online biomedical information. A single large document can contain as much information as a small database, but normally lacks the tight structure and consistent indexing of a database. Retrieval systems will often miss highly relevant parts of a document if the document as a whole appears irrelevant. Access to full-text information is further complicated by the need to search separately many disparate information resources. This research explores how these problems can be addressed by the combined use of 2 techniques: 1) natural language processing for automatic concept-based indexing of full text, and 2) methods for exploiting the structure and hierarchy of full-text documents. We describe methods for applying these techniques to a large collection of full-text documents drawn from the Health Services / Technology Assessment Text (HSTAT) database at the NLM and examine how this hierarchical concept indexing can assist both document- and source-level retrieval in the context of NLM's Information Source Map project
    Source
    Journal of the American Society for Information Science. 50(1999) no.6, S.514-523
  12. Haas, S.W.: Natural language processing : toward large-scale, robust systems (1996) 0.02
    0.021945897 = product of:
      0.043891795 = sum of:
        0.01823604 = weight(_text_:information in 7415) [ClassicSimilarity], result of:
          0.01823604 = score(doc=7415,freq=4.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.21943474 = fieldWeight in 7415, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=7415)
        0.025655756 = product of:
          0.05131151 = sum of:
            0.05131151 = weight(_text_:22 in 7415) [ClassicSimilarity], result of:
              0.05131151 = score(doc=7415,freq=2.0), product of:
                0.16577719 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.047340166 = queryNorm
                0.30952093 = fieldWeight in 7415, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=7415)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    State of the art review of natural language processing updating an earlier review published in ARIST 22(1987). Discusses important developments that have allowed for significant advances in the field of natural language processing: materials and resources; knowledge based systems and statistical approaches; and a strong emphasis on evaluation. Reviews some natural language processing applications and common problems still awaiting solution. Considers closely related applications such as language generation and th egeneration phase of machine translation which face the same problems as natural language processing. Covers natural language methodologies for information retrieval only briefly
    Source
    Annual review of information science and technology. 31(1996), S.83-119
  13. Cimiano, P.; Völker, J.; Studer, R.: Ontologies on demand? : a description of the state-of-the-art, applications, challenges and trends for ontology learning from text (2006) 0.02
    0.021793898 = product of:
      0.043587796 = sum of:
        0.01367703 = weight(_text_:information in 6014) [ClassicSimilarity], result of:
          0.01367703 = score(doc=6014,freq=4.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.16457605 = fieldWeight in 6014, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=6014)
        0.029910767 = product of:
          0.059821535 = sum of:
            0.059821535 = weight(_text_:services in 6014) [ClassicSimilarity], result of:
              0.059821535 = score(doc=6014,freq=4.0), product of:
                0.1738033 = queryWeight, product of:
                  3.6713707 = idf(docFreq=3057, maxDocs=44218)
                  0.047340166 = queryNorm
                0.344191 = fieldWeight in 6014, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.6713707 = idf(docFreq=3057, maxDocs=44218)
                  0.046875 = fieldNorm(doc=6014)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Ontologies are nowadays used for many applications requiring data, services and resources in general to be interoperable and machine understandable. Such applications are for example web service discovery and composition, information integration across databases, intelligent search, etc. The general idea is that data and services are semantically described with respect to ontologies, which are formal specifications of a domain of interest, and can thus be shared and reused in a way such that the shared meaning specified by the ontology remains formally the same across different parties and applications. As the cost of creating ontologies is relatively high, different proposals have emerged for learning ontologies from structured and unstructured resources. In this article we examine the maturity of techniques for ontology learning from textual resources, addressing the question whether the state-of-the-art is mature enough to produce ontologies 'on demand'.
    Source
    Information - Wissenschaft und Praxis. 57(2006) H.6/7, S.315-320
  14. Wanner, L.: Lexical choice in text generation and machine translation (1996) 0.02
    0.019275293 = product of:
      0.038550586 = sum of:
        0.012894828 = weight(_text_:information in 8521) [ClassicSimilarity], result of:
          0.012894828 = score(doc=8521,freq=2.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.1551638 = fieldWeight in 8521, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=8521)
        0.025655756 = product of:
          0.05131151 = sum of:
            0.05131151 = weight(_text_:22 in 8521) [ClassicSimilarity], result of:
              0.05131151 = score(doc=8521,freq=2.0), product of:
                0.16577719 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.047340166 = queryNorm
                0.30952093 = fieldWeight in 8521, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=8521)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Presents the state of the art in lexical choice research in text generation and machine translation. Discusses the existing implementations with respect to: the place of lexical choice in the overall generation rates; the information flow within the generation process and the consequences thereof for lexical choice; the internal organization of the lexical choice process; and the phenomena covered by lexical choice. Identifies possible future directions in lexical choice research
    Date
    31. 7.1996 9:22:19
  15. Morris, V.: Automated language identification of bibliographic resources (2020) 0.02
    0.019275293 = product of:
      0.038550586 = sum of:
        0.012894828 = weight(_text_:information in 5749) [ClassicSimilarity], result of:
          0.012894828 = score(doc=5749,freq=2.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.1551638 = fieldWeight in 5749, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=5749)
        0.025655756 = product of:
          0.05131151 = sum of:
            0.05131151 = weight(_text_:22 in 5749) [ClassicSimilarity], result of:
              0.05131151 = score(doc=5749,freq=2.0), product of:
                0.16577719 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.047340166 = queryNorm
                0.30952093 = fieldWeight in 5749, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5749)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    This article describes experiments in the use of machine learning techniques at the British Library to assign language codes to catalog records, in order to provide information about the language of content of the resources described. In the first phase of the project, language codes were assigned to 1.15 million records with 99.7% confidence. The automated language identification tools developed will be used to contribute to future enhancement of over 4 million legacy records.
    Date
    2. 3.2020 19:04:22
  16. Reyes Ayala, B.; Knudson, R.; Chen, J.; Cao, G.; Wang, X.: Metadata records machine translation combining multi-engine outputs with limited parallel data (2018) 0.02
    0.018161582 = product of:
      0.036323164 = sum of:
        0.011397525 = weight(_text_:information in 4010) [ClassicSimilarity], result of:
          0.011397525 = score(doc=4010,freq=4.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.13714671 = fieldWeight in 4010, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4010)
        0.024925638 = product of:
          0.049851276 = sum of:
            0.049851276 = weight(_text_:services in 4010) [ClassicSimilarity], result of:
              0.049851276 = score(doc=4010,freq=4.0), product of:
                0.1738033 = queryWeight, product of:
                  3.6713707 = idf(docFreq=3057, maxDocs=44218)
                  0.047340166 = queryNorm
                0.28682584 = fieldWeight in 4010, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.6713707 = idf(docFreq=3057, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4010)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    One way to facilitate Multilingual Information Access (MLIA) for digital libraries is to generate multilingual metadata records by applying Machine Translation (MT) techniques. Current online MT services are available and affordable, but are not always effective for creating multilingual metadata records. In this study, we implemented 3 different MT strategies and evaluated their performance when translating English metadata records to Chinese and Spanish. These strategies included combining MT results from 3 online MT systems (Google, Bing, and Yahoo!) with and without additional linguistic resources, such as manually-generated parallel corpora, and metadata records in the two target languages obtained from international partners. The open-source statistical MT platform Moses was applied to design and implement the three translation strategies. Human evaluation of the MT results using adequacy and fluency demonstrated that two of the strategies produced higher quality translations than individual online MT systems for both languages. Especially, adding small, manually-generated parallel corpora of metadata records significantly improved translation performance. Our study suggested an effective and efficient MT approach for providing multilingual services for digital collections.
    Source
    Journal of the Association for Information Science and Technology. 69(2018) no.1, S.47-59
  17. Bian, G.-W.; Chen, H.-H.: Cross-language information access to multilingual collections on the Internet (2000) 0.02
    0.017996345 = product of:
      0.03599269 = sum of:
        0.016750874 = weight(_text_:information in 4436) [ClassicSimilarity], result of:
          0.016750874 = score(doc=4436,freq=6.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.20156369 = fieldWeight in 4436, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=4436)
        0.019241815 = product of:
          0.03848363 = sum of:
            0.03848363 = weight(_text_:22 in 4436) [ClassicSimilarity], result of:
              0.03848363 = score(doc=4436,freq=2.0), product of:
                0.16577719 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.047340166 = queryNorm
                0.23214069 = fieldWeight in 4436, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4436)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Language barrier is the major problem that people face in searching for, retrieving, and understanding multilingual collections on the Internet. This paper deals with query translation and document translation in a Chinese-English information retrieval system called MTIR. Bilingual dictionary and monolingual corpus-based approaches are adopted to select suitable tranlated query terms. A machine transliteration algorithm is introduced to resolve proper name searching. We consider several design issues for document translation, including which material is translated, what roles the HTML tags play in translation, what the tradeoff is between the speed performance and the translation performance, and what from the translated result is presented in. About 100.000 Web pages translated in the last 4 months of 1997 are used for quantitative study of online and real-time Web page translation
    Date
    16. 2.2000 14:22:39
    Source
    Journal of the American Society for Information Science. 51(2000) no.3, S.281-296
  18. Navarretta, C.; Pedersen, B.S.; Hansen, D.H.: Language technology in knowledge-organization systems (2006) 0.02
    0.017413568 = product of:
      0.034827136 = sum of:
        0.01367703 = weight(_text_:information in 5706) [ClassicSimilarity], result of:
          0.01367703 = score(doc=5706,freq=4.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.16457605 = fieldWeight in 5706, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=5706)
        0.021150107 = product of:
          0.042300213 = sum of:
            0.042300213 = weight(_text_:services in 5706) [ClassicSimilarity], result of:
              0.042300213 = score(doc=5706,freq=2.0), product of:
                0.1738033 = queryWeight, product of:
                  3.6713707 = idf(docFreq=3057, maxDocs=44218)
                  0.047340166 = queryNorm
                0.2433798 = fieldWeight in 5706, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.6713707 = idf(docFreq=3057, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5706)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    This paper describes the language technology methods developed in the Danish research project VID to extract from Danish text material relevant information for the population of knowledge organization systems (KOS) within specific corporate domains. The results achieved by applying these methods to a prototype search engine tuned to the patent and trademark domain indicate that the use of human language technology can support the construction of a linguistically based KOS and that linguistic information in search improves recall substantially without harming precision (near 90%). Finally, we describe two research experiments where (1) linguistic analysis of Danish compounds and is exploited to improve search atrategies on these (2) linguistic knowledge is used to model corporate knowledge into a language-based ontology.
    Content
    Beitrag eines Themenheftes "Knowledge organization systems and services"
  19. Schneider, J.W.; Borlund, P.: ¬A bibliometric-based semiautomatic approach to identification of candidate thesaurus terms : parsing and filtering of noun phrases from citation contexts (2005) 0.02
    0.01686588 = product of:
      0.03373176 = sum of:
        0.011282975 = weight(_text_:information in 156) [ClassicSimilarity], result of:
          0.011282975 = score(doc=156,freq=2.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.13576832 = fieldWeight in 156, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=156)
        0.022448786 = product of:
          0.04489757 = sum of:
            0.04489757 = weight(_text_:22 in 156) [ClassicSimilarity], result of:
              0.04489757 = score(doc=156,freq=2.0), product of:
                0.16577719 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.047340166 = queryNorm
                0.2708308 = fieldWeight in 156, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=156)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Date
    8. 3.2007 19:55:22
    Source
    Context: nature, impact and role. 5th International Conference an Conceptions of Library and Information Sciences, CoLIS 2005 Glasgow, UK, June 2005. Ed. by F. Crestani u. I. Ruthven
  20. Schöneberg, U.; Sperber, W.: POS tagging and its applications for mathematics (2014) 0.02
    0.015410613 = product of:
      0.030821227 = sum of:
        0.009671121 = weight(_text_:information in 1748) [ClassicSimilarity], result of:
          0.009671121 = score(doc=1748,freq=2.0), product of:
            0.08310462 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.047340166 = queryNorm
            0.116372846 = fieldWeight in 1748, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=1748)
        0.021150107 = product of:
          0.042300213 = sum of:
            0.042300213 = weight(_text_:services in 1748) [ClassicSimilarity], result of:
              0.042300213 = score(doc=1748,freq=2.0), product of:
                0.1738033 = queryWeight, product of:
                  3.6713707 = idf(docFreq=3057, maxDocs=44218)
                  0.047340166 = queryNorm
                0.2433798 = fieldWeight in 1748, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.6713707 = idf(docFreq=3057, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1748)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Content analysis of scientific publications is a nontrivial task, but a useful and important one for scientific information services. In the Gutenberg era it was a domain of human experts; in the digital age many machine-based methods, e.g., graph analysis tools and machine-learning techniques, have been developed for it. Natural Language Processing (NLP) is a powerful machine-learning approach to semiautomatic speech and language processing, which is also applicable to mathematics. The well established methods of NLP have to be adjusted for the special needs of mathematics, in particular for handling mathematical formulae. We demonstrate a mathematics-aware part of speech tagger and give a short overview about our adaptation of NLP methods for mathematical publications. We show the use of the tools developed for key phrase extraction and classification in the database zbMATH.

Languages

Types

  • a 390
  • m 34
  • el 26
  • s 19
  • x 10
  • p 3
  • d 2
  • b 1
  • More… Less…

Subjects

Classifications