Search (41 results, page 1 of 3)

  • × theme_ss:"Computerlinguistik"
  • × language_ss:"e"
  1. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.10
    0.10109107 = sum of:
      0.080492064 = product of:
        0.24147618 = sum of:
          0.24147618 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
            0.24147618 = score(doc=562,freq=2.0), product of:
              0.42965913 = queryWeight, product of:
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.050679237 = queryNorm
              0.56201804 = fieldWeight in 562, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.046875 = fieldNorm(doc=562)
        0.33333334 = coord(1/3)
      0.020599011 = product of:
        0.041198023 = sum of:
          0.041198023 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
            0.041198023 = score(doc=562,freq=2.0), product of:
              0.17747006 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.050679237 = queryNorm
              0.23214069 = fieldWeight in 562, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=562)
        0.5 = coord(1/2)
    
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  2. Roberts, C.W.; Popping, R.: Computer-supported content analysis : some recent developments (1993) 0.04
    0.04420089 = product of:
      0.08840178 = sum of:
        0.08840178 = product of:
          0.17680356 = sum of:
            0.17680356 = weight(_text_:maps in 4236) [ClassicSimilarity], result of:
              0.17680356 = score(doc=4236,freq=2.0), product of:
                0.28477904 = queryWeight, product of:
                  5.619245 = idf(docFreq=435, maxDocs=44218)
                  0.050679237 = queryNorm
                0.6208447 = fieldWeight in 4236, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.619245 = idf(docFreq=435, maxDocs=44218)
                  0.078125 = fieldNorm(doc=4236)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Presents an overview of some recent developments in the clause-based content analysis of linguistic data. Introduces network analysis of evaluative texts, for the analysis of cognitive maps, and linguistic content analysis. Focuses on the types of substantive inferences afforded by the three approaches
  3. Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.04
    0.040246032 = product of:
      0.080492064 = sum of:
        0.080492064 = product of:
          0.24147618 = sum of:
            0.24147618 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
              0.24147618 = score(doc=862,freq=2.0), product of:
                0.42965913 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.050679237 = queryNorm
                0.56201804 = fieldWeight in 862, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=862)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Source
    https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN
  4. Warner, A.J.: Natural language processing (1987) 0.03
    0.027465349 = product of:
      0.054930698 = sum of:
        0.054930698 = product of:
          0.109861396 = sum of:
            0.109861396 = weight(_text_:22 in 337) [ClassicSimilarity], result of:
              0.109861396 = score(doc=337,freq=2.0), product of:
                0.17747006 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050679237 = queryNorm
                0.61904186 = fieldWeight in 337, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=337)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Annual review of information science and technology. 22(1987), S.79-108
  5. Hausser, R.: Language and nonlanguage cognition (2021) 0.03
    0.026520537 = product of:
      0.053041074 = sum of:
        0.053041074 = product of:
          0.10608215 = sum of:
            0.10608215 = weight(_text_:maps in 255) [ClassicSimilarity], result of:
              0.10608215 = score(doc=255,freq=2.0), product of:
                0.28477904 = queryWeight, product of:
                  5.619245 = idf(docFreq=435, maxDocs=44218)
                  0.050679237 = queryNorm
                0.37250686 = fieldWeight in 255, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.619245 = idf(docFreq=435, maxDocs=44218)
                  0.046875 = fieldNorm(doc=255)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    A basic distinction in agent-based data-driven Database Semantics (DBS) is between language and nonlanguage cognition. Language cognition transfers content between agents by means of raw data. Nonlanguage cognition maps between content and raw data inside the focus agent. {\it Recognition} applies a concept type to raw data, resulting in a concept token. In language recognition, the focus agent (hearer) takes raw language-data (surfaces) produced by another agent (speaker) as input, while nonlanguage recognition takes raw nonlanguage-data as input. In either case, the output is a content which is stored in the agent's onboard short term memory. {\it Action} adapts a concept type to a purpose, resulting in a token. In language action, the focus agent (speaker) produces language-dependent surfaces for another agent (hearer), while nonlanguage action produces intentions for a nonlanguage purpose. In either case, the output is raw action data. As long as the procedural implementation of place holder values works properly, it is compatible with the DBS requirement of input-output equivalence between the natural prototype and the artificial reconstruction.
  6. Humphrey, S.M.; Rogers, W.J.; Kilicoglu, H.; Demner-Fushman, D.; Rindflesch, T.C.: Word sense disambiguation by selecting the best semantic type based on journal descriptor indexing : preliminary experiment (2006) 0.03
    0.0250038 = product of:
      0.0500076 = sum of:
        0.0500076 = product of:
          0.1000152 = sum of:
            0.1000152 = weight(_text_:maps in 4912) [ClassicSimilarity], result of:
              0.1000152 = score(doc=4912,freq=4.0), product of:
                0.28477904 = queryWeight, product of:
                  5.619245 = idf(docFreq=435, maxDocs=44218)
                  0.050679237 = queryNorm
                0.35120282 = fieldWeight in 4912, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.619245 = idf(docFreq=435, maxDocs=44218)
                  0.03125 = fieldNorm(doc=4912)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    An experiment was performed at the National Library of Medicine® (NLM®) in word sense disambiguation (WSD) using the Journal Descriptor Indexing (JDI) methodology. The motivation is the need to solve the ambiguity problem confronting NLM's MetaMap system, which maps free text to terms corresponding to concepts in NLM's Unified Medical Language System® (UMLS®) Metathesaurus®. If the text maps to more than one Metathesaurus concept at the same high confidence score, MetaMap has no way of knowing which concept is the correct mapping. We describe the JDI methodology, which is ultimately based an statistical associations between words in a training set of MEDLINE® citations and a small set of journal descriptors (assigned by humans to journals per se) assumed to be inherited by the citations. JDI is the basis for selecting the best meaning that is correlated to UMLS semantic types (STs) assigned to ambiguous concepts in the Metathesaurus. For example, the ambiguity transport has two meanings: "Biological Transport" assigned the ST Cell Function and "Patient transport" assigned the ST Health Care Activity. A JDI-based methodology can analyze text containing transport and determine which ST receives a higher score for that text, which then returns the associated meaning, presumed to apply to the ambiguity itself. We then present an experiment in which a baseline disambiguation method was compared to four versions of JDI in disambiguating 45 ambiguous strings from NLM's WSD Test Collection. Overall average precision for the highest-scoring JDI version was 0.7873 compared to 0.2492 for the baseline method, and average precision for individual ambiguities was greater than 0.90 for 23 of them (51%), greater than 0.85 for 24 (53%), and greater than 0.65 for 35 (79%). On the basis of these results, we hope to improve performance of JDI and test its use in applications.
  7. McMahon, J.G.; Smith, F.J.: Improved statistical language model performance with automatic generated word hierarchies (1996) 0.02
    0.02403218 = product of:
      0.04806436 = sum of:
        0.04806436 = product of:
          0.09612872 = sum of:
            0.09612872 = weight(_text_:22 in 3164) [ClassicSimilarity], result of:
              0.09612872 = score(doc=3164,freq=2.0), product of:
                0.17747006 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050679237 = queryNorm
                0.5416616 = fieldWeight in 3164, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3164)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Computational linguistics. 22(1996) no.2, S.217-248
  8. Ruge, G.: ¬A spreading activation network for automatic generation of thesaurus relationships (1991) 0.02
    0.02403218 = product of:
      0.04806436 = sum of:
        0.04806436 = product of:
          0.09612872 = sum of:
            0.09612872 = weight(_text_:22 in 4506) [ClassicSimilarity], result of:
              0.09612872 = score(doc=4506,freq=2.0), product of:
                0.17747006 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050679237 = queryNorm
                0.5416616 = fieldWeight in 4506, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=4506)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    8.10.2000 11:52:22
  9. Somers, H.: Example-based machine translation : Review article (1999) 0.02
    0.02403218 = product of:
      0.04806436 = sum of:
        0.04806436 = product of:
          0.09612872 = sum of:
            0.09612872 = weight(_text_:22 in 6672) [ClassicSimilarity], result of:
              0.09612872 = score(doc=6672,freq=2.0), product of:
                0.17747006 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050679237 = queryNorm
                0.5416616 = fieldWeight in 6672, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=6672)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    31. 7.1996 9:22:19
  10. New tools for human translators (1997) 0.02
    0.02403218 = product of:
      0.04806436 = sum of:
        0.04806436 = product of:
          0.09612872 = sum of:
            0.09612872 = weight(_text_:22 in 1179) [ClassicSimilarity], result of:
              0.09612872 = score(doc=1179,freq=2.0), product of:
                0.17747006 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050679237 = queryNorm
                0.5416616 = fieldWeight in 1179, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=1179)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    31. 7.1996 9:22:19
  11. Baayen, R.H.; Lieber, H.: Word frequency distributions and lexical semantics (1997) 0.02
    0.02403218 = product of:
      0.04806436 = sum of:
        0.04806436 = product of:
          0.09612872 = sum of:
            0.09612872 = weight(_text_:22 in 3117) [ClassicSimilarity], result of:
              0.09612872 = score(doc=3117,freq=2.0), product of:
                0.17747006 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050679237 = queryNorm
                0.5416616 = fieldWeight in 3117, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3117)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    28. 2.1999 10:48:22
  12. Witschel, H.F.: Terminology extraction and automatic indexing : comparison and qualitative evaluation of methods (2005) 0.02
    0.022100445 = product of:
      0.04420089 = sum of:
        0.04420089 = product of:
          0.08840178 = sum of:
            0.08840178 = weight(_text_:maps in 1842) [ClassicSimilarity], result of:
              0.08840178 = score(doc=1842,freq=2.0), product of:
                0.28477904 = queryWeight, product of:
                  5.619245 = idf(docFreq=435, maxDocs=44218)
                  0.050679237 = queryNorm
                0.31042236 = fieldWeight in 1842, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.619245 = idf(docFreq=435, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1842)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Many terminology engineering processes involve the task of automatic terminology extraction: before the terminology of a given domain can be modelled, organised or standardised, important concepts (or terms) of this domain have to be identified and fed into terminological databases. These serve in further steps as a starting point for compiling dictionaries, thesauri or maybe even terminological ontologies for the domain. For the extraction of the initial concepts, extraction methods are needed that operate on specialised language texts. On the other hand, many machine learning or information retrieval applications require automatic indexing techniques. In Machine Learning applications concerned with the automatic clustering or classification of texts, often feature vectors are needed that describe the contents of a given text briefly but meaningfully. These feature vectors typically consist of a fairly small set of index terms together with weights indicating their importance. Short but meaningful descriptions of document contents as provided by good index terms are also useful to humans: some knowledge management applications (e.g. topic maps) use them as a set of basic concepts (topics). The author believes that the tasks of terminology extraction and automatic indexing have much in common and can thus benefit from the same set of basic algorithms. It is the goal of this paper to outline some methods that may be used in both contexts, but also to find the discriminating factors between the two tasks that call for the variation of parameters or application of different techniques. The discussion of these methods will be based on statistical, syntactical and especially morphological properties of (index) terms. The paper is concluded by the presentation of some qualitative and quantitative results comparing statistical and morphological methods.
  13. Lian, T.; Yu, C.; Wang, W.; Yuan, Q.; Hou, Z.: Doctoral dissertations on tourism in China : a co-word analysis (2016) 0.02
    0.022100445 = product of:
      0.04420089 = sum of:
        0.04420089 = product of:
          0.08840178 = sum of:
            0.08840178 = weight(_text_:maps in 3178) [ClassicSimilarity], result of:
              0.08840178 = score(doc=3178,freq=2.0), product of:
                0.28477904 = queryWeight, product of:
                  5.619245 = idf(docFreq=435, maxDocs=44218)
                  0.050679237 = queryNorm
                0.31042236 = fieldWeight in 3178, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.619245 = idf(docFreq=435, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3178)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The aim of this paper is to map the foci of research in doctoral dissertations on tourism in China. In the paper, coword analysis is applied, with keywords coming from six public dissertation databases, i.e. CDFD, Wanfang Data, NLC, CALIS, ISTIC, and NSTL, as well as some university libraries providing doctoral dissertations on tourism. Altogether we have examined 928 doctoral dissertations on tourism written between 1989 and 2013. Doctoral dissertations on tourism in China involve 36 first level disciplines and 102 secondary level disciplines. We collect the top 68 keywords of practical significance in tourism which are mentioned at least four times or more. These keywords are classified into 12 categories based on co-word analysis, including cluster analysis, strategic diagrams analysis, and social network analysis. According to the strategic diagram of the 12 categories, we find the mature and immature areas in tourism study. From social networks, we can see the social network maps of original co-occurrence matrix and k-cores analysis of binary matrix. The paper provides valuable insight into the study of tourism by analyzing doctoral dissertations on tourism in China.
  14. Byrne, C.C.; McCracken, S.A.: ¬An adaptive thesaurus employing semantic distance, relational inheritance and nominal compound interpretation for linguistic support of information retrieval (1999) 0.02
    0.020599011 = product of:
      0.041198023 = sum of:
        0.041198023 = product of:
          0.082396045 = sum of:
            0.082396045 = weight(_text_:22 in 4483) [ClassicSimilarity], result of:
              0.082396045 = score(doc=4483,freq=2.0), product of:
                0.17747006 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050679237 = queryNorm
                0.46428138 = fieldWeight in 4483, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=4483)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    15. 3.2000 10:22:37
  15. Boleda, G.; Evert, S.: Multiword expressions : a pain in the neck of lexical semantics (2009) 0.02
    0.020599011 = product of:
      0.041198023 = sum of:
        0.041198023 = product of:
          0.082396045 = sum of:
            0.082396045 = weight(_text_:22 in 4888) [ClassicSimilarity], result of:
              0.082396045 = score(doc=4888,freq=2.0), product of:
                0.17747006 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050679237 = queryNorm
                0.46428138 = fieldWeight in 4888, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=4888)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    1. 3.2013 14:56:22
  16. Hutchins, J.: From first conception to first demonstration : the nascent years of machine translation, 1947-1954. A chronology (1997) 0.02
    0.017165843 = product of:
      0.034331687 = sum of:
        0.034331687 = product of:
          0.06866337 = sum of:
            0.06866337 = weight(_text_:22 in 1463) [ClassicSimilarity], result of:
              0.06866337 = score(doc=1463,freq=2.0), product of:
                0.17747006 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050679237 = queryNorm
                0.38690117 = fieldWeight in 1463, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1463)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    31. 7.1996 9:22:19
  17. Wanner, L.: Lexical choice in text generation and machine translation (1996) 0.01
    0.0137326745 = product of:
      0.027465349 = sum of:
        0.027465349 = product of:
          0.054930698 = sum of:
            0.054930698 = weight(_text_:22 in 8521) [ClassicSimilarity], result of:
              0.054930698 = score(doc=8521,freq=2.0), product of:
                0.17747006 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050679237 = queryNorm
                0.30952093 = fieldWeight in 8521, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=8521)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    31. 7.1996 9:22:19
  18. Riloff, E.: ¬An empirical study of automated dictionary construction for information extraction in three domains (1996) 0.01
    0.0137326745 = product of:
      0.027465349 = sum of:
        0.027465349 = product of:
          0.054930698 = sum of:
            0.054930698 = weight(_text_:22 in 6752) [ClassicSimilarity], result of:
              0.054930698 = score(doc=6752,freq=2.0), product of:
                0.17747006 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050679237 = queryNorm
                0.30952093 = fieldWeight in 6752, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6752)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    6. 3.1997 16:22:15
  19. Basili, R.; Pazienza, M.T.; Velardi, P.: ¬An empirical symbolic approach to natural language processing (1996) 0.01
    0.0137326745 = product of:
      0.027465349 = sum of:
        0.027465349 = product of:
          0.054930698 = sum of:
            0.054930698 = weight(_text_:22 in 6753) [ClassicSimilarity], result of:
              0.054930698 = score(doc=6753,freq=2.0), product of:
                0.17747006 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050679237 = queryNorm
                0.30952093 = fieldWeight in 6753, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=6753)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    6. 3.1997 16:22:15
  20. Haas, S.W.: Natural language processing : toward large-scale, robust systems (1996) 0.01
    0.0137326745 = product of:
      0.027465349 = sum of:
        0.027465349 = product of:
          0.054930698 = sum of:
            0.054930698 = weight(_text_:22 in 7415) [ClassicSimilarity], result of:
              0.054930698 = score(doc=7415,freq=2.0), product of:
                0.17747006 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050679237 = queryNorm
                0.30952093 = fieldWeight in 7415, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=7415)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    State of the art review of natural language processing updating an earlier review published in ARIST 22(1987). Discusses important developments that have allowed for significant advances in the field of natural language processing: materials and resources; knowledge based systems and statistical approaches; and a strong emphasis on evaluation. Reviews some natural language processing applications and common problems still awaiting solution. Considers closely related applications such as language generation and th egeneration phase of machine translation which face the same problems as natural language processing. Covers natural language methodologies for information retrieval only briefly