Search (202 results, page 2 of 11)

  • × theme_ss:"Computerlinguistik"
  • × type_ss:"a"
  • × year_i:[1990 TO 2000}
  1. Liddy, E.D.: Natural language processing for information retrieval and knowledge discovery (1998) 0.02
    0.018662978 = product of:
      0.027994465 = sum of:
        0.0065699257 = weight(_text_:a in 2345) [ClassicSimilarity], result of:
          0.0065699257 = score(doc=2345,freq=4.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.12611452 = fieldWeight in 2345, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2345)
        0.02142454 = product of:
          0.04284908 = sum of:
            0.04284908 = weight(_text_:22 in 2345) [ClassicSimilarity], result of:
              0.04284908 = score(doc=2345,freq=2.0), product of:
                0.15821345 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045180224 = queryNorm
                0.2708308 = fieldWeight in 2345, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2345)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Natural language processing (NLP) is a powerful technology for the vital tasks of information retrieval (IR) and knowledge discovery (KD) which, in turn, feed the visualization systems of the present and future and enable knowledge workers to focus more of their time on the vital tasks of analysis and prediction
    Date
    22. 9.1997 19:16:05
    Type
    a
  2. Godby, J.: WordSmith research project bridges gap between tokens and indexes (1998) 0.02
    0.018662978 = product of:
      0.027994465 = sum of:
        0.0065699257 = weight(_text_:a in 4729) [ClassicSimilarity], result of:
          0.0065699257 = score(doc=4729,freq=4.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.12611452 = fieldWeight in 4729, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4729)
        0.02142454 = product of:
          0.04284908 = sum of:
            0.04284908 = weight(_text_:22 in 4729) [ClassicSimilarity], result of:
              0.04284908 = score(doc=4729,freq=2.0), product of:
                0.15821345 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045180224 = queryNorm
                0.2708308 = fieldWeight in 4729, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4729)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Reports on an OCLC natural language processing research project to develop methods for identifying terminology in unstructured electronic text, especially material associated with new cultural trends and emerging subjects. Current OCLC production software can only identify single words as indexable terms in full text documents, thus a major goal of the WordSmith project is to develop software that can automatically identify and intelligently organize phrases for uses in database indexes. By analyzing user terminology from local newspapers in the USA, the latest cultural trends and technical developments as well as personal and geographic names have been drawm out. Notes that this new vocabulary can also be mapped into reference works
    Source
    OCLC newsletter. 1998, no.234, Jul/Aug, S.22-24
    Type
    a
  3. Dorr, B.J.: Large-scale dictionary construction for foreign language tutoring and interlingual machine translation (1997) 0.02
    0.017551895 = product of:
      0.026327841 = sum of:
        0.007963953 = weight(_text_:a in 3244) [ClassicSimilarity], result of:
          0.007963953 = score(doc=3244,freq=8.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.15287387 = fieldWeight in 3244, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=3244)
        0.01836389 = product of:
          0.03672778 = sum of:
            0.03672778 = weight(_text_:22 in 3244) [ClassicSimilarity], result of:
              0.03672778 = score(doc=3244,freq=2.0), product of:
                0.15821345 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045180224 = queryNorm
                0.23214069 = fieldWeight in 3244, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3244)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Describes techniques for automatic construction of dictionaries for use in large-scale foreign language tutoring (FLT) and interlingual machine translation (MT) systems. The dictionaries are based on a language independent representation called lexical conceptual structure (LCS). Demonstrates that synonymous verb senses share distribution patterns. Shows how the syntax-semantics relation can be used to develop a lexical acquisition approach that contributes both toward the enrichment of existing online resources and toward the development of lexicons containing more complete information than is provided in any of these resources alone. Describes the structure of the LCS and shows how this representation is used in FLT and MT. Focuses on the problem of building LCS dictionaries for large-scale FLT and MT. Describes authoring tools for manual and semi-automatic construction of LCS dictionaries. Presents an approach that uses linguistic techniques for building word definitions automatically. The techniques have been implemented as part of a set of lixicon-development tools used in the MILT FLT project
    Date
    31. 7.1996 9:22:19
    Type
    a
  4. Rahmstorf, G.: Concept structures for large vocabularies (1998) 0.02
    0.017551895 = product of:
      0.026327841 = sum of:
        0.007963953 = weight(_text_:a in 75) [ClassicSimilarity], result of:
          0.007963953 = score(doc=75,freq=8.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.15287387 = fieldWeight in 75, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=75)
        0.01836389 = product of:
          0.03672778 = sum of:
            0.03672778 = weight(_text_:22 in 75) [ClassicSimilarity], result of:
              0.03672778 = score(doc=75,freq=2.0), product of:
                0.15821345 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045180224 = queryNorm
                0.23214069 = fieldWeight in 75, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=75)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    A technology is described which supports the acquisition, visualisation and manipulation of large vocabularies with associated structures. It is used for dictionary production, terminology data bases, thesauri, library classification systems etc. Essential features of the technology are a lexicographic user interface, variable word description, unlimited list of word readings, a concept language, automatic transformations of formulas into graphic structures, structure manipulation operations and retransformation into formulas. The concept language includes notations for undefined concepts. The structure of defined concepts can be constructed interactively. The technology supports the generation of large vocabularies with structures representing word senses. Concept structures and ordering systems for indexing and retrieval can be constructed separately and connected by associating relations.
    Date
    30.12.2001 19:01:22
    Type
    a
  5. MacLeod, C.; Grisham, R.; Meyer, A.: COMLERX syntax : a large syntactic dictionary for natural language processing (1998) 0.01
    0.005364322 = product of:
      0.016092965 = sum of:
        0.016092965 = weight(_text_:a in 3167) [ClassicSimilarity], result of:
          0.016092965 = score(doc=3167,freq=6.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.3089162 = fieldWeight in 3167, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.109375 = fieldNorm(doc=3167)
      0.33333334 = coord(1/3)
    
    Type
    a
  6. Sembok, T.M.T.; Rijsbergen, C.J. van: SILOL: a simple logical-linguistic document retrieval system (1990) 0.01
    0.005309302 = product of:
      0.015927905 = sum of:
        0.015927905 = weight(_text_:a in 6684) [ClassicSimilarity], result of:
          0.015927905 = score(doc=6684,freq=18.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.30574775 = fieldWeight in 6684, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=6684)
      0.33333334 = coord(1/3)
    
    Abstract
    Describes a system called SILOL which is based on a logical-linguistic model of document retrieval systems. SILOL uses a shallow semantic translation of natural language texts into a first order predicate representation in performing a document indexing and retrieval process. Some preliminary experiments have been carried out to test the retrieval effectiveness of this system. The results obtained show improvements in the level of retrieval effectiveness, which demonstrate that the approach of using a semantic theory of natural language and logic in document retrieval systems is a valid one
    Type
    a
  7. Solvberg, I.; Nordbo, I.; Aamodt, A.: Knowledge-based information retrieval (1991/92) 0.01
    0.0050056577 = product of:
      0.015016973 = sum of:
        0.015016973 = weight(_text_:a in 546) [ClassicSimilarity], result of:
          0.015016973 = score(doc=546,freq=4.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.28826174 = fieldWeight in 546, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.125 = fieldNorm(doc=546)
      0.33333334 = coord(1/3)
    
    Type
    a
  8. Stede, M.: Lexicalization in natural language generation : a survey (1994/95) 0.01
    0.0050056577 = product of:
      0.015016973 = sum of:
        0.015016973 = weight(_text_:a in 1913) [ClassicSimilarity], result of:
          0.015016973 = score(doc=1913,freq=16.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.28826174 = fieldWeight in 1913, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=1913)
      0.33333334 = coord(1/3)
    
    Abstract
    In natural language generation, a meaning representation of some kind is successively transformed into a sentence or a text. Naturally, a central subtask of this problem is the choice of words, or lexicalization. Proposes 4 major issues that determine how a generator tackles lexicalization, and surveys the contributions that research have made to them. Identifies open problems, and sketches a possible direction for research
    Type
    a
  9. Czejdo. B.D.; Tucci, R.P.: ¬A dataflow graphical language for database applications (1994) 0.00
    0.00494665 = product of:
      0.014839949 = sum of:
        0.014839949 = weight(_text_:a in 559) [ClassicSimilarity], result of:
          0.014839949 = score(doc=559,freq=10.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.28486365 = fieldWeight in 559, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.078125 = fieldNorm(doc=559)
      0.33333334 = coord(1/3)
    
    Abstract
    Discusses a graphical language for information retrieval and processing. A lot of recent activity has occured in the area of improving access to database systems. However, current results are restricted to simple interfacing of database systems. Proposes a graphical language for specifying complex applications
    Type
    a
  10. Lawson, V.; Vasconcellos, M.: Forty ways to skin a cat : users report on machine translation (1994) 0.00
    0.0045979903 = product of:
      0.01379397 = sum of:
        0.01379397 = weight(_text_:a in 6956) [ClassicSimilarity], result of:
          0.01379397 = score(doc=6956,freq=6.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.26478532 = fieldWeight in 6956, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.09375 = fieldNorm(doc=6956)
      0.33333334 = coord(1/3)
    
    Abstract
    In the most extensive survey of machine translation (MT) use ever performed, explores the responeses to a questionnaire survey of 40 MT users concerning their experiences
    Type
    a
  11. McKelvie, D.; Brew, C.; Thompson, H.S.: Uisng SGML as a basis for data-intensive natural language processing (1998) 0.00
    0.0045979903 = product of:
      0.01379397 = sum of:
        0.01379397 = weight(_text_:a in 3147) [ClassicSimilarity], result of:
          0.01379397 = score(doc=3147,freq=6.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.26478532 = fieldWeight in 3147, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.09375 = fieldNorm(doc=3147)
      0.33333334 = coord(1/3)
    
    Abstract
    Addresses advantages and disadvantages of SGML-approach compared with a non-SGML database aproach
    Type
    a
  12. Rodriguez, H.; Climent, S.; Vossen, P.; Bloksma, L.; Peters, W.; Alonge, A.; Bertagna, F.; Roventini, A.: ¬The top-down strategy for building EuroWordNet : vocabulary coverage, base concept and top ontology (1998) 0.00
    0.0045979903 = product of:
      0.01379397 = sum of:
        0.01379397 = weight(_text_:a in 6441) [ClassicSimilarity], result of:
          0.01379397 = score(doc=6441,freq=6.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.26478532 = fieldWeight in 6441, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.09375 = fieldNorm(doc=6441)
      0.33333334 = coord(1/3)
    
    Type
    a
  13. Ruge, G.; Schwarz, C.: Term association and computational linguistics (1991) 0.00
    0.004424418 = product of:
      0.013273253 = sum of:
        0.013273253 = weight(_text_:a in 2310) [ClassicSimilarity], result of:
          0.013273253 = score(doc=2310,freq=8.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.25478977 = fieldWeight in 2310, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.078125 = fieldNorm(doc=2310)
      0.33333334 = coord(1/3)
    
    Abstract
    Most systems for term associations are statistically based. In general they exploit term co-occurrences. A critical overview about statistical approaches in this field is given. A new approach on the basis of a linguistic analysis for large amounts of textual data is outlined
    Type
    a
  14. Driscoll, J.R.; Rajala, D.A.; Shaffer, W.H.: ¬The operation and performance of an artificially intelligent keywording system (1991) 0.00
    0.0043799505 = product of:
      0.013139851 = sum of:
        0.013139851 = weight(_text_:a in 6681) [ClassicSimilarity], result of:
          0.013139851 = score(doc=6681,freq=16.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.25222903 = fieldWeight in 6681, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6681)
      0.33333334 = coord(1/3)
    
    Abstract
    Presents a new approach to text analysis for automating the key phrase indexing process, using artificial intelligence techniques. This mimics the behaviour of human experts by using a rule base consisting of insertion and deletion rules generated by subject-matter experts. The insertion rules are based on the idea that some phrases found in a text imply or trigger other phrases. The deletion rules apply to semantically ambiguous phrases where text presence alone does not determine appropriateness as a key phrase. The insertion and deletion rules are used to transform a list of found phrases to a list of key phrases for indexing a document. Statistical data are provided to demonstrate the performance of this expert rule based system
    Type
    a
  15. Haas, S.W.: ¬A feasibility study of the case hierarchy model for the construction and porting of natural language interfaces (1990) 0.00
    0.0043799505 = product of:
      0.013139851 = sum of:
        0.013139851 = weight(_text_:a in 8071) [ClassicSimilarity], result of:
          0.013139851 = score(doc=8071,freq=4.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.25222903 = fieldWeight in 8071, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.109375 = fieldNorm(doc=8071)
      0.33333334 = coord(1/3)
    
    Type
    a
  16. Fellbaum, C.: ¬A semantic network of English : the mother of all WordNets (1998) 0.00
    0.0043799505 = product of:
      0.013139851 = sum of:
        0.013139851 = weight(_text_:a in 6416) [ClassicSimilarity], result of:
          0.013139851 = score(doc=6416,freq=4.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.25222903 = fieldWeight in 6416, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.109375 = fieldNorm(doc=6416)
      0.33333334 = coord(1/3)
    
    Type
    a
  17. Sharada, B.A.: Identification and interpretation of metaphors in document titles (1999) 0.00
    0.0043799505 = product of:
      0.013139851 = sum of:
        0.013139851 = weight(_text_:a in 6792) [ClassicSimilarity], result of:
          0.013139851 = score(doc=6792,freq=4.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.25222903 = fieldWeight in 6792, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.109375 = fieldNorm(doc=6792)
      0.33333334 = coord(1/3)
    
    Source
    Library science with a slant to documentation and information studies. 36(1999) no.1, S.27-33
    Type
    a
  18. Jacquemin, C.: What is the tree that we see through the window : a linguistic approach to windowing and term variation (1996) 0.00
    0.0040970687 = product of:
      0.012291206 = sum of:
        0.012291206 = weight(_text_:a in 5578) [ClassicSimilarity], result of:
          0.012291206 = score(doc=5578,freq=14.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.23593865 = fieldWeight in 5578, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5578)
      0.33333334 = coord(1/3)
    
    Abstract
    Provides a linguistic approach to text windowing through an extraction of term variants with the help of a partial parser. The syntactic grounding of the method ensures ehat words observed within restricted spans are lexically related and that spurious word cooccurrences are rules out with a good level of confidence. The system is computationally tractable on large corpora and large lists of terms. Gives illustrative examples of term variation from a large medical corpus. An experimental evaluation of the method shows that only a small proportion of co-occuring words are lexically related and motivates the call for natural language parsing techniques in text windowing
    Type
    a
  19. Ekmekcioglu, F.C.; Lynch, M.F.; Willet, P.: Development and evaluation of conflation techniques for the implementation of a document retrieval system for Turkish text databases (1995) 0.00
    0.0040970687 = product of:
      0.012291206 = sum of:
        0.012291206 = weight(_text_:a in 5797) [ClassicSimilarity], result of:
          0.012291206 = score(doc=5797,freq=14.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.23593865 = fieldWeight in 5797, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5797)
      0.33333334 = coord(1/3)
    
    Abstract
    Considers language processing techniques necessary for the implementation of a document retrieval system for Turkish text databases. Introduces the main characteristics of the Turkish language. Discusses the development of a stopword list and the evaluation of a stemming algorithm that takes account of the language's morphological structure. A 2 level description of Turkish morphology developed in Bilkent University, Ankara, is incorporated into a morphological parser, PC-KIMMO, to carry out stemming in Turkish databases. Describes the evaluation of string similarity measures - n-gram matching techniques - for Turkish. Reports experiments on 6 different Turkish text corpora
    Type
    a
  20. Kraaij, W.; Pohlmann, R.: Evaluation of a Dutch stemming algorithm (1995) 0.00
    0.0040970687 = product of:
      0.012291206 = sum of:
        0.012291206 = weight(_text_:a in 5798) [ClassicSimilarity], result of:
          0.012291206 = score(doc=5798,freq=14.0), product of:
            0.05209492 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.045180224 = queryNorm
            0.23593865 = fieldWeight in 5798, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5798)
      0.33333334 = coord(1/3)
    
    Abstract
    A stemming algorithm enables the recall of text retrieval systems to be enhanced. Describes the development of a Dutch version of the Porter stemming algorithm. The stemmer was evaluated using a method drawn from Paice. The evaluation method is based on a list of groups of morphologically related words. Ideally, each group must be stemmed to the same root. The result of applying the stemmer to these groups of words is used to calculate the understemming and overstemming index. These parameters and the diversity of stem group categories that could be generated from the CELEX database enabled a careful analysis of the effects of each stemming rule. The test suite is highly suited to qualitative comparison of different versions of stemmers
    Type
    a

Languages