Search (2 results, page 1 of 1)

  • × author_ss:"Witschel, H.F."
  • × theme_ss:"Computerlinguistik"
  1. Witschel, H.F.: Global and local resources for peer-to-peer text retrieval (2008) 0.01
    0.008782639 = product of:
      0.026347917 = sum of:
        0.026347917 = weight(_text_:resources in 127) [ClassicSimilarity], result of:
          0.026347917 = score(doc=127,freq=2.0), product of:
            0.18665522 = queryWeight, product of:
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.051133685 = queryNorm
            0.14115821 = fieldWeight in 127, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.650338 = idf(docFreq=3122, maxDocs=44218)
              0.02734375 = fieldNorm(doc=127)
      0.33333334 = coord(1/3)
    
  2. Witschel, H.F.: Terminology extraction and automatic indexing : comparison and qualitative evaluation of methods (2005) 0.01
    0.005348703 = product of:
      0.016046109 = sum of:
        0.016046109 = product of:
          0.032092217 = sum of:
            0.032092217 = weight(_text_:management in 1842) [ClassicSimilarity], result of:
              0.032092217 = score(doc=1842,freq=2.0), product of:
                0.17235184 = queryWeight, product of:
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.051133685 = queryNorm
                0.18620178 = fieldWeight in 1842, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1842)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Many terminology engineering processes involve the task of automatic terminology extraction: before the terminology of a given domain can be modelled, organised or standardised, important concepts (or terms) of this domain have to be identified and fed into terminological databases. These serve in further steps as a starting point for compiling dictionaries, thesauri or maybe even terminological ontologies for the domain. For the extraction of the initial concepts, extraction methods are needed that operate on specialised language texts. On the other hand, many machine learning or information retrieval applications require automatic indexing techniques. In Machine Learning applications concerned with the automatic clustering or classification of texts, often feature vectors are needed that describe the contents of a given text briefly but meaningfully. These feature vectors typically consist of a fairly small set of index terms together with weights indicating their importance. Short but meaningful descriptions of document contents as provided by good index terms are also useful to humans: some knowledge management applications (e.g. topic maps) use them as a set of basic concepts (topics). The author believes that the tasks of terminology extraction and automatic indexing have much in common and can thus benefit from the same set of basic algorithms. It is the goal of this paper to outline some methods that may be used in both contexts, but also to find the discriminating factors between the two tasks that call for the variation of parameters or application of different techniques. The discussion of these methods will be based on statistical, syntactical and especially morphological properties of (index) terms. The paper is concluded by the presentation of some qualitative and quantitative results comparing statistical and morphological methods.

Languages

Types