Search (4 results, page 1 of 1)

  • × theme_ss:"Automatisches Klassifizieren"
  • × theme_ss:"Computerlinguistik"
  1. Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.11
    0.11140175 = product of:
      0.3119249 = sum of:
        0.029976752 = product of:
          0.11990701 = sum of:
            0.11990701 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
              0.11990701 = score(doc=562,freq=2.0), product of:
                0.21335082 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.025165197 = queryNorm
                0.56201804 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.25 = coord(1/4)
        0.11990701 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.11990701 = score(doc=562,freq=2.0), product of:
            0.21335082 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.025165197 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.11990701 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
          0.11990701 = score(doc=562,freq=2.0), product of:
            0.21335082 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.025165197 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.03531506 = weight(_text_:representation in 562) [ClassicSimilarity], result of:
          0.03531506 = score(doc=562,freq=2.0), product of:
            0.11578492 = queryWeight, product of:
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.025165197 = queryNorm
            0.3050057 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
        0.006819073 = product of:
          0.02045722 = sum of:
            0.02045722 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
              0.02045722 = score(doc=562,freq=2.0), product of:
                0.08812423 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.025165197 = queryNorm
                0.23214069 = fieldWeight in 562, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=562)
          0.33333334 = coord(1/3)
      0.35714287 = coord(5/14)
    
    Abstract
    Document representations for text classification are typically based on the classical Bag-Of-Words paradigm. This approach comes with deficiencies that motivate the integration of features on a higher semantic level than single words. In this paper we propose an enhancement of the classical document representation through concepts extracted from background knowledge. Boosting is used for actual classification. Experimental evaluations on two well known text corpora support our approach through consistent improvement of the results.
    Content
    Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
    Date
    8. 1.2013 10:22:32
  2. Sebastiani, F.: Machine learning in automated text categorization (2002) 0.00
    0.0025225044 = product of:
      0.03531506 = sum of:
        0.03531506 = weight(_text_:representation in 3389) [ClassicSimilarity], result of:
          0.03531506 = score(doc=3389,freq=2.0), product of:
            0.11578492 = queryWeight, product of:
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.025165197 = queryNorm
            0.3050057 = fieldWeight in 3389, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.600994 = idf(docFreq=1206, maxDocs=44218)
              0.046875 = fieldNorm(doc=3389)
      0.071428575 = coord(1/14)
    
    Abstract
    The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last 10 years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based an machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert labor power, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.
  3. Ruiz, M.E.; Srinivasan, P.: Combining machine learning and hierarchical indexing structures for text categorization (2001) 0.00
    5.734144E-4 = product of:
      0.008027801 = sum of:
        0.008027801 = product of:
          0.024083402 = sum of:
            0.024083402 = weight(_text_:29 in 1595) [ClassicSimilarity], result of:
              0.024083402 = score(doc=1595,freq=2.0), product of:
                0.08852329 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.025165197 = queryNorm
                0.27205724 = fieldWeight in 1595, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1595)
          0.33333334 = coord(1/3)
      0.071428575 = coord(1/14)
    
    Date
    11. 5.2003 18:29:44
  4. Ibekwe-SanJuan, F.; SanJuan, E.: From term variants to research topics (2002) 0.00
    4.0958173E-4 = product of:
      0.005734144 = sum of:
        0.005734144 = product of:
          0.017202431 = sum of:
            0.017202431 = weight(_text_:29 in 1853) [ClassicSimilarity], result of:
              0.017202431 = score(doc=1853,freq=2.0), product of:
                0.08852329 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.025165197 = queryNorm
                0.19432661 = fieldWeight in 1853, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1853)
          0.33333334 = coord(1/3)
      0.071428575 = coord(1/14)
    
    Source
    Knowledge organization. 29(2002) nos.3/4, S.181-197