Search (1 results, page 1 of 1)

  • × author_ss:"Combarro, E.F."
  • × theme_ss:"Automatisches Klassifizieren"
  1. Díaz, I.; Ranilla, J.; Montañes, E.; Fernández, J.; Combarro, E.F.: Improving performance of text categorization by combining filtering and support vector machines (2004) 0.01
    0.00876316 = product of:
      0.01752632 = sum of:
        0.01752632 = product of:
          0.03505264 = sum of:
            0.03505264 = weight(_text_:classification in 2234) [ClassicSimilarity], result of:
              0.03505264 = score(doc=2234,freq=2.0), product of:
                0.16603322 = queryWeight, product of:
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.05213454 = queryNorm
                0.21111822 = fieldWeight in 2234, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1847067 = idf(docFreq=4974, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2234)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Text Categorization is the process of assigning documents to a set of previously fixed categories. A lot of research is going an with the goal of automating this time-consuming task. Several different algorithms have been applied, and Support Vector Machines (SVM) have shown very good results. In this report, we try to prove that a previous filtering of the words used by SVM in the classification can improve the overall performance. This hypothesis is systematically tested with three different measures of word relevance, an two different corpus (one of them considered in three different splits), and with both local and global vocabularies. The results show that filtering significantly improves the recall of the method, and that also has the effect of significantly improving the overall performance.