Search (2 results, page 1 of 1)

Ruiz, M.E.; Srinivasan, P.: Combining machine learning and hierarchical indexing structures for text categorization (2001) 0.00
```
0.0023678814 = product of:
  0.0047357627 = sum of:
    0.0047357627 = product of:
      0.009471525 = sum of:
        0.009471525 = weight(_text_:a in 1595) [ClassicSimilarity], result of:
          0.009471525 = score(doc=1595,freq=8.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.17835285 = fieldWeight in 1595, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1595)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This paper presents a method that exploits the hierarchical structure of an indexing vocabulary to guide the development and training of machine learning methods for automatic text categorization. We present the design of a hierarchical classifier based an the divide-and-conquer principle. The method is evaluated using backpropagation neural networks, such as the machine learning algorithm, that leam to assign MeSH categories to a subset of MEDLINE records. Comparisons with traditional Rocchio's algorithm adapted for text categorization, as well as flat neural network classifiers, are provided. The results indicate that the use of hierarchical structures improves Performance significantly.

Type

a

Srinivasan, P.; Ruiz, M.E.; Lam, W.: ¬An investigation of indexing on the WWW (1996) 0.00

0.0020506454 = product of:
  0.004101291 = sum of:
    0.004101291 = product of:
      0.008202582 = sum of:
        0.008202582 = weight(_text_:a in 7424) [ClassicSimilarity], result of:
          0.008202582 = score(doc=7424,freq=6.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.1544581 = fieldWeight in 7424, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7424)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Proposes a model that assists in understanding indexing on the WWW. It specifies key features of indexing startegies that are currently being used. Presents an experiment assessing the validity of Inverse Document Frequency (IDF) as a term weighting strategy for WWW documents. The experiment indicates that IDF scores are not stable in the heterogeneous and dynamic context of the WWW. Recommends further investigation to clarify the effectiveness of alternative indexing strategies for the WWW
Type: a

Search (2 results, page 1 of 1)

Years

Themes