Document (#20327)

Author
Mostafa, J.
Quiroga, L.M.
Palakal, M.
Title
Filtering medical documents using automated and human classification methods
Source
Journal of the American Society for Information Science. 49(1998) no.14, S.1304-1318
Year
1998
Abstract
The goal of this research is to clarify the role of document classification in information filtering. An important function of classification, in managing computational complexity, is described and illustrated in the context of an existing filtering system. A parameter called classification homogeneity is presented for analyzing unsupervised automated classification by employing human classification as a control. 2 significant components of the automated classification approach, vocabulary discovery and classification scheme generation, are described in detail. Results of classification performance revealed considerable variability in the homogeneity of automatically produced classes. Based on the classification performance, different types of interest profiles were created. Subsequently, these profiles were used to perform filtering sessions. The filtering results showed that with increasing homogeneity, filtering performance improves, and, conversely, with decreasing homogeneity, filtering performance degrades
Theme
Automatisches Klassifizieren

Similar documents (author)

  1. Mostafa, J.: Digital image representation and access (1994) 5.50
    5.504072 = sum of:
      5.504072 = weight(author_txt:mostafa in 1102) [ClassicSimilarity], result of:
        5.504072 = fieldWeight in 1102, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.806516 = idf(docFreq=17, maxDocs=44218)
          0.625 = fieldNorm(doc=1102)
    
  2. Mostafa, S.P.: Enfoqies paradigmaticos de bibliotecologia : unidade na diversidad na unidad (1996) 5.50
    5.504072 = sum of:
      5.504072 = weight(author_txt:mostafa in 829) [ClassicSimilarity], result of:
        5.504072 = fieldWeight in 829, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.806516 = idf(docFreq=17, maxDocs=44218)
          0.625 = fieldNorm(doc=829)
    
  3. Mostafa, J.: Document search interface design : background and introduction to special topic section (2004) 5.50
    5.504072 = sum of:
      5.504072 = weight(author_txt:mostafa in 2503) [ClassicSimilarity], result of:
        5.504072 = fieldWeight in 2503, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.806516 = idf(docFreq=17, maxDocs=44218)
          0.625 = fieldNorm(doc=2503)
    
  4. Mostafa, J.: Bessere Suchmaschinen für das Web (2006) 5.50
    5.504072 = sum of:
      5.504072 = weight(author_txt:mostafa in 4871) [ClassicSimilarity], result of:
        5.504072 = fieldWeight in 4871, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.806516 = idf(docFreq=17, maxDocs=44218)
          0.625 = fieldNorm(doc=4871)
    
  5. Sugimoto, C.R.; Mostafa, J.: ¬A note of concern and context : on careful use of terminologies (2018) 4.40
    4.403258 = sum of:
      4.403258 = weight(author_txt:mostafa in 7278) [ClassicSimilarity], result of:
        4.403258 = fieldWeight in 7278, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.806516 = idf(docFreq=17, maxDocs=44218)
          0.5 = fieldNorm(doc=7278)
    

Similar documents (content)

  1. Quiroga, L.M.; Mostafa, J.: ¬An experiment in building profiles in information filtering : the role of context of user relevance feedback (2002) 0.18
    0.17585449 = sum of:
      0.17585449 = product of:
        0.73272705 = sum of:
          0.013538705 = weight(abstract_txt:results in 2579) [ClassicSimilarity], result of:
            0.013538705 = score(doc=2579,freq=2.0), product of:
              0.043984607 = queryWeight, product of:
                1.1350409 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.011127761 = queryNorm
              0.30780554 = fieldWeight in 2579, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.0625 = fieldNorm(doc=2579)
          0.040424265 = weight(abstract_txt:clarify in 2579) [ClassicSimilarity], result of:
            0.040424265 = score(doc=2579,freq=1.0), product of:
              0.09120333 = queryWeight, product of:
                1.155717 = boost
                7.0917172 = idf(docFreq=99, maxDocs=44218)
                0.011127761 = queryNorm
              0.44323233 = fieldWeight in 2579, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.0917172 = idf(docFreq=99, maxDocs=44218)
                0.0625 = fieldNorm(doc=2579)
          0.019408828 = weight(abstract_txt:were in 2579) [ClassicSimilarity], result of:
            0.019408828 = score(doc=2579,freq=3.0), product of:
              0.048852306 = queryWeight, product of:
                1.1961997 = boost
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.011127761 = queryNorm
              0.39729604 = fieldWeight in 2579, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.0625 = fieldNorm(doc=2579)
          0.12591329 = weight(abstract_txt:profiles in 2579) [ClassicSimilarity], result of:
            0.12591329 = score(doc=2579,freq=3.0), product of:
              0.16992864 = queryWeight, product of:
                2.2309737 = boost
                6.8448567 = idf(docFreq=127, maxDocs=44218)
                0.011127761 = queryNorm
              0.74097747 = fieldWeight in 2579, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.8448567 = idf(docFreq=127, maxDocs=44218)
                0.0625 = fieldNorm(doc=2579)
          0.09002002 = weight(abstract_txt:performance in 2579) [ClassicSimilarity], result of:
            0.09002002 = score(doc=2579,freq=4.0), product of:
              0.15552804 = queryWeight, product of:
                3.0184257 = boost
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.011127761 = queryNorm
              0.5788025 = fieldWeight in 2579, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.0625 = fieldNorm(doc=2579)
          0.4434219 = weight(abstract_txt:filtering in 2579) [ClassicSimilarity], result of:
            0.4434219 = score(doc=2579,freq=4.0), product of:
              0.542592 = queryWeight, product of:
                7.4581633 = boost
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.011127761 = queryNorm
              0.817229 = fieldWeight in 2579, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.0625 = fieldNorm(doc=2579)
        0.24 = coord(6/25)
    
  2. Díaz, I.; Ranilla, J.; Montañes, E.; Fernández, J.; Combarro, E.F.: Improving performance of text categorization by combining filtering and support vector machines (2004) 0.12
    0.12431114 = sum of:
      0.12431114 = product of:
        0.6215557 = sum of:
          0.04299673 = weight(abstract_txt:improves in 2234) [ClassicSimilarity], result of:
            0.04299673 = score(doc=2234,freq=1.0), product of:
              0.08189667 = queryWeight, product of:
                1.0951643 = boost
                6.7201533 = idf(docFreq=144, maxDocs=44218)
                0.011127761 = queryNorm
              0.52501196 = fieldWeight in 2234, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7201533 = idf(docFreq=144, maxDocs=44218)
                0.078125 = fieldNorm(doc=2234)
          0.016923383 = weight(abstract_txt:results in 2234) [ClassicSimilarity], result of:
            0.016923383 = score(doc=2234,freq=2.0), product of:
              0.043984607 = queryWeight, product of:
                1.1350409 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.011127761 = queryNorm
              0.38475692 = fieldWeight in 2234, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.078125 = fieldNorm(doc=2234)
          0.0795672 = weight(abstract_txt:performance in 2234) [ClassicSimilarity], result of:
            0.0795672 = score(doc=2234,freq=2.0), product of:
              0.15552804 = queryWeight, product of:
                3.0184257 = boost
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.011127761 = queryNorm
              0.51159394 = fieldWeight in 2234, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.078125 = fieldNorm(doc=2234)
          0.09013512 = weight(abstract_txt:classification in 2234) [ClassicSimilarity], result of:
            0.09013512 = score(doc=2234,freq=1.0), product of:
              0.28900495 = queryWeight, product of:
                6.5057716 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.011127761 = queryNorm
              0.3118809 = fieldWeight in 2234, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.078125 = fieldNorm(doc=2234)
          0.3919333 = weight(abstract_txt:filtering in 2234) [ClassicSimilarity], result of:
            0.3919333 = score(doc=2234,freq=2.0), product of:
              0.542592 = queryWeight, product of:
                7.4581633 = boost
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.011127761 = queryNorm
              0.7223352 = fieldWeight in 2234, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.078125 = fieldNorm(doc=2234)
        0.2 = coord(5/25)
    
  3. Sebastiani, F.: Classification of text, automatic (2006) 0.10
    0.09781587 = sum of:
      0.09781587 = product of:
        0.6113492 = sum of:
          0.04372373 = weight(abstract_txt:computational in 5003) [ClassicSimilarity], result of:
            0.04372373 = score(doc=5003,freq=1.0), product of:
              0.0733387 = queryWeight, product of:
                1.036365 = boost
                6.3593493 = idf(docFreq=207, maxDocs=44218)
                0.011127761 = queryNorm
              0.596189 = fieldWeight in 5003, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3593493 = idf(docFreq=207, maxDocs=44218)
                0.09375 = fieldNorm(doc=5003)
          0.1268969 = weight(abstract_txt:automated in 5003) [ClassicSimilarity], result of:
            0.1268969 = score(doc=5003,freq=2.0), product of:
              0.17081246 = queryWeight, product of:
                2.73947 = boost
                5.6033173 = idf(docFreq=442, maxDocs=44218)
                0.011127761 = queryNorm
              0.7429019 = fieldWeight in 5003, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6033173 = idf(docFreq=442, maxDocs=44218)
                0.09375 = fieldNorm(doc=5003)
          0.10816214 = weight(abstract_txt:classification in 5003) [ClassicSimilarity], result of:
            0.10816214 = score(doc=5003,freq=1.0), product of:
              0.28900495 = queryWeight, product of:
                6.5057716 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.011127761 = queryNorm
              0.37425706 = fieldWeight in 5003, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.09375 = fieldNorm(doc=5003)
          0.3325664 = weight(abstract_txt:filtering in 5003) [ClassicSimilarity], result of:
            0.3325664 = score(doc=5003,freq=1.0), product of:
              0.542592 = queryWeight, product of:
                7.4581633 = boost
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.011127761 = queryNorm
              0.6129217 = fieldWeight in 5003, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.09375 = fieldNorm(doc=5003)
        0.16 = coord(4/25)
    
  4. Kenter, T.; Balog, K.; Rijke, M. de: Evaluating document filtering systems over time (2015) 0.09
    0.094132565 = sum of:
      0.094132565 = product of:
        0.58832854 = sum of:
          0.0330736 = weight(abstract_txt:employing in 2672) [ClassicSimilarity], result of:
            0.0330736 = score(doc=2672,freq=1.0), product of:
              0.087209724 = queryWeight, product of:
                1.1301305 = boost
                6.9347134 = idf(docFreq=116, maxDocs=44218)
                0.011127761 = queryNorm
              0.37924212 = fieldWeight in 2672, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9347134 = idf(docFreq=116, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2672)
          0.011846367 = weight(abstract_txt:results in 2672) [ClassicSimilarity], result of:
            0.011846367 = score(doc=2672,freq=2.0), product of:
              0.043984607 = queryWeight, product of:
                1.1350409 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.011127761 = queryNorm
              0.26932985 = fieldWeight in 2672, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2672)
          0.06821467 = weight(abstract_txt:performance in 2672) [ClassicSimilarity], result of:
            0.06821467 = score(doc=2672,freq=3.0), product of:
              0.15552804 = queryWeight, product of:
                3.0184257 = boost
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.011127761 = queryNorm
              0.43860045 = fieldWeight in 2672, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2672)
          0.4751939 = weight(abstract_txt:filtering in 2672) [ClassicSimilarity], result of:
            0.4751939 = score(doc=2672,freq=6.0), product of:
              0.542592 = queryWeight, product of:
                7.4581633 = boost
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.011127761 = queryNorm
              0.87578493 = fieldWeight in 2672, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2672)
        0.16 = coord(4/25)
    
  5. Morato, J.; Llorens, J.; Genova, G.; Moreiro, J.A.: Experiments in discourse analysis impact on information classification and retrieval algorithms (2003) 0.09
    0.09070682 = sum of:
      0.09070682 = product of:
        0.45353407 = sum of:
          0.01658146 = weight(abstract_txt:results in 1083) [ClassicSimilarity], result of:
            0.01658146 = score(doc=1083,freq=3.0), product of:
              0.043984607 = queryWeight, product of:
                1.1350409 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.011127761 = queryNorm
              0.37698326 = fieldWeight in 1083, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.0625 = fieldNorm(doc=1083)
          0.011205692 = weight(abstract_txt:were in 1083) [ClassicSimilarity], result of:
            0.011205692 = score(doc=1083,freq=1.0), product of:
              0.048852306 = queryWeight, product of:
                1.1961997 = boost
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.011127761 = queryNorm
              0.22937898 = fieldWeight in 1083, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.0625 = fieldNorm(doc=1083)
          0.059819773 = weight(abstract_txt:automated in 1083) [ClassicSimilarity], result of:
            0.059819773 = score(doc=1083,freq=1.0), product of:
              0.17081246 = queryWeight, product of:
                2.73947 = boost
                5.6033173 = idf(docFreq=442, maxDocs=44218)
                0.011127761 = queryNorm
              0.35020733 = fieldWeight in 1083, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6033173 = idf(docFreq=442, maxDocs=44218)
                0.0625 = fieldNorm(doc=1083)
          0.1442162 = weight(abstract_txt:classification in 1083) [ClassicSimilarity], result of:
            0.1442162 = score(doc=1083,freq=4.0), product of:
              0.28900495 = queryWeight, product of:
                6.5057716 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.011127761 = queryNorm
              0.4990094 = fieldWeight in 1083, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0625 = fieldNorm(doc=1083)
          0.22171095 = weight(abstract_txt:filtering in 1083) [ClassicSimilarity], result of:
            0.22171095 = score(doc=1083,freq=1.0), product of:
              0.542592 = queryWeight, product of:
                7.4581633 = boost
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.011127761 = queryNorm
              0.4086145 = fieldWeight in 1083, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.537832 = idf(docFreq=173, maxDocs=44218)
                0.0625 = fieldNorm(doc=1083)
        0.2 = coord(5/25)