Document (#27884)

Author
Hassanien, A.-E.
Title
Rough set approach for attribute reduction and rule generation : a case of patients with suspected breast cancer
Source
Journal of the American Society for Information Science and Technology. 55(2004) no.11, S.954-962
Year
2004
Abstract
Rough set theory is a relatively new intelligent technique used in the discovery of data dependencies; it evaluates the importance of attributes, discovers the patterns of data, reduces all redundant objects and attributes, and seeks the minimum subset of attributes. Moreover, it is being used for the extraction of rules from databases. In this paper, we present a rough set approach to attribute reduction and generation of classification rules from a set of medical datasets. For this purpose, we first introduce a rough set reduction technique to find all reducts of the data that contain the minimal subset of attributes associated with a class label for classification. To evaluate the validity of the rules based an the approximation quality of the attributes, we introduce a statistical test to evaluate the significance of the rules. Experimental results from applying the rough set approach to the set of data samples are given and evaluated. In addition, the rough set classification accuracy is also compared to the weIl-known ID3 classifier algorithm. The study showed that the theory of rough sets is a useful tool for inductive learning and a valuable aid for building expert systems.
Field
Medizin

Similar documents (content)

  1. Bell, D.A.; Guan, J.W.: Computational methods for rough classification and discovery (1998) 0.35
    0.34593135 = sum of:
      0.34593135 = product of:
        1.4413806 = sum of:
          0.07159905 = weight(abstract_txt:dependencies in 2909) [ClassicSimilarity], result of:
            0.07159905 = score(doc=2909,freq=1.0), product of:
              0.09733001 = queryWeight, product of:
                1.0554911 = boost
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.011751762 = queryNorm
              0.7356318 = fieldWeight in 2909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.09375 = fieldNorm(doc=2909)
          0.009386844 = weight(abstract_txt:from in 2909) [ClassicSimilarity], result of:
            0.009386844 = score(doc=2909,freq=1.0), product of:
              0.036226697 = queryWeight, product of:
                1.1153371 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.011751762 = queryNorm
              0.259114 = fieldWeight in 2909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.09375 = fieldNorm(doc=2909)
          0.04791386 = weight(abstract_txt:theory in 2909) [ClassicSimilarity], result of:
            0.04791386 = score(doc=2909,freq=3.0), product of:
              0.06505074 = queryWeight, product of:
                1.220316 = boost
                4.5360413 = idf(docFreq=1287, maxDocs=44218)
                0.011751762 = queryNorm
              0.7365613 = fieldWeight in 2909, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.5360413 = idf(docFreq=1287, maxDocs=44218)
                0.09375 = fieldNorm(doc=2909)
          0.028285054 = weight(abstract_txt:classification in 2909) [ClassicSimilarity], result of:
            0.028285054 = score(doc=2909,freq=1.0), product of:
              0.075576544 = queryWeight, product of:
                1.610962 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.011751762 = queryNorm
              0.37425706 = fieldWeight in 2909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.09375 = fieldNorm(doc=2909)
          0.08514184 = weight(abstract_txt:rules in 2909) [ClassicSimilarity], result of:
            0.08514184 = score(doc=2909,freq=1.0), product of:
              0.17341657 = queryWeight, product of:
                2.817776 = boost
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.011751762 = queryNorm
              0.49096715 = fieldWeight in 2909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.09375 = fieldNorm(doc=2909)
          1.199054 = weight(abstract_txt:rough in 2909) [ClassicSimilarity], result of:
            1.199054 = score(doc=2909,freq=4.0), product of:
              0.7677393 = queryWeight, product of:
                7.843088 = boost
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.011751762 = queryNorm
              1.5617985 = fieldWeight in 2909, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.09375 = fieldNorm(doc=2909)
        0.24 = coord(6/25)
    
  2. Lingras, P.J.; Yao, Y.Y.: Data mining using extensions of the rough set model (1998) 0.31
    0.31431666 = sum of:
      0.31431666 = product of:
        1.5715833 = sum of:
          0.013275003 = weight(abstract_txt:from in 2910) [ClassicSimilarity], result of:
            0.013275003 = score(doc=2910,freq=2.0), product of:
              0.036226697 = queryWeight, product of:
                1.1153371 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.011751762 = queryNorm
              0.36644253 = fieldWeight in 2910, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.09375 = fieldNorm(doc=2910)
          0.0391215 = weight(abstract_txt:theory in 2910) [ClassicSimilarity], result of:
            0.0391215 = score(doc=2910,freq=2.0), product of:
              0.06505074 = queryWeight, product of:
                1.220316 = boost
                4.5360413 = idf(docFreq=1287, maxDocs=44218)
                0.011751762 = queryNorm
              0.6013998 = fieldWeight in 2910, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5360413 = idf(docFreq=1287, maxDocs=44218)
                0.09375 = fieldNorm(doc=2910)
          0.031133542 = weight(abstract_txt:data in 2910) [ClassicSimilarity], result of:
            0.031133542 = score(doc=2910,freq=2.0), product of:
              0.07038351 = queryWeight, product of:
                1.795133 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.011751762 = queryNorm
              0.44234142 = fieldWeight in 2910, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.09375 = fieldNorm(doc=2910)
          0.14746997 = weight(abstract_txt:rules in 2910) [ClassicSimilarity], result of:
            0.14746997 = score(doc=2910,freq=3.0), product of:
              0.17341657 = queryWeight, product of:
                2.817776 = boost
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.011751762 = queryNorm
              0.85037994 = fieldWeight in 2910, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.09375 = fieldNorm(doc=2910)
          1.3405832 = weight(abstract_txt:rough in 2910) [ClassicSimilarity], result of:
            1.3405832 = score(doc=2910,freq=5.0), product of:
              0.7677393 = queryWeight, product of:
                7.843088 = boost
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.011751762 = queryNorm
              1.7461438 = fieldWeight in 2910, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.09375 = fieldNorm(doc=2910)
        0.2 = coord(5/25)
    
  3. Miyamoto, S.: Application of rough sets to information retrieval (1998) 0.26
    0.25645092 = sum of:
      0.25645092 = product of:
        1.6028184 = sum of:
          0.0846052 = weight(abstract_txt:approximation in 559) [ClassicSimilarity], result of:
            0.0846052 = score(doc=559,freq=1.0), product of:
              0.10878607 = queryWeight, product of:
                1.1158808 = boost
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.011751762 = queryNorm
              0.7777209 = fieldWeight in 559, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.29569 = idf(docFreq=29, maxDocs=44218)
                0.09375 = fieldNorm(doc=559)
          0.02766308 = weight(abstract_txt:theory in 559) [ClassicSimilarity], result of:
            0.02766308 = score(doc=559,freq=1.0), product of:
              0.06505074 = queryWeight, product of:
                1.220316 = boost
                4.5360413 = idf(docFreq=1287, maxDocs=44218)
                0.011751762 = queryNorm
              0.42525387 = fieldWeight in 559, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5360413 = idf(docFreq=1287, maxDocs=44218)
                0.09375 = fieldNorm(doc=559)
          0.022014739 = weight(abstract_txt:data in 559) [ClassicSimilarity], result of:
            0.022014739 = score(doc=559,freq=1.0), product of:
              0.07038351 = queryWeight, product of:
                1.795133 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.011751762 = queryNorm
              0.31278262 = fieldWeight in 559, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.09375 = fieldNorm(doc=559)
          1.4685353 = weight(abstract_txt:rough in 559) [ClassicSimilarity], result of:
            1.4685353 = score(doc=559,freq=6.0), product of:
              0.7677393 = queryWeight, product of:
                7.843088 = boost
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.011751762 = queryNorm
              1.9128046 = fieldWeight in 559, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.09375 = fieldNorm(doc=559)
        0.16 = coord(4/25)
    
  4. Methodologies for knowledge discovery and data mining : Third Pacific-Asia Conference, PAKDD'99, Beijing, China, April 26-28, 1999, Proceedings (1999) 0.23
    0.22575237 = sum of:
      0.22575237 = product of:
        0.9406349 = sum of:
          0.010951319 = weight(abstract_txt:from in 3821) [ClassicSimilarity], result of:
            0.010951319 = score(doc=3821,freq=1.0), product of:
              0.036226697 = queryWeight, product of:
                1.1153371 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.011751762 = queryNorm
              0.30229968 = fieldWeight in 3821, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.109375 = fieldNorm(doc=3821)
          0.061581496 = weight(abstract_txt:generation in 3821) [ClassicSimilarity], result of:
            0.061581496 = score(doc=3821,freq=1.0), product of:
              0.10007391 = queryWeight, product of:
                1.5135843 = boost
                5.6261497 = idf(docFreq=432, maxDocs=44218)
                0.011751762 = queryNorm
              0.61536014 = fieldWeight in 3821, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6261497 = idf(docFreq=432, maxDocs=44218)
                0.109375 = fieldNorm(doc=3821)
          0.03299923 = weight(abstract_txt:classification in 3821) [ClassicSimilarity], result of:
            0.03299923 = score(doc=3821,freq=1.0), product of:
              0.075576544 = queryWeight, product of:
                1.610962 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.011751762 = queryNorm
              0.43663323 = fieldWeight in 3821, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.109375 = fieldNorm(doc=3821)
          0.036322467 = weight(abstract_txt:data in 3821) [ClassicSimilarity], result of:
            0.036322467 = score(doc=3821,freq=2.0), product of:
              0.07038351 = queryWeight, product of:
                1.795133 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.011751762 = queryNorm
              0.516065 = fieldWeight in 3821, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.109375 = fieldNorm(doc=3821)
          0.09933214 = weight(abstract_txt:rules in 3821) [ClassicSimilarity], result of:
            0.09933214 = score(doc=3821,freq=1.0), product of:
              0.17341657 = queryWeight, product of:
                2.817776 = boost
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.011751762 = queryNorm
              0.572795 = fieldWeight in 3821, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.109375 = fieldNorm(doc=3821)
          0.6994482 = weight(abstract_txt:rough in 3821) [ClassicSimilarity], result of:
            0.6994482 = score(doc=3821,freq=1.0), product of:
              0.7677393 = queryWeight, product of:
                7.843088 = boost
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.011751762 = queryNorm
              0.9110491 = fieldWeight in 3821, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.109375 = fieldNorm(doc=3821)
        0.24 = coord(6/25)
    
  5. Srinivasan, P.: Intelligent information retrieval using rough set approximations (1989) 0.16
    0.15878506 = sum of:
      0.15878506 = product of:
        0.99240667 = sum of:
          0.04791386 = weight(abstract_txt:theory in 2526) [ClassicSimilarity], result of:
            0.04791386 = score(doc=2526,freq=3.0), product of:
              0.06505074 = queryWeight, product of:
                1.220316 = boost
                4.5360413 = idf(docFreq=1287, maxDocs=44218)
                0.011751762 = queryNorm
              0.7365613 = fieldWeight in 2526, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.5360413 = idf(docFreq=1287, maxDocs=44218)
                0.09375 = fieldNorm(doc=2526)
          0.04899115 = weight(abstract_txt:classification in 2526) [ClassicSimilarity], result of:
            0.04899115 = score(doc=2526,freq=3.0), product of:
              0.075576544 = queryWeight, product of:
                1.610962 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.011751762 = queryNorm
              0.6482322 = fieldWeight in 2526, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.09375 = fieldNorm(doc=2526)
          0.29597467 = weight(abstract_txt:attributes in 2526) [ClassicSimilarity], result of:
            0.29597467 = score(doc=2526,freq=3.0), product of:
              0.2972313 = queryWeight, product of:
                4.1244254 = boost
                6.1323667 = idf(docFreq=260, maxDocs=44218)
                0.011751762 = queryNorm
              0.99577224 = fieldWeight in 2526, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.1323667 = idf(docFreq=260, maxDocs=44218)
                0.09375 = fieldNorm(doc=2526)
          0.599527 = weight(abstract_txt:rough in 2526) [ClassicSimilarity], result of:
            0.599527 = score(doc=2526,freq=1.0), product of:
              0.7677393 = queryWeight, product of:
                7.843088 = boost
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.011751762 = queryNorm
              0.7808992 = fieldWeight in 2526, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.09375 = fieldNorm(doc=2526)
        0.16 = coord(4/25)