Document (#20911)

Author
Lingras, P.J.
Yao, Y.Y.
Title
Data mining using extensions of the rough set model
Source
Journal of the American Society for Information Science. 49(1998) no.5, S.415-422
Year
1998
Abstract
Examines basic issues of data mining using the theory of rough sets, which is a recent proposal for generalizing classical set theory. The Pawlak rough set model is based on the concept of an equivalence relation. A generalized rough set model need not be based on equivalence relation axioms. The Pawlak rough set model has been used for deriving deterministic as well as probabilistic rules froma complete database. Demonstrates that a generalised rough set model can be used for generating rules from incomplete databases. These rules are based on plausability functions proposed by Shafer. Discusses the importance of rule extraction from incomplete databases in data mining
Footnote
Contribution to a special issue devoted to knowledge discovery and data mining
Theme
Data Mining

Similar documents (content)

  1. Bell, D.A.; Guan, J.W.: Computational methods for rough classification and discovery (1998) 0.41
    0.41457322 = sum of:
      0.41457322 = product of:
        1.4806186 = sum of:
          0.0134151345 = weight(abstract_txt:used in 2909) [ClassicSimilarity], result of:
            0.0134151345 = score(doc=2909,freq=1.0), product of:
              0.042596612 = queryWeight, product of:
                1.0330058 = boost
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.012275059 = queryNorm
              0.3149343 = fieldWeight in 2909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.09375 = fieldNorm(doc=2909)
          0.020785766 = weight(abstract_txt:using in 2909) [ClassicSimilarity], result of:
            0.020785766 = score(doc=2909,freq=2.0), product of:
              0.045270197 = queryWeight, product of:
                1.0649309 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.012275059 = queryNorm
              0.459149 = fieldWeight in 2909, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.09375 = fieldNorm(doc=2909)
          0.030407287 = weight(abstract_txt:databases in 2909) [ClassicSimilarity], result of:
            0.030407287 = score(doc=2909,freq=1.0), product of:
              0.0735016 = queryWeight, product of:
                1.3569494 = boost
                4.4127526 = idf(docFreq=1456, maxDocs=44218)
                0.012275059 = queryNorm
              0.41369557 = fieldWeight in 2909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4127526 = idf(docFreq=1456, maxDocs=44218)
                0.09375 = fieldNorm(doc=2909)
          0.057205867 = weight(abstract_txt:theory in 2909) [ClassicSimilarity], result of:
            0.057205867 = score(doc=2909,freq=3.0), product of:
              0.07766613 = queryWeight, product of:
                1.3948616 = boost
                4.5360413 = idf(docFreq=1287, maxDocs=44218)
                0.012275059 = queryNorm
              0.7365613 = fieldWeight in 2909, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.5360413 = idf(docFreq=1287, maxDocs=44218)
                0.09375 = fieldNorm(doc=2909)
          0.0554887 = weight(abstract_txt:relation in 2909) [ClassicSimilarity], result of:
            0.0554887 = score(doc=2909,freq=1.0), product of:
              0.10976101 = queryWeight, product of:
                1.6582091 = boost
                5.3924384 = idf(docFreq=546, maxDocs=44218)
                0.012275059 = queryNorm
              0.5055411 = fieldWeight in 2909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3924384 = idf(docFreq=546, maxDocs=44218)
                0.09375 = fieldNorm(doc=2909)
          0.076240145 = weight(abstract_txt:rules in 2909) [ClassicSimilarity], result of:
            0.076240145 = score(doc=2909,freq=1.0), product of:
              0.15528563 = queryWeight, product of:
                2.4156084 = boost
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.012275059 = queryNorm
              0.49096715 = fieldWeight in 2909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.09375 = fieldNorm(doc=2909)
          1.2270757 = weight(abstract_txt:rough in 2909) [ClassicSimilarity], result of:
            1.2270757 = score(doc=2909,freq=4.0), product of:
              0.7856812 = queryWeight, product of:
                7.6842074 = boost
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.012275059 = queryNorm
              1.5617985 = fieldWeight in 2909, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.09375 = fieldNorm(doc=2909)
        0.28 = coord(7/25)
    
  2. Hassanien, A.-E.: Rough set approach for attribute reduction and rule generation : a case of patients with suspected breast cancer (2004) 0.34
    0.3375019 = sum of:
      0.3375019 = product of:
        1.205364 = sum of:
          0.012647909 = weight(abstract_txt:used in 2883) [ClassicSimilarity], result of:
            0.012647909 = score(doc=2883,freq=2.0), product of:
              0.042596612 = queryWeight, product of:
                1.0330058 = boost
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.012275059 = queryNorm
              0.2969229 = fieldWeight in 2883, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.0625 = fieldNorm(doc=2883)
          0.020271525 = weight(abstract_txt:databases in 2883) [ClassicSimilarity], result of:
            0.020271525 = score(doc=2883,freq=1.0), product of:
              0.0735016 = queryWeight, product of:
                1.3569494 = boost
                4.4127526 = idf(docFreq=1456, maxDocs=44218)
                0.012275059 = queryNorm
              0.27579704 = fieldWeight in 2883, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4127526 = idf(docFreq=1456, maxDocs=44218)
                0.0625 = fieldNorm(doc=2883)
          0.03113893 = weight(abstract_txt:theory in 2883) [ClassicSimilarity], result of:
            0.03113893 = score(doc=2883,freq=2.0), product of:
              0.07766613 = queryWeight, product of:
                1.3948616 = boost
                4.5360413 = idf(docFreq=1287, maxDocs=44218)
                0.012275059 = queryNorm
              0.40093318 = fieldWeight in 2883, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5360413 = idf(docFreq=1287, maxDocs=44218)
                0.0625 = fieldNorm(doc=2883)
          0.011465008 = weight(abstract_txt:based in 2883) [ClassicSimilarity], result of:
            0.011465008 = score(doc=2883,freq=1.0), product of:
              0.057542123 = queryWeight, product of:
                1.4704621 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.012275059 = queryNorm
              0.19924548 = fieldWeight in 2883, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0625 = fieldNorm(doc=2883)
          0.026284087 = weight(abstract_txt:data in 2883) [ClassicSimilarity], result of:
            0.026284087 = score(doc=2883,freq=4.0), product of:
              0.06302481 = queryWeight, product of:
                1.5389223 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.012275059 = queryNorm
              0.41704348 = fieldWeight in 2883, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=2883)
          0.10165352 = weight(abstract_txt:rules in 2883) [ClassicSimilarity], result of:
            0.10165352 = score(doc=2883,freq=4.0), product of:
              0.15528563 = queryWeight, product of:
                2.4156084 = boost
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.012275059 = queryNorm
              0.65462285 = fieldWeight in 2883, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.0625 = fieldNorm(doc=2883)
          1.001903 = weight(abstract_txt:rough in 2883) [ClassicSimilarity], result of:
            1.001903 = score(doc=2883,freq=6.0), product of:
              0.7856812 = queryWeight, product of:
                7.6842074 = boost
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.012275059 = queryNorm
              1.2752031 = fieldWeight in 2883, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.0625 = fieldNorm(doc=2883)
        0.28 = coord(7/25)
    
  3. Yang, H.; King, I.; Lyu, M.R.: ¬The generalized dependency degree between attributes (2007) 0.29
    0.29474592 = sum of:
      0.29474592 = product of:
        0.92108107 = sum of:
          0.0138571765 = weight(abstract_txt:using in 1322) [ClassicSimilarity], result of:
            0.0138571765 = score(doc=1322,freq=2.0), product of:
              0.045270197 = queryWeight, product of:
                1.0649309 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.012275059 = queryNorm
              0.30609933 = fieldWeight in 1322, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0625 = fieldNorm(doc=1322)
          0.059000224 = weight(abstract_txt:generalized in 1322) [ClassicSimilarity], result of:
            0.059000224 = score(doc=1322,freq=2.0), product of:
              0.09438906 = queryWeight, product of:
                1.0873293 = boost
                7.071914 = idf(docFreq=101, maxDocs=44218)
                0.012275059 = queryNorm
              0.6250748 = fieldWeight in 1322, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.071914 = idf(docFreq=101, maxDocs=44218)
                0.0625 = fieldNorm(doc=1322)
          0.02201855 = weight(abstract_txt:theory in 1322) [ClassicSimilarity], result of:
            0.02201855 = score(doc=1322,freq=1.0), product of:
              0.07766613 = queryWeight, product of:
                1.3948616 = boost
                4.5360413 = idf(docFreq=1287, maxDocs=44218)
                0.012275059 = queryNorm
              0.28350258 = fieldWeight in 1322, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5360413 = idf(docFreq=1287, maxDocs=44218)
                0.0625 = fieldNorm(doc=1322)
          0.12703739 = weight(abstract_txt:deterministic in 1322) [ClassicSimilarity], result of:
            0.12703739 = score(doc=1322,freq=2.0), product of:
              0.15738872 = queryWeight, product of:
                1.4040645 = boost
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.012275059 = queryNorm
              0.8071569 = fieldWeight in 1322, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.0625 = fieldNorm(doc=1322)
          0.10042557 = weight(abstract_txt:equivalence in 1322) [ClassicSimilarity], result of:
            0.10042557 = score(doc=1322,freq=1.0), product of:
              0.21360041 = queryWeight, product of:
                2.3132167 = boost
                7.5225 = idf(docFreq=64, maxDocs=44218)
                0.012275059 = queryNorm
              0.47015625 = fieldWeight in 1322, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.5225 = idf(docFreq=64, maxDocs=44218)
                0.0625 = fieldNorm(doc=1322)
          0.10168243 = weight(abstract_txt:incomplete in 1322) [ClassicSimilarity], result of:
            0.10168243 = score(doc=1322,freq=1.0), product of:
              0.2153789 = queryWeight, product of:
                2.3228269 = boost
                7.5537524 = idf(docFreq=62, maxDocs=44218)
                0.012275059 = queryNorm
              0.47210953 = fieldWeight in 1322, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.5537524 = idf(docFreq=62, maxDocs=44218)
                0.0625 = fieldNorm(doc=1322)
          0.088034526 = weight(abstract_txt:rules in 1322) [ClassicSimilarity], result of:
            0.088034526 = score(doc=1322,freq=3.0), product of:
              0.15528563 = queryWeight, product of:
                2.4156084 = boost
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.012275059 = queryNorm
              0.56692 = fieldWeight in 1322, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.0625 = fieldNorm(doc=1322)
          0.40902522 = weight(abstract_txt:rough in 1322) [ClassicSimilarity], result of:
            0.40902522 = score(doc=1322,freq=1.0), product of:
              0.7856812 = queryWeight, product of:
                7.6842074 = boost
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.012275059 = queryNorm
              0.5205995 = fieldWeight in 1322, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.0625 = fieldNorm(doc=1322)
        0.32 = coord(8/25)
    
  4. Miyamoto, S.: Application of rough sets to information retrieval (1998) 0.26
    0.25786126 = sum of:
      0.25786126 = product of:
        1.6116328 = sum of:
          0.033027824 = weight(abstract_txt:theory in 559) [ClassicSimilarity], result of:
            0.033027824 = score(doc=559,freq=1.0), product of:
              0.07766613 = queryWeight, product of:
                1.3948616 = boost
                4.5360413 = idf(docFreq=1287, maxDocs=44218)
                0.012275059 = queryNorm
              0.42525387 = fieldWeight in 559, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5360413 = idf(docFreq=1287, maxDocs=44218)
                0.09375 = fieldNorm(doc=559)
          0.019713065 = weight(abstract_txt:data in 559) [ClassicSimilarity], result of:
            0.019713065 = score(doc=559,freq=1.0), product of:
              0.06302481 = queryWeight, product of:
                1.5389223 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.012275059 = queryNorm
              0.31278262 = fieldWeight in 559, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.09375 = fieldNorm(doc=559)
          0.056037318 = weight(abstract_txt:model in 559) [ClassicSimilarity], result of:
            0.056037318 = score(doc=559,freq=1.0), product of:
              0.1499489 = queryWeight, product of:
                3.0644808 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.012275059 = queryNorm
              0.37370944 = fieldWeight in 559, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.09375 = fieldNorm(doc=559)
          1.5028546 = weight(abstract_txt:rough in 559) [ClassicSimilarity], result of:
            1.5028546 = score(doc=559,freq=6.0), product of:
              0.7856812 = queryWeight, product of:
                7.6842074 = boost
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.012275059 = queryNorm
              1.9128046 = fieldWeight in 559, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.09375 = fieldNorm(doc=559)
        0.16 = coord(4/25)
    
  5. Methodologies for knowledge discovery and data mining : Third Pacific-Asia Conference, PAKDD'99, Beijing, China, April 26-28, 1999, Proceedings (1999) 0.21
    0.21437891 = sum of:
      0.21437891 = product of:
        1.0718945 = sum of:
          0.028374445 = weight(abstract_txt:based in 3821) [ClassicSimilarity], result of:
            0.028374445 = score(doc=3821,freq=2.0), product of:
              0.057542123 = queryWeight, product of:
                1.4704621 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.012275059 = queryNorm
              0.49310738 = fieldWeight in 3821, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.109375 = fieldNorm(doc=3821)
          0.0325249 = weight(abstract_txt:data in 3821) [ClassicSimilarity], result of:
            0.0325249 = score(doc=3821,freq=2.0), product of:
              0.06302481 = queryWeight, product of:
                1.5389223 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.012275059 = queryNorm
              0.516065 = fieldWeight in 3821, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.109375 = fieldNorm(doc=3821)
          0.08894683 = weight(abstract_txt:rules in 3821) [ClassicSimilarity], result of:
            0.08894683 = score(doc=3821,freq=1.0), product of:
              0.15528563 = queryWeight, product of:
                2.4156084 = boost
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.012275059 = queryNorm
              0.572795 = fieldWeight in 3821, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.236983 = idf(docFreq=638, maxDocs=44218)
                0.109375 = fieldNorm(doc=3821)
          0.20625423 = weight(abstract_txt:mining in 3821) [ClassicSimilarity], result of:
            0.20625423 = score(doc=3821,freq=2.0), product of:
              0.21592496 = queryWeight, product of:
                2.8484743 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.012275059 = queryNorm
              0.95521253 = fieldWeight in 3821, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.109375 = fieldNorm(doc=3821)
          0.71579415 = weight(abstract_txt:rough in 3821) [ClassicSimilarity], result of:
            0.71579415 = score(doc=3821,freq=1.0), product of:
              0.7856812 = queryWeight, product of:
                7.6842074 = boost
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.012275059 = queryNorm
              0.9110491 = fieldWeight in 3821, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.329592 = idf(docFreq=28, maxDocs=44218)
                0.109375 = fieldNorm(doc=3821)
        0.2 = coord(5/25)