Document (#20898)

Author
Wong, S.K.M.
Butz, C.J.
Xiang, X.
Title
Automated database schema design using mined data dependencies
Source
Journal of the American Society for Information Science. 49(1998) no.5, S.455-470
Year
1998
Abstract
Data dependencies are used in database schema design to enforce the correctness of a database as well as to reduce redundant data. These dependencies are usually determined from the semantics of the attributes and are then enforced upon the relations. Describes a bottom-up procedure for discovering multivalued dependencies in observed data without knowing a priori the relationships among the attributes. The proposed algorithm is an application of the technique designed for learning conditional independencies in probabilistic reasoning. A prototype system for automated database schema design has been implemented. Experiments were carried out to demonstrate both the effectiveness and efficiency of the method
Footnote
Contribution to a special issue devoted to knowledge discovery and data mining
Theme
Data Mining

Similar documents (author)

  1. Wong, S.K.M.: On modelling information retrieval with probabilistic inference (1995) 8.00
    8.003967 = sum of:
      8.003967 = sum of:
        2.8975797 = weight(author_txt:wong in 1938) [ClassicSimilarity], result of:
          2.8975797 = score(doc=1938,freq=1.0), product of:
            0.56535524 = queryWeight, product of:
              8.200379 = idf(docFreq=32, maxDocs=44218)
              0.06894257 = queryNorm
            5.125237 = fieldWeight in 1938, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.200379 = idf(docFreq=32, maxDocs=44218)
              0.625 = fieldNorm(doc=1938)
        5.106387 = weight(author_txt:s.k.m in 1938) [ClassicSimilarity], result of:
          5.106387 = score(doc=1938,freq=1.0), product of:
            0.82484746 = queryWeight, product of:
              1.2078865 = boost
              9.905128 = idf(docFreq=5, maxDocs=44218)
              0.06894257 = queryNorm
            6.190705 = fieldWeight in 1938, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.905128 = idf(docFreq=5, maxDocs=44218)
              0.625 = fieldNorm(doc=1938)
    
  2. Wong, S.K.M.; Yao, Y.Y.: ¬An information-theoretic measure of term specifics (1992) 6.40
    6.4031734 = sum of:
      6.4031734 = sum of:
        2.3180637 = weight(author_txt:wong in 4807) [ClassicSimilarity], result of:
          2.3180637 = score(doc=4807,freq=1.0), product of:
            0.56535524 = queryWeight, product of:
              8.200379 = idf(docFreq=32, maxDocs=44218)
              0.06894257 = queryNorm
            4.1001897 = fieldWeight in 4807, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.200379 = idf(docFreq=32, maxDocs=44218)
              0.5 = fieldNorm(doc=4807)
        4.0851097 = weight(author_txt:s.k.m in 4807) [ClassicSimilarity], result of:
          4.0851097 = score(doc=4807,freq=1.0), product of:
            0.82484746 = queryWeight, product of:
              1.2078865 = boost
              9.905128 = idf(docFreq=5, maxDocs=44218)
              0.06894257 = queryNorm
            4.952564 = fieldWeight in 4807, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.905128 = idf(docFreq=5, maxDocs=44218)
              0.5 = fieldNorm(doc=4807)
    
  3. Wong, S.K.M.; Yao, Y.Y.: Query formulation in linear retrieval models (1990) 6.40
    6.4031734 = sum of:
      6.4031734 = sum of:
        2.3180637 = weight(author_txt:wong in 3571) [ClassicSimilarity], result of:
          2.3180637 = score(doc=3571,freq=1.0), product of:
            0.56535524 = queryWeight, product of:
              8.200379 = idf(docFreq=32, maxDocs=44218)
              0.06894257 = queryNorm
            4.1001897 = fieldWeight in 3571, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.200379 = idf(docFreq=32, maxDocs=44218)
              0.5 = fieldNorm(doc=3571)
        4.0851097 = weight(author_txt:s.k.m in 3571) [ClassicSimilarity], result of:
          4.0851097 = score(doc=3571,freq=1.0), product of:
            0.82484746 = queryWeight, product of:
              1.2078865 = boost
              9.905128 = idf(docFreq=5, maxDocs=44218)
              0.06894257 = queryNorm
            4.952564 = fieldWeight in 3571, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.905128 = idf(docFreq=5, maxDocs=44218)
              0.5 = fieldNorm(doc=3571)
    
  4. Wong, S.K.M.; Yao, Y.Y.; Salton, G.; Buckley, C.: Evaluation of an adaptive linear model (1991) 4.00
    4.0019836 = sum of:
      4.0019836 = sum of:
        1.4487898 = weight(author_txt:wong in 4836) [ClassicSimilarity], result of:
          1.4487898 = score(doc=4836,freq=1.0), product of:
            0.56535524 = queryWeight, product of:
              8.200379 = idf(docFreq=32, maxDocs=44218)
              0.06894257 = queryNorm
            2.5626185 = fieldWeight in 4836, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.200379 = idf(docFreq=32, maxDocs=44218)
              0.3125 = fieldNorm(doc=4836)
        2.5531936 = weight(author_txt:s.k.m in 4836) [ClassicSimilarity], result of:
          2.5531936 = score(doc=4836,freq=1.0), product of:
            0.82484746 = queryWeight, product of:
              1.2078865 = boost
              9.905128 = idf(docFreq=5, maxDocs=44218)
              0.06894257 = queryNorm
            3.0953524 = fieldWeight in 4836, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.905128 = idf(docFreq=5, maxDocs=44218)
              0.3125 = fieldNorm(doc=4836)
    
  5. Wong, K.: Frühe Spuren des menschlichen Geistes (2005) 1.45
    1.4487898 = sum of:
      1.4487898 = product of:
        2.8975797 = sum of:
          2.8975797 = weight(author_txt:wong in 983) [ClassicSimilarity], result of:
            2.8975797 = score(doc=983,freq=1.0), product of:
              0.56535524 = queryWeight, product of:
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.06894257 = queryNorm
              5.125237 = fieldWeight in 983, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.625 = fieldNorm(doc=983)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Bosc, P.; Dubois, D.; Prade, H.: Fuzzy functional dependencies and redundancy elimination (1998) 0.15
    0.15427081 = sum of:
      0.15427081 = product of:
        0.9641926 = sum of:
          0.050585635 = weight(abstract_txt:design in 590) [ClassicSimilarity], result of:
            0.050585635 = score(doc=590,freq=2.0), product of:
              0.11677521 = queryWeight, product of:
                1.9414463 = boost
                3.9207718 = idf(docFreq=2382, maxDocs=44218)
                0.015341001 = queryNorm
              0.43318814 = fieldWeight in 590, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9207718 = idf(docFreq=2382, maxDocs=44218)
                0.078125 = fieldNorm(doc=590)
          0.041559007 = weight(abstract_txt:data in 590) [ClassicSimilarity], result of:
            0.041559007 = score(doc=590,freq=2.0), product of:
              0.112742804 = queryWeight, product of:
                2.202743 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.015341001 = queryNorm
              0.36861783 = fieldWeight in 590, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.078125 = fieldNorm(doc=590)
          0.10744843 = weight(abstract_txt:database in 590) [ClassicSimilarity], result of:
            0.10744843 = score(doc=590,freq=3.0), product of:
              0.18553038 = queryWeight, product of:
                2.8257058 = boost
                4.2799077 = idf(docFreq=1663, maxDocs=44218)
                0.015341001 = queryNorm
              0.579142 = fieldWeight in 590, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.2799077 = idf(docFreq=1663, maxDocs=44218)
                0.078125 = fieldNorm(doc=590)
          0.76459956 = weight(abstract_txt:dependencies in 590) [ClassicSimilarity], result of:
            0.76459956 = score(doc=590,freq=4.0), product of:
              0.62362677 = queryWeight, product of:
                5.1806207 = boost
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.015341001 = queryNorm
              1.2260531 = fieldWeight in 590, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.078125 = fieldNorm(doc=590)
        0.16 = coord(4/25)
    
  2. Hassanien, A.-E.: Rough set approach for attribute reduction and rule generation : a case of patients with suspected breast cancer (2004) 0.10
    0.095575556 = sum of:
      0.095575556 = product of:
        0.59734726 = sum of:
          0.081271775 = weight(abstract_txt:redundant in 2883) [ClassicSimilarity], result of:
            0.081271775 = score(doc=2883,freq=1.0), product of:
              0.16238101 = queryWeight, product of:
                1.3217735 = boost
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.015341001 = queryNorm
              0.5005005 = fieldWeight in 2883, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.0625 = fieldNorm(doc=2883)
          0.16321707 = weight(abstract_txt:attributes in 2883) [ClassicSimilarity], result of:
            0.16321707 = score(doc=2883,freq=5.0), product of:
              0.19044626 = queryWeight, product of:
                2.0243735 = boost
                6.1323667 = idf(docFreq=260, maxDocs=44218)
                0.015341001 = queryNorm
              0.8570243 = fieldWeight in 2883, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.1323667 = idf(docFreq=260, maxDocs=44218)
                0.0625 = fieldNorm(doc=2883)
          0.04701865 = weight(abstract_txt:data in 2883) [ClassicSimilarity], result of:
            0.04701865 = score(doc=2883,freq=4.0), product of:
              0.112742804 = queryWeight, product of:
                2.202743 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.015341001 = queryNorm
              0.41704348 = fieldWeight in 2883, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=2883)
          0.3058398 = weight(abstract_txt:dependencies in 2883) [ClassicSimilarity], result of:
            0.3058398 = score(doc=2883,freq=1.0), product of:
              0.62362677 = queryWeight, product of:
                5.1806207 = boost
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.015341001 = queryNorm
              0.49042124 = fieldWeight in 2883, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.0625 = fieldNorm(doc=2883)
        0.16 = coord(4/25)
    
  3. Peckham, J.; MacKellar, B.; Vorback, J.: ¬A unified approach to the design and generation of complex database schemata (1997) 0.09
    0.086401425 = sum of:
      0.086401425 = product of:
        0.5400089 = sum of:
          0.09744732 = weight(abstract_txt:automated in 1259) [ClassicSimilarity], result of:
            0.09744732 = score(doc=1259,freq=1.0), product of:
              0.15900348 = queryWeight, product of:
                1.8497275 = boost
                5.6033173 = idf(docFreq=442, maxDocs=44218)
                0.015341001 = queryNorm
              0.6128628 = fieldWeight in 1259, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6033173 = idf(docFreq=442, maxDocs=44218)
                0.109375 = fieldNorm(doc=1259)
          0.07081989 = weight(abstract_txt:design in 1259) [ClassicSimilarity], result of:
            0.07081989 = score(doc=1259,freq=2.0), product of:
              0.11677521 = queryWeight, product of:
                1.9414463 = boost
                3.9207718 = idf(docFreq=2382, maxDocs=44218)
                0.015341001 = queryNorm
              0.60646343 = fieldWeight in 1259, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9207718 = idf(docFreq=2382, maxDocs=44218)
                0.109375 = fieldNorm(doc=1259)
          0.1504278 = weight(abstract_txt:database in 1259) [ClassicSimilarity], result of:
            0.1504278 = score(doc=1259,freq=3.0), product of:
              0.18553038 = queryWeight, product of:
                2.8257058 = boost
                4.2799077 = idf(docFreq=1663, maxDocs=44218)
                0.015341001 = queryNorm
              0.81079876 = fieldWeight in 1259, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.2799077 = idf(docFreq=1663, maxDocs=44218)
                0.109375 = fieldNorm(doc=1259)
          0.22131386 = weight(abstract_txt:schema in 1259) [ClassicSimilarity], result of:
            0.22131386 = score(doc=1259,freq=1.0), product of:
              0.3144823 = queryWeight, product of:
                3.1860175 = boost
                6.434197 = idf(docFreq=192, maxDocs=44218)
                0.015341001 = queryNorm
              0.7037403 = fieldWeight in 1259, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.434197 = idf(docFreq=192, maxDocs=44218)
                0.109375 = fieldNorm(doc=1259)
        0.16 = coord(4/25)
    
  4. Leazer, G.H.: ¬A conceptual schema for the control of bibliographic works (1994) 0.08
    0.080642104 = sum of:
      0.080642104 = product of:
        0.40321052 = sum of:
          0.028615559 = weight(abstract_txt:design in 3033) [ClassicSimilarity], result of:
            0.028615559 = score(doc=3033,freq=1.0), product of:
              0.11677521 = queryWeight, product of:
                1.9414463 = boost
                3.9207718 = idf(docFreq=2382, maxDocs=44218)
                0.015341001 = queryNorm
              0.24504824 = fieldWeight in 3033, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9207718 = idf(docFreq=2382, maxDocs=44218)
                0.0625 = fieldNorm(doc=3033)
          0.07299289 = weight(abstract_txt:attributes in 3033) [ClassicSimilarity], result of:
            0.07299289 = score(doc=3033,freq=1.0), product of:
              0.19044626 = queryWeight, product of:
                2.0243735 = boost
                6.1323667 = idf(docFreq=260, maxDocs=44218)
                0.015341001 = queryNorm
              0.38327292 = fieldWeight in 3033, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1323667 = idf(docFreq=260, maxDocs=44218)
                0.0625 = fieldNorm(doc=3033)
          0.05256845 = weight(abstract_txt:data in 3033) [ClassicSimilarity], result of:
            0.05256845 = score(doc=3033,freq=5.0), product of:
              0.112742804 = queryWeight, product of:
                2.202743 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.015341001 = queryNorm
              0.46626878 = fieldWeight in 3033, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=3033)
          0.07018502 = weight(abstract_txt:database in 3033) [ClassicSimilarity], result of:
            0.07018502 = score(doc=3033,freq=2.0), product of:
              0.18553038 = queryWeight, product of:
                2.8257058 = boost
                4.2799077 = idf(docFreq=1663, maxDocs=44218)
                0.015341001 = queryNorm
              0.37829396 = fieldWeight in 3033, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2799077 = idf(docFreq=1663, maxDocs=44218)
                0.0625 = fieldNorm(doc=3033)
          0.1788486 = weight(abstract_txt:schema in 3033) [ClassicSimilarity], result of:
            0.1788486 = score(doc=3033,freq=2.0), product of:
              0.3144823 = queryWeight, product of:
                3.1860175 = boost
                6.434197 = idf(docFreq=192, maxDocs=44218)
                0.015341001 = queryNorm
              0.568708 = fieldWeight in 3033, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.434197 = idf(docFreq=192, maxDocs=44218)
                0.0625 = fieldNorm(doc=3033)
        0.2 = coord(5/25)
    
  5. Bell, D.A.; Guan, J.W.: Computational methods for rough classification and discovery (1998) 0.08
    0.07568849 = sum of:
      0.07568849 = product of:
        0.6307374 = sum of:
          0.09753523 = weight(abstract_txt:discovering in 2909) [ClassicSimilarity], result of:
            0.09753523 = score(doc=2909,freq=1.0), product of:
              0.13994442 = queryWeight, product of:
                1.227064 = boost
                7.4342074 = idf(docFreq=70, maxDocs=44218)
                0.015341001 = queryNorm
              0.69695693 = fieldWeight in 2909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4342074 = idf(docFreq=70, maxDocs=44218)
                0.09375 = fieldNorm(doc=2909)
          0.07444246 = weight(abstract_txt:database in 2909) [ClassicSimilarity], result of:
            0.07444246 = score(doc=2909,freq=1.0), product of:
              0.18553038 = queryWeight, product of:
                2.8257058 = boost
                4.2799077 = idf(docFreq=1663, maxDocs=44218)
                0.015341001 = queryNorm
              0.40124136 = fieldWeight in 2909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2799077 = idf(docFreq=1663, maxDocs=44218)
                0.09375 = fieldNorm(doc=2909)
          0.4587597 = weight(abstract_txt:dependencies in 2909) [ClassicSimilarity], result of:
            0.4587597 = score(doc=2909,freq=1.0), product of:
              0.62362677 = queryWeight, product of:
                5.1806207 = boost
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.015341001 = queryNorm
              0.7356318 = fieldWeight in 2909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.09375 = fieldNorm(doc=2909)
        0.12 = coord(3/25)