Document (#20301)

Author
Trybula, W.J.
Title
Data mining and knowledge discovery
Source
Annual review of information science and technology. 32(1997), S.197-229
Year
1997
Abstract
State of the art review of the recently developed concepts of data mining (defined as the automated process of evaluating data and finding relationships) and knowledge discovery (defined as the automated process of extracting information, especially unpredicted relationships or previously unknown patterns among the data) with particular reference to numerical data. Includes: the knowledge acquisition process; data mining; evaluation methods; and knowledge discovery. Concludes that existing work in the field are confusing because the terminology is inconsistent and poorly defined. Although methods are available for analyzing and cleaning databases, better coordinated efforts should be directed toward providing users with improved means of structuring search mechanisms to explore the data for relationships
Theme
Data Mining
Literaturübersicht

Similar documents (content)

  1. Benoit, G.: Data mining (2002) 0.62
    0.61655885 = sum of:
      0.61655885 = product of:
        1.2844976 = sum of:
          0.04773725 = weight(abstract_txt:previously in 4296) [ClassicSimilarity], result of:
            0.04773725 = score(doc=4296,freq=1.0), product of:
              0.12447366 = queryWeight, product of:
                1.0493613 = boost
                6.1362057 = idf(docFreq=259, maxDocs=44218)
                0.019330919 = queryNorm
              0.38351285 = fieldWeight in 4296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1362057 = idf(docFreq=259, maxDocs=44218)
                0.0625 = fieldNorm(doc=4296)
          0.0689038 = weight(abstract_txt:extracting in 4296) [ClassicSimilarity], result of:
            0.0689038 = score(doc=4296,freq=1.0), product of:
              0.15897712 = queryWeight, product of:
                1.1859152 = boost
                6.9347134 = idf(docFreq=116, maxDocs=44218)
                0.019330919 = queryNorm
              0.4334196 = fieldWeight in 4296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9347134 = idf(docFreq=116, maxDocs=44218)
                0.0625 = fieldNorm(doc=4296)
          0.08637855 = weight(abstract_txt:inconsistent in 4296) [ClassicSimilarity], result of:
            0.08637855 = score(doc=4296,freq=1.0), product of:
              0.18483171 = queryWeight, product of:
                1.2787174 = boost
                7.4773793 = idf(docFreq=67, maxDocs=44218)
                0.019330919 = queryNorm
              0.4673362 = fieldWeight in 4296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4773793 = idf(docFreq=67, maxDocs=44218)
                0.0625 = fieldNorm(doc=4296)
          0.094616406 = weight(abstract_txt:poorly in 4296) [ClassicSimilarity], result of:
            0.094616406 = score(doc=4296,freq=1.0), product of:
              0.19640392 = queryWeight, product of:
                1.3181396 = boost
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.019330919 = queryNorm
              0.48174396 = fieldWeight in 4296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.0625 = fieldNorm(doc=4296)
          0.10512534 = weight(abstract_txt:confusing in 4296) [ClassicSimilarity], result of:
            0.10512534 = score(doc=4296,freq=1.0), product of:
              0.2106901 = queryWeight, product of:
                1.3652381 = boost
                7.983315 = idf(docFreq=40, maxDocs=44218)
                0.019330919 = queryNorm
              0.4989572 = fieldWeight in 4296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.983315 = idf(docFreq=40, maxDocs=44218)
                0.0625 = fieldNorm(doc=4296)
          0.02946536 = weight(abstract_txt:methods in 4296) [ClassicSimilarity], result of:
            0.02946536 = score(doc=4296,freq=1.0), product of:
              0.11369038 = queryWeight, product of:
                1.4182839 = boost
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.019330919 = queryNorm
              0.259172 = fieldWeight in 4296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.0625 = fieldNorm(doc=4296)
          0.041206844 = weight(abstract_txt:process in 4296) [ClassicSimilarity], result of:
            0.041206844 = score(doc=4296,freq=1.0), product of:
              0.16275182 = queryWeight, product of:
                2.0783079 = boost
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.019330919 = queryNorm
              0.25318822 = fieldWeight in 4296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.0625 = fieldNorm(doc=4296)
          0.07412347 = weight(abstract_txt:knowledge in 4296) [ClassicSimilarity], result of:
            0.07412347 = score(doc=4296,freq=4.0), product of:
              0.1669077 = queryWeight, product of:
                2.43027 = boost
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.019330919 = queryNorm
              0.4440986 = fieldWeight in 4296, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.0625 = fieldNorm(doc=4296)
          0.088472635 = weight(abstract_txt:defined in 4296) [ClassicSimilarity], result of:
            0.088472635 = score(doc=4296,freq=1.0), product of:
              0.2708646 = queryWeight, product of:
                2.681162 = boost
                5.2260876 = idf(docFreq=645, maxDocs=44218)
                0.019330919 = queryNorm
              0.32663047 = fieldWeight in 4296, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2260876 = idf(docFreq=645, maxDocs=44218)
                0.0625 = fieldNorm(doc=4296)
          0.19049056 = weight(abstract_txt:discovery in 4296) [ClassicSimilarity], result of:
            0.19049056 = score(doc=4296,freq=3.0), product of:
              0.31315175 = queryWeight, product of:
                2.8828654 = boost
                5.619245 = idf(docFreq=435, maxDocs=44218)
                0.019330919 = queryNorm
              0.6083011 = fieldWeight in 4296, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.619245 = idf(docFreq=435, maxDocs=44218)
                0.0625 = fieldNorm(doc=4296)
          0.32641098 = weight(abstract_txt:mining in 4296) [ClassicSimilarity], result of:
            0.32641098 = score(doc=4296,freq=5.0), product of:
              0.3782098 = queryWeight, product of:
                3.1682055 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.019330919 = queryNorm
              0.8630421 = fieldWeight in 4296, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0625 = fieldNorm(doc=4296)
          0.13156648 = weight(abstract_txt:data in 4296) [ClassicSimilarity], result of:
            0.13156648 = score(doc=4296,freq=6.0), product of:
              0.25758365 = queryWeight, product of:
                3.9938753 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.019330919 = queryNorm
              0.5107719 = fieldWeight in 4296, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=4296)
        0.48 = coord(12/25)
    
  2. Fayyad, U.M.: Data mining and knowledge dicovery : making sense out of data (1996) 0.25
    0.2537179 = sum of:
      0.2537179 = product of:
        1.057158 = sum of:
          0.1378076 = weight(abstract_txt:extracting in 7007) [ClassicSimilarity], result of:
            0.1378076 = score(doc=7007,freq=1.0), product of:
              0.15897712 = queryWeight, product of:
                1.1859152 = boost
                6.9347134 = idf(docFreq=116, maxDocs=44218)
                0.019330919 = queryNorm
              0.8668392 = fieldWeight in 7007, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9347134 = idf(docFreq=116, maxDocs=44218)
                0.125 = fieldNorm(doc=7007)
          0.11655056 = weight(abstract_txt:process in 7007) [ClassicSimilarity], result of:
            0.11655056 = score(doc=7007,freq=2.0), product of:
              0.16275182 = queryWeight, product of:
                2.0783079 = boost
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.019330919 = queryNorm
              0.7161244 = fieldWeight in 7007, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.125 = fieldNorm(doc=7007)
          0.10482643 = weight(abstract_txt:knowledge in 7007) [ClassicSimilarity], result of:
            0.10482643 = score(doc=7007,freq=2.0), product of:
              0.1669077 = queryWeight, product of:
                2.43027 = boost
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.019330919 = queryNorm
              0.62805027 = fieldWeight in 7007, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.125 = fieldNorm(doc=7007)
          0.21995956 = weight(abstract_txt:discovery in 7007) [ClassicSimilarity], result of:
            0.21995956 = score(doc=7007,freq=1.0), product of:
              0.31315175 = queryWeight, product of:
                2.8828654 = boost
                5.619245 = idf(docFreq=435, maxDocs=44218)
                0.019330919 = queryNorm
              0.70240563 = fieldWeight in 7007, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.619245 = idf(docFreq=435, maxDocs=44218)
                0.125 = fieldNorm(doc=7007)
          0.29195085 = weight(abstract_txt:mining in 7007) [ClassicSimilarity], result of:
            0.29195085 = score(doc=7007,freq=1.0), product of:
              0.3782098 = queryWeight, product of:
                3.1682055 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.019330919 = queryNorm
              0.7719283 = fieldWeight in 7007, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.125 = fieldNorm(doc=7007)
          0.1860631 = weight(abstract_txt:data in 7007) [ClassicSimilarity], result of:
            0.1860631 = score(doc=7007,freq=3.0), product of:
              0.25758365 = queryWeight, product of:
                3.9938753 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.019330919 = queryNorm
              0.72234046 = fieldWeight in 7007, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.125 = fieldNorm(doc=7007)
        0.24 = coord(6/25)
    
  3. Berry, M.W.; Esau, R.; Kiefer, B.: ¬The use of text mining techniques in electronic discovery for legal matters (2012) 0.22
    0.22077149 = sum of:
      0.22077149 = product of:
        0.7884696 = sum of:
          0.067734435 = weight(abstract_txt:analyzing in 91) [ClassicSimilarity], result of:
            0.067734435 = score(doc=91,freq=1.0), product of:
              0.119945705 = queryWeight, product of:
                1.0300983 = boost
                6.023564 = idf(docFreq=290, maxDocs=44218)
                0.019330919 = queryNorm
              0.5647091 = fieldWeight in 91, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.023564 = idf(docFreq=290, maxDocs=44218)
                0.09375 = fieldNorm(doc=91)
          0.071605876 = weight(abstract_txt:previously in 91) [ClassicSimilarity], result of:
            0.071605876 = score(doc=91,freq=1.0), product of:
              0.12447366 = queryWeight, product of:
                1.0493613 = boost
                6.1362057 = idf(docFreq=259, maxDocs=44218)
                0.019330919 = queryNorm
              0.5752693 = fieldWeight in 91, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1362057 = idf(docFreq=259, maxDocs=44218)
                0.09375 = fieldNorm(doc=91)
          0.044198044 = weight(abstract_txt:methods in 91) [ClassicSimilarity], result of:
            0.044198044 = score(doc=91,freq=1.0), product of:
              0.11369038 = queryWeight, product of:
                1.4182839 = boost
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.019330919 = queryNorm
              0.388758 = fieldWeight in 91, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.09375 = fieldNorm(doc=91)
          0.10705852 = weight(abstract_txt:process in 91) [ClassicSimilarity], result of:
            0.10705852 = score(doc=91,freq=3.0), product of:
              0.16275182 = queryWeight, product of:
                2.0783079 = boost
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.019330919 = queryNorm
              0.6578023 = fieldWeight in 91, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.09375 = fieldNorm(doc=91)
          0.16496965 = weight(abstract_txt:discovery in 91) [ClassicSimilarity], result of:
            0.16496965 = score(doc=91,freq=1.0), product of:
              0.31315175 = queryWeight, product of:
                2.8828654 = boost
                5.619245 = idf(docFreq=435, maxDocs=44218)
                0.019330919 = queryNorm
              0.5268042 = fieldWeight in 91, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.619245 = idf(docFreq=435, maxDocs=44218)
                0.09375 = fieldNorm(doc=91)
          0.21896313 = weight(abstract_txt:mining in 91) [ClassicSimilarity], result of:
            0.21896313 = score(doc=91,freq=1.0), product of:
              0.3782098 = queryWeight, product of:
                3.1682055 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.019330919 = queryNorm
              0.57894623 = fieldWeight in 91, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.09375 = fieldNorm(doc=91)
          0.11393992 = weight(abstract_txt:data in 91) [ClassicSimilarity], result of:
            0.11393992 = score(doc=91,freq=2.0), product of:
              0.25758365 = queryWeight, product of:
                3.9938753 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.019330919 = queryNorm
              0.44234142 = fieldWeight in 91, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.09375 = fieldNorm(doc=91)
        0.28 = coord(7/25)
    
  4. Raghavan, V.V.; Deogun, J.S.; Sever, H.: Knowledge discovery and data mining : introduction (1998) 0.21
    0.2061924 = sum of:
      0.2061924 = product of:
        1.030962 = sum of:
          0.08741291 = weight(abstract_txt:process in 2899) [ClassicSimilarity], result of:
            0.08741291 = score(doc=2899,freq=2.0), product of:
              0.16275182 = queryWeight, product of:
                2.0783079 = boost
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.019330919 = queryNorm
              0.5370933 = fieldWeight in 2899, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.09375 = fieldNorm(doc=2899)
          0.11118521 = weight(abstract_txt:knowledge in 2899) [ClassicSimilarity], result of:
            0.11118521 = score(doc=2899,freq=4.0), product of:
              0.1669077 = queryWeight, product of:
                2.43027 = boost
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.019330919 = queryNorm
              0.6661479 = fieldWeight in 2899, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.09375 = fieldNorm(doc=2899)
          0.23330234 = weight(abstract_txt:discovery in 2899) [ClassicSimilarity], result of:
            0.23330234 = score(doc=2899,freq=2.0), product of:
              0.31315175 = queryWeight, product of:
                2.8828654 = boost
                5.619245 = idf(docFreq=435, maxDocs=44218)
                0.019330919 = queryNorm
              0.7450137 = fieldWeight in 2899, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.619245 = idf(docFreq=435, maxDocs=44218)
                0.09375 = fieldNorm(doc=2899)
          0.43792626 = weight(abstract_txt:mining in 2899) [ClassicSimilarity], result of:
            0.43792626 = score(doc=2899,freq=4.0), product of:
              0.3782098 = queryWeight, product of:
                3.1682055 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.019330919 = queryNorm
              1.1578925 = fieldWeight in 2899, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.09375 = fieldNorm(doc=2899)
          0.16113538 = weight(abstract_txt:data in 2899) [ClassicSimilarity], result of:
            0.16113538 = score(doc=2899,freq=4.0), product of:
              0.25758365 = queryWeight, product of:
                3.9938753 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.019330919 = queryNorm
              0.62556523 = fieldWeight in 2899, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.09375 = fieldNorm(doc=2899)
        0.2 = coord(5/25)
    
  5. Bath, P.A.: Data mining in health and medical information (2003) 0.20
    0.20361826 = sum of:
      0.20361826 = product of:
        0.7272081 = sum of:
          0.067734435 = weight(abstract_txt:analyzing in 4263) [ClassicSimilarity], result of:
            0.067734435 = score(doc=4263,freq=1.0), product of:
              0.119945705 = queryWeight, product of:
                1.0300983 = boost
                6.023564 = idf(docFreq=290, maxDocs=44218)
                0.019330919 = queryNorm
              0.5647091 = fieldWeight in 4263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.023564 = idf(docFreq=290, maxDocs=44218)
                0.09375 = fieldNorm(doc=4263)
          0.044198044 = weight(abstract_txt:methods in 4263) [ClassicSimilarity], result of:
            0.044198044 = score(doc=4263,freq=1.0), product of:
              0.11369038 = queryWeight, product of:
                1.4182839 = boost
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.019330919 = queryNorm
              0.388758 = fieldWeight in 4263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.09375 = fieldNorm(doc=4263)
          0.061810266 = weight(abstract_txt:process in 4263) [ClassicSimilarity], result of:
            0.061810266 = score(doc=4263,freq=1.0), product of:
              0.16275182 = queryWeight, product of:
                2.0783079 = boost
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.019330919 = queryNorm
              0.37978232 = fieldWeight in 4263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0510116 = idf(docFreq=2091, maxDocs=44218)
                0.09375 = fieldNorm(doc=4263)
          0.055592604 = weight(abstract_txt:knowledge in 4263) [ClassicSimilarity], result of:
            0.055592604 = score(doc=4263,freq=1.0), product of:
              0.1669077 = queryWeight, product of:
                2.43027 = boost
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.019330919 = queryNorm
              0.33307394 = fieldWeight in 4263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5527887 = idf(docFreq=3442, maxDocs=44218)
                0.09375 = fieldNorm(doc=4263)
          0.16496965 = weight(abstract_txt:discovery in 4263) [ClassicSimilarity], result of:
            0.16496965 = score(doc=4263,freq=1.0), product of:
              0.31315175 = queryWeight, product of:
                2.8828654 = boost
                5.619245 = idf(docFreq=435, maxDocs=44218)
                0.019330919 = queryNorm
              0.5268042 = fieldWeight in 4263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.619245 = idf(docFreq=435, maxDocs=44218)
                0.09375 = fieldNorm(doc=4263)
          0.21896313 = weight(abstract_txt:mining in 4263) [ClassicSimilarity], result of:
            0.21896313 = score(doc=4263,freq=1.0), product of:
              0.3782098 = queryWeight, product of:
                3.1682055 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.019330919 = queryNorm
              0.57894623 = fieldWeight in 4263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.09375 = fieldNorm(doc=4263)
          0.11393992 = weight(abstract_txt:data in 4263) [ClassicSimilarity], result of:
            0.11393992 = score(doc=4263,freq=2.0), product of:
              0.25758365 = queryWeight, product of:
                3.9938753 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.019330919 = queryNorm
              0.44234142 = fieldWeight in 4263, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.09375 = fieldNorm(doc=4263)
        0.28 = coord(7/25)