Search (28 results, page 2 of 2)

Principles of data mining and knowledge discovery (1998) 0.00
```
0.001682769 = product of:
  0.010096614 = sum of:
    0.010096614 = weight(_text_:in in 3822) [ClassicSimilarity], result of:
      0.010096614 = score(doc=3822,freq=4.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.17003182 = fieldWeight in 3822, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0625 = fieldNorm(doc=3822)
  0.16666667 = coord(1/6)
```
Abstract

The volume presents 26 revised papers corresponding to the oral presentations given at the conference, also included are refereed papers corresponding to the 30 poster presentations. These papers were selected from a total of 73 full draft submissions. The papers are organized in topical sections on rule evaluation, visualization, association rules and text mining, KDD process and software, tree construction, sequential and spatial data mining, and attribute selection

Series

Lecture notes in computer science; vol.1510
Wu, X.: Rule induction with extension matrices (1998) 0.00
```
0.0015457221 = product of:
  0.009274333 = sum of:
    0.009274333 = weight(_text_:in in 2912) [ClassicSimilarity], result of:
      0.009274333 = score(doc=2912,freq=6.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.1561842 = fieldWeight in 2912, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=2912)
  0.16666667 = coord(1/6)
```
Abstract

Presents a heuristic, attribute-based, noise-tolerant data mining program, HCV (Version 2.0), absed on the newly-developed extension matrix approach. Gives a simple example of attribute-based induction to show the difference between the rules in variable-valued logic produced by HCV, the decision tree generated by C4.5 and the decision tree's decompiled rules by C4.5 rules. Outlines the extension matrix approach for data mining. Describes the HCV algorithm in detail. Outlines techniques developed and implemented in the HCV program for noise handling and discretization of continuous domains respectively. Follows these with a performance comparison of HCV with famous ID3-like algorithms including C4.5 and C4.5 rules on a collection of standard databases including the famous MONK's problems
Methodologies for knowledge discovery and data mining : Third Pacific-Asia Conference, PAKDD'99, Beijing, China, April 26-28, 1999, Proceedings (1999) 0.00
```
0.0014724231 = product of:
  0.008834538 = sum of:
    0.008834538 = weight(_text_:in in 3821) [ClassicSimilarity], result of:
      0.008834538 = score(doc=3821,freq=4.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.14877784 = fieldWeight in 3821, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3821)
  0.16666667 = coord(1/6)
```
Abstract

The 29 revised full papers presented together with 37 short papers were carefully selected from a total of 158 submissions. The book is divided into sections on emerging KDD technology; association rules; feature selection and generation; mining in semi-unstructured data; interestingness, surprisingness, and exceptions; rough sets, fuzzy logic, and neural networks; induction, classification, and clustering; visualization, causal models and graph-based methods; agent-based and distributed data mining; and advanced topics and new methodologies

Series

Lecture notes in computer science; vol.1574
Deogun, J.S.: Feature selection and effective classifiers (1998) 0.00
```
0.0012620769 = product of:
  0.0075724614 = sum of:
    0.0075724614 = weight(_text_:in in 2911) [ClassicSimilarity], result of:
      0.0075724614 = score(doc=2911,freq=4.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.12752387 = fieldWeight in 2911, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=2911)
  0.16666667 = coord(1/6)
```
Abstract

Develops and analyzes 4 algorithms for feature selection in the context of rough set methodology. Develops the notion of accuracy of classification that can be used for upper or lower classification methods and defines the feature selection problem. Presents a discussion of upper classifiers and develops 4 features selection heuristics and discusses the family of stepwise backward selection algorithms. Analyzes the worst case time complexity in all algorithms presented. Discusses details of the experiments and results of using a family of stepwise backward selection learning data sets and a duodenal ulcer data set. Includes the experimental setup and results of comparison of lower classifiers and upper classiers on the duodenal ulcer data set. Discusses exteded decision tables

Fayyad, U.M.; Djorgovski, S.G.; Weir, N.: From digitized images to online catalogs : data ming a sky server (1996) 0.00

0.0011898974 = product of:
  0.0071393843 = sum of:
    0.0071393843 = weight(_text_:in in 6625) [ClassicSimilarity], result of:
      0.0071393843 = score(doc=6625,freq=2.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.120230645 = fieldWeight in 6625, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0625 = fieldNorm(doc=6625)
  0.16666667 = coord(1/6)

Abstract: Offers a data mining approach based on machine learning classification methods to the problem of automated cataloguing of online databases of digital images resulting from sky surveys. The SKICAT system automates the reduction and analysis of 3 terabytes of images expected to contain about 2 billion sky objects. It offers a solution to problems associated with the analysis of large data sets in science

Trybula, W.J.: Data mining and knowledge discovery (1997) 0.00
```
0.0010411602 = product of:
  0.006246961 = sum of:
    0.006246961 = weight(_text_:in in 2300) [ClassicSimilarity], result of:
      0.006246961 = score(doc=2300,freq=2.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.10520181 = fieldWeight in 2300, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2300)
  0.16666667 = coord(1/6)
```
Abstract

State of the art review of the recently developed concepts of data mining (defined as the automated process of evaluating data and finding relationships) and knowledge discovery (defined as the automated process of extracting information, especially unpredicted relationships or previously unknown patterns among the data) with particular reference to numerical data. Includes: the knowledge acquisition process; data mining; evaluation methods; and knowledge discovery. Concludes that existing work in the field are confusing because the terminology is inconsistent and poorly defined. Although methods are available for analyzing and cleaning databases, better coordinated efforts should be directed toward providing users with improved means of structuring search mechanisms to explore the data for relationships
Raghavan, V.V.; Deogun, J.S.; Sever, H.: Knowledge discovery and data mining : introduction (1998) 0.00
```
0.0010411602 = product of:
  0.006246961 = sum of:
    0.006246961 = weight(_text_:in in 2899) [ClassicSimilarity], result of:
      0.006246961 = score(doc=2899,freq=2.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.10520181 = fieldWeight in 2899, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2899)
  0.16666667 = coord(1/6)
```
Abstract

Defines knowledge discovery and database mining. The challenge for knowledge discovery in databases (KDD) is to automatically process large quantities of raw data, identifying the most significant and meaningful patterns, and present these as as knowledge appropriate for achieving a user's goals. Data mining is the process of deriving useful knowledge from real world databases through the application of pattern extraction techniques. Explains the goals of, and motivation for, research work on data mining. Discusses the nature of database contents, along with problems within the field of data mining
Lingras, P.J.; Yao, Y.Y.: Data mining using extensions of the rough set model (1998) 0.00
```
0.0010411602 = product of:
  0.006246961 = sum of:
    0.006246961 = weight(_text_:in in 2910) [ClassicSimilarity], result of:
      0.006246961 = score(doc=2910,freq=2.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.10520181 = fieldWeight in 2910, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2910)
  0.16666667 = coord(1/6)
```
Abstract

Examines basic issues of data mining using the theory of rough sets, which is a recent proposal for generalizing classical set theory. The Pawlak rough set model is based on the concept of an equivalence relation. A generalized rough set model need not be based on equivalence relation axioms. The Pawlak rough set model has been used for deriving deterministic as well as probabilistic rules froma complete database. Demonstrates that a generalised rough set model can be used for generating rules from incomplete databases. These rules are based on plausability functions proposed by Shafer. Discusses the importance of rule extraction from incomplete databases in data mining

Search (28 results, page 2 of 2)

Authors

Languages

Types

Themes

Subjects