Search (29 results, page 1 of 2)

  • × theme_ss:"Data Mining"
  • × year_i:[1990 TO 2000}
  1. KDD : techniques and applications (1998) 0.09
    0.086345725 = product of:
      0.19427788 = sum of:
        0.10067343 = weight(_text_:applications in 6783) [ClassicSimilarity], result of:
          0.10067343 = score(doc=6783,freq=2.0), product of:
            0.17247584 = queryWeight, product of:
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.03917671 = queryNorm
            0.5836958 = fieldWeight in 6783, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.09375 = fieldNorm(doc=6783)
        0.012701439 = weight(_text_:of in 6783) [ClassicSimilarity], result of:
          0.012701439 = score(doc=6783,freq=2.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.20732689 = fieldWeight in 6783, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.09375 = fieldNorm(doc=6783)
        0.0490556 = weight(_text_:systems in 6783) [ClassicSimilarity], result of:
          0.0490556 = score(doc=6783,freq=2.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.4074492 = fieldWeight in 6783, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.09375 = fieldNorm(doc=6783)
        0.031847417 = product of:
          0.063694835 = sum of:
            0.063694835 = weight(_text_:22 in 6783) [ClassicSimilarity], result of:
              0.063694835 = score(doc=6783,freq=2.0), product of:
                0.13719016 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03917671 = queryNorm
                0.46428138 = fieldWeight in 6783, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6783)
          0.5 = coord(1/2)
      0.44444445 = coord(4/9)
    
    Footnote
    A special issue of selected papers from the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'97), held Singapore, 22-23 Feb 1997
    Source
    Knowledge-based systems. 10(1998) no.7, S.401-470
  2. Gaizauskas, R.; Wilks, Y.: Information extraction : beyond document retrieval (1998) 0.03
    0.031649083 = product of:
      0.09494725 = sum of:
        0.050336715 = weight(_text_:applications in 4716) [ClassicSimilarity], result of:
          0.050336715 = score(doc=4716,freq=2.0), product of:
            0.17247584 = queryWeight, product of:
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.03917671 = queryNorm
            0.2918479 = fieldWeight in 4716, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.046875 = fieldNorm(doc=4716)
        0.020082738 = weight(_text_:of in 4716) [ClassicSimilarity], result of:
          0.020082738 = score(doc=4716,freq=20.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.32781258 = fieldWeight in 4716, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=4716)
        0.0245278 = weight(_text_:systems in 4716) [ClassicSimilarity], result of:
          0.0245278 = score(doc=4716,freq=2.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.2037246 = fieldWeight in 4716, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.046875 = fieldNorm(doc=4716)
      0.33333334 = coord(3/9)
    
    Abstract
    In this paper we give a synoptic view of the growth of the text processing technology of informatione xtraction (IE) whose function is to extract information about a pre-specified set of entities, relations or events from natural language texts and to record this information in structured representations called templates. Here we describe the nature of the IE task, review the history of the area from its origins in AI work in the 1960s and 70s till the present, discuss the techniques being used to carry out the task, describe application areas where IE systems are or are about to be at work, and conclude with a discussion of the challenges facing the area. What emerges is a picture of an exciting new text processing technology with a host of new applications, both on its own and in conjunction with other technologies, such as information retrieval, machine translation and data mining
    Source
    Journal of documentation. 54(1998) no.1, S.70-105
  3. Fayyad, U.; Piatetsky-Shapiro, G.; Smyth, P.: From data mining to knowledge discovery in databases (1996) 0.03
    0.029691901 = product of:
      0.13361356 = sum of:
        0.118644774 = weight(_text_:applications in 7458) [ClassicSimilarity], result of:
          0.118644774 = score(doc=7458,freq=4.0), product of:
            0.17247584 = queryWeight, product of:
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.03917671 = queryNorm
            0.68789214 = fieldWeight in 7458, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.078125 = fieldNorm(doc=7458)
        0.014968789 = weight(_text_:of in 7458) [ClassicSimilarity], result of:
          0.014968789 = score(doc=7458,freq=4.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.24433708 = fieldWeight in 7458, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=7458)
      0.22222222 = coord(2/9)
    
    Abstract
    Gives an overview of data mining and knowledge discovery in databases. Clarifies how they are related both to each other and to related fields. Mentions real world applications data mining techniques, challenges involved in real world applications of knowledge discovery, and current and future research directions
  4. Hofstede, A.H.M. ter; Proper, H.A.; Van der Weide, T.P.: Exploiting fact verbalisation in conceptual information modelling (1997) 0.02
    0.02226542 = product of:
      0.06679626 = sum of:
        0.01960283 = weight(_text_:of in 2908) [ClassicSimilarity], result of:
          0.01960283 = score(doc=2908,freq=14.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.31997898 = fieldWeight in 2908, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2908)
        0.028615767 = weight(_text_:systems in 2908) [ClassicSimilarity], result of:
          0.028615767 = score(doc=2908,freq=2.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.23767869 = fieldWeight in 2908, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2908)
        0.018577661 = product of:
          0.037155323 = sum of:
            0.037155323 = weight(_text_:22 in 2908) [ClassicSimilarity], result of:
              0.037155323 = score(doc=2908,freq=2.0), product of:
                0.13719016 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03917671 = queryNorm
                0.2708308 = fieldWeight in 2908, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2908)
          0.5 = coord(1/2)
      0.33333334 = coord(3/9)
    
    Abstract
    Focuses on the information modelling side of conceptual modelling. Deals with the exploitation of fact verbalisations after finishing the actual information system. Verbalisations are used as input for the design of the so-called information model. Exploits these verbalisation in 4 directions: considers their use for a conceptual query language, the verbalisation of instances, the description of the contents of a database and for the verbalisation of queries in a computer supported query environment. Provides an example session with an envisioned tool for end user query formulations that exploits the verbalisation
    Source
    Information systems. 22(1997) nos.5/6, S.349-385
  5. Amir, A.; Feldman, R.; Kashi, R.: ¬A new and versatile method for association generation (1997) 0.02
    0.020800991 = product of:
      0.06240297 = sum of:
        0.008467626 = weight(_text_:of in 1270) [ClassicSimilarity], result of:
          0.008467626 = score(doc=1270,freq=2.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.13821793 = fieldWeight in 1270, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=1270)
        0.03270373 = weight(_text_:systems in 1270) [ClassicSimilarity], result of:
          0.03270373 = score(doc=1270,freq=2.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.2716328 = fieldWeight in 1270, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0625 = fieldNorm(doc=1270)
        0.021231614 = product of:
          0.042463228 = sum of:
            0.042463228 = weight(_text_:22 in 1270) [ClassicSimilarity], result of:
              0.042463228 = score(doc=1270,freq=2.0), product of:
                0.13719016 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03917671 = queryNorm
                0.30952093 = fieldWeight in 1270, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1270)
          0.5 = coord(1/2)
      0.33333334 = coord(3/9)
    
    Abstract
    Current algorithms for finding associations among the attributes describing data in a database have a number of shortcomings. Presents a novel method for association generation, that answers all desiderata. The method is different from all existing algorithms and especially suitable to textual databases with binary attributes. Uses subword trees for quick indexing into the required database statistics. Tests the algorithm on the Reuters-22173 database with satisfactory results
    Source
    Information systems. 22(1997) nos.5/6, S.333-347
  6. Principles of data mining and knowledge discovery (1998) 0.01
    0.0147717865 = product of:
      0.06647304 = sum of:
        0.011975031 = weight(_text_:of in 3822) [ClassicSimilarity], result of:
          0.011975031 = score(doc=3822,freq=4.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.19546966 = fieldWeight in 3822, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=3822)
        0.054498006 = weight(_text_:software in 3822) [ClassicSimilarity], result of:
          0.054498006 = score(doc=3822,freq=2.0), product of:
            0.15541996 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.03917671 = queryNorm
            0.35064998 = fieldWeight in 3822, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.0625 = fieldNorm(doc=3822)
      0.22222222 = coord(2/9)
    
    Abstract
    The volume presents 26 revised papers corresponding to the oral presentations given at the conference, also included are refereed papers corresponding to the 30 poster presentations. These papers were selected from a total of 73 full draft submissions. The papers are organized in topical sections on rule evaluation, visualization, association rules and text mining, KDD process and software, tree construction, sequential and spatial data mining, and attribute selection
  7. Cardie, C.: Empirical methods in information extraction (1997) 0.01
    0.010526687 = product of:
      0.04737009 = sum of:
        0.014666359 = weight(_text_:of in 3246) [ClassicSimilarity], result of:
          0.014666359 = score(doc=3246,freq=6.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.23940048 = fieldWeight in 3246, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=3246)
        0.03270373 = weight(_text_:systems in 3246) [ClassicSimilarity], result of:
          0.03270373 = score(doc=3246,freq=2.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.2716328 = fieldWeight in 3246, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0625 = fieldNorm(doc=3246)
      0.22222222 = coord(2/9)
    
    Abstract
    Surveys the use of empirical, machine-learning methods for information extraction. Presents a generic architecture for information extraction systems and surveys the learning algorithms that have been developed to address the problems of accuracy, portability, and knowledge acquisition for each component of the architecture
  8. Galal, G.M.; Cook, D.J.; Holder, L.B.: Exploiting parallelism in a structural scientific discovery system to improve scalability (1999) 0.01
    0.009184495 = product of:
      0.041330226 = sum of:
        0.016802425 = weight(_text_:of in 2952) [ClassicSimilarity], result of:
          0.016802425 = score(doc=2952,freq=14.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.2742677 = fieldWeight in 2952, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2952)
        0.0245278 = weight(_text_:systems in 2952) [ClassicSimilarity], result of:
          0.0245278 = score(doc=2952,freq=2.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.2037246 = fieldWeight in 2952, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.046875 = fieldNorm(doc=2952)
      0.22222222 = coord(2/9)
    
    Abstract
    The large amount of data collected today is quickly overwhelming researchers' abilities to interpret the data and discover interesting patterns. Knowledge discovery and data mining approaches hold the potential to automate the interpretation process, but these approaches frequently utilize computationally expensive algorithms. In particular, scientific discovery systems focus on the utilization of richer data representation, sometimes without regard for scalability. This research investigates approaches for scaling a particular knowledge discovery in databases (KDD) system, SUBDUE, using parallel and distributed resources. SUBDUE has been used to discover interesting and repetitive concepts in graph-based databases from a variety of domains, but requires a substantial amount of processing time. Experiments that demonstrate scalability of parallel versions of the SUBDUE system are performed using CAD circuit databases and artificially-generated databases, and potential achievements and obstacles are discussed
    Source
    Journal of the American Society for Information Science. 50(1999) no.1, S.65-73
  9. Chen, Z.: Knowledge discovery and system-user partnership : on a production 'adversarial partnership' approach (1994) 0.01
    0.009149191 = product of:
      0.041171357 = sum of:
        0.008467626 = weight(_text_:of in 6759) [ClassicSimilarity], result of:
          0.008467626 = score(doc=6759,freq=2.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.13821793 = fieldWeight in 6759, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=6759)
        0.03270373 = weight(_text_:systems in 6759) [ClassicSimilarity], result of:
          0.03270373 = score(doc=6759,freq=2.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.2716328 = fieldWeight in 6759, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0625 = fieldNorm(doc=6759)
      0.22222222 = coord(2/9)
    
    Abstract
    Examines the relationship between systems and users from the knowledge discovery in databases or data mining perspecitives. A comprehensive study on knowledge discovery in human computer symbiosis is needed. Proposes a database-user adversarial partnership, which is general enough to cover knowledge discovery and security of issues related to databases and their users. It can be further generalized into system-user adversarial paertnership. Discusses opportunities provided by knowledge discovery techniques and potential social implications
  10. Matson, L.D.; Bonski, D.J.: Do digital libraries need librarians? (1997) 0.01
    0.008481526 = product of:
      0.038166866 = sum of:
        0.016935252 = weight(_text_:of in 1737) [ClassicSimilarity], result of:
          0.016935252 = score(doc=1737,freq=8.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.27643585 = fieldWeight in 1737, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=1737)
        0.021231614 = product of:
          0.042463228 = sum of:
            0.042463228 = weight(_text_:22 in 1737) [ClassicSimilarity], result of:
              0.042463228 = score(doc=1737,freq=2.0), product of:
                0.13719016 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03917671 = queryNorm
                0.30952093 = fieldWeight in 1737, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1737)
          0.5 = coord(1/2)
      0.22222222 = coord(2/9)
    
    Abstract
    Defines digital libraries and discusses the effects of new technology on librarians. Examines the different viewpoints of librarians and information technologists on digital libraries. Describes the development of a digital library at the National Drug Intelligence Center, USA, which was carried out in collaboration with information technology experts. The system is based on Web enabled search technology to find information, data visualization and data mining to visualize it and use of SGML as an information standard to store it
    Date
    22.11.1998 18:57:22
  11. Search tools (1997) 0.01
    0.008005542 = product of:
      0.03602494 = sum of:
        0.0074091726 = weight(_text_:of in 3834) [ClassicSimilarity], result of:
          0.0074091726 = score(doc=3834,freq=2.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.120940685 = fieldWeight in 3834, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3834)
        0.028615767 = weight(_text_:systems in 3834) [ClassicSimilarity], result of:
          0.028615767 = score(doc=3834,freq=2.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.23767869 = fieldWeight in 3834, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3834)
      0.22222222 = coord(2/9)
    
    Abstract
    Offers brief accounts of Internet search tools. Covers the Lycos revamp; the new navigation service produced jointly by Excite and Netscape, delivering a language specific, locally relevant Web guide for Japan, Germany, France, the UK and Australia; InfoWatcher, a combination offline browser, search engine and push product from Carvelle Inc., USA; Alexa by Alexa Internet and WBI from IBM which are free and provide users with information on how others have used the Web sites which they are visiting; and Concept Explorer from Knowledge Discovery Systems, Inc., California which performs data mining from the Web, Usenet groups, MEDLINE and the US Patent and Trademark Office patent abstracts
  12. Chowdhury, G.G.: Template mining for information extraction from digital documents (1999) 0.00
    0.004128369 = product of:
      0.037155323 = sum of:
        0.037155323 = product of:
          0.074310645 = sum of:
            0.074310645 = weight(_text_:22 in 4577) [ClassicSimilarity], result of:
              0.074310645 = score(doc=4577,freq=2.0), product of:
                0.13719016 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03917671 = queryNorm
                0.5416616 = fieldWeight in 4577, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=4577)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Date
    2. 4.2000 18:01:22
  13. Lusti, M.: Data Warehousing and Data Mining : Eine Einführung in entscheidungsunterstützende Systeme (1999) 0.00
    0.0023590683 = product of:
      0.021231614 = sum of:
        0.021231614 = product of:
          0.042463228 = sum of:
            0.042463228 = weight(_text_:22 in 4261) [ClassicSimilarity], result of:
              0.042463228 = score(doc=4261,freq=2.0), product of:
                0.13719016 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03917671 = queryNorm
                0.30952093 = fieldWeight in 4261, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4261)
          0.5 = coord(1/2)
      0.11111111 = coord(1/9)
    
    Date
    17. 7.2002 19:22:06
  14. Deogun, J.S.: Feature selection and effective classifiers (1998) 0.00
    0.0023403284 = product of:
      0.021062955 = sum of:
        0.021062955 = weight(_text_:of in 2911) [ClassicSimilarity], result of:
          0.021062955 = score(doc=2911,freq=22.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.34381276 = fieldWeight in 2911, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=2911)
      0.11111111 = coord(1/9)
    
    Abstract
    Develops and analyzes 4 algorithms for feature selection in the context of rough set methodology. Develops the notion of accuracy of classification that can be used for upper or lower classification methods and defines the feature selection problem. Presents a discussion of upper classifiers and develops 4 features selection heuristics and discusses the family of stepwise backward selection algorithms. Analyzes the worst case time complexity in all algorithms presented. Discusses details of the experiments and results of using a family of stepwise backward selection learning data sets and a duodenal ulcer data set. Includes the experimental setup and results of comparison of lower classifiers and upper classiers on the duodenal ulcer data set. Discusses exteded decision tables
    Source
    Journal of the American Society for Information Science. 49(1998) no.5, S.423-434
  15. Fayyad, U.M.; Djorgovski, S.G.; Weir, N.: From digitized images to online catalogs : data ming a sky server (1996) 0.00
    0.002304596 = product of:
      0.020741362 = sum of:
        0.020741362 = weight(_text_:of in 6625) [ClassicSimilarity], result of:
          0.020741362 = score(doc=6625,freq=12.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.33856338 = fieldWeight in 6625, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=6625)
      0.11111111 = coord(1/9)
    
    Abstract
    Offers a data mining approach based on machine learning classification methods to the problem of automated cataloguing of online databases of digital images resulting from sky surveys. The SKICAT system automates the reduction and analysis of 3 terabytes of images expected to contain about 2 billion sky objects. It offers a solution to problems associated with the analysis of large data sets in science
  16. Trybula, W.J.: Data mining and knowledge discovery (1997) 0.00
    0.0021780923 = product of:
      0.01960283 = sum of:
        0.01960283 = weight(_text_:of in 2300) [ClassicSimilarity], result of:
          0.01960283 = score(doc=2300,freq=14.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.31997898 = fieldWeight in 2300, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2300)
      0.11111111 = coord(1/9)
    
    Abstract
    State of the art review of the recently developed concepts of data mining (defined as the automated process of evaluating data and finding relationships) and knowledge discovery (defined as the automated process of extracting information, especially unpredicted relationships or previously unknown patterns among the data) with particular reference to numerical data. Includes: the knowledge acquisition process; data mining; evaluation methods; and knowledge discovery. Concludes that existing work in the field are confusing because the terminology is inconsistent and poorly defined. Although methods are available for analyzing and cleaning databases, better coordinated efforts should be directed toward providing users with improved means of structuring search mechanisms to explore the data for relationships
    Source
    Annual review of information science and technology. 32(1997), S.197-229
  17. Raghavan, V.V.; Deogun, J.S.; Sever, H.: Knowledge discovery and data mining : introduction (1998) 0.00
    0.0021780923 = product of:
      0.01960283 = sum of:
        0.01960283 = weight(_text_:of in 2899) [ClassicSimilarity], result of:
          0.01960283 = score(doc=2899,freq=14.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.31997898 = fieldWeight in 2899, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2899)
      0.11111111 = coord(1/9)
    
    Abstract
    Defines knowledge discovery and database mining. The challenge for knowledge discovery in databases (KDD) is to automatically process large quantities of raw data, identifying the most significant and meaningful patterns, and present these as as knowledge appropriate for achieving a user's goals. Data mining is the process of deriving useful knowledge from real world databases through the application of pattern extraction techniques. Explains the goals of, and motivation for, research work on data mining. Discusses the nature of database contents, along with problems within the field of data mining
    Source
    Journal of the American Society for Information Science. 49(1998) no.5, S.397-402
  18. Fayyad, U.M.: Data mining and knowledge dicovery : making sense out of data (1996) 0.00
    0.0020369943 = product of:
      0.018332949 = sum of:
        0.018332949 = weight(_text_:of in 7007) [ClassicSimilarity], result of:
          0.018332949 = score(doc=7007,freq=6.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.2992506 = fieldWeight in 7007, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=7007)
      0.11111111 = coord(1/9)
    
    Abstract
    Defines knowledge discovery and data mining (KDD) as the overall process of extracting high level knowledge from low level data. Outlines the KDD process. Explains how KDD is related to the fields of: statistics, pattern recognition, machine learning, artificial intelligence, databases and data warehouses
  19. Saz, J.T.: Perspectivas en recuperacion y explotacion de informacion electronica : el 'data mining' (1997) 0.00
    0.0020369943 = product of:
      0.018332949 = sum of:
        0.018332949 = weight(_text_:of in 3723) [ClassicSimilarity], result of:
          0.018332949 = score(doc=3723,freq=6.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.2992506 = fieldWeight in 3723, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=3723)
      0.11111111 = coord(1/9)
    
    Abstract
    Presents the concept and the techniques identified by the term data mining. Explains the principles and phases of developing a data mining process, and the main types of data mining tools
    Footnote
    Übers. des Titels: Perspectives on the retrieval and exploitation of electronic information: data mining
  20. Lingras, P.J.; Yao, Y.Y.: Data mining using extensions of the rough set model (1998) 0.00
    0.0020165213 = product of:
      0.018148692 = sum of:
        0.018148692 = weight(_text_:of in 2910) [ClassicSimilarity], result of:
          0.018148692 = score(doc=2910,freq=12.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.29624295 = fieldWeight in 2910, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2910)
      0.11111111 = coord(1/9)
    
    Abstract
    Examines basic issues of data mining using the theory of rough sets, which is a recent proposal for generalizing classical set theory. The Pawlak rough set model is based on the concept of an equivalence relation. A generalized rough set model need not be based on equivalence relation axioms. The Pawlak rough set model has been used for deriving deterministic as well as probabilistic rules froma complete database. Demonstrates that a generalised rough set model can be used for generating rules from incomplete databases. These rules are based on plausability functions proposed by Shafer. Discusses the importance of rule extraction from incomplete databases in data mining
    Source
    Journal of the American Society for Information Science. 49(1998) no.5, S.415-422