Search (157 results, page 1 of 8)

  • × theme_ss:"Data Mining"
  1. Chowdhury, G.G.: Template mining for information extraction from digital documents (1999) 0.03
    0.03206814 = product of:
      0.06413628 = sum of:
        0.06413628 = product of:
          0.09620442 = sum of:
            0.009410121 = weight(_text_:a in 4577) [ClassicSimilarity], result of:
              0.009410121 = score(doc=4577,freq=2.0), product of:
                0.052761257 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.045758117 = queryNorm
                0.17835285 = fieldWeight in 4577, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.109375 = fieldNorm(doc=4577)
            0.0867943 = weight(_text_:22 in 4577) [ClassicSimilarity], result of:
              0.0867943 = score(doc=4577,freq=2.0), product of:
                0.16023713 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045758117 = queryNorm
                0.5416616 = fieldWeight in 4577, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=4577)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Date
    2. 4.2000 18:01:22
    Type
    a
  2. KDD : techniques and applications (1998) 0.03
    0.02748698 = product of:
      0.05497396 = sum of:
        0.05497396 = product of:
          0.08246094 = sum of:
            0.008065818 = weight(_text_:a in 6783) [ClassicSimilarity], result of:
              0.008065818 = score(doc=6783,freq=2.0), product of:
                0.052761257 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.045758117 = queryNorm
                0.15287387 = fieldWeight in 6783, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6783)
            0.07439512 = weight(_text_:22 in 6783) [ClassicSimilarity], result of:
              0.07439512 = score(doc=6783,freq=2.0), product of:
                0.16023713 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045758117 = queryNorm
                0.46428138 = fieldWeight in 6783, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6783)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Footnote
    A special issue of selected papers from the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'97), held Singapore, 22-23 Feb 1997
  3. Amir, A.; Feldman, R.; Kashi, R.: ¬A new and versatile method for association generation (1997) 0.02
    0.020922724 = product of:
      0.04184545 = sum of:
        0.04184545 = product of:
          0.06276817 = sum of:
            0.013171425 = weight(_text_:a in 1270) [ClassicSimilarity], result of:
              0.013171425 = score(doc=1270,freq=12.0), product of:
                0.052761257 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.045758117 = queryNorm
                0.24964198 = fieldWeight in 1270, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1270)
            0.049596746 = weight(_text_:22 in 1270) [ClassicSimilarity], result of:
              0.049596746 = score(doc=1270,freq=2.0), product of:
                0.16023713 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045758117 = queryNorm
                0.30952093 = fieldWeight in 1270, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1270)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Abstract
    Current algorithms for finding associations among the attributes describing data in a database have a number of shortcomings. Presents a novel method for association generation, that answers all desiderata. The method is different from all existing algorithms and especially suitable to textual databases with binary attributes. Uses subword trees for quick indexing into the required database statistics. Tests the algorithm on the Reuters-22173 database with satisfactory results
    Source
    Information systems. 22(1997) nos.5/6, S.333-347
    Type
    a
  4. Matson, L.D.; Bonski, D.J.: Do digital libraries need librarians? (1997) 0.02
    0.01906709 = product of:
      0.03813418 = sum of:
        0.03813418 = product of:
          0.05720127 = sum of:
            0.007604526 = weight(_text_:a in 1737) [ClassicSimilarity], result of:
              0.007604526 = score(doc=1737,freq=4.0), product of:
                0.052761257 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.045758117 = queryNorm
                0.14413087 = fieldWeight in 1737, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1737)
            0.049596746 = weight(_text_:22 in 1737) [ClassicSimilarity], result of:
              0.049596746 = score(doc=1737,freq=2.0), product of:
                0.16023713 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045758117 = queryNorm
                0.30952093 = fieldWeight in 1737, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1737)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Abstract
    Defines digital libraries and discusses the effects of new technology on librarians. Examines the different viewpoints of librarians and information technologists on digital libraries. Describes the development of a digital library at the National Drug Intelligence Center, USA, which was carried out in collaboration with information technology experts. The system is based on Web enabled search technology to find information, data visualization and data mining to visualize it and use of SGML as an information standard to store it
    Date
    22.11.1998 18:57:22
    Type
    a
  5. Schmid, J.: Data mining : wie finde ich in Datensammlungen entscheidungsrelevante Muster? (1999) 0.02
    0.01769935 = product of:
      0.0353987 = sum of:
        0.0353987 = product of:
          0.053098045 = sum of:
            0.009410121 = weight(_text_:a in 4540) [ClassicSimilarity], result of:
              0.009410121 = score(doc=4540,freq=2.0), product of:
                0.052761257 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.045758117 = queryNorm
                0.17835285 = fieldWeight in 4540, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.109375 = fieldNorm(doc=4540)
            0.043687925 = weight(_text_:h in 4540) [ClassicSimilarity], result of:
              0.043687925 = score(doc=4540,freq=2.0), product of:
                0.113683715 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045758117 = queryNorm
                0.38429362 = fieldWeight in 4540, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.109375 = fieldNorm(doc=4540)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Source
    ARBIDO. 14(1999) H.5, S.11-13
    Type
    a
  6. Hofstede, A.H.M. ter; Proper, H.A.; Van der Weide, T.P.: Exploiting fact verbalisation in conceptual information modelling (1997) 0.02
    0.017602425 = product of:
      0.03520485 = sum of:
        0.03520485 = product of:
          0.05280727 = sum of:
            0.009410121 = weight(_text_:a in 2908) [ClassicSimilarity], result of:
              0.009410121 = score(doc=2908,freq=8.0), product of:
                0.052761257 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.045758117 = queryNorm
                0.17835285 = fieldWeight in 2908, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2908)
            0.04339715 = weight(_text_:22 in 2908) [ClassicSimilarity], result of:
              0.04339715 = score(doc=2908,freq=2.0), product of:
                0.16023713 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045758117 = queryNorm
                0.2708308 = fieldWeight in 2908, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2908)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Abstract
    Focuses on the information modelling side of conceptual modelling. Deals with the exploitation of fact verbalisations after finishing the actual information system. Verbalisations are used as input for the design of the so-called information model. Exploits these verbalisation in 4 directions: considers their use for a conceptual query language, the verbalisation of instances, the description of the contents of a database and for the verbalisation of queries in a computer supported query environment. Provides an example session with an envisioned tool for end user query formulations that exploits the verbalisation
    Source
    Information systems. 22(1997) nos.5/6, S.349-385
    Type
    a
  7. Keim, D.A.: Data Mining mit bloßem Auge (2002) 0.02
    0.01517087 = product of:
      0.03034174 = sum of:
        0.03034174 = product of:
          0.04551261 = sum of:
            0.008065818 = weight(_text_:a in 1086) [ClassicSimilarity], result of:
              0.008065818 = score(doc=1086,freq=2.0), product of:
                0.052761257 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.045758117 = queryNorm
                0.15287387 = fieldWeight in 1086, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1086)
            0.037446793 = weight(_text_:h in 1086) [ClassicSimilarity], result of:
              0.037446793 = score(doc=1086,freq=2.0), product of:
                0.113683715 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045758117 = queryNorm
                0.32939452 = fieldWeight in 1086, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1086)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Source
    Spektrum der Wissenschaft. 2002, H.11, S.88-91
    Type
    a
  8. Kruse, R.; Borgelt, C.: Suche im Datendschungel (2002) 0.02
    0.01517087 = product of:
      0.03034174 = sum of:
        0.03034174 = product of:
          0.04551261 = sum of:
            0.008065818 = weight(_text_:a in 1087) [ClassicSimilarity], result of:
              0.008065818 = score(doc=1087,freq=2.0), product of:
                0.052761257 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.045758117 = queryNorm
                0.15287387 = fieldWeight in 1087, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1087)
            0.037446793 = weight(_text_:h in 1087) [ClassicSimilarity], result of:
              0.037446793 = score(doc=1087,freq=2.0), product of:
                0.113683715 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045758117 = queryNorm
                0.32939452 = fieldWeight in 1087, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1087)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Source
    Spektrum der Wissenschaft. 2002, H.11, S.80-81
    Type
    a
  9. Wrobel, S.: Lern- und Entdeckungsverfahren (2002) 0.02
    0.01517087 = product of:
      0.03034174 = sum of:
        0.03034174 = product of:
          0.04551261 = sum of:
            0.008065818 = weight(_text_:a in 1105) [ClassicSimilarity], result of:
              0.008065818 = score(doc=1105,freq=2.0), product of:
                0.052761257 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.045758117 = queryNorm
                0.15287387 = fieldWeight in 1105, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1105)
            0.037446793 = weight(_text_:h in 1105) [ClassicSimilarity], result of:
              0.037446793 = score(doc=1105,freq=2.0), product of:
                0.113683715 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045758117 = queryNorm
                0.32939452 = fieldWeight in 1105, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1105)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Source
    Spektrum der Wissenschaft. 2002, H.11, S.85-87
    Type
    a
  10. Brückner, T.; Dambeck, H.: Sortierautomaten : Grundlagen der Textklassifizierung (2003) 0.01
    0.013560796 = product of:
      0.027121592 = sum of:
        0.027121592 = product of:
          0.040682387 = sum of:
            0.0053772116 = weight(_text_:a in 2398) [ClassicSimilarity], result of:
              0.0053772116 = score(doc=2398,freq=2.0), product of:
                0.052761257 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.045758117 = queryNorm
                0.10191591 = fieldWeight in 2398, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2398)
            0.035305176 = weight(_text_:h in 2398) [ClassicSimilarity], result of:
              0.035305176 = score(doc=2398,freq=4.0), product of:
                0.113683715 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045758117 = queryNorm
                0.31055614 = fieldWeight in 2398, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2398)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Source
    c't. 2003, H.19, S.192-197
    Type
    a
  11. Fonseca, F.; Marcinkowski, M.; Davis, C.: Cyber-human systems of thought and understanding (2019) 0.01
    0.013296565 = product of:
      0.02659313 = sum of:
        0.02659313 = product of:
          0.039889693 = sum of:
            0.008891728 = weight(_text_:a in 5011) [ClassicSimilarity], result of:
              0.008891728 = score(doc=5011,freq=14.0), product of:
                0.052761257 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.045758117 = queryNorm
                0.1685276 = fieldWeight in 5011, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5011)
            0.030997967 = weight(_text_:22 in 5011) [ClassicSimilarity], result of:
              0.030997967 = score(doc=5011,freq=2.0), product of:
                0.16023713 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045758117 = queryNorm
                0.19345059 = fieldWeight in 5011, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5011)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Abstract
    The present challenge faced by scientists working with Big Data comes in the overwhelming volume and level of detail provided by current data sets. Exceeding traditional empirical approaches, Big Data opens a new perspective on scientific work in which data comes to play a role in the development of the scientific problematic to be developed. Addressing this reconfiguration of our relationship with data through readings of Wittgenstein, Macherey, and Popper, we propose a picture of science that encourages scientists to engage with the data in a direct way, using the data itself as an instrument for scientific investigation. Using GIS as a theme, we develop the concept of cyber-human systems of thought and understanding to bridge the divide between representative (theoretical) thinking and (non-theoretical) data-driven science. At the foundation of these systems, we invoke the concept of the "semantic pixel" to establish a logical and virtual space linking data and the work of scientists. It is with this discussion of the relationship between analysts in their pursuit of knowledge and the rise of Big Data that this present discussion of the philosophical foundations of Big Data addresses the central questions raised by social informatics research.
    Date
    7. 3.2019 16:32:22
    Type
    a
  12. Hallonsten, O.; Holmberg, D.: Analyzing structural stratification in the Swedish higher education system : data contextualization with policy-history analysis (2013) 0.01
    0.012837617 = product of:
      0.025675233 = sum of:
        0.025675233 = product of:
          0.03851285 = sum of:
            0.007514882 = weight(_text_:a in 668) [ClassicSimilarity], result of:
              0.007514882 = score(doc=668,freq=10.0), product of:
                0.052761257 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.045758117 = queryNorm
                0.14243183 = fieldWeight in 668, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=668)
            0.030997967 = weight(_text_:22 in 668) [ClassicSimilarity], result of:
              0.030997967 = score(doc=668,freq=2.0), product of:
                0.16023713 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045758117 = queryNorm
                0.19345059 = fieldWeight in 668, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=668)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Abstract
    20th century massification of higher education and research in academia is said to have produced structurally stratified higher education systems in many countries. Most manifestly, the research mission of universities appears to be divisive. Authors have claimed that the Swedish system, while formally unified, has developed into a binary state, and statistics seem to support this conclusion. This article makes use of a comprehensive statistical data source on Swedish higher education institutions to illustrate stratification, and uses literature on Swedish research policy history to contextualize the statistics. Highlighting the opportunities as well as constraints of the data, the article argues that there is great merit in combining statistics with a qualitative analysis when studying the structural characteristics of national higher education systems. Not least the article shows that it is an over-simplification to describe the Swedish system as binary; the stratification is more complex. On basis of the analysis, the article also argues that while global trends certainly influence national developments, higher education systems have country-specific features that may enrich the understanding of how systems evolve and therefore should be analyzed as part of a broader study of the increasingly globalized academic system.
    Date
    22. 3.2013 19:43:01
    Type
    a
  13. Borgelt, C.; Kruse, R.: Unsicheres Wissen nutzen (2002) 0.01
    0.012642393 = product of:
      0.025284786 = sum of:
        0.025284786 = product of:
          0.037927177 = sum of:
            0.0067215143 = weight(_text_:a in 1104) [ClassicSimilarity], result of:
              0.0067215143 = score(doc=1104,freq=2.0), product of:
                0.052761257 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.045758117 = queryNorm
                0.12739488 = fieldWeight in 1104, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1104)
            0.031205663 = weight(_text_:h in 1104) [ClassicSimilarity], result of:
              0.031205663 = score(doc=1104,freq=2.0), product of:
                0.113683715 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045758117 = queryNorm
                0.27449545 = fieldWeight in 1104, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1104)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Source
    Spektrum der Wissenschaft. 2002, H.11, S.82-84
    Type
    a
  14. Vaughan, L.; Chen, Y.: Data mining from web search queries : a comparison of Google trends and Baidu index (2015) 0.01
    0.01257316 = product of:
      0.02514632 = sum of:
        0.02514632 = product of:
          0.03771948 = sum of:
            0.0067215143 = weight(_text_:a in 1605) [ClassicSimilarity], result of:
              0.0067215143 = score(doc=1605,freq=8.0), product of:
                0.052761257 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.045758117 = queryNorm
                0.12739488 = fieldWeight in 1605, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1605)
            0.030997967 = weight(_text_:22 in 1605) [ClassicSimilarity], result of:
              0.030997967 = score(doc=1605,freq=2.0), product of:
                0.16023713 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045758117 = queryNorm
                0.19345059 = fieldWeight in 1605, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1605)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Abstract
    Numerous studies have explored the possibility of uncovering information from web search queries but few have examined the factors that affect web query data sources. We conducted a study that investigated this issue by comparing Google Trends and Baidu Index. Data from these two services are based on queries entered by users into Google and Baidu, two of the largest search engines in the world. We first compared the features and functions of the two services based on documents and extensive testing. We then carried out an empirical study that collected query volume data from the two sources. We found that data from both sources could be used to predict the quality of Chinese universities and companies. Despite the differences between the two services in terms of technology, such as differing methods of language processing, the search volume data from the two were highly correlated and combining the two data sources did not improve the predictive power of the data. However, there was a major difference between the two in terms of data availability. Baidu Index was able to provide more search volume data than Google Trends did. Our analysis showed that the disadvantage of Google Trends in this regard was due to Google's smaller user base in China. The implication of this finding goes beyond China. Google's user bases in many countries are smaller than that in China, so the search volume data related to those countries could result in the same issue as that related to China.
    Source
    Journal of the Association for Information Science and Technology. 66(2015) no.1, S.13-22
    Type
    a
  15. Tu, Y.-N.; Hsu, S.-L.: Constructing conceptual trajectory maps to trace the development of research fields (2016) 0.01
    0.010523799 = product of:
      0.021047598 = sum of:
        0.021047598 = product of:
          0.031571396 = sum of:
            0.0095056575 = weight(_text_:a in 3059) [ClassicSimilarity], result of:
              0.0095056575 = score(doc=3059,freq=16.0), product of:
                0.052761257 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.045758117 = queryNorm
                0.18016359 = fieldWeight in 3059, product of:
                  4.0 = tf(freq=16.0), with freq of:
                    16.0 = termFreq=16.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3059)
            0.022065736 = weight(_text_:h in 3059) [ClassicSimilarity], result of:
              0.022065736 = score(doc=3059,freq=4.0), product of:
                0.113683715 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045758117 = queryNorm
                0.1940976 = fieldWeight in 3059, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3059)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Abstract
    This study proposes a new method to construct and trace the trajectory of conceptual development of a research field by combining main path analysis, citation analysis, and text-mining techniques. Main path analysis, a method used commonly to trace the most critical path in a citation network, helps describe the developmental trajectory of a research field. This study extends the main path analysis method and applies text-mining techniques in the new method, which reflects the trajectory of conceptual development in an academic research field more accurately than citation frequency, which represents only the articles examined. Articles can be merged based on similarity of concepts, and by merging concepts the history of a research field can be described more precisely. The new method was applied to the "h-index" and "text mining" fields. The precision, recall, and F-measures of the h-index were 0.738, 0.652, and 0.658 and those of text-mining were 0.501, 0.653, and 0.551, respectively. Last, this study not only establishes the conceptual trajectory map of a research field, but also recommends keywords that are more precise than those used currently by researchers. These precise keywords could enable researchers to gather related works more quickly than before.
    Type
    a
  16. Tiefschürfen in Datenbanken (2002) 0.01
    0.010113914 = product of:
      0.020227827 = sum of:
        0.020227827 = product of:
          0.03034174 = sum of:
            0.0053772116 = weight(_text_:a in 996) [ClassicSimilarity], result of:
              0.0053772116 = score(doc=996,freq=2.0), product of:
                0.052761257 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.045758117 = queryNorm
                0.10191591 = fieldWeight in 996, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=996)
            0.02496453 = weight(_text_:h in 996) [ClassicSimilarity], result of:
              0.02496453 = score(doc=996,freq=2.0), product of:
                0.113683715 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045758117 = queryNorm
                0.21959636 = fieldWeight in 996, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.0625 = fieldNorm(doc=996)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Source
    Spektrum der Wissenschaft. 2002, H.11, S.80-91
    Type
    a
  17. Nohr, H.: Big Data im Lichte der EU-Datenschutz-Grundverordnung (2017) 0.01
    0.010113914 = product of:
      0.020227827 = sum of:
        0.020227827 = product of:
          0.03034174 = sum of:
            0.0053772116 = weight(_text_:a in 4076) [ClassicSimilarity], result of:
              0.0053772116 = score(doc=4076,freq=2.0), product of:
                0.052761257 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.045758117 = queryNorm
                0.10191591 = fieldWeight in 4076, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4076)
            0.02496453 = weight(_text_:h in 4076) [ClassicSimilarity], result of:
              0.02496453 = score(doc=4076,freq=2.0), product of:
                0.113683715 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045758117 = queryNorm
                0.21959636 = fieldWeight in 4076, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4076)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Type
    a
  18. Winterhalter, C.: Licence to mine : ein Überblick über Rahmenbedingungen von Text and Data Mining und den aktuellen Stand der Diskussion (2016) 0.01
    0.010113914 = product of:
      0.020227827 = sum of:
        0.020227827 = product of:
          0.03034174 = sum of:
            0.0053772116 = weight(_text_:a in 673) [ClassicSimilarity], result of:
              0.0053772116 = score(doc=673,freq=2.0), product of:
                0.052761257 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.045758117 = queryNorm
                0.10191591 = fieldWeight in 673, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=673)
            0.02496453 = weight(_text_:h in 673) [ClassicSimilarity], result of:
              0.02496453 = score(doc=673,freq=2.0), product of:
                0.113683715 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045758117 = queryNorm
                0.21959636 = fieldWeight in 673, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.0625 = fieldNorm(doc=673)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Source
    027.7 Zeitschrift für Bibliothekskultur. 4(2016), H.2
    Type
    a
  19. Kraker, P.; Kittel, C,; Enkhbayar, A.: Open Knowledge Maps : creating a visual interface to the world's scientific knowledge based on natural language processing (2016) 0.01
    0.010043396 = product of:
      0.020086791 = sum of:
        0.020086791 = product of:
          0.030130185 = sum of:
            0.011406789 = weight(_text_:a in 3205) [ClassicSimilarity], result of:
              0.011406789 = score(doc=3205,freq=16.0), product of:
                0.052761257 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.045758117 = queryNorm
                0.2161963 = fieldWeight in 3205, product of:
                  4.0 = tf(freq=16.0), with freq of:
                    16.0 = termFreq=16.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3205)
            0.018723397 = weight(_text_:h in 3205) [ClassicSimilarity], result of:
              0.018723397 = score(doc=3205,freq=2.0), product of:
                0.113683715 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045758117 = queryNorm
                0.16469726 = fieldWeight in 3205, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3205)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Abstract
    The goal of Open Knowledge Maps is to create a visual interface to the world's scientific knowledge. The base for this visual interface consists of so-called knowledge maps, which enable the exploration of existing knowledge and the discovery of new knowledge. Our open source knowledge mapping software applies a mixture of summarization techniques and similarity measures on article metadata, which are iteratively chained together. After processing, the representation is saved in a database for use in a web visualization. In the future, we want to create a space for collective knowledge mapping that brings together individuals and communities involved in exploration and discovery. We want to enable people to guide each other in their discovery by collaboratively annotating and modifying the automatically created maps.
    Source
    027.7 Zeitschrift für Bibliothekskultur. 4(2016), H.2
    Type
    a
  20. Raghavan, V.V.; Deogun, J.S.; Sever, H.: Knowledge discovery and data mining : introduction (1998) 0.01
    0.009997789 = product of:
      0.019995578 = sum of:
        0.019995578 = product of:
          0.029993366 = sum of:
            0.008149404 = weight(_text_:a in 2899) [ClassicSimilarity], result of:
              0.008149404 = score(doc=2899,freq=6.0), product of:
                0.052761257 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.045758117 = queryNorm
                0.1544581 = fieldWeight in 2899, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2899)
            0.021843962 = weight(_text_:h in 2899) [ClassicSimilarity], result of:
              0.021843962 = score(doc=2899,freq=2.0), product of:
                0.113683715 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045758117 = queryNorm
                0.19214681 = fieldWeight in 2899, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2899)
          0.6666667 = coord(2/3)
      0.5 = coord(1/2)
    
    Abstract
    Defines knowledge discovery and database mining. The challenge for knowledge discovery in databases (KDD) is to automatically process large quantities of raw data, identifying the most significant and meaningful patterns, and present these as as knowledge appropriate for achieving a user's goals. Data mining is the process of deriving useful knowledge from real world databases through the application of pattern extraction techniques. Explains the goals of, and motivation for, research work on data mining. Discusses the nature of database contents, along with problems within the field of data mining
    Footnote
    Contribution to a special issue devoted to knowledge discovery and data mining
    Type
    a

Years

Languages

  • e 125
  • d 31
  • sp 1
  • More… Less…

Types

  • a 141
  • el 15
  • m 12
  • s 10
  • More… Less…