Search (92 results, page 1 of 5)

  • × theme_ss:"Data Mining"
  1. Ohly, H.P.: Bibliometric mining : added value from document analysis and retrieval (2008) 0.05
    0.053571116 = product of:
      0.08035667 = sum of:
        0.016003672 = weight(_text_:on in 2386) [ClassicSimilarity], result of:
          0.016003672 = score(doc=2386,freq=2.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.14580199 = fieldWeight in 2386, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=2386)
        0.064353004 = product of:
          0.12870601 = sum of:
            0.12870601 = weight(_text_:demand in 2386) [ClassicSimilarity], result of:
              0.12870601 = score(doc=2386,freq=2.0), product of:
                0.31127608 = queryWeight, product of:
                  6.237302 = idf(docFreq=234, maxDocs=44218)
                  0.04990557 = queryNorm
                0.4134786 = fieldWeight in 2386, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  6.237302 = idf(docFreq=234, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2386)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Bibliometrics is understood as statistical analysis of scientific structures and processes. The analyzed data result from information and administrative actions. The demand for quality judgments or the discovering of new structures and information means that Bibliometrics takes on the role of being exploratory and decision supporting. To the extent that it has acquired important features of Data Mining, the analysis of text and internet material can be viewed as an additional challenge. In the sense of an evaluative approach Bibliometrics can also be seen to apply inference procedures as well as navigation tools.
  2. KDD : techniques and applications (1998) 0.05
    0.04838429 = product of:
      0.07257643 = sum of:
        0.032007344 = weight(_text_:on in 6783) [ClassicSimilarity], result of:
          0.032007344 = score(doc=6783,freq=2.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.29160398 = fieldWeight in 6783, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.09375 = fieldNorm(doc=6783)
        0.040569093 = product of:
          0.081138186 = sum of:
            0.081138186 = weight(_text_:22 in 6783) [ClassicSimilarity], result of:
              0.081138186 = score(doc=6783,freq=2.0), product of:
                0.1747608 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04990557 = queryNorm
                0.46428138 = fieldWeight in 6783, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6783)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Footnote
    A special issue of selected papers from the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'97), held Singapore, 22-23 Feb 1997
  3. Matson, L.D.; Bonski, D.J.: Do digital libraries need librarians? (1997) 0.04
    0.042669978 = product of:
      0.064004965 = sum of:
        0.0369589 = weight(_text_:on in 1737) [ClassicSimilarity], result of:
          0.0369589 = score(doc=1737,freq=6.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.33671528 = fieldWeight in 1737, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0625 = fieldNorm(doc=1737)
        0.027046064 = product of:
          0.054092128 = sum of:
            0.054092128 = weight(_text_:22 in 1737) [ClassicSimilarity], result of:
              0.054092128 = score(doc=1737,freq=2.0), product of:
                0.1747608 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04990557 = queryNorm
                0.30952093 = fieldWeight in 1737, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1737)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Defines digital libraries and discusses the effects of new technology on librarians. Examines the different viewpoints of librarians and information technologists on digital libraries. Describes the development of a digital library at the National Drug Intelligence Center, USA, which was carried out in collaboration with information technology experts. The system is based on Web enabled search technology to find information, data visualization and data mining to visualize it and use of SGML as an information standard to store it
    Date
    22.11.1998 18:57:22
  4. Amir, A.; Feldman, R.; Kashi, R.: ¬A new and versatile method for association generation (1997) 0.03
    0.032256197 = product of:
      0.048384294 = sum of:
        0.021338228 = weight(_text_:on in 1270) [ClassicSimilarity], result of:
          0.021338228 = score(doc=1270,freq=2.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.19440265 = fieldWeight in 1270, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0625 = fieldNorm(doc=1270)
        0.027046064 = product of:
          0.054092128 = sum of:
            0.054092128 = weight(_text_:22 in 1270) [ClassicSimilarity], result of:
              0.054092128 = score(doc=1270,freq=2.0), product of:
                0.1747608 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04990557 = queryNorm
                0.30952093 = fieldWeight in 1270, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1270)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Current algorithms for finding associations among the attributes describing data in a database have a number of shortcomings. Presents a novel method for association generation, that answers all desiderata. The method is different from all existing algorithms and especially suitable to textual databases with binary attributes. Uses subword trees for quick indexing into the required database statistics. Tests the algorithm on the Reuters-22173 database with satisfactory results
    Source
    Information systems. 22(1997) nos.5/6, S.333-347
  5. Hofstede, A.H.M. ter; Proper, H.A.; Van der Weide, T.P.: Exploiting fact verbalisation in conceptual information modelling (1997) 0.03
    0.02822417 = product of:
      0.042336255 = sum of:
        0.01867095 = weight(_text_:on in 2908) [ClassicSimilarity], result of:
          0.01867095 = score(doc=2908,freq=2.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.17010231 = fieldWeight in 2908, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2908)
        0.023665305 = product of:
          0.04733061 = sum of:
            0.04733061 = weight(_text_:22 in 2908) [ClassicSimilarity], result of:
              0.04733061 = score(doc=2908,freq=2.0), product of:
                0.1747608 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04990557 = queryNorm
                0.2708308 = fieldWeight in 2908, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2908)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Focuses on the information modelling side of conceptual modelling. Deals with the exploitation of fact verbalisations after finishing the actual information system. Verbalisations are used as input for the design of the so-called information model. Exploits these verbalisation in 4 directions: considers their use for a conceptual query language, the verbalisation of instances, the description of the contents of a database and for the verbalisation of queries in a computer supported query environment. Provides an example session with an envisioned tool for end user query formulations that exploits the verbalisation
    Source
    Information systems. 22(1997) nos.5/6, S.349-385
  6. Hallonsten, O.; Holmberg, D.: Analyzing structural stratification in the Swedish higher education system : data contextualization with policy-history analysis (2013) 0.03
    0.026668733 = product of:
      0.0400031 = sum of:
        0.02309931 = weight(_text_:on in 668) [ClassicSimilarity], result of:
          0.02309931 = score(doc=668,freq=6.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.21044704 = fieldWeight in 668, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=668)
        0.01690379 = product of:
          0.03380758 = sum of:
            0.03380758 = weight(_text_:22 in 668) [ClassicSimilarity], result of:
              0.03380758 = score(doc=668,freq=2.0), product of:
                0.1747608 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04990557 = queryNorm
                0.19345059 = fieldWeight in 668, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=668)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    20th century massification of higher education and research in academia is said to have produced structurally stratified higher education systems in many countries. Most manifestly, the research mission of universities appears to be divisive. Authors have claimed that the Swedish system, while formally unified, has developed into a binary state, and statistics seem to support this conclusion. This article makes use of a comprehensive statistical data source on Swedish higher education institutions to illustrate stratification, and uses literature on Swedish research policy history to contextualize the statistics. Highlighting the opportunities as well as constraints of the data, the article argues that there is great merit in combining statistics with a qualitative analysis when studying the structural characteristics of national higher education systems. Not least the article shows that it is an over-simplification to describe the Swedish system as binary; the stratification is more complex. On basis of the analysis, the article also argues that while global trends certainly influence national developments, higher education systems have country-specific features that may enrich the understanding of how systems evolve and therefore should be analyzed as part of a broader study of the increasingly globalized academic system.
    Date
    22. 3.2013 19:43:01
  7. Vaughan, L.; Chen, Y.: Data mining from web search queries : a comparison of Google trends and Baidu index (2015) 0.02
    0.023842867 = product of:
      0.0357643 = sum of:
        0.01886051 = weight(_text_:on in 1605) [ClassicSimilarity], result of:
          0.01886051 = score(doc=1605,freq=4.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.1718293 = fieldWeight in 1605, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1605)
        0.01690379 = product of:
          0.03380758 = sum of:
            0.03380758 = weight(_text_:22 in 1605) [ClassicSimilarity], result of:
              0.03380758 = score(doc=1605,freq=2.0), product of:
                0.1747608 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04990557 = queryNorm
                0.19345059 = fieldWeight in 1605, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1605)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Numerous studies have explored the possibility of uncovering information from web search queries but few have examined the factors that affect web query data sources. We conducted a study that investigated this issue by comparing Google Trends and Baidu Index. Data from these two services are based on queries entered by users into Google and Baidu, two of the largest search engines in the world. We first compared the features and functions of the two services based on documents and extensive testing. We then carried out an empirical study that collected query volume data from the two sources. We found that data from both sources could be used to predict the quality of Chinese universities and companies. Despite the differences between the two services in terms of technology, such as differing methods of language processing, the search volume data from the two were highly correlated and combining the two data sources did not improve the predictive power of the data. However, there was a major difference between the two in terms of data availability. Baidu Index was able to provide more search volume data than Google Trends did. Our analysis showed that the disadvantage of Google Trends in this regard was due to Google's smaller user base in China. The implication of this finding goes beyond China. Google's user bases in many countries are smaller than that in China, so the search volume data related to those countries could result in the same issue as that related to China.
    Source
    Journal of the Association for Information Science and Technology. 66(2015) no.1, S.13-22
  8. Fonseca, F.; Marcinkowski, M.; Davis, C.: Cyber-human systems of thought and understanding (2019) 0.02
    0.023842867 = product of:
      0.0357643 = sum of:
        0.01886051 = weight(_text_:on in 5011) [ClassicSimilarity], result of:
          0.01886051 = score(doc=5011,freq=4.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.1718293 = fieldWeight in 5011, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5011)
        0.01690379 = product of:
          0.03380758 = sum of:
            0.03380758 = weight(_text_:22 in 5011) [ClassicSimilarity], result of:
              0.03380758 = score(doc=5011,freq=2.0), product of:
                0.1747608 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04990557 = queryNorm
                0.19345059 = fieldWeight in 5011, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5011)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    The present challenge faced by scientists working with Big Data comes in the overwhelming volume and level of detail provided by current data sets. Exceeding traditional empirical approaches, Big Data opens a new perspective on scientific work in which data comes to play a role in the development of the scientific problematic to be developed. Addressing this reconfiguration of our relationship with data through readings of Wittgenstein, Macherey, and Popper, we propose a picture of science that encourages scientists to engage with the data in a direct way, using the data itself as an instrument for scientific investigation. Using GIS as a theme, we develop the concept of cyber-human systems of thought and understanding to bridge the divide between representative (theoretical) thinking and (non-theoretical) data-driven science. At the foundation of these systems, we invoke the concept of the "semantic pixel" to establish a logical and virtual space linking data and the work of scientists. It is with this discussion of the relationship between analysts in their pursuit of knowledge and the rise of Big Data that this present discussion of the philosophical foundations of Big Data addresses the central questions raised by social informatics research.
    Date
    7. 3.2019 16:32:22
    Footnote
    Beitrag eines Special issue on social informatics of knowledge
  9. Chowdhury, G.G.: Template mining for information extraction from digital documents (1999) 0.02
    0.01577687 = product of:
      0.04733061 = sum of:
        0.04733061 = product of:
          0.09466122 = sum of:
            0.09466122 = weight(_text_:22 in 4577) [ClassicSimilarity], result of:
              0.09466122 = score(doc=4577,freq=2.0), product of:
                0.1747608 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04990557 = queryNorm
                0.5416616 = fieldWeight in 4577, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=4577)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    2. 4.2000 18:01:22
  10. Huvila, I.: Mining qualitative data on human information behaviour from the Web (2010) 0.01
    0.0139165055 = product of:
      0.041749515 = sum of:
        0.041749515 = weight(_text_:on in 4676) [ClassicSimilarity], result of:
          0.041749515 = score(doc=4676,freq=10.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.38036036 = fieldWeight in 4676, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4676)
      0.33333334 = coord(1/3)
    
    Abstract
    This paper discusses an approach of collecting qualitative data on human information behaviour that is based on mining web data using search engines. The approach is technically the same that has been used for some time in webometric research to make statistical inferences on web data, but the present paper shows how the same tools and data collecting methods can be used to gather data for qualitative data analysis on human information behaviour.
  11. Shi, X.; Yang, C.C.: Mining related queries from Web search engine query logs using an improved association rule mining model (2007) 0.01
    0.0108891195 = product of:
      0.032667357 = sum of:
        0.032667357 = weight(_text_:on in 597) [ClassicSimilarity], result of:
          0.032667357 = score(doc=597,freq=12.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.29761705 = fieldWeight in 597, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=597)
      0.33333334 = coord(1/3)
    
    Abstract
    With the overwhelming volume of information, the task of finding relevant information on a given topic on the Web is becoming increasingly difficult. Web search engines hence become one of the most popular solutions available on the Web. However, it has never been easy for novice users to organize and represent their information needs using simple queries. Users have to keep modifying their input queries until they get expected results. Therefore, it is often desirable for search engines to give suggestions on related queries to users. Besides, by identifying those related queries, search engines can potentially perform optimizations on their systems, such as query expansion and file indexing. In this work we propose a method that suggests a list of related queries given an initial input query. The related queries are based in the query log of previously submitted queries by human users, which can be identified using an enhanced model of association rules. Users can utilize the suggested related queries to tune or redirect the search process. Our method not only discovers the related queries, but also ranks them according to the degree of their relatedness. Unlike many other rival techniques, it also performs reasonably well on less frequent input queries.
  12. Lingras, P.J.; Yao, Y.Y.: Data mining using extensions of the rough set model (1998) 0.01
    0.010779679 = product of:
      0.032339036 = sum of:
        0.032339036 = weight(_text_:on in 2910) [ClassicSimilarity], result of:
          0.032339036 = score(doc=2910,freq=6.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.29462588 = fieldWeight in 2910, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2910)
      0.33333334 = coord(1/3)
    
    Abstract
    Examines basic issues of data mining using the theory of rough sets, which is a recent proposal for generalizing classical set theory. The Pawlak rough set model is based on the concept of an equivalence relation. A generalized rough set model need not be based on equivalence relation axioms. The Pawlak rough set model has been used for deriving deterministic as well as probabilistic rules froma complete database. Demonstrates that a generalised rough set model can be used for generating rules from incomplete databases. These rules are based on plausability functions proposed by Shafer. Discusses the importance of rule extraction from incomplete databases in data mining
  13. Analytische Informationssysteme : Data Warehouse, On-Line Analytical Processing, Data Mining (1999) 0.01
    0.010779679 = product of:
      0.032339036 = sum of:
        0.032339036 = weight(_text_:on in 1381) [ClassicSimilarity], result of:
          0.032339036 = score(doc=1381,freq=6.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.29462588 = fieldWeight in 1381, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1381)
      0.33333334 = coord(1/3)
    
    Abstract
    Neben den operativen Informationssystemen, welche die Abwicklung des betrieblichen Tagesgeschäftes unterstützen, treten heute verstärkt Informationssysteme für analytische Aufgaben der Fach- und Führungskräfte in den Vordergrund. In fast allen Unternehmen werden derzeit Begriffe und Konzepte wie Data Warehouse, On-Line Analytical Processing und Data Mining diskutiert und die zugehörigen Produkte evaluiert. Vor diesem Hintergrund zielt der vorliegende Sammelband darauf ab, einen aktuellen Überblick über Technologien, Produkte und Trends zu bieten. Als Entscheidungsgrundlage für den Praktiker beim Aufbau und Einsatz derartiger analytischer Informationssysteme können die unterschiedlichen Beiträge aus Wirtschaft und Wissenschaft wertvolle Hilfestellung leisten.
    Content
    Grundlagen.- Data Warehouse.- On-line Analytical Processing.- Data Mining.- Betriebswirtschaftliche und strategische Aspekte.
  14. Lam, W.; Yang, C.C.; Menczer, F.: Introduction to the special topic section on mining Web resources for enhancing information retrieval (2007) 0.01
    0.010779679 = product of:
      0.032339036 = sum of:
        0.032339036 = weight(_text_:on in 600) [ClassicSimilarity], result of:
          0.032339036 = score(doc=600,freq=6.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.29462588 = fieldWeight in 600, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0546875 = fieldNorm(doc=600)
      0.33333334 = coord(1/3)
    
    Abstract
    The amount of information on the Web has been expanding at an enormous pace. There are a variety of Web documents in different genres, such as news, reports, reviews. Traditionally, the information displayed on Web sites has been static. Recently, there are many Web sites offering content that is dynamically generated and frequently updated. It is also common for Web sites to contain information in different languages since many countries adopt more than one language. Moreover, content may exist in multimedia formats including text, images, video, and audio.
  15. Kong, S.; Ye, F.; Feng, L.; Zhao, Z.: Towards the prediction problems of bursting hashtags on Twitter (2015) 0.01
    0.010779679 = product of:
      0.032339036 = sum of:
        0.032339036 = weight(_text_:on in 2338) [ClassicSimilarity], result of:
          0.032339036 = score(doc=2338,freq=6.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.29462588 = fieldWeight in 2338, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2338)
      0.33333334 = coord(1/3)
    
    Abstract
    Hundreds of thousands of hashtags are generated every day on Twitter. Only a few will burst and become trending topics. In this article, we provide the definition of a bursting hashtag and conduct a systematic study of a series of challenging prediction problems that span the entire life cycles of bursting hashtags. Around the problem of "how to build a system to predict bursting hashtags," we explore different types of features and present machine learning solutions. On real data sets from Twitter, experiments are conducted to evaluate the effectiveness of the proposed solutions and the contributions of features.
  16. Leydesdorff, L.; Persson, O.: Mapping the geography of science : distribution patterns and networks of relations among cities and institutes (2010) 0.01
    0.010669115 = product of:
      0.032007344 = sum of:
        0.032007344 = weight(_text_:on in 3704) [ClassicSimilarity], result of:
          0.032007344 = score(doc=3704,freq=8.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.29160398 = fieldWeight in 3704, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=3704)
      0.33333334 = coord(1/3)
    
    Abstract
    Using Google Earth, Google Maps, and/or network visualization programs such as Pajek, one can overlay the network of relations among addresses in scientific publications onto the geographic map. The authors discuss the pros and cons of various options, and provide software (freeware) for bridging existing gaps between the Science Citation Indices (Thomson Reuters) and Scopus (Elsevier), on the one hand, and these various visualization tools on the other. At the level of city names, the global map can be drawn reliably on the basis of the available address information. At the level of the names of organizations and institutes, there are problems of unification both in the ISI databases and with Scopus. Pajek enables a combination of visualization and statistical analysis, whereas the Google Maps and its derivatives provide superior tools on the Internet.
  17. Chen, Z.: Knowledge discovery and system-user partnership : on a production 'adversarial partnership' approach (1994) 0.01
    0.010058938 = product of:
      0.030176813 = sum of:
        0.030176813 = weight(_text_:on in 6759) [ClassicSimilarity], result of:
          0.030176813 = score(doc=6759,freq=4.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.27492687 = fieldWeight in 6759, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0625 = fieldNorm(doc=6759)
      0.33333334 = coord(1/3)
    
    Abstract
    Examines the relationship between systems and users from the knowledge discovery in databases or data mining perspecitives. A comprehensive study on knowledge discovery in human computer symbiosis is needed. Proposes a database-user adversarial partnership, which is general enough to cover knowledge discovery and security of issues related to databases and their users. It can be further generalized into system-user adversarial paertnership. Discusses opportunities provided by knowledge discovery techniques and potential social implications
  18. Analytische Informationssysteme : Data Warehouse, On-Line Analytical Processing, Data Mining (1998) 0.01
    0.010058938 = product of:
      0.030176813 = sum of:
        0.030176813 = weight(_text_:on in 1380) [ClassicSimilarity], result of:
          0.030176813 = score(doc=1380,freq=4.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.27492687 = fieldWeight in 1380, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0625 = fieldNorm(doc=1380)
      0.33333334 = coord(1/3)
    
    Abstract
    Neben den operativen Informationssystemen treten heute verstärkt Informationssysteme für die analytischen Aufgaben der Fach- und Führungskräfte in den Vordergrund. In fast allen Unternehmen werden derzeit Begriffe und Konzepte wie Data Warehouse, On-Line Analytical Processing und Data Mining diskutiert und die zugehörigen Produkte evaluiert. Vor diesem Hintergrund zielt der vorliegende Sammelband darauf, einen aktuellen Überblick über Technologien, Produkte und Trends zu bieten. Als Entscheidungsgrundlage für den Praktiker beim Aufbau und Einsatz derartiger analytischer Informationssysteme können die unterschiedlichen Beiträge aus Wirtschaft und Wissenschaft wertvolle Hilfestellung leisten
  19. Pons-Porrata, A.; Berlanga-Llavori, R.; Ruiz-Shulcloper, J.: Topic discovery based on text mining techniques (2007) 0.01
    0.009239726 = product of:
      0.027719175 = sum of:
        0.027719175 = weight(_text_:on in 916) [ClassicSimilarity], result of:
          0.027719175 = score(doc=916,freq=6.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.25253648 = fieldWeight in 916, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=916)
      0.33333334 = coord(1/3)
    
    Abstract
    In this paper, we present a topic discovery system aimed to reveal the implicit knowledge present in news streams. This knowledge is expressed as a hierarchy of topic/subtopics, where each topic contains the set of documents that are related to it and a summary extracted from these documents. Summaries so built are useful to browse and select topics of interest from the generated hierarchies. Our proposal consists of a new incremental hierarchical clustering algorithm, which combines both partitional and agglomerative approaches, taking the main benefits from them. Finally, a new summarization method based on Testor Theory has been proposed to build the topic summaries. Experimental results in the TDT2 collection demonstrate its usefulness and effectiveness not only as a topic detection system, but also as a classification and summarization tool.
    Footnote
    Beitrag in: Special issue on Heterogeneous and Distributed IR
  20. Kulathuramaiyer, N.; Maurer, H.: Implications of emerging data mining (2009) 0.01
    0.009239726 = product of:
      0.027719175 = sum of:
        0.027719175 = weight(_text_:on in 3144) [ClassicSimilarity], result of:
          0.027719175 = score(doc=3144,freq=6.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.25253648 = fieldWeight in 3144, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=3144)
      0.33333334 = coord(1/3)
    
    Abstract
    Data Mining describes a technology that discovers non-trivial hidden patterns in a large collection of data. Although this technology has a tremendous impact on our lives, the invaluable contributions of this invisible technology often go unnoticed. This paper discusses advances in data mining while focusing on the emerging data mining capability. Such data mining applications perform multidimensional mining on a wide variety of heterogeneous data sources, providing solutions to many unresolved problems. This paper also highlights the advantages and disadvantages arising from the ever-expanding scope of data mining. Data Mining augments human intelligence by equipping us with a wealth of knowledge and by empowering us to perform our daily tasks better. As the mining scope and capacity increases, users and organizations become more willing to compromise privacy. The huge data stores of the 'master miners' allow them to gain deep insights into individual lifestyles and their social and behavioural patterns. Data integration and analysis capability of combining business and financial trends together with the ability to deterministically track market changes will drastically affect our lives.

Years

Languages

  • e 82
  • d 9
  • sp 1
  • More… Less…

Types

  • a 79
  • m 10
  • s 9
  • el 6
  • More… Less…