Search (26 results, page 1 of 2)

Vaughan, L.; Chen, Y.: Data mining from web search queries : a comparison of Google trends and Baidu index (2015) 0.04
```
0.03819228 = product of:
  0.07638456 = sum of:
    0.06049643 = weight(_text_:services in 1605) [ClassicSimilarity], result of:
      0.06049643 = score(doc=1605,freq=6.0), product of:
        0.17221296 = queryWeight, product of:
          3.6713707 = idf(docFreq=3057, maxDocs=44218)
          0.046906993 = queryNorm
        0.3512885 = fieldWeight in 1605, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.6713707 = idf(docFreq=3057, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1605)
    0.015888125 = product of:
      0.03177625 = sum of:
        0.03177625 = weight(_text_:22 in 1605) [ClassicSimilarity], result of:
          0.03177625 = score(doc=1605,freq=2.0), product of:
            0.1642603 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046906993 = queryNorm
            0.19345059 = fieldWeight in 1605, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1605)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Numerous studies have explored the possibility of uncovering information from web search queries but few have examined the factors that affect web query data sources. We conducted a study that investigated this issue by comparing Google Trends and Baidu Index. Data from these two services are based on queries entered by users into Google and Baidu, two of the largest search engines in the world. We first compared the features and functions of the two services based on documents and extensive testing. We then carried out an empirical study that collected query volume data from the two sources. We found that data from both sources could be used to predict the quality of Chinese universities and companies. Despite the differences between the two services in terms of technology, such as differing methods of language processing, the search volume data from the two were highly correlated and combining the two data sources did not improve the predictive power of the data. However, there was a major difference between the two in terms of data availability. Baidu Index was able to provide more search volume data than Google Trends did. Our analysis showed that the disadvantage of Google Trends in this regard was due to Google's smaller user base in China. The implication of this finding goes beyond China. Google's user bases in many countries are smaller than that in China, so the search volume data related to those countries could result in the same issue as that related to China.

Source

Journal of the Association for Information Science and Technology. 66(2015) no.1, S.13-22
Chen, C.-C.; Chen, A.-P.: Using data mining technology to provide a recommendation service in the digital library (2007) 0.01
```
0.012348781 = product of:
  0.049395125 = sum of:
    0.049395125 = weight(_text_:services in 2533) [ClassicSimilarity], result of:
      0.049395125 = score(doc=2533,freq=4.0), product of:
        0.17221296 = queryWeight, product of:
          3.6713707 = idf(docFreq=3057, maxDocs=44218)
          0.046906993 = queryNorm
        0.28682584 = fieldWeight in 2533, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.6713707 = idf(docFreq=3057, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2533)
  0.25 = coord(1/4)
```
Abstract

Purpose - Since library storage has been increasing day by day, it is difficult for readers to find the books which interest them as well as representative booklists. How to utilize meaningful information effectively to improve the service quality of the digital library appears to be very important. The purpose of this paper is to provide a recommendation system architecture to promote digital library services in electronic libraries. Design/methodology/approach - In the proposed architecture, a two-phase data mining process used by association rule and clustering methods is designed to generate a recommendation system. The process considers not only the relationship of a cluster of users but also the associations among the information accessed. Findings - The process considered not only the relationship of a cluster of users but also the associations among the information accessed. With the advanced filter, the recommendation supported by the proposed system architecture would be closely served to meet users' needs. Originality/value - This paper not only constructs a recommendation service for readers to search books from the web but takes the initiative in finding the most suitable books for readers as well. Furthermore, library managers are expected to purchase core and hot books from a limited budget to maintain and satisfy the requirements of readers along with promoting digital library services.

Chowdhury, G.G.: Template mining for information extraction from digital documents (1999) 0.01

0.0111216875 = product of:
  0.04448675 = sum of:
    0.04448675 = product of:
      0.0889735 = sum of:
        0.0889735 = weight(_text_:22 in 4577) [ClassicSimilarity], result of:
          0.0889735 = score(doc=4577,freq=2.0), product of:
            0.1642603 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046906993 = queryNorm
            0.5416616 = fieldWeight in 4577, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=4577)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 2. 4.2000 18:01:22

Sun, X.; Lin, H.: Topical community detection from mining user tagging behavior and interest (2013) 0.01
```
0.010478289 = product of:
  0.041913155 = sum of:
    0.041913155 = weight(_text_:services in 605) [ClassicSimilarity], result of:
      0.041913155 = score(doc=605,freq=2.0), product of:
        0.17221296 = queryWeight, product of:
          3.6713707 = idf(docFreq=3057, maxDocs=44218)
          0.046906993 = queryNorm
        0.2433798 = fieldWeight in 605, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.6713707 = idf(docFreq=3057, maxDocs=44218)
          0.046875 = fieldNorm(doc=605)
  0.25 = coord(1/4)
```
Abstract

With the development of Web2.0, social tagging systems in which users can freely choose tags to annotate resources according to their interests have attracted much attention. In particular, literature on the emergence of collective intelligence in social tagging systems has increased. In this article, we propose a probabilistic generative model to detect latent topical communities among users. Social tags and resource contents are leveraged to model user interest in two similar and correlated ways. Our primary goal is to capture user tagging behavior and interest and discover the emergent topical community structure. The communities should be groups of users with frequent social interactions as well as similar topical interests, which would have important research implications for personalized information services. Experimental results on two real social tagging data sets with different genres have shown that the proposed generative model more accurately models user interest and detects high-quality and meaningful topical communities.
Lischka, K.: Spurensuche im Datenwust : Data-Mining-Software fahndet nach kriminellen Mitarbeitern, guten Kunden - und bald vielleicht auch nach Terroristen (2002) 0.01
```
0.00918236 = product of:
  0.03672944 = sum of:
    0.03672944 = sum of:
      0.017663691 = weight(_text_:management in 1178) [ClassicSimilarity], result of:
        0.017663691 = score(doc=1178,freq=2.0), product of:
          0.15810528 = queryWeight, product of:
            3.3706124 = idf(docFreq=4130, maxDocs=44218)
            0.046906993 = queryNorm
          0.11172107 = fieldWeight in 1178, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.3706124 = idf(docFreq=4130, maxDocs=44218)
            0.0234375 = fieldNorm(doc=1178)
      0.019065749 = weight(_text_:22 in 1178) [ClassicSimilarity], result of:
        0.019065749 = score(doc=1178,freq=2.0), product of:
          0.1642603 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046906993 = queryNorm
          0.116070345 = fieldWeight in 1178, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0234375 = fieldNorm(doc=1178)
  0.25 = coord(1/4)
```
Content

"Ob man als Terrorist einen Anschlag gegen die Vereinigten Staaten plant, als Kassierer Scheine aus der Kasse unterschlägt oder für bestimmte Produkte besonders gerne Geld ausgibt - einen Unterschied macht Data-Mining-Software da nicht. Solche Programme analysieren riesige Daten- mengen und fällen statistische Urteile. Mit diesen Methoden wollen nun die For- scher des "Information Awaren in den Vereinigten Staaten Spuren von Terroristen in den Datenbanken von Behörden und privaten Unternehmen wie Kreditkartenfirmen finden. 200 Millionen Dollar umfasst der Jahresetat für die verschiedenen Forschungsprojekte. Dass solche Software in der Praxis funktioniert, zeigen die steigenden Umsätze der Anbieter so genannter Customer-Relationship-Management-Software. Im vergangenen Jahr ist das Potenzial für analytische CRM-Anwendungen laut dem Marktforschungsinstitut IDC weltweit um 22 Prozent gewachsen, bis zum Jahr 2006 soll es in Deutschland mit einem jährlichen Plus von 14,1 Prozent so weitergehen. Und das trotz schwacher Konjunktur - oder gerade deswegen. Denn ähnlich wie Data-Mining der USRegierung helfen soll, Terroristen zu finden, entscheiden CRM-Programme heute, welche Kunden für eine Firma profitabel sind. Und welche es künftig sein werden, wie Manuela Schnaubelt, Sprecherin des CRM-Anbieters SAP, beschreibt: "Die Kundenbewertung ist ein zentraler Bestandteil des analytischen CRM. Sie ermöglicht es Unternehmen, sich auf die für sie wichtigen und richtigen Kunden zu fokussieren. Darüber hinaus können Firmen mit speziellen Scoring- Verfahren ermitteln, welche Kunden langfristig in welchem Maße zum Unternehmenserfolg beitragen." Die Folgen der Bewertungen sind für die Betroffenen nicht immer positiv: Attraktive Kunden profitieren von individuellen Sonderangeboten und besonderer Zuwendung. Andere hängen vielleicht so lauge in der Warteschleife des Telefonservice, bis die profitableren Kunden abgearbeitet sind. So könnte eine praktische Umsetzung dessen aussehen, was SAP-Spreche-rin Schnaubelt abstrakt beschreibt: "In vielen Unternehmen wird Kundenbewertung mit der klassischen ABC-Analyse durchgeführt, bei der Kunden anhand von Daten wie dem Umsatz kategorisiert werden. A-Kunden als besonders wichtige Kunden werden anders betreut als C-Kunden." Noch näher am geplanten Einsatz von Data-Mining zur Terroristenjagd ist eine Anwendung, die heute viele Firmen erfolgreich nutzen: Sie spüren betrügende Mitarbeiter auf. Werner Sülzer vom großen CRM-Anbieter NCR Teradata beschreibt die Möglichkeiten so: "Heute hinterlässt praktisch jeder Täter - ob Mitarbeiter, Kunde oder Lieferant - Datenspuren bei seinen wirtschaftskriminellen Handlungen. Es muss vorrangig darum gehen, einzelne Spuren zu Handlungsmustern und Täterprofilen zu verdichten. Das gelingt mittels zentraler Datenlager und hoch entwickelter Such- und Analyseinstrumente." Von konkreten Erfolgen sprich: Entlas-sungen krimineller Mitarbeiter-nach Einsatz solcher Programme erzählen Unternehmen nicht gerne. Matthias Wilke von der "Beratungsstelle für Technologiefolgen und Qualifizierung" (BTQ) der Gewerkschaft Verdi weiß von einem Fall 'aus der Schweiz. Dort setzt die Handelskette "Pick Pay" das Programm "Lord Lose Prevention" ein. Zwei Monate nach Einfüh-rung seien Unterschlagungen im Wert von etwa 200 000 Franken ermittelt worden. Das kostete mehr als 50 verdächtige Kassiererinnen und Kassierer den Job.
Ku, L.-W.; Chen, H.-H.: Mining opinions from the Web : beyond relevance retrieval (2007) 0.01
```
0.008731907 = product of:
  0.03492763 = sum of:
    0.03492763 = weight(_text_:services in 605) [ClassicSimilarity], result of:
      0.03492763 = score(doc=605,freq=2.0), product of:
        0.17221296 = queryWeight, product of:
          3.6713707 = idf(docFreq=3057, maxDocs=44218)
          0.046906993 = queryNorm
        0.2028165 = fieldWeight in 605, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.6713707 = idf(docFreq=3057, maxDocs=44218)
          0.0390625 = fieldNorm(doc=605)
  0.25 = coord(1/4)
```
Abstract

Documents discussing public affairs, common themes, interesting products, and so on, are reported and distributed on the Web. Positive and negative opinions embedded in documents are useful references and feedbacks for governments to improve their services, for companies to market their products, and for customers to purchase their objects. Web opinion mining aims to extract, summarize, and track various aspects of subjective information on the Web. Mining subjective information enables traditional information retrieval (IR) systems to retrieve more data from human viewpoints and provide information with finer granularity. Opinion extraction identifies opinion holders, extracts the relevant opinion sentences, and decides their polarities. Opinion summarization recognizes the major events embedded in documents and summarizes the supportive and the nonsupportive evidence. Opinion tracking captures subjective information from various genres and monitors the developments of opinions from spatial and temporal dimensions. To demonstrate and evaluate the proposed opinion mining algorithms, news and bloggers' articles are adopted. Documents in the evaluation corpora are tagged in different granularities from words, sentences to documents. In the experiments, positive and negative sentiment words and their weights are mined on the basis of Chinese word structures. The f-measure is 73.18% and 63.75% for verbs and nouns, respectively. Utilizing the sentiment words mined together with topical words, we achieve f-measure 62.16% at the sentence level and 74.37% at the document level.
Cohen, D.J.: From Babel to knowledge : data mining large digital collections (2006) 0.01
```
0.006985526 = product of:
  0.027942104 = sum of:
    0.027942104 = weight(_text_:services in 1178) [ClassicSimilarity], result of:
      0.027942104 = score(doc=1178,freq=2.0), product of:
        0.17221296 = queryWeight, product of:
          3.6713707 = idf(docFreq=3057, maxDocs=44218)
          0.046906993 = queryNorm
        0.1622532 = fieldWeight in 1178, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.6713707 = idf(docFreq=3057, maxDocs=44218)
          0.03125 = fieldNorm(doc=1178)
  0.25 = coord(1/4)
```
Abstract

In Jorge Luis Borges's curious short story The Library of Babel, the narrator describes an endless collection of books stored from floor to ceiling in a labyrinth of countless hexagonal rooms. The pages of the library's books seem to contain random sequences of letters and spaces; occasionally a few intelligible words emerge in the sea of paper and ink. Nevertheless, readers diligently, and exasperatingly, scan the shelves for coherent passages. The narrator himself has wandered numerous rooms in search of enlightenment, but with resignation he simply awaits his death and burial - which Borges explains (with signature dark humor) consists of being tossed unceremoniously over the library's banister. Borges's nightmare, of course, is a cursed vision of the research methods of disciplines such as literature, history, and philosophy, where the careful reading of books, one after the other, is supposed to lead inexorably to knowledge and understanding. Computer scientists would approach Borges's library far differently. Employing the information theory that forms the basis for search engines and other computerized techniques for assessing in one fell swoop large masses of documents, they would quickly realize the collection's incoherence though sampling and statistical methods - and wisely start looking for the library's exit. These computational methods, which allow us to find patterns, determine relationships, categorize documents, and extract information from massive corpuses, will form the basis for new tools for research in the humanities and other disciplines in the coming decade. For the past three years I have been experimenting with how to provide such end-user tools - that is, tools that harness the power of vast electronic collections while hiding much of their complicated technical plumbing. In particular, I have made extensive use of the application programming interfaces (APIs) the leading search engines provide for programmers to query their databases directly (from server to server without using their web interfaces). In addition, I have explored how one might extract information from large digital collections, from the well-curated lexicographic database WordNet to the democratic (and poorly curated) online reference work Wikipedia. While processing these digital corpuses is currently an imperfect science, even now useful tools can be created by combining various collections and methods for searching and analyzing them. And more importantly, these nascent services suggest a future in which information can be gleaned from, and sense can be made out of, even imperfect digital libraries of enormous scale. A brief examination of two approaches to data mining large digital collections hints at this future, while also providing some lessons about how to get there.

Matson, L.D.; Bonski, D.J.: Do digital libraries need librarians? (1997) 0.01

0.0063552503 = product of:
  0.025421001 = sum of:
    0.025421001 = product of:
      0.050842002 = sum of:
        0.050842002 = weight(_text_:22 in 1737) [ClassicSimilarity], result of:
          0.050842002 = score(doc=1737,freq=2.0), product of:
            0.1642603 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046906993 = queryNorm
            0.30952093 = fieldWeight in 1737, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1737)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 22.11.1998 18:57:22

Amir, A.; Feldman, R.; Kashi, R.: ¬A new and versatile method for association generation (1997) 0.01

0.0063552503 = product of:
  0.025421001 = sum of:
    0.025421001 = product of:
      0.050842002 = sum of:
        0.050842002 = weight(_text_:22 in 1270) [ClassicSimilarity], result of:
          0.050842002 = score(doc=1270,freq=2.0), product of:
            0.1642603 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046906993 = queryNorm
            0.30952093 = fieldWeight in 1270, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1270)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Source: Information systems. 22(1997) nos.5/6, S.333-347

Hofstede, A.H.M. ter; Proper, H.A.; Van der Weide, T.P.: Exploiting fact verbalisation in conceptual information modelling (1997) 0.01

0.0055608437 = product of:
  0.022243375 = sum of:
    0.022243375 = product of:
      0.04448675 = sum of:
        0.04448675 = weight(_text_:22 in 2908) [ClassicSimilarity], result of:
          0.04448675 = score(doc=2908,freq=2.0), product of:
            0.1642603 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046906993 = queryNorm
            0.2708308 = fieldWeight in 2908, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2908)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Source: Information systems. 22(1997) nos.5/6, S.349-385

Liu, W.; Weichselbraun, A.; Scharl, A.; Chang, E.: Semi-automatic ontology extension using spreading activation (2005) 0.01

0.00515191 = product of:
  0.02060764 = sum of:
    0.02060764 = product of:
      0.04121528 = sum of:
        0.04121528 = weight(_text_:management in 3028) [ClassicSimilarity], result of:
          0.04121528 = score(doc=3028,freq=2.0), product of:
            0.15810528 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.046906993 = queryNorm
            0.2606825 = fieldWeight in 3028, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3028)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Source: Journal of universal knowledge management. 0(2005) no.1, S.50-58

Wu, K.J.; Chen, M.-C.; Sun, Y.: Automatic topics discovery from hyperlinked documents (2004) 0.00

0.004415923 = product of:
  0.017663691 = sum of:
    0.017663691 = product of:
      0.035327382 = sum of:
        0.035327382 = weight(_text_:management in 2563) [ClassicSimilarity], result of:
          0.035327382 = score(doc=2563,freq=2.0), product of:
            0.15810528 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.046906993 = queryNorm
            0.22344214 = fieldWeight in 2563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.046875 = fieldNorm(doc=2563)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Source: Information processing and management. 40(2004) no.2, S.239-255

Chen, H.; Chau, M.: Web mining : machine learning for Web applications (2003) 0.00
```
0.004415923 = product of:
  0.017663691 = sum of:
    0.017663691 = product of:
      0.035327382 = sum of:
        0.035327382 = weight(_text_:management in 4242) [ClassicSimilarity], result of:
          0.035327382 = score(doc=4242,freq=2.0), product of:
            0.15810528 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.046906993 = queryNorm
            0.22344214 = fieldWeight in 4242, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.046875 = fieldNorm(doc=4242)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

With more than two billion pages created by millions of Web page authors and organizations, the World Wide Web is a tremendously rich knowledge base. The knowledge comes not only from the content of the pages themselves, but also from the unique characteristics of the Web, such as its hyperlink structure and its diversity of content and languages. Analysis of these characteristics often reveals interesting patterns and new knowledge. Such knowledge can be used to improve users' efficiency and effectiveness in searching for information an the Web, and also for applications unrelated to the Web, such as support for decision making or business management. The Web's size and its unstructured and dynamic content, as well as its multilingual nature, make the extraction of useful knowledge a challenging research problem. Furthermore, the Web generates a large amount of data in other formats that contain valuable information. For example, Web server logs' information about user access patterns can be used for information personalization or improving Web page design.

Pons-Porrata, A.; Berlanga-Llavori, R.; Ruiz-Shulcloper, J.: Topic discovery based on text mining techniques (2007) 0.00

0.004415923 = product of:
  0.017663691 = sum of:
    0.017663691 = product of:
      0.035327382 = sum of:
        0.035327382 = weight(_text_:management in 916) [ClassicSimilarity], result of:
          0.035327382 = score(doc=916,freq=2.0), product of:
            0.15810528 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.046906993 = queryNorm
            0.22344214 = fieldWeight in 916, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.046875 = fieldNorm(doc=916)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Source: Information processing and management. 43(2007) no.3, S.752-768

Sánchez, D.; Chamorro-Martínez, J.; Vila, M.A.: Modelling subjectivity in visual perception of orientation for image retrieval (2003) 0.00

0.004415923 = product of:
  0.017663691 = sum of:
    0.017663691 = product of:
      0.035327382 = sum of:
        0.035327382 = weight(_text_:management in 1067) [ClassicSimilarity], result of:
          0.035327382 = score(doc=1067,freq=2.0), product of:
            0.15810528 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.046906993 = queryNorm
            0.22344214 = fieldWeight in 1067, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.046875 = fieldNorm(doc=1067)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Source: Information processing and management. 39(2003) no.2, S.251-266

Berendt, B.; Krause, B.; Kolbe-Nusser, S.: Intelligent scientific authoring tools : interactive data mining for constructive uses of citation networks (2010) 0.00

0.004415923 = product of:
  0.017663691 = sum of:
    0.017663691 = product of:
      0.035327382 = sum of:
        0.035327382 = weight(_text_:management in 4226) [ClassicSimilarity], result of:
          0.035327382 = score(doc=4226,freq=2.0), product of:
            0.15810528 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.046906993 = queryNorm
            0.22344214 = fieldWeight in 4226, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.046875 = fieldNorm(doc=4226)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Source: Information processing and management. 46(2010) no.1, S.1-10

Hallonsten, O.; Holmberg, D.: Analyzing structural stratification in the Swedish higher education system : data contextualization with policy-history analysis (2013) 0.00

0.003972031 = product of:
  0.015888125 = sum of:
    0.015888125 = product of:
      0.03177625 = sum of:
        0.03177625 = weight(_text_:22 in 668) [ClassicSimilarity], result of:
          0.03177625 = score(doc=668,freq=2.0), product of:
            0.1642603 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046906993 = queryNorm
            0.19345059 = fieldWeight in 668, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=668)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 22. 3.2013 19:43:01

Fonseca, F.; Marcinkowski, M.; Davis, C.: Cyber-human systems of thought and understanding (2019) 0.00

0.003972031 = product of:
  0.015888125 = sum of:
    0.015888125 = product of:
      0.03177625 = sum of:
        0.03177625 = weight(_text_:22 in 5011) [ClassicSimilarity], result of:
          0.03177625 = score(doc=5011,freq=2.0), product of:
            0.1642603 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046906993 = queryNorm
            0.19345059 = fieldWeight in 5011, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5011)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 7. 3.2019 16:32:22

Hereth, J.; Stumme, G.; Wille, R.; Wille, U.: Conceptual knowledge discovery and data analysis (2000) 0.00
```
0.0036799356 = product of:
  0.014719742 = sum of:
    0.014719742 = product of:
      0.029439485 = sum of:
        0.029439485 = weight(_text_:management in 5083) [ClassicSimilarity], result of:
          0.029439485 = score(doc=5083,freq=2.0), product of:
            0.15810528 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.046906993 = queryNorm
            0.18620178 = fieldWeight in 5083, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5083)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

In this paper, we discuss Conceptual Knowledge Discovery in Databases (CKDD) in its connection with Data Analysis. Our approach is based on Formal Concept Analysis, a mathematical theory which has been developed and proven useful during the last 20 years. Formal Concept Analysis has led to a theory of conceptual information systems which has been applied by using the management system TOSCANA in a wide range of domains. In this paper, we use such an application in database marketing to demonstrate how methods and procedures of CKDD can be applied in Data Analysis. In particular, we show the interplay and integration of data mining and data analysis techniques based on Formal Concept Analysis. The main concern of this paper is to explain how the transition from data to knowledge can be supported by a TOSCANA system. To clarify the transition steps we discuss their correspondence to the five levels of knowledge representation established by R. Brachman and to the steps of empirically grounded theory building proposed by A. Strauss and J. Corbin
Liu, Y.; Huang, X.; An, A.: Personalized recommendation with adaptive mixture of markov models (2007) 0.00
```
0.0036799356 = product of:
  0.014719742 = sum of:
    0.014719742 = product of:
      0.029439485 = sum of:
        0.029439485 = weight(_text_:management in 606) [ClassicSimilarity], result of:
          0.029439485 = score(doc=606,freq=2.0), product of:
            0.15810528 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.046906993 = queryNorm
            0.18620178 = fieldWeight in 606, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.0390625 = fieldNorm(doc=606)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

With more and more information available on the Internet, the task of making personalized recommendations to assist the user's navigation has become increasingly important. Considering there might be millions of users with different backgrounds accessing a Web site everyday, it is infeasible to build a separate recommendation system for each user. To address this problem, clustering techniques can first be employed to discover user groups. Then, user navigation patterns for each group can be discovered, to allow the adaptation of a Web site to the interest of each individual group. In this paper, we propose to model user access sequences as stochastic processes, and a mixture of Markov models based approach is taken to cluster users and to capture the sequential relationships inherent in user access histories. Several important issues that arise in constructing the Markov models are also addressed. The first issue lies in the complexity of the mixture of Markov models. To improve the efficiency of building/maintaining the mixture of Markov models, we develop a lightweight adapt-ive algorithm to update the model parameters without recomputing model parameters from scratch. The second issue concerns the proper selection of training data for building the mixture of Markov models. We investigate two different training data selection strategies and perform extensive experiments to compare their effectiveness on a real dataset that is generated by a Web-based knowledge management system, Livelink.

Search (26 results, page 1 of 2)

Authors

Years

Languages

Themes