Search (37 results, page 2 of 2)

Liu, B.: Web data mining : exploring hyperlinks, contents, and usage data (2011) 0.01
```
0.012652363 = product of:
  0.037957087 = sum of:
    0.037957087 = weight(_text_:search in 354) [ClassicSimilarity], result of:
      0.037957087 = score(doc=354,freq=4.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.21722981 = fieldWeight in 354, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.03125 = fieldNorm(doc=354)
  0.33333334 = coord(1/3)
```
Abstract

Web mining aims to discover useful information and knowledge from the Web hyperlink structure, page contents, and usage data. Although Web mining uses many conventional data mining techniques, it is not purely an application of traditional data mining due to the semistructured and unstructured nature of the Web data and its heterogeneity. It has also developed many of its own algorithms and techniques. Liu has written a comprehensive text on Web data mining. Key topics of structure mining, content mining, and usage mining are covered both in breadth and in depth. His book brings together all the essential concepts and algorithms from related areas such as data mining, machine learning, and text processing to form an authoritative and coherent text. The book offers a rich blend of theory and practice, addressing seminal research ideas, as well as examining the technology from a practical point of view. It is suitable for students, researchers and practitioners interested in Web mining both as a learning text and a reference book. Lecturers can readily use it for classes on data mining, Web mining, and Web search. Additional teaching materials such as lecture slides, datasets, and implemented algorithms are available online.

Content

Inhalt: 1. Introduction 2. Association Rules and Sequential Patterns 3. Supervised Learning 4. Unsupervised Learning 5. Partially Supervised Learning 6. Information Retrieval and Web Search 7. Social Network Analysis 8. Web Crawling 9. Structured Data Extraction: Wrapper Generation 10. Information Integration
Survey of text mining : clustering, classification, and retrieval (2004) 0.01
```
0.011183213 = product of:
  0.03354964 = sum of:
    0.03354964 = weight(_text_:search in 804) [ClassicSimilarity], result of:
      0.03354964 = score(doc=804,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.19200584 = fieldWeight in 804, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=804)
  0.33333334 = coord(1/3)
```
Abstract

Extracting content from text continues to be an important research problem for information processing and management. Approaches to capture the semantics of text-based document collections may be based on Bayesian models, probability theory, vector space models, statistical models, or even graph theory. As the volume of digitized textual media continues to grow, so does the need for designing robust, scalable indexing and search strategies (software) to meet a variety of user needs. Knowledge extraction or creation from text requires systematic yet reliable processing that can be codified and adapted for changing needs and environments. This book will draw upon experts in both academia and industry to recommend practical approaches to the purification, indexing, and mining of textual information. It will address document identification, clustering and categorizing documents, cleaning text, and visualizing semantic models of text.
Chen, C.-C.; Chen, A.-P.: Using data mining technology to provide a recommendation service in the digital library (2007) 0.01
```
0.011183213 = product of:
  0.03354964 = sum of:
    0.03354964 = weight(_text_:search in 2533) [ClassicSimilarity], result of:
      0.03354964 = score(doc=2533,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.19200584 = fieldWeight in 2533, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2533)
  0.33333334 = coord(1/3)
```
Abstract

Purpose - Since library storage has been increasing day by day, it is difficult for readers to find the books which interest them as well as representative booklists. How to utilize meaningful information effectively to improve the service quality of the digital library appears to be very important. The purpose of this paper is to provide a recommendation system architecture to promote digital library services in electronic libraries. Design/methodology/approach - In the proposed architecture, a two-phase data mining process used by association rule and clustering methods is designed to generate a recommendation system. The process considers not only the relationship of a cluster of users but also the associations among the information accessed. Findings - The process considered not only the relationship of a cluster of users but also the associations among the information accessed. With the advanced filter, the recommendation supported by the proposed system architecture would be closely served to meet users' needs. Originality/value - This paper not only constructs a recommendation service for readers to search books from the web but takes the initiative in finding the most suitable books for readers as well. Furthermore, library managers are expected to purchase core and hot books from a limited budget to maintain and satisfy the requirements of readers along with promoting digital library services.
Tonkin, E.L.; Tourte, G.J.L.: Working with text. tools, techniques and approaches for text mining (2016) 0.01
```
0.011183213 = product of:
  0.03354964 = sum of:
    0.03354964 = weight(_text_:search in 4019) [ClassicSimilarity], result of:
      0.03354964 = score(doc=4019,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.19200584 = fieldWeight in 4019, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4019)
  0.33333334 = coord(1/3)
```
Abstract

What is text mining, and how can it be used? What relevance do these methods have to everyday work in information science and the digital humanities? How does one develop competences in text mining? Working with Text provides a series of cross-disciplinary perspectives on text mining and its applications. As text mining raises legal and ethical issues, the legal background of text mining and the responsibilities of the engineer are discussed in this book. Chapters provide an introduction to the use of the popular GATE text mining package with data drawn from social media, the use of text mining to support semantic search, the development of an authority system to support content tagging, and recent techniques in automatic language evaluation. Focused studies describe text mining on historical texts, automated indexing using constrained vocabularies, and the use of natural language processing to explore the climate science literature. Interviews are included that offer a glimpse into the real-life experience of working within commercial and academic text mining.

Lusti, M.: Data Warehousing and Data Mining : Eine Einführung in entscheidungsunterstützende Systeme (1999) 0.01

0.009081715 = product of:
  0.027245143 = sum of:
    0.027245143 = product of:
      0.054490287 = sum of:
        0.054490287 = weight(_text_:22 in 4261) [ClassicSimilarity], result of:
          0.054490287 = score(doc=4261,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.30952093 = fieldWeight in 4261, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=4261)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 17. 7.2002 19:22:06

Amir, A.; Feldman, R.; Kashi, R.: ¬A new and versatile method for association generation (1997) 0.01

0.009081715 = product of:
  0.027245143 = sum of:
    0.027245143 = product of:
      0.054490287 = sum of:
        0.054490287 = weight(_text_:22 in 1270) [ClassicSimilarity], result of:
          0.054490287 = score(doc=1270,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.30952093 = fieldWeight in 1270, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1270)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Information systems. 22(1997) nos.5/6, S.333-347

Chakrabarti, S.: Mining the Web : discovering knowledge from hypertext data (2003) 0.01
```
0.0089465715 = product of:
  0.026839713 = sum of:
    0.026839713 = weight(_text_:search in 2222) [ClassicSimilarity], result of:
      0.026839713 = score(doc=2222,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.15360467 = fieldWeight in 2222, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.03125 = fieldNorm(doc=2222)
  0.33333334 = coord(1/3)
```
Footnote

Part I, Infrastructure, has two chapters: Chapter 2 on crawling the Web and Chapter 3 an Web search and information retrieval. The second part of the book, containing chapters 4, 5, and 6, is the centerpiece. This part specifically focuses an machine learning in the context of hypertext. Part III is a collection of applications that utilize the techniques described in earlier chapters. Chapter 7 is an social network analysis. Chapter 8 is an resource discovery. Chapter 9 is an the future of Web mining. Overall, this is a valuable reference book for researchers and developers in the field of Web mining. It should be particularly useful for those who would like to design and probably code their own Computer programs out of the equations and pseudocodes an most of the pages. For a student, the most valuable feature of the book is perhaps the formal and consistent treatments of concepts across the board. For what is behind and beyond the technical details, one has to either dig deeper into the bibliographic notes at the end of each chapter, or resort to more in-depth analysis of relevant subjects in the literature. lf you are looking for successful stories about Web mining or hard-way-learned lessons of failures, this is not the book."

Hofstede, A.H.M. ter; Proper, H.A.; Van der Weide, T.P.: Exploiting fact verbalisation in conceptual information modelling (1997) 0.01

0.007946501 = product of:
  0.0238395 = sum of:
    0.0238395 = product of:
      0.047679 = sum of:
        0.047679 = weight(_text_:22 in 2908) [ClassicSimilarity], result of:
          0.047679 = score(doc=2908,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.2708308 = fieldWeight in 2908, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2908)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Source: Information systems. 22(1997) nos.5/6, S.349-385

Lackes, R.; Tillmanns, C.: Data Mining für die Unternehmenspraxis : Entscheidungshilfen und Fallstudien mit führenden Softwarelösungen (2006) 0.01

0.0068112854 = product of:
  0.020433856 = sum of:
    0.020433856 = product of:
      0.040867712 = sum of:
        0.040867712 = weight(_text_:22 in 1383) [ClassicSimilarity], result of:
          0.040867712 = score(doc=1383,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.23214069 = fieldWeight in 1383, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1383)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 22. 3.2008 14:46:06

Hallonsten, O.; Holmberg, D.: Analyzing structural stratification in the Swedish higher education system : data contextualization with policy-history analysis (2013) 0.01

0.0056760716 = product of:
  0.017028214 = sum of:
    0.017028214 = product of:
      0.03405643 = sum of:
        0.03405643 = weight(_text_:22 in 668) [ClassicSimilarity], result of:
          0.03405643 = score(doc=668,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.19345059 = fieldWeight in 668, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=668)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 22. 3.2013 19:43:01

Fonseca, F.; Marcinkowski, M.; Davis, C.: Cyber-human systems of thought and understanding (2019) 0.01

0.0056760716 = product of:
  0.017028214 = sum of:
    0.017028214 = product of:
      0.03405643 = sum of:
        0.03405643 = weight(_text_:22 in 5011) [ClassicSimilarity], result of:
          0.03405643 = score(doc=5011,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.19345059 = fieldWeight in 5011, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5011)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 7. 3.2019 16:32:22

Peters, G.; Gaese, V.: ¬Das DocCat-System in der Textdokumentation von G+J (2003) 0.00

0.0045408574 = product of:
  0.013622572 = sum of:
    0.013622572 = product of:
      0.027245143 = sum of:
        0.027245143 = weight(_text_:22 in 1507) [ClassicSimilarity], result of:
          0.027245143 = score(doc=1507,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.15476047 = fieldWeight in 1507, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=1507)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 22. 4.2003 11:45:36

Hölzig, C.: Google spürt Grippewellen auf : Die neue Anwendung ist bisher auf die USA beschränkt (2008) 0.00

0.0045408574 = product of:
  0.013622572 = sum of:
    0.013622572 = product of:
      0.027245143 = sum of:
        0.027245143 = weight(_text_:22 in 2403) [ClassicSimilarity], result of:
          0.027245143 = score(doc=2403,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.15476047 = fieldWeight in 2403, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=2403)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 3. 5.1997 8:44:22

Jäger, L.: Von Big Data zu Big Brother (2018) 0.00

0.0045408574 = product of:
  0.013622572 = sum of:
    0.013622572 = product of:
      0.027245143 = sum of:
        0.027245143 = weight(_text_:22 in 5234) [ClassicSimilarity], result of:
          0.027245143 = score(doc=5234,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.15476047 = fieldWeight in 5234, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=5234)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 22. 1.2018 11:33:49

Lischka, K.: Spurensuche im Datenwust : Data-Mining-Software fahndet nach kriminellen Mitarbeitern, guten Kunden - und bald vielleicht auch nach Terroristen (2002) 0.00
```
0.0034056427 = product of:
  0.010216928 = sum of:
    0.010216928 = product of:
      0.020433856 = sum of:
        0.020433856 = weight(_text_:22 in 1178) [ClassicSimilarity], result of:
          0.020433856 = score(doc=1178,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.116070345 = fieldWeight in 1178, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0234375 = fieldNorm(doc=1178)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Content

"Ob man als Terrorist einen Anschlag gegen die Vereinigten Staaten plant, als Kassierer Scheine aus der Kasse unterschlägt oder für bestimmte Produkte besonders gerne Geld ausgibt - einen Unterschied macht Data-Mining-Software da nicht. Solche Programme analysieren riesige Daten- mengen und fällen statistische Urteile. Mit diesen Methoden wollen nun die For- scher des "Information Awaren in den Vereinigten Staaten Spuren von Terroristen in den Datenbanken von Behörden und privaten Unternehmen wie Kreditkartenfirmen finden. 200 Millionen Dollar umfasst der Jahresetat für die verschiedenen Forschungsprojekte. Dass solche Software in der Praxis funktioniert, zeigen die steigenden Umsätze der Anbieter so genannter Customer-Relationship-Management-Software. Im vergangenen Jahr ist das Potenzial für analytische CRM-Anwendungen laut dem Marktforschungsinstitut IDC weltweit um 22 Prozent gewachsen, bis zum Jahr 2006 soll es in Deutschland mit einem jährlichen Plus von 14,1 Prozent so weitergehen. Und das trotz schwacher Konjunktur - oder gerade deswegen. Denn ähnlich wie Data-Mining der USRegierung helfen soll, Terroristen zu finden, entscheiden CRM-Programme heute, welche Kunden für eine Firma profitabel sind. Und welche es künftig sein werden, wie Manuela Schnaubelt, Sprecherin des CRM-Anbieters SAP, beschreibt: "Die Kundenbewertung ist ein zentraler Bestandteil des analytischen CRM. Sie ermöglicht es Unternehmen, sich auf die für sie wichtigen und richtigen Kunden zu fokussieren. Darüber hinaus können Firmen mit speziellen Scoring- Verfahren ermitteln, welche Kunden langfristig in welchem Maße zum Unternehmenserfolg beitragen." Die Folgen der Bewertungen sind für die Betroffenen nicht immer positiv: Attraktive Kunden profitieren von individuellen Sonderangeboten und besonderer Zuwendung. Andere hängen vielleicht so lauge in der Warteschleife des Telefonservice, bis die profitableren Kunden abgearbeitet sind. So könnte eine praktische Umsetzung dessen aussehen, was SAP-Spreche-rin Schnaubelt abstrakt beschreibt: "In vielen Unternehmen wird Kundenbewertung mit der klassischen ABC-Analyse durchgeführt, bei der Kunden anhand von Daten wie dem Umsatz kategorisiert werden. A-Kunden als besonders wichtige Kunden werden anders betreut als C-Kunden." Noch näher am geplanten Einsatz von Data-Mining zur Terroristenjagd ist eine Anwendung, die heute viele Firmen erfolgreich nutzen: Sie spüren betrügende Mitarbeiter auf. Werner Sülzer vom großen CRM-Anbieter NCR Teradata beschreibt die Möglichkeiten so: "Heute hinterlässt praktisch jeder Täter - ob Mitarbeiter, Kunde oder Lieferant - Datenspuren bei seinen wirtschaftskriminellen Handlungen. Es muss vorrangig darum gehen, einzelne Spuren zu Handlungsmustern und Täterprofilen zu verdichten. Das gelingt mittels zentraler Datenlager und hoch entwickelter Such- und Analyseinstrumente." Von konkreten Erfolgen sprich: Entlas-sungen krimineller Mitarbeiter-nach Einsatz solcher Programme erzählen Unternehmen nicht gerne. Matthias Wilke von der "Beratungsstelle für Technologiefolgen und Qualifizierung" (BTQ) der Gewerkschaft Verdi weiß von einem Fall 'aus der Schweiz. Dort setzt die Handelskette "Pick Pay" das Programm "Lord Lose Prevention" ein. Zwei Monate nach Einfüh-rung seien Unterschlagungen im Wert von etwa 200 000 Franken ermittelt worden. Das kostete mehr als 50 verdächtige Kassiererinnen und Kassierer den Job.

Medien-Informationsmanagement : Archivarische, dokumentarische, betriebswirtschaftliche, rechtliche und Berufsbild-Aspekte ; [Frühjahrstagung der Fachgruppe 7 im Jahr 2000 in Weimar und Folgetagung 2001 in Köln] (2003) 0.00

0.0034056427 = product of:
  0.010216928 = sum of:
    0.010216928 = product of:
      0.020433856 = sum of:
        0.020433856 = weight(_text_:22 in 1833) [ClassicSimilarity], result of:
          0.020433856 = score(doc=1833,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.116070345 = fieldWeight in 1833, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0234375 = fieldNorm(doc=1833)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 11. 5.2008 19:49:22

Information visualization in data mining and knowledge discovery (2002) 0.00

0.0022704287 = product of:
  0.006811286 = sum of:
    0.006811286 = product of:
      0.013622572 = sum of:
        0.013622572 = weight(_text_:22 in 1789) [ClassicSimilarity], result of:
          0.013622572 = score(doc=1789,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.07738023 = fieldWeight in 1789, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.015625 = fieldNorm(doc=1789)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 23. 3.2008 19:10:22

Search (37 results, page 2 of 2)

Authors

Years

Languages

Types

Themes

Subjects

Classifications