Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 28. April 2022)
1Chakrabarti, S.: Mining the Web : discovering knowledge from hypertext data.
San Francisco, CA : Morgan Kaufmann, 2003. 344 S.
Anmerkung: Rez. in: JASIST 55(2004) no.3, S.275-276 (C. Chen): "This is a book about finding significant statistical patterns on the Web - in particular, patterns that are associated with hypertext documents, topics, hyperlinks, and queries. The term pattern in this book refers to dependencies among such items. On the one hand, the Web contains useful information an just about every topic under the sun. On the other hand, just like searching for a needle in a haystack, one would need powerful tools to locate useful information an the vast land of the Web. Soumen Chakrabarti's book focuses an a wide range of techniques for machine learning and data mining an the Web. The goal of the book is to provide both the technical Background and tools and tricks of the trade of Web content mining. Much of the technical content reflects the state of the art between 1995 and 2002. The targeted audience is researchers and innovative developers in this area, as well as newcomers who intend to enter this area. The book begins with an introduction chapter. The introduction chapter explains fundamental concepts such as crawling and indexing as well as clustering and classification. The remaining eight chapters are organized into three parts: i) infrastructure, ii) learning and iii) applications. ; Part I, Infrastructure, has two chapters: Chapter 2 on crawling the Web and Chapter 3 an Web search and information retrieval. The second part of the book, containing chapters 4, 5, and 6, is the centerpiece. This part specifically focuses an machine learning in the context of hypertext. Part III is a collection of applications that utilize the techniques described in earlier chapters. Chapter 7 is an social network analysis. Chapter 8 is an resource discovery. Chapter 9 is an the future of Web mining. Overall, this is a valuable reference book for researchers and developers in the field of Web mining. It should be particularly useful for those who would like to design and probably code their own Computer programs out of the equations and pseudocodes an most of the pages. For a student, the most valuable feature of the book is perhaps the formal and consistent treatments of concepts across the board. For what is behind and beyond the technical details, one has to either dig deeper into the bibliographic notes at the end of each chapter, or resort to more in-depth analysis of relevant subjects in the literature. lf you are looking for successful stories about Web mining or hard-way-learned lessons of failures, this is not the book."
Themenfeld: Internet ; Data Mining
2Chakrabarti, S. ; Dom, B. ; Kumar, S.R. ; Raghavan, P. ; Rajagopalan, S. ; Tomkins, A. ; Kleinberg, J.M. ; Gibson, D.: Neue Pfade durch den Internet-Dschungel : Die zweite Generation von Web-Suchmaschinen.
In: Spektrum der Wissenschaft. 1999, H.8, S.44-49.
Abstract: Die im WWW verfügbare Datenmenge wächst mit atemberaubender Geschwindigkeit; entsprechend schwieriger wird es, relevante Informationen zu finden. ein neues Analyseverfahren stellt nahezu automatische Abhilfe in Aussicht
Inhalt: Ausnutzen der Hyperlinks für verbesserte Such- und Findeverfahren; Darstellung des HITS-Algorithmus
Anmerkung: Vgl. auch: http://www.almaden.ibm.com/cs/k53/clever.html
Themenfeld: Suchmaschinen ; Retrievalalgorithmen
Objekt: Google ; HITS-Algorithmus