Search (8 results, page 1 of 1)

Liu, B.: Web data mining : exploring hyperlinks, contents, and usage data (2011) 0.05
```
0.04718969 = product of:
  0.09437938 = sum of:
    0.07742243 = weight(_text_:data in 354) [ClassicSimilarity], result of:
      0.07742243 = score(doc=354,freq=28.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.52287495 = fieldWeight in 354, product of:
          5.2915025 = tf(freq=28.0), with freq of:
            28.0 = termFreq=28.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=354)
    0.016956951 = product of:
      0.033913903 = sum of:
        0.033913903 = weight(_text_:processing in 354) [ClassicSimilarity], result of:
          0.033913903 = score(doc=354,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.17890452 = fieldWeight in 354, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.03125 = fieldNorm(doc=354)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Web mining aims to discover useful information and knowledge from the Web hyperlink structure, page contents, and usage data. Although Web mining uses many conventional data mining techniques, it is not purely an application of traditional data mining due to the semistructured and unstructured nature of the Web data and its heterogeneity. It has also developed many of its own algorithms and techniques. Liu has written a comprehensive text on Web data mining. Key topics of structure mining, content mining, and usage mining are covered both in breadth and in depth. His book brings together all the essential concepts and algorithms from related areas such as data mining, machine learning, and text processing to form an authoritative and coherent text. The book offers a rich blend of theory and practice, addressing seminal research ideas, as well as examining the technology from a practical point of view. It is suitable for students, researchers and practitioners interested in Web mining both as a learning text and a reference book. Lecturers can readily use it for classes on data mining, Web mining, and Web search. Additional teaching materials such as lecture slides, datasets, and implemented algorithms are available online.

Content

Inhalt: 1. Introduction 2. Association Rules and Sequential Patterns 3. Supervised Learning 4. Unsupervised Learning 5. Partially Supervised Learning 6. Information Retrieval and Web Search 7. Social Network Analysis 8. Web Crawling 9. Structured Data Extraction: Wrapper Generation 10. Information Integration

RSWK

World Wide Web / Data Mining

Series

Data-centric systems and applications

Subject

World Wide Web / Data Mining

Theme

Data Mining

Pang, B.; Lee, L.: Opinion mining and sentiment analysis (2008) 0.03

0.03268239 = product of:
  0.06536478 = sum of:
    0.04138403 = weight(_text_:data in 1171) [ClassicSimilarity], result of:
      0.04138403 = score(doc=1171,freq=8.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.2794884 = fieldWeight in 1171, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=1171)
    0.02398075 = product of:
      0.0479615 = sum of:
        0.0479615 = weight(_text_:processing in 1171) [ClassicSimilarity], result of:
          0.0479615 = score(doc=1171,freq=4.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.2530092 = fieldWeight in 1171, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.03125 = fieldNorm(doc=1171)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

LCSH: Text processing (Computer science)
RSWK: World Wide Web / Meinungsäußerung / Data Mining
Data Mining / Psycholinguistik (BVB)
Subject: World Wide Web / Meinungsäußerung / Data Mining
Data Mining / Psycholinguistik (BVB)
Text processing (Computer science)

Kantardzic, M.: Data mining : concepts, models, methods, and algorithms (2003) 0.02
```
0.022548601 = product of:
  0.090194404 = sum of:
    0.090194404 = weight(_text_:data in 2291) [ClassicSimilarity], result of:
      0.090194404 = score(doc=2291,freq=38.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.60913086 = fieldWeight in 2291, product of:
          6.164414 = tf(freq=38.0), with freq of:
            38.0 = termFreq=38.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=2291)
  0.25 = coord(1/4)
```
Abstract

This book offers a comprehensive introduction to the exploding field of data mining. We are surrounded by data, numerical and otherwise, which must be analyzed and processed to convert it into information that informs, instructs, answers, or otherwise aids understanding and decision-making. Due to the ever-increasing complexity and size of today's data sets, a new term, data mining, was created to describe the indirect, automatic data analysis techniques that utilize more complex and sophisticated tools than those which analysts used in the past to do mere data analysis. "Data Mining: Concepts, Models, Methods, and Algorithms" discusses data mining principles and then describes representative state-of-the-art methods and algorithms originating from different disciplines such as statistics, machine learning, neural networks, fuzzy logic, and evolutionary computation. Detailed algorithms are provided with necessary explanations and illustrative examples. This text offers guidance: how and when to use a particular software tool (with their companion data sets) from among the hundreds offered when faced with a data set to mine. This allows analysts to create and perform their own data mining experiments using their knowledge of the methodologies and techniques provided. This book emphasizes the selection of appropriate methodologies and data analysis software, as well as parameter tuning. These critically important, qualitative decisions can only be made with the deeper understanding of parameter meaning and its role in the technique that is offered here. Data mining is an exploding field and this book offers much-needed guidance to selecting among the numerous analysis programs that are available.

LCSH

Data mining

RSWK

Data Mining / Lehrbuch

Subject

Data Mining / Lehrbuch
Data mining

Theme

Data Mining
Mining text data (2012) 0.02
```
0.017156914 = product of:
  0.068627656 = sum of:
    0.068627656 = weight(_text_:data in 362) [ClassicSimilarity], result of:
      0.068627656 = score(doc=362,freq=22.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.46347913 = fieldWeight in 362, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=362)
  0.25 = coord(1/4)
```
Abstract

Text mining applications have experienced tremendous advances because of web 2.0 and social networking applications. Recent advances in hardware and software technology have lead to a number of unique scenarios where text mining algorithms are learned. Mining Text Data introduces an important niche in the text analytics field, and is an edited volume contributed by leading international researchers and practitioners focused on social networks & data mining. This book contains a wide swath in topics across social networks & data mining. Each chapter contains a comprehensive survey including the key research content on the topic, and the future directions of research in the field. There is a special focus on Text Embedded with Heterogeneous and Multimedia Data which makes the mining process much more challenging. A number of methods have been designed such as transfer learning and cross-lingual mining for such cases. Mining Text Data simplifies the content, so that advanced-level students, practitioners and researchers in computer science can benefit from this book. Academic and corporate libraries, as well as ACM, IEEE, and Management Science focused on information security, electronic commerce, databases, data mining, machine learning, and statistics are the primary buyers for this reference book.

Content

Inhalt: An Introduction to Text Mining.- Information Extraction from Text.- A Survey of Text Summarization Techniques.- A Survey of Text Clustering Algorithms.- Dimensionality Reduction and Topic Modeling.- A Survey of Text Classification Algorithms.- Transfer Learning for Text Mining.- Probabilistic Models for Text Mining.- Mining Text Streams.- Translingual Mining from Text Data.- Text Mining in Multimedia.- Text Analytics in Social Media.- A Survey of Opinion Mining and Sentiment Analysis.- Biomedical Text Mining: A Survey of Recent Progress.- Index.

LCSH

Data mining

Subject

Data mining

Theme

Data Mining

Geiselberger, H. u.a. [Red.]: Big Data : das neue Versprechen der Allwissenheit (2013) 0.01

0.013439858 = product of:
  0.053759433 = sum of:
    0.053759433 = weight(_text_:data in 2484) [ClassicSimilarity], result of:
      0.053759433 = score(doc=2484,freq=6.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.3630661 = fieldWeight in 2484, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=2484)
  0.25 = coord(1/4)

Abstract: Der Begriff Big Data hat spätestens in diesem Jahr der Überwachung den Durchbruch geschafft - mit dem Sammelband des Suhrkamp Verlags bekommt nun jedermann den Data-Durchblick. ... Experten aus Theorie und Praxis bringen ihre Erfahrungen und Meinungen im Suhrkamp-Werk kurz und präzise auf den Punkt und bieten damit einen guten Überblick über die Thematik, die gerade erst in den Startlöchern steht.

O'Neil, C.: Angriff der Algorithmen : wie sie Wahlen manipulieren, Berufschancen zerstören und unsere Gesundheit gefährden (2017) 0.01
```
0.011087317 = product of:
  0.044349268 = sum of:
    0.044349268 = weight(_text_:data in 4060) [ClassicSimilarity], result of:
      0.044349268 = score(doc=4060,freq=12.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.29951423 = fieldWeight in 4060, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4060)
  0.25 = coord(1/4)
```
Abstract

Algorithmen nehmen Einfluss auf unser Leben: Von ihnen hängt es ab, ob man etwa einen Kredit für sein Haus erhält und wie viel man für die Krankenversicherung bezahlt. Cathy O'Neil, ehemalige Hedgefonds-Managerin und heute Big-Data-Whistleblowerin, erklärt, wie Algorithmen in der Theorie objektive Entscheidungen ermöglichen, im wirklichen Leben aber mächtigen Interessen folgen. Algorithmen nehmen Einfluss auf die Politik, gefährden freie Wahlen und manipulieren über soziale Netzwerke sogar die Demokratie. Cathy O'Neils dringlicher Appell zeigt, wie sie Diskriminierung und Ungleichheit verstärken und so zu Waffen werden, die das Fundament unserer Gesellschaft erschüttern.

Footnote

Originaltitel: Weapons of math destruction:: how Big Data increases inequality and threatens democracy. Vgl. auch den Rezensions-Beitrag: Krüger, J.: Wie der Mensch die Kontrolle über den Algorithmus behalten kann. [19.01.2018]. In: https://netzpolitik.org/2018/algorithmen-regulierung-im-kontext-aktueller-gesetzgebung/.

LCSH

Big data / Social aspects / United States
Big data / Political aspects / United States

Subject

Big data / Social aspects / United States
Big data / Political aspects / United States
Bergman, O.; Whittaker, S.: ¬The science of managing our digital stuff (2016) 0.01
```
0.0103460075 = product of:
  0.04138403 = sum of:
    0.04138403 = weight(_text_:data in 3971) [ClassicSimilarity], result of:
      0.04138403 = score(doc=3971,freq=8.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.2794884 = fieldWeight in 3971, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=3971)
  0.25 = coord(1/4)
```
Abstract

Why we organize our personal digital data the way we do and how design of new PIM systems can help us manage our information more efficiently. Each of us has an ever-growing collection of personal digital data: documents, photographs, PowerPoint presentations, videos, music, emails and texts sent and received. To access any of this, we have to find it. The ease (or difficulty) of finding something depends on how we organize our digital stuff. In this book, personal information management (PIM) experts Ofer Bergman and Steve Whittaker explain why we organize our personal digital data the way we do and how the design of new PIM systems can help us manage our collections more efficiently.

Content

Bergman and Whittaker report that many of us use hierarchical folders for our personal digital organizing. Critics of this method point out that information is hidden from sight in folders that are often within other folders so that we have to remember the exact location of information to access it. Because of this, information scientists suggest other methods: search, more flexible than navigating folders; tags, which allow multiple categorizations; and group information management. Yet Bergman and Whittaker have found in their pioneering PIM research that these other methods that work best for public information management don't work as well for personal information management. Bergman and Whittaker describe personal information collection as curation: we preserve and organize this data to ensure our future access to it. Unlike other information management fields, in PIM the same user organizes and retrieves the information. After explaining the cognitive and psychological reasons that so many prefer folders, Bergman and Whittaker propose the user-subjective approach to PIM, which does not replace folder hierarchies but exploits these unique characteristics of PIM.

Multi-source, multilingual information extraction and summarization (2013) 0.01

0.005299047 = product of:
  0.021196188 = sum of:
    0.021196188 = product of:
      0.042392377 = sum of:
        0.042392377 = weight(_text_:processing in 978) [ClassicSimilarity], result of:
          0.042392377 = score(doc=978,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.22363065 = fieldWeight in 978, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0390625 = fieldNorm(doc=978)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Series: Theory and applications of natural language processing

Search (8 results, page 1 of 1)

Authors

Years

Languages

Types

Themes

Subjects

Classifications