Document (#37355)

Author
Liu, B.
Title
Web data mining : exploring hyperlinks, contents, and usage data
Issue
2nd ed.
Imprint
Heidelberg : Springer
Year
2011
Pages
XX, 622 S
Isbn
978-3-642-19459-7
Series
Data-centric systems and applications
Abstract
Web mining aims to discover useful information and knowledge from the Web hyperlink structure, page contents, and usage data. Although Web mining uses many conventional data mining techniques, it is not purely an application of traditional data mining due to the semistructured and unstructured nature of the Web data and its heterogeneity. It has also developed many of its own algorithms and techniques. Liu has written a comprehensive text on Web data mining. Key topics of structure mining, content mining, and usage mining are covered both in breadth and in depth. His book brings together all the essential concepts and algorithms from related areas such as data mining, machine learning, and text processing to form an authoritative and coherent text. The book offers a rich blend of theory and practice, addressing seminal research ideas, as well as examining the technology from a practical point of view. It is suitable for students, researchers and practitioners interested in Web mining both as a learning text and a reference book. Lecturers can readily use it for classes on data mining, Web mining, and Web search. Additional teaching materials such as lecture slides, datasets, and implemented algorithms are available online.
Content
Inhalt: 1. Introduction 2. Association Rules and Sequential Patterns 3. Supervised Learning 4. Unsupervised Learning 5. Partially Supervised Learning 6. Information Retrieval and Web Search 7. Social Network Analysis 8. Web Crawling 9. Structured Data Extraction: Wrapper Generation 10. Information Integration
Footnote
Elektronische Ausgabe unter: http://springer.r.delivery.net/r/r?2.1.Ee.2Tp.1gd0L5.C3WE8i..N.WdtE.3uq2.bW89MQ%5f%5fCXPUFOH0.
Theme
Data Mining
RSWK
World Wide Web / Data Mining
BK
54.72
06.74
06.70
54.32
DDC
006.312 / DDC22ger
005.7402854678 / DDC22ger
005.72 / DDC22ger
GHBS
TZG (FH K)
TWX (FH GE)
LCC
QA76.9.D343
RVK
ST 530

Similar documents (content)

  1. Kantardzic, M.: Data mining : concepts, models, methods, and algorithms (2003) 0.38
    0.3774624 = sum of:
      0.3774624 = product of:
        1.17957 = sum of:
          0.00894829 = weight(abstract_txt:from in 2291) [ClassicSimilarity], result of:
            0.00894829 = score(doc=2291,freq=2.0), product of:
              0.036629032 = queryWeight, product of:
                1.137374 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.011652056 = queryNorm
              0.24429502 = fieldWeight in 2291, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=2291)
          0.026262822 = weight(abstract_txt:techniques in 2291) [ClassicSimilarity], result of:
            0.026262822 = score(doc=2291,freq=2.0), product of:
              0.06559377 = queryWeight, product of:
                1.2427285 = boost
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.011652056 = queryNorm
              0.40038592 = fieldWeight in 2291, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.0625 = fieldNorm(doc=2291)
          0.021423742 = weight(abstract_txt:learning in 2291) [ClassicSimilarity], result of:
            0.021423742 = score(doc=2291,freq=1.0), product of:
              0.07215092 = queryWeight, product of:
                1.3033645 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.011652056 = queryNorm
              0.29692957 = fieldWeight in 2291, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.0625 = fieldNorm(doc=2291)
          0.059450354 = weight(abstract_txt:book in 2291) [ClassicSimilarity], result of:
            0.059450354 = score(doc=2291,freq=3.0), product of:
              0.113084905 = queryWeight, product of:
                1.9984483 = boost
                4.856341 = idf(docFreq=934, maxDocs=44218)
                0.011652056 = queryNorm
              0.52571434 = fieldWeight in 2291, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.856341 = idf(docFreq=934, maxDocs=44218)
                0.0625 = fieldNorm(doc=2291)
          0.026423816 = weight(abstract_txt:text in 2291) [ClassicSimilarity], result of:
            0.026423816 = score(doc=2291,freq=1.0), product of:
              0.104548715 = queryWeight, product of:
                2.2188058 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.011652056 = queryNorm
              0.25274166 = fieldWeight in 2291, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=2291)
          0.0965299 = weight(abstract_txt:algorithms in 2291) [ClassicSimilarity], result of:
            0.0965299 = score(doc=2291,freq=3.0), product of:
              0.15622225 = queryWeight, product of:
                2.3488865 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.011652056 = queryNorm
              0.6179011 = fieldWeight in 2291, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.0625 = fieldNorm(doc=2291)
          0.13376138 = weight(abstract_txt:data in 2291) [ClassicSimilarity], result of:
            0.13376138 = score(doc=2291,freq=13.0), product of:
              0.17791301 = queryWeight, product of:
                4.5765038 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.011652056 = queryNorm
              0.7518358 = fieldWeight in 2291, product of:
                3.6055512 = tf(freq=13.0), with freq of:
                  13.0 = termFreq=13.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=2291)
          0.8067697 = weight(abstract_txt:mining in 2291) [ClassicSimilarity], result of:
            0.8067697 = score(doc=2291,freq=6.0), product of:
              0.8533496 = queryWeight, product of:
                11.859257 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.011652056 = queryNorm
              0.94541526 = fieldWeight in 2291, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0625 = fieldNorm(doc=2291)
        0.32 = coord(8/25)
    
  2. Mining text data (2012) 0.36
    0.36273775 = sum of:
      0.36273775 = product of:
        1.2954919 = sum of:
          0.0063273967 = weight(abstract_txt:from in 362) [ClassicSimilarity], result of:
            0.0063273967 = score(doc=362,freq=1.0), product of:
              0.036629032 = queryWeight, product of:
                1.137374 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.011652056 = queryNorm
              0.17274266 = fieldWeight in 362, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=362)
          0.030297747 = weight(abstract_txt:learning in 362) [ClassicSimilarity], result of:
            0.030297747 = score(doc=362,freq=2.0), product of:
              0.07215092 = queryWeight, product of:
                1.3033645 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.011652056 = queryNorm
              0.41992182 = fieldWeight in 362, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.0625 = fieldNorm(doc=362)
          0.059450354 = weight(abstract_txt:book in 362) [ClassicSimilarity], result of:
            0.059450354 = score(doc=362,freq=3.0), product of:
              0.113084905 = queryWeight, product of:
                1.9984483 = boost
                4.856341 = idf(docFreq=934, maxDocs=44218)
                0.011652056 = queryNorm
              0.52571434 = fieldWeight in 362, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.856341 = idf(docFreq=934, maxDocs=44218)
                0.0625 = fieldNorm(doc=362)
          0.06472487 = weight(abstract_txt:text in 362) [ClassicSimilarity], result of:
            0.06472487 = score(doc=362,freq=6.0), product of:
              0.104548715 = queryWeight, product of:
                2.2188058 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.011652056 = queryNorm
              0.6190881 = fieldWeight in 362, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=362)
          0.055731565 = weight(abstract_txt:algorithms in 362) [ClassicSimilarity], result of:
            0.055731565 = score(doc=362,freq=1.0), product of:
              0.15622225 = queryWeight, product of:
                2.3488865 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.011652056 = queryNorm
              0.35674536 = fieldWeight in 362, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.0625 = fieldNorm(doc=362)
          0.09087296 = weight(abstract_txt:data in 362) [ClassicSimilarity], result of:
            0.09087296 = score(doc=362,freq=6.0), product of:
              0.17791301 = queryWeight, product of:
                4.5765038 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.011652056 = queryNorm
              0.5107719 = fieldWeight in 362, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=362)
          0.9880871 = weight(abstract_txt:mining in 362) [ClassicSimilarity], result of:
            0.9880871 = score(doc=362,freq=9.0), product of:
              0.8533496 = queryWeight, product of:
                11.859257 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.011652056 = queryNorm
              1.1578925 = fieldWeight in 362, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0625 = fieldNorm(doc=362)
        0.28 = coord(7/25)
    
  3. Relational data mining (2001) 0.28
    0.28359014 = sum of:
      0.28359014 = product of:
        1.1816256 = sum of:
          0.023213275 = weight(abstract_txt:techniques in 1303) [ClassicSimilarity], result of:
            0.023213275 = score(doc=1303,freq=1.0), product of:
              0.06559377 = queryWeight, product of:
                1.2427285 = boost
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.011652056 = queryNorm
              0.3538945 = fieldWeight in 1303, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.078125 = fieldNorm(doc=1303)
          0.02677968 = weight(abstract_txt:learning in 1303) [ClassicSimilarity], result of:
            0.02677968 = score(doc=1303,freq=1.0), product of:
              0.07215092 = queryWeight, product of:
                1.3033645 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.011652056 = queryNorm
              0.37116197 = fieldWeight in 1303, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.078125 = fieldNorm(doc=1303)
          0.07431295 = weight(abstract_txt:book in 1303) [ClassicSimilarity], result of:
            0.07431295 = score(doc=1303,freq=3.0), product of:
              0.113084905 = queryWeight, product of:
                1.9984483 = boost
                4.856341 = idf(docFreq=934, maxDocs=44218)
                0.011652056 = queryNorm
              0.65714294 = fieldWeight in 1303, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.856341 = idf(docFreq=934, maxDocs=44218)
                0.078125 = fieldNorm(doc=1303)
          0.033029772 = weight(abstract_txt:text in 1303) [ClassicSimilarity], result of:
            0.033029772 = score(doc=1303,freq=1.0), product of:
              0.104548715 = queryWeight, product of:
                2.2188058 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.011652056 = queryNorm
              0.3159271 = fieldWeight in 1303, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=1303)
          0.1036941 = weight(abstract_txt:data in 1303) [ClassicSimilarity], result of:
            0.1036941 = score(doc=1303,freq=5.0), product of:
              0.17791301 = queryWeight, product of:
                4.5765038 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.011652056 = queryNorm
              0.582836 = fieldWeight in 1303, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.078125 = fieldNorm(doc=1303)
          0.92059577 = weight(abstract_txt:mining in 1303) [ClassicSimilarity], result of:
            0.92059577 = score(doc=1303,freq=5.0), product of:
              0.8533496 = queryWeight, product of:
                11.859257 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.011652056 = queryNorm
              1.0788026 = fieldWeight in 1303, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.078125 = fieldNorm(doc=1303)
        0.24 = coord(6/25)
    
  4. Tonkin, E.L.; Tourte, G.J.L.: Working with text. tools, techniques and approaches for text mining (2016) 0.28
    0.28031206 = sum of:
      0.28031206 = product of:
        1.167967 = sum of:
          0.0063273967 = weight(abstract_txt:from in 4019) [ClassicSimilarity], result of:
            0.0063273967 = score(doc=4019,freq=1.0), product of:
              0.036629032 = queryWeight, product of:
                1.137374 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.011652056 = queryNorm
              0.17274266 = fieldWeight in 4019, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=4019)
          0.01857062 = weight(abstract_txt:techniques in 4019) [ClassicSimilarity], result of:
            0.01857062 = score(doc=4019,freq=1.0), product of:
              0.06559377 = queryWeight, product of:
                1.2427285 = boost
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.011652056 = queryNorm
              0.2831156 = fieldWeight in 4019, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5298495 = idf(docFreq=1295, maxDocs=44218)
                0.0625 = fieldNorm(doc=4019)
          0.034323677 = weight(abstract_txt:book in 4019) [ClassicSimilarity], result of:
            0.034323677 = score(doc=4019,freq=1.0), product of:
              0.113084905 = queryWeight, product of:
                1.9984483 = boost
                4.856341 = idf(docFreq=934, maxDocs=44218)
                0.011652056 = queryNorm
              0.3035213 = fieldWeight in 4019, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.856341 = idf(docFreq=934, maxDocs=44218)
                0.0625 = fieldNorm(doc=4019)
          0.083559446 = weight(abstract_txt:text in 4019) [ClassicSimilarity], result of:
            0.083559446 = score(doc=4019,freq=10.0), product of:
              0.104548715 = queryWeight, product of:
                2.2188058 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.011652056 = queryNorm
              0.79923934 = fieldWeight in 4019, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=4019)
          0.03709873 = weight(abstract_txt:data in 4019) [ClassicSimilarity], result of:
            0.03709873 = score(doc=4019,freq=1.0), product of:
              0.17791301 = queryWeight, product of:
                4.5765038 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.011652056 = queryNorm
              0.20852174 = fieldWeight in 4019, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=4019)
          0.9880871 = weight(abstract_txt:mining in 4019) [ClassicSimilarity], result of:
            0.9880871 = score(doc=4019,freq=9.0), product of:
              0.8533496 = queryWeight, product of:
                11.859257 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.011652056 = queryNorm
              1.1578925 = fieldWeight in 4019, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0625 = fieldNorm(doc=4019)
        0.24 = coord(6/25)
    
  5. Perugini, S.; Ramakrishnan, N.: Mining Web functional dependencies for flexible information access (2007) 0.26
    0.26489884 = sum of:
      0.26489884 = product of:
        1.1037452 = sum of:
          0.10623897 = weight(abstract_txt:hyperlink in 602) [ClassicSimilarity], result of:
            0.10623897 = score(doc=602,freq=3.0), product of:
              0.09950475 = queryWeight, product of:
                1.0823104 = boost
                7.890225 = idf(docFreq=44, maxDocs=44218)
                0.011652056 = queryNorm
              1.0676774 = fieldWeight in 602, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.890225 = idf(docFreq=44, maxDocs=44218)
                0.078125 = fieldNorm(doc=602)
          0.007909246 = weight(abstract_txt:from in 602) [ClassicSimilarity], result of:
            0.007909246 = score(doc=602,freq=1.0), product of:
              0.036629032 = queryWeight, product of:
                1.137374 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.011652056 = queryNorm
              0.21592833 = fieldWeight in 602, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.078125 = fieldNorm(doc=602)
          0.029232202 = weight(abstract_txt:structure in 602) [ClassicSimilarity], result of:
            0.029232202 = score(doc=602,freq=2.0), product of:
              0.060711276 = queryWeight, product of:
                1.1955827 = boost
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.011652056 = queryNorm
              0.48149544 = fieldWeight in 602, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.078125 = fieldNorm(doc=602)
          0.06729442 = weight(abstract_txt:usage in 602) [ClassicSimilarity], result of:
            0.06729442 = score(doc=602,freq=1.0), product of:
              0.15265866 = queryWeight, product of:
                2.3219416 = boost
                5.642448 = idf(docFreq=425, maxDocs=44218)
                0.011652056 = queryNorm
              0.44081625 = fieldWeight in 602, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.642448 = idf(docFreq=425, maxDocs=44218)
                0.078125 = fieldNorm(doc=602)
          0.069664456 = weight(abstract_txt:algorithms in 602) [ClassicSimilarity], result of:
            0.069664456 = score(doc=602,freq=1.0), product of:
              0.15622225 = queryWeight, product of:
                2.3488865 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.011652056 = queryNorm
              0.4459317 = fieldWeight in 602, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.078125 = fieldNorm(doc=602)
          0.8234059 = weight(abstract_txt:mining in 602) [ClassicSimilarity], result of:
            0.8234059 = score(doc=602,freq=4.0), product of:
              0.8533496 = queryWeight, product of:
                11.859257 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.011652056 = queryNorm
              0.9649104 = fieldWeight in 602, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.078125 = fieldNorm(doc=602)
        0.24 = coord(6/25)