Document (#37356)

Author
Liu, B.
Title
Web data mining : exploring hyperlinks, contents, and usage data
Issue
2nd ed.
Imprint
Heidelberg : Springer
Year
2011
Pages
XX, 622 S
Isbn
978-3-642-19459-7
Series
Data-centric systems and applications
Abstract
Web mining aims to discover useful information and knowledge from the Web hyperlink structure, page contents, and usage data. Although Web mining uses many conventional data mining techniques, it is not purely an application of traditional data mining due to the semistructured and unstructured nature of the Web data and its heterogeneity. It has also developed many of its own algorithms and techniques. Liu has written a comprehensive text on Web data mining. Key topics of structure mining, content mining, and usage mining are covered both in breadth and in depth. His book brings together all the essential concepts and algorithms from related areas such as data mining, machine learning, and text processing to form an authoritative and coherent text. The book offers a rich blend of theory and practice, addressing seminal research ideas, as well as examining the technology from a practical point of view. It is suitable for students, researchers and practitioners interested in Web mining both as a learning text and a reference book. Lecturers can readily use it for classes on data mining, Web mining, and Web search. Additional teaching materials such as lecture slides, datasets, and implemented algorithms are available online.
Content
Inhalt: 1. Introduction 2. Association Rules and Sequential Patterns 3. Supervised Learning 4. Unsupervised Learning 5. Partially Supervised Learning 6. Information Retrieval and Web Search 7. Social Network Analysis 8. Web Crawling 9. Structured Data Extraction: Wrapper Generation 10. Information Integration
Footnote
Elektronische Ausgabe unter: http://springer.r.delivery.net/r/r?2.1.Ee.2Tp.1gd0L5.C3WE8i..N.WdtE.3uq2.bW89MQ%5f%5fCXPUFOH0.
Theme
Data Mining
RSWK
World Wide Web / Data Mining
BK
54.72
06.74
06.70
54.32
DDC
006.312 / DDC22ger
005.7402854678 / DDC22ger
005.72 / DDC22ger
GHBS
TZG (FH K)
TWX (FH GE)
LCC
QA76.9.D343
RVK
ST 530

Similar documents (content)

  1. Kantardzic, M.: Data mining : concepts, models, methods, and algorithms (2003) 0.38
    0.38176146 = sum of:
      0.38176146 = product of:
        1.1930046 = sum of:
          0.0091400845 = weight(abstract_txt:from in 3292) [ClassicSimilarity], result of:
            0.0091400845 = score(doc=3292,freq=2.0), product of:
              0.036972743 = queryWeight, product of:
                1.1459562 = boost
                2.796878 = idf(docFreq=7014, maxDocs=42306)
                0.011535599 = queryNorm
              0.24721143 = fieldWeight in 3292, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.796878 = idf(docFreq=7014, maxDocs=42306)
                0.0625 = fieldNorm(doc=3292)
          0.025845516 = weight(abstract_txt:techniques in 3292) [ClassicSimilarity], result of:
            0.025845516 = score(doc=3292,freq=2.0), product of:
              0.06458642 = queryWeight, product of:
                1.2366652 = boost
                4.527401 = idf(docFreq=1242, maxDocs=42306)
                0.011535599 = queryNorm
              0.4001695 = fieldWeight in 3292, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.527401 = idf(docFreq=1242, maxDocs=42306)
                0.0625 = fieldNorm(doc=3292)
          0.022018535 = weight(abstract_txt:learning in 3292) [ClassicSimilarity], result of:
            0.022018535 = score(doc=3292,freq=1.0), product of:
              0.07312851 = queryWeight, product of:
                1.315906 = boost
                4.8174996 = idf(docFreq=929, maxDocs=42306)
                0.011535599 = queryNorm
              0.30109373 = fieldWeight in 3292, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8174996 = idf(docFreq=929, maxDocs=42306)
                0.0625 = fieldNorm(doc=3292)
          0.059073217 = weight(abstract_txt:book in 3292) [ClassicSimilarity], result of:
            0.059073217 = score(doc=3292,freq=3.0), product of:
              0.11206711 = queryWeight, product of:
                1.9951072 = boost
                4.869359 = idf(docFreq=882, maxDocs=42306)
                0.011535599 = queryNorm
              0.5271236 = fieldWeight in 3292, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.869359 = idf(docFreq=882, maxDocs=42306)
                0.0625 = fieldNorm(doc=3292)
          0.026199404 = weight(abstract_txt:text in 3292) [ClassicSimilarity], result of:
            0.026199404 = score(doc=3292,freq=1.0), product of:
              0.10345831 = queryWeight, product of:
                2.2134984 = boost
                4.0517817 = idf(docFreq=1999, maxDocs=42306)
                0.011535599 = queryNorm
              0.25323635 = fieldWeight in 3292, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0517817 = idf(docFreq=1999, maxDocs=42306)
                0.0625 = fieldNorm(doc=3292)
          0.09769084 = weight(abstract_txt:algorithms in 3292) [ClassicSimilarity], result of:
            0.09769084 = score(doc=3292,freq=3.0), product of:
              0.15671852 = queryWeight, product of:
                2.3593225 = boost
                5.758281 = idf(docFreq=362, maxDocs=42306)
                0.011535599 = queryNorm
              0.6233522 = fieldWeight in 3292, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.758281 = idf(docFreq=362, maxDocs=42306)
                0.0625 = fieldNorm(doc=3292)
          0.13741823 = weight(abstract_txt:data in 3292) [ClassicSimilarity], result of:
            0.13741823 = score(doc=3292,freq=13.0), product of:
              0.1802739 = queryWeight, product of:
                4.619904 = boost
                3.382671 = idf(docFreq=3904, maxDocs=42306)
                0.011535599 = queryNorm
              0.7622746 = fieldWeight in 3292, product of:
                3.6055512 = tf(freq=13.0), with freq of:
                  13.0 = termFreq=13.0
                3.382671 = idf(docFreq=3904, maxDocs=42306)
                0.0625 = fieldNorm(doc=3292)
          0.81561875 = weight(abstract_txt:mining in 3292) [ClassicSimilarity], result of:
            0.81561875 = score(doc=3292,freq=6.0), product of:
              0.8554635 = queryWeight, product of:
                11.907793 = boost
                6.227734 = idf(docFreq=226, maxDocs=42306)
                0.011535599 = queryNorm
              0.9534232 = fieldWeight in 3292, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.227734 = idf(docFreq=226, maxDocs=42306)
                0.0625 = fieldNorm(doc=3292)
        0.32 = coord(8/25)
    
  2. Mining text data (2012) 0.37
    0.3666696 = sum of:
      0.3666696 = product of:
        1.3095343 = sum of:
          0.006463016 = weight(abstract_txt:from in 2363) [ClassicSimilarity], result of:
            0.006463016 = score(doc=2363,freq=1.0), product of:
              0.036972743 = queryWeight, product of:
                1.1459562 = boost
                2.796878 = idf(docFreq=7014, maxDocs=42306)
                0.011535599 = queryNorm
              0.17480488 = fieldWeight in 2363, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.796878 = idf(docFreq=7014, maxDocs=42306)
                0.0625 = fieldNorm(doc=2363)
          0.031138908 = weight(abstract_txt:learning in 2363) [ClassicSimilarity], result of:
            0.031138908 = score(doc=2363,freq=2.0), product of:
              0.07312851 = queryWeight, product of:
                1.315906 = boost
                4.8174996 = idf(docFreq=929, maxDocs=42306)
                0.011535599 = queryNorm
              0.4258108 = fieldWeight in 2363, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.8174996 = idf(docFreq=929, maxDocs=42306)
                0.0625 = fieldNorm(doc=2363)
          0.059073217 = weight(abstract_txt:book in 2363) [ClassicSimilarity], result of:
            0.059073217 = score(doc=2363,freq=3.0), product of:
              0.11206711 = queryWeight, product of:
                1.9951072 = boost
                4.869359 = idf(docFreq=882, maxDocs=42306)
                0.011535599 = queryNorm
              0.5271236 = fieldWeight in 2363, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.869359 = idf(docFreq=882, maxDocs=42306)
                0.0625 = fieldNorm(doc=2363)
          0.06417517 = weight(abstract_txt:text in 2363) [ClassicSimilarity], result of:
            0.06417517 = score(doc=2363,freq=6.0), product of:
              0.10345831 = queryWeight, product of:
                2.2134984 = boost
                4.0517817 = idf(docFreq=1999, maxDocs=42306)
                0.011535599 = queryNorm
              0.6202999 = fieldWeight in 2363, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.0517817 = idf(docFreq=1999, maxDocs=42306)
                0.0625 = fieldNorm(doc=2363)
          0.056401834 = weight(abstract_txt:algorithms in 2363) [ClassicSimilarity], result of:
            0.056401834 = score(doc=2363,freq=1.0), product of:
              0.15671852 = queryWeight, product of:
                2.3593225 = boost
                5.758281 = idf(docFreq=362, maxDocs=42306)
                0.011535599 = queryNorm
              0.35989258 = fieldWeight in 2363, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.758281 = idf(docFreq=362, maxDocs=42306)
                0.0625 = fieldNorm(doc=2363)
          0.0933573 = weight(abstract_txt:data in 2363) [ClassicSimilarity], result of:
            0.0933573 = score(doc=2363,freq=6.0), product of:
              0.1802739 = queryWeight, product of:
                4.619904 = boost
                3.382671 = idf(docFreq=3904, maxDocs=42306)
                0.011535599 = queryNorm
              0.51786363 = fieldWeight in 2363, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.382671 = idf(docFreq=3904, maxDocs=42306)
                0.0625 = fieldNorm(doc=2363)
          0.9989249 = weight(abstract_txt:mining in 2363) [ClassicSimilarity], result of:
            0.9989249 = score(doc=2363,freq=9.0), product of:
              0.8554635 = queryWeight, product of:
                11.907793 = boost
                6.227734 = idf(docFreq=226, maxDocs=42306)
                0.011535599 = queryNorm
              1.1677002 = fieldWeight in 2363, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                6.227734 = idf(docFreq=226, maxDocs=42306)
                0.0625 = fieldNorm(doc=2363)
        0.28 = coord(7/25)
    
  3. Relational data mining (2001) 0.29
    0.28660336 = sum of:
      0.28660336 = product of:
        1.1941807 = sum of:
          0.022844424 = weight(abstract_txt:techniques in 2304) [ClassicSimilarity], result of:
            0.022844424 = score(doc=2304,freq=1.0), product of:
              0.06458642 = queryWeight, product of:
                1.2366652 = boost
                4.527401 = idf(docFreq=1242, maxDocs=42306)
                0.011535599 = queryNorm
              0.3537032 = fieldWeight in 2304, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.527401 = idf(docFreq=1242, maxDocs=42306)
                0.078125 = fieldNorm(doc=2304)
          0.027523167 = weight(abstract_txt:learning in 2304) [ClassicSimilarity], result of:
            0.027523167 = score(doc=2304,freq=1.0), product of:
              0.07312851 = queryWeight, product of:
                1.315906 = boost
                4.8174996 = idf(docFreq=929, maxDocs=42306)
                0.011535599 = queryNorm
              0.37636715 = fieldWeight in 2304, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8174996 = idf(docFreq=929, maxDocs=42306)
                0.078125 = fieldNorm(doc=2304)
          0.07384152 = weight(abstract_txt:book in 2304) [ClassicSimilarity], result of:
            0.07384152 = score(doc=2304,freq=3.0), product of:
              0.11206711 = queryWeight, product of:
                1.9951072 = boost
                4.869359 = idf(docFreq=882, maxDocs=42306)
                0.011535599 = queryNorm
              0.65890443 = fieldWeight in 2304, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.869359 = idf(docFreq=882, maxDocs=42306)
                0.078125 = fieldNorm(doc=2304)
          0.032749254 = weight(abstract_txt:text in 2304) [ClassicSimilarity], result of:
            0.032749254 = score(doc=2304,freq=1.0), product of:
              0.10345831 = queryWeight, product of:
                2.2134984 = boost
                4.0517817 = idf(docFreq=1999, maxDocs=42306)
                0.011535599 = queryNorm
              0.31654543 = fieldWeight in 2304, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0517817 = idf(docFreq=1999, maxDocs=42306)
                0.078125 = fieldNorm(doc=2304)
          0.10652895 = weight(abstract_txt:data in 2304) [ClassicSimilarity], result of:
            0.10652895 = score(doc=2304,freq=5.0), product of:
              0.1802739 = queryWeight, product of:
                4.619904 = boost
                3.382671 = idf(docFreq=3904, maxDocs=42306)
                0.011535599 = queryNorm
              0.5909283 = fieldWeight in 2304, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.382671 = idf(docFreq=3904, maxDocs=42306)
                0.078125 = fieldNorm(doc=2304)
          0.9306933 = weight(abstract_txt:mining in 2304) [ClassicSimilarity], result of:
            0.9306933 = score(doc=2304,freq=5.0), product of:
              0.8554635 = queryWeight, product of:
                11.907793 = boost
                6.227734 = idf(docFreq=226, maxDocs=42306)
                0.011535599 = queryNorm
              1.0879405 = fieldWeight in 2304, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.227734 = idf(docFreq=226, maxDocs=42306)
                0.078125 = fieldNorm(doc=2304)
        0.24 = coord(6/25)
    
  4. Tonkin, E.L.; Tourte, G.J.L.: Working with text. tools, techniques and approaches for text mining (2016) 0.28
    0.2828957 = sum of:
      0.2828957 = product of:
        1.1787322 = sum of:
          0.006463016 = weight(abstract_txt:from in 938) [ClassicSimilarity], result of:
            0.006463016 = score(doc=938,freq=1.0), product of:
              0.036972743 = queryWeight, product of:
                1.1459562 = boost
                2.796878 = idf(docFreq=7014, maxDocs=42306)
                0.011535599 = queryNorm
              0.17480488 = fieldWeight in 938, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.796878 = idf(docFreq=7014, maxDocs=42306)
                0.0625 = fieldNorm(doc=938)
          0.01827554 = weight(abstract_txt:techniques in 938) [ClassicSimilarity], result of:
            0.01827554 = score(doc=938,freq=1.0), product of:
              0.06458642 = queryWeight, product of:
                1.2366652 = boost
                4.527401 = idf(docFreq=1242, maxDocs=42306)
                0.011535599 = queryNorm
              0.28296256 = fieldWeight in 938, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.527401 = idf(docFreq=1242, maxDocs=42306)
                0.0625 = fieldNorm(doc=938)
          0.034105938 = weight(abstract_txt:book in 938) [ClassicSimilarity], result of:
            0.034105938 = score(doc=938,freq=1.0), product of:
              0.11206711 = queryWeight, product of:
                1.9951072 = boost
                4.869359 = idf(docFreq=882, maxDocs=42306)
                0.011535599 = queryNorm
              0.30433494 = fieldWeight in 938, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.869359 = idf(docFreq=882, maxDocs=42306)
                0.0625 = fieldNorm(doc=938)
          0.08284979 = weight(abstract_txt:text in 938) [ClassicSimilarity], result of:
            0.08284979 = score(doc=938,freq=10.0), product of:
              0.10345831 = queryWeight, product of:
                2.2134984 = boost
                4.0517817 = idf(docFreq=1999, maxDocs=42306)
                0.011535599 = queryNorm
              0.80080366 = fieldWeight in 938, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                4.0517817 = idf(docFreq=1999, maxDocs=42306)
                0.0625 = fieldNorm(doc=938)
          0.038112957 = weight(abstract_txt:data in 938) [ClassicSimilarity], result of:
            0.038112957 = score(doc=938,freq=1.0), product of:
              0.1802739 = queryWeight, product of:
                4.619904 = boost
                3.382671 = idf(docFreq=3904, maxDocs=42306)
                0.011535599 = queryNorm
              0.21141694 = fieldWeight in 938, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.382671 = idf(docFreq=3904, maxDocs=42306)
                0.0625 = fieldNorm(doc=938)
          0.9989249 = weight(abstract_txt:mining in 938) [ClassicSimilarity], result of:
            0.9989249 = score(doc=938,freq=9.0), product of:
              0.8554635 = queryWeight, product of:
                11.907793 = boost
                6.227734 = idf(docFreq=226, maxDocs=42306)
                0.011535599 = queryNorm
              1.1677002 = fieldWeight in 938, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                6.227734 = idf(docFreq=226, maxDocs=42306)
                0.0625 = fieldNorm(doc=938)
        0.24 = coord(6/25)
    
  5. Perugini, S.; Ramakrishnan, N.: Mining Web functional dependencies for flexible information access (2007) 0.27
    0.26650795 = sum of:
      0.26650795 = product of:
        1.1104498 = sum of:
          0.10297043 = weight(abstract_txt:hyperlink in 2603) [ClassicSimilarity], result of:
            0.10297043 = score(doc=2603,freq=3.0), product of:
              0.09698674 = queryWeight, product of:
                1.0715753 = boost
                7.8460217 = idf(docFreq=44, maxDocs=42306)
                0.011535599 = queryNorm
              1.0616959 = fieldWeight in 2603, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.8460217 = idf(docFreq=44, maxDocs=42306)
                0.078125 = fieldNorm(doc=2603)
          0.00807877 = weight(abstract_txt:from in 2603) [ClassicSimilarity], result of:
            0.00807877 = score(doc=2603,freq=1.0), product of:
              0.036972743 = queryWeight, product of:
                1.1459562 = boost
                2.796878 = idf(docFreq=7014, maxDocs=42306)
                0.011535599 = queryNorm
              0.2185061 = fieldWeight in 2603, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.796878 = idf(docFreq=7014, maxDocs=42306)
                0.078125 = fieldNorm(doc=2603)
          0.029106515 = weight(abstract_txt:structure in 2603) [ClassicSimilarity], result of:
            0.029106515 = score(doc=2603,freq=2.0), product of:
              0.060247343 = queryWeight, product of:
                1.1944019 = boost
                4.372676 = idf(docFreq=1450, maxDocs=42306)
                0.011535599 = queryNorm
              0.48311698 = fieldWeight in 2603, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.372676 = idf(docFreq=1450, maxDocs=42306)
                0.078125 = fieldNorm(doc=2603)
          0.06735434 = weight(abstract_txt:usage in 2603) [ClassicSimilarity], result of:
            0.06735434 = score(doc=2603,freq=1.0), product of:
              0.15201807 = queryWeight, product of:
                2.3236716 = boost
                5.67127 = idf(docFreq=395, maxDocs=42306)
                0.011535599 = queryNorm
              0.44306797 = fieldWeight in 2603, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.67127 = idf(docFreq=395, maxDocs=42306)
                0.078125 = fieldNorm(doc=2603)
          0.07050229 = weight(abstract_txt:algorithms in 2603) [ClassicSimilarity], result of:
            0.07050229 = score(doc=2603,freq=1.0), product of:
              0.15671852 = queryWeight, product of:
                2.3593225 = boost
                5.758281 = idf(docFreq=362, maxDocs=42306)
                0.011535599 = queryNorm
              0.44986573 = fieldWeight in 2603, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.758281 = idf(docFreq=362, maxDocs=42306)
                0.078125 = fieldNorm(doc=2603)
          0.8324374 = weight(abstract_txt:mining in 2603) [ClassicSimilarity], result of:
            0.8324374 = score(doc=2603,freq=4.0), product of:
              0.8554635 = queryWeight, product of:
                11.907793 = boost
                6.227734 = idf(docFreq=226, maxDocs=42306)
                0.011535599 = queryNorm
              0.97308344 = fieldWeight in 2603, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.227734 = idf(docFreq=226, maxDocs=42306)
                0.078125 = fieldNorm(doc=2603)
        0.24 = coord(6/25)