Search (3 results, page 1 of 1)

  • × language_ss:"e"
  • × theme_ss:"Data Mining"
  • × theme_ss:"Internet"
  1. Fenstermacher, K.D.; Ginsburg, M.: Client-side monitoring for Web mining (2003) 0.00
    0.0026166062 = product of:
      0.018316243 = sum of:
        0.018316243 = product of:
          0.045790605 = sum of:
            0.02197135 = weight(_text_:retrieval in 1611) [ClassicSimilarity], result of:
              0.02197135 = score(doc=1611,freq=2.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.20052543 = fieldWeight in 1611, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1611)
            0.023819257 = weight(_text_:system in 1611) [ClassicSimilarity], result of:
              0.023819257 = score(doc=1611,freq=2.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.20878783 = fieldWeight in 1611, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1611)
          0.4 = coord(2/5)
      0.14285715 = coord(1/7)
    
    Abstract
    "Garbage in, garbage out" is a well-known phrase in computer analysis, and one that comes to mind when mining Web data to draw conclusions about Web users. The challenge is that data analysts wish to infer patterns of client-side behavior from server-side data. However, because only a fraction of the user's actions ever reaches the Web server, analysts must rely an incomplete data. In this paper, we propose a client-side monitoring system that is unobtrusive and supports flexible data collection. Moreover, the proposed framework encompasses client-side applications beyond the Web browser. Expanding monitoring beyond the browser to incorporate standard office productivity tools enables analysts to derive a much richer and more accurate picture of user behavior an the Web.
    Footnote
    Teil eines Themenheftes: "Web retrieval and mining: A machine learning perspective"
  2. Kong, S.; Ye, F.; Feng, L.; Zhao, Z.: Towards the prediction problems of bursting hashtags on Twitter (2015) 0.00
    7.9397525E-4 = product of:
      0.0055578267 = sum of:
        0.0055578267 = product of:
          0.027789133 = sum of:
            0.027789133 = weight(_text_:system in 2338) [ClassicSimilarity], result of:
              0.027789133 = score(doc=2338,freq=2.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.2435858 = fieldWeight in 2338, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2338)
          0.2 = coord(1/5)
      0.14285715 = coord(1/7)
    
    Abstract
    Hundreds of thousands of hashtags are generated every day on Twitter. Only a few will burst and become trending topics. In this article, we provide the definition of a bursting hashtag and conduct a systematic study of a series of challenging prediction problems that span the entire life cycles of bursting hashtags. Around the problem of "how to build a system to predict bursting hashtags," we explore different types of features and present machine learning solutions. On real data sets from Twitter, experiments are conducted to evaluate the effectiveness of the proposed solutions and the contributions of features.
  3. Chakrabarti, S.: Mining the Web : discovering knowledge from hypertext data (2003) 0.00
    4.1850194E-4 = product of:
      0.0029295133 = sum of:
        0.0029295133 = product of:
          0.014647567 = sum of:
            0.014647567 = weight(_text_:retrieval in 2222) [ClassicSimilarity], result of:
              0.014647567 = score(doc=2222,freq=2.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.13368362 = fieldWeight in 2222, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2222)
          0.2 = coord(1/5)
      0.14285715 = coord(1/7)
    
    Footnote
    Part I, Infrastructure, has two chapters: Chapter 2 on crawling the Web and Chapter 3 an Web search and information retrieval. The second part of the book, containing chapters 4, 5, and 6, is the centerpiece. This part specifically focuses an machine learning in the context of hypertext. Part III is a collection of applications that utilize the techniques described in earlier chapters. Chapter 7 is an social network analysis. Chapter 8 is an resource discovery. Chapter 9 is an the future of Web mining. Overall, this is a valuable reference book for researchers and developers in the field of Web mining. It should be particularly useful for those who would like to design and probably code their own Computer programs out of the equations and pseudocodes an most of the pages. For a student, the most valuable feature of the book is perhaps the formal and consistent treatments of concepts across the board. For what is behind and beyond the technical details, one has to either dig deeper into the bibliographic notes at the end of each chapter, or resort to more in-depth analysis of relevant subjects in the literature. lf you are looking for successful stories about Web mining or hard-way-learned lessons of failures, this is not the book."

Types