Search (2 results, page 1 of 1)

  • × theme_ss:"Internet"
  • × theme_ss:"Data Mining"
  1. Kong, S.; Ye, F.; Feng, L.; Zhao, Z.: Towards the prediction problems of bursting hashtags on Twitter (2015) 0.04
    0.041841652 = product of:
      0.12552495 = sum of:
        0.12552495 = weight(_text_:systematic in 2338) [ClassicSimilarity], result of:
          0.12552495 = score(doc=2338,freq=2.0), product of:
            0.28397155 = queryWeight, product of:
              5.715473 = idf(docFreq=395, maxDocs=44218)
              0.049684696 = queryNorm
            0.44203353 = fieldWeight in 2338, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.715473 = idf(docFreq=395, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2338)
      0.33333334 = coord(1/3)
    
    Abstract
    Hundreds of thousands of hashtags are generated every day on Twitter. Only a few will burst and become trending topics. In this article, we provide the definition of a bursting hashtag and conduct a systematic study of a series of challenging prediction problems that span the entire life cycles of bursting hashtags. Around the problem of "how to build a system to predict bursting hashtags," we explore different types of features and present machine learning solutions. On real data sets from Twitter, experiments are conducted to evaluate the effectiveness of the proposed solutions and the contributions of features.
  2. Chakrabarti, S.: Mining the Web : discovering knowledge from hypertext data (2003) 0.01
    0.00536229 = product of:
      0.016086869 = sum of:
        0.016086869 = product of:
          0.032173738 = sum of:
            0.032173738 = weight(_text_:indexing in 2222) [ClassicSimilarity], result of:
              0.032173738 = score(doc=2222,freq=2.0), product of:
                0.19018644 = queryWeight, product of:
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.049684696 = queryNorm
                0.16916946 = fieldWeight in 2222, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.8278677 = idf(docFreq=2614, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2222)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Footnote
    Rez. in: JASIST 55(2004) no.3, S.275-276 (C. Chen): "This is a book about finding significant statistical patterns on the Web - in particular, patterns that are associated with hypertext documents, topics, hyperlinks, and queries. The term pattern in this book refers to dependencies among such items. On the one hand, the Web contains useful information an just about every topic under the sun. On the other hand, just like searching for a needle in a haystack, one would need powerful tools to locate useful information an the vast land of the Web. Soumen Chakrabarti's book focuses an a wide range of techniques for machine learning and data mining an the Web. The goal of the book is to provide both the technical Background and tools and tricks of the trade of Web content mining. Much of the technical content reflects the state of the art between 1995 and 2002. The targeted audience is researchers and innovative developers in this area, as well as newcomers who intend to enter this area. The book begins with an introduction chapter. The introduction chapter explains fundamental concepts such as crawling and indexing as well as clustering and classification. The remaining eight chapters are organized into three parts: i) infrastructure, ii) learning and iii) applications.

Types