Search (28 results, page 2 of 2)

  • × theme_ss:"Data Mining"
  1. Dang, X.H.; Ong. K.-L.: Knowledge discovery in data streams (2009) 0.00
    0.0026484418 = product of:
      0.010593767 = sum of:
        0.010593767 = product of:
          0.0317813 = sum of:
            0.0317813 = weight(_text_:k in 3829) [ClassicSimilarity], result of:
              0.0317813 = score(doc=3829,freq=2.0), product of:
                0.13429943 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.037621226 = queryNorm
                0.23664509 = fieldWeight in 3829, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3829)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
  2. Song, J.; Huang, Y.; Qi, X.; Li, Y.; Li, F.; Fu, K.; Huang, T.: Discovering hierarchical topic evolution in time-stamped documents (2016) 0.00
    0.0026484418 = product of:
      0.010593767 = sum of:
        0.010593767 = product of:
          0.0317813 = sum of:
            0.0317813 = weight(_text_:k in 2853) [ClassicSimilarity], result of:
              0.0317813 = score(doc=2853,freq=2.0), product of:
                0.13429943 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.037621226 = queryNorm
                0.23664509 = fieldWeight in 2853, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2853)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
  3. Peters, G.; Gaese, V.: ¬Das DocCat-System in der Textdokumentation von G+J (2003) 0.00
    0.0025485782 = product of:
      0.010194313 = sum of:
        0.010194313 = product of:
          0.020388626 = sum of:
            0.020388626 = weight(_text_:22 in 1507) [ClassicSimilarity], result of:
              0.020388626 = score(doc=1507,freq=2.0), product of:
                0.13174312 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.037621226 = queryNorm
                0.15476047 = fieldWeight in 1507, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1507)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    22. 4.2003 11:45:36
  4. Hölzig, C.: Google spürt Grippewellen auf : Die neue Anwendung ist bisher auf die USA beschränkt (2008) 0.00
    0.0025485782 = product of:
      0.010194313 = sum of:
        0.010194313 = product of:
          0.020388626 = sum of:
            0.020388626 = weight(_text_:22 in 2403) [ClassicSimilarity], result of:
              0.020388626 = score(doc=2403,freq=2.0), product of:
                0.13174312 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.037621226 = queryNorm
                0.15476047 = fieldWeight in 2403, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2403)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    3. 5.1997 8:44:22
  5. Jäger, L.: Von Big Data zu Big Brother (2018) 0.00
    0.0025485782 = product of:
      0.010194313 = sum of:
        0.010194313 = product of:
          0.020388626 = sum of:
            0.020388626 = weight(_text_:22 in 5234) [ClassicSimilarity], result of:
              0.020388626 = score(doc=5234,freq=2.0), product of:
                0.13174312 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.037621226 = queryNorm
                0.15476047 = fieldWeight in 5234, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5234)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    22. 1.2018 11:33:49
  6. Kantardzic, M.: Data mining : concepts, models, methods, and algorithms (2003) 0.00
    0.0024969748 = product of:
      0.009987899 = sum of:
        0.009987899 = product of:
          0.029963696 = sum of:
            0.029963696 = weight(_text_:k in 2291) [ClassicSimilarity], result of:
              0.029963696 = score(doc=2291,freq=4.0), product of:
                0.13429943 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.037621226 = queryNorm
                0.22311112 = fieldWeight in 2291, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2291)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Classification
    PZY (FH K)
    GHBS
    PZY (FH K)
  7. Liu, B.: Web data mining : exploring hyperlinks, contents, and usage data (2011) 0.00
    0.0024969748 = product of:
      0.009987899 = sum of:
        0.009987899 = product of:
          0.029963696 = sum of:
            0.029963696 = weight(_text_:k in 354) [ClassicSimilarity], result of:
              0.029963696 = score(doc=354,freq=4.0), product of:
                0.13429943 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.037621226 = queryNorm
                0.22311112 = fieldWeight in 354, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.03125 = fieldNorm(doc=354)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Classification
    TZG (FH K)
    GHBS
    TZG (FH K)
  8. Ma, Z.; Sun, A.; Cong, G.: On predicting the popularity of newly emerging hashtags in Twitter (2013) 0.00
    0.0022070347 = product of:
      0.008828139 = sum of:
        0.008828139 = product of:
          0.026484415 = sum of:
            0.026484415 = weight(_text_:k in 967) [ClassicSimilarity], result of:
              0.026484415 = score(doc=967,freq=2.0), product of:
                0.13429943 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.037621226 = queryNorm
                0.19720423 = fieldWeight in 967, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=967)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Abstract
    Because of Twitter's popularity and the viral nature of information dissemination on Twitter, predicting which Twitter topics will become popular in the near future becomes a task of considerable economic importance. Many Twitter topics are annotated by hashtags. In this article, we propose methods to predict the popularity of new hashtags on Twitter by formulating the problem as a classification task. We use five standard classification models (i.e., Naïve bayes, k-nearest neighbors, decision trees, support vector machines, and logistic regression) for prediction. The main challenge is the identification of effective features for describing new hashtags. We extract 7 content features from a hashtag string and the collection of tweets containing the hashtag and 11 contextual features from the social graph formed by users who have adopted the hashtag. We conducted experiments on a Twitter data set consisting of 31 million tweets from 2 million Singapore-based users. The experimental results show that the standard classifiers using the extracted features significantly outperform the baseline methods that do not use these features. Among the five classifiers, the logistic regression model performs the best in terms of the Micro-F1 measure. We also observe that contextual features are more effective than content features.

Languages

  • e 16
  • d 12

Types