Search (51 results, page 2 of 3)

  • × theme_ss:"Data Mining"
  1. Kantardzic, M.: Data mining : concepts, models, methods, and algorithms (2003) 0.01
    0.0060027596 = product of:
      0.012005519 = sum of:
        0.012005519 = product of:
          0.036016557 = sum of:
            0.036016557 = weight(_text_:k in 2291) [ClassicSimilarity], result of:
              0.036016557 = score(doc=2291,freq=4.0), product of:
                0.16142878 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.045220956 = queryNorm
                0.22311112 = fieldWeight in 2291, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2291)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Classification
    PZY (FH K)
    GHBS
    PZY (FH K)
  2. Liu, B.: Web data mining : exploring hyperlinks, contents, and usage data (2011) 0.01
    0.0060027596 = product of:
      0.012005519 = sum of:
        0.012005519 = product of:
          0.036016557 = sum of:
            0.036016557 = weight(_text_:k in 354) [ClassicSimilarity], result of:
              0.036016557 = score(doc=354,freq=4.0), product of:
                0.16142878 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.045220956 = queryNorm
                0.22311112 = fieldWeight in 354, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.03125 = fieldNorm(doc=354)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Classification
    TZG (FH K)
    GHBS
    TZG (FH K)
  3. Brückner, T.; Dambeck, H.: Sortierautomaten : Grundlagen der Textklassifizierung (2003) 0.01
    0.0058151204 = product of:
      0.011630241 = sum of:
        0.011630241 = product of:
          0.034890722 = sum of:
            0.034890722 = weight(_text_:h in 2398) [ClassicSimilarity], result of:
              0.034890722 = score(doc=2398,freq=4.0), product of:
                0.11234917 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045220956 = queryNorm
                0.31055614 = fieldWeight in 2398, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2398)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Source
    c't. 2003, H.19, S.192-197
  4. Ma, Z.; Sun, A.; Cong, G.: On predicting the popularity of newly emerging hashtags in Twitter (2013) 0.01
    0.00530574 = product of:
      0.01061148 = sum of:
        0.01061148 = product of:
          0.03183444 = sum of:
            0.03183444 = weight(_text_:k in 967) [ClassicSimilarity], result of:
              0.03183444 = score(doc=967,freq=2.0), product of:
                0.16142878 = queryWeight, product of:
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.045220956 = queryNorm
                0.19720423 = fieldWeight in 967, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.569778 = idf(docFreq=3384, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=967)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Abstract
    Because of Twitter's popularity and the viral nature of information dissemination on Twitter, predicting which Twitter topics will become popular in the near future becomes a task of considerable economic importance. Many Twitter topics are annotated by hashtags. In this article, we propose methods to predict the popularity of new hashtags on Twitter by formulating the problem as a classification task. We use five standard classification models (i.e., Naïve bayes, k-nearest neighbors, decision trees, support vector machines, and logistic regression) for prediction. The main challenge is the identification of effective features for describing new hashtags. We extract 7 content features from a hashtag string and the collection of tweets containing the hashtag and 11 contextual features from the social graph formed by users who have adopted the hashtag. We conducted experiments on a Twitter data set consisting of 31 million tweets from 2 million Singapore-based users. The experimental results show that the standard classifiers using the extracted features significantly outperform the baseline methods that do not use these features. Among the five classifiers, the logistic regression model performs the best in terms of the Micro-F1 measure. We also observe that contextual features are more effective than content features.
  5. Borgelt, C.; Kruse, R.: Unsicheres Wissen nutzen (2002) 0.01
    0.005139889 = product of:
      0.010279778 = sum of:
        0.010279778 = product of:
          0.030839335 = sum of:
            0.030839335 = weight(_text_:h in 1104) [ClassicSimilarity], result of:
              0.030839335 = score(doc=1104,freq=2.0), product of:
                0.11234917 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045220956 = queryNorm
                0.27449545 = fieldWeight in 1104, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1104)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Source
    Spektrum der Wissenschaft. 2002, H.11, S.82-84
  6. Hallonsten, O.; Holmberg, D.: Analyzing structural stratification in the Swedish higher education system : data contextualization with policy-history analysis (2013) 0.01
    0.00510568 = product of:
      0.01021136 = sum of:
        0.01021136 = product of:
          0.030634077 = sum of:
            0.030634077 = weight(_text_:22 in 668) [ClassicSimilarity], result of:
              0.030634077 = score(doc=668,freq=2.0), product of:
                0.15835609 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045220956 = queryNorm
                0.19345059 = fieldWeight in 668, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=668)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Date
    22. 3.2013 19:43:01
  7. Vaughan, L.; Chen, Y.: Data mining from web search queries : a comparison of Google trends and Baidu index (2015) 0.01
    0.00510568 = product of:
      0.01021136 = sum of:
        0.01021136 = product of:
          0.030634077 = sum of:
            0.030634077 = weight(_text_:22 in 1605) [ClassicSimilarity], result of:
              0.030634077 = score(doc=1605,freq=2.0), product of:
                0.15835609 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045220956 = queryNorm
                0.19345059 = fieldWeight in 1605, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1605)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Source
    Journal of the Association for Information Science and Technology. 66(2015) no.1, S.13-22
  8. Fonseca, F.; Marcinkowski, M.; Davis, C.: Cyber-human systems of thought and understanding (2019) 0.01
    0.00510568 = product of:
      0.01021136 = sum of:
        0.01021136 = product of:
          0.030634077 = sum of:
            0.030634077 = weight(_text_:22 in 5011) [ClassicSimilarity], result of:
              0.030634077 = score(doc=5011,freq=2.0), product of:
                0.15835609 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045220956 = queryNorm
                0.19345059 = fieldWeight in 5011, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5011)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Date
    7. 3.2019 16:32:22
  9. Tiefschürfen in Datenbanken (2002) 0.00
    0.004111911 = product of:
      0.008223822 = sum of:
        0.008223822 = product of:
          0.024671467 = sum of:
            0.024671467 = weight(_text_:h in 996) [ClassicSimilarity], result of:
              0.024671467 = score(doc=996,freq=2.0), product of:
                0.11234917 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045220956 = queryNorm
                0.21959636 = fieldWeight in 996, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.0625 = fieldNorm(doc=996)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Source
    Spektrum der Wissenschaft. 2002, H.11, S.80-91
  10. Nohr, H.: Big Data im Lichte der EU-Datenschutz-Grundverordnung (2017) 0.00
    0.004111911 = product of:
      0.008223822 = sum of:
        0.008223822 = product of:
          0.024671467 = sum of:
            0.024671467 = weight(_text_:h in 4076) [ClassicSimilarity], result of:
              0.024671467 = score(doc=4076,freq=2.0), product of:
                0.11234917 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045220956 = queryNorm
                0.21959636 = fieldWeight in 4076, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4076)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  11. Winterhalter, C.: Licence to mine : ein Überblick über Rahmenbedingungen von Text and Data Mining und den aktuellen Stand der Diskussion (2016) 0.00
    0.004111911 = product of:
      0.008223822 = sum of:
        0.008223822 = product of:
          0.024671467 = sum of:
            0.024671467 = weight(_text_:h in 673) [ClassicSimilarity], result of:
              0.024671467 = score(doc=673,freq=2.0), product of:
                0.11234917 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045220956 = queryNorm
                0.21959636 = fieldWeight in 673, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.0625 = fieldNorm(doc=673)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Source
    027.7 Zeitschrift für Bibliothekskultur. 4(2016), H.2
  12. Peters, G.; Gaese, V.: ¬Das DocCat-System in der Textdokumentation von G+J (2003) 0.00
    0.004084544 = product of:
      0.008169088 = sum of:
        0.008169088 = product of:
          0.024507262 = sum of:
            0.024507262 = weight(_text_:22 in 1507) [ClassicSimilarity], result of:
              0.024507262 = score(doc=1507,freq=2.0), product of:
                0.15835609 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045220956 = queryNorm
                0.15476047 = fieldWeight in 1507, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1507)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Date
    22. 4.2003 11:45:36
  13. Hölzig, C.: Google spürt Grippewellen auf : Die neue Anwendung ist bisher auf die USA beschränkt (2008) 0.00
    0.004084544 = product of:
      0.008169088 = sum of:
        0.008169088 = product of:
          0.024507262 = sum of:
            0.024507262 = weight(_text_:22 in 2403) [ClassicSimilarity], result of:
              0.024507262 = score(doc=2403,freq=2.0), product of:
                0.15835609 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045220956 = queryNorm
                0.15476047 = fieldWeight in 2403, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2403)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Date
    3. 5.1997 8:44:22
  14. Jäger, L.: Von Big Data zu Big Brother (2018) 0.00
    0.004084544 = product of:
      0.008169088 = sum of:
        0.008169088 = product of:
          0.024507262 = sum of:
            0.024507262 = weight(_text_:22 in 5234) [ClassicSimilarity], result of:
              0.024507262 = score(doc=5234,freq=2.0), product of:
                0.15835609 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045220956 = queryNorm
                0.15476047 = fieldWeight in 5234, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5234)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Date
    22. 1.2018 11:33:49
  15. Ku, L.-W.; Chen, H.-H.: Mining opinions from the Web : beyond relevance retrieval (2007) 0.00
    0.0036344507 = product of:
      0.0072689014 = sum of:
        0.0072689014 = product of:
          0.021806704 = sum of:
            0.021806704 = weight(_text_:h in 605) [ClassicSimilarity], result of:
              0.021806704 = score(doc=605,freq=4.0), product of:
                0.11234917 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045220956 = queryNorm
                0.1940976 = fieldWeight in 605, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=605)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  16. Tu, Y.-N.; Hsu, S.-L.: Constructing conceptual trajectory maps to trace the development of research fields (2016) 0.00
    0.0036344507 = product of:
      0.0072689014 = sum of:
        0.0072689014 = product of:
          0.021806704 = sum of:
            0.021806704 = weight(_text_:h in 3059) [ClassicSimilarity], result of:
              0.021806704 = score(doc=3059,freq=4.0), product of:
                0.11234917 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045220956 = queryNorm
                0.1940976 = fieldWeight in 3059, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3059)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
    Abstract
    This study proposes a new method to construct and trace the trajectory of conceptual development of a research field by combining main path analysis, citation analysis, and text-mining techniques. Main path analysis, a method used commonly to trace the most critical path in a citation network, helps describe the developmental trajectory of a research field. This study extends the main path analysis method and applies text-mining techniques in the new method, which reflects the trajectory of conceptual development in an academic research field more accurately than citation frequency, which represents only the articles examined. Articles can be merged based on similarity of concepts, and by merging concepts the history of a research field can be described more precisely. The new method was applied to the "h-index" and "text mining" fields. The precision, recall, and F-measures of the h-index were 0.738, 0.652, and 0.658 and those of text-mining were 0.501, 0.653, and 0.551, respectively. Last, this study not only establishes the conceptual trajectory map of a research field, but also recommends keywords that are more precise than those used currently by researchers. These precise keywords could enable researchers to gather related works more quickly than before.
  17. Raghavan, V.V.; Deogun, J.S.; Sever, H.: Knowledge discovery and data mining : introduction (1998) 0.00
    0.0035979224 = product of:
      0.007195845 = sum of:
        0.007195845 = product of:
          0.021587534 = sum of:
            0.021587534 = weight(_text_:h in 2899) [ClassicSimilarity], result of:
              0.021587534 = score(doc=2899,freq=2.0), product of:
                0.11234917 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045220956 = queryNorm
                0.19214681 = fieldWeight in 2899, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2899)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  18. Chen, H.; Chau, M.: Web mining : machine learning for Web applications (2003) 0.00
    0.0030839336 = product of:
      0.006167867 = sum of:
        0.006167867 = product of:
          0.0185036 = sum of:
            0.0185036 = weight(_text_:h in 4242) [ClassicSimilarity], result of:
              0.0185036 = score(doc=4242,freq=2.0), product of:
                0.11234917 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045220956 = queryNorm
                0.16469726 = fieldWeight in 4242, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4242)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  19. Kulathuramaiyer, N.; Maurer, H.: Implications of emerging data mining (2009) 0.00
    0.0030839336 = product of:
      0.006167867 = sum of:
        0.006167867 = product of:
          0.0185036 = sum of:
            0.0185036 = weight(_text_:h in 3144) [ClassicSimilarity], result of:
              0.0185036 = score(doc=3144,freq=2.0), product of:
                0.11234917 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045220956 = queryNorm
                0.16469726 = fieldWeight in 3144, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3144)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    
  20. Chen, Y.-L.; Liu, Y.-H.; Ho, W.-L.: ¬A text mining approach to assist the general public in the retrieval of legal documents (2013) 0.00
    0.0030839336 = product of:
      0.006167867 = sum of:
        0.006167867 = product of:
          0.0185036 = sum of:
            0.0185036 = weight(_text_:h in 521) [ClassicSimilarity], result of:
              0.0185036 = score(doc=521,freq=2.0), product of:
                0.11234917 = queryWeight, product of:
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.045220956 = queryNorm
                0.16469726 = fieldWeight in 521, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.4844491 = idf(docFreq=10020, maxDocs=44218)
                  0.046875 = fieldNorm(doc=521)
          0.33333334 = coord(1/3)
      0.5 = coord(1/2)
    

Languages

  • e 29
  • d 22

Types