Search (8 results, page 1 of 1)

  • × theme_ss:"Data Mining"
  • × year_i:[2010 TO 2020}
  1. Vaughan, L.; Chen, Y.: Data mining from web search queries : a comparison of Google trends and Baidu index (2015) 0.04
    0.03819228 = product of:
      0.07638456 = sum of:
        0.06049643 = weight(_text_:services in 1605) [ClassicSimilarity], result of:
          0.06049643 = score(doc=1605,freq=6.0), product of:
            0.17221296 = queryWeight, product of:
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.046906993 = queryNorm
            0.3512885 = fieldWeight in 1605, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1605)
        0.015888125 = product of:
          0.03177625 = sum of:
            0.03177625 = weight(_text_:22 in 1605) [ClassicSimilarity], result of:
              0.03177625 = score(doc=1605,freq=2.0), product of:
                0.1642603 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046906993 = queryNorm
                0.19345059 = fieldWeight in 1605, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1605)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Numerous studies have explored the possibility of uncovering information from web search queries but few have examined the factors that affect web query data sources. We conducted a study that investigated this issue by comparing Google Trends and Baidu Index. Data from these two services are based on queries entered by users into Google and Baidu, two of the largest search engines in the world. We first compared the features and functions of the two services based on documents and extensive testing. We then carried out an empirical study that collected query volume data from the two sources. We found that data from both sources could be used to predict the quality of Chinese universities and companies. Despite the differences between the two services in terms of technology, such as differing methods of language processing, the search volume data from the two were highly correlated and combining the two data sources did not improve the predictive power of the data. However, there was a major difference between the two in terms of data availability. Baidu Index was able to provide more search volume data than Google Trends did. Our analysis showed that the disadvantage of Google Trends in this regard was due to Google's smaller user base in China. The implication of this finding goes beyond China. Google's user bases in many countries are smaller than that in China, so the search volume data related to those countries could result in the same issue as that related to China.
    Source
    Journal of the Association for Information Science and Technology. 66(2015) no.1, S.13-22
  2. Sun, X.; Lin, H.: Topical community detection from mining user tagging behavior and interest (2013) 0.01
    0.010478289 = product of:
      0.041913155 = sum of:
        0.041913155 = weight(_text_:services in 605) [ClassicSimilarity], result of:
          0.041913155 = score(doc=605,freq=2.0), product of:
            0.17221296 = queryWeight, product of:
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.046906993 = queryNorm
            0.2433798 = fieldWeight in 605, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.6713707 = idf(docFreq=3057, maxDocs=44218)
              0.046875 = fieldNorm(doc=605)
      0.25 = coord(1/4)
    
    Abstract
    With the development of Web2.0, social tagging systems in which users can freely choose tags to annotate resources according to their interests have attracted much attention. In particular, literature on the emergence of collective intelligence in social tagging systems has increased. In this article, we propose a probabilistic generative model to detect latent topical communities among users. Social tags and resource contents are leveraged to model user interest in two similar and correlated ways. Our primary goal is to capture user tagging behavior and interest and discover the emergent topical community structure. The communities should be groups of users with frequent social interactions as well as similar topical interests, which would have important research implications for personalized information services. Experimental results on two real social tagging data sets with different genres have shown that the proposed generative model more accurately models user interest and detects high-quality and meaningful topical communities.
  3. Mining text data (2012) 0.01
    0.005099069 = product of:
      0.020396275 = sum of:
        0.020396275 = product of:
          0.04079255 = sum of:
            0.04079255 = weight(_text_:management in 362) [ClassicSimilarity], result of:
              0.04079255 = score(doc=362,freq=6.0), product of:
                0.15810528 = queryWeight, product of:
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.046906993 = queryNorm
                0.25800878 = fieldWeight in 362, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.03125 = fieldNorm(doc=362)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Text mining applications have experienced tremendous advances because of web 2.0 and social networking applications. Recent advances in hardware and software technology have lead to a number of unique scenarios where text mining algorithms are learned. Mining Text Data introduces an important niche in the text analytics field, and is an edited volume contributed by leading international researchers and practitioners focused on social networks & data mining. This book contains a wide swath in topics across social networks & data mining. Each chapter contains a comprehensive survey including the key research content on the topic, and the future directions of research in the field. There is a special focus on Text Embedded with Heterogeneous and Multimedia Data which makes the mining process much more challenging. A number of methods have been designed such as transfer learning and cross-lingual mining for such cases. Mining Text Data simplifies the content, so that advanced-level students, practitioners and researchers in computer science can benefit from this book. Academic and corporate libraries, as well as ACM, IEEE, and Management Science focused on information security, electronic commerce, databases, data mining, machine learning, and statistics are the primary buyers for this reference book.
    LCSH
    Database management
    Subject
    Database management
  4. Berendt, B.; Krause, B.; Kolbe-Nusser, S.: Intelligent scientific authoring tools : interactive data mining for constructive uses of citation networks (2010) 0.00
    0.004415923 = product of:
      0.017663691 = sum of:
        0.017663691 = product of:
          0.035327382 = sum of:
            0.035327382 = weight(_text_:management in 4226) [ClassicSimilarity], result of:
              0.035327382 = score(doc=4226,freq=2.0), product of:
                0.15810528 = queryWeight, product of:
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.046906993 = queryNorm
                0.22344214 = fieldWeight in 4226, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4226)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Source
    Information processing and management. 46(2010) no.1, S.1-10
  5. Hallonsten, O.; Holmberg, D.: Analyzing structural stratification in the Swedish higher education system : data contextualization with policy-history analysis (2013) 0.00
    0.003972031 = product of:
      0.015888125 = sum of:
        0.015888125 = product of:
          0.03177625 = sum of:
            0.03177625 = weight(_text_:22 in 668) [ClassicSimilarity], result of:
              0.03177625 = score(doc=668,freq=2.0), product of:
                0.1642603 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046906993 = queryNorm
                0.19345059 = fieldWeight in 668, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=668)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    22. 3.2013 19:43:01
  6. Fonseca, F.; Marcinkowski, M.; Davis, C.: Cyber-human systems of thought and understanding (2019) 0.00
    0.003972031 = product of:
      0.015888125 = sum of:
        0.015888125 = product of:
          0.03177625 = sum of:
            0.03177625 = weight(_text_:22 in 5011) [ClassicSimilarity], result of:
              0.03177625 = score(doc=5011,freq=2.0), product of:
                0.1642603 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046906993 = queryNorm
                0.19345059 = fieldWeight in 5011, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5011)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    7. 3.2019 16:32:22
  7. Saggi, M.K.; Jain, S.: ¬A survey towards an integration of big data analytics to big insights for value-creation (2018) 0.00
    0.0036799356 = product of:
      0.014719742 = sum of:
        0.014719742 = product of:
          0.029439485 = sum of:
            0.029439485 = weight(_text_:management in 5053) [ClassicSimilarity], result of:
              0.029439485 = score(doc=5053,freq=2.0), product of:
                0.15810528 = queryWeight, product of:
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.046906993 = queryNorm
                0.18620178 = fieldWeight in 5053, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.3706124 = idf(docFreq=4130, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5053)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Source
    Information processing and management. 54(2018) no.5, S.758-790
  8. Jäger, L.: Von Big Data zu Big Brother (2018) 0.00
    0.0031776251 = product of:
      0.0127105005 = sum of:
        0.0127105005 = product of:
          0.025421001 = sum of:
            0.025421001 = weight(_text_:22 in 5234) [ClassicSimilarity], result of:
              0.025421001 = score(doc=5234,freq=2.0), product of:
                0.1642603 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046906993 = queryNorm
                0.15476047 = fieldWeight in 5234, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5234)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    22. 1.2018 11:33:49