Search (14 results, page 1 of 1)

  • × year_i:[2010 TO 2020}
  • × theme_ss:"Data Mining"
  1. Wattenberg, M.; Viégas, F.; Johnson, I.: How to use t-SNE effectively (2016) 0.03
    0.031122928 = product of:
      0.062245857 = sum of:
        0.062245857 = product of:
          0.124491714 = sum of:
            0.124491714 = weight(_text_:t in 3887) [ClassicSimilarity], result of:
              0.124491714 = score(doc=3887,freq=8.0), product of:
                0.17876579 = queryWeight, product of:
                  3.9394085 = idf(docFreq=2338, maxDocs=44218)
                  0.04537884 = queryNorm
                0.69639564 = fieldWeight in 3887, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.9394085 = idf(docFreq=2338, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3887)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Although extremely useful for visualizing high-dimensional data, t-SNE plots can sometimes be mysterious or misleading. By exploring how it behaves in simple cases, we can learn to use it more effectively. We'll walk through a series of simple examples to illustrate what t-SNE diagrams can and cannot show. The t-SNE technique really is useful-but only if you know how to interpret it.
  2. Maaten, L. van den: Accelerating t-SNE using Tree-Based Algorithms (2014) 0.03
    0.03044693 = product of:
      0.06089386 = sum of:
        0.06089386 = product of:
          0.12178772 = sum of:
            0.12178772 = weight(_text_:t in 3886) [ClassicSimilarity], result of:
              0.12178772 = score(doc=3886,freq=10.0), product of:
                0.17876579 = queryWeight, product of:
                  3.9394085 = idf(docFreq=2338, maxDocs=44218)
                  0.04537884 = queryNorm
                0.6812697 = fieldWeight in 3886, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  3.9394085 = idf(docFreq=2338, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3886)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The paper investigates the acceleration of t-SNE-an embedding technique that is commonly used for the visualization of high-dimensional data in scatter plots-using two tree-based algorithms. In particular, the paper develops variants of the Barnes-Hut algorithm and of the dual-tree algorithm that approximate the gradient used for learning t-SNE embeddings in O(N*logN). Our experiments show that the resulting algorithms substantially accelerate t-SNE, and that they make it possible to learn embeddings of data sets with millions of objects. Somewhat counterintuitively, the Barnes-Hut variant of t-SNE appears to outperform the dual-tree variant.
  3. Maaten, L. van den; Hinton, G.: Visualizing non-metric similarities in multiple maps (2012) 0.02
    0.020214936 = product of:
      0.04042987 = sum of:
        0.04042987 = product of:
          0.08085974 = sum of:
            0.08085974 = weight(_text_:t in 3884) [ClassicSimilarity], result of:
              0.08085974 = score(doc=3884,freq=6.0), product of:
                0.17876579 = queryWeight, product of:
                  3.9394085 = idf(docFreq=2338, maxDocs=44218)
                  0.04537884 = queryNorm
                0.45232224 = fieldWeight in 3884, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.9394085 = idf(docFreq=2338, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3884)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Techniques for multidimensional scaling visualize objects as points in a low-dimensional metric map. As a result, the visualizations are subject to the fundamental limitations of metric spaces. These limitations prevent multidimensional scaling from faithfully representing non-metric similarity data such as word associations or event co-occurrences. In particular, multidimensional scaling cannot faithfully represent intransitive pairwise similarities in a visualization, and it cannot faithfully visualize "central" objects. In this paper, we present an extension of a recently proposed multidimensional scaling technique called t-SNE. The extension aims to address the problems of traditional multidimensional scaling techniques when these techniques are used to visualize non-metric similarities. The new technique, called multiple maps t-SNE, alleviates these problems by constructing a collection of maps that reveal complementary structure in the similarity data. We apply multiple maps t-SNE to a large data set of word association data and to a data set of NIPS co-authorships, demonstrating its ability to successfully visualize non-metric similarities.
  4. Mandl, T.: Text mining und data minig (2013) 0.02
    0.01945183 = product of:
      0.03890366 = sum of:
        0.03890366 = product of:
          0.07780732 = sum of:
            0.07780732 = weight(_text_:t in 713) [ClassicSimilarity], result of:
              0.07780732 = score(doc=713,freq=2.0), product of:
                0.17876579 = queryWeight, product of:
                  3.9394085 = idf(docFreq=2338, maxDocs=44218)
                  0.04537884 = queryNorm
                0.43524727 = fieldWeight in 713, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.9394085 = idf(docFreq=2338, maxDocs=44218)
                  0.078125 = fieldNorm(doc=713)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  5. Huvila, I.: Mining qualitative data on human information behaviour from the Web (2010) 0.01
    0.013616281 = product of:
      0.027232561 = sum of:
        0.027232561 = product of:
          0.054465123 = sum of:
            0.054465123 = weight(_text_:t in 4676) [ClassicSimilarity], result of:
              0.054465123 = score(doc=4676,freq=2.0), product of:
                0.17876579 = queryWeight, product of:
                  3.9394085 = idf(docFreq=2338, maxDocs=44218)
                  0.04537884 = queryNorm
                0.30467308 = fieldWeight in 4676, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.9394085 = idf(docFreq=2338, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4676)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Information und Wissen: global, sozial und frei? Proceedings des 12. Internationalen Symposiums für Informationswissenschaft (ISI 2011) ; Hildesheim, 9. - 11. März 2011. Hrsg.: J. Griesbaum, T. Mandl u. C. Womser-Hacker
  6. Song, J.; Huang, Y.; Qi, X.; Li, Y.; Li, F.; Fu, K.; Huang, T.: Discovering hierarchical topic evolution in time-stamped documents (2016) 0.01
    0.011671098 = product of:
      0.023342196 = sum of:
        0.023342196 = product of:
          0.04668439 = sum of:
            0.04668439 = weight(_text_:t in 2853) [ClassicSimilarity], result of:
              0.04668439 = score(doc=2853,freq=2.0), product of:
                0.17876579 = queryWeight, product of:
                  3.9394085 = idf(docFreq=2338, maxDocs=44218)
                  0.04537884 = queryNorm
                0.26114836 = fieldWeight in 2853, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.9394085 = idf(docFreq=2338, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2853)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  7. Wei, C.-P.; Lee, Y.-H.; Chiang, Y.-S.; Chen, C.-T.; Yang, C.C.C.: Exploiting temporal characteristics of features for effectively discovering event episodes from news corpora (2014) 0.01
    0.009725915 = product of:
      0.01945183 = sum of:
        0.01945183 = product of:
          0.03890366 = sum of:
            0.03890366 = weight(_text_:t in 1225) [ClassicSimilarity], result of:
              0.03890366 = score(doc=1225,freq=2.0), product of:
                0.17876579 = queryWeight, product of:
                  3.9394085 = idf(docFreq=2338, maxDocs=44218)
                  0.04537884 = queryNorm
                0.21762364 = fieldWeight in 1225, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.9394085 = idf(docFreq=2338, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1225)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  8. Ekbia, H.; Mattioli, M.; Kouper, I.; Arave, G.; Ghazinejad, A.; Bowman, T.; Suri, V.R.; Tsou, A.; Weingart, S.; Sugimoto, C.R.: Big data, bigger dilemmas : a critical review (2015) 0.01
    0.009725915 = product of:
      0.01945183 = sum of:
        0.01945183 = product of:
          0.03890366 = sum of:
            0.03890366 = weight(_text_:t in 2155) [ClassicSimilarity], result of:
              0.03890366 = score(doc=2155,freq=2.0), product of:
                0.17876579 = queryWeight, product of:
                  3.9394085 = idf(docFreq=2338, maxDocs=44218)
                  0.04537884 = queryNorm
                0.21762364 = fieldWeight in 2155, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.9394085 = idf(docFreq=2338, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2155)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  9. Li, D.; Tang, J.; Ding, Y.; Shuai, X.; Chambers, T.; Sun, G.; Luo, Z.; Zhang, J.: Topic-level opinion influence model (TOIM) : an investigation using tencent microblogging (2015) 0.01
    0.009725915 = product of:
      0.01945183 = sum of:
        0.01945183 = product of:
          0.03890366 = sum of:
            0.03890366 = weight(_text_:t in 2345) [ClassicSimilarity], result of:
              0.03890366 = score(doc=2345,freq=2.0), product of:
                0.17876579 = queryWeight, product of:
                  3.9394085 = idf(docFreq=2338, maxDocs=44218)
                  0.04537884 = queryNorm
                0.21762364 = fieldWeight in 2345, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.9394085 = idf(docFreq=2338, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2345)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  10. Gill, A.J.; Hinrichs-Krapels, S.; Blanke, T.; Grant, J.; Hedges, M.; Tanner, S.: Insight workflow : systematically combining human and computational methods to explore textual data (2017) 0.01
    0.009725915 = product of:
      0.01945183 = sum of:
        0.01945183 = product of:
          0.03890366 = sum of:
            0.03890366 = weight(_text_:t in 3682) [ClassicSimilarity], result of:
              0.03890366 = score(doc=3682,freq=2.0), product of:
                0.17876579 = queryWeight, product of:
                  3.9394085 = idf(docFreq=2338, maxDocs=44218)
                  0.04537884 = queryNorm
                0.21762364 = fieldWeight in 3682, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.9394085 = idf(docFreq=2338, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3682)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  11. Hallonsten, O.; Holmberg, D.: Analyzing structural stratification in the Swedish higher education system : data contextualization with policy-history analysis (2013) 0.01
    0.0076852585 = product of:
      0.015370517 = sum of:
        0.015370517 = product of:
          0.030741034 = sum of:
            0.030741034 = weight(_text_:22 in 668) [ClassicSimilarity], result of:
              0.030741034 = score(doc=668,freq=2.0), product of:
                0.15890898 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04537884 = queryNorm
                0.19345059 = fieldWeight in 668, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=668)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2013 19:43:01
  12. Vaughan, L.; Chen, Y.: Data mining from web search queries : a comparison of Google trends and Baidu index (2015) 0.01
    0.0076852585 = product of:
      0.015370517 = sum of:
        0.015370517 = product of:
          0.030741034 = sum of:
            0.030741034 = weight(_text_:22 in 1605) [ClassicSimilarity], result of:
              0.030741034 = score(doc=1605,freq=2.0), product of:
                0.15890898 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04537884 = queryNorm
                0.19345059 = fieldWeight in 1605, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1605)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Journal of the Association for Information Science and Technology. 66(2015) no.1, S.13-22
  13. Fonseca, F.; Marcinkowski, M.; Davis, C.: Cyber-human systems of thought and understanding (2019) 0.01
    0.0076852585 = product of:
      0.015370517 = sum of:
        0.015370517 = product of:
          0.030741034 = sum of:
            0.030741034 = weight(_text_:22 in 5011) [ClassicSimilarity], result of:
              0.030741034 = score(doc=5011,freq=2.0), product of:
                0.15890898 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04537884 = queryNorm
                0.19345059 = fieldWeight in 5011, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5011)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    7. 3.2019 16:32:22
  14. Jäger, L.: Von Big Data zu Big Brother (2018) 0.01
    0.006148207 = product of:
      0.012296414 = sum of:
        0.012296414 = product of:
          0.024592828 = sum of:
            0.024592828 = weight(_text_:22 in 5234) [ClassicSimilarity], result of:
              0.024592828 = score(doc=5234,freq=2.0), product of:
                0.15890898 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04537884 = queryNorm
                0.15476047 = fieldWeight in 5234, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=5234)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 1.2018 11:33:49