Document (#40887)

Author
Maaten, L. van den
Title
Accelerating t-SNE using Tree-Based Algorithms
Source
Journal of machine learning research. 15(2014), S.3221-3245
Year
2014
Abstract
The paper investigates the acceleration of t-SNE-an embedding technique that is commonly used for the visualization of high-dimensional data in scatter plots-using two tree-based algorithms. In particular, the paper develops variants of the Barnes-Hut algorithm and of the dual-tree algorithm that approximate the gradient used for learning t-SNE embeddings in O(N*logN). Our experiments show that the resulting algorithms substantially accelerate t-SNE, and that they make it possible to learn embeddings of data sets with millions of objects. Somewhat counterintuitively, the Barnes-Hut variant of t-SNE appears to outperform the dual-tree variant.
Content
Vgl. auch: https://lvdmaaten.github.io/tsne/.
Theme
Data Mining
Visualisierung
Object
tSNE

Similar documents (content)

  1. Zhu, Y.; Quan, L.; Chen, P.-Y.; Kim, M.C.; Che, C.: Predicting coauthorship using bibliographic network embedding (2023) 0.29
    0.2877294 = sum of:
      0.2877294 = product of:
        0.79924834 = sum of:
          0.022126839 = weight(abstract_txt:data in 917) [ClassicSimilarity], result of:
            0.022126839 = score(doc=917,freq=4.0), product of:
              0.05305643 = queryWeight, product of:
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.015902547 = queryNorm
              0.41704348 = fieldWeight in 917, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=917)
          0.011293315 = weight(abstract_txt:used in 917) [ClassicSimilarity], result of:
            0.011293315 = score(doc=917,freq=1.0), product of:
              0.05378891 = queryWeight, product of:
                1.0068792 = boost
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.015902547 = queryNorm
              0.2099562 = fieldWeight in 917, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.0625 = fieldNorm(doc=917)
          0.0469704 = weight(abstract_txt:dimensional in 917) [ClassicSimilarity], result of:
            0.0469704 = score(doc=917,freq=1.0), product of:
              0.11041243 = queryWeight, product of:
                1.0200583 = boost
                6.806538 = idf(docFreq=132, maxDocs=44218)
                0.015902547 = queryNorm
              0.42540863 = fieldWeight in 917, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.806538 = idf(docFreq=132, maxDocs=44218)
                0.0625 = fieldNorm(doc=917)
          0.01237307 = weight(abstract_txt:using in 917) [ClassicSimilarity], result of:
            0.01237307 = score(doc=917,freq=1.0), product of:
              0.057164986 = queryWeight, product of:
                1.0379969 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.015902547 = queryNorm
              0.21644491 = fieldWeight in 917, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0625 = fieldNorm(doc=917)
          0.14392729 = weight(abstract_txt:embedding in 917) [ClassicSimilarity], result of:
            0.14392729 = score(doc=917,freq=4.0), product of:
              0.14673844 = queryWeight, product of:
                1.1759475 = boost
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.015902547 = queryNorm
              0.9808425 = fieldWeight in 917, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.0625 = fieldNorm(doc=917)
          0.11644061 = weight(abstract_txt:gradient in 917) [ClassicSimilarity], result of:
            0.11644061 = score(doc=917,freq=1.0), product of:
              0.20224203 = queryWeight, product of:
                1.3805486 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.015902547 = queryNorm
              0.5757488 = fieldWeight in 917, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.0625 = fieldNorm(doc=917)
          0.01372848 = weight(abstract_txt:that in 917) [ClassicSimilarity], result of:
            0.01372848 = score(doc=917,freq=3.0), product of:
              0.05352167 = queryWeight, product of:
                1.4204005 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.015902547 = queryNorm
              0.2565032 = fieldWeight in 917, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=917)
          0.08310011 = weight(abstract_txt:algorithms in 917) [ClassicSimilarity], result of:
            0.08310011 = score(doc=917,freq=1.0), product of:
              0.23293957 = queryWeight, product of:
                2.5662458 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.015902547 = queryNorm
              0.35674536 = fieldWeight in 917, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.0625 = fieldNorm(doc=917)
          0.34928825 = weight(abstract_txt:embeddings in 917) [ClassicSimilarity], result of:
            0.34928825 = score(doc=917,freq=2.0), product of:
              0.42065343 = queryWeight, product of:
                2.8157444 = boost
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.015902547 = queryNorm
              0.8303468 = fieldWeight in 917, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.0625 = fieldNorm(doc=917)
        0.36 = coord(9/25)
    
  2. Su, S.; Li, X.; Cheng, X.; Sun, C.: Location-aware targeted influence maximization in social networks (2018) 0.15
    0.15044573 = sum of:
      0.15044573 = product of:
        0.6268572 = sum of:
          0.011063419 = weight(abstract_txt:data in 4034) [ClassicSimilarity], result of:
            0.011063419 = score(doc=4034,freq=1.0), product of:
              0.05305643 = queryWeight, product of:
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.015902547 = queryNorm
              0.20852174 = fieldWeight in 4034, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=4034)
          0.012418759 = weight(abstract_txt:paper in 4034) [ClassicSimilarity], result of:
            0.012418759 = score(doc=4034,freq=1.0), product of:
              0.057305623 = queryWeight, product of:
                1.0392729 = boost
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.015902547 = queryNorm
              0.216711 = fieldWeight in 4034, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.0625 = fieldNorm(doc=4034)
          0.07316671 = weight(abstract_txt:approximate in 4034) [ClassicSimilarity], result of:
            0.07316671 = score(doc=4034,freq=1.0), product of:
              0.14836933 = queryWeight, product of:
                1.1824644 = boost
                7.890225 = idf(docFreq=44, maxDocs=44218)
                0.015902547 = queryNorm
              0.49313906 = fieldWeight in 4034, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.890225 = idf(docFreq=44, maxDocs=44218)
                0.0625 = fieldNorm(doc=4034)
          0.055327233 = weight(abstract_txt:algorithm in 4034) [ClassicSimilarity], result of:
            0.055327233 = score(doc=4034,freq=1.0), product of:
              0.1551569 = queryWeight, product of:
                1.7100804 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.015902547 = queryNorm
              0.35658893 = fieldWeight in 4034, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.0625 = fieldNorm(doc=4034)
          0.14393361 = weight(abstract_txt:algorithms in 4034) [ClassicSimilarity], result of:
            0.14393361 = score(doc=4034,freq=3.0), product of:
              0.23293957 = queryWeight, product of:
                2.5662458 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.015902547 = queryNorm
              0.6179011 = fieldWeight in 4034, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.0625 = fieldNorm(doc=4034)
          0.3309475 = weight(abstract_txt:tree in 4034) [ClassicSimilarity], result of:
            0.3309475 = score(doc=4034,freq=3.0), product of:
              0.44663638 = queryWeight, product of:
                4.1032033 = boost
                6.8448567 = idf(docFreq=127, maxDocs=44218)
                0.015902547 = queryNorm
              0.74097747 = fieldWeight in 4034, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.8448567 = idf(docFreq=127, maxDocs=44218)
                0.0625 = fieldNorm(doc=4034)
        0.24 = coord(6/25)
    
  3. Safder, I.; Ali, M.; Aljohani, N.R.; Nawaz, R.; Hassan, S.-U.: Neural machine translation for in-text citation classification (2023) 0.15
    0.1503496 = sum of:
      0.1503496 = product of:
        0.53696287 = sum of:
          0.011063419 = weight(abstract_txt:data in 1053) [ClassicSimilarity], result of:
            0.011063419 = score(doc=1053,freq=1.0), product of:
              0.05305643 = queryWeight, product of:
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.015902547 = queryNorm
              0.20852174 = fieldWeight in 1053, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=1053)
          0.01237307 = weight(abstract_txt:using in 1053) [ClassicSimilarity], result of:
            0.01237307 = score(doc=1053,freq=1.0), product of:
              0.057164986 = queryWeight, product of:
                1.0379969 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.015902547 = queryNorm
              0.21644491 = fieldWeight in 1053, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0625 = fieldNorm(doc=1053)
          0.017562777 = weight(abstract_txt:paper in 1053) [ClassicSimilarity], result of:
            0.017562777 = score(doc=1053,freq=2.0), product of:
              0.057305623 = queryWeight, product of:
                1.0392729 = boost
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.015902547 = queryNorm
              0.30647564 = fieldWeight in 1053, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.467376 = idf(docFreq=3749, maxDocs=44218)
                0.0625 = fieldNorm(doc=1053)
          0.066785604 = weight(abstract_txt:outperform in 1053) [ClassicSimilarity], result of:
            0.066785604 = score(doc=1053,freq=1.0), product of:
              0.1396123 = queryWeight, product of:
                1.1470381 = boost
                7.653836 = idf(docFreq=56, maxDocs=44218)
                0.015902547 = queryNorm
              0.47836474 = fieldWeight in 1053, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.653836 = idf(docFreq=56, maxDocs=44218)
                0.0625 = fieldNorm(doc=1053)
          0.071963646 = weight(abstract_txt:embedding in 1053) [ClassicSimilarity], result of:
            0.071963646 = score(doc=1053,freq=1.0), product of:
              0.14673844 = queryWeight, product of:
                1.1759475 = boost
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.015902547 = queryNorm
              0.49042124 = fieldWeight in 1053, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.0625 = fieldNorm(doc=1053)
          0.007926142 = weight(abstract_txt:that in 1053) [ClassicSimilarity], result of:
            0.007926142 = score(doc=1053,freq=1.0), product of:
              0.05352167 = queryWeight, product of:
                1.4204005 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.015902547 = queryNorm
              0.1480922 = fieldWeight in 1053, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=1053)
          0.34928825 = weight(abstract_txt:embeddings in 1053) [ClassicSimilarity], result of:
            0.34928825 = score(doc=1053,freq=2.0), product of:
              0.42065343 = queryWeight, product of:
                2.8157444 = boost
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.015902547 = queryNorm
              0.8303468 = fieldWeight in 1053, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.0625 = fieldNorm(doc=1053)
        0.28 = coord(7/25)
    
  4. White, K.J.; Sutcliffe, R.F.E.: Applying incremental tree induction to retrieval : from manuals and medical texts (2006) 0.15
    0.14897008 = sum of:
      0.14897008 = product of:
        0.6207087 = sum of:
          0.019557547 = weight(abstract_txt:data in 5044) [ClassicSimilarity], result of:
            0.019557547 = score(doc=5044,freq=2.0), product of:
              0.05305643 = queryWeight, product of:
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.015902547 = queryNorm
              0.36861783 = fieldWeight in 5044, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.078125 = fieldNorm(doc=5044)
          0.030932678 = weight(abstract_txt:using in 5044) [ClassicSimilarity], result of:
            0.030932678 = score(doc=5044,freq=4.0), product of:
              0.057164986 = queryWeight, product of:
                1.0379969 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.015902547 = queryNorm
              0.5411123 = fieldWeight in 5044, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.078125 = fieldNorm(doc=5044)
          0.06755972 = weight(abstract_txt:substantially in 5044) [ClassicSimilarity], result of:
            0.06755972 = score(doc=5044,freq=1.0), product of:
              0.12124216 = queryWeight, product of:
                1.0689142 = boost
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.015902547 = queryNorm
              0.5572296 = fieldWeight in 5044, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.132539 = idf(docFreq=95, maxDocs=44218)
                0.078125 = fieldNorm(doc=5044)
          0.019815354 = weight(abstract_txt:that in 5044) [ClassicSimilarity], result of:
            0.019815354 = score(doc=5044,freq=4.0), product of:
              0.05352167 = queryWeight, product of:
                1.4204005 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.015902547 = queryNorm
              0.3702305 = fieldWeight in 5044, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.078125 = fieldNorm(doc=5044)
          0.06915904 = weight(abstract_txt:algorithm in 5044) [ClassicSimilarity], result of:
            0.06915904 = score(doc=5044,freq=1.0), product of:
              0.1551569 = queryWeight, product of:
                1.7100804 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.015902547 = queryNorm
              0.44573617 = fieldWeight in 5044, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.078125 = fieldNorm(doc=5044)
          0.41368437 = weight(abstract_txt:tree in 5044) [ClassicSimilarity], result of:
            0.41368437 = score(doc=5044,freq=3.0), product of:
              0.44663638 = queryWeight, product of:
                4.1032033 = boost
                6.8448567 = idf(docFreq=127, maxDocs=44218)
                0.015902547 = queryNorm
              0.92622185 = fieldWeight in 5044, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.8448567 = idf(docFreq=127, maxDocs=44218)
                0.078125 = fieldNorm(doc=5044)
        0.24 = coord(6/25)
    
  5. French, J.C.; Brown, D.E.; Kim, N.-H.: ¬A classification approach to Boolean query reformulation (1997) 0.14
    0.14497577 = sum of:
      0.14497577 = product of:
        0.7248788 = sum of:
          0.01237307 = weight(abstract_txt:using in 197) [ClassicSimilarity], result of:
            0.01237307 = score(doc=197,freq=1.0), product of:
              0.057164986 = queryWeight, product of:
                1.0379969 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.015902547 = queryNorm
              0.21644491 = fieldWeight in 197, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0625 = fieldNorm(doc=197)
          0.011209257 = weight(abstract_txt:that in 197) [ClassicSimilarity], result of:
            0.011209257 = score(doc=197,freq=2.0), product of:
              0.05352167 = queryWeight, product of:
                1.4204005 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.015902547 = queryNorm
              0.20943399 = fieldWeight in 197, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=197)
          0.07824452 = weight(abstract_txt:algorithm in 197) [ClassicSimilarity], result of:
            0.07824452 = score(doc=197,freq=2.0), product of:
              0.1551569 = queryWeight, product of:
                1.7100804 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.015902547 = queryNorm
              0.5042929 = fieldWeight in 197, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.0625 = fieldNorm(doc=197)
          0.1175213 = weight(abstract_txt:algorithms in 197) [ClassicSimilarity], result of:
            0.1175213 = score(doc=197,freq=2.0), product of:
              0.23293957 = queryWeight, product of:
                2.5662458 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.015902547 = queryNorm
              0.5045141 = fieldWeight in 197, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.0625 = fieldNorm(doc=197)
          0.50553066 = weight(abstract_txt:tree in 197) [ClassicSimilarity], result of:
            0.50553066 = score(doc=197,freq=7.0), product of:
              0.44663638 = queryWeight, product of:
                4.1032033 = boost
                6.8448567 = idf(docFreq=127, maxDocs=44218)
                0.015902547 = queryNorm
              1.1318618 = fieldWeight in 197, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.8448567 = idf(docFreq=127, maxDocs=44218)
                0.0625 = fieldNorm(doc=197)
        0.2 = coord(5/25)