Document (#40888)

Author
Maaten, L. van den
Title
Accelerating t-SNE using Tree-Based Algorithms
Source
Journal of machine learning research. 15(2014), S.3221-3245
Year
2014
Abstract
The paper investigates the acceleration of t-SNE-an embedding technique that is commonly used for the visualization of high-dimensional data in scatter plots-using two tree-based algorithms. In particular, the paper develops variants of the Barnes-Hut algorithm and of the dual-tree algorithm that approximate the gradient used for learning t-SNE embeddings in O(N*logN). Our experiments show that the resulting algorithms substantially accelerate t-SNE, and that they make it possible to learn embeddings of data sets with millions of objects. Somewhat counterintuitively, the Barnes-Hut variant of t-SNE appears to outperform the dual-tree variant.
Content
Vgl. auch: https://lvdmaaten.github.io/tsne/.
Theme
Data Mining
Visualisierung
Object
tSNE

Similar documents (content)

  1. Su, S.; Li, X.; Cheng, X.; Sun, C.: Location-aware targeted influence maximization in social networks (2018) 0.16
    0.16411485 = sum of:
      0.16411485 = product of:
        0.6838119 = sum of:
          0.012482834 = weight(abstract_txt:data in 953) [ClassicSimilarity], result of:
            0.012482834 = score(doc=953,freq=1.0), product of:
              0.05904368 = queryWeight, product of:
                1.0165083 = boost
                3.382671 = idf(docFreq=3904, maxDocs=42306)
                0.01717128 = queryNorm
              0.21141694 = fieldWeight in 953, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.382671 = idf(docFreq=3904, maxDocs=42306)
                0.0625 = fieldNorm(doc=953)
          0.013967475 = weight(abstract_txt:paper in 953) [ClassicSimilarity], result of:
            0.013967475 = score(doc=953,freq=1.0), product of:
              0.063637026 = queryWeight, product of:
                1.0553079 = boost
                3.5117857 = idf(docFreq=3431, maxDocs=42306)
                0.01717128 = queryNorm
              0.21948661 = fieldWeight in 953, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5117857 = idf(docFreq=3431, maxDocs=42306)
                0.0625 = fieldNorm(doc=953)
          0.07788474 = weight(abstract_txt:approximate in 953) [ClassicSimilarity], result of:
            0.07788474 = score(doc=953,freq=1.0), product of:
              0.15882646 = queryWeight, product of:
                1.1788828 = boost
                7.8460217 = idf(docFreq=44, maxDocs=42306)
                0.01717128 = queryNorm
              0.49037635 = fieldWeight in 953, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.8460217 = idf(docFreq=44, maxDocs=42306)
                0.0625 = fieldNorm(doc=953)
          0.060623404 = weight(abstract_txt:algorithm in 953) [ClassicSimilarity], result of:
            0.060623404 = score(doc=953,freq=1.0), product of:
              0.16932645 = queryWeight, product of:
                1.7214192 = boost
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.01717128 = queryNorm
              0.35802677 = fieldWeight in 953, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.0625 = fieldNorm(doc=953)
          0.1599795 = weight(abstract_txt:algorithms in 953) [ClassicSimilarity], result of:
            0.1599795 = score(doc=953,freq=3.0), product of:
              0.25664383 = queryWeight, product of:
                2.595585 = boost
                5.758281 = idf(docFreq=362, maxDocs=42306)
                0.01717128 = queryNorm
              0.6233522 = fieldWeight in 953, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.758281 = idf(docFreq=362, maxDocs=42306)
                0.0625 = fieldNorm(doc=953)
          0.3588739 = weight(abstract_txt:tree in 953) [ClassicSimilarity], result of:
            0.3588739 = score(doc=953,freq=3.0), product of:
              0.48405582 = queryWeight, product of:
                4.11611 = boost
                6.8486633 = idf(docFreq=121, maxDocs=42306)
                0.01717128 = queryNorm
              0.7413895 = fieldWeight in 953, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.8486633 = idf(docFreq=121, maxDocs=42306)
                0.0625 = fieldNorm(doc=953)
        0.24 = coord(6/25)
    
  2. White, K.J.; Sutcliffe, R.F.E.: Applying incremental tree induction to retrieval : from manuals and medical texts (2006) 0.16
    0.16243403 = sum of:
      0.16243403 = product of:
        0.6768085 = sum of:
          0.02206674 = weight(abstract_txt:data in 45) [ClassicSimilarity], result of:
            0.02206674 = score(doc=45,freq=2.0), product of:
              0.05904368 = queryWeight, product of:
                1.0165083 = boost
                3.382671 = idf(docFreq=3904, maxDocs=42306)
                0.01717128 = queryNorm
              0.37373587 = fieldWeight in 45, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.382671 = idf(docFreq=3904, maxDocs=42306)
                0.078125 = fieldNorm(doc=45)
          0.03417724 = weight(abstract_txt:using in 45) [ClassicSimilarity], result of:
            0.03417724 = score(doc=45,freq=4.0), product of:
              0.06273298 = queryWeight, product of:
                1.047785 = boost
                3.486752 = idf(docFreq=3518, maxDocs=42306)
                0.01717128 = queryNorm
              0.544805 = fieldWeight in 45, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.486752 = idf(docFreq=3518, maxDocs=42306)
                0.078125 = fieldNorm(doc=45)
          0.07376591 = weight(abstract_txt:substantially in 45) [ClassicSimilarity], result of:
            0.07376591 = score(doc=45,freq=1.0), product of:
              0.13200338 = queryWeight, product of:
                1.0747359 = boost
                7.1528745 = idf(docFreq=89, maxDocs=42306)
                0.01717128 = queryNorm
              0.55881834 = fieldWeight in 45, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.1528745 = idf(docFreq=89, maxDocs=42306)
                0.078125 = fieldNorm(doc=45)
          0.022426981 = weight(abstract_txt:that in 45) [ClassicSimilarity], result of:
            0.022426981 = score(doc=45,freq=4.0), product of:
              0.059684534 = queryWeight, product of:
                1.4453404 = boost
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.01717128 = queryNorm
              0.37575868 = fieldWeight in 45, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.078125 = fieldNorm(doc=45)
          0.07577925 = weight(abstract_txt:algorithm in 45) [ClassicSimilarity], result of:
            0.07577925 = score(doc=45,freq=1.0), product of:
              0.16932645 = queryWeight, product of:
                1.7214192 = boost
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.01717128 = queryNorm
              0.44753346 = fieldWeight in 45, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.078125 = fieldNorm(doc=45)
          0.4485924 = weight(abstract_txt:tree in 45) [ClassicSimilarity], result of:
            0.4485924 = score(doc=45,freq=3.0), product of:
              0.48405582 = queryWeight, product of:
                4.11611 = boost
                6.8486633 = idf(docFreq=121, maxDocs=42306)
                0.01717128 = queryNorm
              0.9267369 = fieldWeight in 45, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.8486633 = idf(docFreq=121, maxDocs=42306)
                0.078125 = fieldNorm(doc=45)
        0.24 = coord(6/25)
    
  3. French, J.C.; Brown, D.E.; Kim, N.-H.: ¬A classification approach to Boolean query reformulation (1997) 0.16
    0.15818074 = sum of:
      0.15818074 = product of:
        0.7909037 = sum of:
          0.013670896 = weight(abstract_txt:using in 198) [ClassicSimilarity], result of:
            0.013670896 = score(doc=198,freq=1.0), product of:
              0.06273298 = queryWeight, product of:
                1.047785 = boost
                3.486752 = idf(docFreq=3518, maxDocs=42306)
                0.01717128 = queryNorm
              0.217922 = fieldWeight in 198, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.486752 = idf(docFreq=3518, maxDocs=42306)
                0.0625 = fieldNorm(doc=198)
          0.012686617 = weight(abstract_txt:that in 198) [ClassicSimilarity], result of:
            0.012686617 = score(doc=198,freq=2.0), product of:
              0.059684534 = queryWeight, product of:
                1.4453404 = boost
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.01717128 = queryNorm
              0.2125612 = fieldWeight in 198, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.0625 = fieldNorm(doc=198)
          0.08573444 = weight(abstract_txt:algorithm in 198) [ClassicSimilarity], result of:
            0.08573444 = score(doc=198,freq=2.0), product of:
              0.16932645 = queryWeight, product of:
                1.7214192 = boost
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.01717128 = queryNorm
              0.5063263 = fieldWeight in 198, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.0625 = fieldNorm(doc=198)
          0.13062271 = weight(abstract_txt:algorithms in 198) [ClassicSimilarity], result of:
            0.13062271 = score(doc=198,freq=2.0), product of:
              0.25664383 = queryWeight, product of:
                2.595585 = boost
                5.758281 = idf(docFreq=362, maxDocs=42306)
                0.01717128 = queryNorm
              0.50896496 = fieldWeight in 198, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.758281 = idf(docFreq=362, maxDocs=42306)
                0.0625 = fieldNorm(doc=198)
          0.548189 = weight(abstract_txt:tree in 198) [ClassicSimilarity], result of:
            0.548189 = score(doc=198,freq=7.0), product of:
              0.48405582 = queryWeight, product of:
                4.11611 = boost
                6.8486633 = idf(docFreq=121, maxDocs=42306)
                0.01717128 = queryNorm
              1.1324912 = fieldWeight in 198, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.8486633 = idf(docFreq=121, maxDocs=42306)
                0.0625 = fieldNorm(doc=198)
        0.2 = coord(5/25)
    
  4. French, J.C.; Powell, A.L.; Schulman, E.: Using clustering strategies for creating authority files (2000) 0.15
    0.14952417 = sum of:
      0.14952417 = product of:
        0.5340149 = sum of:
          0.02203671 = weight(abstract_txt:used in 5812) [ClassicSimilarity], result of:
            0.02203671 = score(doc=5812,freq=2.0), product of:
              0.0589901 = queryWeight, product of:
                1.016047 = boost
                3.381136 = idf(docFreq=3910, maxDocs=42306)
                0.01717128 = queryNorm
              0.37356627 = fieldWeight in 5812, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.381136 = idf(docFreq=3910, maxDocs=42306)
                0.078125 = fieldNorm(doc=5812)
          0.02702613 = weight(abstract_txt:data in 5812) [ClassicSimilarity], result of:
            0.02702613 = score(doc=5812,freq=3.0), product of:
              0.05904368 = queryWeight, product of:
                1.0165083 = boost
                3.382671 = idf(docFreq=3904, maxDocs=42306)
                0.01717128 = queryNorm
              0.45773113 = fieldWeight in 5812, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.382671 = idf(docFreq=3904, maxDocs=42306)
                0.078125 = fieldNorm(doc=5812)
          0.01708862 = weight(abstract_txt:using in 5812) [ClassicSimilarity], result of:
            0.01708862 = score(doc=5812,freq=1.0), product of:
              0.06273298 = queryWeight, product of:
                1.047785 = boost
                3.486752 = idf(docFreq=3518, maxDocs=42306)
                0.01717128 = queryNorm
              0.2724025 = fieldWeight in 5812, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.486752 = idf(docFreq=3518, maxDocs=42306)
                0.078125 = fieldNorm(doc=5812)
          0.084824316 = weight(abstract_txt:variants in 5812) [ClassicSimilarity], result of:
            0.084824316 = score(doc=5812,freq=1.0), product of:
              0.14488658 = queryWeight, product of:
                1.125961 = boost
                7.493801 = idf(docFreq=63, maxDocs=42306)
                0.01717128 = queryNorm
              0.5854532 = fieldWeight in 5812, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.493801 = idf(docFreq=63, maxDocs=42306)
                0.078125 = fieldNorm(doc=5812)
          0.13768207 = weight(abstract_txt:approximate in 5812) [ClassicSimilarity], result of:
            0.13768207 = score(doc=5812,freq=2.0), product of:
              0.15882646 = queryWeight, product of:
                1.1788828 = boost
                7.8460217 = idf(docFreq=44, maxDocs=42306)
                0.01717128 = queryNorm
              0.8668711 = fieldWeight in 5812, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.8460217 = idf(docFreq=44, maxDocs=42306)
                0.078125 = fieldNorm(doc=5812)
          0.011213491 = weight(abstract_txt:that in 5812) [ClassicSimilarity], result of:
            0.011213491 = score(doc=5812,freq=1.0), product of:
              0.059684534 = queryWeight, product of:
                1.4453404 = boost
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.01717128 = queryNorm
              0.18787934 = fieldWeight in 5812, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.078125 = fieldNorm(doc=5812)
          0.23414356 = weight(abstract_txt:variant in 5812) [ClassicSimilarity], result of:
            0.23414356 = score(doc=5812,freq=2.0), product of:
              0.28510362 = queryWeight, product of:
                2.2337039 = boost
                7.4331765 = idf(docFreq=67, maxDocs=42306)
                0.01717128 = queryNorm
              0.82125777 = fieldWeight in 5812, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.4331765 = idf(docFreq=67, maxDocs=42306)
                0.078125 = fieldNorm(doc=5812)
        0.28 = coord(7/25)
    
  5. Rasmussen, E.: Clustering algorithms (1992) 0.11
    0.11411759 = sum of:
      0.11411759 = product of:
        0.47548997 = sum of:
          0.017629368 = weight(abstract_txt:used in 4514) [ClassicSimilarity], result of:
            0.017629368 = score(doc=4514,freq=2.0), product of:
              0.0589901 = queryWeight, product of:
                1.016047 = boost
                3.381136 = idf(docFreq=3910, maxDocs=42306)
                0.01717128 = queryNorm
              0.298853 = fieldWeight in 4514, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.381136 = idf(docFreq=3910, maxDocs=42306)
                0.0625 = fieldNorm(doc=4514)
          0.021620903 = weight(abstract_txt:data in 4514) [ClassicSimilarity], result of:
            0.021620903 = score(doc=4514,freq=3.0), product of:
              0.05904368 = queryWeight, product of:
                1.0165083 = boost
                3.382671 = idf(docFreq=3904, maxDocs=42306)
                0.01717128 = queryNorm
              0.3661849 = fieldWeight in 4514, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.382671 = idf(docFreq=3904, maxDocs=42306)
                0.0625 = fieldNorm(doc=4514)
          0.012686617 = weight(abstract_txt:that in 4514) [ClassicSimilarity], result of:
            0.012686617 = score(doc=4514,freq=2.0), product of:
              0.059684534 = queryWeight, product of:
                1.4453404 = boost
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.01717128 = queryNorm
              0.2125612 = fieldWeight in 4514, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.0625 = fieldNorm(doc=4514)
          0.08573444 = weight(abstract_txt:algorithm in 4514) [ClassicSimilarity], result of:
            0.08573444 = score(doc=4514,freq=2.0), product of:
              0.16932645 = queryWeight, product of:
                1.7214192 = boost
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.01717128 = queryNorm
              0.5063263 = fieldWeight in 4514, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.7284284 = idf(docFreq=373, maxDocs=42306)
                0.0625 = fieldNorm(doc=4514)
          0.13062271 = weight(abstract_txt:algorithms in 4514) [ClassicSimilarity], result of:
            0.13062271 = score(doc=4514,freq=2.0), product of:
              0.25664383 = queryWeight, product of:
                2.595585 = boost
                5.758281 = idf(docFreq=362, maxDocs=42306)
                0.01717128 = queryNorm
              0.50896496 = fieldWeight in 4514, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.758281 = idf(docFreq=362, maxDocs=42306)
                0.0625 = fieldNorm(doc=4514)
          0.20719595 = weight(abstract_txt:tree in 4514) [ClassicSimilarity], result of:
            0.20719595 = score(doc=4514,freq=1.0), product of:
              0.48405582 = queryWeight, product of:
                4.11611 = boost
                6.8486633 = idf(docFreq=121, maxDocs=42306)
                0.01717128 = queryNorm
              0.42804146 = fieldWeight in 4514, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8486633 = idf(docFreq=121, maxDocs=42306)
                0.0625 = fieldNorm(doc=4514)
        0.24 = coord(6/25)