Document (#38143)

Author
Mohr, J.W.
Bogdanov, P.
Title
Topic models : what they are and why they matter
Source
Poetics. xxx(2013), S.xxx-xxx
Year
2013
Abstract
We provide a brief, non-technical introduction to the text mining methodology known as "topic modeling." We summarize the theory and background of the method and discuss what kinds of things are found by topic models. Using a text corpus comprised of the eight articles from the special issue of Poetics on the subject of topic models, we run a topic model on these articles, both as a way to introduce the methodology and also to help summarize some of the ways in which social and cultural scientists are using topic models. We review some of the critiques and debates over the use of the method and finally, we link these developments back to some of the original innovations in the field of content analysis that were pioneered by Harold D. Lasswell and colleagues during and just after World War II.
Content
Vgl.: http://dx.doi.org/10.1016/j.poetic.2013.10.001/.
Theme
Data Mining
Wissensrepräsentation

Similar documents (author)

  1. Mohr, J.: ¬Die Schule verschläft das Informationszeitalter (1994) 6.19
    6.190705 = sum of:
      6.190705 = weight(author_txt:mohr in 7834) [ClassicSimilarity], result of:
        6.190705 = fieldWeight in 7834, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.625 = fieldNorm(doc=7834)
    
  2. Mohr, H.: Wissen : Prinzip und Ressource ; [die Zukunft gehört der Wissensgesellschaft] (1999) 6.19
    6.190705 = sum of:
      6.190705 = weight(author_txt:mohr in 5019) [ClassicSimilarity], result of:
        6.190705 = fieldWeight in 5019, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.625 = fieldNorm(doc=5019)
    
  3. Rostek, L.; Mohr, W.; Fischer, D.H.: Weaving a web : the structure and creation of an object network representing an electronic reference work (1993) 3.71
    3.7144227 = sum of:
      3.7144227 = weight(author_txt:mohr in 8418) [ClassicSimilarity], result of:
        3.7144227 = fieldWeight in 8418, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.375 = fieldNorm(doc=8418)
    
  4. Anderson, R.; Birbeck, M.; Kay, M.; Livingstone, S.; Loesgen, B.; Martin, D.; Mohr, S.; Ozu, N.; Peat, B.; Pinnock, J.; Stark, P.; Williams, K.: XML professionell : behandelt W3C DOM, SAX, CSS, XSLT, DTDs, XML Schemas, XLink, XPointer, XPath, E-Commerce, BizTalk, B2B, SOAP, WAP, WML (2000) 1.86
    1.8572114 = sum of:
      1.8572114 = weight(author_txt:mohr in 729) [ClassicSimilarity], result of:
        1.8572114 = fieldWeight in 729, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.1875 = fieldNorm(doc=729)
    

Similar documents (content)

  1. Lin, J.: User simulations for evaluating answers to question series (2007) 0.18
    0.18393742 = sum of:
      0.18393742 = product of:
        0.5748044 = sum of:
          0.022542937 = weight(abstract_txt:these in 914) [ClassicSimilarity], result of:
            0.022542937 = score(doc=914,freq=2.0), product of:
              0.06399846 = queryWeight, product of:
                1.0410672 = boost
                3.1881294 = idf(docFreq=4957, maxDocs=44218)
                0.019282121 = queryNorm
              0.35224184 = fieldWeight in 914, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1881294 = idf(docFreq=4957, maxDocs=44218)
                0.078125 = fieldNorm(doc=914)
          0.020431008 = weight(abstract_txt:using in 914) [ClassicSimilarity], result of:
            0.020431008 = score(doc=914,freq=1.0), product of:
              0.07551485 = queryWeight, product of:
                1.1308635 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.019282121 = queryNorm
              0.27055615 = fieldWeight in 914, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.078125 = fieldNorm(doc=914)
          0.02598306 = weight(abstract_txt:they in 914) [ClassicSimilarity], result of:
            0.02598306 = score(doc=914,freq=1.0), product of:
              0.088640615 = queryWeight, product of:
                1.2252096 = boost
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.019282121 = queryNorm
              0.29312816 = fieldWeight in 914, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.078125 = fieldNorm(doc=914)
          0.112633206 = weight(abstract_txt:comprised in 914) [ClassicSimilarity], result of:
            0.112633206 = score(doc=914,freq=1.0), product of:
              0.18704244 = queryWeight, product of:
                1.258488 = boost
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.019282121 = queryNorm
              0.60217994 = fieldWeight in 914, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.078125 = fieldNorm(doc=914)
          0.06667864 = weight(abstract_txt:methodology in 914) [ClassicSimilarity], result of:
            0.06667864 = score(doc=914,freq=2.0), product of:
              0.13187234 = queryWeight, product of:
                1.4944139 = boost
                4.5764427 = idf(docFreq=1236, maxDocs=44218)
                0.019282121 = queryNorm
              0.50563025 = fieldWeight in 914, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5764427 = idf(docFreq=1236, maxDocs=44218)
                0.078125 = fieldNorm(doc=914)
          0.03671049 = weight(abstract_txt:some in 914) [ClassicSimilarity], result of:
            0.03671049 = score(doc=914,freq=1.0), product of:
              0.12776044 = queryWeight, product of:
                1.8015149 = boost
                3.6779325 = idf(docFreq=3037, maxDocs=44218)
                0.019282121 = queryNorm
              0.28733847 = fieldWeight in 914, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6779325 = idf(docFreq=3037, maxDocs=44218)
                0.078125 = fieldNorm(doc=914)
          0.09838154 = weight(abstract_txt:models in 914) [ClassicSimilarity], result of:
            0.09838154 = score(doc=914,freq=1.0), product of:
              0.2713053 = queryWeight, product of:
                3.0313644 = boost
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.019282121 = queryNorm
              0.362623 = fieldWeight in 914, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.078125 = fieldNorm(doc=914)
          0.19144353 = weight(abstract_txt:topic in 914) [ClassicSimilarity], result of:
            0.19144353 = score(doc=914,freq=1.0), product of:
              0.48406842 = queryWeight, product of:
                4.959159 = boost
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.019282121 = queryNorm
              0.3954886 = fieldWeight in 914, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.078125 = fieldNorm(doc=914)
        0.32 = coord(8/25)
    
  2. Tseng, Y.-H.; Lin, C.-J.; Lin, Y.-I.: Text mining techniques for patent analysis (2007) 0.15
    0.15147805 = sum of:
      0.15147805 = product of:
        0.47336894 = sum of:
          0.0463392 = weight(abstract_txt:mining in 935) [ClassicSimilarity], result of:
            0.0463392 = score(doc=935,freq=1.0), product of:
              0.12006089 = queryWeight, product of:
                1.0082768 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.019282121 = queryNorm
              0.38596416 = fieldWeight in 935, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0625 = fieldNorm(doc=935)
          0.022087477 = weight(abstract_txt:these in 935) [ClassicSimilarity], result of:
            0.022087477 = score(doc=935,freq=3.0), product of:
              0.06399846 = queryWeight, product of:
                1.0410672 = boost
                3.1881294 = idf(docFreq=4957, maxDocs=44218)
                0.019282121 = queryNorm
              0.34512514 = fieldWeight in 935, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.1881294 = idf(docFreq=4957, maxDocs=44218)
                0.0625 = fieldNorm(doc=935)
          0.02078645 = weight(abstract_txt:they in 935) [ClassicSimilarity], result of:
            0.02078645 = score(doc=935,freq=1.0), product of:
              0.088640615 = queryWeight, product of:
                1.2252096 = boost
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.019282121 = queryNorm
              0.23450254 = fieldWeight in 935, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.0625 = fieldNorm(doc=935)
          0.036802996 = weight(abstract_txt:text in 935) [ClassicSimilarity], result of:
            0.036802996 = score(doc=935,freq=2.0), product of:
              0.1029654 = queryWeight, product of:
                1.3205038 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.019282121 = queryNorm
              0.3574307 = fieldWeight in 935, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=935)
          0.03588312 = weight(abstract_txt:method in 935) [ClassicSimilarity], result of:
            0.03588312 = score(doc=935,freq=1.0), product of:
              0.1275575 = queryWeight, product of:
                1.469762 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.019282121 = queryNorm
              0.28130937 = fieldWeight in 935, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.0625 = fieldNorm(doc=935)
          0.053342916 = weight(abstract_txt:methodology in 935) [ClassicSimilarity], result of:
            0.053342916 = score(doc=935,freq=2.0), product of:
              0.13187234 = queryWeight, product of:
                1.4944139 = boost
                4.5764427 = idf(docFreq=1236, maxDocs=44218)
                0.019282121 = queryNorm
              0.4045042 = fieldWeight in 935, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5764427 = idf(docFreq=1236, maxDocs=44218)
                0.0625 = fieldNorm(doc=935)
          0.04153318 = weight(abstract_txt:some in 935) [ClassicSimilarity], result of:
            0.04153318 = score(doc=935,freq=2.0), product of:
              0.12776044 = queryWeight, product of:
                1.8015149 = boost
                3.6779325 = idf(docFreq=3037, maxDocs=44218)
                0.019282121 = queryNorm
              0.32508639 = fieldWeight in 935, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6779325 = idf(docFreq=3037, maxDocs=44218)
                0.0625 = fieldNorm(doc=935)
          0.21659364 = weight(abstract_txt:topic in 935) [ClassicSimilarity], result of:
            0.21659364 = score(doc=935,freq=2.0), product of:
              0.48406842 = queryWeight, product of:
                4.959159 = boost
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.019282121 = queryNorm
              0.44744426 = fieldWeight in 935, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.0625 = fieldNorm(doc=935)
        0.32 = coord(8/25)
    
  3. Pepper, S.: ¬The TAO of topic maps : finding the way in the age of infoglut (2002) 0.14
    0.14355744 = sum of:
      0.14355744 = product of:
        0.59815603 = sum of:
          0.015940264 = weight(abstract_txt:these in 4724) [ClassicSimilarity], result of:
            0.015940264 = score(doc=4724,freq=1.0), product of:
              0.06399846 = queryWeight, product of:
                1.0410672 = boost
                3.1881294 = idf(docFreq=4957, maxDocs=44218)
                0.019282121 = queryNorm
              0.24907261 = fieldWeight in 4724, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1881294 = idf(docFreq=4957, maxDocs=44218)
                0.078125 = fieldNorm(doc=4724)
          0.07101046 = weight(abstract_txt:things in 4724) [ClassicSimilarity], result of:
            0.07101046 = score(doc=4724,freq=1.0), product of:
              0.13752367 = queryWeight, product of:
                1.079115 = boost
                6.609291 = idf(docFreq=161, maxDocs=44218)
                0.019282121 = queryNorm
              0.51635087 = fieldWeight in 4724, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.609291 = idf(docFreq=161, maxDocs=44218)
                0.078125 = fieldNorm(doc=4724)
          0.020431008 = weight(abstract_txt:using in 4724) [ClassicSimilarity], result of:
            0.020431008 = score(doc=4724,freq=1.0), product of:
              0.07551485 = queryWeight, product of:
                1.1308635 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.019282121 = queryNorm
              0.27055615 = fieldWeight in 4724, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.078125 = fieldNorm(doc=4724)
          0.02598306 = weight(abstract_txt:they in 4724) [ClassicSimilarity], result of:
            0.02598306 = score(doc=4724,freq=1.0), product of:
              0.088640615 = queryWeight, product of:
                1.2252096 = boost
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.019282121 = queryNorm
              0.29312816 = fieldWeight in 4724, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.078125 = fieldNorm(doc=4724)
          0.03671049 = weight(abstract_txt:some in 4724) [ClassicSimilarity], result of:
            0.03671049 = score(doc=4724,freq=1.0), product of:
              0.12776044 = queryWeight, product of:
                1.8015149 = boost
                3.6779325 = idf(docFreq=3037, maxDocs=44218)
                0.019282121 = queryNorm
              0.28733847 = fieldWeight in 4724, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6779325 = idf(docFreq=3037, maxDocs=44218)
                0.078125 = fieldNorm(doc=4724)
          0.42808074 = weight(abstract_txt:topic in 4724) [ClassicSimilarity], result of:
            0.42808074 = score(doc=4724,freq=5.0), product of:
              0.48406842 = queryWeight, product of:
                4.959159 = boost
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.019282121 = queryNorm
              0.88433933 = fieldWeight in 4724, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.078125 = fieldNorm(doc=4724)
        0.24 = coord(6/25)
    
  4. Wu, I.-C.; Vakkari, P.: Supporting navigation in Wikipedia by information visualization : extended evaluation measures (2014) 0.13
    0.12759557 = sum of:
      0.12759557 = product of:
        0.5316482 = sum of:
          0.03390552 = weight(abstract_txt:introduce in 1797) [ClassicSimilarity], result of:
            0.03390552 = score(doc=1797,freq=1.0), product of:
              0.11809784 = queryWeight, product of:
                6.124733 = idf(docFreq=262, maxDocs=44218)
                0.019282121 = queryNorm
              0.28709686 = fieldWeight in 1797, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.124733 = idf(docFreq=262, maxDocs=44218)
                0.046875 = fieldNorm(doc=1797)
          0.019128317 = weight(abstract_txt:these in 1797) [ClassicSimilarity], result of:
            0.019128317 = score(doc=1797,freq=4.0), product of:
              0.06399846 = queryWeight, product of:
                1.0410672 = boost
                3.1881294 = idf(docFreq=4957, maxDocs=44218)
                0.019282121 = queryNorm
              0.29888713 = fieldWeight in 1797, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.1881294 = idf(docFreq=4957, maxDocs=44218)
                0.046875 = fieldNorm(doc=1797)
          0.023708811 = weight(abstract_txt:what in 1797) [ClassicSimilarity], result of:
            0.023708811 = score(doc=1797,freq=1.0), product of:
              0.11722265 = queryWeight, product of:
                1.4089637 = boost
                4.314763 = idf(docFreq=1606, maxDocs=44218)
                0.019282121 = queryNorm
              0.20225452 = fieldWeight in 1797, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.314763 = idf(docFreq=1606, maxDocs=44218)
                0.046875 = fieldNorm(doc=1797)
          0.028289353 = weight(abstract_txt:methodology in 1797) [ClassicSimilarity], result of:
            0.028289353 = score(doc=1797,freq=1.0), product of:
              0.13187234 = queryWeight, product of:
                1.4944139 = boost
                4.5764427 = idf(docFreq=1236, maxDocs=44218)
                0.019282121 = queryNorm
              0.21452075 = fieldWeight in 1797, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5764427 = idf(docFreq=1236, maxDocs=44218)
                0.046875 = fieldNorm(doc=1797)
          0.04564837 = weight(abstract_txt:articles in 1797) [ClassicSimilarity], result of:
            0.04564837 = score(doc=1797,freq=2.0), product of:
              0.14399427 = queryWeight, product of:
                1.5615885 = boost
                4.7821565 = idf(docFreq=1006, maxDocs=44218)
                0.019282121 = queryNorm
              0.31701517 = fieldWeight in 1797, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7821565 = idf(docFreq=1006, maxDocs=44218)
                0.046875 = fieldNorm(doc=1797)
          0.38096783 = weight(abstract_txt:topic in 1797) [ClassicSimilarity], result of:
            0.38096783 = score(doc=1797,freq=11.0), product of:
              0.48406842 = queryWeight, product of:
                4.959159 = boost
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.019282121 = queryNorm
              0.78701234 = fieldWeight in 1797, product of:
                3.3166249 = tf(freq=11.0), with freq of:
                  11.0 = termFreq=11.0
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.046875 = fieldNorm(doc=1797)
        0.24 = coord(6/25)
    
  5. Zhao, X.; Jin, P.; Yue, L.: Discovering topic time from web news (2015) 0.12
    0.12465697 = sum of:
      0.12465697 = product of:
        0.6232848 = sum of:
          0.0463392 = weight(abstract_txt:mining in 2673) [ClassicSimilarity], result of:
            0.0463392 = score(doc=2673,freq=1.0), product of:
              0.12006089 = queryWeight, product of:
                1.0082768 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.019282121 = queryNorm
              0.38596416 = fieldWeight in 2673, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0625 = fieldNorm(doc=2673)
          0.012752211 = weight(abstract_txt:these in 2673) [ClassicSimilarity], result of:
            0.012752211 = score(doc=2673,freq=1.0), product of:
              0.06399846 = queryWeight, product of:
                1.0410672 = boost
                3.1881294 = idf(docFreq=4957, maxDocs=44218)
                0.019282121 = queryNorm
              0.19925809 = fieldWeight in 2673, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1881294 = idf(docFreq=4957, maxDocs=44218)
                0.0625 = fieldNorm(doc=2673)
          0.026023647 = weight(abstract_txt:text in 2673) [ClassicSimilarity], result of:
            0.026023647 = score(doc=2673,freq=1.0), product of:
              0.1029654 = queryWeight, product of:
                1.3205038 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.019282121 = queryNorm
              0.25274166 = fieldWeight in 2673, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=2673)
          0.07870523 = weight(abstract_txt:models in 2673) [ClassicSimilarity], result of:
            0.07870523 = score(doc=2673,freq=1.0), product of:
              0.2713053 = queryWeight, product of:
                3.0313644 = boost
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.019282121 = queryNorm
              0.2900984 = fieldWeight in 2673, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.0625 = fieldNorm(doc=2673)
          0.4594645 = weight(abstract_txt:topic in 2673) [ClassicSimilarity], result of:
            0.4594645 = score(doc=2673,freq=9.0), product of:
              0.48406842 = queryWeight, product of:
                4.959159 = boost
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.019282121 = queryNorm
              0.9491726 = fieldWeight in 2673, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.0625 = fieldNorm(doc=2673)
        0.2 = coord(5/25)