Document (#30292)

Author
Newman, D.J.
Block, S.
Title
Probabilistic topic decomposition of an eighteenth-century American newspaper
Source
Journal of the American Society for Information Science and Technology. 57(2006) no.6, S.753-767
Year
2006
Abstract
We use a probabilistic mixture decomposition method to determine topics in the Pennsylvania Gazette, a major colonial U.S. newspaper from 1728-1800. We assess the value of several topic decomposition techniques for historical research and compare the accuracy and efficacy of various methods. After determining the topics covered by the 80,000 articles and advertisements in the entire 18th century run of the Gazette, we calculate how the prevalence of those topics changed over time, and give historically relevant examples of our findings. This approach reveals important information about the content of this colonial newspaper, and suggests the value of such approaches to a more complete understanding of early American print culture and society.
Theme
Automatisches Indexieren
Form
Zeitungen

Similar documents (author)

  1. Newman, N.: Search strategies and activities of BBC news interactive (2007) 2.32
    2.32216 = sum of:
      2.32216 = product of:
        4.64432 = sum of:
          4.64432 = weight(author_txt:newman in 381) [ClassicSimilarity], result of:
            4.64432 = score(doc=381,freq=1.0), product of:
              0.7502086 = queryWeight, product of:
                1.0651829 = boost
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.071104616 = queryNorm
              6.190705 = fieldWeight in 381, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.625 = fieldNorm(doc=381)
        0.5 = coord(1/2)
    
  2. Block, B.: Quality control makes WorldCat ever better (1999) 1.92
    1.9214078 = sum of:
      1.9214078 = product of:
        3.8428156 = sum of:
          3.8428156 = weight(author_txt:block in 4398) [ClassicSimilarity], result of:
            3.8428156 = score(doc=4398,freq=1.0), product of:
              0.66120124 = queryWeight, product of:
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.071104616 = queryNorm
              5.81187 = fieldWeight in 4398, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.625 = fieldNorm(doc=4398)
        0.5 = coord(1/2)
    
  3. Block, B.: Kooperative Neukatalogisierung : Eine neue Facette in der Zusammenlegung der Verbünde (2007) 1.92
    1.9214078 = sum of:
      1.9214078 = product of:
        3.8428156 = sum of:
          3.8428156 = weight(author_txt:block in 6394) [ClassicSimilarity], result of:
            3.8428156 = score(doc=6394,freq=1.0), product of:
              0.66120124 = queryWeight, product of:
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.071104616 = queryNorm
              5.81187 = fieldWeight in 6394, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.625 = fieldNorm(doc=6394)
        0.5 = coord(1/2)
    
  4. Block, C.H.: ¬Das Intranet : die neue Informationsverarbeitung (2004) 1.92
    1.9214078 = sum of:
      1.9214078 = product of:
        3.8428156 = sum of:
          3.8428156 = weight(author_txt:block in 2396) [ClassicSimilarity], result of:
            3.8428156 = score(doc=2396,freq=1.0), product of:
              0.66120124 = queryWeight, product of:
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.071104616 = queryNorm
              5.81187 = fieldWeight in 2396, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.625 = fieldNorm(doc=2396)
        0.5 = coord(1/2)
    
  5. Block, B.: Noch kooperativer katalogisieren : aus der Arbeit der AG Kooperative Neukatalogisierung (2009) 1.92
    1.9214078 = sum of:
      1.9214078 = product of:
        3.8428156 = sum of:
          3.8428156 = weight(author_txt:block in 3045) [ClassicSimilarity], result of:
            3.8428156 = score(doc=3045,freq=1.0), product of:
              0.66120124 = queryWeight, product of:
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.071104616 = queryNorm
              5.81187 = fieldWeight in 3045, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.625 = fieldNorm(doc=3045)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Karpuk, S.: Cataloging seventeenth- and eighteenth-century German dissertations : guidelines and observations (2010) 0.10
    0.09896052 = sum of:
      0.09896052 = product of:
        0.82467103 = sum of:
          0.64913744 = weight(title_txt:eighteenth in 3555) [ClassicSimilarity], result of:
            0.64913744 = score(doc=3555,freq=1.0), product of:
              0.21598664 = queryWeight, product of:
                1.5301634 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.01467673 = queryNorm
              3.005452 = fieldWeight in 3555, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.3125 = fieldNorm(doc=3555)
          0.08307882 = weight(abstract_txt:american in 3555) [ClassicSimilarity], result of:
            0.08307882 = score(doc=3555,freq=1.0), product of:
              0.13915344 = queryWeight, product of:
                1.7369461 = boost
                5.4585624 = idf(docFreq=511, maxDocs=44218)
                0.01467673 = queryNorm
              0.5970303 = fieldWeight in 3555, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4585624 = idf(docFreq=511, maxDocs=44218)
                0.109375 = fieldNorm(doc=3555)
          0.09245477 = weight(abstract_txt:century in 3555) [ClassicSimilarity], result of:
            0.09245477 = score(doc=3555,freq=1.0), product of:
              0.14943533 = queryWeight, product of:
                1.7999731 = boost
                5.6566324 = idf(docFreq=419, maxDocs=44218)
                0.01467673 = queryNorm
              0.6186942 = fieldWeight in 3555, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6566324 = idf(docFreq=419, maxDocs=44218)
                0.109375 = fieldNorm(doc=3555)
        0.12 = coord(3/25)
    
  2. Kratz, I.: ¬La conversion retrospective des fonds anciens : l'example des bibliothèques Americaines (1997) 0.07
    0.07063234 = sum of:
      0.07063234 = product of:
        0.44145212 = sum of:
          0.14421463 = weight(abstract_txt:18th in 7905) [ClassicSimilarity], result of:
            0.14421463 = score(doc=7905,freq=1.0), product of:
              0.1767914 = queryWeight, product of:
                1.3843788 = boost
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.01467673 = queryNorm
              0.81573325 = fieldWeight in 7905, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.701155 = idf(docFreq=19, maxDocs=44218)
                0.09375 = fieldNorm(doc=7905)
          0.14678012 = weight(abstract_txt:1800 in 7905) [ClassicSimilarity], result of:
            0.14678012 = score(doc=7905,freq=1.0), product of:
              0.17888191 = queryWeight, product of:
                1.3925397 = boost
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.01467673 = queryNorm
              0.820542 = fieldWeight in 7905, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.09375 = fieldNorm(doc=7905)
          0.07121041 = weight(abstract_txt:american in 7905) [ClassicSimilarity], result of:
            0.07121041 = score(doc=7905,freq=1.0), product of:
              0.13915344 = queryWeight, product of:
                1.7369461 = boost
                5.4585624 = idf(docFreq=511, maxDocs=44218)
                0.01467673 = queryNorm
              0.5117402 = fieldWeight in 7905, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4585624 = idf(docFreq=511, maxDocs=44218)
                0.09375 = fieldNorm(doc=7905)
          0.079246946 = weight(abstract_txt:century in 7905) [ClassicSimilarity], result of:
            0.079246946 = score(doc=7905,freq=1.0), product of:
              0.14943533 = queryWeight, product of:
                1.7999731 = boost
                5.6566324 = idf(docFreq=419, maxDocs=44218)
                0.01467673 = queryNorm
              0.5303093 = fieldWeight in 7905, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6566324 = idf(docFreq=419, maxDocs=44218)
                0.09375 = fieldNorm(doc=7905)
        0.16 = coord(4/25)
    
  3. Stover, M.: ¬The best family studies databases on CD-ROM : a survey of nine products (1993) 0.05
    0.05486032 = sum of:
      0.05486032 = product of:
        0.45716935 = sum of:
          0.08307882 = weight(abstract_txt:american in 5297) [ClassicSimilarity], result of:
            0.08307882 = score(doc=5297,freq=1.0), product of:
              0.13915344 = queryWeight, product of:
                1.7369461 = boost
                5.4585624 = idf(docFreq=511, maxDocs=44218)
                0.01467673 = queryNorm
              0.5970303 = fieldWeight in 5297, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4585624 = idf(docFreq=511, maxDocs=44218)
                0.109375 = fieldNorm(doc=5297)
          0.1008149 = weight(abstract_txt:topics in 5297) [ClassicSimilarity], result of:
            0.1008149 = score(doc=5297,freq=1.0), product of:
              0.18122327 = queryWeight, product of:
                2.427683 = boost
                5.086191 = idf(docFreq=742, maxDocs=44218)
                0.01467673 = queryNorm
              0.5563022 = fieldWeight in 5297, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.086191 = idf(docFreq=742, maxDocs=44218)
                0.109375 = fieldNorm(doc=5297)
          0.27327564 = weight(abstract_txt:newspaper in 5297) [ClassicSimilarity], result of:
            0.27327564 = score(doc=5297,freq=1.0), product of:
              0.35231525 = queryWeight, product of:
                3.384938 = boost
                7.0917172 = idf(docFreq=99, maxDocs=44218)
                0.01467673 = queryNorm
              0.7756566 = fieldWeight in 5297, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.0917172 = idf(docFreq=99, maxDocs=44218)
                0.109375 = fieldNorm(doc=5297)
        0.12 = coord(3/25)
    
  4. Luyt, B.: Centres of calculation and unruly colonists : the colonial library in Singapore and its users, 1874-1900 (2008) 0.05
    0.05098709 = sum of:
      0.05098709 = product of:
        0.63733864 = sum of:
          0.024399238 = weight(abstract_txt:value in 1895) [ClassicSimilarity], result of:
            0.024399238 = score(doc=1895,freq=1.0), product of:
              0.089284614 = queryWeight, product of:
                1.3913221 = boost
                4.3723974 = idf(docFreq=1516, maxDocs=44218)
                0.01467673 = queryNorm
              0.27327484 = fieldWeight in 1895, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3723974 = idf(docFreq=1516, maxDocs=44218)
                0.0625 = fieldNorm(doc=1895)
          0.6129394 = weight(abstract_txt:colonial in 1895) [ClassicSimilarity], result of:
            0.6129394 = score(doc=1895,freq=6.0), product of:
              0.42145744 = queryWeight, product of:
                3.0228474 = boost
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.01467673 = queryNorm
              1.454333 = fieldWeight in 1895, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.0625 = fieldNorm(doc=1895)
        0.08 = coord(2/25)
    
  5. Ding, W.; Chen, C.: Dynamic topic detection and tracking : a comparison of HDP, C-word, and cocitation methods (2014) 0.05
    0.050723232 = sum of:
      0.050723232 = product of:
        0.4226936 = sum of:
          0.08198224 = weight(abstract_txt:topic in 1502) [ClassicSimilarity], result of:
            0.08198224 = score(doc=1502,freq=3.0), product of:
              0.119681 = queryWeight, product of:
                1.6108384 = boost
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.01467673 = queryNorm
              0.6850063 = fieldWeight in 1502, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.078125 = fieldNorm(doc=1502)
          0.1966901 = weight(abstract_txt:probabilistic in 1502) [ClassicSimilarity], result of:
            0.1966901 = score(doc=1502,freq=3.0), product of:
              0.21448669 = queryWeight, product of:
                2.1564507 = boost
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.01467673 = queryNorm
              0.91702706 = fieldWeight in 1502, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.078125 = fieldNorm(doc=1502)
          0.14402129 = weight(abstract_txt:topics in 1502) [ClassicSimilarity], result of:
            0.14402129 = score(doc=1502,freq=4.0), product of:
              0.18122327 = queryWeight, product of:
                2.427683 = boost
                5.086191 = idf(docFreq=742, maxDocs=44218)
                0.01467673 = queryNorm
              0.7947174 = fieldWeight in 1502, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.086191 = idf(docFreq=742, maxDocs=44218)
                0.078125 = fieldNorm(doc=1502)
        0.12 = coord(3/25)