Document (#44102)

Author
Grabus, S.
Logan, P.M.
Greenberg, J.
Title
Temporal concept drift and alignment : an empirical approach to comparing knowledge organization systems over time
Source
Knowledge organization. 49(2022) no.2, S.69 - 78
Year
2022
Abstract
This research explores temporal concept drift and temporal alignment in knowledge organization systems (KOS). A comparative analysis is pursued using the 1910 Library of Congress Subject Headings, 2020 FAST Topical, and automatic indexing. The use case involves a sample of 90 nineteenth-century Encyclopedia Britannica entries. The entries were indexed using two approaches: 1) full-text indexing; 2) Named Entity Recognition was performed upon the entries with Stanza, Stanford's NLP toolkit, and entities were automatically indexed with the Helping Interdisciplinary Vocabulary application (HIVE), using both 1910 LCSH and FAST Topical. The analysis focused on three goals: 1) identifying results that were exclusive to the 1910 LCSH output; 2) identifying terms in the exclusive set that have been deprecated from the contemporary LCSH, demonstrating temporal concept drift; and 3) exploring the historical significance of these deprecated terms. Results confirm that historical vocabularies can be used to generate anachronistic subject headings representing conceptual drift across time in KOS and historical resources. A methodological contribution is made demonstrating how to study changes in KOS over time and improve the contextualization historical humanities resources.
Content
Vgl.: https://www.nomos-elibrary.de/10.5771/0943-7444-2022-2/ko-knowledge-organization-jahrgang-49-2022-heft-2?page=1.
Object
LCSH
FAST

Similar documents (author)

  1. Logan, E.: Cognitive styles and online behaviour of novice searchers (1990) 2.55
    2.5531936 = sum of:
      2.5531936 = product of:
        5.106387 = sum of:
          5.106387 = weight(author_txt:logan in 6891) [ClassicSimilarity], result of:
            5.106387 = score(doc=6891,freq=1.0), product of:
              0.82484746 = queryWeight, product of:
                1.2078865 = boost
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.06894257 = queryNorm
              6.190705 = fieldWeight in 6891, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.625 = fieldNorm(doc=6891)
        0.5 = coord(1/2)
    
  2. Logan, E.: ¬The Internet challenge (1995) 2.55
    2.5531936 = sum of:
      2.5531936 = product of:
        5.106387 = sum of:
          5.106387 = weight(author_txt:logan in 2731) [ClassicSimilarity], result of:
            5.106387 = score(doc=2731,freq=1.0), product of:
              0.82484746 = queryWeight, product of:
                1.2078865 = boost
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.06894257 = queryNorm
              6.190705 = fieldWeight in 2731, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.625 = fieldNorm(doc=2731)
        0.5 = coord(1/2)
    
  3. Logan, E.: ¬The Internet challenge accepted (1996) 2.55
    2.5531936 = sum of:
      2.5531936 = product of:
        5.106387 = sum of:
          5.106387 = weight(author_txt:logan in 5859) [ClassicSimilarity], result of:
            5.106387 = score(doc=5859,freq=1.0), product of:
              0.82484746 = queryWeight, product of:
                1.2078865 = boost
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.06894257 = queryNorm
              6.190705 = fieldWeight in 5859, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.625 = fieldNorm(doc=5859)
        0.5 = coord(1/2)
    
  4. Greenberg, A.M.: ¬An author index to Library of Congress Classification: class P, subclasses PN, PR, PS, PZ; general literature, english, juvenile belles lettres (1981) 1.45
    1.4487898 = sum of:
      1.4487898 = product of:
        2.8975797 = sum of:
          2.8975797 = weight(author_txt:greenberg in 3519) [ClassicSimilarity], result of:
            2.8975797 = score(doc=3519,freq=1.0), product of:
              0.56535524 = queryWeight, product of:
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.06894257 = queryNorm
              5.125237 = fieldWeight in 3519, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.625 = fieldNorm(doc=3519)
        0.5 = coord(1/2)
    
  5. Greenberg, J.: Subject control of ephemera : MARC format options (1996) 1.45
    1.4487898 = sum of:
      1.4487898 = product of:
        2.8975797 = sum of:
          2.8975797 = weight(author_txt:greenberg in 543) [ClassicSimilarity], result of:
            2.8975797 = score(doc=543,freq=1.0), product of:
              0.56535524 = queryWeight, product of:
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.06894257 = queryNorm
              5.125237 = fieldWeight in 543, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.625 = fieldNorm(doc=543)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Li, W.; Wong, K.-F.; Yuan, C.: Toward automatic Chinese temporal information extraction (2001) 0.08
    0.08130448 = sum of:
      0.08130448 = product of:
        0.508153 = sum of:
          0.015272913 = weight(abstract_txt:over in 6029) [ClassicSimilarity], result of:
            0.015272913 = score(doc=6029,freq=1.0), product of:
              0.057572734 = queryWeight, product of:
                1.0389746 = boost
                4.244485 = idf(docFreq=1723, maxDocs=44218)
                0.013055302 = queryNorm
              0.2652803 = fieldWeight in 6029, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.244485 = idf(docFreq=1723, maxDocs=44218)
                0.0625 = fieldNorm(doc=6029)
          0.03024638 = weight(abstract_txt:time in 6029) [ClassicSimilarity], result of:
            0.03024638 = score(doc=6029,freq=2.0), product of:
              0.0824907 = queryWeight, product of:
                1.5231569 = boost
                4.148331 = idf(docFreq=1897, maxDocs=44218)
                0.013055302 = queryNorm
              0.36666414 = fieldWeight in 6029, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.148331 = idf(docFreq=1897, maxDocs=44218)
                0.0625 = fieldNorm(doc=6029)
          0.04745867 = weight(abstract_txt:concept in 6029) [ClassicSimilarity], result of:
            0.04745867 = score(doc=6029,freq=3.0), product of:
              0.097305186 = queryWeight, product of:
                1.6542842 = boost
                4.505458 = idf(docFreq=1327, maxDocs=44218)
                0.013055302 = queryNorm
              0.48773012 = fieldWeight in 6029, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.505458 = idf(docFreq=1327, maxDocs=44218)
                0.0625 = fieldNorm(doc=6029)
          0.41517508 = weight(abstract_txt:temporal in 6029) [ClassicSimilarity], result of:
            0.41517508 = score(doc=6029,freq=10.0), product of:
              0.30439192 = queryWeight, product of:
                3.3785322 = boost
                6.901097 = idf(docFreq=120, maxDocs=44218)
                0.013055302 = queryNorm
              1.3639491 = fieldWeight in 6029, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                6.901097 = idf(docFreq=120, maxDocs=44218)
                0.0625 = fieldNorm(doc=6029)
        0.16 = coord(4/25)
    
  2. Jatowt, A.; Yeung, C.M.A.; Tanaka, K.: Generic method for detecting focus time of documents (2015) 0.08
    0.07771361 = sum of:
      0.07771361 = product of:
        0.38856804 = sum of:
          0.015038765 = weight(abstract_txt:resources in 2668) [ClassicSimilarity], result of:
            0.015038765 = score(doc=2668,freq=1.0), product of:
              0.056982793 = queryWeight, product of:
                1.0336378 = boost
                4.2226825 = idf(docFreq=1761, maxDocs=44218)
                0.013055302 = queryNorm
              0.26391765 = fieldWeight in 2668, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2226825 = idf(docFreq=1761, maxDocs=44218)
                0.0625 = fieldNorm(doc=2668)
          0.012443433 = weight(abstract_txt:using in 2668) [ClassicSimilarity], result of:
            0.012443433 = score(doc=2668,freq=1.0), product of:
              0.05749007 = queryWeight, product of:
                1.271565 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.013055302 = queryNorm
              0.21644491 = fieldWeight in 2668, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.0625 = fieldNorm(doc=2668)
          0.06416226 = weight(abstract_txt:time in 2668) [ClassicSimilarity], result of:
            0.06416226 = score(doc=2668,freq=9.0), product of:
              0.0824907 = queryWeight, product of:
                1.5231569 = boost
                4.148331 = idf(docFreq=1897, maxDocs=44218)
                0.013055302 = queryNorm
              0.7778121 = fieldWeight in 2668, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                4.148331 = idf(docFreq=1897, maxDocs=44218)
                0.0625 = fieldNorm(doc=2668)
          0.06952287 = weight(abstract_txt:historical in 2668) [ClassicSimilarity], result of:
            0.06952287 = score(doc=2668,freq=1.0), product of:
              0.19923429 = queryWeight, product of:
                2.7333393 = boost
                5.583205 = idf(docFreq=451, maxDocs=44218)
                0.013055302 = queryNorm
              0.34895033 = fieldWeight in 2668, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.583205 = idf(docFreq=451, maxDocs=44218)
                0.0625 = fieldNorm(doc=2668)
          0.22740074 = weight(abstract_txt:temporal in 2668) [ClassicSimilarity], result of:
            0.22740074 = score(doc=2668,freq=3.0), product of:
              0.30439192 = queryWeight, product of:
                3.3785322 = boost
                6.901097 = idf(docFreq=120, maxDocs=44218)
                0.013055302 = queryNorm
              0.7470656 = fieldWeight in 2668, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.901097 = idf(docFreq=120, maxDocs=44218)
                0.0625 = fieldNorm(doc=2668)
        0.2 = coord(5/25)
    
  3. Carlyle, A.: Matching LCSH and user vocabulary in the library catalog (1989) 0.07
    0.073776536 = sum of:
      0.073776536 = product of:
        0.36888266 = sum of:
          0.06826491 = weight(abstract_txt:headings in 449) [ClassicSimilarity], result of:
            0.06826491 = score(doc=449,freq=4.0), product of:
              0.084808744 = queryWeight, product of:
                1.261005 = boost
                5.1515374 = idf(docFreq=695, maxDocs=44218)
                0.013055302 = queryNorm
              0.8049277 = fieldWeight in 449, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.1515374 = idf(docFreq=695, maxDocs=44218)
                0.078125 = fieldNorm(doc=449)
          0.015554292 = weight(abstract_txt:using in 449) [ClassicSimilarity], result of:
            0.015554292 = score(doc=449,freq=1.0), product of:
              0.05749007 = queryWeight, product of:
                1.271565 = boost
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.013055302 = queryNorm
              0.27055615 = fieldWeight in 449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4631186 = idf(docFreq=3765, maxDocs=44218)
                0.078125 = fieldNorm(doc=449)
          0.018512668 = weight(abstract_txt:were in 449) [ClassicSimilarity], result of:
            0.018512668 = score(doc=449,freq=1.0), product of:
              0.064566225 = queryWeight, product of:
                1.3475498 = boost
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.013055302 = queryNorm
              0.28672373 = fieldWeight in 449, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.078125 = fieldNorm(doc=449)
          0.037807975 = weight(abstract_txt:time in 449) [ClassicSimilarity], result of:
            0.037807975 = score(doc=449,freq=2.0), product of:
              0.0824907 = queryWeight, product of:
                1.5231569 = boost
                4.148331 = idf(docFreq=1897, maxDocs=44218)
                0.013055302 = queryNorm
              0.45833018 = fieldWeight in 449, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.148331 = idf(docFreq=1897, maxDocs=44218)
                0.078125 = fieldNorm(doc=449)
          0.22874282 = weight(abstract_txt:lcsh in 449) [ClassicSimilarity], result of:
            0.22874282 = score(doc=449,freq=6.0), product of:
              0.18990682 = queryWeight, product of:
                2.3110664 = boost
                6.29421 = idf(docFreq=221, maxDocs=44218)
                0.013055302 = queryNorm
              1.2045003 = fieldWeight in 449, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.29421 = idf(docFreq=221, maxDocs=44218)
                0.078125 = fieldNorm(doc=449)
        0.2 = coord(5/25)
    
  4. Yi, K.; Chan, L.M.: Revisiting the syntactical and structural analysis of Library of Congress Subject Headings for the digital environment (2010) 0.07
    0.07187267 = sum of:
      0.07187267 = product of:
        0.35936332 = sum of:
          0.037596915 = weight(abstract_txt:resources in 3431) [ClassicSimilarity], result of:
            0.037596915 = score(doc=3431,freq=4.0), product of:
              0.056982793 = queryWeight, product of:
                1.0336378 = boost
                4.2226825 = idf(docFreq=1761, maxDocs=44218)
                0.013055302 = queryNorm
              0.65979415 = fieldWeight in 3431, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.2226825 = idf(docFreq=1761, maxDocs=44218)
                0.078125 = fieldNorm(doc=3431)
          0.022050807 = weight(abstract_txt:organization in 3431) [ClassicSimilarity], result of:
            0.022050807 = score(doc=3431,freq=1.0), product of:
              0.06337898 = queryWeight, product of:
                1.0901071 = boost
                4.4533744 = idf(docFreq=1398, maxDocs=44218)
                0.013055302 = queryNorm
              0.34791988 = fieldWeight in 3431, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4533744 = idf(docFreq=1398, maxDocs=44218)
                0.078125 = fieldNorm(doc=3431)
          0.034132455 = weight(abstract_txt:headings in 3431) [ClassicSimilarity], result of:
            0.034132455 = score(doc=3431,freq=1.0), product of:
              0.084808744 = queryWeight, product of:
                1.261005 = boost
                5.1515374 = idf(docFreq=695, maxDocs=44218)
                0.013055302 = queryNorm
              0.40246385 = fieldWeight in 3431, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1515374 = idf(docFreq=695, maxDocs=44218)
                0.078125 = fieldNorm(doc=3431)
          0.018512668 = weight(abstract_txt:were in 3431) [ClassicSimilarity], result of:
            0.018512668 = score(doc=3431,freq=1.0), product of:
              0.064566225 = queryWeight, product of:
                1.3475498 = boost
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.013055302 = queryNorm
              0.28672373 = fieldWeight in 3431, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.078125 = fieldNorm(doc=3431)
          0.24707048 = weight(abstract_txt:lcsh in 3431) [ClassicSimilarity], result of:
            0.24707048 = score(doc=3431,freq=7.0), product of:
              0.18990682 = queryWeight, product of:
                2.3110664 = boost
                6.29421 = idf(docFreq=221, maxDocs=44218)
                0.013055302 = queryNorm
              1.3010089 = fieldWeight in 3431, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.29421 = idf(docFreq=221, maxDocs=44218)
                0.078125 = fieldNorm(doc=3431)
        0.2 = coord(5/25)
    
  5. Frost, C.O.; Dede, B.A.: Subject heading compatibility between LCSH and catalog files of a large research library : a suggested model for analysis (1988) 0.07
    0.06734207 = sum of:
      0.06734207 = product of:
        0.42088795 = sum of:
          0.070942976 = weight(abstract_txt:headings in 655) [ClassicSimilarity], result of:
            0.070942976 = score(doc=655,freq=3.0), product of:
              0.084808744 = queryWeight, product of:
                1.261005 = boost
                5.1515374 = idf(docFreq=695, maxDocs=44218)
                0.013055302 = queryNorm
              0.8365054 = fieldWeight in 655, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.1515374 = idf(docFreq=695, maxDocs=44218)
                0.09375 = fieldNorm(doc=655)
          0.03141704 = weight(abstract_txt:were in 655) [ClassicSimilarity], result of:
            0.03141704 = score(doc=655,freq=2.0), product of:
              0.064566225 = queryWeight, product of:
                1.3475498 = boost
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.013055302 = queryNorm
              0.48658627 = fieldWeight in 655, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.09375 = fieldNorm(doc=655)
          0.12443322 = weight(abstract_txt:topical in 655) [ClassicSimilarity], result of:
            0.12443322 = score(doc=655,freq=2.0), product of:
              0.14119598 = queryWeight, product of:
                1.6270754 = boost
                6.6470313 = idf(docFreq=155, maxDocs=44218)
                0.013055302 = queryNorm
              0.8812802 = fieldWeight in 655, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.6470313 = idf(docFreq=155, maxDocs=44218)
                0.09375 = fieldNorm(doc=655)
          0.1940947 = weight(abstract_txt:lcsh in 655) [ClassicSimilarity], result of:
            0.1940947 = score(doc=655,freq=3.0), product of:
              0.18990682 = queryWeight, product of:
                2.3110664 = boost
                6.29421 = idf(docFreq=221, maxDocs=44218)
                0.013055302 = queryNorm
              1.0220523 = fieldWeight in 655, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.29421 = idf(docFreq=221, maxDocs=44218)
                0.09375 = fieldNorm(doc=655)
        0.16 = coord(4/25)