Document (#39277)

Author
Gatenby, J.
Thornburg, G.
Weitz, J.
Title
Collected work clustering in WorldCat : three techniques for maintaining records
Source
Code4Lib journal. Issue 30(2015), [http://journal.code4lib.org]
Year
2015
Abstract
WorldCat records are clustered into works, and within works, into content and manifestation clusters. A recent project revisited the clustering of collected works that had been previously sidelined because of the challenges posed by their complexity. Attention was given to both the identification of collected works and to the determination of the component works within them. By extensively analysing cast-list information, performance notes, contents notes, titles, uniform titles and added entries, the contents of collected works could be identified and differentiated so that correct clustering was achieved. Further work is envisaged in the form of refining the tests and weights and also in the creation and use of name/title authority records and other knowledge cards in clustering. There is a requirement to link collected works with their component works for use in search and retrieval.
Content
Vgl.: http://journal.code4lib.org/articles/10963.
Theme
Formalerschließung
Object
WorldCat
FRBR

Similar documents (content)

  1. FictionFinder : a FRBR-based prototype for fiction in WorldCat (o.J.) 0.29
    0.2910194 = sum of:
      0.2910194 = product of:
        0.90943563 = sum of:
          0.03165059 = weight(abstract_txt:into in 2432) [ClassicSimilarity], result of:
            0.03165059 = score(doc=2432,freq=2.0), product of:
              0.064468876 = queryWeight, product of:
                1.02255 = boost
                3.7029297 = idf(docFreq=2962, maxDocs=44218)
                0.017026292 = queryNorm
              0.49094376 = fieldWeight in 2432, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.7029297 = idf(docFreq=2962, maxDocs=44218)
                0.09375 = fieldNorm(doc=2432)
          0.034054514 = weight(abstract_txt:work in 2432) [ClassicSimilarity], result of:
            0.034054514 = score(doc=2432,freq=2.0), product of:
              0.06769325 = queryWeight, product of:
                1.0478091 = boost
                3.7943997 = idf(docFreq=2703, maxDocs=44218)
                0.017026292 = queryNorm
              0.50307107 = fieldWeight in 2432, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.7943997 = idf(docFreq=2703, maxDocs=44218)
                0.09375 = fieldNorm(doc=2432)
          0.09684498 = weight(abstract_txt:manifestation in 2432) [ClassicSimilarity], result of:
            0.09684498 = score(doc=2432,freq=1.0), product of:
              0.13587731 = queryWeight, product of:
                1.0497067 = boost
                7.602543 = idf(docFreq=59, maxDocs=44218)
                0.017026292 = queryNorm
              0.7127384 = fieldWeight in 2432, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.602543 = idf(docFreq=59, maxDocs=44218)
                0.09375 = fieldNorm(doc=2432)
          0.10648003 = weight(abstract_txt:clustered in 2432) [ClassicSimilarity], result of:
            0.10648003 = score(doc=2432,freq=1.0), product of:
              0.14474636 = queryWeight, product of:
                1.0834237 = boost
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.017026292 = queryNorm
              0.7356318 = fieldWeight in 2432, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.09375 = fieldNorm(doc=2432)
          0.082626514 = weight(abstract_txt:titles in 2432) [ClassicSimilarity], result of:
            0.082626514 = score(doc=2432,freq=1.0), product of:
              0.15399921 = queryWeight, product of:
                1.5804063 = boost
                5.723078 = idf(docFreq=392, maxDocs=44218)
                0.017026292 = queryNorm
              0.53653854 = fieldWeight in 2432, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.723078 = idf(docFreq=392, maxDocs=44218)
                0.09375 = fieldNorm(doc=2432)
          0.12763429 = weight(abstract_txt:records in 2432) [ClassicSimilarity], result of:
            0.12763429 = score(doc=2432,freq=5.0), product of:
              0.13776033 = queryWeight, product of:
                1.8307002 = boost
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.017026292 = queryNorm
              0.9264952 = fieldWeight in 2432, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.09375 = fieldNorm(doc=2432)
          0.17713057 = weight(abstract_txt:worldcat in 2432) [ClassicSimilarity], result of:
            0.17713057 = score(doc=2432,freq=1.0), product of:
              0.2560361 = queryWeight, product of:
                2.0377932 = boost
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.017026292 = queryNorm
              0.6918187 = fieldWeight in 2432, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.09375 = fieldNorm(doc=2432)
          0.25301418 = weight(abstract_txt:works in 2432) [ClassicSimilarity], result of:
            0.25301418 = score(doc=2432,freq=1.0), product of:
              0.5154922 = queryWeight, product of:
                5.7829647 = boost
                5.2354193 = idf(docFreq=639, maxDocs=44218)
                0.017026292 = queryNorm
              0.49082056 = fieldWeight in 2432, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2354193 = idf(docFreq=639, maxDocs=44218)
                0.09375 = fieldNorm(doc=2432)
        0.32 = coord(8/25)
    
  2. Dwyer, J.: Bibliographic records enhancement : from the drawing board to the catalog screen (1991) 0.17
    0.17245735 = sum of:
      0.17245735 = product of:
        0.86228675 = sum of:
          0.0963976 = weight(abstract_txt:titles in 514) [ClassicSimilarity], result of:
            0.0963976 = score(doc=514,freq=1.0), product of:
              0.15399921 = queryWeight, product of:
                1.5804063 = boost
                5.723078 = idf(docFreq=392, maxDocs=44218)
                0.017026292 = queryNorm
              0.62596166 = fieldWeight in 514, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.723078 = idf(docFreq=392, maxDocs=44218)
                0.109375 = fieldNorm(doc=514)
          0.19895433 = weight(abstract_txt:contents in 514) [ClassicSimilarity], result of:
            0.19895433 = score(doc=514,freq=4.0), product of:
              0.15726182 = queryWeight, product of:
                1.5970597 = boost
                5.7833843 = idf(docFreq=369, maxDocs=44218)
                0.017026292 = queryNorm
              1.2651153 = fieldWeight in 514, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.7833843 = idf(docFreq=369, maxDocs=44218)
                0.109375 = fieldNorm(doc=514)
          0.17757483 = weight(abstract_txt:notes in 514) [ClassicSimilarity], result of:
            0.17757483 = score(doc=514,freq=3.0), product of:
              0.1604556 = queryWeight, product of:
                1.6131953 = boost
                5.8418155 = idf(docFreq=348, maxDocs=44218)
                0.017026292 = queryNorm
              1.1066914 = fieldWeight in 514, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.8418155 = idf(docFreq=348, maxDocs=44218)
                0.109375 = fieldNorm(doc=514)
          0.09417684 = weight(abstract_txt:records in 514) [ClassicSimilarity], result of:
            0.09417684 = score(doc=514,freq=2.0), product of:
              0.13776033 = queryWeight, product of:
                1.8307002 = boost
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.017026292 = queryNorm
              0.68362814 = fieldWeight in 514, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.109375 = fieldNorm(doc=514)
          0.29518318 = weight(abstract_txt:works in 514) [ClassicSimilarity], result of:
            0.29518318 = score(doc=514,freq=1.0), product of:
              0.5154922 = queryWeight, product of:
                5.7829647 = boost
                5.2354193 = idf(docFreq=639, maxDocs=44218)
                0.017026292 = queryNorm
              0.57262397 = fieldWeight in 514, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2354193 = idf(docFreq=639, maxDocs=44218)
                0.109375 = fieldNorm(doc=514)
        0.2 = coord(5/25)
    
  3. Hickey, T.B.; O'Neill, E.T.; Toves, J.: Experiments with the IFLA Functional Requirements for Bibliographic Records (FRBR) (2002) 0.16
    0.15611859 = sum of:
      0.15611859 = product of:
        0.7805929 = sum of:
          0.029840464 = weight(abstract_txt:into in 1660) [ClassicSimilarity], result of:
            0.029840464 = score(doc=1660,freq=1.0), product of:
              0.064468876 = queryWeight, product of:
                1.02255 = boost
                3.7029297 = idf(docFreq=2962, maxDocs=44218)
                0.017026292 = queryNorm
              0.46286622 = fieldWeight in 1660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7029297 = idf(docFreq=2962, maxDocs=44218)
                0.125 = fieldNorm(doc=1660)
          0.045406017 = weight(abstract_txt:work in 1660) [ClassicSimilarity], result of:
            0.045406017 = score(doc=1660,freq=2.0), product of:
              0.06769325 = queryWeight, product of:
                1.0478091 = boost
                3.7943997 = idf(docFreq=2703, maxDocs=44218)
                0.017026292 = queryNorm
              0.6707614 = fieldWeight in 1660, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.7943997 = idf(docFreq=2703, maxDocs=44218)
                0.125 = fieldNorm(doc=1660)
          0.13182011 = weight(abstract_txt:records in 1660) [ClassicSimilarity], result of:
            0.13182011 = score(doc=1660,freq=3.0), product of:
              0.13776033 = queryWeight, product of:
                1.8307002 = boost
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.017026292 = queryNorm
              0.95688003 = fieldWeight in 1660, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.125 = fieldNorm(doc=1660)
          0.23617408 = weight(abstract_txt:worldcat in 1660) [ClassicSimilarity], result of:
            0.23617408 = score(doc=1660,freq=1.0), product of:
              0.2560361 = queryWeight, product of:
                2.0377932 = boost
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.017026292 = queryNorm
              0.9224249 = fieldWeight in 1660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.125 = fieldNorm(doc=1660)
          0.33735222 = weight(abstract_txt:works in 1660) [ClassicSimilarity], result of:
            0.33735222 = score(doc=1660,freq=1.0), product of:
              0.5154922 = queryWeight, product of:
                5.7829647 = boost
                5.2354193 = idf(docFreq=639, maxDocs=44218)
                0.017026292 = queryNorm
              0.6544274 = fieldWeight in 1660, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2354193 = idf(docFreq=639, maxDocs=44218)
                0.125 = fieldNorm(doc=1660)
        0.2 = coord(5/25)
    
  4. Carlyle, A.; Summerlin, J.: Transforming catalog displays : records clustering for works of fiction (2000) 0.14
    0.13952596 = sum of:
      0.13952596 = product of:
        0.6976298 = sum of:
          0.01865029 = weight(abstract_txt:into in 100) [ClassicSimilarity], result of:
            0.01865029 = score(doc=100,freq=1.0), product of:
              0.064468876 = queryWeight, product of:
                1.02255 = boost
                3.7029297 = idf(docFreq=2962, maxDocs=44218)
                0.017026292 = queryNorm
              0.28929138 = fieldWeight in 100, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7029297 = idf(docFreq=2962, maxDocs=44218)
                0.078125 = fieldNorm(doc=100)
          0.020066816 = weight(abstract_txt:work in 100) [ClassicSimilarity], result of:
            0.020066816 = score(doc=100,freq=1.0), product of:
              0.06769325 = queryWeight, product of:
                1.0478091 = boost
                3.7943997 = idf(docFreq=2703, maxDocs=44218)
                0.017026292 = queryNorm
              0.29643747 = fieldWeight in 100, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7943997 = idf(docFreq=2703, maxDocs=44218)
                0.078125 = fieldNorm(doc=100)
          0.09513297 = weight(abstract_txt:records in 100) [ClassicSimilarity], result of:
            0.09513297 = score(doc=100,freq=4.0), product of:
              0.13776033 = queryWeight, product of:
                1.8307002 = boost
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.017026292 = queryNorm
              0.6905687 = fieldWeight in 100, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.078125 = fieldNorm(doc=100)
          0.35293463 = weight(abstract_txt:clustering in 100) [ClassicSimilarity], result of:
            0.35293463 = score(doc=100,freq=4.0), product of:
              0.36336735 = queryWeight, product of:
                3.433187 = boost
                6.2162485 = idf(docFreq=239, maxDocs=44218)
                0.017026292 = queryNorm
              0.9712888 = fieldWeight in 100, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.2162485 = idf(docFreq=239, maxDocs=44218)
                0.078125 = fieldNorm(doc=100)
          0.21084514 = weight(abstract_txt:works in 100) [ClassicSimilarity], result of:
            0.21084514 = score(doc=100,freq=1.0), product of:
              0.5154922 = queryWeight, product of:
                5.7829647 = boost
                5.2354193 = idf(docFreq=639, maxDocs=44218)
                0.017026292 = queryNorm
              0.40901715 = fieldWeight in 100, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2354193 = idf(docFreq=639, maxDocs=44218)
                0.078125 = fieldNorm(doc=100)
        0.2 = coord(5/25)
    
  5. Smiraglia, R.P.: Knowledge sharing and content genealogy : extensing the "works" model as a metaphor for non-documentary artefacts with case studies of Etruscan artefacts (2004) 0.13
    0.12626043 = sum of:
      0.12626043 = product of:
        0.6313021 = sum of:
          0.013055203 = weight(abstract_txt:into in 2671) [ClassicSimilarity], result of:
            0.013055203 = score(doc=2671,freq=1.0), product of:
              0.064468876 = queryWeight, product of:
                1.02255 = boost
                3.7029297 = idf(docFreq=2962, maxDocs=44218)
                0.017026292 = queryNorm
              0.20250396 = fieldWeight in 2671, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7029297 = idf(docFreq=2962, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2671)
          0.014046771 = weight(abstract_txt:work in 2671) [ClassicSimilarity], result of:
            0.014046771 = score(doc=2671,freq=1.0), product of:
              0.06769325 = queryWeight, product of:
                1.0478091 = boost
                3.7943997 = idf(docFreq=2703, maxDocs=44218)
                0.017026292 = queryNorm
              0.20750624 = fieldWeight in 2671, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7943997 = idf(docFreq=2703, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2671)
          0.04708842 = weight(abstract_txt:records in 2671) [ClassicSimilarity], result of:
            0.04708842 = score(doc=2671,freq=2.0), product of:
              0.13776033 = queryWeight, product of:
                1.8307002 = boost
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.017026292 = queryNorm
              0.34181407 = fieldWeight in 2671, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4196396 = idf(docFreq=1446, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2671)
          0.11433699 = weight(abstract_txt:collected in 2671) [ClassicSimilarity], result of:
            0.11433699 = score(doc=2671,freq=1.0), product of:
              0.37176245 = queryWeight, product of:
                3.882507 = boost
                5.6238427 = idf(docFreq=433, maxDocs=44218)
                0.017026292 = queryNorm
              0.3075539 = fieldWeight in 2671, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6238427 = idf(docFreq=433, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2671)
          0.44277477 = weight(abstract_txt:works in 2671) [ClassicSimilarity], result of:
            0.44277477 = score(doc=2671,freq=9.0), product of:
              0.5154922 = queryWeight, product of:
                5.7829647 = boost
                5.2354193 = idf(docFreq=639, maxDocs=44218)
                0.017026292 = queryNorm
              0.85893595 = fieldWeight in 2671, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                5.2354193 = idf(docFreq=639, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2671)
        0.2 = coord(5/25)