Document (#39278)

Author
Gatenby, J.
Thornburg, G.
Weitz, J.
Title
Collected work clustering in WorldCat : three techniques for maintaining records
Source
Code4Lib journal. Issue 30(2015), [http://journal.code4lib.org]
Year
2015
Abstract
WorldCat records are clustered into works, and within works, into content and manifestation clusters. A recent project revisited the clustering of collected works that had been previously sidelined because of the challenges posed by their complexity. Attention was given to both the identification of collected works and to the determination of the component works within them. By extensively analysing cast-list information, performance notes, contents notes, titles, uniform titles and added entries, the contents of collected works could be identified and differentiated so that correct clustering was achieved. Further work is envisaged in the form of refining the tests and weights and also in the creation and use of name/title authority records and other knowledge cards in clustering. There is a requirement to link collected works with their component works for use in search and retrieval.
Content
Vgl.: http://journal.code4lib.org/articles/10963.
Theme
Formalerschließung
Object
WorldCat
FRBR

Similar documents (content)

  1. FictionFinder : a FRBR-based prototype for fiction in WorldCat (o.J.) 0.29
    0.28985575 = sum of:
      0.28985575 = product of:
        0.90579927 = sum of:
          0.03190575 = weight(abstract_txt:into in 3433) [ClassicSimilarity], result of:
            0.03190575 = score(doc=3433,freq=2.0), product of:
              0.06467179 = queryWeight, product of:
                1.0243733 = boost
                3.72107 = idf(docFreq=2802, maxDocs=42596)
                0.016966365 = queryNorm
              0.49334884 = fieldWeight in 3433, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.72107 = idf(docFreq=2802, maxDocs=42596)
                0.09375 = fieldNorm(doc=3433)
          0.09607321 = weight(abstract_txt:manifestation in 3433) [ClassicSimilarity], result of:
            0.09607321 = score(doc=3433,freq=1.0), product of:
              0.13485605 = queryWeight, product of:
                1.0459743 = boost
                7.5990725 = idf(docFreq=57, maxDocs=42596)
                0.016966365 = queryNorm
              0.7124131 = fieldWeight in 3433, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.5990725 = idf(docFreq=57, maxDocs=42596)
                0.09375 = fieldNorm(doc=3433)
          0.034765802 = weight(abstract_txt:work in 3433) [ClassicSimilarity], result of:
            0.034765802 = score(doc=3433,freq=2.0), product of:
              0.06848105 = queryWeight, product of:
                1.0541102 = boost
                3.82909 = idf(docFreq=2515, maxDocs=42596)
                0.016966365 = queryNorm
              0.5076704 = fieldWeight in 3433, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.82909 = idf(docFreq=2515, maxDocs=42596)
                0.09375 = fieldNorm(doc=3433)
          0.105135955 = weight(abstract_txt:clustered in 3433) [ClassicSimilarity], result of:
            0.105135955 = score(doc=3433,freq=1.0), product of:
              0.14320882 = queryWeight, product of:
                1.0778806 = boost
                7.8308744 = idf(docFreq=45, maxDocs=42596)
                0.016966365 = queryNorm
              0.73414445 = fieldWeight in 3433, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.8308744 = idf(docFreq=45, maxDocs=42596)
                0.09375 = fieldNorm(doc=3433)
          0.08203315 = weight(abstract_txt:titles in 3433) [ClassicSimilarity], result of:
            0.08203315 = score(doc=3433,freq=1.0), product of:
              0.15292265 = queryWeight, product of:
                1.5752037 = boost
                5.7219796 = idf(docFreq=378, maxDocs=42596)
                0.016966365 = queryNorm
              0.5364356 = fieldWeight in 3433, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7219796 = idf(docFreq=378, maxDocs=42596)
                0.09375 = fieldNorm(doc=3433)
          0.12635523 = weight(abstract_txt:records in 3433) [ClassicSimilarity], result of:
            0.12635523 = score(doc=3433,freq=5.0), product of:
              0.1365363 = queryWeight, product of:
                1.8229321 = boost
                4.414574 = idf(docFreq=1400, maxDocs=42596)
                0.016966365 = queryNorm
              0.9254333 = fieldWeight in 3433, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.414574 = idf(docFreq=1400, maxDocs=42596)
                0.09375 = fieldNorm(doc=3433)
          0.17522097 = weight(abstract_txt:worldcat in 3433) [ClassicSimilarity], result of:
            0.17522097 = score(doc=3433,freq=1.0), product of:
              0.25363135 = queryWeight, product of:
                2.0286274 = boost
                7.369056 = idf(docFreq=72, maxDocs=42596)
                0.016966365 = queryNorm
              0.690849 = fieldWeight in 3433, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.369056 = idf(docFreq=72, maxDocs=42596)
                0.09375 = fieldNorm(doc=3433)
          0.25430918 = weight(abstract_txt:works in 3433) [ClassicSimilarity], result of:
            0.25430918 = score(doc=3433,freq=1.0), product of:
              0.51610756 = queryWeight, product of:
                5.787632 = boost
                5.2559414 = idf(docFreq=603, maxDocs=42596)
                0.016966365 = queryNorm
              0.4927445 = fieldWeight in 3433, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2559414 = idf(docFreq=603, maxDocs=42596)
                0.09375 = fieldNorm(doc=3433)
        0.32 = coord(8/25)
    
  2. Dwyer, J.: Bibliographic records enhancement : from the drawing board to the catalog screen (1991) 0.17
    0.17148769 = sum of:
      0.17148769 = product of:
        0.85743845 = sum of:
          0.09570534 = weight(abstract_txt:titles in 819) [ClassicSimilarity], result of:
            0.09570534 = score(doc=819,freq=1.0), product of:
              0.15292265 = queryWeight, product of:
                1.5752037 = boost
                5.7219796 = idf(docFreq=378, maxDocs=42596)
                0.016966365 = queryNorm
              0.6258415 = fieldWeight in 819, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.7219796 = idf(docFreq=378, maxDocs=42596)
                0.109375 = fieldNorm(doc=819)
          0.19747491 = weight(abstract_txt:contents in 819) [ClassicSimilarity], result of:
            0.19747491 = score(doc=819,freq=4.0), product of:
              0.15613574 = queryWeight, product of:
                1.591666 = boost
                5.78178 = idf(docFreq=356, maxDocs=42596)
                0.016966365 = queryNorm
              1.2647643 = fieldWeight in 819, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.78178 = idf(docFreq=356, maxDocs=42596)
                0.109375 = fieldNorm(doc=819)
          0.17433104 = weight(abstract_txt:notes in 819) [ClassicSimilarity], result of:
            0.17433104 = score(doc=819,freq=3.0), product of:
              0.15814559 = queryWeight, product of:
                1.6018777 = boost
                5.818874 = idf(docFreq=343, maxDocs=42596)
                0.016966365 = queryNorm
              1.1023452 = fieldWeight in 819, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.818874 = idf(docFreq=343, maxDocs=42596)
                0.109375 = fieldNorm(doc=819)
          0.09323308 = weight(abstract_txt:records in 819) [ClassicSimilarity], result of:
            0.09323308 = score(doc=819,freq=2.0), product of:
              0.1365363 = queryWeight, product of:
                1.8229321 = boost
                4.414574 = idf(docFreq=1400, maxDocs=42596)
                0.016966365 = queryNorm
              0.68284464 = fieldWeight in 819, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.414574 = idf(docFreq=1400, maxDocs=42596)
                0.109375 = fieldNorm(doc=819)
          0.296694 = weight(abstract_txt:works in 819) [ClassicSimilarity], result of:
            0.296694 = score(doc=819,freq=1.0), product of:
              0.51610756 = queryWeight, product of:
                5.787632 = boost
                5.2559414 = idf(docFreq=603, maxDocs=42596)
                0.016966365 = queryNorm
              0.57486856 = fieldWeight in 819, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2559414 = idf(docFreq=603, maxDocs=42596)
                0.109375 = fieldNorm(doc=819)
        0.2 = coord(5/25)
    
  3. Hickey, T.B.; O'Neill, E.T.; Toves, J.: Experiments with the IFLA Functional Requirements for Bibliographic Records (FRBR) (2002) 0.16
    0.15592828 = sum of:
      0.15592828 = product of:
        0.7796414 = sum of:
          0.030081034 = weight(abstract_txt:into in 2661) [ClassicSimilarity], result of:
            0.030081034 = score(doc=2661,freq=1.0), product of:
              0.06467179 = queryWeight, product of:
                1.0243733 = boost
                3.72107 = idf(docFreq=2802, maxDocs=42596)
                0.016966365 = queryNorm
              0.46513376 = fieldWeight in 2661, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.72107 = idf(docFreq=2802, maxDocs=42596)
                0.125 = fieldNorm(doc=2661)
          0.046354406 = weight(abstract_txt:work in 2661) [ClassicSimilarity], result of:
            0.046354406 = score(doc=2661,freq=2.0), product of:
              0.06848105 = queryWeight, product of:
                1.0541102 = boost
                3.82909 = idf(docFreq=2515, maxDocs=42596)
                0.016966365 = queryNorm
              0.6768939 = fieldWeight in 2661, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.82909 = idf(docFreq=2515, maxDocs=42596)
                0.125 = fieldNorm(doc=2661)
          0.13049911 = weight(abstract_txt:records in 2661) [ClassicSimilarity], result of:
            0.13049911 = score(doc=2661,freq=3.0), product of:
              0.1365363 = queryWeight, product of:
                1.8229321 = boost
                4.414574 = idf(docFreq=1400, maxDocs=42596)
                0.016966365 = queryNorm
              0.9557833 = fieldWeight in 2661, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.414574 = idf(docFreq=1400, maxDocs=42596)
                0.125 = fieldNorm(doc=2661)
          0.23362796 = weight(abstract_txt:worldcat in 2661) [ClassicSimilarity], result of:
            0.23362796 = score(doc=2661,freq=1.0), product of:
              0.25363135 = queryWeight, product of:
                2.0286274 = boost
                7.369056 = idf(docFreq=72, maxDocs=42596)
                0.016966365 = queryNorm
              0.921132 = fieldWeight in 2661, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.369056 = idf(docFreq=72, maxDocs=42596)
                0.125 = fieldNorm(doc=2661)
          0.33907887 = weight(abstract_txt:works in 2661) [ClassicSimilarity], result of:
            0.33907887 = score(doc=2661,freq=1.0), product of:
              0.51610756 = queryWeight, product of:
                5.787632 = boost
                5.2559414 = idf(docFreq=603, maxDocs=42596)
                0.016966365 = queryNorm
              0.6569927 = fieldWeight in 2661, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2559414 = idf(docFreq=603, maxDocs=42596)
                0.125 = fieldNorm(doc=2661)
        0.2 = coord(5/25)
    
  4. Carlyle, A.; Summerlin, J.: Transforming catalog displays : records clustering for works of fiction (2000) 0.14
    0.13922724 = sum of:
      0.13922724 = product of:
        0.6961362 = sum of:
          0.018800646 = weight(abstract_txt:into in 1101) [ClassicSimilarity], result of:
            0.018800646 = score(doc=1101,freq=1.0), product of:
              0.06467179 = queryWeight, product of:
                1.0243733 = boost
                3.72107 = idf(docFreq=2802, maxDocs=42596)
                0.016966365 = queryNorm
              0.2907086 = fieldWeight in 1101, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.72107 = idf(docFreq=2802, maxDocs=42596)
                0.078125 = fieldNorm(doc=1101)
          0.020485947 = weight(abstract_txt:work in 1101) [ClassicSimilarity], result of:
            0.020485947 = score(doc=1101,freq=1.0), product of:
              0.06848105 = queryWeight, product of:
                1.0541102 = boost
                3.82909 = idf(docFreq=2515, maxDocs=42596)
                0.016966365 = queryNorm
              0.29914767 = fieldWeight in 1101, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.82909 = idf(docFreq=2515, maxDocs=42596)
                0.078125 = fieldNorm(doc=1101)
          0.09417962 = weight(abstract_txt:records in 1101) [ClassicSimilarity], result of:
            0.09417962 = score(doc=1101,freq=4.0), product of:
              0.1365363 = queryWeight, product of:
                1.8229321 = boost
                4.414574 = idf(docFreq=1400, maxDocs=42596)
                0.016966365 = queryNorm
              0.6897772 = fieldWeight in 1101, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.414574 = idf(docFreq=1400, maxDocs=42596)
                0.078125 = fieldNorm(doc=1101)
          0.35074565 = weight(abstract_txt:clustering in 1101) [ClassicSimilarity], result of:
            0.35074565 = score(doc=1101,freq=4.0), product of:
              0.3610643 = queryWeight, product of:
                3.4230094 = boost
                6.2170978 = idf(docFreq=230, maxDocs=42596)
                0.016966365 = queryNorm
              0.97142154 = fieldWeight in 1101, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.2170978 = idf(docFreq=230, maxDocs=42596)
                0.078125 = fieldNorm(doc=1101)
          0.2119243 = weight(abstract_txt:works in 1101) [ClassicSimilarity], result of:
            0.2119243 = score(doc=1101,freq=1.0), product of:
              0.51610756 = queryWeight, product of:
                5.787632 = boost
                5.2559414 = idf(docFreq=603, maxDocs=42596)
                0.016966365 = queryNorm
              0.41062042 = fieldWeight in 1101, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2559414 = idf(docFreq=603, maxDocs=42596)
                0.078125 = fieldNorm(doc=1101)
        0.2 = coord(5/25)
    
  5. Smiraglia, R.P.: Knowledge sharing and content genealogy : extensing the "works" model as a metaphor for non-documentary artefacts with case studies of Etruscan artefacts (2004) 0.13
    0.12740086 = sum of:
      0.12740086 = product of:
        0.6370043 = sum of:
          0.013160452 = weight(abstract_txt:into in 3672) [ClassicSimilarity], result of:
            0.013160452 = score(doc=3672,freq=1.0), product of:
              0.06467179 = queryWeight, product of:
                1.0243733 = boost
                3.72107 = idf(docFreq=2802, maxDocs=42596)
                0.016966365 = queryNorm
              0.20349602 = fieldWeight in 3672, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.72107 = idf(docFreq=2802, maxDocs=42596)
                0.0546875 = fieldNorm(doc=3672)
          0.014340162 = weight(abstract_txt:work in 3672) [ClassicSimilarity], result of:
            0.014340162 = score(doc=3672,freq=1.0), product of:
              0.06848105 = queryWeight, product of:
                1.0541102 = boost
                3.82909 = idf(docFreq=2515, maxDocs=42596)
                0.016966365 = queryNorm
              0.20940337 = fieldWeight in 3672, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.82909 = idf(docFreq=2515, maxDocs=42596)
                0.0546875 = fieldNorm(doc=3672)
          0.04661654 = weight(abstract_txt:records in 3672) [ClassicSimilarity], result of:
            0.04661654 = score(doc=3672,freq=2.0), product of:
              0.1365363 = queryWeight, product of:
                1.8229321 = boost
                4.414574 = idf(docFreq=1400, maxDocs=42596)
                0.016966365 = queryNorm
              0.34142232 = fieldWeight in 3672, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.414574 = idf(docFreq=1400, maxDocs=42596)
                0.0546875 = fieldNorm(doc=3672)
          0.117846124 = weight(abstract_txt:collected in 3672) [ClassicSimilarity], result of:
            0.117846124 = score(doc=3672,freq=1.0), product of:
              0.37849304 = queryWeight, product of:
                3.9183185 = boost
                5.693369 = idf(docFreq=389, maxDocs=42596)
                0.016966365 = queryNorm
              0.31135613 = fieldWeight in 3672, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.693369 = idf(docFreq=389, maxDocs=42596)
                0.0546875 = fieldNorm(doc=3672)
          0.44504103 = weight(abstract_txt:works in 3672) [ClassicSimilarity], result of:
            0.44504103 = score(doc=3672,freq=9.0), product of:
              0.51610756 = queryWeight, product of:
                5.787632 = boost
                5.2559414 = idf(docFreq=603, maxDocs=42596)
                0.016966365 = queryNorm
              0.8623029 = fieldWeight in 3672, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                5.2559414 = idf(docFreq=603, maxDocs=42596)
                0.0546875 = fieldNorm(doc=3672)
        0.2 = coord(5/25)