Frank, E.; Paynter, G.W.: Predicting Library of Congress Classifications from Library of Congress Subject Headings (2004)
0.00
0.003743066 = product of:
0.0112291975 = sum of:
0.0112291975 = product of:
0.022458395 = sum of:
0.022458395 = weight(_text_:of in 2218) [ClassicSimilarity], result of:
0.022458395 = score(doc=2218,freq=20.0), product of:
0.06850986 = queryWeight, product of:
1.5637573 = idf(docFreq=25162, maxDocs=44218)
0.043811057 = queryNorm
0.32781258 = fieldWeight in 2218, product of:
4.472136 = tf(freq=20.0), with freq of:
20.0 = termFreq=20.0
1.5637573 = idf(docFreq=25162, maxDocs=44218)
0.046875 = fieldNorm(doc=2218)
0.5 = coord(1/2)
0.33333334 = coord(1/3)
- Abstract
- This paper addresses the problem of automatically assigning a Library of Congress Classification (LCC) to a work given its set of Library of Congress Subject Headings (LCSH). LCCs are organized in a tree: The root node of this hierarchy comprises all possible topics, and leaf nodes correspond to the most specialized topic areas defined. We describe a procedure that, given a resource identified by its LCSH, automatically places that resource in the LCC hierarchy. The procedure uses machine learning techniques and training data from a large library catalog to learn a model that maps from sets of LCSH to classifications from the LCC tree. We present empirical results for our technique showing its accuracy an an independent collection of 50,000 LCSH/LCC pairs.
- Source
- Journal of the American Society for Information Science and technology. 55(2004) no.3, S.214-227