Sitas, A.; Kapidakis, S.: Duplicate detection algorithms of bibliographic descriptions (2008)
0.01
0.0069690677 = product of:
0.020907203 = sum of:
0.020907203 = product of:
0.041814405 = sum of:
0.041814405 = weight(_text_:web in 2543) [ClassicSimilarity], result of:
0.041814405 = score(doc=2543,freq=2.0), product of:
0.1656677 = queryWeight, product of:
3.2635105 = idf(docFreq=4597, maxDocs=44218)
0.050763648 = queryNorm
0.25239927 = fieldWeight in 2543, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.2635105 = idf(docFreq=4597, maxDocs=44218)
0.0546875 = fieldNorm(doc=2543)
0.5 = coord(1/2)
0.33333334 = coord(1/3)
- Abstract
- Purpose - The purpose of this paper is to focus on duplicate record detection algorithms used for detection in bibliographic databases. Design/methodology/approach - Individual algorithms, their application process for duplicate detection and their results are described based on available literature (published articles), information found at various library web sites and follow-up e-mail communications. Findings - Algorithms are categorized according to their application as a process of a single step or two consecutive steps. The results of deletion, merging, and temporary and virtual consolidation of duplicate records are studied. Originality/value - The paper presents an overview of the duplication detection algorithms and an up-to-date state of their application in different library systems.