Donner, P.: Enhanced self-citation detection by fuzzy author name matching and complementary error estimates (2016)
0.02
0.016717812 = product of:
0.033435624 = sum of:
0.033435624 = product of:
0.06687125 = sum of:
0.06687125 = weight(_text_:i in 2776) [ClassicSimilarity], result of:
0.06687125 = score(doc=2776,freq=6.0), product of:
0.15441231 = queryWeight, product of:
3.7717297 = idf(docFreq=2765, maxDocs=44218)
0.04093939 = queryNorm
0.43306938 = fieldWeight in 2776, product of:
2.4494898 = tf(freq=6.0), with freq of:
6.0 = termFreq=6.0
3.7717297 = idf(docFreq=2765, maxDocs=44218)
0.046875 = fieldNorm(doc=2776)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Abstract
- In this article I investigate the shortcomings of exact string match-based author self-citation detection methods. The contributions of this study are twofold. First, I apply a fuzzy string matching algorithm for self-citation detection and benchmark this approach and other common methods of exclusively author name-based self-citation detection against a manually curated ground truth sample. Near full recall can be achieved with the proposed method while incurring only negligible precision loss. Second, I report some important observations from the results about the extent of latent self-citations and their characteristics and give an example of the effect of improved self-citation detection on the document level self-citation rate of real data.