Document (#38293)

Author
Ferreira, A.A.
Veloso, A.
Gonçalves, M.A.
Laender, A.H.F.
Title
Self-training author name disambiguation for information scarce scenarios
Source
Journal of the Association for Information Science and Technology. 65(2014) no.6, S.1257-1278
Year
2014
Abstract
We present a novel 3-step self-training method for author name disambiguation-SAND (self-training associative name disambiguator)-which requires no manual labeling, no parameterization (in real-world scenarios) and is particularly suitable for the common situation in which only the most basic information about a citation record is available (i.e., author names, and work and venue titles). During the first step, real-world heuristics on coauthors are able to produce highly pure (although fragmented) clusters. The most representative of these clusters are then selected to serve as training data for the third supervised author assignment step. The third step exploits a state-of-the-art transductive disambiguation method capable of detecting unseen authors not included in any training example and incorporating reliable predictions to the training data. Experiments conducted with standard public collections, using the minimum set of attributes present in a citation, demonstrate that our proposed method outperforms all representative unsupervised author grouping disambiguation methods and is very competitive with fully supervised author assignment methods. Thus, different from other bootstrapping methods that explore privileged, hard to obtain information such as self-citations and personal information, our proposed method produces topnotch performance with no (manual) training data or parameterization and in the presence of scarce information.
Object
SAND

Similar documents (author)

  1. Cota, R.G.; Ferreira, A.A.; Nascimento, C.; Gonçalves, M.A.; Laender, A.H.F.: ¬An unsupervised heuristic-based hierarchical method for name disambiguation in bibliographic citations (2010) 5.75
    5.748005 = sum of:
      5.748005 = sum of:
        1.1513202 = weight(author_txt:gonçalves in 3986) [ClassicSimilarity], result of:
          1.1513202 = score(doc=3986,freq=1.0), product of:
            0.43033007 = queryWeight, product of:
              8.561393 = idf(docFreq=22, maxDocs=44218)
              0.050264027 = queryNorm
            2.6754353 = fieldWeight in 3986, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.561393 = idf(docFreq=22, maxDocs=44218)
              0.3125 = fieldNorm(doc=3986)
        1.332519 = weight(author_txt:ferreira in 3986) [ClassicSimilarity], result of:
          1.332519 = score(doc=3986,freq=1.0), product of:
            0.47437292 = queryWeight, product of:
              1.049927 = boost
              8.988837 = idf(docFreq=14, maxDocs=44218)
              0.050264027 = queryNorm
            2.8090117 = fieldWeight in 3986, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.988837 = idf(docFreq=14, maxDocs=44218)
              0.3125 = fieldNorm(doc=3986)
        1.6320827 = weight(author_txt:laender in 3986) [ClassicSimilarity], result of:
          1.6320827 = score(doc=3986,freq=1.0), product of:
            0.5430407 = queryWeight, product of:
              1.1233506 = boost
              9.617446 = idf(docFreq=7, maxDocs=44218)
              0.050264027 = queryNorm
            3.005452 = fieldWeight in 3986, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.617446 = idf(docFreq=7, maxDocs=44218)
              0.3125 = fieldNorm(doc=3986)
        1.6320827 = weight(author_txt:a.h.f in 3986) [ClassicSimilarity], result of:
          1.6320827 = score(doc=3986,freq=1.0), product of:
            0.5430407 = queryWeight, product of:
              1.1233506 = boost
              9.617446 = idf(docFreq=7, maxDocs=44218)
              0.050264027 = queryNorm
            3.005452 = fieldWeight in 3986, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.617446 = idf(docFreq=7, maxDocs=44218)
              0.3125 = fieldNorm(doc=3986)
    
  2. Santana, A.F.; Gonçalves, M.A.; Laender, A.H.F.; Ferreira, A.A.: Incremental author name disambiguation by exploiting domain-specific heuristics (2017) 5.75
    5.748005 = sum of:
      5.748005 = sum of:
        1.1513202 = weight(author_txt:gonçalves in 3587) [ClassicSimilarity], result of:
          1.1513202 = score(doc=3587,freq=1.0), product of:
            0.43033007 = queryWeight, product of:
              8.561393 = idf(docFreq=22, maxDocs=44218)
              0.050264027 = queryNorm
            2.6754353 = fieldWeight in 3587, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.561393 = idf(docFreq=22, maxDocs=44218)
              0.3125 = fieldNorm(doc=3587)
        1.332519 = weight(author_txt:ferreira in 3587) [ClassicSimilarity], result of:
          1.332519 = score(doc=3587,freq=1.0), product of:
            0.47437292 = queryWeight, product of:
              1.049927 = boost
              8.988837 = idf(docFreq=14, maxDocs=44218)
              0.050264027 = queryNorm
            2.8090117 = fieldWeight in 3587, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.988837 = idf(docFreq=14, maxDocs=44218)
              0.3125 = fieldNorm(doc=3587)
        1.6320827 = weight(author_txt:laender in 3587) [ClassicSimilarity], result of:
          1.6320827 = score(doc=3587,freq=1.0), product of:
            0.5430407 = queryWeight, product of:
              1.1233506 = boost
              9.617446 = idf(docFreq=7, maxDocs=44218)
              0.050264027 = queryNorm
            3.005452 = fieldWeight in 3587, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.617446 = idf(docFreq=7, maxDocs=44218)
              0.3125 = fieldNorm(doc=3587)
        1.6320827 = weight(author_txt:a.h.f in 3587) [ClassicSimilarity], result of:
          1.6320827 = score(doc=3587,freq=1.0), product of:
            0.5430407 = queryWeight, product of:
              1.1233506 = boost
              9.617446 = idf(docFreq=7, maxDocs=44218)
              0.050264027 = queryNorm
            3.005452 = fieldWeight in 3587, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.617446 = idf(docFreq=7, maxDocs=44218)
              0.3125 = fieldNorm(doc=3587)
    
  3. Silva, A.J.C.; Gonçalves, M.A.; Laender, A.H.F.; Modesto, M.A.B.; Cristo, M.; Ziviani, N.: Finding what is missing from a digital library : a case study in the computer science field (2009) 2.65
    2.649291 = sum of:
      2.649291 = product of:
        3.5323882 = sum of:
          0.9210562 = weight(author_txt:gonçalves in 4219) [ClassicSimilarity], result of:
            0.9210562 = score(doc=4219,freq=1.0), product of:
              0.43033007 = queryWeight, product of:
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.050264027 = queryNorm
              2.1403482 = fieldWeight in 4219, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.25 = fieldNorm(doc=4219)
          1.3056661 = weight(author_txt:laender in 4219) [ClassicSimilarity], result of:
            1.3056661 = score(doc=4219,freq=1.0), product of:
              0.5430407 = queryWeight, product of:
                1.1233506 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.050264027 = queryNorm
              2.4043615 = fieldWeight in 4219, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.25 = fieldNorm(doc=4219)
          1.3056661 = weight(author_txt:a.h.f in 4219) [ClassicSimilarity], result of:
            1.3056661 = score(doc=4219,freq=1.0), product of:
              0.5430407 = queryWeight, product of:
                1.1233506 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.050264027 = queryNorm
              2.4043615 = fieldWeight in 4219, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.25 = fieldNorm(doc=4219)
        0.75 = coord(3/4)
    
  4. Pereira, D.A.; Ribeiro-Neto, B.; Ziviani, N.; Laender, A.H.F.; Gonçalves, M.A.: ¬A generic Web-based entity resolution framework (2011) 2.65
    2.649291 = sum of:
      2.649291 = product of:
        3.5323882 = sum of:
          0.9210562 = weight(author_txt:gonçalves in 4450) [ClassicSimilarity], result of:
            0.9210562 = score(doc=4450,freq=1.0), product of:
              0.43033007 = queryWeight, product of:
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.050264027 = queryNorm
              2.1403482 = fieldWeight in 4450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.25 = fieldNorm(doc=4450)
          1.3056661 = weight(author_txt:laender in 4450) [ClassicSimilarity], result of:
            1.3056661 = score(doc=4450,freq=1.0), product of:
              0.5430407 = queryWeight, product of:
                1.1233506 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.050264027 = queryNorm
              2.4043615 = fieldWeight in 4450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.25 = fieldNorm(doc=4450)
          1.3056661 = weight(author_txt:a.h.f in 4450) [ClassicSimilarity], result of:
            1.3056661 = score(doc=4450,freq=1.0), product of:
              0.5430407 = queryWeight, product of:
                1.1233506 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.050264027 = queryNorm
              2.4043615 = fieldWeight in 4450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.25 = fieldNorm(doc=4450)
        0.75 = coord(3/4)
    
  5. Ribeiro-Neto, B.; Laender, A.H.F.; Lima, L.R.S. de: ¬An experimental study in automatically categorizing medical documents (2001) 1.63
    1.6320827 = sum of:
      1.6320827 = product of:
        3.2641654 = sum of:
          1.6320827 = weight(author_txt:laender in 5702) [ClassicSimilarity], result of:
            1.6320827 = score(doc=5702,freq=1.0), product of:
              0.5430407 = queryWeight, product of:
                1.1233506 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.050264027 = queryNorm
              3.005452 = fieldWeight in 5702, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.3125 = fieldNorm(doc=5702)
          1.6320827 = weight(author_txt:a.h.f in 5702) [ClassicSimilarity], result of:
            1.6320827 = score(doc=5702,freq=1.0), product of:
              0.5430407 = queryWeight, product of:
                1.1233506 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.050264027 = queryNorm
              3.005452 = fieldWeight in 5702, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.3125 = fieldNorm(doc=5702)
        0.5 = coord(2/4)
    

Similar documents (content)

  1. Cota, R.G.; Ferreira, A.A.; Nascimento, C.; Gonçalves, M.A.; Laender, A.H.F.: ¬An unsupervised heuristic-based hierarchical method for name disambiguation in bibliographic citations (2010) 0.54
    0.53813595 = sum of:
      0.53813595 = product of:
        1.1211165 = sum of:
          0.12569997 = weight(abstract_txt:venue in 3986) [ClassicSimilarity], result of:
            0.12569997 = score(doc=3986,freq=2.0), product of:
              0.1570059 = queryWeight, product of:
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.017333722 = queryNorm
              0.8006066 = fieldWeight in 3986, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.0625 = fieldNorm(doc=3986)
          0.035780825 = weight(abstract_txt:real in 3986) [ClassicSimilarity], result of:
            0.035780825 = score(doc=3986,freq=1.0), product of:
              0.10784817 = queryWeight, product of:
                1.1720966 = boost
                5.308326 = idf(docFreq=594, maxDocs=44218)
                0.017333722 = queryNorm
              0.33177036 = fieldWeight in 3986, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.308326 = idf(docFreq=594, maxDocs=44218)
                0.0625 = fieldNorm(doc=3986)
          0.014697173 = weight(abstract_txt:information in 3986) [ClassicSimilarity], result of:
            0.014697173 = score(doc=3986,freq=3.0), product of:
              0.05608 = queryWeight, product of:
                1.3363832 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.017333722 = queryNorm
              0.26207513 = fieldWeight in 3986, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=3986)
          0.036183294 = weight(abstract_txt:methods in 3986) [ClassicSimilarity], result of:
            0.036183294 = score(doc=3986,freq=2.0), product of:
              0.09871998 = queryWeight, product of:
                1.3734257 = boost
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.017333722 = queryNorm
              0.36652455 = fieldWeight in 3986, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.0625 = fieldNorm(doc=3986)
          0.114578284 = weight(abstract_txt:clusters in 3986) [ClassicSimilarity], result of:
            0.114578284 = score(doc=3986,freq=3.0), product of:
              0.16245772 = queryWeight, product of:
                1.4385574 = boost
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.017333722 = queryNorm
              0.70528066 = fieldWeight in 3986, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.0625 = fieldNorm(doc=3986)
          0.07469298 = weight(abstract_txt:representative in 3986) [ClassicSimilarity], result of:
            0.07469298 = score(doc=3986,freq=1.0), product of:
              0.17615665 = queryWeight, product of:
                1.4979818 = boost
                6.784232 = idf(docFreq=135, maxDocs=44218)
                0.017333722 = queryNorm
              0.4240145 = fieldWeight in 3986, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.784232 = idf(docFreq=135, maxDocs=44218)
                0.0625 = fieldNorm(doc=3986)
          0.099421374 = weight(abstract_txt:supervised in 3986) [ClassicSimilarity], result of:
            0.099421374 = score(doc=3986,freq=1.0), product of:
              0.21315673 = queryWeight, product of:
                1.6478077 = boost
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.017333722 = queryNorm
              0.4664238 = fieldWeight in 3986, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.0625 = fieldNorm(doc=3986)
          0.09628005 = weight(abstract_txt:name in 3986) [ClassicSimilarity], result of:
            0.09628005 = score(doc=3986,freq=2.0), product of:
              0.18956459 = queryWeight, product of:
                1.9031861 = boost
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.017333722 = queryNorm
              0.5079011 = fieldWeight in 3986, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.0625 = fieldNorm(doc=3986)
          0.09754504 = weight(abstract_txt:method in 3986) [ClassicSimilarity], result of:
            0.09754504 = score(doc=3986,freq=5.0), product of:
              0.15507293 = queryWeight, product of:
                1.9876504 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.017333722 = queryNorm
              0.6290269 = fieldWeight in 3986, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.0625 = fieldNorm(doc=3986)
          0.1892028 = weight(abstract_txt:disambiguation in 3986) [ClassicSimilarity], result of:
            0.1892028 = score(doc=3986,freq=1.0), product of:
              0.41242114 = queryWeight, product of:
                3.2414734 = boost
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.017333722 = queryNorm
              0.45876116 = fieldWeight in 3986, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.0625 = fieldNorm(doc=3986)
          0.12518309 = weight(abstract_txt:author in 3986) [ClassicSimilarity], result of:
            0.12518309 = score(doc=3986,freq=2.0), product of:
              0.28451604 = queryWeight, product of:
                3.2973952 = boost
                4.9778743 = idf(docFreq=827, maxDocs=44218)
                0.017333722 = queryNorm
              0.43998608 = fieldWeight in 3986, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.9778743 = idf(docFreq=827, maxDocs=44218)
                0.0625 = fieldNorm(doc=3986)
          0.11185173 = weight(abstract_txt:training in 3986) [ClassicSimilarity], result of:
            0.11185173 = score(doc=3986,freq=1.0), product of:
              0.3500771 = queryWeight, product of:
                3.9506893 = boost
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.017333722 = queryNorm
              0.319506 = fieldWeight in 3986, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.0625 = fieldNorm(doc=3986)
        0.48 = coord(12/25)
    
  2. Liu, Y.; Li, W.; Huang, Z.; Fang, Q.: ¬A fast method based on multiple clustering for name disambiguation in bibliographic citations (2015) 0.34
    0.33788005 = sum of:
      0.33788005 = product of:
        0.76790917 = sum of:
          0.088883296 = weight(abstract_txt:venue in 1672) [ClassicSimilarity], result of:
            0.088883296 = score(doc=1672,freq=1.0), product of:
              0.1570059 = queryWeight, product of:
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.017333722 = queryNorm
              0.56611437 = fieldWeight in 1672, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.0625 = fieldNorm(doc=1672)
          0.023425367 = weight(abstract_txt:proposed in 1672) [ClassicSimilarity], result of:
            0.023425367 = score(doc=1672,freq=1.0), product of:
              0.0813149 = queryWeight, product of:
                1.0177523 = boost
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.017333722 = queryNorm
              0.2880821 = fieldWeight in 1672, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.0625 = fieldNorm(doc=1672)
          0.02808618 = weight(abstract_txt:citation in 1672) [ClassicSimilarity], result of:
            0.02808618 = score(doc=1672,freq=1.0), product of:
              0.09177146 = queryWeight, product of:
                1.0812119 = boost
                4.896717 = idf(docFreq=897, maxDocs=44218)
                0.017333722 = queryNorm
              0.30604482 = fieldWeight in 1672, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.896717 = idf(docFreq=897, maxDocs=44218)
                0.0625 = fieldNorm(doc=1672)
          0.013325481 = weight(abstract_txt:data in 1672) [ClassicSimilarity], result of:
            0.013325481 = score(doc=1672,freq=1.0), product of:
              0.06390452 = queryWeight, product of:
                1.1050156 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.017333722 = queryNorm
              0.20852174 = fieldWeight in 1672, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.0625 = fieldNorm(doc=1672)
          0.008485417 = weight(abstract_txt:information in 1672) [ClassicSimilarity], result of:
            0.008485417 = score(doc=1672,freq=1.0), product of:
              0.05608 = queryWeight, product of:
                1.3363832 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.017333722 = queryNorm
              0.15130915 = fieldWeight in 1672, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=1672)
          0.036183294 = weight(abstract_txt:methods in 1672) [ClassicSimilarity], result of:
            0.036183294 = score(doc=1672,freq=2.0), product of:
              0.09871998 = queryWeight, product of:
                1.3734257 = boost
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.017333722 = queryNorm
              0.36652455 = fieldWeight in 1672, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.0625 = fieldNorm(doc=1672)
          0.066151805 = weight(abstract_txt:clusters in 1672) [ClassicSimilarity], result of:
            0.066151805 = score(doc=1672,freq=1.0), product of:
              0.16245772 = queryWeight, product of:
                1.4385574 = boost
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.017333722 = queryNorm
              0.407194 = fieldWeight in 1672, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.0625 = fieldNorm(doc=1672)
          0.11791851 = weight(abstract_txt:name in 1672) [ClassicSimilarity], result of:
            0.11791851 = score(doc=1672,freq=3.0), product of:
              0.18956459 = queryWeight, product of:
                1.9031861 = boost
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.017333722 = queryNorm
              0.6220493 = fieldWeight in 1672, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.0625 = fieldNorm(doc=1672)
          0.043623466 = weight(abstract_txt:method in 1672) [ClassicSimilarity], result of:
            0.043623466 = score(doc=1672,freq=1.0), product of:
              0.15507293 = queryWeight, product of:
                1.9876504 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.017333722 = queryNorm
              0.28130937 = fieldWeight in 1672, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.0625 = fieldNorm(doc=1672)
          0.1526235 = weight(abstract_txt:step in 1672) [ClassicSimilarity], result of:
            0.1526235 = score(doc=1672,freq=2.0), product of:
              0.28365698 = queryWeight, product of:
                2.6882443 = boost
                6.087415 = idf(docFreq=272, maxDocs=44218)
                0.017333722 = queryNorm
              0.53805655 = fieldWeight in 1672, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.087415 = idf(docFreq=272, maxDocs=44218)
                0.0625 = fieldNorm(doc=1672)
          0.1892028 = weight(abstract_txt:disambiguation in 1672) [ClassicSimilarity], result of:
            0.1892028 = score(doc=1672,freq=1.0), product of:
              0.41242114 = queryWeight, product of:
                3.2414734 = boost
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.017333722 = queryNorm
              0.45876116 = fieldWeight in 1672, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.0625 = fieldNorm(doc=1672)
        0.44 = coord(11/25)
    
  3. Levin, M.; Krawczyk, S.; Bethard, S.; Jurafsky, D.: Citation-based bootstrapping for large-scale author disambiguation (2012) 0.29
    0.28557426 = sum of:
      0.28557426 = product of:
        1.0199081 = sum of:
          0.05617236 = weight(abstract_txt:citation in 246) [ClassicSimilarity], result of:
            0.05617236 = score(doc=246,freq=4.0), product of:
              0.09177146 = queryWeight, product of:
                1.0812119 = boost
                4.896717 = idf(docFreq=897, maxDocs=44218)
                0.017333722 = queryNorm
              0.61208963 = fieldWeight in 246, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.896717 = idf(docFreq=897, maxDocs=44218)
                0.0625 = fieldNorm(doc=246)
          0.066151805 = weight(abstract_txt:clusters in 246) [ClassicSimilarity], result of:
            0.066151805 = score(doc=246,freq=1.0), product of:
              0.16245772 = queryWeight, product of:
                1.4385574 = boost
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.017333722 = queryNorm
              0.407194 = fieldWeight in 246, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.0625 = fieldNorm(doc=246)
          0.22231296 = weight(abstract_txt:supervised in 246) [ClassicSimilarity], result of:
            0.22231296 = score(doc=246,freq=5.0), product of:
              0.21315673 = queryWeight, product of:
                1.6478077 = boost
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.017333722 = queryNorm
              1.0429554 = fieldWeight in 246, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.0625 = fieldNorm(doc=246)
          0.14252873 = weight(abstract_txt:self in 246) [ClassicSimilarity], result of:
            0.14252873 = score(doc=246,freq=3.0), product of:
              0.2367466 = queryWeight, product of:
                2.455918 = boost
                5.561322 = idf(docFreq=461, maxDocs=44218)
                0.017333722 = queryNorm
              0.60203075 = fieldWeight in 246, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.561322 = idf(docFreq=461, maxDocs=44218)
                0.0625 = fieldNorm(doc=246)
          0.26757315 = weight(abstract_txt:disambiguation in 246) [ClassicSimilarity], result of:
            0.26757315 = score(doc=246,freq=2.0), product of:
              0.41242114 = queryWeight, product of:
                3.2414734 = boost
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.017333722 = queryNorm
              0.64878625 = fieldWeight in 246, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.0625 = fieldNorm(doc=246)
          0.15331735 = weight(abstract_txt:author in 246) [ClassicSimilarity], result of:
            0.15331735 = score(doc=246,freq=3.0), product of:
              0.28451604 = queryWeight, product of:
                3.2973952 = boost
                4.9778743 = idf(docFreq=827, maxDocs=44218)
                0.017333722 = queryNorm
              0.5388707 = fieldWeight in 246, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.9778743 = idf(docFreq=827, maxDocs=44218)
                0.0625 = fieldNorm(doc=246)
          0.11185173 = weight(abstract_txt:training in 246) [ClassicSimilarity], result of:
            0.11185173 = score(doc=246,freq=1.0), product of:
              0.3500771 = queryWeight, product of:
                3.9506893 = boost
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.017333722 = queryNorm
              0.319506 = fieldWeight in 246, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.0625 = fieldNorm(doc=246)
        0.28 = coord(7/25)
    
  4. Santana, A.F.; Gonçalves, M.A.; Laender, A.H.F.; Ferreira, A.A.: Incremental author name disambiguation by exploiting domain-specific heuristics (2017) 0.28
    0.27607137 = sum of:
      0.27607137 = product of:
        0.9859692 = sum of:
          0.035107724 = weight(abstract_txt:citation in 3587) [ClassicSimilarity], result of:
            0.035107724 = score(doc=3587,freq=1.0), product of:
              0.09177146 = queryWeight, product of:
                1.0812119 = boost
                4.896717 = idf(docFreq=897, maxDocs=44218)
                0.017333722 = queryNorm
              0.38255602 = fieldWeight in 3587, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.896717 = idf(docFreq=897, maxDocs=44218)
                0.078125 = fieldNorm(doc=3587)
          0.044726033 = weight(abstract_txt:real in 3587) [ClassicSimilarity], result of:
            0.044726033 = score(doc=3587,freq=1.0), product of:
              0.10784817 = queryWeight, product of:
                1.1720966 = boost
                5.308326 = idf(docFreq=594, maxDocs=44218)
                0.017333722 = queryNorm
              0.41471297 = fieldWeight in 3587, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.308326 = idf(docFreq=594, maxDocs=44218)
                0.078125 = fieldNorm(doc=3587)
          0.06300693 = weight(abstract_txt:manual in 3587) [ClassicSimilarity], result of:
            0.06300693 = score(doc=3587,freq=1.0), product of:
              0.13552874 = queryWeight, product of:
                1.3139315 = boost
                5.950684 = idf(docFreq=312, maxDocs=44218)
                0.017333722 = queryNorm
              0.4648972 = fieldWeight in 3587, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.950684 = idf(docFreq=312, maxDocs=44218)
                0.078125 = fieldNorm(doc=3587)
          0.14739814 = weight(abstract_txt:name in 3587) [ClassicSimilarity], result of:
            0.14739814 = score(doc=3587,freq=3.0), product of:
              0.18956459 = queryWeight, product of:
                1.9031861 = boost
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.017333722 = queryNorm
              0.7775616 = fieldWeight in 3587, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.078125 = fieldNorm(doc=3587)
          0.094447576 = weight(abstract_txt:method in 3587) [ClassicSimilarity], result of:
            0.094447576 = score(doc=3587,freq=3.0), product of:
              0.15507293 = queryWeight, product of:
                1.9876504 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.017333722 = queryNorm
              0.60905266 = fieldWeight in 3587, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.078125 = fieldNorm(doc=3587)
          0.40963608 = weight(abstract_txt:disambiguation in 3587) [ClassicSimilarity], result of:
            0.40963608 = score(doc=3587,freq=3.0), product of:
              0.41242114 = queryWeight, product of:
                3.2414734 = boost
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.017333722 = queryNorm
              0.99324703 = fieldWeight in 3587, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.078125 = fieldNorm(doc=3587)
          0.1916467 = weight(abstract_txt:author in 3587) [ClassicSimilarity], result of:
            0.1916467 = score(doc=3587,freq=3.0), product of:
              0.28451604 = queryWeight, product of:
                3.2973952 = boost
                4.9778743 = idf(docFreq=827, maxDocs=44218)
                0.017333722 = queryNorm
              0.6735884 = fieldWeight in 3587, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.9778743 = idf(docFreq=827, maxDocs=44218)
                0.078125 = fieldNorm(doc=3587)
        0.28 = coord(7/25)
    
  5. Donner, P.: Enhanced self-citation detection by fuzzy author name matching and complementary error estimates (2016) 0.27
    0.27448678 = sum of:
      0.27448678 = product of:
        0.7624632 = sum of:
          0.02928171 = weight(abstract_txt:proposed in 2776) [ClassicSimilarity], result of:
            0.02928171 = score(doc=2776,freq=1.0), product of:
              0.0813149 = queryWeight, product of:
                1.0177523 = boost
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.017333722 = queryNorm
              0.36010262 = fieldWeight in 2776, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.078125 = fieldNorm(doc=2776)
          0.078503266 = weight(abstract_txt:citation in 2776) [ClassicSimilarity], result of:
            0.078503266 = score(doc=2776,freq=5.0), product of:
              0.09177146 = queryWeight, product of:
                1.0812119 = boost
                4.896717 = idf(docFreq=897, maxDocs=44218)
                0.017333722 = queryNorm
              0.8554213 = fieldWeight in 2776, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.896717 = idf(docFreq=897, maxDocs=44218)
                0.078125 = fieldNorm(doc=2776)
          0.016656851 = weight(abstract_txt:data in 2776) [ClassicSimilarity], result of:
            0.016656851 = score(doc=2776,freq=1.0), product of:
              0.06390452 = queryWeight, product of:
                1.1050156 = boost
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.017333722 = queryNorm
              0.26065218 = fieldWeight in 2776, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3363478 = idf(docFreq=4274, maxDocs=44218)
                0.078125 = fieldNorm(doc=2776)
          0.044726033 = weight(abstract_txt:real in 2776) [ClassicSimilarity], result of:
            0.044726033 = score(doc=2776,freq=1.0), product of:
              0.10784817 = queryWeight, product of:
                1.1720966 = boost
                5.308326 = idf(docFreq=594, maxDocs=44218)
                0.017333722 = queryNorm
              0.41471297 = fieldWeight in 2776, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.308326 = idf(docFreq=594, maxDocs=44218)
                0.078125 = fieldNorm(doc=2776)
          0.04522912 = weight(abstract_txt:methods in 2776) [ClassicSimilarity], result of:
            0.04522912 = score(doc=2776,freq=2.0), product of:
              0.09871998 = queryWeight, product of:
                1.3734257 = boost
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.017333722 = queryNorm
              0.4581557 = fieldWeight in 2776, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.078125 = fieldNorm(doc=2776)
          0.08510035 = weight(abstract_txt:name in 2776) [ClassicSimilarity], result of:
            0.08510035 = score(doc=2776,freq=1.0), product of:
              0.18956459 = queryWeight, product of:
                1.9031861 = boost
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.017333722 = queryNorm
              0.44892538 = fieldWeight in 2776, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.078125 = fieldNorm(doc=2776)
          0.054529335 = weight(abstract_txt:method in 2776) [ClassicSimilarity], result of:
            0.054529335 = score(doc=2776,freq=1.0), product of:
              0.15507293 = queryWeight, product of:
                1.9876504 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.017333722 = queryNorm
              0.3516367 = fieldWeight in 2776, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.078125 = fieldNorm(doc=2776)
          0.25195763 = weight(abstract_txt:self in 2776) [ClassicSimilarity], result of:
            0.25195763 = score(doc=2776,freq=6.0), product of:
              0.2367466 = queryWeight, product of:
                2.455918 = boost
                5.561322 = idf(docFreq=461, maxDocs=44218)
                0.017333722 = queryNorm
              1.0642502 = fieldWeight in 2776, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.561322 = idf(docFreq=461, maxDocs=44218)
                0.078125 = fieldNorm(doc=2776)
          0.15647887 = weight(abstract_txt:author in 2776) [ClassicSimilarity], result of:
            0.15647887 = score(doc=2776,freq=2.0), product of:
              0.28451604 = queryWeight, product of:
                3.2973952 = boost
                4.9778743 = idf(docFreq=827, maxDocs=44218)
                0.017333722 = queryNorm
              0.5499826 = fieldWeight in 2776, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.9778743 = idf(docFreq=827, maxDocs=44218)
                0.078125 = fieldNorm(doc=2776)
        0.36 = coord(9/25)