Document (#36102)

Author
Fagni, T.
Sebastiani, F.
Title
Selecting negative examples for hierarchical text classification: An experimental comparison
Source
Journal of the American Society for Information Science and Technology. 61(2010) no.11, S.2256-2265
Year
2010
Abstract
Hierarchical text classification (HTC) approaches have recently attracted a lot of interest on the part of researchers in human language technology and machine learning, since they have been shown to bring about equal, if not better, classification accuracy with respect to their "flat" counterparts while allowing exponential time savings at both learning and classification time. A typical component of HTC methods is a "local" policy for selecting negative examples: Given a category c, its negative training examples are by default identified with the training examples that are negative for c and positive for the categories which are siblings of c in the hierarchy. However, this policy has always been taken for granted and never been subjected to careful scrutiny since first proposed 15 years ago. This article proposes a thorough experimental comparison between this policy and three other policies for the selection of negative examples in HTC contexts, one of which (BEST LOCAL (k)) is being proposed for the first time in this article. We compare these policies on the hierarchical versions of three supervised learning algorithms (boosting, support vector machines, and naïve Bayes) by performing experiments on two standard TC datasets, REUTERS-21578 and RCV1-V2.
Theme
Automatisches Klassifizieren

Similar documents (author)

  1. Sebastiani, F.: On the role of logic in information retrieval (1998) 5.94
    5.937289 = sum of:
      5.937289 = weight(author_txt:sebastiani in 1140) [ClassicSimilarity], result of:
        5.937289 = fieldWeight in 1140, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.625 = fieldNorm(doc=1140)
    
  2. Sebastiani, F.: Machine learning in automated text categorization (2002) 5.94
    5.937289 = sum of:
      5.937289 = weight(author_txt:sebastiani in 3389) [ClassicSimilarity], result of:
        5.937289 = fieldWeight in 3389, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.625 = fieldNorm(doc=3389)
    
  3. Sebastiani, F.: ¬A tutorial an automated text categorisation (1999) 5.94
    5.937289 = sum of:
      5.937289 = weight(author_txt:sebastiani in 3390) [ClassicSimilarity], result of:
        5.937289 = fieldWeight in 3390, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.625 = fieldNorm(doc=3390)
    
  4. Sebastiani, F.: Classification of text, automatic (2006) 5.94
    5.937289 = sum of:
      5.937289 = weight(author_txt:sebastiani in 5003) [ClassicSimilarity], result of:
        5.937289 = fieldWeight in 5003, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.625 = fieldNorm(doc=5003)
    
  5. Debole, F.; Sebastiani, F.: ¬An analysis of the relative hardness of Reuters-21578 subsets (2005) 4.75
    4.749831 = sum of:
      4.749831 = weight(author_txt:sebastiani in 3456) [ClassicSimilarity], result of:
        4.749831 = fieldWeight in 3456, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.5 = fieldNorm(doc=3456)
    

Similar documents (content)

  1. Yang, M.; Kiang, M.; Chen, H.; Li, Y.: Artificial immune system for illicit content identification in social media (2012) 0.17
    0.16921048 = sum of:
      0.16921048 = product of:
        0.60432315 = sum of:
          0.030414695 = weight(abstract_txt:proposed in 4980) [ClassicSimilarity], result of:
            0.030414695 = score(doc=4980,freq=1.0), product of:
              0.105576485 = queryWeight, product of:
                1.0973184 = boost
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.020873643 = queryNorm
              0.2880821 = fieldWeight in 4980, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.0625 = fieldNorm(doc=4980)
          0.008727425 = weight(abstract_txt:this in 4980) [ClassicSimilarity], result of:
            0.008727425 = score(doc=4980,freq=1.0), product of:
              0.057868954 = queryWeight, product of:
                1.1489123 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.020873643 = queryNorm
              0.1508136 = fieldWeight in 4980, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=4980)
          0.03325724 = weight(abstract_txt:time in 4980) [ClassicSimilarity], result of:
            0.03325724 = score(doc=4980,freq=1.0), product of:
              0.12827227 = queryWeight, product of:
                1.4813616 = boost
                4.148331 = idf(docFreq=1897, maxDocs=44218)
                0.020873643 = queryNorm
              0.2592707 = fieldWeight in 4980, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.148331 = idf(docFreq=1897, maxDocs=44218)
                0.0625 = fieldNorm(doc=4980)
          0.049955834 = weight(abstract_txt:learning in 4980) [ClassicSimilarity], result of:
            0.049955834 = score(doc=4980,freq=1.0), product of:
              0.16824135 = queryWeight, product of:
                1.6965282 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.020873643 = queryNorm
              0.29692957 = fieldWeight in 4980, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.0625 = fieldNorm(doc=4980)
          0.03951853 = weight(abstract_txt:classification in 4980) [ClassicSimilarity], result of:
            0.03951853 = score(doc=4980,freq=1.0), product of:
              0.15838793 = queryWeight, product of:
                1.9007505 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.020873643 = queryNorm
              0.2495047 = fieldWeight in 4980, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0625 = fieldNorm(doc=4980)
          0.16685747 = weight(abstract_txt:examples in 4980) [ClassicSimilarity], result of:
            0.16685747 = score(doc=4980,freq=3.0), product of:
              0.30904016 = queryWeight, product of:
                2.9684281 = boost
                4.9875827 = idf(docFreq=819, maxDocs=44218)
                0.020873643 = queryNorm
              0.53992164 = fieldWeight in 4980, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.9875827 = idf(docFreq=819, maxDocs=44218)
                0.0625 = fieldNorm(doc=4980)
          0.275592 = weight(abstract_txt:negative in 4980) [ClassicSimilarity], result of:
            0.275592 = score(doc=4980,freq=2.0), product of:
              0.4943023 = queryWeight, product of:
                3.7541826 = boost
                6.3078156 = idf(docFreq=218, maxDocs=44218)
                0.020873643 = queryNorm
              0.5575374 = fieldWeight in 4980, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.3078156 = idf(docFreq=218, maxDocs=44218)
                0.0625 = fieldNorm(doc=4980)
        0.28 = coord(7/25)
    
  2. Gauch, S.; Chandramouli, A.; Ranganathan, S.: Training a hierarchical classifier using inter document relationships (2009) 0.17
    0.1650501 = sum of:
      0.1650501 = product of:
        0.58946466 = sum of:
          0.03343705 = weight(abstract_txt:three in 2697) [ClassicSimilarity], result of:
            0.03343705 = score(doc=2697,freq=1.0), product of:
              0.09691481 = queryWeight, product of:
                1.0513424 = boost
                4.41619 = idf(docFreq=1451, maxDocs=44218)
                0.020873643 = queryNorm
              0.34501487 = fieldWeight in 2697, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.41619 = idf(docFreq=1451, maxDocs=44218)
                0.078125 = fieldNorm(doc=2697)
          0.010909281 = weight(abstract_txt:this in 2697) [ClassicSimilarity], result of:
            0.010909281 = score(doc=2697,freq=1.0), product of:
              0.057868954 = queryWeight, product of:
                1.1489123 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.020873643 = queryNorm
              0.18851699 = fieldWeight in 2697, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.078125 = fieldNorm(doc=2697)
          0.045396958 = weight(abstract_txt:since in 2697) [ClassicSimilarity], result of:
            0.045396958 = score(doc=2697,freq=1.0), product of:
              0.11882907 = queryWeight, product of:
                1.1641539 = boost
                4.890058 = idf(docFreq=903, maxDocs=44218)
                0.020873643 = queryNorm
              0.3820358 = fieldWeight in 2697, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.890058 = idf(docFreq=903, maxDocs=44218)
                0.078125 = fieldNorm(doc=2697)
          0.10373174 = weight(abstract_txt:training in 2697) [ClassicSimilarity], result of:
            0.10373174 = score(doc=2697,freq=4.0), product of:
              0.12986515 = queryWeight, product of:
                1.2170135 = boost
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.020873643 = queryNorm
              0.79876494 = fieldWeight in 2697, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.078125 = fieldNorm(doc=2697)
          0.10736154 = weight(abstract_txt:selecting in 2697) [ClassicSimilarity], result of:
            0.10736154 = score(doc=2697,freq=1.0), product of:
              0.21092951 = queryWeight, product of:
                1.5510212 = boost
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.020873643 = queryNorm
              0.5089925 = fieldWeight in 2697, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.078125 = fieldNorm(doc=2697)
          0.09879634 = weight(abstract_txt:classification in 2697) [ClassicSimilarity], result of:
            0.09879634 = score(doc=2697,freq=4.0), product of:
              0.15838793 = queryWeight, product of:
                1.9007505 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.020873643 = queryNorm
              0.6237618 = fieldWeight in 2697, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.078125 = fieldNorm(doc=2697)
          0.18983175 = weight(abstract_txt:hierarchical in 2697) [ClassicSimilarity], result of:
            0.18983175 = score(doc=2697,freq=3.0), product of:
              0.24479777 = queryWeight, product of:
                2.0464373 = boost
                5.7307405 = idf(docFreq=389, maxDocs=44218)
                0.020873643 = queryNorm
              0.7754636 = fieldWeight in 2697, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.7307405 = idf(docFreq=389, maxDocs=44218)
                0.078125 = fieldNorm(doc=2697)
        0.28 = coord(7/25)
    
  3. Sun, A.; Lim, E.-P.; Ng, W.-K.: Performance measurement framework for hierarchical text classification (2003) 0.15
    0.15297447 = sum of:
      0.15297447 = product of:
        0.63739365 = sum of:
          0.1357688 = weight(abstract_txt:bayes in 1808) [ClassicSimilarity], result of:
            0.1357688 = score(doc=1808,freq=2.0), product of:
              0.1803121 = queryWeight, product of:
                1.01402 = boost
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.020873643 = queryNorm
              0.75296557 = fieldWeight in 1808, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.0625 = fieldNorm(doc=1808)
          0.030414695 = weight(abstract_txt:proposed in 1808) [ClassicSimilarity], result of:
            0.030414695 = score(doc=1808,freq=1.0), product of:
              0.105576485 = queryWeight, product of:
                1.0973184 = boost
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.020873643 = queryNorm
              0.2880821 = fieldWeight in 1808, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.0625 = fieldNorm(doc=1808)
          0.008727425 = weight(abstract_txt:this in 1808) [ClassicSimilarity], result of:
            0.008727425 = score(doc=1808,freq=1.0), product of:
              0.057868954 = queryWeight, product of:
                1.1489123 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.020873643 = queryNorm
              0.1508136 = fieldWeight in 1808, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=1808)
          0.15091239 = weight(abstract_txt:21578 in 1808) [ClassicSimilarity], result of:
            0.15091239 = score(doc=1808,freq=1.0), product of:
              0.24377255 = queryWeight, product of:
                1.1790344 = boost
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.020873643 = queryNorm
              0.6190705 = fieldWeight in 1808, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.0625 = fieldNorm(doc=1808)
          0.09680024 = weight(abstract_txt:classification in 1808) [ClassicSimilarity], result of:
            0.09680024 = score(doc=1808,freq=6.0), product of:
              0.15838793 = queryWeight, product of:
                1.9007505 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.020873643 = queryNorm
              0.6111592 = fieldWeight in 1808, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0625 = fieldNorm(doc=1808)
          0.21477012 = weight(abstract_txt:hierarchical in 1808) [ClassicSimilarity], result of:
            0.21477012 = score(doc=1808,freq=6.0), product of:
              0.24479777 = queryWeight, product of:
                2.0464373 = boost
                5.7307405 = idf(docFreq=389, maxDocs=44218)
                0.020873643 = queryNorm
              0.8773369 = fieldWeight in 1808, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.7307405 = idf(docFreq=389, maxDocs=44218)
                0.0625 = fieldNorm(doc=1808)
        0.24 = coord(6/25)
    
  4. Ru, C.; Tang, J.; Li, S.; Xie, S.; Wang, T.: Using semantic similarity to reduce wrong labels in distant supervision for relation extraction (2018) 0.14
    0.14172728 = sum of:
      0.14172728 = product of:
        0.50616884 = sum of:
          0.030414695 = weight(abstract_txt:proposed in 5055) [ClassicSimilarity], result of:
            0.030414695 = score(doc=5055,freq=1.0), product of:
              0.105576485 = queryWeight, product of:
                1.0973184 = boost
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.020873643 = queryNorm
              0.2880821 = fieldWeight in 5055, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.0625 = fieldNorm(doc=5055)
          0.008727425 = weight(abstract_txt:this in 5055) [ClassicSimilarity], result of:
            0.008727425 = score(doc=5055,freq=1.0), product of:
              0.057868954 = queryWeight, product of:
                1.1489123 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.020873643 = queryNorm
              0.1508136 = fieldWeight in 5055, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=5055)
          0.041492693 = weight(abstract_txt:training in 5055) [ClassicSimilarity], result of:
            0.041492693 = score(doc=5055,freq=1.0), product of:
              0.12986515 = queryWeight, product of:
                1.2170135 = boost
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.020873643 = queryNorm
              0.319506 = fieldWeight in 5055, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.0625 = fieldNorm(doc=5055)
          0.048849292 = weight(abstract_txt:experimental in 5055) [ClassicSimilarity], result of:
            0.048849292 = score(doc=5055,freq=1.0), product of:
              0.14479394 = queryWeight, product of:
                1.2850626 = boost
                5.397938 = idf(docFreq=543, maxDocs=44218)
                0.020873643 = queryNorm
              0.3373711 = fieldWeight in 5055, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.397938 = idf(docFreq=543, maxDocs=44218)
                0.0625 = fieldNorm(doc=5055)
          0.022055658 = weight(abstract_txt:been in 5055) [ClassicSimilarity], result of:
            0.022055658 = score(doc=5055,freq=1.0), product of:
              0.09754881 = queryWeight, product of:
                1.291831 = boost
                3.617579 = idf(docFreq=3226, maxDocs=44218)
                0.020873643 = queryNorm
              0.22609869 = fieldWeight in 5055, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.617579 = idf(docFreq=3226, maxDocs=44218)
                0.0625 = fieldNorm(doc=5055)
          0.07903706 = weight(abstract_txt:classification in 5055) [ClassicSimilarity], result of:
            0.07903706 = score(doc=5055,freq=4.0), product of:
              0.15838793 = queryWeight, product of:
                1.9007505 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.020873643 = queryNorm
              0.4990094 = fieldWeight in 5055, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0625 = fieldNorm(doc=5055)
          0.275592 = weight(abstract_txt:negative in 5055) [ClassicSimilarity], result of:
            0.275592 = score(doc=5055,freq=2.0), product of:
              0.4943023 = queryWeight, product of:
                3.7541826 = boost
                6.3078156 = idf(docFreq=218, maxDocs=44218)
                0.020873643 = queryNorm
              0.5575374 = fieldWeight in 5055, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.3078156 = idf(docFreq=218, maxDocs=44218)
                0.0625 = fieldNorm(doc=5055)
        0.28 = coord(7/25)
    
  5. Borodin, Y.; Polishchuk, V.; Mahmud, J.; Ramakrishnan, I.V.; Stent, A.: Live and learn from mistakes : a lightweight system for document classification (2013) 0.13
    0.12730153 = sum of:
      0.12730153 = product of:
        0.53042305 = sum of:
          0.09600304 = weight(abstract_txt:bayes in 2722) [ClassicSimilarity], result of:
            0.09600304 = score(doc=2722,freq=1.0), product of:
              0.1803121 = queryWeight, product of:
                1.01402 = boost
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.020873643 = queryNorm
              0.5324271 = fieldWeight in 2722, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.0625 = fieldNorm(doc=2722)
          0.02674964 = weight(abstract_txt:three in 2722) [ClassicSimilarity], result of:
            0.02674964 = score(doc=2722,freq=1.0), product of:
              0.09691481 = queryWeight, product of:
                1.0513424 = boost
                4.41619 = idf(docFreq=1451, maxDocs=44218)
                0.020873643 = queryNorm
              0.27601188 = fieldWeight in 2722, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.41619 = idf(docFreq=1451, maxDocs=44218)
                0.0625 = fieldNorm(doc=2722)
          0.022055658 = weight(abstract_txt:been in 2722) [ClassicSimilarity], result of:
            0.022055658 = score(doc=2722,freq=1.0), product of:
              0.09754881 = queryWeight, product of:
                1.291831 = boost
                3.617579 = idf(docFreq=3226, maxDocs=44218)
                0.020873643 = queryNorm
              0.22609869 = fieldWeight in 2722, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.617579 = idf(docFreq=3226, maxDocs=44218)
                0.0625 = fieldNorm(doc=2722)
          0.11170464 = weight(abstract_txt:learning in 2722) [ClassicSimilarity], result of:
            0.11170464 = score(doc=2722,freq=5.0), product of:
              0.16824135 = queryWeight, product of:
                1.6965282 = boost
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.020873643 = queryNorm
              0.66395473 = fieldWeight in 2722, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.750873 = idf(docFreq=1038, maxDocs=44218)
                0.0625 = fieldNorm(doc=2722)
          0.07903706 = weight(abstract_txt:classification in 2722) [ClassicSimilarity], result of:
            0.07903706 = score(doc=2722,freq=4.0), product of:
              0.15838793 = queryWeight, product of:
                1.9007505 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.020873643 = queryNorm
              0.4990094 = fieldWeight in 2722, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0625 = fieldNorm(doc=2722)
          0.19487299 = weight(abstract_txt:negative in 2722) [ClassicSimilarity], result of:
            0.19487299 = score(doc=2722,freq=1.0), product of:
              0.4943023 = queryWeight, product of:
                3.7541826 = boost
                6.3078156 = idf(docFreq=218, maxDocs=44218)
                0.020873643 = queryNorm
              0.39423847 = fieldWeight in 2722, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3078156 = idf(docFreq=218, maxDocs=44218)
                0.0625 = fieldNorm(doc=2722)
        0.24 = coord(6/25)