Document (#34114)

Author
Hu, G.
Zhou, S.
Guan, J.
Hu, X.
Title
Towards effective document clustering : a constrained K-means based approach
Source
Information processing and management. 44(2008) no.4, S.1397-1409
Year
2008
Abstract
Document clustering is an important tool for document collection organization and browsing. In real applications, some limited knowledge about cluster membership of a small number of documents is often available, such as some pairs of documents belonging to the same cluster. This kind of prior knowledge can be served as constraints for the clustering process. We integrate the constraints into the trace formulation of the sum of square Euclidean distance function of K-means. Then, the combined criterion function is transformed into trace maximization, which is further optimized by eigen-decomposition. Our experimental evaluation shows that the proposed semi-supervised clustering method can achieve better performance, compared to three existing methods.
Theme
Automatisches Klassifizieren

Similar documents (author)

  1. Bell, D.A.; Guan, J.W.: Computational methods for rough classification and discovery (1998) 2.05
    2.0494254 = sum of:
      2.0494254 = product of:
        4.0988507 = sum of:
          4.0988507 = weight(author_txt:guan in 2909) [ClassicSimilarity], result of:
            4.0988507 = score(doc=2909,freq=1.0), product of:
              0.8407056 = queryWeight, product of:
                1.2460221 = boost
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.06919425 = queryNorm
              4.8754888 = fieldWeight in 2909, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.5 = fieldNorm(doc=2909)
        0.5 = coord(1/2)
    
  2. Cowie, J.; Guan, Z.: CRL English routing system for TREC-5 (1997) 2.05
    2.0494254 = sum of:
      2.0494254 = product of:
        4.0988507 = sum of:
          4.0988507 = weight(author_txt:guan in 3106) [ClassicSimilarity], result of:
            4.0988507 = score(doc=3106,freq=1.0), product of:
              0.8407056 = queryWeight, product of:
                1.2460221 = boost
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.06919425 = queryNorm
              4.8754888 = fieldWeight in 3106, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.5 = fieldNorm(doc=3106)
        0.5 = coord(1/2)
    
  3. Wang, J.; Guan, J.: ¬The analysis and evaluation of knowledge efficiency in research groups (2005) 2.05
    2.0494254 = sum of:
      2.0494254 = product of:
        4.0988507 = sum of:
          4.0988507 = weight(author_txt:guan in 4238) [ClassicSimilarity], result of:
            4.0988507 = score(doc=4238,freq=1.0), product of:
              0.8407056 = queryWeight, product of:
                1.2460221 = boost
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.06919425 = queryNorm
              4.8754888 = fieldWeight in 4238, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.5 = fieldNorm(doc=4238)
        0.5 = coord(1/2)
    
  4. Guan, J.C.; Gao, X.: Exploring the h-index at patent level (2009) 2.05
    2.0494254 = sum of:
      2.0494254 = product of:
        4.0988507 = sum of:
          4.0988507 = weight(author_txt:guan in 2696) [ClassicSimilarity], result of:
            4.0988507 = score(doc=2696,freq=1.0), product of:
              0.8407056 = queryWeight, product of:
                1.2460221 = boost
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.06919425 = queryNorm
              4.8754888 = fieldWeight in 2696, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.5 = fieldNorm(doc=2696)
        0.5 = coord(1/2)
    
  5. Ma, N.; Guan, J.; Zhao, Y.: Bringing PageRank to the citation analysis (2008) 1.54
    1.537069 = sum of:
      1.537069 = product of:
        3.074138 = sum of:
          3.074138 = weight(author_txt:guan in 2064) [ClassicSimilarity], result of:
            3.074138 = score(doc=2064,freq=1.0), product of:
              0.8407056 = queryWeight, product of:
                1.2460221 = boost
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.06919425 = queryNorm
              3.6566167 = fieldWeight in 2064, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.375 = fieldNorm(doc=2064)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. AlQenaei, Z.M.; Monarchi, D.E.: ¬The use of learning techniques to analyze the results of a manual classification system (2016) 0.22
    0.22331612 = sum of:
      0.22331612 = product of:
        0.69786286 = sum of:
          0.053754322 = weight(abstract_txt:pairs in 2836) [ClassicSimilarity], result of:
            0.053754322 = score(doc=2836,freq=1.0), product of:
              0.12649848 = queryWeight, product of:
                6.7990475 = idf(docFreq=133, maxDocs=44218)
                0.018605324 = queryNorm
              0.42494047 = fieldWeight in 2836, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7990475 = idf(docFreq=133, maxDocs=44218)
                0.0625 = fieldNorm(doc=2836)
          0.0710839 = weight(abstract_txt:supervised in 2836) [ClassicSimilarity], result of:
            0.0710839 = score(doc=2836,freq=1.0), product of:
              0.15240195 = queryWeight, product of:
                1.0976216 = boost
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.018605324 = queryNorm
              0.4664238 = fieldWeight in 2836, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.0625 = fieldNorm(doc=2836)
          0.0833111 = weight(abstract_txt:belonging in 2836) [ClassicSimilarity], result of:
            0.0833111 = score(doc=2836,freq=1.0), product of:
              0.16941231 = queryWeight, product of:
                1.1572571 = boost
                7.8682456 = idf(docFreq=45, maxDocs=44218)
                0.018605324 = queryNorm
              0.49176535 = fieldWeight in 2836, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.8682456 = idf(docFreq=45, maxDocs=44218)
                0.0625 = fieldNorm(doc=2836)
          0.08473111 = weight(abstract_txt:decomposition in 2836) [ClassicSimilarity], result of:
            0.08473111 = score(doc=2836,freq=1.0), product of:
              0.17133193 = queryWeight, product of:
                1.163795 = boost
                7.912698 = idf(docFreq=43, maxDocs=44218)
                0.018605324 = queryNorm
              0.4945436 = fieldWeight in 2836, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.912698 = idf(docFreq=43, maxDocs=44218)
                0.0625 = fieldNorm(doc=2836)
          0.033862393 = weight(abstract_txt:documents in 2836) [ClassicSimilarity], result of:
            0.033862393 = score(doc=2836,freq=2.0), product of:
              0.09295829 = queryWeight, product of:
                1.2123176 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.018605324 = queryNorm
              0.36427513 = fieldWeight in 2836, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=2836)
          0.16620703 = weight(abstract_txt:euclidean in 2836) [ClassicSimilarity], result of:
            0.16620703 = score(doc=2836,freq=1.0), product of:
              0.26847836 = queryWeight, product of:
                1.4568405 = boost
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.018605324 = queryNorm
              0.6190705 = fieldWeight in 2836, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.905128 = idf(docFreq=5, maxDocs=44218)
                0.0625 = fieldNorm(doc=2836)
          0.040583942 = weight(abstract_txt:document in 2836) [ClassicSimilarity], result of:
            0.040583942 = score(doc=2836,freq=1.0), product of:
              0.15127005 = queryWeight, product of:
                1.8940631 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.018605324 = queryNorm
              0.26828802 = fieldWeight in 2836, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=2836)
          0.16432902 = weight(abstract_txt:clustering in 2836) [ClassicSimilarity], result of:
            0.16432902 = score(doc=2836,freq=1.0), product of:
              0.42296642 = queryWeight, product of:
                3.657129 = boost
                6.2162485 = idf(docFreq=239, maxDocs=44218)
                0.018605324 = queryNorm
              0.38851553 = fieldWeight in 2836, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2162485 = idf(docFreq=239, maxDocs=44218)
                0.0625 = fieldNorm(doc=2836)
        0.32 = coord(8/25)
    
  2. Dunlavy, D.M.; O'Leary, D.P.; Conroy, J.M.; Schlesinger, J.D.: QCS: A system for querying, clustering and summarizing documents (2007) 0.16
    0.15565611 = sum of:
      0.15565611 = product of:
        0.5559147 = sum of:
          0.015196502 = weight(abstract_txt:into in 947) [ClassicSimilarity], result of:
            0.015196502 = score(doc=947,freq=1.0), product of:
              0.075042985 = queryWeight, product of:
                1.0892496 = boost
                3.7029297 = idf(docFreq=2962, maxDocs=44218)
                0.018605324 = queryNorm
              0.20250396 = fieldWeight in 947, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7029297 = idf(docFreq=2962, maxDocs=44218)
                0.0546875 = fieldNorm(doc=947)
          0.07413972 = weight(abstract_txt:decomposition in 947) [ClassicSimilarity], result of:
            0.07413972 = score(doc=947,freq=1.0), product of:
              0.17133193 = queryWeight, product of:
                1.163795 = boost
                7.912698 = idf(docFreq=43, maxDocs=44218)
                0.018605324 = queryNorm
              0.43272567 = fieldWeight in 947, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.912698 = idf(docFreq=43, maxDocs=44218)
                0.0546875 = fieldNorm(doc=947)
          0.029629592 = weight(abstract_txt:documents in 947) [ClassicSimilarity], result of:
            0.029629592 = score(doc=947,freq=2.0), product of:
              0.09295829 = queryWeight, product of:
                1.2123176 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.018605324 = queryNorm
              0.31874073 = fieldWeight in 947, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0546875 = fieldNorm(doc=947)
          0.037745267 = weight(abstract_txt:means in 947) [ClassicSimilarity], result of:
            0.037745267 = score(doc=947,freq=1.0), product of:
              0.13763303 = queryWeight, product of:
                1.4751415 = boost
                5.0147786 = idf(docFreq=797, maxDocs=44218)
                0.018605324 = queryNorm
              0.2742457 = fieldWeight in 947, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0147786 = idf(docFreq=797, maxDocs=44218)
                0.0546875 = fieldNorm(doc=947)
          0.05022007 = weight(abstract_txt:document in 947) [ClassicSimilarity], result of:
            0.05022007 = score(doc=947,freq=2.0), product of:
              0.15127005 = queryWeight, product of:
                1.8940631 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.018605324 = queryNorm
              0.3319895 = fieldWeight in 947, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0546875 = fieldNorm(doc=947)
          0.14563674 = weight(abstract_txt:cluster in 947) [ClassicSimilarity], result of:
            0.14563674 = score(doc=947,freq=3.0), product of:
              0.23475844 = queryWeight, product of:
                1.9265618 = boost
                6.5493927 = idf(docFreq=171, maxDocs=44218)
                0.018605324 = queryNorm
              0.6203685 = fieldWeight in 947, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.5493927 = idf(docFreq=171, maxDocs=44218)
                0.0546875 = fieldNorm(doc=947)
          0.20334677 = weight(abstract_txt:clustering in 947) [ClassicSimilarity], result of:
            0.20334677 = score(doc=947,freq=2.0), product of:
              0.42296642 = queryWeight, product of:
                3.657129 = boost
                6.2162485 = idf(docFreq=239, maxDocs=44218)
                0.018605324 = queryNorm
              0.4807634 = fieldWeight in 947, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.2162485 = idf(docFreq=239, maxDocs=44218)
                0.0546875 = fieldNorm(doc=947)
        0.28 = coord(7/25)
    
  3. Zamir, O.; Etzioni, O.: Grouper : a dynamic clustering interface to Web search results (1999) 0.15
    0.14845154 = sum of:
      0.14845154 = product of:
        0.74225765 = sum of:
          0.02170929 = weight(abstract_txt:into in 6207) [ClassicSimilarity], result of:
            0.02170929 = score(doc=6207,freq=1.0), product of:
              0.075042985 = queryWeight, product of:
                1.0892496 = boost
                3.7029297 = idf(docFreq=2962, maxDocs=44218)
                0.018605324 = queryNorm
              0.28929138 = fieldWeight in 6207, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7029297 = idf(docFreq=2962, maxDocs=44218)
                0.078125 = fieldNorm(doc=6207)
          0.029930409 = weight(abstract_txt:documents in 6207) [ClassicSimilarity], result of:
            0.029930409 = score(doc=6207,freq=1.0), product of:
              0.09295829 = queryWeight, product of:
                1.2123176 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.018605324 = queryNorm
              0.32197678 = fieldWeight in 6207, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.078125 = fieldNorm(doc=6207)
          0.07174295 = weight(abstract_txt:document in 6207) [ClassicSimilarity], result of:
            0.07174295 = score(doc=6207,freq=2.0), product of:
              0.15127005 = queryWeight, product of:
                1.8940631 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.018605324 = queryNorm
              0.4742707 = fieldWeight in 6207, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.078125 = fieldNorm(doc=6207)
          0.20805247 = weight(abstract_txt:cluster in 6207) [ClassicSimilarity], result of:
            0.20805247 = score(doc=6207,freq=3.0), product of:
              0.23475844 = queryWeight, product of:
                1.9265618 = boost
                6.5493927 = idf(docFreq=171, maxDocs=44218)
                0.018605324 = queryNorm
              0.88624066 = fieldWeight in 6207, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.5493927 = idf(docFreq=171, maxDocs=44218)
                0.078125 = fieldNorm(doc=6207)
          0.41082254 = weight(abstract_txt:clustering in 6207) [ClassicSimilarity], result of:
            0.41082254 = score(doc=6207,freq=4.0), product of:
              0.42296642 = queryWeight, product of:
                3.657129 = boost
                6.2162485 = idf(docFreq=239, maxDocs=44218)
                0.018605324 = queryNorm
              0.9712888 = fieldWeight in 6207, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.2162485 = idf(docFreq=239, maxDocs=44218)
                0.078125 = fieldNorm(doc=6207)
        0.2 = coord(5/25)
    
  4. Rooney, N.; Patterson, D.; Galushka, M.; Dobrynin, V.; Smirnova, E.: ¬An investigation into the stability of contextual document clustering (2008) 0.14
    0.14406924 = sum of:
      0.14406924 = product of:
        0.6002885 = sum of:
          0.017367432 = weight(abstract_txt:into in 1356) [ClassicSimilarity], result of:
            0.017367432 = score(doc=1356,freq=1.0), product of:
              0.075042985 = queryWeight, product of:
                1.0892496 = boost
                3.7029297 = idf(docFreq=2962, maxDocs=44218)
                0.018605324 = queryNorm
              0.23143311 = fieldWeight in 1356, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7029297 = idf(docFreq=2962, maxDocs=44218)
                0.0625 = fieldNorm(doc=1356)
          0.033862393 = weight(abstract_txt:documents in 1356) [ClassicSimilarity], result of:
            0.033862393 = score(doc=1356,freq=2.0), product of:
              0.09295829 = queryWeight, product of:
                1.2123176 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.018605324 = queryNorm
              0.36427513 = fieldWeight in 1356, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=1356)
          0.043137446 = weight(abstract_txt:means in 1356) [ClassicSimilarity], result of:
            0.043137446 = score(doc=1356,freq=1.0), product of:
              0.13763303 = queryWeight, product of:
                1.4751415 = boost
                5.0147786 = idf(docFreq=797, maxDocs=44218)
                0.018605324 = queryNorm
              0.31342366 = fieldWeight in 1356, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0147786 = idf(docFreq=797, maxDocs=44218)
                0.0625 = fieldNorm(doc=1356)
          0.081167884 = weight(abstract_txt:document in 1356) [ClassicSimilarity], result of:
            0.081167884 = score(doc=1356,freq=4.0), product of:
              0.15127005 = queryWeight, product of:
                1.8940631 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.018605324 = queryNorm
              0.53657603 = fieldWeight in 1356, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=1356)
          0.09609532 = weight(abstract_txt:cluster in 1356) [ClassicSimilarity], result of:
            0.09609532 = score(doc=1356,freq=1.0), product of:
              0.23475844 = queryWeight, product of:
                1.9265618 = boost
                6.5493927 = idf(docFreq=171, maxDocs=44218)
                0.018605324 = queryNorm
              0.40933704 = fieldWeight in 1356, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5493927 = idf(docFreq=171, maxDocs=44218)
                0.0625 = fieldNorm(doc=1356)
          0.32865804 = weight(abstract_txt:clustering in 1356) [ClassicSimilarity], result of:
            0.32865804 = score(doc=1356,freq=4.0), product of:
              0.42296642 = queryWeight, product of:
                3.657129 = boost
                6.2162485 = idf(docFreq=239, maxDocs=44218)
                0.018605324 = queryNorm
              0.77703106 = fieldWeight in 1356, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.2162485 = idf(docFreq=239, maxDocs=44218)
                0.0625 = fieldNorm(doc=1356)
        0.24 = coord(6/25)
    
  5. Na, S.-H.; Kang, I.-S.; Lee, J.-H.: Adaptive document clustering based on query-based similarity (2007) 0.14
    0.13925038 = sum of:
      0.13925038 = product of:
        0.69625187 = sum of:
          0.023944328 = weight(abstract_txt:documents in 920) [ClassicSimilarity], result of:
            0.023944328 = score(doc=920,freq=1.0), product of:
              0.09295829 = queryWeight, product of:
                1.2123176 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.018605324 = queryNorm
              0.2575814 = fieldWeight in 920, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=920)
          0.043137446 = weight(abstract_txt:means in 920) [ClassicSimilarity], result of:
            0.043137446 = score(doc=920,freq=1.0), product of:
              0.13763303 = queryWeight, product of:
                1.4751415 = boost
                5.0147786 = idf(docFreq=797, maxDocs=44218)
                0.018605324 = queryNorm
              0.31342366 = fieldWeight in 920, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0147786 = idf(docFreq=797, maxDocs=44218)
                0.0625 = fieldNorm(doc=920)
          0.09074845 = weight(abstract_txt:document in 920) [ClassicSimilarity], result of:
            0.09074845 = score(doc=920,freq=5.0), product of:
              0.15127005 = queryWeight, product of:
                1.8940631 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.018605324 = queryNorm
              0.59991026 = fieldWeight in 920, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=920)
          0.1358993 = weight(abstract_txt:cluster in 920) [ClassicSimilarity], result of:
            0.1358993 = score(doc=920,freq=2.0), product of:
              0.23475844 = queryWeight, product of:
                1.9265618 = boost
                6.5493927 = idf(docFreq=171, maxDocs=44218)
                0.018605324 = queryNorm
              0.57888997 = fieldWeight in 920, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.5493927 = idf(docFreq=171, maxDocs=44218)
                0.0625 = fieldNorm(doc=920)
          0.4025223 = weight(abstract_txt:clustering in 920) [ClassicSimilarity], result of:
            0.4025223 = score(doc=920,freq=6.0), product of:
              0.42296642 = queryWeight, product of:
                3.657129 = boost
                6.2162485 = idf(docFreq=239, maxDocs=44218)
                0.018605324 = queryNorm
              0.95166487 = fieldWeight in 920, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.2162485 = idf(docFreq=239, maxDocs=44218)
                0.0625 = fieldNorm(doc=920)
        0.2 = coord(5/25)