Document (#34064)

Author
Stamatatos, E.
Title
Author identification : using text sampling to handle the class imbalance problem
Source
Information processing and management. 44(2008) no.2, S.790-799
Year
2008
Abstract
Authorship analysis of electronic texts assists digital forensics and anti-terror investigation. Author identification can be seen as a single-label multi-class text categorization problem. Very often, there are extremely few training texts at least for some of the candidate authors or there is a significant variation in the text-length among the available training texts of the candidate authors. Moreover, in this task usually there is no similarity between the distribution of training and test texts over the classes, that is, a basic assumption of inductive learning does not apply. In this paper, we present methods to handle imbalanced multi-class textual datasets. The main idea is to segment the training texts into text samples according to the size of the class, thus producing a fairer classification model. Hence, minority classes can be segmented into many short samples and majority classes into less and longer samples. We explore text sampling methods in order to construct a training set according to a desirable distribution over the classes. Essentially, by text sampling we provide new synthetic data that artificially increase the training size of a class. Based on two text corpora of two languages, namely, newswire stories in English and newspaper reportage in Arabic, we present a series of authorship identification experiments on various multi-class imbalanced cases that reveal the properties of the presented methods.

Similar documents (author)

  1. Stamatatos, E.: ¬A survey of modern authorship attribution methods (2009) 6.19
    6.190705 = sum of:
      6.190705 = weight(author_txt:stamatatos in 2741) [ClassicSimilarity], result of:
        6.190705 = fieldWeight in 2741, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.625 = fieldNorm(doc=2741)
    
  2. Stamatatos, E.: Plagiarism detection using stopword n-grams (2011) 6.19
    6.190705 = sum of:
      6.190705 = weight(author_txt:stamatatos in 4955) [ClassicSimilarity], result of:
        6.190705 = fieldWeight in 4955, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.625 = fieldNorm(doc=4955)
    
  3. Stamatatos, E.: Masking topic-related information to enhance authorship attribution (2018) 6.19
    6.190705 = sum of:
      6.190705 = weight(author_txt:stamatatos in 4124) [ClassicSimilarity], result of:
        6.190705 = fieldWeight in 4124, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.625 = fieldNorm(doc=4124)
    
  4. Potha, N.; Stamatatos, E.: Improving author verification based on topic modeling (2019) 4.95
    4.952564 = sum of:
      4.952564 = weight(author_txt:stamatatos in 5385) [ClassicSimilarity], result of:
        4.952564 = fieldWeight in 5385, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.5 = fieldNorm(doc=5385)
    

Similar documents (content)

  1. Li, T.; Zhu, S.; Ogihara, M.: Text categorization via generalized discriminant analysis (2008) 0.20
    0.20059262 = sum of:
      0.20059262 = product of:
        0.8358026 = sum of:
          0.0172096 = weight(abstract_txt:into in 2119) [ClassicSimilarity], result of:
            0.0172096 = score(doc=2119,freq=1.0), product of:
              0.07436101 = queryWeight, product of:
                1.1825029 = boost
                3.7029297 = idf(docFreq=2962, maxDocs=44218)
                0.016982343 = queryNorm
              0.23143311 = fieldWeight in 2119, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7029297 = idf(docFreq=2962, maxDocs=44218)
                0.0625 = fieldNorm(doc=2119)
          0.069879256 = weight(abstract_txt:handle in 2119) [ClassicSimilarity], result of:
            0.069879256 = score(doc=2119,freq=1.0), product of:
              0.16533566 = queryWeight, product of:
                1.439684 = boost
                6.7624135 = idf(docFreq=138, maxDocs=44218)
                0.016982343 = queryNorm
              0.42265084 = fieldWeight in 2119, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7624135 = idf(docFreq=138, maxDocs=44218)
                0.0625 = fieldNorm(doc=2119)
          0.17438748 = weight(abstract_txt:multi in 2119) [ClassicSimilarity], result of:
            0.17438748 = score(doc=2119,freq=6.0), product of:
              0.19162752 = queryWeight, product of:
                1.8982722 = boost
                5.9443145 = idf(docFreq=314, maxDocs=44218)
                0.016982343 = queryNorm
              0.91003364 = fieldWeight in 2119, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.9443145 = idf(docFreq=314, maxDocs=44218)
                0.0625 = fieldNorm(doc=2119)
          0.11694634 = weight(abstract_txt:text in 2119) [ClassicSimilarity], result of:
            0.11694634 = score(doc=2119,freq=5.0), product of:
              0.20693062 = queryWeight, product of:
                3.0132165 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.016982343 = queryNorm
              0.5651476 = fieldWeight in 2119, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=2119)
          0.0905652 = weight(abstract_txt:training in 2119) [ClassicSimilarity], result of:
            0.0905652 = score(doc=2119,freq=1.0), product of:
              0.28345382 = queryWeight, product of:
                3.265019 = boost
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.016982343 = queryNorm
              0.319506 = fieldWeight in 2119, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.0625 = fieldNorm(doc=2119)
          0.36681473 = weight(abstract_txt:class in 2119) [ClassicSimilarity], result of:
            0.36681473 = score(doc=2119,freq=7.0), product of:
              0.3765072 = queryWeight, product of:
                3.7629738 = boost
                5.8917522 = idf(docFreq=331, maxDocs=44218)
                0.016982343 = queryNorm
              0.97425693 = fieldWeight in 2119, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                5.8917522 = idf(docFreq=331, maxDocs=44218)
                0.0625 = fieldNorm(doc=2119)
        0.24 = coord(6/25)
    
  2. Xu, L.; Qiu, J.: Unsupervised multi-class sentiment classification approach (2019) 0.19
    0.1919904 = sum of:
      0.1919904 = product of:
        0.68568003 = sum of:
          0.0172096 = weight(abstract_txt:into in 5003) [ClassicSimilarity], result of:
            0.0172096 = score(doc=5003,freq=1.0), product of:
              0.07436101 = queryWeight, product of:
                1.1825029 = boost
                3.7029297 = idf(docFreq=2962, maxDocs=44218)
                0.016982343 = queryNorm
              0.23143311 = fieldWeight in 5003, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7029297 = idf(docFreq=2962, maxDocs=44218)
                0.0625 = fieldNorm(doc=5003)
          0.039610334 = weight(abstract_txt:distribution in 5003) [ClassicSimilarity], result of:
            0.039610334 = score(doc=5003,freq=1.0), product of:
              0.11324178 = queryWeight, product of:
                1.1914814 = boost
                5.596568 = idf(docFreq=445, maxDocs=44218)
                0.016982343 = queryNorm
              0.3497855 = fieldWeight in 5003, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.596568 = idf(docFreq=445, maxDocs=44218)
                0.0625 = fieldNorm(doc=5003)
          0.023352008 = weight(abstract_txt:there in 5003) [ClassicSimilarity], result of:
            0.023352008 = score(doc=5003,freq=1.0), product of:
              0.0911411 = queryWeight, product of:
                1.3091418 = boost
                4.099491 = idf(docFreq=1992, maxDocs=44218)
                0.016982343 = queryNorm
              0.2562182 = fieldWeight in 5003, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.099491 = idf(docFreq=1992, maxDocs=44218)
                0.0625 = fieldNorm(doc=5003)
          0.034180116 = weight(abstract_txt:methods in 5003) [ClassicSimilarity], result of:
            0.034180116 = score(doc=5003,freq=2.0), product of:
              0.09325464 = queryWeight, product of:
                1.3242341 = boost
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.016982343 = queryNorm
              0.36652455 = fieldWeight in 5003, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.0625 = fieldNorm(doc=5003)
          0.15919326 = weight(abstract_txt:multi in 5003) [ClassicSimilarity], result of:
            0.15919326 = score(doc=5003,freq=5.0), product of:
              0.19162752 = queryWeight, product of:
                1.8982722 = boost
                5.9443145 = idf(docFreq=314, maxDocs=44218)
                0.016982343 = queryNorm
              0.8307432 = fieldWeight in 5003, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.9443145 = idf(docFreq=314, maxDocs=44218)
                0.0625 = fieldNorm(doc=5003)
          0.10211961 = weight(abstract_txt:texts in 5003) [ClassicSimilarity], result of:
            0.10211961 = score(doc=5003,freq=1.0), product of:
              0.28897068 = queryWeight, product of:
                3.0094063 = boost
                5.6542544 = idf(docFreq=420, maxDocs=44218)
                0.016982343 = queryNorm
              0.3533909 = fieldWeight in 5003, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6542544 = idf(docFreq=420, maxDocs=44218)
                0.0625 = fieldNorm(doc=5003)
          0.31001505 = weight(abstract_txt:class in 5003) [ClassicSimilarity], result of:
            0.31001505 = score(doc=5003,freq=5.0), product of:
              0.3765072 = queryWeight, product of:
                3.7629738 = boost
                5.8917522 = idf(docFreq=331, maxDocs=44218)
                0.016982343 = queryNorm
              0.8233974 = fieldWeight in 5003, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.8917522 = idf(docFreq=331, maxDocs=44218)
                0.0625 = fieldNorm(doc=5003)
        0.28 = coord(7/25)
    
  3. Koppel, M.; Schler, J.; Argamon, S.: Computational methods in authorship attribution (2009) 0.19
    0.1905994 = sum of:
      0.1905994 = product of:
        0.59562314 = sum of:
          0.03941757 = weight(abstract_txt:author in 2683) [ClassicSimilarity], result of:
            0.03941757 = score(doc=2683,freq=2.0), product of:
              0.08958822 = queryWeight, product of:
                1.0597645 = boost
                4.9778743 = idf(docFreq=827, maxDocs=44218)
                0.016982343 = queryNorm
              0.43998608 = fieldWeight in 2683, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.9778743 = idf(docFreq=827, maxDocs=44218)
                0.0625 = fieldNorm(doc=2683)
          0.046704017 = weight(abstract_txt:there in 2683) [ClassicSimilarity], result of:
            0.046704017 = score(doc=2683,freq=4.0), product of:
              0.0911411 = queryWeight, product of:
                1.3091418 = boost
                4.099491 = idf(docFreq=1992, maxDocs=44218)
                0.016982343 = queryNorm
              0.5124364 = fieldWeight in 2683, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.099491 = idf(docFreq=1992, maxDocs=44218)
                0.0625 = fieldNorm(doc=2683)
          0.034180116 = weight(abstract_txt:methods in 2683) [ClassicSimilarity], result of:
            0.034180116 = score(doc=2683,freq=2.0), product of:
              0.09325464 = queryWeight, product of:
                1.3242341 = boost
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.016982343 = queryNorm
              0.36652455 = fieldWeight in 2683, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.0625 = fieldNorm(doc=2683)
          0.069879256 = weight(abstract_txt:handle in 2683) [ClassicSimilarity], result of:
            0.069879256 = score(doc=2683,freq=1.0), product of:
              0.16533566 = queryWeight, product of:
                1.439684 = boost
                6.7624135 = idf(docFreq=138, maxDocs=44218)
                0.016982343 = queryNorm
              0.42265084 = fieldWeight in 2683, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7624135 = idf(docFreq=138, maxDocs=44218)
                0.0625 = fieldNorm(doc=2683)
          0.10859841 = weight(abstract_txt:authorship in 2683) [ClassicSimilarity], result of:
            0.10859841 = score(doc=2683,freq=2.0), product of:
              0.17606513 = queryWeight, product of:
                1.485664 = boost
                6.9783883 = idf(docFreq=111, maxDocs=44218)
                0.016982343 = queryNorm
              0.6168082 = fieldWeight in 2683, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.9783883 = idf(docFreq=111, maxDocs=44218)
                0.0625 = fieldNorm(doc=2683)
          0.1539786 = weight(abstract_txt:candidate in 2683) [ClassicSimilarity], result of:
            0.1539786 = score(doc=2683,freq=3.0), product of:
              0.19411877 = queryWeight, product of:
                1.5599751 = boost
                7.3274393 = idf(docFreq=78, maxDocs=44218)
                0.016982343 = queryNorm
              0.79321855 = fieldWeight in 2683, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.3274393 = idf(docFreq=78, maxDocs=44218)
                0.0625 = fieldNorm(doc=2683)
          0.05229999 = weight(abstract_txt:text in 2683) [ClassicSimilarity], result of:
            0.05229999 = score(doc=2683,freq=1.0), product of:
              0.20693062 = queryWeight, product of:
                3.0132165 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.016982343 = queryNorm
              0.25274166 = fieldWeight in 2683, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=2683)
          0.0905652 = weight(abstract_txt:training in 2683) [ClassicSimilarity], result of:
            0.0905652 = score(doc=2683,freq=1.0), product of:
              0.28345382 = queryWeight, product of:
                3.265019 = boost
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.016982343 = queryNorm
              0.319506 = fieldWeight in 2683, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.0625 = fieldNorm(doc=2683)
        0.32 = coord(8/25)
    
  4. Potha, N.; Stamatatos, E.: Improving author verification based on topic modeling (2019) 0.19
    0.18727809 = sum of:
      0.18727809 = product of:
        0.6688503 = sum of:
          0.10475085 = weight(abstract_txt:forensics in 5385) [ClassicSimilarity], result of:
            0.10475085 = score(doc=5385,freq=1.0), product of:
              0.1718816 = queryWeight, product of:
                1.0379672 = boost
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.016982343 = queryNorm
              0.6094361 = fieldWeight in 5385, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.0625 = fieldNorm(doc=5385)
          0.062324647 = weight(abstract_txt:author in 5385) [ClassicSimilarity], result of:
            0.062324647 = score(doc=5385,freq=5.0), product of:
              0.08958822 = queryWeight, product of:
                1.0597645 = boost
                4.9778743 = idf(docFreq=827, maxDocs=44218)
                0.016982343 = queryNorm
              0.69567907 = fieldWeight in 5385, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.9778743 = idf(docFreq=827, maxDocs=44218)
                0.0625 = fieldNorm(doc=5385)
          0.054043505 = weight(abstract_txt:methods in 5385) [ClassicSimilarity], result of:
            0.054043505 = score(doc=5385,freq=5.0), product of:
              0.09325464 = queryWeight, product of:
                1.3242341 = boost
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.016982343 = queryNorm
              0.5795262 = fieldWeight in 5385, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.0625 = fieldNorm(doc=5385)
          0.13300535 = weight(abstract_txt:authorship in 5385) [ClassicSimilarity], result of:
            0.13300535 = score(doc=5385,freq=3.0), product of:
              0.17606513 = queryWeight, product of:
                1.485664 = boost
                6.9783883 = idf(docFreq=111, maxDocs=44218)
                0.016982343 = queryNorm
              0.75543267 = fieldWeight in 5385, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.9783883 = idf(docFreq=111, maxDocs=44218)
                0.0625 = fieldNorm(doc=5385)
          0.10211961 = weight(abstract_txt:texts in 5385) [ClassicSimilarity], result of:
            0.10211961 = score(doc=5385,freq=1.0), product of:
              0.28897068 = queryWeight, product of:
                3.0094063 = boost
                5.6542544 = idf(docFreq=420, maxDocs=44218)
                0.016982343 = queryNorm
              0.3533909 = fieldWeight in 5385, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6542544 = idf(docFreq=420, maxDocs=44218)
                0.0625 = fieldNorm(doc=5385)
          0.07396336 = weight(abstract_txt:text in 5385) [ClassicSimilarity], result of:
            0.07396336 = score(doc=5385,freq=2.0), product of:
              0.20693062 = queryWeight, product of:
                3.0132165 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.016982343 = queryNorm
              0.3574307 = fieldWeight in 5385, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=5385)
          0.13864294 = weight(abstract_txt:class in 5385) [ClassicSimilarity], result of:
            0.13864294 = score(doc=5385,freq=1.0), product of:
              0.3765072 = queryWeight, product of:
                3.7629738 = boost
                5.8917522 = idf(docFreq=331, maxDocs=44218)
                0.016982343 = queryNorm
              0.36823452 = fieldWeight in 5385, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8917522 = idf(docFreq=331, maxDocs=44218)
                0.0625 = fieldNorm(doc=5385)
        0.28 = coord(7/25)
    
  5. Schaalje, G.B.; Blades, N.J.; Funai, T.: ¬An open-set size-adjusted Bayesian classifier for authorship attribution (2013) 0.17
    0.17366062 = sum of:
      0.17366062 = product of:
        0.6202165 = sum of:
          0.039610334 = weight(abstract_txt:distribution in 1041) [ClassicSimilarity], result of:
            0.039610334 = score(doc=1041,freq=1.0), product of:
              0.11324178 = queryWeight, product of:
                1.1914814 = boost
                5.596568 = idf(docFreq=445, maxDocs=44218)
                0.016982343 = queryNorm
              0.3497855 = fieldWeight in 1041, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.596568 = idf(docFreq=445, maxDocs=44218)
                0.0625 = fieldNorm(doc=1041)
          0.080168545 = weight(abstract_txt:size in 1041) [ClassicSimilarity], result of:
            0.080168545 = score(doc=1041,freq=3.0), product of:
              0.12563093 = queryWeight, product of:
                1.2549667 = boost
                5.8947687 = idf(docFreq=330, maxDocs=44218)
                0.016982343 = queryNorm
              0.63812745 = fieldWeight in 1041, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.8947687 = idf(docFreq=330, maxDocs=44218)
                0.0625 = fieldNorm(doc=1041)
          0.04186192 = weight(abstract_txt:methods in 1041) [ClassicSimilarity], result of:
            0.04186192 = score(doc=1041,freq=3.0), product of:
              0.09325464 = queryWeight, product of:
                1.3242341 = boost
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.016982343 = queryNorm
              0.44889906 = fieldWeight in 1041, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.0625 = fieldNorm(doc=1041)
          0.13300535 = weight(abstract_txt:authorship in 1041) [ClassicSimilarity], result of:
            0.13300535 = score(doc=1041,freq=3.0), product of:
              0.17606513 = queryWeight, product of:
                1.485664 = boost
                6.9783883 = idf(docFreq=111, maxDocs=44218)
                0.016982343 = queryNorm
              0.75543267 = fieldWeight in 1041, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.9783883 = idf(docFreq=111, maxDocs=44218)
                0.0625 = fieldNorm(doc=1041)
          0.14441893 = weight(abstract_txt:texts in 1041) [ClassicSimilarity], result of:
            0.14441893 = score(doc=1041,freq=2.0), product of:
              0.28897068 = queryWeight, product of:
                3.0094063 = boost
                5.6542544 = idf(docFreq=420, maxDocs=44218)
                0.016982343 = queryNorm
              0.4997702 = fieldWeight in 1041, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6542544 = idf(docFreq=420, maxDocs=44218)
                0.0625 = fieldNorm(doc=1041)
          0.09058624 = weight(abstract_txt:text in 1041) [ClassicSimilarity], result of:
            0.09058624 = score(doc=1041,freq=3.0), product of:
              0.20693062 = queryWeight, product of:
                3.0132165 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.016982343 = queryNorm
              0.4377614 = fieldWeight in 1041, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=1041)
          0.0905652 = weight(abstract_txt:training in 1041) [ClassicSimilarity], result of:
            0.0905652 = score(doc=1041,freq=1.0), product of:
              0.28345382 = queryWeight, product of:
                3.265019 = boost
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.016982343 = queryNorm
              0.319506 = fieldWeight in 1041, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.112096 = idf(docFreq=723, maxDocs=44218)
                0.0625 = fieldNorm(doc=1041)
        0.28 = coord(7/25)