Document (#32832)

Author
Peng, F.
Huang, X.
Title
Machine learning for Asian language text classification
Source
Journal of documentation. 63(2007) no.3, S.378-397
Year
2007
Abstract
Purpose - The purpose of this research is to compare several machine learning techniques on the task of Asian language text classification, such as Chinese and Japanese where no word boundary information is available in written text. The paper advocates a simple language modeling based approach for this task. Design/methodology/approach - Naïve Bayes, maximum entropy model, support vector machines, and language modeling approaches were implemented and were applied to Chinese and Japanese text classification. To investigate the influence of word segmentation, different word segmentation approaches were investigated and applied to Chinese text. A segmentation-based approach was compared with the non-segmentation-based approach. Findings - There were two findings: the experiments show that statistical language modeling can significantly outperform standard techniques, given the same set of features; and it was found that classification with word level features normally yields improved classification performance, but that classification performance is not monotonically related to segmentation accuracy. In particular, classification performance may initially improve with increased segmentation accuracy, but eventually classification performance stops improving, and can in fact even decrease, after a certain level of segmentation accuracy. Practical implications - Apply the findings to real web text classification is ongoing work. Originality/value - The paper is very relevant to Chinese and Japanese information processing, e.g. webpage classification, web search.
Theme
Computerlinguistik
Automatisches Klassifizieren

Similar documents (author)

  1. Huang, X.; Peng, F,; An, A.; Schuurmans, D.: Dynamic Web log session identification with statistical language models (2004) 3.61
    3.6108594 = sum of:
      3.6108594 = sum of:
        1.1867 = weight(author_txt:huang in 3096) [ClassicSimilarity], result of:
          1.1867 = score(doc=3096,freq=1.0), product of:
            0.52763635 = queryWeight, product of:
              7.1970778 = idf(docFreq=89, maxDocs=44218)
              0.07331258 = queryNorm
            2.2490869 = fieldWeight in 3096, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.1970778 = idf(docFreq=89, maxDocs=44218)
              0.3125 = fieldNorm(doc=3096)
        2.4241595 = weight(author_txt:peng in 3096) [ClassicSimilarity], result of:
          2.4241595 = score(doc=3096,freq=1.0), product of:
            0.8494703 = queryWeight, product of:
              1.2688397 = boost
              9.131938 = idf(docFreq=12, maxDocs=44218)
              0.07331258 = queryNorm
            2.8537307 = fieldWeight in 3096, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              9.131938 = idf(docFreq=12, maxDocs=44218)
              0.3125 = fieldNorm(doc=3096)
    
  2. Choi, B.; Peng, X.: Dynamic and hierarchical classification of Web pages (2004) 1.94
    1.9393276 = sum of:
      1.9393276 = product of:
        3.8786552 = sum of:
          3.8786552 = weight(author_txt:peng in 2555) [ClassicSimilarity], result of:
            3.8786552 = score(doc=2555,freq=1.0), product of:
              0.8494703 = queryWeight, product of:
                1.2688397 = boost
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.07331258 = queryNorm
              4.565969 = fieldWeight in 2555, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.5 = fieldNorm(doc=2555)
        0.5 = coord(1/2)
    
  3. Peng, T.-Q.; Zhu, J.J.H.: Where you publish matters most : a multilevel analysis of factors affecting citations of internet studies (2012) 1.70
    1.6969116 = sum of:
      1.6969116 = product of:
        3.3938231 = sum of:
          3.3938231 = weight(author_txt:peng in 386) [ClassicSimilarity], result of:
            3.3938231 = score(doc=386,freq=1.0), product of:
              0.8494703 = queryWeight, product of:
                1.2688397 = boost
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.07331258 = queryNorm
              3.9952228 = fieldWeight in 386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.4375 = fieldNorm(doc=386)
        0.5 = coord(1/2)
    
  4. Mukhopadhyay, S.; Peng, S.; Raje, R.; Palakal, M.; Mostafa, J.: Multi-agent information classification using dynamic acquaintance lists (2003) 1.21
    1.2120798 = sum of:
      1.2120798 = product of:
        2.4241595 = sum of:
          2.4241595 = weight(author_txt:peng in 1755) [ClassicSimilarity], result of:
            2.4241595 = score(doc=1755,freq=1.0), product of:
              0.8494703 = queryWeight, product of:
                1.2688397 = boost
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.07331258 = queryNorm
              2.8537307 = fieldWeight in 1755, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.3125 = fieldNorm(doc=1755)
        0.5 = coord(1/2)
    
  5. Mukhopadhyay, S.; Peng, S.; Raje, R.; Mostafa, J.; Palakal, M.: Distributed multi-agent information filtering : a comparative study (2005) 1.21
    1.2120798 = sum of:
      1.2120798 = product of:
        2.4241595 = sum of:
          2.4241595 = weight(author_txt:peng in 3559) [ClassicSimilarity], result of:
            2.4241595 = score(doc=3559,freq=1.0), product of:
              0.8494703 = queryWeight, product of:
                1.2688397 = boost
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.07331258 = queryNorm
              2.8537307 = fieldWeight in 3559, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.131938 = idf(docFreq=12, maxDocs=44218)
                0.3125 = fieldNorm(doc=3559)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Yang, C.C.; Li, K.W.: ¬A heuristic method based on a statistical approach for chinese text segmentation (2005) 0.46
    0.45651335 = sum of:
      0.45651335 = product of:
        1.6304048 = sum of:
          0.013831377 = weight(abstract_txt:based in 4580) [ClassicSimilarity], result of:
            0.013831377 = score(doc=4580,freq=2.0), product of:
              0.04908649 = queryWeight, product of:
                1.0716567 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.014368049 = queryNorm
              0.28177565 = fieldWeight in 4580, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.021146245 = weight(abstract_txt:approach in 4580) [ClassicSimilarity], result of:
            0.021146245 = score(doc=4580,freq=1.0), product of:
              0.09033653 = queryWeight, product of:
                1.6787103 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.014368049 = queryNorm
              0.234083 = fieldWeight in 4580, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.039959952 = weight(abstract_txt:performance in 4580) [ClassicSimilarity], result of:
            0.039959952 = score(doc=4580,freq=1.0), product of:
              0.13807802 = queryWeight, product of:
                2.0754216 = boost
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.014368049 = queryNorm
              0.28940126 = fieldWeight in 4580, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.09140547 = weight(abstract_txt:word in 4580) [ClassicSimilarity], result of:
            0.09140547 = score(doc=4580,freq=2.0), product of:
              0.19025937 = queryWeight, product of:
                2.4362233 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.014368049 = queryNorm
              0.48042563 = fieldWeight in 4580, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.105631754 = weight(abstract_txt:text in 4580) [ClassicSimilarity], result of:
            0.105631754 = score(doc=4580,freq=7.0), product of:
              0.15796782 = queryWeight, product of:
                2.7187796 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.014368049 = queryNorm
              0.6686916 = fieldWeight in 4580, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.30239916 = weight(abstract_txt:chinese in 4580) [ClassicSimilarity], result of:
            0.30239916 = score(doc=4580,freq=9.0), product of:
              0.25586691 = queryWeight, product of:
                2.8252125 = boost
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.014368049 = queryNorm
              1.1818612 = fieldWeight in 4580, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          1.0560309 = weight(abstract_txt:segmentation in 4580) [ClassicSimilarity], result of:
            1.0560309 = score(doc=4580,freq=9.0), product of:
              0.70972615 = queryWeight, product of:
                6.2245574 = boost
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.014368049 = queryNorm
              1.4879413 = fieldWeight in 4580, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
        0.28 = coord(7/25)
    
  2. Wang, F.L.; Yang, C.C.: Mining Web data for Chinese segmentation (2007) 0.38
    0.3803271 = sum of:
      0.3803271 = product of:
        1.5846963 = sum of:
          0.009780261 = weight(abstract_txt:based in 604) [ClassicSimilarity], result of:
            0.009780261 = score(doc=604,freq=1.0), product of:
              0.04908649 = queryWeight, product of:
                1.0716567 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.014368049 = queryNorm
              0.19924548 = fieldWeight in 604, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.06374024 = weight(abstract_txt:language in 604) [ClassicSimilarity], result of:
            0.06374024 = score(doc=604,freq=3.0), product of:
              0.14079264 = queryWeight, product of:
                2.3430903 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.014368049 = queryNorm
              0.45272425 = fieldWeight in 604, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.09140547 = weight(abstract_txt:word in 604) [ClassicSimilarity], result of:
            0.09140547 = score(doc=604,freq=2.0), product of:
              0.19025937 = queryWeight, product of:
                2.4362233 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.014368049 = queryNorm
              0.48042563 = fieldWeight in 604, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.03992505 = weight(abstract_txt:text in 604) [ClassicSimilarity], result of:
            0.03992505 = score(doc=604,freq=1.0), product of:
              0.15796782 = queryWeight, product of:
                2.7187796 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.014368049 = queryNorm
              0.25274166 = fieldWeight in 604, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.266691 = weight(abstract_txt:chinese in 604) [ClassicSimilarity], result of:
            0.266691 = score(doc=604,freq=7.0), product of:
              0.25586691 = queryWeight, product of:
                2.8252125 = boost
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.014368049 = queryNorm
              1.0423036 = fieldWeight in 604, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          1.1131543 = weight(abstract_txt:segmentation in 604) [ClassicSimilarity], result of:
            1.1131543 = score(doc=604,freq=10.0), product of:
              0.70972615 = queryWeight, product of:
                6.2245574 = boost
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.014368049 = queryNorm
              1.5684279 = fieldWeight in 604, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
        0.24 = coord(6/25)
    
  3. Huang, X.; Robertson, S.E.: Application of probilistic methods to Chinese text retrieval (1997) 0.37
    0.37337983 = sum of:
      0.37337983 = product of:
        1.166812 = sum of:
          0.026843037 = weight(abstract_txt:purpose in 4706) [ClassicSimilarity], result of:
            0.026843037 = score(doc=4706,freq=1.0), product of:
              0.06414924 = queryWeight, product of:
                1.0002874 = boost
                4.463432 = idf(docFreq=1384, maxDocs=44218)
                0.014368049 = queryNorm
              0.41844672 = fieldWeight in 4706, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.463432 = idf(docFreq=1384, maxDocs=44218)
                0.09375 = fieldNorm(doc=4706)
          0.020747066 = weight(abstract_txt:based in 4706) [ClassicSimilarity], result of:
            0.020747066 = score(doc=4706,freq=2.0), product of:
              0.04908649 = queryWeight, product of:
                1.0716567 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.014368049 = queryNorm
              0.42266348 = fieldWeight in 4706, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.09375 = fieldNorm(doc=4706)
          0.033283696 = weight(abstract_txt:applied in 4706) [ClassicSimilarity], result of:
            0.033283696 = score(doc=4706,freq=1.0), product of:
              0.07403858 = queryWeight, product of:
                1.0746279 = boost
                4.79515 = idf(docFreq=993, maxDocs=44218)
                0.014368049 = queryNorm
              0.4495453 = fieldWeight in 4706, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.79515 = idf(docFreq=993, maxDocs=44218)
                0.09375 = fieldNorm(doc=4706)
          0.05520067 = weight(abstract_txt:language in 4706) [ClassicSimilarity], result of:
            0.05520067 = score(doc=4706,freq=1.0), product of:
              0.14079264 = queryWeight, product of:
                2.3430903 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.014368049 = queryNorm
              0.3920707 = fieldWeight in 4706, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.09375 = fieldNorm(doc=4706)
          0.13710822 = weight(abstract_txt:word in 4706) [ClassicSimilarity], result of:
            0.13710822 = score(doc=4706,freq=2.0), product of:
              0.19025937 = queryWeight, product of:
                2.4362233 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.014368049 = queryNorm
              0.72063845 = fieldWeight in 4706, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.09375 = fieldNorm(doc=4706)
          0.10372832 = weight(abstract_txt:text in 4706) [ClassicSimilarity], result of:
            0.10372832 = score(doc=4706,freq=3.0), product of:
              0.15796782 = queryWeight, product of:
                2.7187796 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.014368049 = queryNorm
              0.6566421 = fieldWeight in 4706, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.09375 = fieldNorm(doc=4706)
          0.2618854 = weight(abstract_txt:chinese in 4706) [ClassicSimilarity], result of:
            0.2618854 = score(doc=4706,freq=3.0), product of:
              0.25586691 = queryWeight, product of:
                2.8252125 = boost
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.014368049 = queryNorm
              1.0235219 = fieldWeight in 4706, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.09375 = fieldNorm(doc=4706)
          0.52801543 = weight(abstract_txt:segmentation in 4706) [ClassicSimilarity], result of:
            0.52801543 = score(doc=4706,freq=1.0), product of:
              0.70972615 = queryWeight, product of:
                6.2245574 = boost
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.014368049 = queryNorm
              0.74397063 = fieldWeight in 4706, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.09375 = fieldNorm(doc=4706)
        0.32 = coord(8/25)
    
  4. Lee, K.H.; Ng, M.K.M.; Lu, Q.: Text segmentation for Chinese spell checking (1999) 0.34
    0.34324613 = sum of:
      0.34324613 = product of:
        1.2258791 = sum of:
          0.025899677 = weight(abstract_txt:level in 3913) [ClassicSimilarity], result of:
            0.025899677 = score(doc=3913,freq=2.0), product of:
              0.065145455 = queryWeight, product of:
                1.0080246 = boost
                4.497956 = idf(docFreq=1337, maxDocs=44218)
                0.014368049 = queryNorm
              0.39756688 = fieldWeight in 3913, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.497956 = idf(docFreq=1337, maxDocs=44218)
                0.0625 = fieldNorm(doc=3913)
          0.013831377 = weight(abstract_txt:based in 3913) [ClassicSimilarity], result of:
            0.013831377 = score(doc=3913,freq=2.0), product of:
              0.04908649 = queryWeight, product of:
                1.0716567 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.014368049 = queryNorm
              0.28177565 = fieldWeight in 3913, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0625 = fieldNorm(doc=3913)
          0.036800444 = weight(abstract_txt:language in 3913) [ClassicSimilarity], result of:
            0.036800444 = score(doc=3913,freq=1.0), product of:
              0.14079264 = queryWeight, product of:
                2.3430903 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.014368049 = queryNorm
              0.26138046 = fieldWeight in 3913, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.0625 = fieldNorm(doc=3913)
          0.12926687 = weight(abstract_txt:word in 3913) [ClassicSimilarity], result of:
            0.12926687 = score(doc=3913,freq=4.0), product of:
              0.19025937 = queryWeight, product of:
                2.4362233 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.014368049 = queryNorm
              0.67942446 = fieldWeight in 3913, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.0625 = fieldNorm(doc=3913)
          0.06915221 = weight(abstract_txt:text in 3913) [ClassicSimilarity], result of:
            0.06915221 = score(doc=3913,freq=3.0), product of:
              0.15796782 = queryWeight, product of:
                2.7187796 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.014368049 = queryNorm
              0.4377614 = fieldWeight in 3913, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=3913)
          0.2469079 = weight(abstract_txt:chinese in 3913) [ClassicSimilarity], result of:
            0.2469079 = score(doc=3913,freq=6.0), product of:
              0.25586691 = queryWeight, product of:
                2.8252125 = boost
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.014368049 = queryNorm
              0.96498567 = fieldWeight in 3913, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.0625 = fieldNorm(doc=3913)
          0.70402056 = weight(abstract_txt:segmentation in 3913) [ClassicSimilarity], result of:
            0.70402056 = score(doc=3913,freq=4.0), product of:
              0.70972615 = queryWeight, product of:
                6.2245574 = boost
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.014368049 = queryNorm
              0.9919609 = fieldWeight in 3913, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.0625 = fieldNorm(doc=3913)
        0.28 = coord(7/25)
    
  5. Doval, Y.; Gómez-Rodríguez, C.: Comparing neural- and N-gram-based language models for word segmentation (2019) 0.31
    0.30925763 = sum of:
      0.30925763 = product of:
        0.85904896 = sum of:
          0.018313836 = weight(abstract_txt:level in 4675) [ClassicSimilarity], result of:
            0.018313836 = score(doc=4675,freq=1.0), product of:
              0.065145455 = queryWeight, product of:
                1.0080246 = boost
                4.497956 = idf(docFreq=1337, maxDocs=44218)
                0.014368049 = queryNorm
              0.28112224 = fieldWeight in 4675, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.497956 = idf(docFreq=1337, maxDocs=44218)
                0.0625 = fieldNorm(doc=4675)
          0.009780261 = weight(abstract_txt:based in 4675) [ClassicSimilarity], result of:
            0.009780261 = score(doc=4675,freq=1.0), product of:
              0.04908649 = queryWeight, product of:
                1.0716567 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.014368049 = queryNorm
              0.19924548 = fieldWeight in 4675, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0625 = fieldNorm(doc=4675)
          0.023840923 = weight(abstract_txt:task in 4675) [ClassicSimilarity], result of:
            0.023840923 = score(doc=4675,freq=1.0), product of:
              0.0776688 = queryWeight, product of:
                1.1006579 = boost
                4.9112997 = idf(docFreq=884, maxDocs=44218)
                0.014368049 = queryNorm
              0.30695623 = fieldWeight in 4675, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9112997 = idf(docFreq=884, maxDocs=44218)
                0.0625 = fieldNorm(doc=4675)
          0.021146245 = weight(abstract_txt:approach in 4675) [ClassicSimilarity], result of:
            0.021146245 = score(doc=4675,freq=1.0), product of:
              0.09033653 = queryWeight, product of:
                1.6787103 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.014368049 = queryNorm
              0.234083 = fieldWeight in 4675, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0625 = fieldNorm(doc=4675)
          0.039959952 = weight(abstract_txt:performance in 4675) [ClassicSimilarity], result of:
            0.039959952 = score(doc=4675,freq=1.0), product of:
              0.13807802 = queryWeight, product of:
                2.0754216 = boost
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.014368049 = queryNorm
              0.28940126 = fieldWeight in 4675, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.0625 = fieldNorm(doc=4675)
          0.06374024 = weight(abstract_txt:language in 4675) [ClassicSimilarity], result of:
            0.06374024 = score(doc=4675,freq=3.0), product of:
              0.14079264 = queryWeight, product of:
                2.3430903 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.014368049 = queryNorm
              0.45272425 = fieldWeight in 4675, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.0625 = fieldNorm(doc=4675)
          0.14452475 = weight(abstract_txt:word in 4675) [ClassicSimilarity], result of:
            0.14452475 = score(doc=4675,freq=5.0), product of:
              0.19025937 = queryWeight, product of:
                2.4362233 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.014368049 = queryNorm
              0.75961965 = fieldWeight in 4675, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.0625 = fieldNorm(doc=4675)
          0.03992505 = weight(abstract_txt:text in 4675) [ClassicSimilarity], result of:
            0.03992505 = score(doc=4675,freq=1.0), product of:
              0.15796782 = queryWeight, product of:
                2.7187796 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.014368049 = queryNorm
              0.25274166 = fieldWeight in 4675, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=4675)
          0.49781772 = weight(abstract_txt:segmentation in 4675) [ClassicSimilarity], result of:
            0.49781772 = score(doc=4675,freq=2.0), product of:
              0.70972615 = queryWeight, product of:
                6.2245574 = boost
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.014368049 = queryNorm
              0.7014223 = fieldWeight in 4675, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.0625 = fieldNorm(doc=4675)
        0.36 = coord(9/25)