Document (#30275)

Author
Zheng, R.
Li, J.
Chen, H.
Huang, Z.
Title
¬A framework for authorship identification of online messages : writing-style features and classification techniques
Source
Journal of the American Society for Information Science and Technology. 57(2006) no.3, S.378-393
Year
2006
Abstract
With the rapid proliferation of Internet technologies and applications, misuse of online messages for inappropriate or illegal purposes has become a major concern for society. The anonymous nature of online-message distribution makes identity tracing a critical problem. We developed a framework for authorship identification of online messages to address the identity-tracing problem. In this framework, four types of writing-style features (lexical, syntactic, structural, and content-specific features) are extracted and inductive learning algorithms are used to build feature-based classification models to identify authorship of online messages. To examine this framework, we conducted experiments on English and Chinese online-newsgroup messages. We compared the discriminating power of the four types of features and of three classification techniques: decision trees, backpropagation neural networks, and support vector machines. The experimental results showed that the proposed approach was able to identify authors of online messages with satisfactory accuracy of 70 to 95%. All four types of message features contributed to discriminating authors of online messages. Support vector machines outperformed the other two classification techniques in our experiments. The high performance we achieved for both the English and Chinese datasets showed the potential of this approach in a multiple-language context.

Similar documents (author)

  1. Huang, Z.; Chung, Z.W.; Chen, H.: ¬A graph model for e-commerce recommender systems (2004) 1.52
    1.5175261 = sum of:
      1.5175261 = product of:
        2.2762892 = sum of:
          0.8705119 = weight(author_txt:chen in 2499) [ClassicSimilarity], result of:
            0.8705119 = score(doc=2499,freq=1.0), product of:
              0.37682405 = queryWeight, product of:
                6.1603417 = idf(docFreq=249, maxDocs=43556)
                0.06116934 = queryNorm
              2.3101282 = fieldWeight in 2499, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1603417 = idf(docFreq=249, maxDocs=43556)
                0.375 = fieldNorm(doc=2499)
          1.4057773 = weight(author_txt:huang in 2499) [ClassicSimilarity], result of:
            1.4057773 = score(doc=2499,freq=1.0), product of:
              0.5186804 = queryWeight, product of:
                1.1732231 = boost
                7.2274556 = idf(docFreq=85, maxDocs=43556)
                0.06116934 = queryNorm
              2.710296 = fieldWeight in 2499, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2274556 = idf(docFreq=85, maxDocs=43556)
                0.375 = fieldNorm(doc=2499)
        0.6666667 = coord(2/3)
    
  2. Huang, C.; Fu, T.; Chen, H.: Text-based video content classification for online video-sharing sites (2010) 1.52
    1.5175261 = sum of:
      1.5175261 = product of:
        2.2762892 = sum of:
          0.8705119 = weight(author_txt:chen in 450) [ClassicSimilarity], result of:
            0.8705119 = score(doc=450,freq=1.0), product of:
              0.37682405 = queryWeight, product of:
                6.1603417 = idf(docFreq=249, maxDocs=43556)
                0.06116934 = queryNorm
              2.3101282 = fieldWeight in 450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1603417 = idf(docFreq=249, maxDocs=43556)
                0.375 = fieldNorm(doc=450)
          1.4057773 = weight(author_txt:huang in 450) [ClassicSimilarity], result of:
            1.4057773 = score(doc=450,freq=1.0), product of:
              0.5186804 = queryWeight, product of:
                1.1732231 = boost
                7.2274556 = idf(docFreq=85, maxDocs=43556)
                0.06116934 = queryNorm
              2.710296 = fieldWeight in 450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2274556 = idf(docFreq=85, maxDocs=43556)
                0.375 = fieldNorm(doc=450)
        0.6666667 = coord(2/3)
    
  3. Huang, M.-H.; Huang, W.-T.; Chang, C.-C.; Chen, D. Z.; Lin, C.-P.: The greater scattering phenomenon beyond Bradford's law in patent citation (2014) 1.27
    1.2704806 = sum of:
      1.2704806 = product of:
        1.9057208 = sum of:
          0.5803412 = weight(author_txt:chen in 3350) [ClassicSimilarity], result of:
            0.5803412 = score(doc=3350,freq=1.0), product of:
              0.37682405 = queryWeight, product of:
                6.1603417 = idf(docFreq=249, maxDocs=43556)
                0.06116934 = queryNorm
              1.5400854 = fieldWeight in 3350, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1603417 = idf(docFreq=249, maxDocs=43556)
                0.25 = fieldNorm(doc=3350)
          1.3253796 = weight(author_txt:huang in 3350) [ClassicSimilarity], result of:
            1.3253796 = score(doc=3350,freq=2.0), product of:
              0.5186804 = queryWeight, product of:
                1.1732231 = boost
                7.2274556 = idf(docFreq=85, maxDocs=43556)
                0.06116934 = queryNorm
              2.5552914 = fieldWeight in 3350, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.2274556 = idf(docFreq=85, maxDocs=43556)
                0.25 = fieldNorm(doc=3350)
        0.6666667 = coord(2/3)
    
  4. Chen, Z.; Huang, Y.; Tian, J.; Liu, X.; Fu, K.; Huang, T.: Joint model for subsentence-level sentiment analysis with Markov logic (2015) 1.27
    1.2704806 = sum of:
      1.2704806 = product of:
        1.9057208 = sum of:
          0.5803412 = weight(author_txt:chen in 4208) [ClassicSimilarity], result of:
            0.5803412 = score(doc=4208,freq=1.0), product of:
              0.37682405 = queryWeight, product of:
                6.1603417 = idf(docFreq=249, maxDocs=43556)
                0.06116934 = queryNorm
              1.5400854 = fieldWeight in 4208, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1603417 = idf(docFreq=249, maxDocs=43556)
                0.25 = fieldNorm(doc=4208)
          1.3253796 = weight(author_txt:huang in 4208) [ClassicSimilarity], result of:
            1.3253796 = score(doc=4208,freq=2.0), product of:
              0.5186804 = queryWeight, product of:
                1.1732231 = boost
                7.2274556 = idf(docFreq=85, maxDocs=43556)
                0.06116934 = queryNorm
              2.5552914 = fieldWeight in 4208, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.2274556 = idf(docFreq=85, maxDocs=43556)
                0.25 = fieldNorm(doc=4208)
        0.6666667 = coord(2/3)
    
  5. Huang, M.-H.; Tang, M.-C.; Chen, D.-Z.: Inequality of publishing performance and international collaboration in physics (2011) 1.26
    1.2646052 = sum of:
      1.2646052 = product of:
        1.8969077 = sum of:
          0.72542655 = weight(author_txt:chen in 1465) [ClassicSimilarity], result of:
            0.72542655 = score(doc=1465,freq=1.0), product of:
              0.37682405 = queryWeight, product of:
                6.1603417 = idf(docFreq=249, maxDocs=43556)
                0.06116934 = queryNorm
              1.9251068 = fieldWeight in 1465, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1603417 = idf(docFreq=249, maxDocs=43556)
                0.3125 = fieldNorm(doc=1465)
          1.1714811 = weight(author_txt:huang in 1465) [ClassicSimilarity], result of:
            1.1714811 = score(doc=1465,freq=1.0), product of:
              0.5186804 = queryWeight, product of:
                1.1732231 = boost
                7.2274556 = idf(docFreq=85, maxDocs=43556)
                0.06116934 = queryNorm
              2.25858 = fieldWeight in 1465, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2274556 = idf(docFreq=85, maxDocs=43556)
                0.3125 = fieldNorm(doc=1465)
        0.6666667 = coord(2/3)
    

Similar documents (content)

  1. Kucukyilmaz, T.; Cambazoglu, B.B.; Aykanat, C.; Can, F.: Chat mining : Predicting user and message attributes in computer-mediated communication (2008) 0.40
    0.4028981 = sum of:
      0.4028981 = product of:
        1.1191614 = sum of:
          0.040222093 = weight(abstract_txt:authors in 4097) [ClassicSimilarity], result of:
            0.040222093 = score(doc=4097,freq=2.0), product of:
              0.077975035 = queryWeight, product of:
                4.668787 = idf(docFreq=1110, maxDocs=43556)
                0.016701348 = queryNorm
              0.51583296 = fieldWeight in 4097, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.668787 = idf(docFreq=1110, maxDocs=43556)
                0.078125 = fieldNorm(doc=4097)
          0.075392276 = weight(abstract_txt:writing in 4097) [ClassicSimilarity], result of:
            0.075392276 = score(doc=4097,freq=1.0), product of:
              0.14935061 = queryWeight, product of:
                1.3839669 = boost
                6.461447 = idf(docFreq=184, maxDocs=43556)
                0.016701348 = queryNorm
              0.50480056 = fieldWeight in 4097, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.461447 = idf(docFreq=184, maxDocs=43556)
                0.078125 = fieldNorm(doc=4097)
          0.07735422 = weight(abstract_txt:style in 4097) [ClassicSimilarity], result of:
            0.07735422 = score(doc=4097,freq=1.0), product of:
              0.15193057 = queryWeight, product of:
                1.3958694 = boost
                6.517017 = idf(docFreq=174, maxDocs=43556)
                0.016701348 = queryNorm
              0.5091419 = fieldWeight in 4097, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.517017 = idf(docFreq=174, maxDocs=43556)
                0.078125 = fieldNorm(doc=4097)
          0.08584858 = weight(abstract_txt:identity in 4097) [ClassicSimilarity], result of:
            0.08584858 = score(doc=4097,freq=1.0), product of:
              0.16285878 = queryWeight, product of:
                1.4451995 = boost
                6.7473288 = idf(docFreq=138, maxDocs=43556)
                0.016701348 = queryNorm
              0.5271351 = fieldWeight in 4097, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7473288 = idf(docFreq=138, maxDocs=43556)
                0.078125 = fieldNorm(doc=4097)
          0.038916975 = weight(abstract_txt:techniques in 4097) [ClassicSimilarity], result of:
            0.038916975 = score(doc=4097,freq=1.0), product of:
              0.1100134 = queryWeight, product of:
                1.4547576 = boost
                4.527969 = idf(docFreq=1278, maxDocs=43556)
                0.016701348 = queryNorm
              0.35374758 = fieldWeight in 4097, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.527969 = idf(docFreq=1278, maxDocs=43556)
                0.078125 = fieldNorm(doc=4097)
          0.15534313 = weight(abstract_txt:message in 4097) [ClassicSimilarity], result of:
            0.15534313 = score(doc=4097,freq=2.0), product of:
              0.19194369 = queryWeight, product of:
                1.56895 = boost
                7.3250937 = idf(docFreq=77, maxDocs=43556)
                0.016701348 = queryNorm
              0.80931616 = fieldWeight in 4097, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.3250937 = idf(docFreq=77, maxDocs=43556)
                0.078125 = fieldNorm(doc=4097)
          0.035754114 = weight(abstract_txt:classification in 4097) [ClassicSimilarity], result of:
            0.035754114 = score(doc=4097,freq=1.0), product of:
              0.11443262 = queryWeight, product of:
                1.713216 = boost
                3.9993203 = idf(docFreq=2169, maxDocs=43556)
                0.016701348 = queryNorm
              0.3124469 = fieldWeight in 4097, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9993203 = idf(docFreq=2169, maxDocs=43556)
                0.078125 = fieldNorm(doc=4097)
          0.055143733 = weight(abstract_txt:online in 4097) [ClassicSimilarity], result of:
            0.055143733 = score(doc=4097,freq=1.0), product of:
              0.19245975 = queryWeight, product of:
                3.1421156 = boost
                3.667467 = idf(docFreq=3023, maxDocs=43556)
                0.016701348 = queryNorm
              0.28652087 = fieldWeight in 4097, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.667467 = idf(docFreq=3023, maxDocs=43556)
                0.078125 = fieldNorm(doc=4097)
          0.55518633 = weight(abstract_txt:messages in 4097) [ClassicSimilarity], result of:
            0.55518633 = score(doc=4097,freq=3.0), product of:
              0.59511 = queryWeight, product of:
                5.168385 = boost
                6.894311 = idf(docFreq=119, maxDocs=43556)
                0.016701348 = queryNorm
              0.9329138 = fieldWeight in 4097, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.894311 = idf(docFreq=119, maxDocs=43556)
                0.078125 = fieldNorm(doc=4097)
        0.36 = coord(9/25)
    
  2. Huang, C.; Fu, T.; Chen, H.: Text-based video content classification for online video-sharing sites (2010) 0.26
    0.2593414 = sum of:
      0.2593414 = product of:
        0.64835346 = sum of:
          0.0298134 = weight(abstract_txt:experiments in 450) [ClassicSimilarity], result of:
            0.0298134 = score(doc=450,freq=1.0), product of:
              0.102062166 = queryWeight, product of:
                1.1440753 = boost
                5.3414435 = idf(docFreq=566, maxDocs=43556)
                0.016701348 = queryNorm
              0.2921102 = fieldWeight in 450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3414435 = idf(docFreq=566, maxDocs=43556)
                0.0546875 = fieldNorm(doc=450)
          0.038417198 = weight(abstract_txt:showed in 450) [ClassicSimilarity], result of:
            0.038417198 = score(doc=450,freq=1.0), product of:
              0.12085768 = queryWeight, product of:
                1.2449713 = boost
                5.8125057 = idf(docFreq=353, maxDocs=43556)
                0.016701348 = queryNorm
              0.3178714 = fieldWeight in 450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8125057 = idf(docFreq=353, maxDocs=43556)
                0.0546875 = fieldNorm(doc=450)
          0.07657678 = weight(abstract_txt:vector in 450) [ClassicSimilarity], result of:
            0.07657678 = score(doc=450,freq=2.0), product of:
              0.15193057 = queryWeight, product of:
                1.3958694 = boost
                6.517017 = idf(docFreq=174, maxDocs=43556)
                0.016701348 = queryNorm
              0.5040248 = fieldWeight in 450, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.517017 = idf(docFreq=174, maxDocs=43556)
                0.0546875 = fieldNorm(doc=450)
          0.04565152 = weight(abstract_txt:types in 450) [ClassicSimilarity], result of:
            0.04565152 = score(doc=450,freq=3.0), product of:
              0.107617766 = queryWeight, product of:
                1.4388311 = boost
                4.4783974 = idf(docFreq=1343, maxDocs=43556)
                0.016701348 = queryNorm
              0.42420062 = fieldWeight in 450, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.4783974 = idf(docFreq=1343, maxDocs=43556)
                0.0546875 = fieldNorm(doc=450)
          0.03852584 = weight(abstract_txt:techniques in 450) [ClassicSimilarity], result of:
            0.03852584 = score(doc=450,freq=2.0), product of:
              0.1100134 = queryWeight, product of:
                1.4547576 = boost
                4.527969 = idf(docFreq=1278, maxDocs=43556)
                0.016701348 = queryNorm
              0.35019222 = fieldWeight in 450, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.527969 = idf(docFreq=1278, maxDocs=43556)
                0.0546875 = fieldNorm(doc=450)
          0.043349564 = weight(abstract_txt:classification in 450) [ClassicSimilarity], result of:
            0.043349564 = score(doc=450,freq=3.0), product of:
              0.11443262 = queryWeight, product of:
                1.713216 = boost
                3.9993203 = idf(docFreq=2169, maxDocs=43556)
                0.016701348 = queryNorm
              0.37882173 = fieldWeight in 450, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.9993203 = idf(docFreq=2169, maxDocs=43556)
                0.0546875 = fieldNorm(doc=450)
          0.053365048 = weight(abstract_txt:framework in 450) [ClassicSimilarity], result of:
            0.053365048 = score(doc=450,freq=2.0), product of:
              0.15046254 = queryWeight, product of:
                1.9644971 = boost
                4.5859094 = idf(docFreq=1206, maxDocs=43556)
                0.016701348 = queryNorm
              0.35467333 = fieldWeight in 450, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5859094 = idf(docFreq=1206, maxDocs=43556)
                0.0546875 = fieldNorm(doc=450)
          0.15654099 = weight(abstract_txt:discriminating in 450) [ClassicSimilarity], result of:
            0.15654099 = score(doc=450,freq=1.0), product of:
              0.3083253 = queryWeight, product of:
                1.9885054 = boost
                9.283908 = idf(docFreq=10, maxDocs=43556)
                0.016701348 = queryNorm
              0.50771374 = fieldWeight in 450, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.283908 = idf(docFreq=10, maxDocs=43556)
                0.0546875 = fieldNorm(doc=450)
          0.0797995 = weight(abstract_txt:features in 450) [ClassicSimilarity], result of:
            0.0797995 = score(doc=450,freq=3.0), product of:
              0.18515274 = queryWeight, product of:
                2.4364488 = boost
                4.550104 = idf(docFreq=1250, maxDocs=43556)
                0.016701348 = queryNorm
              0.4309928 = fieldWeight in 450, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.550104 = idf(docFreq=1250, maxDocs=43556)
                0.0546875 = fieldNorm(doc=450)
          0.08631359 = weight(abstract_txt:online in 450) [ClassicSimilarity], result of:
            0.08631359 = score(doc=450,freq=5.0), product of:
              0.19245975 = queryWeight, product of:
                3.1421156 = boost
                3.667467 = idf(docFreq=3023, maxDocs=43556)
                0.016701348 = queryNorm
              0.44847608 = fieldWeight in 450, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.667467 = idf(docFreq=3023, maxDocs=43556)
                0.0546875 = fieldNorm(doc=450)
        0.4 = coord(10/25)
    
  3. Zhang, Y.; Zhang, C.; Li, J.: Joint modeling of characters, words, and conversation contexts for microblog keyphrase extraction (2020) 0.18
    0.18083192 = sum of:
      0.18083192 = product of:
        0.75346637 = sum of:
          0.06377905 = weight(abstract_txt:identification in 2102) [ClassicSimilarity], result of:
            0.06377905 = score(doc=2102,freq=2.0), product of:
              0.1230376 = queryWeight, product of:
                1.2561489 = boost
                5.8646917 = idf(docFreq=335, maxDocs=43556)
                0.016701348 = queryNorm
              0.5183704 = fieldWeight in 2102, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.8646917 = idf(docFreq=335, maxDocs=43556)
                0.0625 = fieldNorm(doc=2102)
          0.061883382 = weight(abstract_txt:style in 2102) [ClassicSimilarity], result of:
            0.061883382 = score(doc=2102,freq=1.0), product of:
              0.15193057 = queryWeight, product of:
                1.3958694 = boost
                6.517017 = idf(docFreq=174, maxDocs=43556)
                0.016701348 = queryNorm
              0.40731356 = fieldWeight in 2102, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.517017 = idf(docFreq=174, maxDocs=43556)
                0.0625 = fieldNorm(doc=2102)
          0.087875344 = weight(abstract_txt:message in 2102) [ClassicSimilarity], result of:
            0.087875344 = score(doc=2102,freq=1.0), product of:
              0.19194369 = queryWeight, product of:
                1.56895 = boost
                7.3250937 = idf(docFreq=77, maxDocs=43556)
                0.016701348 = queryNorm
              0.45781836 = fieldWeight in 2102, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3250937 = idf(docFreq=77, maxDocs=43556)
                0.0625 = fieldNorm(doc=2102)
          0.043125473 = weight(abstract_txt:framework in 2102) [ClassicSimilarity], result of:
            0.043125473 = score(doc=2102,freq=1.0), product of:
              0.15046254 = queryWeight, product of:
                1.9644971 = boost
                4.5859094 = idf(docFreq=1206, maxDocs=43556)
                0.016701348 = queryNorm
              0.28661934 = fieldWeight in 2102, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5859094 = idf(docFreq=1206, maxDocs=43556)
                0.0625 = fieldNorm(doc=2102)
          0.052654017 = weight(abstract_txt:features in 2102) [ClassicSimilarity], result of:
            0.052654017 = score(doc=2102,freq=1.0), product of:
              0.18515274 = queryWeight, product of:
                2.4364488 = boost
                4.550104 = idf(docFreq=1250, maxDocs=43556)
                0.016701348 = queryNorm
              0.2843815 = fieldWeight in 2102, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.550104 = idf(docFreq=1250, maxDocs=43556)
                0.0625 = fieldNorm(doc=2102)
          0.44414908 = weight(abstract_txt:messages in 2102) [ClassicSimilarity], result of:
            0.44414908 = score(doc=2102,freq=3.0), product of:
              0.59511 = queryWeight, product of:
                5.168385 = boost
                6.894311 = idf(docFreq=119, maxDocs=43556)
                0.016701348 = queryNorm
              0.74633104 = fieldWeight in 2102, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.894311 = idf(docFreq=119, maxDocs=43556)
                0.0625 = fieldNorm(doc=2102)
        0.24 = coord(6/25)
    
  4. Ku, Y.; Chiu, C.; Zhang, Y.; Chen, H.; Su, H.: Text mining self-disclosing health information for public health service (2014) 0.14
    0.14042015 = sum of:
      0.14042015 = product of:
        0.58508396 = sum of:
          0.027248288 = weight(abstract_txt:identify in 3260) [ClassicSimilarity], result of:
            0.027248288 = score(doc=3260,freq=1.0), product of:
              0.08793369 = queryWeight, product of:
                1.0619397 = boost
                4.95797 = idf(docFreq=831, maxDocs=43556)
                0.016701348 = queryNorm
              0.30987313 = fieldWeight in 3260, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.95797 = idf(docFreq=831, maxDocs=43556)
                0.0625 = fieldNorm(doc=3260)
          0.03113358 = weight(abstract_txt:techniques in 3260) [ClassicSimilarity], result of:
            0.03113358 = score(doc=3260,freq=1.0), product of:
              0.1100134 = queryWeight, product of:
                1.4547576 = boost
                4.527969 = idf(docFreq=1278, maxDocs=43556)
                0.016701348 = queryNorm
              0.28299806 = fieldWeight in 3260, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.527969 = idf(docFreq=1278, maxDocs=43556)
                0.0625 = fieldNorm(doc=3260)
          0.028603293 = weight(abstract_txt:classification in 3260) [ClassicSimilarity], result of:
            0.028603293 = score(doc=3260,freq=1.0), product of:
              0.11443262 = queryWeight, product of:
                1.713216 = boost
                3.9993203 = idf(docFreq=2169, maxDocs=43556)
                0.016701348 = queryNorm
              0.24995752 = fieldWeight in 3260, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9993203 = idf(docFreq=2169, maxDocs=43556)
                0.0625 = fieldNorm(doc=3260)
          0.060988627 = weight(abstract_txt:framework in 3260) [ClassicSimilarity], result of:
            0.060988627 = score(doc=3260,freq=2.0), product of:
              0.15046254 = queryWeight, product of:
                1.9644971 = boost
                4.5859094 = idf(docFreq=1206, maxDocs=43556)
                0.016701348 = queryNorm
              0.40534094 = fieldWeight in 3260, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5859094 = idf(docFreq=1206, maxDocs=43556)
                0.0625 = fieldNorm(doc=3260)
          0.07446402 = weight(abstract_txt:features in 3260) [ClassicSimilarity], result of:
            0.07446402 = score(doc=3260,freq=2.0), product of:
              0.18515274 = queryWeight, product of:
                2.4364488 = boost
                4.550104 = idf(docFreq=1250, maxDocs=43556)
                0.016701348 = queryNorm
              0.40217617 = fieldWeight in 3260, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.550104 = idf(docFreq=1250, maxDocs=43556)
                0.0625 = fieldNorm(doc=3260)
          0.3626462 = weight(abstract_txt:messages in 3260) [ClassicSimilarity], result of:
            0.3626462 = score(doc=3260,freq=2.0), product of:
              0.59511 = queryWeight, product of:
                5.168385 = boost
                6.894311 = idf(docFreq=119, maxDocs=43556)
                0.016701348 = queryNorm
              0.6093767 = fieldWeight in 3260, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.894311 = idf(docFreq=119, maxDocs=43556)
                0.0625 = fieldNorm(doc=3260)
        0.24 = coord(6/25)
    
  5. Peng, F.; Huang, X.: Machine learning for Asian language text classification (2007) 0.14
    0.13801312 = sum of:
      0.13801312 = product of:
        0.49290404 = sum of:
          0.034072455 = weight(abstract_txt:experiments in 2829) [ClassicSimilarity], result of:
            0.034072455 = score(doc=2829,freq=1.0), product of:
              0.102062166 = queryWeight, product of:
                1.1440753 = boost
                5.3414435 = idf(docFreq=566, maxDocs=43556)
                0.016701348 = queryNorm
              0.33384022 = fieldWeight in 2829, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3414435 = idf(docFreq=566, maxDocs=43556)
                0.0625 = fieldNorm(doc=2829)
          0.11341153 = weight(abstract_txt:chinese in 2829) [ClassicSimilarity], result of:
            0.11341153 = score(doc=2829,freq=4.0), product of:
              0.14333336 = queryWeight, product of:
                1.3558006 = boost
                6.3299446 = idf(docFreq=210, maxDocs=43556)
                0.016701348 = queryNorm
              0.7912431 = fieldWeight in 2829, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.3299446 = idf(docFreq=210, maxDocs=43556)
                0.0625 = fieldNorm(doc=2829)
          0.061883382 = weight(abstract_txt:vector in 2829) [ClassicSimilarity], result of:
            0.061883382 = score(doc=2829,freq=1.0), product of:
              0.15193057 = queryWeight, product of:
                1.3958694 = boost
                6.517017 = idf(docFreq=174, maxDocs=43556)
                0.016701348 = queryNorm
              0.40731356 = fieldWeight in 2829, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.517017 = idf(docFreq=174, maxDocs=43556)
                0.0625 = fieldNorm(doc=2829)
          0.04402953 = weight(abstract_txt:techniques in 2829) [ClassicSimilarity], result of:
            0.04402953 = score(doc=2829,freq=2.0), product of:
              0.1100134 = queryWeight, product of:
                1.4547576 = boost
                4.527969 = idf(docFreq=1278, maxDocs=43556)
                0.016701348 = queryNorm
              0.40021968 = fieldWeight in 2829, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.527969 = idf(docFreq=1278, maxDocs=43556)
                0.0625 = fieldNorm(doc=2829)
          0.07923324 = weight(abstract_txt:machines in 2829) [ClassicSimilarity], result of:
            0.07923324 = score(doc=2829,freq=1.0), product of:
              0.17914337 = queryWeight, product of:
                1.5157325 = boost
                7.0766325 = idf(docFreq=99, maxDocs=43556)
                0.016701348 = queryNorm
              0.44228953 = fieldWeight in 2829, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.0766325 = idf(docFreq=99, maxDocs=43556)
                0.0625 = fieldNorm(doc=2829)
          0.08580988 = weight(abstract_txt:classification in 2829) [ClassicSimilarity], result of:
            0.08580988 = score(doc=2829,freq=9.0), product of:
              0.11443262 = queryWeight, product of:
                1.713216 = boost
                3.9993203 = idf(docFreq=2169, maxDocs=43556)
                0.016701348 = queryNorm
              0.74987257 = fieldWeight in 2829, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                3.9993203 = idf(docFreq=2169, maxDocs=43556)
                0.0625 = fieldNorm(doc=2829)
          0.07446402 = weight(abstract_txt:features in 2829) [ClassicSimilarity], result of:
            0.07446402 = score(doc=2829,freq=2.0), product of:
              0.18515274 = queryWeight, product of:
                2.4364488 = boost
                4.550104 = idf(docFreq=1250, maxDocs=43556)
                0.016701348 = queryNorm
              0.40217617 = fieldWeight in 2829, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.550104 = idf(docFreq=1250, maxDocs=43556)
                0.0625 = fieldNorm(doc=2829)
        0.28 = coord(7/25)