Document (#30278)

Author
Zheng, R.
Li, J.
Chen, H.
Huang, Z.
Title
¬A framework for authorship identification of online messages : writing-style features and classification techniques
Source
Journal of the American Society for Information Science and Technology. 57(2006) no.3, S.378-393
Year
2006
Abstract
With the rapid proliferation of Internet technologies and applications, misuse of online messages for inappropriate or illegal purposes has become a major concern for society. The anonymous nature of online-message distribution makes identity tracing a critical problem. We developed a framework for authorship identification of online messages to address the identity-tracing problem. In this framework, four types of writing-style features (lexical, syntactic, structural, and content-specific features) are extracted and inductive learning algorithms are used to build feature-based classification models to identify authorship of online messages. To examine this framework, we conducted experiments on English and Chinese online-newsgroup messages. We compared the discriminating power of the four types of features and of three classification techniques: decision trees, backpropagation neural networks, and support vector machines. The experimental results showed that the proposed approach was able to identify authors of online messages with satisfactory accuracy of 70 to 95%. All four types of message features contributed to discriminating authors of online messages. Support vector machines outperformed the other two classification techniques in our experiments. The high performance we achieved for both the English and Chinese datasets showed the potential of this approach in a multiple-language context.

Similar documents (author)

  1. Huang, Z.; Chung, Z.W.; Chen, H.: ¬A graph model for e-commerce recommender systems (2004) 1.52
    1.5177186 = sum of:
      1.5177186 = product of:
        2.2765777 = sum of:
          0.8560624 = weight(author_txt:chen in 2502) [ClassicSimilarity], result of:
            0.8560624 = score(doc=2502,freq=1.0), product of:
              0.36911428 = queryWeight, product of:
                6.184624 = idf(docFreq=236, maxDocs=42306)
                0.05968257 = queryNorm
              2.3192341 = fieldWeight in 2502, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.184624 = idf(docFreq=236, maxDocs=42306)
                0.375 = fieldNorm(doc=2502)
          1.4205153 = weight(author_txt:huang in 2502) [ClassicSimilarity], result of:
            1.4205153 = score(doc=2502,freq=1.0), product of:
              0.517354 = queryWeight, product of:
                1.1838958 = boost
                7.321951 = idf(docFreq=75, maxDocs=42306)
                0.05968257 = queryNorm
              2.7457316 = fieldWeight in 2502, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.321951 = idf(docFreq=75, maxDocs=42306)
                0.375 = fieldNorm(doc=2502)
        0.6666667 = coord(2/3)
    
  2. Huang, C.; Fu, T.; Chen, H.: Text-based video content classification for online video-sharing sites (2010) 1.52
    1.5177186 = sum of:
      1.5177186 = product of:
        2.2765777 = sum of:
          0.8560624 = weight(author_txt:chen in 453) [ClassicSimilarity], result of:
            0.8560624 = score(doc=453,freq=1.0), product of:
              0.36911428 = queryWeight, product of:
                6.184624 = idf(docFreq=236, maxDocs=42306)
                0.05968257 = queryNorm
              2.3192341 = fieldWeight in 453, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.184624 = idf(docFreq=236, maxDocs=42306)
                0.375 = fieldNorm(doc=453)
          1.4205153 = weight(author_txt:huang in 453) [ClassicSimilarity], result of:
            1.4205153 = score(doc=453,freq=1.0), product of:
              0.517354 = queryWeight, product of:
                1.1838958 = boost
                7.321951 = idf(docFreq=75, maxDocs=42306)
                0.05968257 = queryNorm
              2.7457316 = fieldWeight in 453, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.321951 = idf(docFreq=75, maxDocs=42306)
                0.375 = fieldNorm(doc=453)
        0.6666667 = coord(2/3)
    
  3. Huang, M.-H.; Huang, W.-T.; Chang, C.-C.; Chen, D. Z.; Lin, C.-P.: The greater scattering phenomenon beyond Bradford's law in patent citation (2014) 1.27
    1.273322 = sum of:
      1.273322 = product of:
        1.9099829 = sum of:
          0.5707083 = weight(author_txt:chen in 3353) [ClassicSimilarity], result of:
            0.5707083 = score(doc=3353,freq=1.0), product of:
              0.36911428 = queryWeight, product of:
                6.184624 = idf(docFreq=236, maxDocs=42306)
                0.05968257 = queryNorm
              1.546156 = fieldWeight in 3353, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.184624 = idf(docFreq=236, maxDocs=42306)
                0.25 = fieldNorm(doc=3353)
          1.3392746 = weight(author_txt:huang in 3353) [ClassicSimilarity], result of:
            1.3392746 = score(doc=3353,freq=2.0), product of:
              0.517354 = queryWeight, product of:
                1.1838958 = boost
                7.321951 = idf(docFreq=75, maxDocs=42306)
                0.05968257 = queryNorm
              2.5887005 = fieldWeight in 3353, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.321951 = idf(docFreq=75, maxDocs=42306)
                0.25 = fieldNorm(doc=3353)
        0.6666667 = coord(2/3)
    
  4. Chen, Z.; Huang, Y.; Tian, J.; Liu, X.; Fu, K.; Huang, T.: Joint model for subsentence-level sentiment analysis with Markov logic (2015) 1.27
    1.273322 = sum of:
      1.273322 = product of:
        1.9099829 = sum of:
          0.5707083 = weight(author_txt:chen in 4211) [ClassicSimilarity], result of:
            0.5707083 = score(doc=4211,freq=1.0), product of:
              0.36911428 = queryWeight, product of:
                6.184624 = idf(docFreq=236, maxDocs=42306)
                0.05968257 = queryNorm
              1.546156 = fieldWeight in 4211, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.184624 = idf(docFreq=236, maxDocs=42306)
                0.25 = fieldNorm(doc=4211)
          1.3392746 = weight(author_txt:huang in 4211) [ClassicSimilarity], result of:
            1.3392746 = score(doc=4211,freq=2.0), product of:
              0.517354 = queryWeight, product of:
                1.1838958 = boost
                7.321951 = idf(docFreq=75, maxDocs=42306)
                0.05968257 = queryNorm
              2.5887005 = fieldWeight in 4211, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.321951 = idf(docFreq=75, maxDocs=42306)
                0.25 = fieldNorm(doc=4211)
        0.6666667 = coord(2/3)
    
  5. Huang, M.-H.; Tang, M.-C.; Chen, D.-Z.: Inequality of publishing performance and international collaboration in physics (2011) 1.26
    1.2647655 = sum of:
      1.2647655 = product of:
        1.8971481 = sum of:
          0.71338534 = weight(author_txt:chen in 1468) [ClassicSimilarity], result of:
            0.71338534 = score(doc=1468,freq=1.0), product of:
              0.36911428 = queryWeight, product of:
                6.184624 = idf(docFreq=236, maxDocs=42306)
                0.05968257 = queryNorm
              1.932695 = fieldWeight in 1468, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.184624 = idf(docFreq=236, maxDocs=42306)
                0.3125 = fieldNorm(doc=1468)
          1.1837628 = weight(author_txt:huang in 1468) [ClassicSimilarity], result of:
            1.1837628 = score(doc=1468,freq=1.0), product of:
              0.517354 = queryWeight, product of:
                1.1838958 = boost
                7.321951 = idf(docFreq=75, maxDocs=42306)
                0.05968257 = queryNorm
              2.2881098 = fieldWeight in 1468, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.321951 = idf(docFreq=75, maxDocs=42306)
                0.3125 = fieldNorm(doc=1468)
        0.6666667 = coord(2/3)
    

Similar documents (content)

  1. Kucukyilmaz, T.; Cambazoglu, B.B.; Aykanat, C.; Can, F.: Chat mining : Predicting user and message attributes in computer-mediated communication (2008) 0.41
    0.40500972 = sum of:
      0.40500972 = product of:
        1.125027 = sum of:
          0.040325273 = weight(abstract_txt:authors in 4100) [ClassicSimilarity], result of:
            0.040325273 = score(doc=4100,freq=2.0), product of:
              0.077829875 = queryWeight, product of:
                4.689494 = idf(docFreq=1056, maxDocs=42306)
                0.016596647 = queryNorm
              0.51812077 = fieldWeight in 4100, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.689494 = idf(docFreq=1056, maxDocs=42306)
                0.078125 = fieldNorm(doc=4100)
          0.07550861 = weight(abstract_txt:writing in 4100) [ClassicSimilarity], result of:
            0.07550861 = score(doc=4100,freq=1.0), product of:
              0.14897123 = queryWeight, product of:
                1.3834964 = boost
                6.4878983 = idf(docFreq=174, maxDocs=42306)
                0.016596647 = queryNorm
              0.50686705 = fieldWeight in 4100, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.4878983 = idf(docFreq=174, maxDocs=42306)
                0.078125 = fieldNorm(doc=4100)
          0.079592906 = weight(abstract_txt:style in 4100) [ClassicSimilarity], result of:
            0.079592906 = score(doc=4100,freq=1.0), product of:
              0.15429588 = queryWeight, product of:
                1.4080043 = boost
                6.602828 = idf(docFreq=155, maxDocs=42306)
                0.016596647 = queryNorm
              0.51584595 = fieldWeight in 4100, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.602828 = idf(docFreq=155, maxDocs=42306)
                0.078125 = fieldNorm(doc=4100)
          0.03848775 = weight(abstract_txt:techniques in 4100) [ClassicSimilarity], result of:
            0.03848775 = score(doc=4100,freq=1.0), product of:
              0.10881369 = queryWeight, product of:
                1.4481523 = boost
                4.527401 = idf(docFreq=1242, maxDocs=42306)
                0.016596647 = queryNorm
              0.3537032 = fieldWeight in 4100, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.527401 = idf(docFreq=1242, maxDocs=42306)
                0.078125 = fieldNorm(doc=4100)
          0.08946281 = weight(abstract_txt:identity in 4100) [ClassicSimilarity], result of:
            0.08946281 = score(doc=4100,freq=1.0), product of:
              0.16680144 = queryWeight, product of:
                1.4639516 = boost
                6.8651924 = idf(docFreq=119, maxDocs=42306)
                0.016596647 = queryNorm
              0.53634316 = fieldWeight in 4100, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8651924 = idf(docFreq=119, maxDocs=42306)
                0.078125 = fieldNorm(doc=4100)
          0.1543244 = weight(abstract_txt:message in 4100) [ClassicSimilarity], result of:
            0.1543244 = score(doc=4100,freq=2.0), product of:
              0.19042231 = queryWeight, product of:
                1.5641764 = boost
                7.335196 = idf(docFreq=74, maxDocs=42306)
                0.016596647 = queryNorm
              0.8104323 = fieldWeight in 4100, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.335196 = idf(docFreq=74, maxDocs=42306)
                0.078125 = fieldNorm(doc=4100)
          0.035597626 = weight(abstract_txt:classification in 4100) [ClassicSimilarity], result of:
            0.035597626 = score(doc=4100,freq=1.0), product of:
              0.11369171 = queryWeight, product of:
                1.7092525 = boost
                4.007765 = idf(docFreq=2089, maxDocs=42306)
                0.016596647 = queryNorm
              0.31310663 = fieldWeight in 4100, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.007765 = idf(docFreq=2089, maxDocs=42306)
                0.078125 = fieldNorm(doc=4100)
          0.055018824 = weight(abstract_txt:online in 4100) [ClassicSimilarity], result of:
            0.055018824 = score(doc=4100,freq=1.0), product of:
              0.19148391 = queryWeight, product of:
                3.137061 = boost
                3.6778073 = idf(docFreq=2906, maxDocs=42306)
                0.016596647 = queryNorm
              0.2873287 = fieldWeight in 4100, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6778073 = idf(docFreq=2906, maxDocs=42306)
                0.078125 = fieldNorm(doc=4100)
          0.5567088 = weight(abstract_txt:messages in 4100) [ClassicSimilarity], result of:
            0.5567088 = score(doc=4100,freq=3.0), product of:
              0.59407204 = queryWeight, product of:
                5.1686893 = boost
                6.9252963 = idf(docFreq=112, maxDocs=42306)
                0.016596647 = queryNorm
              0.9371066 = fieldWeight in 4100, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.9252963 = idf(docFreq=112, maxDocs=42306)
                0.078125 = fieldNorm(doc=4100)
        0.36 = coord(9/25)
    
  2. Huang, C.; Fu, T.; Chen, H.: Text-based video content classification for online video-sharing sites (2010) 0.26
    0.25805175 = sum of:
      0.25805175 = product of:
        0.6451294 = sum of:
          0.03000961 = weight(abstract_txt:experiments in 453) [ClassicSimilarity], result of:
            0.03000961 = score(doc=453,freq=1.0), product of:
              0.102144025 = queryWeight, product of:
                1.1456008 = boost
                5.372288 = idf(docFreq=533, maxDocs=42306)
                0.016596647 = queryNorm
              0.29379702 = fieldWeight in 453, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.372288 = idf(docFreq=533, maxDocs=42306)
                0.0546875 = fieldNorm(doc=453)
          0.038462102 = weight(abstract_txt:showed in 453) [ClassicSimilarity], result of:
            0.038462102 = score(doc=453,freq=1.0), product of:
              0.12052063 = queryWeight, product of:
                1.2443929 = boost
                5.835573 = idf(docFreq=335, maxDocs=42306)
                0.016596647 = queryNorm
              0.31913292 = fieldWeight in 453, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.835573 = idf(docFreq=335, maxDocs=42306)
                0.0546875 = fieldNorm(doc=453)
          0.07555177 = weight(abstract_txt:vector in 453) [ClassicSimilarity], result of:
            0.07555177 = score(doc=453,freq=2.0), product of:
              0.15003498 = queryWeight, product of:
                1.3884271 = boost
                6.5110207 = idf(docFreq=170, maxDocs=42306)
                0.016596647 = queryNorm
              0.503561 = fieldWeight in 453, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.5110207 = idf(docFreq=170, maxDocs=42306)
                0.0546875 = fieldNorm(doc=453)
          0.045882367 = weight(abstract_txt:types in 453) [ClassicSimilarity], result of:
            0.045882367 = score(doc=453,freq=3.0), product of:
              0.107595295 = queryWeight, product of:
                1.4400219 = boost
                4.5019827 = idf(docFreq=1274, maxDocs=42306)
                0.016596647 = queryNorm
              0.4264347 = fieldWeight in 453, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.5019827 = idf(docFreq=1274, maxDocs=42306)
                0.0546875 = fieldNorm(doc=453)
          0.03810093 = weight(abstract_txt:techniques in 453) [ClassicSimilarity], result of:
            0.03810093 = score(doc=453,freq=2.0), product of:
              0.10881369 = queryWeight, product of:
                1.4481523 = boost
                4.527401 = idf(docFreq=1242, maxDocs=42306)
                0.016596647 = queryNorm
              0.35014832 = fieldWeight in 453, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.527401 = idf(docFreq=1242, maxDocs=42306)
                0.0546875 = fieldNorm(doc=453)
          0.04315983 = weight(abstract_txt:classification in 453) [ClassicSimilarity], result of:
            0.04315983 = score(doc=453,freq=3.0), product of:
              0.11369171 = queryWeight, product of:
                1.7092525 = boost
                4.007765 = idf(docFreq=2089, maxDocs=42306)
                0.016596647 = queryNorm
              0.37962162 = fieldWeight in 453, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.007765 = idf(docFreq=2089, maxDocs=42306)
                0.0546875 = fieldNorm(doc=453)
          0.15342003 = weight(abstract_txt:discriminating in 453) [ClassicSimilarity], result of:
            0.15342003 = score(doc=453,freq=1.0), product of:
              0.303129 = queryWeight, product of:
                1.9735155 = boost
                9.254789 = idf(docFreq=10, maxDocs=42306)
                0.016596647 = queryNorm
              0.5061213 = fieldWeight in 453, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.254789 = idf(docFreq=10, maxDocs=42306)
                0.0546875 = fieldNorm(doc=453)
          0.05451634 = weight(abstract_txt:framework in 453) [ClassicSimilarity], result of:
            0.05451634 = score(doc=453,freq=2.0), product of:
              0.15207478 = queryWeight, product of:
                1.976835 = boost
                4.635178 = idf(docFreq=1115, maxDocs=42306)
                0.016596647 = queryNorm
              0.35848376 = fieldWeight in 453, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.635178 = idf(docFreq=1115, maxDocs=42306)
                0.0546875 = fieldNorm(doc=453)
          0.0799083 = weight(abstract_txt:features in 453) [ClassicSimilarity], result of:
            0.0799083 = score(doc=453,freq=3.0), product of:
              0.18466032 = queryWeight, product of:
                2.4354746 = boost
                4.5684576 = idf(docFreq=1192, maxDocs=42306)
                0.016596647 = queryNorm
              0.43273127 = fieldWeight in 453, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.5684576 = idf(docFreq=1192, maxDocs=42306)
                0.0546875 = fieldNorm(doc=453)
          0.08611808 = weight(abstract_txt:online in 453) [ClassicSimilarity], result of:
            0.08611808 = score(doc=453,freq=5.0), product of:
              0.19148391 = queryWeight, product of:
                3.137061 = boost
                3.6778073 = idf(docFreq=2906, maxDocs=42306)
                0.016596647 = queryNorm
              0.44974056 = fieldWeight in 453, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.6778073 = idf(docFreq=2906, maxDocs=42306)
                0.0546875 = fieldNorm(doc=453)
        0.4 = coord(10/25)
    
  3. Ku, Y.; Chiu, C.; Zhang, Y.; Chen, H.; Su, H.: Text mining self-disclosing health information for public health service (2014) 0.14
    0.1409362 = sum of:
      0.1409362 = product of:
        0.58723414 = sum of:
          0.02745526 = weight(abstract_txt:identify in 3263) [ClassicSimilarity], result of:
            0.02745526 = score(doc=3263,freq=1.0), product of:
              0.08806334 = queryWeight, product of:
                1.0637128 = boost
                4.988275 = idf(docFreq=783, maxDocs=42306)
                0.016596647 = queryNorm
              0.3117672 = fieldWeight in 3263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.988275 = idf(docFreq=783, maxDocs=42306)
                0.0625 = fieldNorm(doc=3263)
          0.0307902 = weight(abstract_txt:techniques in 3263) [ClassicSimilarity], result of:
            0.0307902 = score(doc=3263,freq=1.0), product of:
              0.10881369 = queryWeight, product of:
                1.4481523 = boost
                4.527401 = idf(docFreq=1242, maxDocs=42306)
                0.016596647 = queryNorm
              0.28296256 = fieldWeight in 3263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.527401 = idf(docFreq=1242, maxDocs=42306)
                0.0625 = fieldNorm(doc=3263)
          0.028478103 = weight(abstract_txt:classification in 3263) [ClassicSimilarity], result of:
            0.028478103 = score(doc=3263,freq=1.0), product of:
              0.11369171 = queryWeight, product of:
                1.7092525 = boost
                4.007765 = idf(docFreq=2089, maxDocs=42306)
                0.016596647 = queryNorm
              0.2504853 = fieldWeight in 3263, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.007765 = idf(docFreq=2089, maxDocs=42306)
                0.0625 = fieldNorm(doc=3263)
          0.06230439 = weight(abstract_txt:framework in 3263) [ClassicSimilarity], result of:
            0.06230439 = score(doc=3263,freq=2.0), product of:
              0.15207478 = queryWeight, product of:
                1.976835 = boost
                4.635178 = idf(docFreq=1115, maxDocs=42306)
                0.016596647 = queryNorm
              0.4096957 = fieldWeight in 3263, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.635178 = idf(docFreq=1115, maxDocs=42306)
                0.0625 = fieldNorm(doc=3263)
          0.07456554 = weight(abstract_txt:features in 3263) [ClassicSimilarity], result of:
            0.07456554 = score(doc=3263,freq=2.0), product of:
              0.18466032 = queryWeight, product of:
                2.4354746 = boost
                4.5684576 = idf(docFreq=1192, maxDocs=42306)
                0.016596647 = queryNorm
              0.4037984 = fieldWeight in 3263, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5684576 = idf(docFreq=1192, maxDocs=42306)
                0.0625 = fieldNorm(doc=3263)
          0.3636407 = weight(abstract_txt:messages in 3263) [ClassicSimilarity], result of:
            0.3636407 = score(doc=3263,freq=2.0), product of:
              0.59407204 = queryWeight, product of:
                5.1686893 = boost
                6.9252963 = idf(docFreq=112, maxDocs=42306)
                0.016596647 = queryNorm
              0.6121155 = fieldWeight in 3263, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.9252963 = idf(docFreq=112, maxDocs=42306)
                0.0625 = fieldNorm(doc=3263)
        0.24 = coord(6/25)
    
  4. Stamatatos, E.: ¬A survey of modern authorship attribution methods (2009) 0.14
    0.13884912 = sum of:
      0.13884912 = product of:
        0.6942456 = sum of:
          0.022811422 = weight(abstract_txt:authors in 561) [ClassicSimilarity], result of:
            0.022811422 = score(doc=561,freq=1.0), product of:
              0.077829875 = queryWeight, product of:
                4.689494 = idf(docFreq=1056, maxDocs=42306)
                0.016596647 = queryNorm
              0.29309338 = fieldWeight in 561, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.689494 = idf(docFreq=1056, maxDocs=42306)
                0.0625 = fieldNorm(doc=561)
          0.028478103 = weight(abstract_txt:classification in 561) [ClassicSimilarity], result of:
            0.028478103 = score(doc=561,freq=1.0), product of:
              0.11369171 = queryWeight, product of:
                1.7092525 = boost
                4.007765 = idf(docFreq=2089, maxDocs=42306)
                0.016596647 = queryNorm
              0.2504853 = fieldWeight in 561, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.007765 = idf(docFreq=2089, maxDocs=42306)
                0.0625 = fieldNorm(doc=561)
          0.23530027 = weight(abstract_txt:authorship in 561) [ClassicSimilarity], result of:
            0.23530027 = score(doc=561,freq=4.0), product of:
              0.26595214 = queryWeight, product of:
                2.2639885 = boost
                7.0779734 = idf(docFreq=96, maxDocs=42306)
                0.016596647 = queryNorm
              0.8847467 = fieldWeight in 561, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.0779734 = idf(docFreq=96, maxDocs=42306)
                0.0625 = fieldNorm(doc=561)
          0.044015057 = weight(abstract_txt:online in 561) [ClassicSimilarity], result of:
            0.044015057 = score(doc=561,freq=1.0), product of:
              0.19148391 = queryWeight, product of:
                3.137061 = boost
                3.6778073 = idf(docFreq=2906, maxDocs=42306)
                0.016596647 = queryNorm
              0.22986296 = fieldWeight in 561, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6778073 = idf(docFreq=2906, maxDocs=42306)
                0.0625 = fieldNorm(doc=561)
          0.3636407 = weight(abstract_txt:messages in 561) [ClassicSimilarity], result of:
            0.3636407 = score(doc=561,freq=2.0), product of:
              0.59407204 = queryWeight, product of:
                5.1686893 = boost
                6.9252963 = idf(docFreq=112, maxDocs=42306)
                0.016596647 = queryNorm
              0.6121155 = fieldWeight in 561, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.9252963 = idf(docFreq=112, maxDocs=42306)
                0.0625 = fieldNorm(doc=561)
        0.2 = coord(5/25)
    
  5. Peng, F.; Huang, X.: Machine learning for Asian language text classification (2007) 0.14
    0.13784236 = sum of:
      0.13784236 = product of:
        0.49229413 = sum of:
          0.034296695 = weight(abstract_txt:experiments in 2832) [ClassicSimilarity], result of:
            0.034296695 = score(doc=2832,freq=1.0), product of:
              0.102144025 = queryWeight, product of:
                1.1456008 = boost
                5.372288 = idf(docFreq=533, maxDocs=42306)
                0.016596647 = queryNorm
              0.335768 = fieldWeight in 2832, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.372288 = idf(docFreq=533, maxDocs=42306)
                0.0625 = fieldNorm(doc=2832)
          0.112448744 = weight(abstract_txt:chinese in 2832) [ClassicSimilarity], result of:
            0.112448744 = score(doc=2832,freq=4.0), product of:
              0.14201292 = queryWeight, product of:
                1.3507991 = boost
                6.334564 = idf(docFreq=203, maxDocs=42306)
                0.016596647 = queryNorm
              0.7918205 = fieldWeight in 2832, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.334564 = idf(docFreq=203, maxDocs=42306)
                0.0625 = fieldNorm(doc=2832)
          0.061055053 = weight(abstract_txt:vector in 2832) [ClassicSimilarity], result of:
            0.061055053 = score(doc=2832,freq=1.0), product of:
              0.15003498 = queryWeight, product of:
                1.3884271 = boost
                6.5110207 = idf(docFreq=170, maxDocs=42306)
                0.016596647 = queryNorm
              0.4069388 = fieldWeight in 2832, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5110207 = idf(docFreq=170, maxDocs=42306)
                0.0625 = fieldNorm(doc=2832)
          0.04354392 = weight(abstract_txt:techniques in 2832) [ClassicSimilarity], result of:
            0.04354392 = score(doc=2832,freq=2.0), product of:
              0.10881369 = queryWeight, product of:
                1.4481523 = boost
                4.527401 = idf(docFreq=1242, maxDocs=42306)
                0.016596647 = queryNorm
              0.4001695 = fieldWeight in 2832, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.527401 = idf(docFreq=1242, maxDocs=42306)
                0.0625 = fieldNorm(doc=2832)
          0.08094987 = weight(abstract_txt:machines in 2832) [ClassicSimilarity], result of:
            0.08094987 = score(doc=2832,freq=1.0), product of:
              0.18107377 = queryWeight, product of:
                1.5252976 = boost
                7.1528745 = idf(docFreq=89, maxDocs=42306)
                0.016596647 = queryNorm
              0.44705465 = fieldWeight in 2832, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.1528745 = idf(docFreq=89, maxDocs=42306)
                0.0625 = fieldNorm(doc=2832)
          0.08543431 = weight(abstract_txt:classification in 2832) [ClassicSimilarity], result of:
            0.08543431 = score(doc=2832,freq=9.0), product of:
              0.11369171 = queryWeight, product of:
                1.7092525 = boost
                4.007765 = idf(docFreq=2089, maxDocs=42306)
                0.016596647 = queryNorm
              0.7514559 = fieldWeight in 2832, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                4.007765 = idf(docFreq=2089, maxDocs=42306)
                0.0625 = fieldNorm(doc=2832)
          0.07456554 = weight(abstract_txt:features in 2832) [ClassicSimilarity], result of:
            0.07456554 = score(doc=2832,freq=2.0), product of:
              0.18466032 = queryWeight, product of:
                2.4354746 = boost
                4.5684576 = idf(docFreq=1192, maxDocs=42306)
                0.016596647 = queryNorm
              0.4037984 = fieldWeight in 2832, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5684576 = idf(docFreq=1192, maxDocs=42306)
                0.0625 = fieldNorm(doc=2832)
        0.28 = coord(7/25)