Document (#21774)

Author
Kwok, K.L.
Title
Employing multiple representations for Chinese information retrieval
Source
Journal of the American Society for Information Science. 50(1999) no.8, S.709-723
Year
1999
Abstract
For information retrieval in the Chinese language, 3 representation methods for texts are popular, namely: 1-gram or character, bigram, and short-word. Each has its advantages as well as drawbacks. Employing more than one method may combine advantages from them and enhance retrieval effectiveness. We investigated 2 ways of using them simultaneously: mixing representations in documents and queries, and combining retrieval lists obtained via different representations. The experiments were done with the 170 MB evaluated Chinese corpora and 54 long and short queries available from the TREC program and using our Probabilistic Indexing and Retrieval Components System (PIRCS retrieval system). Experiments show that good retrieval need not depend on accurate word segmentation; approximate segmentation into short-words will do. Results also show and confirm that bigram representation alone works well; mixing characters with bigram representation boosts effectiveness further, but it is preferable to mix characters with short-word indexing which is more efficient, needs less resource, and gives better retrieval more often. Cobining retrieval lists from short-word with character representation and from bigram indexing provides the best retrieval results but also at a substabtial cost

Similar documents (author)

  1. Kwok, K.L.: ¬The use of titles and cited titles as document representations for automatic classification (1975) 5.62
    5.6180234 = sum of:
      5.6180234 = weight(author_txt:kwok in 4347) [ClassicSimilarity], result of:
        5.6180234 = fieldWeight in 4347, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.988837 = idf(docFreq=14, maxDocs=44218)
          0.625 = fieldNorm(doc=4347)
    
  2. Kwok, K.L.: ¬A network approach to probabilistic information retrieval (1995) 5.62
    5.6180234 = sum of:
      5.6180234 = weight(author_txt:kwok in 5696) [ClassicSimilarity], result of:
        5.6180234 = fieldWeight in 5696, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.988837 = idf(docFreq=14, maxDocs=44218)
          0.625 = fieldNorm(doc=5696)
    
  3. Kwok, K.L.: Improving English and Chinese ad-hoc retrieval : a TIPSTER text phase 3 project report (2000) 5.62
    5.6180234 = sum of:
      5.6180234 = weight(author_txt:kwok in 6388) [ClassicSimilarity], result of:
        5.6180234 = fieldWeight in 6388, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.988837 = idf(docFreq=14, maxDocs=44218)
          0.625 = fieldNorm(doc=6388)
    
  4. Kwok, K.L.; Grunfeld, L.: TREC-4 ad-hoc, routing retrieval and filtering experiments using PIRCS (1996) 4.49
    4.4944186 = sum of:
      4.4944186 = weight(author_txt:kwok in 7561) [ClassicSimilarity], result of:
        4.4944186 = fieldWeight in 7561, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.988837 = idf(docFreq=14, maxDocs=44218)
          0.5 = fieldNorm(doc=7561)
    
  5. Kwok, K.L.; Grunfeld, L.: TREC-5 English and Chinese retrieval experiments using PIRCS (1997) 4.49
    4.4944186 = sum of:
      4.4944186 = weight(author_txt:kwok in 3102) [ClassicSimilarity], result of:
        4.4944186 = fieldWeight in 3102, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.988837 = idf(docFreq=14, maxDocs=44218)
          0.5 = fieldNorm(doc=3102)
    

Similar documents (content)

  1. Wang, F.L.; Yang, C.C.: Mining Web data for Chinese segmentation (2007) 0.35
    0.35223752 = sum of:
      0.35223752 = product of:
        0.8805938 = sum of:
          0.018248035 = weight(abstract_txt:show in 604) [ClassicSimilarity], result of:
            0.018248035 = score(doc=604,freq=1.0), product of:
              0.066267446 = queryWeight, product of:
                1.1104047 = boost
                4.4059124 = idf(docFreq=1466, maxDocs=44218)
                0.013545127 = queryNorm
              0.27536952 = fieldWeight in 604, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4059124 = idf(docFreq=1466, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.009426156 = weight(abstract_txt:with in 604) [ClassicSimilarity], result of:
            0.009426156 = score(doc=604,freq=2.0), product of:
              0.04266246 = queryWeight, product of:
                1.2599959 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.013545127 = queryNorm
              0.22094731 = fieldWeight in 604, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.0323141 = weight(abstract_txt:experiments in 604) [ClassicSimilarity], result of:
            0.0323141 = score(doc=604,freq=1.0), product of:
              0.09699534 = queryWeight, product of:
                1.3434039 = boost
                5.3304167 = idf(docFreq=581, maxDocs=44218)
                0.013545127 = queryNorm
              0.33315104 = fieldWeight in 604, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3304167 = idf(docFreq=581, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.00900943 = weight(abstract_txt:from in 604) [ClassicSimilarity], result of:
            0.00900943 = score(doc=604,freq=1.0), product of:
              0.05215521 = queryWeight, product of:
                1.3931409 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.013545127 = queryNorm
              0.17274266 = fieldWeight in 604, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.083442554 = weight(abstract_txt:character in 604) [ClassicSimilarity], result of:
            0.083442554 = score(doc=604,freq=2.0), product of:
              0.14490095 = queryWeight, product of:
                1.641976 = boost
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.013545127 = queryNorm
              0.57585925 = fieldWeight in 604, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.037244234 = weight(abstract_txt:indexing in 604) [ClassicSimilarity], result of:
            0.037244234 = score(doc=604,freq=2.0), product of:
              0.09687594 = queryWeight, product of:
                1.6443142 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.013545127 = queryNorm
              0.38445285 = fieldWeight in 604, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.33718026 = weight(abstract_txt:segmentation in 604) [ClassicSimilarity], result of:
            0.33718026 = score(doc=604,freq=10.0), product of:
              0.21497977 = queryWeight, product of:
                2.0 = boost
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.013545127 = queryNorm
              1.5684279 = fieldWeight in 604, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.21205299 = weight(abstract_txt:chinese in 604) [ClassicSimilarity], result of:
            0.21205299 = score(doc=604,freq=7.0), product of:
              0.20344648 = queryWeight, product of:
                2.3828785 = boost
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.013545127 = queryNorm
              1.0423036 = fieldWeight in 604, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.09690519 = weight(abstract_txt:word in 604) [ClassicSimilarity], result of:
            0.09690519 = score(doc=604,freq=2.0), product of:
              0.20170695 = queryWeight, product of:
                2.7397227 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.013545127 = queryNorm
              0.48042563 = fieldWeight in 604, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.04477084 = weight(abstract_txt:retrieval in 604) [ClassicSimilarity], result of:
            0.04477084 = score(doc=604,freq=1.0), product of:
              0.20613085 = queryWeight, product of:
                4.3791285 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.013545127 = queryNorm
              0.21719621 = fieldWeight in 604, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
        0.4 = coord(10/25)
    
  2. Lee, K.H.; Ng, M.K.M.; Lu, Q.: Text segmentation for Chinese spell checking (1999) 0.28
    0.2759495 = sum of:
      0.2759495 = product of:
        0.76652634 = sum of:
          0.009426156 = weight(abstract_txt:with in 3913) [ClassicSimilarity], result of:
            0.009426156 = score(doc=3913,freq=2.0), product of:
              0.04266246 = queryWeight, product of:
                1.2599959 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.013545127 = queryNorm
              0.22094731 = fieldWeight in 3913, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.0625 = fieldNorm(doc=3913)
          0.017821793 = weight(abstract_txt:more in 3913) [ClassicSimilarity], result of:
            0.017821793 = score(doc=3913,freq=2.0), product of:
              0.059266713 = queryWeight, product of:
                1.2861222 = boost
                3.402088 = idf(docFreq=4002, maxDocs=44218)
                0.013545127 = queryNorm
              0.30070493 = fieldWeight in 3913, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.402088 = idf(docFreq=4002, maxDocs=44218)
                0.0625 = fieldNorm(doc=3913)
          0.0323141 = weight(abstract_txt:experiments in 3913) [ClassicSimilarity], result of:
            0.0323141 = score(doc=3913,freq=1.0), product of:
              0.09699534 = queryWeight, product of:
                1.3434039 = boost
                5.3304167 = idf(docFreq=581, maxDocs=44218)
                0.013545127 = queryNorm
              0.33315104 = fieldWeight in 3913, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3304167 = idf(docFreq=581, maxDocs=44218)
                0.0625 = fieldNorm(doc=3913)
          0.015604791 = weight(abstract_txt:from in 3913) [ClassicSimilarity], result of:
            0.015604791 = score(doc=3913,freq=3.0), product of:
              0.05215521 = queryWeight, product of:
                1.3931409 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.013545127 = queryNorm
              0.29919907 = fieldWeight in 3913, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=3913)
          0.059002794 = weight(abstract_txt:character in 3913) [ClassicSimilarity], result of:
            0.059002794 = score(doc=3913,freq=1.0), product of:
              0.14490095 = queryWeight, product of:
                1.641976 = boost
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.013545127 = queryNorm
              0.407194 = fieldWeight in 3913, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.0625 = fieldNorm(doc=3913)
          0.08573765 = weight(abstract_txt:characters in 3913) [ClassicSimilarity], result of:
            0.08573765 = score(doc=3913,freq=1.0), product of:
              0.18589622 = queryWeight, product of:
                1.8598009 = boost
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.013545127 = queryNorm
              0.46121246 = fieldWeight in 3913, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.0625 = fieldNorm(doc=3913)
          0.21325152 = weight(abstract_txt:segmentation in 3913) [ClassicSimilarity], result of:
            0.21325152 = score(doc=3913,freq=4.0), product of:
              0.21497977 = queryWeight, product of:
                2.0 = boost
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.013545127 = queryNorm
              0.9919609 = fieldWeight in 3913, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.0625 = fieldNorm(doc=3913)
          0.19632293 = weight(abstract_txt:chinese in 3913) [ClassicSimilarity], result of:
            0.19632293 = score(doc=3913,freq=6.0), product of:
              0.20344648 = queryWeight, product of:
                2.3828785 = boost
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.013545127 = queryNorm
              0.96498567 = fieldWeight in 3913, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.0625 = fieldNorm(doc=3913)
          0.13704464 = weight(abstract_txt:word in 3913) [ClassicSimilarity], result of:
            0.13704464 = score(doc=3913,freq=4.0), product of:
              0.20170695 = queryWeight, product of:
                2.7397227 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.013545127 = queryNorm
              0.67942446 = fieldWeight in 3913, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.0625 = fieldNorm(doc=3913)
        0.36 = coord(9/25)
    
  3. Chau, M.; Lu, Y.; Fang, X.; Yang, C.C.: Characteristics of character usage in Chinese Web searching (2009) 0.27
    0.27138087 = sum of:
      0.27138087 = product of:
        0.8480652 = sum of:
          0.013330597 = weight(abstract_txt:with in 2456) [ClassicSimilarity], result of:
            0.013330597 = score(doc=2456,freq=4.0), product of:
              0.04266246 = queryWeight, product of:
                1.2599959 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.013545127 = queryNorm
              0.31246668 = fieldWeight in 2456, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.0625 = fieldNorm(doc=2456)
          0.017821793 = weight(abstract_txt:more in 2456) [ClassicSimilarity], result of:
            0.017821793 = score(doc=2456,freq=2.0), product of:
              0.059266713 = queryWeight, product of:
                1.2861222 = boost
                3.402088 = idf(docFreq=4002, maxDocs=44218)
                0.013545127 = queryNorm
              0.30070493 = fieldWeight in 2456, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.402088 = idf(docFreq=4002, maxDocs=44218)
                0.0625 = fieldNorm(doc=2456)
          0.040180515 = weight(abstract_txt:queries in 2456) [ClassicSimilarity], result of:
            0.040180515 = score(doc=2456,freq=2.0), product of:
              0.08902046 = queryWeight, product of:
                1.2869928 = boost
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.013545127 = queryNorm
              0.4513627 = fieldWeight in 2456, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.106586 = idf(docFreq=727, maxDocs=44218)
                0.0625 = fieldNorm(doc=2456)
          0.015604791 = weight(abstract_txt:from in 2456) [ClassicSimilarity], result of:
            0.015604791 = score(doc=2456,freq=3.0), product of:
              0.05215521 = queryWeight, product of:
                1.3931409 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.013545127 = queryNorm
              0.29919907 = fieldWeight in 2456, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=2456)
          0.083442554 = weight(abstract_txt:character in 2456) [ClassicSimilarity], result of:
            0.083442554 = score(doc=2456,freq=2.0), product of:
              0.14490095 = queryWeight, product of:
                1.641976 = boost
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.013545127 = queryNorm
              0.57585925 = fieldWeight in 2456, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.0625 = fieldNorm(doc=2456)
          0.08573765 = weight(abstract_txt:characters in 2456) [ClassicSimilarity], result of:
            0.08573765 = score(doc=2456,freq=1.0), product of:
              0.18589622 = queryWeight, product of:
                1.8598009 = boost
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.013545127 = queryNorm
              0.46121246 = fieldWeight in 2456, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.0625 = fieldNorm(doc=2456)
          0.19632293 = weight(abstract_txt:chinese in 2456) [ClassicSimilarity], result of:
            0.19632293 = score(doc=2456,freq=6.0), product of:
              0.20344648 = queryWeight, product of:
                2.3828785 = boost
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.013545127 = queryNorm
              0.96498567 = fieldWeight in 2456, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.0625 = fieldNorm(doc=2456)
          0.39562434 = weight(abstract_txt:bigram in 2456) [ClassicSimilarity], result of:
            0.39562434 = score(doc=2456,freq=1.0), product of:
              0.6491646 = queryWeight, product of:
                4.915001 = boost
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.013545127 = queryNorm
              0.6094361 = fieldWeight in 2456, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.7509775 = idf(docFreq=6, maxDocs=44218)
                0.0625 = fieldNorm(doc=2456)
        0.32 = coord(8/25)
    
  4. Yang, C.C.; Li, K.W.: ¬A heuristic method based on a statistical approach for chinese text segmentation (2005) 0.27
    0.26822996 = sum of:
      0.26822996 = product of:
        0.8382186 = sum of:
          0.0115446355 = weight(abstract_txt:with in 4580) [ClassicSimilarity], result of:
            0.0115446355 = score(doc=4580,freq=3.0), product of:
              0.04266246 = queryWeight, product of:
                1.2599959 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.013545127 = queryNorm
              0.27060407 = fieldWeight in 4580, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.01260191 = weight(abstract_txt:more in 4580) [ClassicSimilarity], result of:
            0.01260191 = score(doc=4580,freq=1.0), product of:
              0.059266713 = queryWeight, product of:
                1.2861222 = boost
                3.402088 = idf(docFreq=4002, maxDocs=44218)
                0.013545127 = queryNorm
              0.2126305 = fieldWeight in 4580, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.402088 = idf(docFreq=4002, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.02633565 = weight(abstract_txt:indexing in 4580) [ClassicSimilarity], result of:
            0.02633565 = score(doc=4580,freq=1.0), product of:
              0.09687594 = queryWeight, product of:
                1.6443142 = boost
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.013545127 = queryNorm
              0.27184922 = fieldWeight in 4580, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3495874 = idf(docFreq=1551, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.08573765 = weight(abstract_txt:characters in 4580) [ClassicSimilarity], result of:
            0.08573765 = score(doc=4580,freq=1.0), product of:
              0.18589622 = queryWeight, product of:
                1.8598009 = boost
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.013545127 = queryNorm
              0.46121246 = fieldWeight in 4580, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.31987727 = weight(abstract_txt:segmentation in 4580) [ClassicSimilarity], result of:
            0.31987727 = score(doc=4580,freq=9.0), product of:
              0.21497977 = queryWeight, product of:
                2.0 = boost
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.013545127 = queryNorm
              1.4879413 = fieldWeight in 4580, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.2404455 = weight(abstract_txt:chinese in 4580) [ClassicSimilarity], result of:
            0.2404455 = score(doc=4580,freq=9.0), product of:
              0.20344648 = queryWeight, product of:
                2.3828785 = boost
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.013545127 = queryNorm
              1.1818612 = fieldWeight in 4580, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.09690519 = weight(abstract_txt:word in 4580) [ClassicSimilarity], result of:
            0.09690519 = score(doc=4580,freq=2.0), product of:
              0.20170695 = queryWeight, product of:
                2.7397227 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.013545127 = queryNorm
              0.48042563 = fieldWeight in 4580, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.04477084 = weight(abstract_txt:retrieval in 4580) [ClassicSimilarity], result of:
            0.04477084 = score(doc=4580,freq=1.0), product of:
              0.20613085 = queryWeight, product of:
                4.3791285 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.013545127 = queryNorm
              0.21719621 = fieldWeight in 4580, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
        0.32 = coord(8/25)
    
  5. Khoo, C.S.G.; Dai, D.; Loh, T.E.: Using statistical and contextual information to identify two- and three-character words in Chinese text (2002) 0.26
    0.25529867 = sum of:
      0.25529867 = product of:
        0.7978084 = sum of:
          0.009426156 = weight(abstract_txt:with in 5206) [ClassicSimilarity], result of:
            0.009426156 = score(doc=5206,freq=2.0), product of:
              0.04266246 = queryWeight, product of:
                1.2599959 = boost
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.013545127 = queryNorm
              0.22094731 = fieldWeight in 5206, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4997334 = idf(docFreq=9868, maxDocs=44218)
                0.0625 = fieldNorm(doc=5206)
          0.017821793 = weight(abstract_txt:more in 5206) [ClassicSimilarity], result of:
            0.017821793 = score(doc=5206,freq=2.0), product of:
              0.059266713 = queryWeight, product of:
                1.2861222 = boost
                3.402088 = idf(docFreq=4002, maxDocs=44218)
                0.013545127 = queryNorm
              0.30070493 = fieldWeight in 5206, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.402088 = idf(docFreq=4002, maxDocs=44218)
                0.0625 = fieldNorm(doc=5206)
          0.00900943 = weight(abstract_txt:from in 5206) [ClassicSimilarity], result of:
            0.00900943 = score(doc=5206,freq=1.0), product of:
              0.05215521 = queryWeight, product of:
                1.3931409 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.013545127 = queryNorm
              0.17274266 = fieldWeight in 5206, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=5206)
          0.14452675 = weight(abstract_txt:character in 5206) [ClassicSimilarity], result of:
            0.14452675 = score(doc=5206,freq=6.0), product of:
              0.14490095 = queryWeight, product of:
                1.641976 = boost
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.013545127 = queryNorm
              0.9974175 = fieldWeight in 5206, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.0625 = fieldNorm(doc=5206)
          0.14850196 = weight(abstract_txt:characters in 5206) [ClassicSimilarity], result of:
            0.14850196 = score(doc=5206,freq=3.0), product of:
              0.18589622 = queryWeight, product of:
                1.8598009 = boost
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.013545127 = queryNorm
              0.7988434 = fieldWeight in 5206, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.0625 = fieldNorm(doc=5206)
          0.26117873 = weight(abstract_txt:segmentation in 5206) [ClassicSimilarity], result of:
            0.26117873 = score(doc=5206,freq=6.0), product of:
              0.21497977 = queryWeight, product of:
                2.0 = boost
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.013545127 = queryNorm
              1.2148991 = fieldWeight in 5206, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.0625 = fieldNorm(doc=5206)
          0.13882127 = weight(abstract_txt:chinese in 5206) [ClassicSimilarity], result of:
            0.13882127 = score(doc=5206,freq=3.0), product of:
              0.20344648 = queryWeight, product of:
                2.3828785 = boost
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.013545127 = queryNorm
              0.6823479 = fieldWeight in 5206, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.0625 = fieldNorm(doc=5206)
          0.06852232 = weight(abstract_txt:word in 5206) [ClassicSimilarity], result of:
            0.06852232 = score(doc=5206,freq=1.0), product of:
              0.20170695 = queryWeight, product of:
                2.7397227 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.013545127 = queryNorm
              0.33971223 = fieldWeight in 5206, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.0625 = fieldNorm(doc=5206)
        0.32 = coord(8/25)