Document (#41676)

Author
Doval, Y.
Gómez-Rodríguez, C.
Title
Comparing neural- and N-gram-based language models for word segmentation
Source
Journal of the Association for Information Science and Technology. 70(2019) no.2, S.187-197
Year
2019
Abstract
Word segmentation is the task of inserting or deleting word boundary characters in order to separate character sequences that correspond to words in some language. In this article we propose an approach based on a beam search algorithm and a language model working at the byte/character level, the latter component implemented either as an n-gram model or a recurrent neural network. The resulting system analyzes the text input with no word boundaries one token at a time, which can be a character or a byte, and uses the information gathered by the language model to determine if a boundary must be placed in the current position or not. Our aim is to use this system in a preprocessing step for a microtext normalization system. This means that it needs to effectively cope with the data sparsity present on this kind of texts. We also strove to surpass the performance of two readily available word segmentation systems: The well-known and accessible Word Breaker by Microsoft, and the Python module WordSegment by Grant Jenks. The results show that we have met our objectives, and we hope to continue to improve both the precision and the efficiency of our system in the future.
Content
Vgl.: https://onlinelibrary.wiley.com/doi/10.1002/asi.24082.
Theme
Computerlinguistik

Similar documents (author)

  1. Cuesta, P.; Gómez, A.M.; Rodríguez, F.J.: Using agents for information retrieval (2003) 4.37
    4.3749547 = sum of:
      4.3749547 = sum of:
        1.7909348 = weight(author_txt:rodríguez in 2745) [ClassicSimilarity], result of:
          1.7909348 = score(doc=2745,freq=1.0), product of:
            0.6165822 = queryWeight, product of:
              7.7456436 = idf(docFreq=51, maxDocs=44218)
              0.07960374 = queryNorm
            2.9046164 = fieldWeight in 2745, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.7456436 = idf(docFreq=51, maxDocs=44218)
              0.375 = fieldNorm(doc=2745)
        2.58402 = weight(author_txt:gómez in 2745) [ClassicSimilarity], result of:
          2.58402 = score(doc=2745,freq=1.0), product of:
            0.7872906 = queryWeight, product of:
              1.1299833 = boost
              8.752448 = idf(docFreq=18, maxDocs=44218)
              0.07960374 = queryNorm
            3.282168 = fieldWeight in 2745, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.752448 = idf(docFreq=18, maxDocs=44218)
              0.375 = fieldNorm(doc=2745)
    
  2. Vilares, D.; Alonso, M.A.; Gómez-Rodríguez, C.: On the usefulness of lexical and syntactic processing in polarity classification of Twitter messages (2015) 4.37
    4.3749547 = sum of:
      4.3749547 = sum of:
        1.7909348 = weight(author_txt:rodríguez in 2161) [ClassicSimilarity], result of:
          1.7909348 = score(doc=2161,freq=1.0), product of:
            0.6165822 = queryWeight, product of:
              7.7456436 = idf(docFreq=51, maxDocs=44218)
              0.07960374 = queryNorm
            2.9046164 = fieldWeight in 2161, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.7456436 = idf(docFreq=51, maxDocs=44218)
              0.375 = fieldNorm(doc=2161)
        2.58402 = weight(author_txt:gómez in 2161) [ClassicSimilarity], result of:
          2.58402 = score(doc=2161,freq=1.0), product of:
            0.7872906 = queryWeight, product of:
              1.1299833 = boost
              8.752448 = idf(docFreq=18, maxDocs=44218)
              0.07960374 = queryNorm
            3.282168 = fieldWeight in 2161, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.752448 = idf(docFreq=18, maxDocs=44218)
              0.375 = fieldNorm(doc=2161)
    
  3. Olmeda-Gómez, C.; Perianes-Rodríguez, A.; Ovalle-Perandones, M.A.: Mapas de ciencias multidisciplinares : la biología molecular en la Comunidad de Madrid (2007) 3.65
    3.6457958 = sum of:
      3.6457958 = sum of:
        1.4924457 = weight(author_txt:rodríguez in 1120) [ClassicSimilarity], result of:
          1.4924457 = score(doc=1120,freq=1.0), product of:
            0.6165822 = queryWeight, product of:
              7.7456436 = idf(docFreq=51, maxDocs=44218)
              0.07960374 = queryNorm
            2.4205136 = fieldWeight in 1120, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              7.7456436 = idf(docFreq=51, maxDocs=44218)
              0.3125 = fieldNorm(doc=1120)
        2.15335 = weight(author_txt:gómez in 1120) [ClassicSimilarity], result of:
          2.15335 = score(doc=1120,freq=1.0), product of:
            0.7872906 = queryWeight, product of:
              1.1299833 = boost
              8.752448 = idf(docFreq=18, maxDocs=44218)
              0.07960374 = queryNorm
            2.73514 = fieldWeight in 1120, product of:
              1.0 = tf(freq=1.0), with freq of:
                1.0 = termFreq=1.0
              8.752448 = idf(docFreq=18, maxDocs=44218)
              0.3125 = fieldNorm(doc=1120)
    
  4. Gómez Prada, R. Gómez => Gómez Prada, R.: 2.24
    2.2378268 = sum of:
      2.2378268 = product of:
        4.4756536 = sum of:
          4.4756536 = weight(author_txt:gómez in 5119) [ClassicSimilarity], result of:
            4.4756536 = score(doc=5119,freq=3.0), product of:
              0.7872906 = queryWeight, product of:
                1.1299833 = boost
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.07960374 = queryNorm
              5.6848817 = fieldWeight in 5119, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.375 = fieldNorm(doc=5119)
        0.5 = coord(1/2)
    
  5. Gómez, C. Olmeda- -> Olmeda-Gómez, C.: 1.83
    1.827178 = sum of:
      1.827178 = product of:
        3.654356 = sum of:
          3.654356 = weight(author_txt:gómez in 7447) [ClassicSimilarity], result of:
            3.654356 = score(doc=7447,freq=2.0), product of:
              0.7872906 = queryWeight, product of:
                1.1299833 = boost
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.07960374 = queryNorm
              4.6416864 = fieldWeight in 7447, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.375 = fieldNorm(doc=7447)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Kwok, K.L.: Employing multiple representations for Chinese information retrieval (1999) 0.26
    0.26305026 = sum of:
      0.26305026 = product of:
        0.93946517 = sum of:
          0.073158465 = weight(abstract_txt:characters in 3773) [ClassicSimilarity], result of:
            0.073158465 = score(doc=3773,freq=2.0), product of:
              0.11216273 = queryWeight, product of:
                1.0284736 = boost
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.014778638 = queryNorm
              0.6522529 = fieldWeight in 3773, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.0625 = fieldNorm(doc=3773)
          0.027928233 = weight(abstract_txt:system in 3773) [ClassicSimilarity], result of:
            0.027928233 = score(doc=3773,freq=2.0), product of:
              0.09369602 = queryWeight, product of:
                1.8800068 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.014778638 = queryNorm
              0.2980728 = fieldWeight in 3773, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.0625 = fieldNorm(doc=3773)
          0.12866788 = weight(abstract_txt:gram in 3773) [ClassicSimilarity], result of:
            0.12866788 = score(doc=3773,freq=1.0), product of:
              0.25942126 = queryWeight, product of:
                2.2120078 = boost
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.014778638 = queryNorm
              0.49598044 = fieldWeight in 3773, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.0625 = fieldNorm(doc=3773)
          0.03766395 = weight(abstract_txt:language in 3773) [ClassicSimilarity], result of:
            0.03766395 = score(doc=3773,freq=1.0), product of:
              0.14409627 = queryWeight, product of:
                2.3314452 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.014778638 = queryNorm
              0.26138046 = fieldWeight in 3773, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.0625 = fieldNorm(doc=3773)
          0.15103823 = weight(abstract_txt:character in 3773) [ClassicSimilarity], result of:
            0.15103823 = score(doc=3773,freq=2.0), product of:
              0.26228324 = queryWeight, product of:
                2.724048 = boost
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.014778638 = queryNorm
              0.57585925 = fieldWeight in 3773, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.0625 = fieldNorm(doc=3773)
          0.2729458 = weight(abstract_txt:segmentation in 3773) [ClassicSimilarity], result of:
            0.2729458 = score(doc=3773,freq=2.0), product of:
              0.38913193 = queryWeight, product of:
                3.3180118 = boost
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.014778638 = queryNorm
              0.7014223 = fieldWeight in 3773, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.0625 = fieldNorm(doc=3773)
          0.24806261 = weight(abstract_txt:word in 3773) [ClassicSimilarity], result of:
            0.24806261 = score(doc=3773,freq=4.0), product of:
              0.36510697 = queryWeight, product of:
                4.545216 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.014778638 = queryNorm
              0.67942446 = fieldWeight in 3773, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.0625 = fieldNorm(doc=3773)
        0.28 = coord(7/25)
    
  2. Wang, F.L.; Yang, C.C.: Mining Web data for Chinese segmentation (2007) 0.25
    0.25484043 = sum of:
      0.25484043 = product of:
        1.0618352 = sum of:
          0.052594118 = weight(abstract_txt:sequences in 604) [ClassicSimilarity], result of:
            0.052594118 = score(doc=604,freq=1.0), product of:
              0.11340711 = queryWeight, product of:
                1.034163 = boost
                7.4202213 = idf(docFreq=71, maxDocs=44218)
                0.014778638 = queryNorm
              0.46376383 = fieldWeight in 604, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.4202213 = idf(docFreq=71, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.0072348244 = weight(abstract_txt:this in 604) [ClassicSimilarity], result of:
            0.0072348244 = score(doc=604,freq=1.0), product of:
              0.047971964 = queryWeight, product of:
                1.3452178 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.014778638 = queryNorm
              0.1508136 = fieldWeight in 604, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.065235876 = weight(abstract_txt:language in 604) [ClassicSimilarity], result of:
            0.065235876 = score(doc=604,freq=3.0), product of:
              0.14409627 = queryWeight, product of:
                2.3314452 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.014778638 = queryNorm
              0.45272425 = fieldWeight in 604, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.15103823 = weight(abstract_txt:character in 604) [ClassicSimilarity], result of:
            0.15103823 = score(doc=604,freq=2.0), product of:
              0.26228324 = queryWeight, product of:
                2.724048 = boost
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.014778638 = queryNorm
              0.57585925 = fieldWeight in 604, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.6103254 = weight(abstract_txt:segmentation in 604) [ClassicSimilarity], result of:
            0.6103254 = score(doc=604,freq=10.0), product of:
              0.38913193 = queryWeight, product of:
                3.3180118 = boost
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.014778638 = queryNorm
              1.5684279 = fieldWeight in 604, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
          0.17540674 = weight(abstract_txt:word in 604) [ClassicSimilarity], result of:
            0.17540674 = score(doc=604,freq=2.0), product of:
              0.36510697 = queryWeight, product of:
                4.545216 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.014778638 = queryNorm
              0.48042563 = fieldWeight in 604, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.0625 = fieldNorm(doc=604)
        0.24 = coord(6/25)
    
  3. Peng, F.; Huang, X.: Machine learning for Asian language text classification (2007) 0.24
    0.23678936 = sum of:
      0.23678936 = product of:
        0.98662233 = sum of:
          0.010231586 = weight(abstract_txt:this in 831) [ClassicSimilarity], result of:
            0.010231586 = score(doc=831,freq=2.0), product of:
              0.047971964 = queryWeight, product of:
                1.3452178 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.014778638 = queryNorm
              0.21328263 = fieldWeight in 831, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.024462238 = weight(abstract_txt:model in 831) [ClassicSimilarity], result of:
            0.024462238 = score(doc=831,freq=1.0), product of:
              0.098186865 = queryWeight, product of:
                1.6666952 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.014778638 = queryNorm
              0.24913962 = fieldWeight in 831, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.11790314 = weight(abstract_txt:boundary in 831) [ClassicSimilarity], result of:
            0.11790314 = score(doc=831,freq=1.0), product of:
              0.24474233 = queryWeight, product of:
                2.148515 = boost
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.014778638 = queryNorm
              0.48174396 = fieldWeight in 831, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.0753279 = weight(abstract_txt:language in 831) [ClassicSimilarity], result of:
            0.0753279 = score(doc=831,freq=4.0), product of:
              0.14409627 = queryWeight, product of:
                2.3314452 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.014778638 = queryNorm
              0.5227609 = fieldWeight in 831, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.51063484 = weight(abstract_txt:segmentation in 831) [ClassicSimilarity], result of:
            0.51063484 = score(doc=831,freq=7.0), product of:
              0.38913193 = queryWeight, product of:
                3.3180118 = boost
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.014778638 = queryNorm
              1.3122408 = fieldWeight in 831, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
          0.24806261 = weight(abstract_txt:word in 831) [ClassicSimilarity], result of:
            0.24806261 = score(doc=831,freq=4.0), product of:
              0.36510697 = queryWeight, product of:
                4.545216 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.014778638 = queryNorm
              0.67942446 = fieldWeight in 831, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.0625 = fieldNorm(doc=831)
        0.24 = coord(6/25)
    
  4. Lee, K.H.; Ng, M.K.M.; Lu, Q.: Text segmentation for Chinese spell checking (1999) 0.20
    0.20171827 = sum of:
      0.20171827 = product of:
        0.8404928 = sum of:
          0.05173085 = weight(abstract_txt:characters in 3913) [ClassicSimilarity], result of:
            0.05173085 = score(doc=3913,freq=1.0), product of:
              0.11216273 = queryWeight, product of:
                1.0284736 = boost
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.014778638 = queryNorm
              0.46121246 = fieldWeight in 3913, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.0625 = fieldNorm(doc=3913)
          0.010231586 = weight(abstract_txt:this in 3913) [ClassicSimilarity], result of:
            0.010231586 = score(doc=3913,freq=2.0), product of:
              0.047971964 = queryWeight, product of:
                1.3452178 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.014778638 = queryNorm
              0.21328263 = fieldWeight in 3913, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=3913)
          0.03766395 = weight(abstract_txt:language in 3913) [ClassicSimilarity], result of:
            0.03766395 = score(doc=3913,freq=1.0), product of:
              0.14409627 = queryWeight, product of:
                2.3314452 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.014778638 = queryNorm
              0.26138046 = fieldWeight in 3913, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.0625 = fieldNorm(doc=3913)
          0.106800154 = weight(abstract_txt:character in 3913) [ClassicSimilarity], result of:
            0.106800154 = score(doc=3913,freq=1.0), product of:
              0.26228324 = queryWeight, product of:
                2.724048 = boost
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.014778638 = queryNorm
              0.407194 = fieldWeight in 3913, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.515104 = idf(docFreq=177, maxDocs=44218)
                0.0625 = fieldNorm(doc=3913)
          0.38600364 = weight(abstract_txt:segmentation in 3913) [ClassicSimilarity], result of:
            0.38600364 = score(doc=3913,freq=4.0), product of:
              0.38913193 = queryWeight, product of:
                3.3180118 = boost
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.014778638 = queryNorm
              0.9919609 = fieldWeight in 3913, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.0625 = fieldNorm(doc=3913)
          0.24806261 = weight(abstract_txt:word in 3913) [ClassicSimilarity], result of:
            0.24806261 = score(doc=3913,freq=4.0), product of:
              0.36510697 = queryWeight, product of:
                4.545216 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.014778638 = queryNorm
              0.67942446 = fieldWeight in 3913, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.0625 = fieldNorm(doc=3913)
        0.24 = coord(6/25)
    
  5. Yang, C.C.; Li, K.W.: ¬A heuristic method based on a statistical approach for chinese text segmentation (2005) 0.20
    0.19662298 = sum of:
      0.19662298 = product of:
        0.9831149 = sum of:
          0.05173085 = weight(abstract_txt:characters in 4580) [ClassicSimilarity], result of:
            0.05173085 = score(doc=4580,freq=1.0), product of:
              0.11216273 = queryWeight, product of:
                1.0284736 = boost
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.014778638 = queryNorm
              0.46121246 = fieldWeight in 4580, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.010231586 = weight(abstract_txt:this in 4580) [ClassicSimilarity], result of:
            0.010231586 = score(doc=4580,freq=2.0), product of:
              0.047971964 = queryWeight, product of:
                1.3452178 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.014778638 = queryNorm
              0.21328263 = fieldWeight in 4580, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.16674022 = weight(abstract_txt:boundary in 4580) [ClassicSimilarity], result of:
            0.16674022 = score(doc=4580,freq=2.0), product of:
              0.24474233 = queryWeight, product of:
                2.148515 = boost
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.014778638 = queryNorm
              0.68128884 = fieldWeight in 4580, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.7079034 = idf(docFreq=53, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.5790055 = weight(abstract_txt:segmentation in 4580) [ClassicSimilarity], result of:
            0.5790055 = score(doc=4580,freq=9.0), product of:
              0.38913193 = queryWeight, product of:
                3.3180118 = boost
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.014778638 = queryNorm
              1.4879413 = fieldWeight in 4580, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
          0.17540674 = weight(abstract_txt:word in 4580) [ClassicSimilarity], result of:
            0.17540674 = score(doc=4580,freq=2.0), product of:
              0.36510697 = queryWeight, product of:
                4.545216 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.014778638 = queryNorm
              0.48042563 = fieldWeight in 4580, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.0625 = fieldNorm(doc=4580)
        0.2 = coord(5/25)