Document (#28392)

Author
Li, K.W.
Yang, C.C.
Title
Automatic crosslingual thesaurus generated from the Hong Kong SAR Police Department Web Corpus for Crime Analysis
Source
Journal of the American Society for Information Science and Technology. 56(2005) no.3, S.272-281
Year
2005
Abstract
For the sake of national security, very large volumes of data and information are generated and gathered daily. Much of this data and information is written in different languages, stored in different locations, and may be seemingly unconnected. Crosslingual semantic interoperability is a major challenge to generate an overview of this disparate data and information so that it can be analyzed, shared, searched, and summarized. The recent terrorist attacks and the tragic events of September 11, 2001 have prompted increased attention an national security and criminal analysis. Many Asian countries and cities, such as Japan, Taiwan, and Singapore, have been advised that they may become the next targets of terrorist attacks. Semantic interoperability has been a focus in digital library research. Traditional information retrieval (IR) approaches normally require a document to share some common keywords with the query. Generating the associations for the related terms between the two term spaces of users and documents is an important issue. The problem can be viewed as the creation of a thesaurus. Apart from this, terrorists and criminals may communicate through letters, e-mails, and faxes in languages other than English. The translation ambiguity significantly exacerbates the retrieval problem. The problem is expanded to crosslingual semantic interoperability. In this paper, we focus an the English/Chinese crosslingual semantic interoperability problem. However, the developed techniques are not limited to English and Chinese languages but can be applied to many other languages. English and Chinese are popular languages in the Asian region. Much information about national security or crime is communicated in these languages. An efficient automatically generated thesaurus between these languages is important to crosslingual information retrieval between English and Chinese languages. To facilitate crosslingual information retrieval, a corpus-based approach uses the term co-occurrence statistics in parallel or comparable corpora to construct a statistical translation model to cross the language boundary. In this paper, the text based approach to align English/Chinese Hong Kong Police press release documents from the Web is first presented. We also introduce an algorithmic approach to generate a robust knowledge base based an statistical correlation analysis of the semantics (knowledge) embedded in the bilingual press release corpus. The research output consisted of a thesaurus-like, semantic network knowledge base, which can aid in semanticsbased crosslingual information management and retrieval.
Footnote
Beitrag in einem Themenheft zu: 'Intelligence and security informatics'
Theme
Multilinguale Probleme
Konzeption und Anwendung des Prinzips Thesaurus
Semantische Interoperabilität

Similar documents (author)

  1. Yang, S.C.: ¬An interpretive and situated approach to an evaluation of Perseus digital libraries (2001) 4.50
    4.4981737 = sum of:
      4.4981737 = weight(author_txt:yang in 6933) [ClassicSimilarity], result of:
        4.4981737 = fieldWeight in 6933, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.1970778 = idf(docFreq=89, maxDocs=44218)
          0.625 = fieldNorm(doc=6933)
    
  2. Yang, K.: Information retrieval on the Web (2004) 4.50
    4.4981737 = sum of:
      4.4981737 = weight(author_txt:yang in 4278) [ClassicSimilarity], result of:
        4.4981737 = fieldWeight in 4278, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.1970778 = idf(docFreq=89, maxDocs=44218)
          0.625 = fieldNorm(doc=4278)
    
  3. Yang, C.C.: Content-based image retrievaI : a comparison between query by example and image browsing map approaches (2005) 4.50
    4.4981737 = sum of:
      4.4981737 = weight(author_txt:yang in 4649) [ClassicSimilarity], result of:
        4.4981737 = fieldWeight in 4649, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.1970778 = idf(docFreq=89, maxDocs=44218)
          0.625 = fieldNorm(doc=4649)
    
  4. Salton, G.; Yang, C.S.: On the specification of term values in automatic indexing (1973) 3.60
    3.5985389 = sum of:
      3.5985389 = weight(author_txt:yang in 5476) [ClassicSimilarity], result of:
        3.5985389 = fieldWeight in 5476, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.1970778 = idf(docFreq=89, maxDocs=44218)
          0.5 = fieldNorm(doc=5476)
    
  5. Yang, Y.; Chute, C.G.A.: ¬A schematic analysis of the Unified Medical Language System (1992) 3.60
    3.5985389 = sum of:
      3.5985389 = weight(author_txt:yang in 6445) [ClassicSimilarity], result of:
        3.5985389 = fieldWeight in 6445, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          7.1970778 = idf(docFreq=89, maxDocs=44218)
          0.5 = fieldNorm(doc=6445)
    

Similar documents (content)

  1. Yang, C.C.; Luk, J.: Automatic generation of English/Chinese thesaurus based on a parallel corpus in laws (2003) 0.76
    0.7590033 = sum of:
      0.7590033 = product of:
        1.1161813 = sum of:
          0.005882915 = weight(abstract_txt:this in 1616) [ClassicSimilarity], result of:
            0.005882915 = score(doc=1616,freq=2.0), product of:
              0.044132352 = queryWeight, product of:
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.01828928 = queryNorm
              0.13330165 = fieldWeight in 1616, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1616)
          0.02610719 = weight(abstract_txt:generate in 1616) [ClassicSimilarity], result of:
            0.02610719 = score(doc=1616,freq=1.0), product of:
              0.11063659 = queryWeight, product of:
                1.0013845 = boost
                6.0408955 = idf(docFreq=285, maxDocs=44218)
                0.01828928 = queryNorm
              0.23597248 = fieldWeight in 1616, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0408955 = idf(docFreq=285, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1616)
          0.027890544 = weight(abstract_txt:translation in 1616) [ClassicSimilarity], result of:
            0.027890544 = score(doc=1616,freq=1.0), product of:
              0.11561921 = queryWeight, product of:
                1.0236853 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.01828928 = queryNorm
              0.2412276 = fieldWeight in 1616, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1616)
          0.05587394 = weight(abstract_txt:asian in 1616) [ClassicSimilarity], result of:
            0.05587394 = score(doc=1616,freq=1.0), product of:
              0.18373767 = queryWeight, product of:
                1.2904779 = boost
                7.7848644 = idf(docFreq=49, maxDocs=44218)
                0.01828928 = queryNorm
              0.30409628 = fieldWeight in 1616, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7848644 = idf(docFreq=49, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1616)
          0.051780473 = weight(abstract_txt:generated in 1616) [ClassicSimilarity], result of:
            0.051780473 = score(doc=1616,freq=3.0), product of:
              0.13862005 = queryWeight, product of:
                1.372809 = boost
                5.52102 = idf(docFreq=480, maxDocs=44218)
                0.01828928 = queryNorm
              0.37354246 = fieldWeight in 1616, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.52102 = idf(docFreq=480, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1616)
          0.103540786 = weight(abstract_txt:hong in 1616) [ClassicSimilarity], result of:
            0.103540786 = score(doc=1616,freq=2.0), product of:
              0.22001705 = queryWeight, product of:
                1.4121461 = boost
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.01828928 = queryNorm
              0.47060347 = fieldWeight in 1616, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1616)
          0.0732144 = weight(abstract_txt:kong in 1616) [ClassicSimilarity], result of:
            0.0732144 = score(doc=1616,freq=1.0), product of:
              0.22001705 = queryWeight, product of:
                1.4121461 = boost
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.01828928 = queryNorm
              0.33276692 = fieldWeight in 1616, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1616)
          0.017572269 = weight(abstract_txt:retrieval in 1616) [ClassicSimilarity], result of:
            0.017572269 = score(doc=1616,freq=2.0), product of:
              0.09153361 = queryWeight, product of:
                1.4401634 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.01828928 = queryNorm
              0.19197613 = fieldWeight in 1616, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1616)
          0.021020874 = weight(abstract_txt:problem in 1616) [ClassicSimilarity], result of:
            0.021020874 = score(doc=1616,freq=1.0), product of:
              0.12064311 = queryWeight, product of:
                1.4788282 = boost
                4.460548 = idf(docFreq=1388, maxDocs=44218)
                0.01828928 = queryNorm
              0.17424016 = fieldWeight in 1616, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.460548 = idf(docFreq=1388, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1616)
          0.056980222 = weight(abstract_txt:corpus in 1616) [ClassicSimilarity], result of:
            0.056980222 = score(doc=1616,freq=2.0), product of:
              0.16913307 = queryWeight, product of:
                1.5163916 = boost
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.01828928 = queryNorm
              0.33689582 = fieldWeight in 1616, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1616)
          0.009505755 = weight(abstract_txt:information in 1616) [ClassicSimilarity], result of:
            0.009505755 = score(doc=1616,freq=2.0), product of:
              0.071076564 = queryWeight, product of:
                1.6052574 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.01828928 = queryNorm
              0.13373965 = fieldWeight in 1616, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1616)
          0.08639741 = weight(abstract_txt:thesaurus in 1616) [ClassicSimilarity], result of:
            0.08639741 = score(doc=1616,freq=7.0), product of:
              0.1618216 = queryWeight, product of:
                1.7127135 = boost
                5.1660094 = idf(docFreq=685, maxDocs=44218)
                0.01828928 = queryNorm
              0.53390527 = fieldWeight in 1616, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                5.1660094 = idf(docFreq=685, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1616)
          0.037505303 = weight(abstract_txt:semantic in 1616) [ClassicSimilarity], result of:
            0.037505303 = score(doc=1616,freq=2.0), product of:
              0.15173665 = queryWeight, product of:
                1.8542433 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.01828928 = queryNorm
              0.24717367 = fieldWeight in 1616, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1616)
          0.0776854 = weight(abstract_txt:interoperability in 1616) [ClassicSimilarity], result of:
            0.0776854 = score(doc=1616,freq=2.0), product of:
              0.2288855 = queryWeight, product of:
                2.0369277 = boost
                6.1439276 = idf(docFreq=257, maxDocs=44218)
                0.01828928 = queryNorm
              0.33940727 = fieldWeight in 1616, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.1439276 = idf(docFreq=257, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1616)
          0.1482934 = weight(abstract_txt:chinese in 1616) [ClassicSimilarity], result of:
            0.1482934 = score(doc=1616,freq=4.0), product of:
              0.3011387 = queryWeight, product of:
                2.61219 = boost
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.01828928 = queryNorm
              0.4924422 = fieldWeight in 1616, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1616)
          0.18462549 = weight(abstract_txt:english in 1616) [ClassicSimilarity], result of:
            0.18462549 = score(doc=1616,freq=9.0), product of:
              0.2826264 = queryWeight, product of:
                2.772161 = boost
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.01828928 = queryNorm
              0.6532493 = fieldWeight in 1616, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1616)
          0.13230482 = weight(abstract_txt:languages in 1616) [ClassicSimilarity], result of:
            0.13230482 = score(doc=1616,freq=4.0), product of:
              0.32641926 = queryWeight, product of:
                3.4400866 = boost
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.01828928 = queryNorm
              0.40532172 = fieldWeight in 1616, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1616)
        0.68 = coord(17/25)
    
  2. Yang, C.C.; Li, K.W.: Automatic construction of English/Chinese parallel corpora (2003) 0.63
    0.63084924 = sum of:
      0.63084924 = product of:
        1.2131717 = sum of:
          0.0058237887 = weight(abstract_txt:this in 1683) [ClassicSimilarity], result of:
            0.0058237887 = score(doc=1683,freq=1.0), product of:
              0.044132352 = queryWeight, product of:
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.01828928 = queryNorm
              0.1319619 = fieldWeight in 1683, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
          0.06763098 = weight(abstract_txt:translation in 1683) [ClassicSimilarity], result of:
            0.06763098 = score(doc=1683,freq=3.0), product of:
              0.11561921 = queryWeight, product of:
                1.0236853 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.01828928 = queryNorm
              0.58494586 = fieldWeight in 1683, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
          0.06522902 = weight(abstract_txt:press in 1683) [ClassicSimilarity], result of:
            0.06522902 = score(doc=1683,freq=1.0), product of:
              0.16277982 = queryWeight, product of:
                1.2146517 = boost
                7.3274393 = idf(docFreq=78, maxDocs=44218)
                0.01828928 = queryNorm
              0.40071934 = fieldWeight in 1683, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3274393 = idf(docFreq=78, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
          0.10303288 = weight(abstract_txt:release in 1683) [ClassicSimilarity], result of:
            0.10303288 = score(doc=1683,freq=2.0), product of:
              0.17523219 = queryWeight, product of:
                1.260255 = boost
                7.602543 = idf(docFreq=59, maxDocs=44218)
                0.01828928 = queryNorm
              0.5879792 = fieldWeight in 1683, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.602543 = idf(docFreq=59, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
          0.07822351 = weight(abstract_txt:asian in 1683) [ClassicSimilarity], result of:
            0.07822351 = score(doc=1683,freq=1.0), product of:
              0.18373767 = queryWeight, product of:
                1.2904779 = boost
                7.7848644 = idf(docFreq=49, maxDocs=44218)
                0.01828928 = queryNorm
              0.42573476 = fieldWeight in 1683, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7848644 = idf(docFreq=49, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
          0.102500156 = weight(abstract_txt:hong in 1683) [ClassicSimilarity], result of:
            0.102500156 = score(doc=1683,freq=1.0), product of:
              0.22001705 = queryWeight, product of:
                1.4121461 = boost
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.01828928 = queryNorm
              0.4658737 = fieldWeight in 1683, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
          0.102500156 = weight(abstract_txt:kong in 1683) [ClassicSimilarity], result of:
            0.102500156 = score(doc=1683,freq=1.0), product of:
              0.22001705 = queryWeight, product of:
                1.4121461 = boost
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.01828928 = queryNorm
              0.4658737 = fieldWeight in 1683, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
          0.024601175 = weight(abstract_txt:retrieval in 1683) [ClassicSimilarity], result of:
            0.024601175 = score(doc=1683,freq=2.0), product of:
              0.09153361 = queryWeight, product of:
                1.4401634 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.01828928 = queryNorm
              0.26876658 = fieldWeight in 1683, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
          0.09770072 = weight(abstract_txt:corpus in 1683) [ClassicSimilarity], result of:
            0.09770072 = score(doc=1683,freq=3.0), product of:
              0.16913307 = queryWeight, product of:
                1.5163916 = boost
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.01828928 = queryNorm
              0.577656 = fieldWeight in 1683, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
          0.016298974 = weight(abstract_txt:information in 1683) [ClassicSimilarity], result of:
            0.016298974 = score(doc=1683,freq=3.0), product of:
              0.071076564 = queryWeight, product of:
                1.6052574 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.01828928 = queryNorm
              0.22931573 = fieldWeight in 1683, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
          0.20761076 = weight(abstract_txt:chinese in 1683) [ClassicSimilarity], result of:
            0.20761076 = score(doc=1683,freq=4.0), product of:
              0.3011387 = queryWeight, product of:
                2.61219 = boost
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.01828928 = queryNorm
              0.68941903 = fieldWeight in 1683, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
          0.21104454 = weight(abstract_txt:english in 1683) [ClassicSimilarity], result of:
            0.21104454 = score(doc=1683,freq=6.0), product of:
              0.2826264 = queryWeight, product of:
                2.772161 = boost
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.01828928 = queryNorm
              0.7467262 = fieldWeight in 1683, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
          0.13097508 = weight(abstract_txt:languages in 1683) [ClassicSimilarity], result of:
            0.13097508 = score(doc=1683,freq=2.0), product of:
              0.32641926 = queryWeight, product of:
                3.4400866 = boost
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.01828928 = queryNorm
              0.40124804 = fieldWeight in 1683, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.0546875 = fieldNorm(doc=1683)
        0.52 = coord(13/25)
    
  3. Yang, C.C.; Lam, W.: Introduction to the special topic section on multilingual information systems (2006) 0.36
    0.36435974 = sum of:
      0.36435974 = product of:
        1.1386242 = sum of:
          0.008319698 = weight(abstract_txt:this in 5043) [ClassicSimilarity], result of:
            0.008319698 = score(doc=5043,freq=1.0), product of:
              0.044132352 = queryWeight, product of:
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.01828928 = queryNorm
              0.18851699 = fieldWeight in 5043, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.078125 = fieldNorm(doc=5043)
          0.05979094 = weight(abstract_txt:generated in 5043) [ClassicSimilarity], result of:
            0.05979094 = score(doc=5043,freq=1.0), product of:
              0.13862005 = queryWeight, product of:
                1.372809 = boost
                5.52102 = idf(docFreq=480, maxDocs=44218)
                0.01828928 = queryNorm
              0.43132967 = fieldWeight in 5043, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.52102 = idf(docFreq=480, maxDocs=44218)
                0.078125 = fieldNorm(doc=5043)
          0.1464288 = weight(abstract_txt:hong in 5043) [ClassicSimilarity], result of:
            0.1464288 = score(doc=5043,freq=1.0), product of:
              0.22001705 = queryWeight, product of:
                1.4121461 = boost
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.01828928 = queryNorm
              0.66553384 = fieldWeight in 5043, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.078125 = fieldNorm(doc=5043)
          0.1464288 = weight(abstract_txt:kong in 5043) [ClassicSimilarity], result of:
            0.1464288 = score(doc=5043,freq=1.0), product of:
              0.22001705 = queryWeight, product of:
                1.4121461 = boost
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.01828928 = queryNorm
              0.66553384 = fieldWeight in 5043, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.518833 = idf(docFreq=23, maxDocs=44218)
                0.078125 = fieldNorm(doc=5043)
          0.030059837 = weight(abstract_txt:information in 5043) [ClassicSimilarity], result of:
            0.030059837 = score(doc=5043,freq=5.0), product of:
              0.071076564 = queryWeight, product of:
                1.6052574 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.01828928 = queryNorm
              0.42292193 = fieldWeight in 5043, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.078125 = fieldNorm(doc=5043)
          0.1482934 = weight(abstract_txt:chinese in 5043) [ClassicSimilarity], result of:
            0.1482934 = score(doc=5043,freq=1.0), product of:
              0.3011387 = queryWeight, product of:
                2.61219 = boost
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.01828928 = queryNorm
              0.4924422 = fieldWeight in 5043, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.078125 = fieldNorm(doc=5043)
          0.27522346 = weight(abstract_txt:english in 5043) [ClassicSimilarity], result of:
            0.27522346 = score(doc=5043,freq=5.0), product of:
              0.2826264 = queryWeight, product of:
                2.772161 = boost
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.01828928 = queryNorm
              0.9738066 = fieldWeight in 5043, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.078125 = fieldNorm(doc=5043)
          0.3240793 = weight(abstract_txt:languages in 5043) [ClassicSimilarity], result of:
            0.3240793 = score(doc=5043,freq=6.0), product of:
              0.32641926 = queryWeight, product of:
                3.4400866 = boost
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.01828928 = queryNorm
              0.99283147 = fieldWeight in 5043, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.078125 = fieldNorm(doc=5043)
        0.32 = coord(8/25)
    
  4. Foo, S.; Hui, S.C.; Lim, H.K.; Hui, L.: Automated thesaurus for enhanced Chinese text retrieval (2000) 0.34
    0.3369616 = sum of:
      0.3369616 = product of:
        0.842404 = sum of:
          0.0066557587 = weight(abstract_txt:this in 759) [ClassicSimilarity], result of:
            0.0066557587 = score(doc=759,freq=1.0), product of:
              0.044132352 = queryWeight, product of:
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.01828928 = queryNorm
              0.1508136 = fieldWeight in 759, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=759)
          0.041771505 = weight(abstract_txt:generate in 759) [ClassicSimilarity], result of:
            0.041771505 = score(doc=759,freq=1.0), product of:
              0.11063659 = queryWeight, product of:
                1.0013845 = boost
                6.0408955 = idf(docFreq=285, maxDocs=44218)
                0.01828928 = queryNorm
              0.37755597 = fieldWeight in 759, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0408955 = idf(docFreq=285, maxDocs=44218)
                0.0625 = fieldNorm(doc=759)
          0.0893983 = weight(abstract_txt:asian in 759) [ClassicSimilarity], result of:
            0.0893983 = score(doc=759,freq=1.0), product of:
              0.18373767 = queryWeight, product of:
                1.2904779 = boost
                7.7848644 = idf(docFreq=49, maxDocs=44218)
                0.01828928 = queryNorm
              0.48655403 = fieldWeight in 759, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7848644 = idf(docFreq=49, maxDocs=44218)
                0.0625 = fieldNorm(doc=759)
          0.06764573 = weight(abstract_txt:generated in 759) [ClassicSimilarity], result of:
            0.06764573 = score(doc=759,freq=2.0), product of:
              0.13862005 = queryWeight, product of:
                1.372809 = boost
                5.52102 = idf(docFreq=480, maxDocs=44218)
                0.01828928 = queryNorm
              0.4879938 = fieldWeight in 759, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.52102 = idf(docFreq=480, maxDocs=44218)
                0.0625 = fieldNorm(doc=759)
          0.039761506 = weight(abstract_txt:retrieval in 759) [ClassicSimilarity], result of:
            0.039761506 = score(doc=759,freq=4.0), product of:
              0.09153361 = queryWeight, product of:
                1.4401634 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.01828928 = queryNorm
              0.43439242 = fieldWeight in 759, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=759)
          0.010754534 = weight(abstract_txt:information in 759) [ClassicSimilarity], result of:
            0.010754534 = score(doc=759,freq=1.0), product of:
              0.071076564 = queryWeight, product of:
                1.6052574 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.01828928 = queryNorm
              0.15130915 = fieldWeight in 759, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=759)
          0.116830625 = weight(abstract_txt:thesaurus in 759) [ClassicSimilarity], result of:
            0.116830625 = score(doc=759,freq=5.0), product of:
              0.1618216 = queryWeight, product of:
                1.7127135 = boost
                5.1660094 = idf(docFreq=685, maxDocs=44218)
                0.01828928 = queryNorm
              0.72197175 = fieldWeight in 759, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.1660094 = idf(docFreq=685, maxDocs=44218)
                0.0625 = fieldNorm(doc=759)
          0.2652753 = weight(abstract_txt:chinese in 759) [ClassicSimilarity], result of:
            0.2652753 = score(doc=759,freq=5.0), product of:
              0.3011387 = queryWeight, product of:
                2.61219 = boost
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.01828928 = queryNorm
              0.88090736 = fieldWeight in 759, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.0625 = fieldNorm(doc=759)
          0.09846693 = weight(abstract_txt:english in 759) [ClassicSimilarity], result of:
            0.09846693 = score(doc=759,freq=1.0), product of:
              0.2826264 = queryWeight, product of:
                2.772161 = boost
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.01828928 = queryNorm
              0.34839964 = fieldWeight in 759, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.0625 = fieldNorm(doc=759)
          0.10584386 = weight(abstract_txt:languages in 759) [ClassicSimilarity], result of:
            0.10584386 = score(doc=759,freq=1.0), product of:
              0.32641926 = queryWeight, product of:
                3.4400866 = boost
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.01828928 = queryNorm
              0.32425737 = fieldWeight in 759, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.0625 = fieldNorm(doc=759)
        0.4 = coord(10/25)
    
  5. Li, K.W.; Yang, C.C.: Conceptual analysis of parallel corpus collected from the Web (2006) 0.32
    0.32472566 = sum of:
      0.32472566 = product of:
        0.9020157 = sum of:
          0.0066557587 = weight(abstract_txt:this in 5051) [ClassicSimilarity], result of:
            0.0066557587 = score(doc=5051,freq=1.0), product of:
              0.044132352 = queryWeight, product of:
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.01828928 = queryNorm
              0.1508136 = fieldWeight in 5051, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=5051)
          0.044624873 = weight(abstract_txt:translation in 5051) [ClassicSimilarity], result of:
            0.044624873 = score(doc=5051,freq=1.0), product of:
              0.11561921 = queryWeight, product of:
                1.0236853 = boost
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.01828928 = queryNorm
              0.38596416 = fieldWeight in 5051, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1754265 = idf(docFreq=249, maxDocs=44218)
                0.0625 = fieldNorm(doc=5051)
          0.0893983 = weight(abstract_txt:asian in 5051) [ClassicSimilarity], result of:
            0.0893983 = score(doc=5051,freq=1.0), product of:
              0.18373767 = queryWeight, product of:
                1.2904779 = boost
                7.7848644 = idf(docFreq=49, maxDocs=44218)
                0.01828928 = queryNorm
              0.48655403 = fieldWeight in 5051, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7848644 = idf(docFreq=49, maxDocs=44218)
                0.0625 = fieldNorm(doc=5051)
          0.019880753 = weight(abstract_txt:retrieval in 5051) [ClassicSimilarity], result of:
            0.019880753 = score(doc=5051,freq=1.0), product of:
              0.09153361 = queryWeight, product of:
                1.4401634 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.01828928 = queryNorm
              0.21719621 = fieldWeight in 5051, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=5051)
          0.06446576 = weight(abstract_txt:corpus in 5051) [ClassicSimilarity], result of:
            0.06446576 = score(doc=5051,freq=1.0), product of:
              0.16913307 = queryWeight, product of:
                1.5163916 = boost
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.01828928 = queryNorm
              0.3811541 = fieldWeight in 5051, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0984654 = idf(docFreq=269, maxDocs=44218)
                0.0625 = fieldNorm(doc=5051)
          0.0186274 = weight(abstract_txt:information in 5051) [ClassicSimilarity], result of:
            0.0186274 = score(doc=5051,freq=3.0), product of:
              0.071076564 = queryWeight, product of:
                1.6052574 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.01828928 = queryNorm
              0.26207513 = fieldWeight in 5051, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=5051)
          0.20548135 = weight(abstract_txt:chinese in 5051) [ClassicSimilarity], result of:
            0.20548135 = score(doc=5051,freq=3.0), product of:
              0.3011387 = queryWeight, product of:
                2.61219 = boost
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.01828928 = queryNorm
              0.6823479 = fieldWeight in 5051, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.30326 = idf(docFreq=219, maxDocs=44218)
                0.0625 = fieldNorm(doc=5051)
          0.24119374 = weight(abstract_txt:english in 5051) [ClassicSimilarity], result of:
            0.24119374 = score(doc=5051,freq=6.0), product of:
              0.2826264 = queryWeight, product of:
                2.772161 = boost
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.01828928 = queryNorm
              0.85340136 = fieldWeight in 5051, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.574394 = idf(docFreq=455, maxDocs=44218)
                0.0625 = fieldNorm(doc=5051)
          0.21168771 = weight(abstract_txt:languages in 5051) [ClassicSimilarity], result of:
            0.21168771 = score(doc=5051,freq=4.0), product of:
              0.32641926 = queryWeight, product of:
                3.4400866 = boost
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.01828928 = queryNorm
              0.64851475 = fieldWeight in 5051, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.0625 = fieldNorm(doc=5051)
        0.36 = coord(9/25)