Search (2 results, page 1 of 1)

  • × author_ss:"Chen, J."
  • × theme_ss:"Computerlinguistik"
  1. Chen, J.: ¬A lexical knowledge base approach for English-Chinese cross-language information retrieval (2006) 0.01
    0.006203569 = product of:
      0.015508923 = sum of:
        0.0076151006 = weight(_text_:a in 4923) [ClassicSimilarity], result of:
          0.0076151006 = score(doc=4923,freq=10.0), product of:
            0.053464882 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046368346 = queryNorm
            0.14243183 = fieldWeight in 4923, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4923)
        0.007893822 = product of:
          0.015787644 = sum of:
            0.015787644 = weight(_text_:information in 4923) [ClassicSimilarity], result of:
              0.015787644 = score(doc=4923,freq=8.0), product of:
                0.08139861 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046368346 = queryNorm
                0.19395474 = fieldWeight in 4923, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4923)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    This study proposes and explores a natural language processing- (NLP) based strategy to address out-ofdictionary and vocabulary mismatch problems in query translation based English-Chinese Cross-Language Information Retrieval (EC-CLIR). The strategy, named the LKB approach, is to construct a lexical knowledge base (LKB) and to use it for query translation. In this article, the author describes the LKB construction process, which customizes available translation resources based an the document collection of the EC-CLIR system. The evaluation shows that the LKB approach is very promising. It consistently increased the percentage of correct translations and decreased the percentage of missing translations in addition to effectively detecting the vocabulary gap between the document collection and the translation resource of the system. The comparative analysis of the top EC-CLIR results using the LKB and two other translation resources demonstrates that the LKB approach has produced significant improvement in EC-CLIR performance compared to performance using the original translation resource without customization. It has also achieved the same level of performance as a sophisticated machine translation system. The study concludes that the LKB approach has the potential to be an empirical model for developing real-world CLIR systems. Linguistic knowledge and NLP techniques, if appropriately used, can improve the effectiveness of English-Chinese crosslanguage information retrieval.
    Source
    Journal of the American Society for Information Science and Technology. 57(2006) no.2, S.233-243
    Type
    a
  2. Reyes Ayala, B.; Knudson, R.; Chen, J.; Cao, G.; Wang, X.: Metadata records machine translation combining multi-engine outputs with limited parallel data (2018) 0.00
    0.003594941 = product of:
      0.008987352 = sum of:
        0.0034055763 = weight(_text_:a in 4010) [ClassicSimilarity], result of:
          0.0034055763 = score(doc=4010,freq=2.0), product of:
            0.053464882 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046368346 = queryNorm
            0.06369744 = fieldWeight in 4010, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4010)
        0.0055817757 = product of:
          0.011163551 = sum of:
            0.011163551 = weight(_text_:information in 4010) [ClassicSimilarity], result of:
              0.011163551 = score(doc=4010,freq=4.0), product of:
                0.08139861 = queryWeight, product of:
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.046368346 = queryNorm
                0.13714671 = fieldWeight in 4010, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.7554779 = idf(docFreq=20772, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4010)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    One way to facilitate Multilingual Information Access (MLIA) for digital libraries is to generate multilingual metadata records by applying Machine Translation (MT) techniques. Current online MT services are available and affordable, but are not always effective for creating multilingual metadata records. In this study, we implemented 3 different MT strategies and evaluated their performance when translating English metadata records to Chinese and Spanish. These strategies included combining MT results from 3 online MT systems (Google, Bing, and Yahoo!) with and without additional linguistic resources, such as manually-generated parallel corpora, and metadata records in the two target languages obtained from international partners. The open-source statistical MT platform Moses was applied to design and implement the three translation strategies. Human evaluation of the MT results using adequacy and fluency demonstrated that two of the strategies produced higher quality translations than individual online MT systems for both languages. Especially, adding small, manually-generated parallel corpora of metadata records significantly improved translation performance. Our study suggested an effective and efficient MT approach for providing multilingual services for digital collections.
    Source
    Journal of the Association for Information Science and Technology. 69(2018) no.1, S.47-59
    Type
    a