Search (207 results, page 1 of 11)

Peters, C.; Braschler, M.; Clough, P.: Multilingual information retrieval : from research to practice (2012) 0.03
```
0.031534832 = product of:
  0.1103719 = sum of:
    0.025709987 = weight(_text_:wide in 361) [ClassicSimilarity], result of:
      0.025709987 = score(doc=361,freq=2.0), product of:
        0.1312982 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029633347 = queryNorm
        0.1958137 = fieldWeight in 361, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.03125 = fieldNorm(doc=361)
    0.029287368 = weight(_text_:elektronische in 361) [ClassicSimilarity], result of:
      0.029287368 = score(doc=361,freq=2.0), product of:
        0.14013545 = queryWeight, product of:
          4.728978 = idf(docFreq=1061, maxDocs=44218)
          0.029633347 = queryNorm
        0.20899329 = fieldWeight in 361, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.728978 = idf(docFreq=1061, maxDocs=44218)
          0.03125 = fieldNorm(doc=361)
    0.015630832 = weight(_text_:information in 361) [ClassicSimilarity], result of:
      0.015630832 = score(doc=361,freq=30.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.3004734 = fieldWeight in 361, product of:
          5.477226 = tf(freq=30.0), with freq of:
            30.0 = termFreq=30.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=361)
    0.03974371 = weight(_text_:retrieval in 361) [ClassicSimilarity], result of:
      0.03974371 = score(doc=361,freq=22.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.44337842 = fieldWeight in 361, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=361)
  0.2857143 = coord(4/14)
```
Abstract

We are living in a multilingual world and the diversity in languages which are used to interact with information access systems has generated a wide variety of challenges to be addressed by computer and information scientists. The growing amount of non-English information accessible globally and the increased worldwide exposure of enterprises also necessitates the adaptation of Information Retrieval (IR) methods to new, multilingual settings.Peters, Braschler and Clough present a comprehensive description of the technologies involved in designing and developing systems for Multilingual Information Retrieval (MLIR). They provide readers with broad coverage of the various issues involved in creating systems to make accessible digitally stored materials regardless of the language(s) they are written in. Details on Cross-Language Information Retrieval (CLIR) are also covered that help readers to understand how to develop retrieval systems that cross language boundaries. Their work is divided into six chapters and accompanies the reader step-by-step through the various stages involved in building, using and evaluating MLIR systems. The book concludes with some examples of recent applications that utilise MLIR technologies. Some of the techniques described have recently started to appear in commercial search systems, while others have the potential to be part of future incarnations.The book is intended for graduate students, scholars, and practitioners with a basic understanding of classical text retrieval methods. It offers guidelines and information on all aspects that need to be taken into consideration when building MLIR systems, while avoiding too many 'hands-on details' that could rapidly become obsolete. Thus it bridges the gap between the material covered by most of the classical IR textbooks and the novel requirements related to the acquisition and dissemination of information in whatever language it is stored.

Content

Inhalt: 1 Introduction 2 Within-Language Information Retrieval 3 Cross-Language Information Retrieval 4 Interaction and User Interfaces 5 Evaluation for Multilingual Information Retrieval Systems 6 Applications of Multilingual Information Access

Footnote

Elektronische Ausgabe unter: http://springer.r.delivery.net/r/r?2.1.Ee.2Tp.1gd0L5.C3WE8i..N.WdtG.3uq2.bW89MQ%5f%5fCXWIFOJ0.

RSWK

Information-Retrieval-System / Mehrsprachigkeit / Abfrage / Zugriff

Subject

Information-Retrieval-System / Mehrsprachigkeit / Abfrage / Zugriff

Li, K.W.; Yang, C.C.: Conceptual analysis of parallel corpus collected from the Web (2006) 0.03

0.0291164 = product of:
  0.101907395 = sum of:
    0.045449268 = weight(_text_:wide in 5051) [ClassicSimilarity], result of:
      0.045449268 = score(doc=5051,freq=4.0), product of:
        0.1312982 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029633347 = queryNorm
        0.34615302 = fieldWeight in 5051, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5051)
    0.03019857 = weight(_text_:web in 5051) [ClassicSimilarity], result of:
      0.03019857 = score(doc=5051,freq=6.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.3122631 = fieldWeight in 5051, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5051)
    0.011280581 = weight(_text_:information in 5051) [ClassicSimilarity], result of:
      0.011280581 = score(doc=5051,freq=10.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.21684799 = fieldWeight in 5051, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5051)
    0.014978974 = weight(_text_:retrieval in 5051) [ClassicSimilarity], result of:
      0.014978974 = score(doc=5051,freq=2.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.16710453 = fieldWeight in 5051, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5051)
  0.2857143 = coord(4/14)

Abstract: As illustrated by the World Wide Web, the volume of information in languages other than English has grown significantly in recent years. This highlights the importance of multilingual corpora. Much effort has been devoted to the compilation of multilingual corpora for the purpose of cross-lingual information retrieval and machine translation. Existing parallel corpora mostly involve European languages, such as English-French and English-Spanish. There is still a lack of parallel corpora between European languages and Asian. languages. In the authors' previous work, an alignment method to identify one-to-one Chinese and English title pairs was developed to construct an English-Chinese parallel corpus that works automatically from the World Wide Web, and a 100% precision and 87% recall were obtained. Careful analysis of these results has helped the authors to understand how the alignment method can be improved. A conceptual analysis was conducted, which includes the analysis of conceptual equivalent and conceptual information alternation in the aligned and nonaligned English-Chinese title pairs that are obtained by the alignment method. The result of the analysis not only reflects the characteristics of parallel corpora, but also gives insight into the strengths and weaknesses of the alignment method. In particular, conceptual alternation, such as omission and addition, is found to have a significant impact on the performance of the alignment method.
Footnote: Beitrag einer special topic section on multilingual information systems
Source: Journal of the American Society for Information Science and Technology. 57(2006) no.5, S.632-644

Talvensaari, T.; Juhola, M.; Laurikkala, J.; Järvelin, K.: Corpus-based cross-language information retrieval in retrieval of highly relevant documents (2007) 0.03

0.025605785 = product of:
  0.08962024 = sum of:
    0.032137483 = weight(_text_:wide in 139) [ClassicSimilarity], result of:
      0.032137483 = score(doc=139,freq=2.0), product of:
        0.1312982 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029633347 = queryNorm
        0.24476713 = fieldWeight in 139, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0390625 = fieldNorm(doc=139)
    0.017435152 = weight(_text_:web in 139) [ClassicSimilarity], result of:
      0.017435152 = score(doc=139,freq=2.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.18028519 = fieldWeight in 139, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=139)
    0.010089659 = weight(_text_:information in 139) [ClassicSimilarity], result of:
      0.010089659 = score(doc=139,freq=8.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.19395474 = fieldWeight in 139, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=139)
    0.029957948 = weight(_text_:retrieval in 139) [ClassicSimilarity], result of:
      0.029957948 = score(doc=139,freq=8.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.33420905 = fieldWeight in 139, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=139)
  0.2857143 = coord(4/14)

Abstract: Information retrieval systems' ability to retrieve highly relevant documents has become more and more important in the age of extremely large collections, such as the World Wide Web (WWW). The authors' aim was to find out how corpus-based cross-language information retrieval (CLIR) manages in retrieving highly relevant documents. They created a Finnish-Swedish comparable corpus from two loosely related document collections and used it as a source of knowledge for query translation. Finnish test queries were translated into Swedish and run against a Swedish test collection. Graded relevance assessments were used in evaluating the results and three relevance criterion levels-liberal, regular, and stringent-were applied. The runs were also evaluated with generalized recall and precision, which weight the retrieved documents according to their relevance level. The performance of the Comparable Corpus Translation system (COCOT) was compared to that of a dictionarybased query translation program; the two translation methods were also combined. The results indicate that corpus-based CUR performs particularly well with highly relevant documents. In average precision, COCOT even matched the monolingual baseline on the highest relevance level. The performance of the different query translation methods was further analyzed by finding out reasons for poor rankings of highly relevant documents.
Source: Journal of the American Society for Information Science and Technology. 58(2007) no.3, S.322-334

Larkey, L.S.; Connell, M.E.: Structured queries, language modelling, and relevance modelling in cross-language information retrieval (2005) 0.02

0.022149958 = product of:
  0.07752485 = sum of:
    0.032137483 = weight(_text_:wide in 1022) [ClassicSimilarity], result of:
      0.032137483 = score(doc=1022,freq=2.0), product of:
        0.1312982 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029633347 = queryNorm
        0.24476713 = fieldWeight in 1022, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1022)
    0.008737902 = weight(_text_:information in 1022) [ClassicSimilarity], result of:
      0.008737902 = score(doc=1022,freq=6.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.16796975 = fieldWeight in 1022, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1022)
    0.029957948 = weight(_text_:retrieval in 1022) [ClassicSimilarity], result of:
      0.029957948 = score(doc=1022,freq=8.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.33420905 = fieldWeight in 1022, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1022)
    0.0066915164 = product of:
      0.020074548 = sum of:
        0.020074548 = weight(_text_:22 in 1022) [ClassicSimilarity], result of:
          0.020074548 = score(doc=1022,freq=2.0), product of:
            0.103770934 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.029633347 = queryNorm
            0.19345059 = fieldWeight in 1022, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1022)
      0.33333334 = coord(1/3)
  0.2857143 = coord(4/14)

Abstract: Two probabilistic approaches to cross-lingual retrieval are in wide use today, those based on probabilistic models of relevance, as exemplified by INQUERY, and those based on language modeling. INQUERY, as a query net model, allows the easy incorporation of query operators, including a synonym operator, which has proven to be extremely useful in cross-language information retrieval (CLIR), in an approach often called structured query translation. In contrast, language models incorporate translation probabilities into a unified framework. We compare the two approaches on Arabic and Spanish data sets, using two kinds of bilingual dictionaries--one derived from a conventional dictionary, and one derived from a parallel corpus. We find that structured query processing gives slightly better results when queries are not expanded. On the other hand, when queries are expanded, language modeling gives better results, but only when using a probabilistic dictionary derived from a parallel corpus. We pursue two additional issues inherent in the comparison of structured query processing with language modeling. The first concerns query expansion, and the second is the role of translation probabilities. We compare conventional expansion techniques (pseudo-relevance feedback) with relevance modeling, a new IR approach which fits into the formal framework of language modeling. We find that relevance modeling and pseudo-relevance feedback achieve comparable levels of retrieval and that good translation probabilities confer a small but significant advantage.
Date: 26.12.2007 20:22:11
Source: Information processing and management. 41(2005) no.3, S.457-474

Cao, L.; Leong, M.-K.; Low, H.-B.: Searching heterogeneous multilingual bibliographic sources (1998) 0.02

0.021766637 = product of:
  0.10157764 = sum of:
    0.051419973 = weight(_text_:wide in 3564) [ClassicSimilarity], result of:
      0.051419973 = score(doc=3564,freq=2.0), product of:
        0.1312982 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029633347 = queryNorm
        0.3916274 = fieldWeight in 3564, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0625 = fieldNorm(doc=3564)
    0.039451245 = weight(_text_:web in 3564) [ClassicSimilarity], result of:
      0.039451245 = score(doc=3564,freq=4.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.4079388 = fieldWeight in 3564, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0625 = fieldNorm(doc=3564)
    0.010706427 = product of:
      0.032119278 = sum of:
        0.032119278 = weight(_text_:22 in 3564) [ClassicSimilarity], result of:
          0.032119278 = score(doc=3564,freq=2.0), product of:
            0.103770934 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.029633347 = queryNorm
            0.30952093 = fieldWeight in 3564, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=3564)
      0.33333334 = coord(1/3)
  0.21428572 = coord(3/14)

Abstract: Propopses a Web-based architecture for searching distributed heterogeneous multi-asian language bibliographic sources, and describes a successful pilot implementation of the system at the Chinese Library (CLib) system developed in Singapore and tested at 2 university libraries and a public library
Date: 1. 8.1996 22:08:06
Footnote: Contribution to a special issue devoted to the Proceedings of the 7th International World Wide Web Conference, held 14-18 April 1998, Brisbane, Australia

Freitas-Junior, H.R.; Ribeiro-Neto, B.A.; Freitas-Vale, R. de; Laender, A.H.F.; Lima, L.R.S. de: Categorization-driven cross-language retrieval of medical information (2006) 0.02

0.02090708 = product of:
  0.073174775 = sum of:
    0.017435152 = weight(_text_:web in 5282) [ClassicSimilarity], result of:
      0.017435152 = score(doc=5282,freq=2.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.18028519 = fieldWeight in 5282, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5282)
    0.012357258 = weight(_text_:information in 5282) [ClassicSimilarity], result of:
      0.012357258 = score(doc=5282,freq=12.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.23754507 = fieldWeight in 5282, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5282)
    0.036690846 = weight(_text_:retrieval in 5282) [ClassicSimilarity], result of:
      0.036690846 = score(doc=5282,freq=12.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.40932083 = fieldWeight in 5282, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5282)
    0.0066915164 = product of:
      0.020074548 = sum of:
        0.020074548 = weight(_text_:22 in 5282) [ClassicSimilarity], result of:
          0.020074548 = score(doc=5282,freq=2.0), product of:
            0.103770934 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.029633347 = queryNorm
            0.19345059 = fieldWeight in 5282, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5282)
      0.33333334 = coord(1/3)
  0.2857143 = coord(4/14)

Abstract: The Web has become a large repository of documents (or pages) written in many different languages. In this context, traditional information retrieval (IR) techniques cannot be used whenever the user query and the documents being retrieved are in different languages. To address this problem, new cross-language information retrieval (CLIR) techniques have been proposed. In this work, we describe a method for cross-language retrieval of medical information. This method combines query terms and related medical concepts obtained automatically through a categorization procedure. The medical concepts are used to create a linguistic abstraction that allows retrieval of information in a language-independent way, minimizing linguistic problems such as polysemy. To evaluate our method, we carried out experiments using the OHSUMED test collection, whose documents are written in English, with queries expressed in Portuguese, Spanish, and French. The results indicate that our cross-language retrieval method is as effective as a standard vector space model algorithm operating on queries and documents in the same language. Further, our results are better than previous results in the literature.
Date: 22. 7.2006 16:46:36
Source: Journal of the American Society for Information Science and Technology. 57(2006) no.4, S.501-510

Bian, G.-W.; Chen, H.-H.: Cross-language information access to multilingual collections on the Internet (2000) 0.02

0.018879574 = product of:
  0.06607851 = sum of:
    0.029588435 = weight(_text_:web in 4436) [ClassicSimilarity], result of:
      0.029588435 = score(doc=4436,freq=4.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.3059541 = fieldWeight in 4436, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=4436)
    0.0104854815 = weight(_text_:information in 4436) [ClassicSimilarity], result of:
      0.0104854815 = score(doc=4436,freq=6.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.20156369 = fieldWeight in 4436, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=4436)
    0.01797477 = weight(_text_:retrieval in 4436) [ClassicSimilarity], result of:
      0.01797477 = score(doc=4436,freq=2.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.20052543 = fieldWeight in 4436, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=4436)
    0.008029819 = product of:
      0.024089456 = sum of:
        0.024089456 = weight(_text_:22 in 4436) [ClassicSimilarity], result of:
          0.024089456 = score(doc=4436,freq=2.0), product of:
            0.103770934 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.029633347 = queryNorm
            0.23214069 = fieldWeight in 4436, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=4436)
      0.33333334 = coord(1/3)
  0.2857143 = coord(4/14)

Abstract: Language barrier is the major problem that people face in searching for, retrieving, and understanding multilingual collections on the Internet. This paper deals with query translation and document translation in a Chinese-English information retrieval system called MTIR. Bilingual dictionary and monolingual corpus-based approaches are adopted to select suitable tranlated query terms. A machine transliteration algorithm is introduced to resolve proper name searching. We consider several design issues for document translation, including which material is translated, what roles the HTML tags play in translation, what the tradeoff is between the speed performance and the translation performance, and what from the translated result is presented in. About 100.000 Web pages translated in the last 4 months of 1997 are used for quantitative study of online and real-time Web page translation
Date: 16. 2.2000 14:22:39
Source: Journal of the American Society for Information Science. 51(2000) no.3, S.281-296

Peters, C.; Picchi, E.: Across languages, across cultures : issues in multilinguality and digital libraries (1997) 0.02

0.018600317 = product of:
  0.08680148 = sum of:
    0.051419973 = weight(_text_:wide in 1233) [ClassicSimilarity], result of:
      0.051419973 = score(doc=1233,freq=2.0), product of:
        0.1312982 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029633347 = queryNorm
        0.3916274 = fieldWeight in 1233, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0625 = fieldNorm(doc=1233)
    0.011415146 = weight(_text_:information in 1233) [ClassicSimilarity], result of:
      0.011415146 = score(doc=1233,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.21943474 = fieldWeight in 1233, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0625 = fieldNorm(doc=1233)
    0.023966359 = weight(_text_:retrieval in 1233) [ClassicSimilarity], result of:
      0.023966359 = score(doc=1233,freq=2.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.26736724 = fieldWeight in 1233, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=1233)
  0.21428572 = coord(3/14)

Abstract: With the recent rapid diffusion over the international computer networks of world-wide distributed document bases, the question of multilingual access and multilingual information retrieval is becoming increasingly relevant. We briefly discuss just some of the issues that must be addressed in order to implement a multilingual interface for a Digital Library system and describe our own approach to this problem.
Theme: Information Gateway

Qin, J.; Zhou, Y.; Chau, M.; Chen, H.: Multilingual Web retrieval : an experiment in English-Chinese business intelligence (2006) 0.02
```
0.017988376 = product of:
  0.08394576 = sum of:
    0.042707227 = weight(_text_:web in 5054) [ClassicSimilarity], result of:
      0.042707227 = score(doc=5054,freq=12.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.4416067 = fieldWeight in 5054, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5054)
    0.011280581 = weight(_text_:information in 5054) [ClassicSimilarity], result of:
      0.011280581 = score(doc=5054,freq=10.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.21684799 = fieldWeight in 5054, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5054)
    0.029957948 = weight(_text_:retrieval in 5054) [ClassicSimilarity], result of:
      0.029957948 = score(doc=5054,freq=8.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.33420905 = fieldWeight in 5054, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5054)
  0.21428572 = coord(3/14)
```
Abstract

As increasing numbers of non-English resources have become available on the Web, the interesting and important issue of how Web users can retrieve documents in different languages has arisen. Cross-language information retrieval (CLIP), the study of retrieving information in one language by queries expressed in another language, is a promising approach to the problem. Cross-language information retrieval has attracted much attention in recent years. Most research systems have achieved satisfactory performance on standard Text REtrieval Conference (TREC) collections such as news articles, but CLIR techniques have not been widely studied and evaluated for applications such as Web portals. In this article, the authors present their research in developing and evaluating a multilingual English-Chinese Web portal that incorporates various CLIP techniques for use in the business domain. A dictionary-based approach was adopted and combines phrasal translation, co-occurrence analysis, and pre- and posttranslation query expansion. The portal was evaluated by domain experts, using a set of queries in both English and Chinese. The experimental results showed that co-occurrence-based phrasal translation achieved a 74.6% improvement in precision over simple word-byword translation. When used together, pre- and posttranslation query expansion improved the performance slightly, achieving a 78.0% improvement over the baseline word-by-word translation approach. In general, applying CLIR techniques in Web applications shows promise.

Footnote

Beitrag einer special topic section on multilingual information systems

Source

Journal of the American Society for Information Science and Technology. 57(2006) no.5, S.671-683

Li, Q.; Chen, Y.P.; Myaeng, S.-H.; Jin, Y.; Kang, B.-Y.: Concept unification of terms in different languages via web mining for Information Retrieval (2009) 0.02

0.017982516 = product of:
  0.08391841 = sum of:
    0.034870304 = weight(_text_:web in 4215) [ClassicSimilarity], result of:
      0.034870304 = score(doc=4215,freq=8.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.36057037 = fieldWeight in 4215, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4215)
    0.012357258 = weight(_text_:information in 4215) [ClassicSimilarity], result of:
      0.012357258 = score(doc=4215,freq=12.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.23754507 = fieldWeight in 4215, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4215)
    0.036690846 = weight(_text_:retrieval in 4215) [ClassicSimilarity], result of:
      0.036690846 = score(doc=4215,freq=12.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.40932083 = fieldWeight in 4215, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4215)
  0.21428572 = coord(3/14)

Abstract: For historical and cultural reasons, English phrases, especially proper nouns and new words, frequently appear in Web pages written primarily in East Asian languages such as Chinese, Korean, and Japanese. Although such English terms and their equivalences in these East Asian languages refer to the same concept, they are often erroneously treated as independent index units in traditional Information Retrieval (IR). This paper describes the degree to which the problem arises in IR and proposes a novel technique to solve it. Our method first extracts English terms from native Web documents in an East Asian language, and then unifies the extracted terms and their equivalences in the native language as one index unit. For Cross-Language Information Retrieval (CLIR), one of the major hindrances to achieving retrieval performance at the level of Mono-Lingual Information Retrieval (MLIR) is the translation of terms in search queries which can not be found in a bilingual dictionary. The Web mining approach proposed in this paper for concept unification of terms in different languages can also be applied to solve this well-known challenge in CLIR. Experimental results based on NTCIR and KT-Set test collections show that the high translation precision of our approach greatly improves performance of both Mono-Lingual and Cross-Language Information Retrieval.
Source: Information processing and management. 45(2009) no.2, S.246-262

Airio, E.: Who benefits from CLIR in web retrieval? (2008) 0.02

0.017303396 = product of:
  0.080749184 = sum of:
    0.036238287 = weight(_text_:web in 2342) [ClassicSimilarity], result of:
      0.036238287 = score(doc=2342,freq=6.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.37471575 = fieldWeight in 2342, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=2342)
    0.00856136 = weight(_text_:information in 2342) [ClassicSimilarity], result of:
      0.00856136 = score(doc=2342,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.16457605 = fieldWeight in 2342, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=2342)
    0.03594954 = weight(_text_:retrieval in 2342) [ClassicSimilarity], result of:
      0.03594954 = score(doc=2342,freq=8.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.40105087 = fieldWeight in 2342, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2342)
  0.21428572 = coord(3/14)

Abstract: Purpose - The aim of the current paper is to test whether query translation is beneficial in web retrieval. Design/methodology/approach - The language pairs were Finnish-Swedish, English-German and Finnish-French. A total of 12-18 participants were recruited for each language pair. Each participant performed four retrieval tasks. The author's aim was to compare the performance of the translated queries with that of the target language queries. Thus, the author asked participants to formulate a source language query and a target language query for each task. The source language queries were translated into the target language utilizing a dictionary-based system. In English-German, also machine translation was utilized. The author used Google as the search engine. Findings - The results differed depending on the language pair. The author concluded that the dictionary coverage had an effect on the results. On average, the results of query-translation were better than in the traditional laboratory tests. Originality/value - This research shows that query translation in web is beneficial especially for users with moderate and non-active language skills. This is valuable information for developers of cross-language information retrieval systems.

Cheng, P.J.; Teng, J.W.; Chen, R.C.; Wang, J.H.; Lu, W.H.; Chien, L.F.: Translating unknown queries with Web corpora for cross-language information languages (2004) 0.02

0.016949398 = product of:
  0.07909719 = sum of:
    0.034870304 = weight(_text_:web in 4131) [ClassicSimilarity], result of:
      0.034870304 = score(doc=4131,freq=2.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.36057037 = fieldWeight in 4131, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.078125 = fieldNorm(doc=4131)
    0.014268933 = weight(_text_:information in 4131) [ClassicSimilarity], result of:
      0.014268933 = score(doc=4131,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.27429342 = fieldWeight in 4131, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.078125 = fieldNorm(doc=4131)
    0.029957948 = weight(_text_:retrieval in 4131) [ClassicSimilarity], result of:
      0.029957948 = score(doc=4131,freq=2.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.33420905 = fieldWeight in 4131, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=4131)
  0.21428572 = coord(3/14)

Source: SIGIR'04: Proceedings of the 27th Annual International ACM-SIGIR Conference an Research and Development in Information Retrieval. Ed.: K. Järvelin, u.a

Yang, C.C.; Lam, W.: Introduction to the special topic section on multilingual information systems (2006) 0.02

0.01617943 = product of:
  0.075504005 = sum of:
    0.03856498 = weight(_text_:wide in 5043) [ClassicSimilarity], result of:
      0.03856498 = score(doc=5043,freq=2.0), product of:
        0.1312982 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029633347 = queryNorm
        0.29372054 = fieldWeight in 5043, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.046875 = fieldNorm(doc=5043)
    0.020922182 = weight(_text_:web in 5043) [ClassicSimilarity], result of:
      0.020922182 = score(doc=5043,freq=2.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.21634221 = fieldWeight in 5043, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=5043)
    0.016016837 = weight(_text_:information in 5043) [ClassicSimilarity], result of:
      0.016016837 = score(doc=5043,freq=14.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.3078936 = fieldWeight in 5043, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=5043)
  0.21428572 = coord(3/14)

Abstract: The information available in languages other than English on the World Wide Web and global information systems is increasing significantly. According to some recent reports. the growth of non-English speaking Internet users is significantly higher than the growth of English-speaking Internet users. Asia and Europe have become the two most-populated regions of Internet users. However, there are many different languages in the many different countries of Asia and Europe. And there are many countries in the world using more than one language as their official languages. For example, Chinese and English are official languages in Hong Kong SAR; English and French are official languages in Canada. In the global economy, information systems are no longer utilized by users in a single geographical region but all over the world. Information can be generated, stored, processed, and accessed in several different languages. All of this reveals the importance of research in multilingual information systems.
Source: Journal of the American Society for Information Science and Technology. 57(2006) no.5, S.629-631

Wang, J.-H.; Teng, J.-W.; Lu, W.-H.; Chien, L.-F.: Exploiting the Web as the multilingual corpus for unknown query translation (2006) 0.02

0.015065275 = product of:
  0.07030462 = sum of:
    0.041844364 = weight(_text_:web in 5050) [ClassicSimilarity], result of:
      0.041844364 = score(doc=5050,freq=8.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.43268442 = fieldWeight in 5050, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=5050)
    0.0104854815 = weight(_text_:information in 5050) [ClassicSimilarity], result of:
      0.0104854815 = score(doc=5050,freq=6.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.20156369 = fieldWeight in 5050, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=5050)
    0.01797477 = weight(_text_:retrieval in 5050) [ClassicSimilarity], result of:
      0.01797477 = score(doc=5050,freq=2.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.20052543 = fieldWeight in 5050, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=5050)
  0.21428572 = coord(3/14)

Abstract: Users' cross-lingual queries to a digital library system might be short and the query terms may not be included in a common translation dictionary (unknown terms). In this article, the authors investigate the feasibility of exploiting the Web as the multilingual corpus source to translate unknown query terms for cross-language information retrieval in digital libraries. They propose a Webbased term translation approach to determine effective translations for unknown query terms by mining bilingual search-result pages obtained from a real Web search engine. This approach can enhance the construction of a domain-specific bilingual lexicon and bring multilingual support to a digital library that only has monolingual document collections. Very promising results have been obtained in generating effective translation equivalents for many unknown terms, including proper nouns, technical terms, and Web query terms, and in assisting bilingual lexicon construction for a real digital library system.
Footnote: Beitrag einer special topic section on multilingual information systems
Source: Journal of the American Society for Information Science and Technology. 57(2006) no.5, S.660-670

Weber, A.; Nöthiger, R.: ETHICS: ETH Library Information Control System : an online public access catalogue at the ETH-Bibliothek, Zürich, Switzerland (1988) 0.01

0.013055014 = product of:
  0.0913851 = sum of:
    0.07725957 = weight(_text_:bibliothek in 7470) [ClassicSimilarity], result of:
      0.07725957 = score(doc=7470,freq=2.0), product of:
        0.121660605 = queryWeight, product of:
          4.1055303 = idf(docFreq=1980, maxDocs=44218)
          0.029633347 = queryNorm
        0.63504183 = fieldWeight in 7470, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1055303 = idf(docFreq=1980, maxDocs=44218)
          0.109375 = fieldNorm(doc=7470)
    0.014125523 = weight(_text_:information in 7470) [ClassicSimilarity], result of:
      0.014125523 = score(doc=7470,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.27153665 = fieldWeight in 7470, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.109375 = fieldNorm(doc=7470)
  0.14285715 = coord(2/14)

Li, K.W.; Yang, C.C.: Automatic crosslingual thesaurus generated from the Hong Kong SAR Police Department Web Corpus for Crime Analysis (2005) 0.01
```
0.012563232 = product of:
  0.058628418 = sum of:
    0.019725623 = weight(_text_:web in 3391) [ClassicSimilarity], result of:
      0.019725623 = score(doc=3391,freq=4.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.2039694 = fieldWeight in 3391, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.03125 = fieldNorm(doc=3391)
    0.01210759 = weight(_text_:information in 3391) [ClassicSimilarity], result of:
      0.01210759 = score(doc=3391,freq=18.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.23274568 = fieldWeight in 3391, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=3391)
    0.026795205 = weight(_text_:retrieval in 3391) [ClassicSimilarity], result of:
      0.026795205 = score(doc=3391,freq=10.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.29892567 = fieldWeight in 3391, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=3391)
  0.21428572 = coord(3/14)
```
Abstract

For the sake of national security, very large volumes of data and information are generated and gathered daily. Much of this data and information is written in different languages, stored in different locations, and may be seemingly unconnected. Crosslingual semantic interoperability is a major challenge to generate an overview of this disparate data and information so that it can be analyzed, shared, searched, and summarized. The recent terrorist attacks and the tragic events of September 11, 2001 have prompted increased attention an national security and criminal analysis. Many Asian countries and cities, such as Japan, Taiwan, and Singapore, have been advised that they may become the next targets of terrorist attacks. Semantic interoperability has been a focus in digital library research. Traditional information retrieval (IR) approaches normally require a document to share some common keywords with the query. Generating the associations for the related terms between the two term spaces of users and documents is an important issue. The problem can be viewed as the creation of a thesaurus. Apart from this, terrorists and criminals may communicate through letters, e-mails, and faxes in languages other than English. The translation ambiguity significantly exacerbates the retrieval problem. The problem is expanded to crosslingual semantic interoperability. In this paper, we focus an the English/Chinese crosslingual semantic interoperability problem. However, the developed techniques are not limited to English and Chinese languages but can be applied to many other languages. English and Chinese are popular languages in the Asian region. Much information about national security or crime is communicated in these languages. An efficient automatically generated thesaurus between these languages is important to crosslingual information retrieval between English and Chinese languages. To facilitate crosslingual information retrieval, a corpus-based approach uses the term co-occurrence statistics in parallel or comparable corpora to construct a statistical translation model to cross the language boundary. In this paper, the text based approach to align English/Chinese Hong Kong Police press release documents from the Web is first presented. We also introduce an algorithmic approach to generate a robust knowledge base based an statistical correlation analysis of the semantics (knowledge) embedded in the bilingual press release corpus. The research output consisted of a thesaurus-like, semantic network knowledge base, which can aid in semanticsbased crosslingual information management and retrieval.

Source

Journal of the American Society for Information Science and Technology. 56(2005) no.3, S.272-281

Gey, F.C.; Kando, N.; Peters, C.: Cross-Language Information Retrieval : the way ahead (2005) 0.01

0.012177391 = product of:
  0.056827825 = sum of:
    0.020922182 = weight(_text_:web in 1018) [ClassicSimilarity], result of:
      0.020922182 = score(doc=1018,freq=2.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.21634221 = fieldWeight in 1018, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=1018)
    0.0104854815 = weight(_text_:information in 1018) [ClassicSimilarity], result of:
      0.0104854815 = score(doc=1018,freq=6.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.20156369 = fieldWeight in 1018, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=1018)
    0.025420163 = weight(_text_:retrieval in 1018) [ClassicSimilarity], result of:
      0.025420163 = score(doc=1018,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.2835858 = fieldWeight in 1018, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=1018)
  0.21428572 = coord(3/14)

Abstract: This introductory paper covers not only the research content of the articles in this special issue of IP&M but attempts to characterize the state-of-the-art in the Cross-Language Information Retrieval (CLIR) domain. We present our view of some major directions for CLIR research in the future. In particular, we find that insufficient attention has been given to the Web as a resource for multilingual research, and to languages which are spoken by hundreds of millions of people in the world but have been mainly neglected by the CLIR research community. In addition, we find that most CLIR evaluation has focussed narrowly on the news genre to the exclusion of other important genres such as scientific and technical literature. The paper concludes by describing an ambitious 5-year research plan proposed by James Mayfield and Paul McNamee.
Source: Information processing and management. 41(2005) no.3, S.415-432

Fulford, H.: Monolingual or multilingual web sites? : An exploratory study of UK SMEs (2000) 0.01
```
0.012063278 = product of:
  0.08444294 = sum of:
    0.032137483 = weight(_text_:wide in 5561) [ClassicSimilarity], result of:
      0.032137483 = score(doc=5561,freq=2.0), product of:
        0.1312982 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.029633347 = queryNorm
        0.24476713 = fieldWeight in 5561, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5561)
    0.052305456 = weight(_text_:web in 5561) [ClassicSimilarity], result of:
      0.052305456 = score(doc=5561,freq=18.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.5408555 = fieldWeight in 5561, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5561)
  0.14285715 = coord(2/14)
```
Abstract

The strategic importance of the internet as a tool for penetrating global markets is increasingly being realized by UK-based SMEs (Small- Medium-sized Enterprises). This may be evidenced by the proliferation over the past few years of SME web sites promoting products and services, and more recently still by the growing number of SMEs offering facilities on their web sites for conducting business transactions online. In this paper, we report on an exploratory study considering the use being made of the world wide web by UK-based SMEs. The study is focussed on the strategies SMEs are employing to communicate via the web with an international client base. We investigate in particular the languages being used to present web content, considering specifically the extent to which English is being employed. Preliminary results obtained to date suggest that there is heavy reliance on the assumption that the language of the web is English. Based on the findings of our study, we discuss some of the performance and competition issues surrounding the use of foreign languages in business, and consider some of the possible barriers to SMEs creating multilingual web sites. We conclude by making some recommendations for SMEs endeavouring to establish a multilingual online presence, and note the strategic role to be played by web designers, IT consultants, business strategists, professional translators, and localization specialists to help achieve this presence effectively and professionally

Grefenstette, G.: ¬The problem of cross-language information retrieval (1998) 0.01

0.011891057 = product of:
  0.083237395 = sum of:
    0.020970963 = weight(_text_:information in 6301) [ClassicSimilarity], result of:
      0.020970963 = score(doc=6301,freq=6.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.40312737 = fieldWeight in 6301, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.09375 = fieldNorm(doc=6301)
    0.06226643 = weight(_text_:retrieval in 6301) [ClassicSimilarity], result of:
      0.06226643 = score(doc=6301,freq=6.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.6946405 = fieldWeight in 6301, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.09375 = fieldNorm(doc=6301)
  0.14285715 = coord(2/14)

Series: The Kluwer International series on information retrieval
Source: Cross-language information retrieval. Ed.: G. Grefenstette

Ballesteros, L.; Croft, W.B.: Statistical methods for cross-language information retrieval (1998) 0.01

0.011891057 = product of:
  0.083237395 = sum of:
    0.020970963 = weight(_text_:information in 6303) [ClassicSimilarity], result of:
      0.020970963 = score(doc=6303,freq=6.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.40312737 = fieldWeight in 6303, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.09375 = fieldNorm(doc=6303)
    0.06226643 = weight(_text_:retrieval in 6303) [ClassicSimilarity], result of:
      0.06226643 = score(doc=6303,freq=6.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.6946405 = fieldWeight in 6303, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.09375 = fieldNorm(doc=6303)
  0.14285715 = coord(2/14)

Series: The Kluwer International series on information retrieval
Source: Cross-language information retrieval. Ed.: G. Grefenstette

Search (207 results, page 1 of 11)

Authors

Years

Types

Themes

Classifications