Search (47 results, page 1 of 3)

  • × language_ss:"e"
  • × theme_ss:"Multilinguale Probleme"
  • × year_i:[2000 TO 2010}
  1. Fulford, H.: Monolingual or multilingual web sites? : An exploratory study of UK SMEs (2000) 0.04
    0.042189382 = product of:
      0.12656814 = sum of:
        0.04816959 = weight(_text_:wide in 5561) [ClassicSimilarity], result of:
          0.04816959 = score(doc=5561,freq=2.0), product of:
            0.19679762 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.044416238 = queryNorm
            0.24476713 = fieldWeight in 5561, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5561)
        0.078398556 = weight(_text_:web in 5561) [ClassicSimilarity], result of:
          0.078398556 = score(doc=5561,freq=18.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.5408555 = fieldWeight in 5561, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5561)
      0.33333334 = coord(2/6)
    
    Abstract
    The strategic importance of the internet as a tool for penetrating global markets is increasingly being realized by UK-based SMEs (Small- Medium-sized Enterprises). This may be evidenced by the proliferation over the past few years of SME web sites promoting products and services, and more recently still by the growing number of SMEs offering facilities on their web sites for conducting business transactions online. In this paper, we report on an exploratory study considering the use being made of the world wide web by UK-based SMEs. The study is focussed on the strategies SMEs are employing to communicate via the web with an international client base. We investigate in particular the languages being used to present web content, considering specifically the extent to which English is being employed. Preliminary results obtained to date suggest that there is heavy reliance on the assumption that the language of the web is English. Based on the findings of our study, we discuss some of the performance and competition issues surrounding the use of foreign languages in business, and consider some of the possible barriers to SMEs creating multilingual web sites. We conclude by making some recommendations for SMEs endeavouring to establish a multilingual online presence, and note the strategic role to be played by web designers, IT consultants, business strategists, professional translators, and localization specialists to help achieve this presence effectively and professionally
  2. Li, K.W.; Yang, C.C.: Conceptual analysis of parallel corpus collected from the Web (2006) 0.04
    0.037795175 = product of:
      0.11338552 = sum of:
        0.06812209 = weight(_text_:wide in 5051) [ClassicSimilarity], result of:
          0.06812209 = score(doc=5051,freq=4.0), product of:
            0.19679762 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.044416238 = queryNorm
            0.34615302 = fieldWeight in 5051, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5051)
        0.045263432 = weight(_text_:web in 5051) [ClassicSimilarity], result of:
          0.045263432 = score(doc=5051,freq=6.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.3122631 = fieldWeight in 5051, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5051)
      0.33333334 = coord(2/6)
    
    Abstract
    As illustrated by the World Wide Web, the volume of information in languages other than English has grown significantly in recent years. This highlights the importance of multilingual corpora. Much effort has been devoted to the compilation of multilingual corpora for the purpose of cross-lingual information retrieval and machine translation. Existing parallel corpora mostly involve European languages, such as English-French and English-Spanish. There is still a lack of parallel corpora between European languages and Asian. languages. In the authors' previous work, an alignment method to identify one-to-one Chinese and English title pairs was developed to construct an English-Chinese parallel corpus that works automatically from the World Wide Web, and a 100% precision and 87% recall were obtained. Careful analysis of these results has helped the authors to understand how the alignment method can be improved. A conceptual analysis was conducted, which includes the analysis of conceptual equivalent and conceptual information alternation in the aligned and nonaligned English-Chinese title pairs that are obtained by the alignment method. The result of the analysis not only reflects the characteristics of parallel corpora, but also gives insight into the strengths and weaknesses of the alignment method. In particular, conceptual alternation, such as omission and addition, is found to have a significant impact on the performance of the alignment method.
  3. Yang, C.C.; Lam, W.: Introduction to the special topic section on multilingual information systems (2006) 0.03
    0.029720977 = product of:
      0.08916293 = sum of:
        0.057803504 = weight(_text_:wide in 5043) [ClassicSimilarity], result of:
          0.057803504 = score(doc=5043,freq=2.0), product of:
            0.19679762 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.044416238 = queryNorm
            0.29372054 = fieldWeight in 5043, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.046875 = fieldNorm(doc=5043)
        0.031359423 = weight(_text_:web in 5043) [ClassicSimilarity], result of:
          0.031359423 = score(doc=5043,freq=2.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.21634221 = fieldWeight in 5043, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=5043)
      0.33333334 = coord(2/6)
    
    Abstract
    The information available in languages other than English on the World Wide Web and global information systems is increasing significantly. According to some recent reports. the growth of non-English speaking Internet users is significantly higher than the growth of English-speaking Internet users. Asia and Europe have become the two most-populated regions of Internet users. However, there are many different languages in the many different countries of Asia and Europe. And there are many countries in the world using more than one language as their official languages. For example, Chinese and English are official languages in Hong Kong SAR; English and French are official languages in Canada. In the global economy, information systems are no longer utilized by users in a single geographical region but all over the world. Information can be generated, stored, processed, and accessed in several different languages. All of this reveals the importance of research in multilingual information systems.
  4. Talvensaari, T.; Juhola, M.; Laurikkala, J.; Järvelin, K.: Corpus-based cross-language information retrieval in retrieval of highly relevant documents (2007) 0.02
    0.02476748 = product of:
      0.07430244 = sum of:
        0.04816959 = weight(_text_:wide in 139) [ClassicSimilarity], result of:
          0.04816959 = score(doc=139,freq=2.0), product of:
            0.19679762 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.044416238 = queryNorm
            0.24476713 = fieldWeight in 139, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=139)
        0.026132854 = weight(_text_:web in 139) [ClassicSimilarity], result of:
          0.026132854 = score(doc=139,freq=2.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.18028519 = fieldWeight in 139, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=139)
      0.33333334 = coord(2/6)
    
    Abstract
    Information retrieval systems' ability to retrieve highly relevant documents has become more and more important in the age of extremely large collections, such as the World Wide Web (WWW). The authors' aim was to find out how corpus-based cross-language information retrieval (CLIR) manages in retrieving highly relevant documents. They created a Finnish-Swedish comparable corpus from two loosely related document collections and used it as a source of knowledge for query translation. Finnish test queries were translated into Swedish and run against a Swedish test collection. Graded relevance assessments were used in evaluating the results and three relevance criterion levels-liberal, regular, and stringent-were applied. The runs were also evaluated with generalized recall and precision, which weight the retrieved documents according to their relevance level. The performance of the Comparable Corpus Translation system (COCOT) was compared to that of a dictionarybased query translation program; the two translation methods were also combined. The results indicate that corpus-based CUR performs particularly well with highly relevant documents. In average precision, COCOT even matched the monolingual baseline on the highest relevance level. The performance of the different query translation methods was further analyzed by finding out reasons for poor rankings of highly relevant documents.
  5. Larkey, L.S.; Connell, M.E.: Structured queries, language modelling, and relevance modelling in cross-language information retrieval (2005) 0.02
    0.021071352 = product of:
      0.063214056 = sum of:
        0.04816959 = weight(_text_:wide in 1022) [ClassicSimilarity], result of:
          0.04816959 = score(doc=1022,freq=2.0), product of:
            0.19679762 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.044416238 = queryNorm
            0.24476713 = fieldWeight in 1022, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1022)
        0.0150444675 = product of:
          0.030088935 = sum of:
            0.030088935 = weight(_text_:22 in 1022) [ClassicSimilarity], result of:
              0.030088935 = score(doc=1022,freq=2.0), product of:
                0.1555381 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044416238 = queryNorm
                0.19345059 = fieldWeight in 1022, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1022)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Abstract
    Two probabilistic approaches to cross-lingual retrieval are in wide use today, those based on probabilistic models of relevance, as exemplified by INQUERY, and those based on language modeling. INQUERY, as a query net model, allows the easy incorporation of query operators, including a synonym operator, which has proven to be extremely useful in cross-language information retrieval (CLIR), in an approach often called structured query translation. In contrast, language models incorporate translation probabilities into a unified framework. We compare the two approaches on Arabic and Spanish data sets, using two kinds of bilingual dictionaries--one derived from a conventional dictionary, and one derived from a parallel corpus. We find that structured query processing gives slightly better results when queries are not expanded. On the other hand, when queries are expanded, language modeling gives better results, but only when using a probabilistic dictionary derived from a parallel corpus. We pursue two additional issues inherent in the comparison of structured query processing with language modeling. The first concerns query expansion, and the second is the role of translation probabilities. We compare conventional expansion techniques (pseudo-relevance feedback) with relevance modeling, a new IR approach which fits into the formal framework of language modeling. We find that relevance modeling and pseudo-relevance feedback achieve comparable levels of retrieval and that good translation probabilities confer a small but significant advantage.
    Date
    26.12.2007 20:22:11
  6. Bian, G.-W.; Chen, H.-H.: Cross-language information access to multilingual collections on the Internet (2000) 0.02
    0.020800762 = product of:
      0.062402282 = sum of:
        0.04434892 = weight(_text_:web in 4436) [ClassicSimilarity], result of:
          0.04434892 = score(doc=4436,freq=4.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.3059541 = fieldWeight in 4436, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=4436)
        0.01805336 = product of:
          0.03610672 = sum of:
            0.03610672 = weight(_text_:22 in 4436) [ClassicSimilarity], result of:
              0.03610672 = score(doc=4436,freq=2.0), product of:
                0.1555381 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044416238 = queryNorm
                0.23214069 = fieldWeight in 4436, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4436)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Abstract
    Language barrier is the major problem that people face in searching for, retrieving, and understanding multilingual collections on the Internet. This paper deals with query translation and document translation in a Chinese-English information retrieval system called MTIR. Bilingual dictionary and monolingual corpus-based approaches are adopted to select suitable tranlated query terms. A machine transliteration algorithm is introduced to resolve proper name searching. We consider several design issues for document translation, including which material is translated, what roles the HTML tags play in translation, what the tradeoff is between the speed performance and the translation performance, and what from the translated result is presented in. About 100.000 Web pages translated in the last 4 months of 1997 are used for quantitative study of online and real-time Web page translation
    Date
    16. 2.2000 14:22:39
  7. Dilevko, J.; Dali, K.: ¬The challenge of building multilingual collections in Canadian public libraries (2002) 0.02
    0.019216085 = product of:
      0.057648253 = sum of:
        0.036585998 = weight(_text_:web in 139) [ClassicSimilarity], result of:
          0.036585998 = score(doc=139,freq=2.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.25239927 = fieldWeight in 139, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=139)
        0.021062255 = product of:
          0.04212451 = sum of:
            0.04212451 = weight(_text_:22 in 139) [ClassicSimilarity], result of:
              0.04212451 = score(doc=139,freq=2.0), product of:
                0.1555381 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044416238 = queryNorm
                0.2708308 = fieldWeight in 139, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=139)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Abstract
    A Web-based survey was conducted to determine the extent to which Canadian public libraries are collecting multilingual materials (foreign languages other than English and French), the methods that they use to select these materials, and whether public librarians are sufficiently prepared to provide their multilingual clientele with an adequate range of materials and services. There is room for improvement with regard to collection development of multilingual materials in Canadian public libraries, as well as in educating staff about keeping multilingual collections current, diverse, and of sufficient interest to potential users to keep such materials circulating. The main constraints preventing public libraries from developing better multilingual collections are addressed, and recommendations for improving the state of multilingual holdings are provided.
    Date
    10. 9.2000 17:38:22
  8. Cunliffe, D.; Herring, S.C.: Introduction to minority languages, multimedia and the Web (2005) 0.01
    0.0147829745 = product of:
      0.08869784 = sum of:
        0.08869784 = weight(_text_:web in 4771) [ClassicSimilarity], result of:
          0.08869784 = score(doc=4771,freq=4.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.6119082 = fieldWeight in 4771, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.09375 = fieldNorm(doc=4771)
      0.16666667 = coord(1/6)
    
    Content
    Einleitung in ein Themenheft "Minority languages, multimedia and the Web"
  9. Freitas-Junior, H.R.; Ribeiro-Neto, B.A.; Freitas-Vale, R. de; Laender, A.H.F.; Lima, L.R.S. de: Categorization-driven cross-language retrieval of medical information (2006) 0.01
    0.013725774 = product of:
      0.04117732 = sum of:
        0.026132854 = weight(_text_:web in 5282) [ClassicSimilarity], result of:
          0.026132854 = score(doc=5282,freq=2.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.18028519 = fieldWeight in 5282, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5282)
        0.0150444675 = product of:
          0.030088935 = sum of:
            0.030088935 = weight(_text_:22 in 5282) [ClassicSimilarity], result of:
              0.030088935 = score(doc=5282,freq=2.0), product of:
                0.1555381 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044416238 = queryNorm
                0.19345059 = fieldWeight in 5282, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5282)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Abstract
    The Web has become a large repository of documents (or pages) written in many different languages. In this context, traditional information retrieval (IR) techniques cannot be used whenever the user query and the documents being retrieved are in different languages. To address this problem, new cross-language information retrieval (CLIR) techniques have been proposed. In this work, we describe a method for cross-language retrieval of medical information. This method combines query terms and related medical concepts obtained automatically through a categorization procedure. The medical concepts are used to create a linguistic abstraction that allows retrieval of information in a language-independent way, minimizing linguistic problems such as polysemy. To evaluate our method, we carried out experiments using the OHSUMED test collection, whose documents are written in English, with queries expressed in Portuguese, Spanish, and French. The results indicate that our cross-language retrieval method is as effective as a standard vector space model algorithm operating on queries and documents in the same language. Further, our results are better than previous results in the literature.
    Date
    22. 7.2006 16:46:36
  10. Cunliffe, D.; Jones, H.; Jarvis, M.; Egan, K.; Huws, R.; Munro, S,: Information architecture for bilingual Web sites (2002) 0.01
    0.012195333 = product of:
      0.073171996 = sum of:
        0.073171996 = weight(_text_:web in 1014) [ClassicSimilarity], result of:
          0.073171996 = score(doc=1014,freq=8.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.50479853 = fieldWeight in 1014, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1014)
      0.16666667 = coord(1/6)
    
    Abstract
    Creating an information architecture for a bilingual Web site presents particular challenges beyond those that exist for single and multilanguage sites. This article reports work in progress an the development of a contentbased bilingual Web site to facilitate the sharing of resources and information between Speech and Language Therapists. The development of the information architecture is based an a combination of two aspects: an abstract structural analysis of existing bilingual Web designs focusing an the presentation of bilingual material, and a bilingual card-sorting activity conducted with potential users. Issues for bilingual developments are discussed, and some observations are made regarding the use of card-sorting activities.
  11. Frâncu, V.: Harmonizing a universal classification system with an interdisciplinary multilingual thesaurus : advantages and limitations (2000) 0.01
    0.011353682 = product of:
      0.06812209 = sum of:
        0.06812209 = weight(_text_:wide in 108) [ClassicSimilarity], result of:
          0.06812209 = score(doc=108,freq=4.0), product of:
            0.19679762 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.044416238 = queryNorm
            0.34615302 = fieldWeight in 108, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=108)
      0.16666667 = coord(1/6)
    
    Abstract
    The case under consideration is a project of building an interdisciplinary multilingual thesaurus (Romanian-English-French) starting from a list of indexing terms based on an abridged version of the Universal Decimal Classification (UDC). The resulting thesaurus is intended for public libraries for both indexing and searching purposes in bibliographic databases covering a wide range of topics but with a fairly low level of specificity. The problems encountered in such an approach fall into two groups: 1) concordance or compatibility problems in terms of the indexing languages considered (between a classification system and a thesaurus); 2) equivalence and, hence, translatability problems in terms of the natural languages involved. Additionally, the question of ambiguity given the co-occurrence of terms in more than one class, will be discussed with reference to homographs and polysemantic words. In a thesaurus with such a wide coverage yet with a low specificity level, the method adopted in the thesaurus construction was to provide as many lead-in terms as possible and post them up to the closest in meaning broader term in order to improve the recall ratio
  12. Subirats, I.; Prasad, A.R.D.; Keizer, J.; Bagdanov, A.: Implementation of rich metadata formats and demantic tools using DSpace (2008) 0.01
    0.010980619 = product of:
      0.032941855 = sum of:
        0.020906283 = weight(_text_:web in 2656) [ClassicSimilarity], result of:
          0.020906283 = score(doc=2656,freq=2.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.14422815 = fieldWeight in 2656, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03125 = fieldNorm(doc=2656)
        0.012035574 = product of:
          0.024071148 = sum of:
            0.024071148 = weight(_text_:22 in 2656) [ClassicSimilarity], result of:
              0.024071148 = score(doc=2656,freq=2.0), product of:
                0.1555381 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044416238 = queryNorm
                0.15476047 = fieldWeight in 2656, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2656)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Source
    Metadata for semantic and social applications : proceedings of the International Conference on Dublin Core and Metadata Applications, Berlin, 22 - 26 September 2008, DC 2008: Berlin, Germany / ed. by Jane Greenberg and Wolfgang Klas
    Theme
    Semantic Web
  13. Qin, J.; Zhou, Y.; Chau, M.; Chen, H.: Multilingual Web retrieval : an experiment in English-Chinese business intelligence (2006) 0.01
    0.010668693 = product of:
      0.064012155 = sum of:
        0.064012155 = weight(_text_:web in 5054) [ClassicSimilarity], result of:
          0.064012155 = score(doc=5054,freq=12.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.4416067 = fieldWeight in 5054, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5054)
      0.16666667 = coord(1/6)
    
    Abstract
    As increasing numbers of non-English resources have become available on the Web, the interesting and important issue of how Web users can retrieve documents in different languages has arisen. Cross-language information retrieval (CLIP), the study of retrieving information in one language by queries expressed in another language, is a promising approach to the problem. Cross-language information retrieval has attracted much attention in recent years. Most research systems have achieved satisfactory performance on standard Text REtrieval Conference (TREC) collections such as news articles, but CLIR techniques have not been widely studied and evaluated for applications such as Web portals. In this article, the authors present their research in developing and evaluating a multilingual English-Chinese Web portal that incorporates various CLIP techniques for use in the business domain. A dictionary-based approach was adopted and combines phrasal translation, co-occurrence analysis, and pre- and posttranslation query expansion. The portal was evaluated by domain experts, using a set of queries in both English and Chinese. The experimental results showed that co-occurrence-based phrasal translation achieved a 74.6% improvement in precision over simple word-byword translation. When used together, pre- and posttranslation query expansion improved the performance slightly, achieving a 78.0% improvement over the baseline word-by-word translation approach. In general, applying CLIR techniques in Web applications shows promise.
  14. Turner, J.M.: Cultural markers and localising the MIC site (2008) 0.01
    0.010668693 = product of:
      0.064012155 = sum of:
        0.064012155 = weight(_text_:web in 2243) [ClassicSimilarity], result of:
          0.064012155 = score(doc=2243,freq=12.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.4416067 = fieldWeight in 2243, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2243)
      0.16666667 = coord(1/6)
    
    Content
    Merely translating web sites is not sufficient for serving international clienteles. Web sites need to be "localised". This involves adapting various informational aspects to address the local population in such a way that users understand the content and its use in the context of their own culture. A cultural marker denotes a convention used on a web site to address a particular population. Research in the area of localisation has concentrated on commercial web sites and software. We found that localisation of cultural web sites increases the complexity of the information management issues. As a project of the Section on Audiovisual and Multimedia of IFLA, a kind for localising the The Moving Image Collections (MIC) site was developed, then tested by using it to localise a selection of pages from the web site in French, Spanish, and Arabic. The kit, in the form of a .pdf file, can be used to produce a version of the MIC site localised for any other language or ethnic community.
  15. Chan, L.M.; Lin, X.; Zeng, M.L.: Structural and multilingual approaches to subject access on the Web (2000) 0.01
    0.010453141 = product of:
      0.062718846 = sum of:
        0.062718846 = weight(_text_:web in 507) [ClassicSimilarity], result of:
          0.062718846 = score(doc=507,freq=2.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.43268442 = fieldWeight in 507, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.09375 = fieldNorm(doc=507)
      0.16666667 = coord(1/6)
    
  16. Wang, J.-H.; Teng, J.-W.; Lu, W.-H.; Chien, L.-F.: Exploiting the Web as the multilingual corpus for unknown query translation (2006) 0.01
    0.010453141 = product of:
      0.062718846 = sum of:
        0.062718846 = weight(_text_:web in 5050) [ClassicSimilarity], result of:
          0.062718846 = score(doc=5050,freq=8.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.43268442 = fieldWeight in 5050, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=5050)
      0.16666667 = coord(1/6)
    
    Abstract
    Users' cross-lingual queries to a digital library system might be short and the query terms may not be included in a common translation dictionary (unknown terms). In this article, the authors investigate the feasibility of exploiting the Web as the multilingual corpus source to translate unknown query terms for cross-language information retrieval in digital libraries. They propose a Webbased term translation approach to determine effective translations for unknown query terms by mining bilingual search-result pages obtained from a real Web search engine. This approach can enhance the construction of a domain-specific bilingual lexicon and bring multilingual support to a digital library that only has monolingual document collections. Very promising results have been obtained in generating effective translation equivalents for many unknown terms, including proper nouns, technical terms, and Web query terms, and in assisting bilingual lexicon construction for a real digital library system.
  17. Mitchell, J.S.; Rype, I.; Svanberg, M.: Mixed translation models for the Dewey Decimal Classification (DDC) System (2008) 0.01
    0.009633917 = product of:
      0.057803504 = sum of:
        0.057803504 = weight(_text_:wide in 2246) [ClassicSimilarity], result of:
          0.057803504 = score(doc=2246,freq=2.0), product of:
            0.19679762 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.044416238 = queryNorm
            0.29372054 = fieldWeight in 2246, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.046875 = fieldNorm(doc=2246)
      0.16666667 = coord(1/6)
    
    Content
    This paper explores the feasibility of developing mixed translations of the Dewey Decimal Classification (DDC system in countries/language groups where English enjoys wide use in academic and social discourse. A mixed translation uses existing DDC data in the vernacular plus additional data from the English-language full edition of the DDC to form a single mixed edition. Two approaches to mixed translations using Norwegian/English and Swedish/English DDC data are described, along with the design of a pilot study to evaluate use of a mixed translation as a classifier's tool.
  18. Airio, E.: Who benefits from CLIR in web retrieval? (2008) 0.01
    0.009052687 = product of:
      0.054316122 = sum of:
        0.054316122 = weight(_text_:web in 2342) [ClassicSimilarity], result of:
          0.054316122 = score(doc=2342,freq=6.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.37471575 = fieldWeight in 2342, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=2342)
      0.16666667 = coord(1/6)
    
    Abstract
    Purpose - The aim of the current paper is to test whether query translation is beneficial in web retrieval. Design/methodology/approach - The language pairs were Finnish-Swedish, English-German and Finnish-French. A total of 12-18 participants were recruited for each language pair. Each participant performed four retrieval tasks. The author's aim was to compare the performance of the translated queries with that of the target language queries. Thus, the author asked participants to formulate a source language query and a target language query for each task. The source language queries were translated into the target language utilizing a dictionary-based system. In English-German, also machine translation was utilized. The author used Google as the search engine. Findings - The results differed depending on the language pair. The author concluded that the dictionary coverage had an effect on the results. On average, the results of query-translation were better than in the traditional laboratory tests. Originality/value - This research shows that query translation in web is beneficial especially for users with moderate and non-active language skills. This is valuable information for developers of cross-language information retrieval systems.
  19. Cheng, P.J.; Teng, J.W.; Chen, R.C.; Wang, J.H.; Lu, W.H.; Chien, L.F.: Translating unknown queries with Web corpora for cross-language information languages (2004) 0.01
    0.008710952 = product of:
      0.052265707 = sum of:
        0.052265707 = weight(_text_:web in 4131) [ClassicSimilarity], result of:
          0.052265707 = score(doc=4131,freq=2.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.36057037 = fieldWeight in 4131, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.078125 = fieldNorm(doc=4131)
      0.16666667 = coord(1/6)
    
  20. Li, Q.; Chen, Y.P.; Myaeng, S.-H.; Jin, Y.; Kang, B.-Y.: Concept unification of terms in different languages via web mining for Information Retrieval (2009) 0.01
    0.008710952 = product of:
      0.052265707 = sum of:
        0.052265707 = weight(_text_:web in 4215) [ClassicSimilarity], result of:
          0.052265707 = score(doc=4215,freq=8.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.36057037 = fieldWeight in 4215, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4215)
      0.16666667 = coord(1/6)
    
    Abstract
    For historical and cultural reasons, English phrases, especially proper nouns and new words, frequently appear in Web pages written primarily in East Asian languages such as Chinese, Korean, and Japanese. Although such English terms and their equivalences in these East Asian languages refer to the same concept, they are often erroneously treated as independent index units in traditional Information Retrieval (IR). This paper describes the degree to which the problem arises in IR and proposes a novel technique to solve it. Our method first extracts English terms from native Web documents in an East Asian language, and then unifies the extracted terms and their equivalences in the native language as one index unit. For Cross-Language Information Retrieval (CLIR), one of the major hindrances to achieving retrieval performance at the level of Mono-Lingual Information Retrieval (MLIR) is the translation of terms in search queries which can not be found in a bilingual dictionary. The Web mining approach proposed in this paper for concept unification of terms in different languages can also be applied to solve this well-known challenge in CLIR. Experimental results based on NTCIR and KT-Set test collections show that the high translation precision of our approach greatly improves performance of both Mono-Lingual and Cross-Language Information Retrieval.

Types

  • a 43
  • el 3
  • x 1
  • More… Less…