Search (105 results, page 2 of 6)

  • × theme_ss:"Multilinguale Probleme"
  1. Li, K.W.; Yang, C.C.: Automatic crosslingual thesaurus generated from the Hong Kong SAR Police Department Web Corpus for Crime Analysis (2005) 0.02
    0.018133802 = product of:
      0.06346831 = sum of:
        0.028243072 = weight(_text_:digital in 3391) [ClassicSimilarity], result of:
          0.028243072 = score(doc=3391,freq=2.0), product of:
            0.16201277 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.04107254 = queryNorm
            0.17432621 = fieldWeight in 3391, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.03125 = fieldNorm(doc=3391)
        0.03522524 = weight(_text_:techniques in 3391) [ClassicSimilarity], result of:
          0.03522524 = score(doc=3391,freq=2.0), product of:
            0.18093403 = queryWeight, product of:
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.04107254 = queryNorm
            0.19468555 = fieldWeight in 3391, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.03125 = fieldNorm(doc=3391)
      0.2857143 = coord(2/7)
    
    Abstract
    For the sake of national security, very large volumes of data and information are generated and gathered daily. Much of this data and information is written in different languages, stored in different locations, and may be seemingly unconnected. Crosslingual semantic interoperability is a major challenge to generate an overview of this disparate data and information so that it can be analyzed, shared, searched, and summarized. The recent terrorist attacks and the tragic events of September 11, 2001 have prompted increased attention an national security and criminal analysis. Many Asian countries and cities, such as Japan, Taiwan, and Singapore, have been advised that they may become the next targets of terrorist attacks. Semantic interoperability has been a focus in digital library research. Traditional information retrieval (IR) approaches normally require a document to share some common keywords with the query. Generating the associations for the related terms between the two term spaces of users and documents is an important issue. The problem can be viewed as the creation of a thesaurus. Apart from this, terrorists and criminals may communicate through letters, e-mails, and faxes in languages other than English. The translation ambiguity significantly exacerbates the retrieval problem. The problem is expanded to crosslingual semantic interoperability. In this paper, we focus an the English/Chinese crosslingual semantic interoperability problem. However, the developed techniques are not limited to English and Chinese languages but can be applied to many other languages. English and Chinese are popular languages in the Asian region. Much information about national security or crime is communicated in these languages. An efficient automatically generated thesaurus between these languages is important to crosslingual information retrieval between English and Chinese languages. To facilitate crosslingual information retrieval, a corpus-based approach uses the term co-occurrence statistics in parallel or comparable corpora to construct a statistical translation model to cross the language boundary. In this paper, the text based approach to align English/Chinese Hong Kong Police press release documents from the Web is first presented. We also introduce an algorithmic approach to generate a robust knowledge base based an statistical correlation analysis of the semantics (knowledge) embedded in the bilingual press release corpus. The research output consisted of a thesaurus-like, semantic network knowledge base, which can aid in semanticsbased crosslingual information management and retrieval.
  2. Seo, H.-C.; Kim, S.-B.; Rim, H.-C.; Myaeng, S.-H.: lmproving query translation in English-Korean Cross-language information retrieval (2005) 0.02
    0.0175181 = product of:
      0.061313346 = sum of:
        0.04461906 = weight(_text_:processing in 1023) [ClassicSimilarity], result of:
          0.04461906 = score(doc=1023,freq=2.0), product of:
            0.1662677 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.04107254 = queryNorm
            0.26835677 = fieldWeight in 1023, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046875 = fieldNorm(doc=1023)
        0.016694285 = product of:
          0.03338857 = sum of:
            0.03338857 = weight(_text_:22 in 1023) [ClassicSimilarity], result of:
              0.03338857 = score(doc=1023,freq=2.0), product of:
                0.14382903 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04107254 = queryNorm
                0.23214069 = fieldWeight in 1023, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1023)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Date
    26.12.2007 20:22:38
    Source
    Information processing and management. 41(2005) no.3, S.507-522
  3. Mitchell, J.S.; Zeng, M.L.; Zumer, M.: Modeling classification systems in multicultural and multilingual contexts (2014) 0.02
    0.01570807 = product of:
      0.054978244 = sum of:
        0.03530384 = weight(_text_:digital in 1962) [ClassicSimilarity], result of:
          0.03530384 = score(doc=1962,freq=2.0), product of:
            0.16201277 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.04107254 = queryNorm
            0.21790776 = fieldWeight in 1962, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1962)
        0.019674404 = product of:
          0.039348807 = sum of:
            0.039348807 = weight(_text_:22 in 1962) [ClassicSimilarity], result of:
              0.039348807 = score(doc=1962,freq=4.0), product of:
                0.14382903 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04107254 = queryNorm
                0.27358043 = fieldWeight in 1962, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1962)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    This article reports on the second part of an initiative of the authors on researching classification systems with the conceptual model defined by the Functional Requirements for Subject Authority Data (FRSAD) final report. In an earlier study, the authors explored whether the FRSAD conceptual model could be extended beyond subject authority data to model classification data. The focus of the current study is to determine if classification data modeled using FRSAD can be used to solve real-world discovery problems in multicultural and multilingual contexts. The article discusses the relationships between entities (same type or different types) in the context of classification systems that involve multiple translations and/or multicultural implementations. Results of two case studies are presented in detail: (a) two instances of the Dewey Decimal Classification [DDC] (DDC 22 in English, and the Swedish-English mixed translation of DDC 22), and (b) Chinese Library Classification. The use cases of conceptual models in practice are also discussed.
    Footnote
    Contribution in a special issue "Beyond libraries: Subject metadata in the digital environment and Semantic Web" - Enthält Beiträge der gleichnamigen IFLA Satellite Post-Conference, 17-18 August 2012, Tallinn.
  4. Oard, D.W.; Resnik, P.: Support for interactive document selection in cross-language information retrieval (1999) 0.01
    0.014873021 = product of:
      0.10411114 = sum of:
        0.10411114 = weight(_text_:processing in 5938) [ClassicSimilarity], result of:
          0.10411114 = score(doc=5938,freq=2.0), product of:
            0.1662677 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.04107254 = queryNorm
            0.6261658 = fieldWeight in 5938, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.109375 = fieldNorm(doc=5938)
      0.14285715 = coord(1/7)
    
    Source
    Information processing and management. 35(1999) no.3, S.363-379
  5. Capstick, J.: ¬A system for supporting cross-lingual information retrieval (2000) 0.01
    0.014873021 = product of:
      0.10411114 = sum of:
        0.10411114 = weight(_text_:processing in 4993) [ClassicSimilarity], result of:
          0.10411114 = score(doc=4993,freq=2.0), product of:
            0.1662677 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.04107254 = queryNorm
            0.6261658 = fieldWeight in 4993, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.109375 = fieldNorm(doc=4993)
      0.14285715 = coord(1/7)
    
    Source
    Information processing and management. 36(2000) no.2, S.275-289
  6. Nichols, D.M.; Witten, I.H.; Keegan, T.T.; Bainbridge, D.; Dewsnip, M.: Digital libraries and minority languages (2005) 0.01
    0.01222961 = product of:
      0.08560727 = sum of:
        0.08560727 = weight(_text_:digital in 5914) [ClassicSimilarity], result of:
          0.08560727 = score(doc=5914,freq=6.0), product of:
            0.16201277 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.04107254 = queryNorm
            0.5283983 = fieldWeight in 5914, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5914)
      0.14285715 = coord(1/7)
    
    Abstract
    Digital libraries have a pivotal role to play in the preservation and maintenance of international cultures in general and minority languages in particular. This paper outlines a software tool for building digital libraries that is well adapted for creating and distributing local information collections in minority languages, and describes some contexts in which it is used. The system can make multilingual documents available in structured collections and allows them to be accessed via multilingual interfaces. It is issued under a free open-source licence, which encourages participatory design of the software, and an end-user interface allows community-based localization of the various language interfaces-of which there are many.
  7. Wang, J.-H.; Teng, J.-W.; Lu, W.-H.; Chien, L.-F.: Exploiting the Web as the multilingual corpus for unknown query translation (2006) 0.01
    0.012104175 = product of:
      0.084729224 = sum of:
        0.084729224 = weight(_text_:digital in 5050) [ClassicSimilarity], result of:
          0.084729224 = score(doc=5050,freq=8.0), product of:
            0.16201277 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.04107254 = queryNorm
            0.52297866 = fieldWeight in 5050, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.046875 = fieldNorm(doc=5050)
      0.14285715 = coord(1/7)
    
    Abstract
    Users' cross-lingual queries to a digital library system might be short and the query terms may not be included in a common translation dictionary (unknown terms). In this article, the authors investigate the feasibility of exploiting the Web as the multilingual corpus source to translate unknown query terms for cross-language information retrieval in digital libraries. They propose a Webbased term translation approach to determine effective translations for unknown query terms by mining bilingual search-result pages obtained from a real Web search engine. This approach can enhance the construction of a domain-specific bilingual lexicon and bring multilingual support to a digital library that only has monolingual document collections. Very promising results have been obtained in generating effective translation equivalents for many unknown terms, including proper nouns, technical terms, and Web query terms, and in assisting bilingual lexicon construction for a real digital library system.
  8. Peters, C.; Picchi, E.: Across languages, across cultures : issues in multilinguality and digital libraries (1997) 0.01
    0.011411925 = product of:
      0.07988347 = sum of:
        0.07988347 = weight(_text_:digital in 1233) [ClassicSimilarity], result of:
          0.07988347 = score(doc=1233,freq=4.0), product of:
            0.16201277 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.04107254 = queryNorm
            0.493069 = fieldWeight in 1233, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.0625 = fieldNorm(doc=1233)
      0.14285715 = coord(1/7)
    
    Abstract
    With the recent rapid diffusion over the international computer networks of world-wide distributed document bases, the question of multilingual access and multilingual information retrieval is becoming increasingly relevant. We briefly discuss just some of the issues that must be addressed in order to implement a multilingual interface for a Digital Library system and describe our own approach to this problem.
  9. Qin, J.; Zhou, Y.; Chau, M.; Chen, H.: Multilingual Web retrieval : an experiment in English-Chinese business intelligence (2006) 0.01
    0.010894983 = product of:
      0.07626488 = sum of:
        0.07626488 = weight(_text_:techniques in 5054) [ClassicSimilarity], result of:
          0.07626488 = score(doc=5054,freq=6.0), product of:
            0.18093403 = queryWeight, product of:
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.04107254 = queryNorm
            0.42150658 = fieldWeight in 5054, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5054)
      0.14285715 = coord(1/7)
    
    Abstract
    As increasing numbers of non-English resources have become available on the Web, the interesting and important issue of how Web users can retrieve documents in different languages has arisen. Cross-language information retrieval (CLIP), the study of retrieving information in one language by queries expressed in another language, is a promising approach to the problem. Cross-language information retrieval has attracted much attention in recent years. Most research systems have achieved satisfactory performance on standard Text REtrieval Conference (TREC) collections such as news articles, but CLIR techniques have not been widely studied and evaluated for applications such as Web portals. In this article, the authors present their research in developing and evaluating a multilingual English-Chinese Web portal that incorporates various CLIP techniques for use in the business domain. A dictionary-based approach was adopted and combines phrasal translation, co-occurrence analysis, and pre- and posttranslation query expansion. The portal was evaluated by domain experts, using a set of queries in both English and Chinese. The experimental results showed that co-occurrence-based phrasal translation achieved a 74.6% improvement in precision over simple word-byword translation. When used together, pre- and posttranslation query expansion improved the performance slightly, achieving a 78.0% improvement over the baseline word-by-word translation approach. In general, applying CLIR techniques in Web applications shows promise.
  10. Pearce, C.; Nicholas, C.: TELLTALE: Experiments in a dynamic hypertext environment for degraded and multilingual data (1996) 0.01
    0.010674859 = product of:
      0.07472401 = sum of:
        0.07472401 = weight(_text_:techniques in 4071) [ClassicSimilarity], result of:
          0.07472401 = score(doc=4071,freq=4.0), product of:
            0.18093403 = queryWeight, product of:
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.04107254 = queryNorm
            0.4129904 = fieldWeight in 4071, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.046875 = fieldNorm(doc=4071)
      0.14285715 = coord(1/7)
    
    Abstract
    Methods and tools for finding documents relevant to a user's needs in a document corpora can be found in the information retrieval, library science, and hypertext communities. Typically, these systems provide retrieval capabilities for fairly static copora, their algorithms are dependent on the language for which they are written, e.g. English, and they do not perform well when presented with misspelled words or text that has been degraded by OCR techniques. In this article, we present experimentation results for the TELLTALE system. TELLTALE is a dynamic hypertext environment that provides full-text search from a hypertext-style user interface for text corpora that may be garbled by OCR or transmission errors, and that may contain languages other than English. TELLTALE uses several techniques based on n-grams (n character sequences of text). With these results we show that the dynamic linkage mechanisms in TELLTALE are tolerant of garbles in up to 30% of the characters in the body of the texts
  11. Jones, R.K.: Language universalization for improved information management : necessity for Esperanto (1978) 0.01
    0.010516814 = product of:
      0.0736177 = sum of:
        0.0736177 = weight(_text_:processing in 7408) [ClassicSimilarity], result of:
          0.0736177 = score(doc=7408,freq=4.0), product of:
            0.1662677 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.04107254 = queryNorm
            0.4427661 = fieldWeight in 7408, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7408)
      0.14285715 = coord(1/7)
    
    Abstract
    Lacking a universal working language, information managers around the world cannot now deal reliably and efficiently with multilingual documentation. Language mismatch paralyses international cooperative efforts such as multinational bibliographic standardisation, linking of collections, and sharing the work of classification and indexing. Knowledge of the same second language by all information managers can open the communication channels needed for worldwide cooperation. Ethnis and ideological rivalries prclude success in this role by any of the conventional languages. The planned language, Esperanto, is the logical choice because of its neutrality, rational structure, clarity and expressive power. Pioneering projects in automatic language processing, not possible in English, are feasible in Esperanto
    Source
    Information processing and management. 14(1978) no.6, S.363-368
  12. Multilingual web software (1996) 0.01
    0.010086811 = product of:
      0.07060768 = sum of:
        0.07060768 = weight(_text_:digital in 4710) [ClassicSimilarity], result of:
          0.07060768 = score(doc=4710,freq=2.0), product of:
            0.16201277 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.04107254 = queryNorm
            0.4358155 = fieldWeight in 4710, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.078125 = fieldNorm(doc=4710)
      0.14285715 = coord(1/7)
    
    Source
    Digital publishing technologies. 1(1996) no.10, S.19-20
  13. Wen, D.; Sakaguchi, T.; Sugimoto, S.; Tabata, K.: Multilingual Access to Dublin Core Metadata of ULIS Library (2002) 0.01
    0.010086811 = product of:
      0.07060768 = sum of:
        0.07060768 = weight(_text_:digital in 2342) [ClassicSimilarity], result of:
          0.07060768 = score(doc=2342,freq=2.0), product of:
            0.16201277 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.04107254 = queryNorm
            0.4358155 = fieldWeight in 2342, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.078125 = fieldNorm(doc=2342)
      0.14285715 = coord(1/7)
    
    Source
    Journal of digital information. 2(2002) no.2,
  14. Garcia Jiménez, A.; Díaz Esteban, A.; Gervás, P.: Knowledge organization in a multilingual system for the personalization of digital news services : how to integrate knowledge (2003) 0.01
    0.010086811 = product of:
      0.07060768 = sum of:
        0.07060768 = weight(_text_:digital in 2748) [ClassicSimilarity], result of:
          0.07060768 = score(doc=2748,freq=8.0), product of:
            0.16201277 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.04107254 = queryNorm
            0.4358155 = fieldWeight in 2748, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2748)
      0.14285715 = coord(1/7)
    
    Abstract
    In this paper we are concerned with the type of services that send periodic news selections to subscribers of a digital newspaper by means of electronic mail. The aims are to study the influence of categorisation in information retrieval and in digital newspapers, different models to solve problems of bilingualism in digital information services and to analyse the evaluation in information filtering and personalisation in information agents. Hermes is a multilingual system for the personalisation of news services which allows integration and categorisation of information in two languages. In order to customise information for each user, Hermes provides the means for representing a user interests homogeneously across the operating languages of the system. A simple system is applied to train automatically a dynamic news item classifier for both languages, by taking the Yahoo set of categories as reference framework and using the web pages classified under them as training collection. Traditional evaluation methods have been applied and their shortcomings for the present endeavour have been noted.
  15. Siebinga, S.: Implementing multilingual information access in the European Library (2008) 0.01
    0.010086811 = product of:
      0.07060768 = sum of:
        0.07060768 = weight(_text_:digital in 1296) [ClassicSimilarity], result of:
          0.07060768 = score(doc=1296,freq=2.0), product of:
            0.16201277 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.04107254 = queryNorm
            0.4358155 = fieldWeight in 1296, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.078125 = fieldNorm(doc=1296)
      0.14285715 = coord(1/7)
    
    Content
    Vortrag "One more step towards the European digital library: International Conference, Deutsche Nationalbibliothek, Frankfurt am Main 31 January - 1 February 2008.
  16. Ata, B.M.A.: SISDOM: a multilingual document retrieval system (1995) 0.01
    0.010064354 = product of:
      0.07045048 = sum of:
        0.07045048 = weight(_text_:techniques in 895) [ClassicSimilarity], result of:
          0.07045048 = score(doc=895,freq=2.0), product of:
            0.18093403 = queryWeight, product of:
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.04107254 = queryNorm
            0.3893711 = fieldWeight in 895, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.0625 = fieldNorm(doc=895)
      0.14285715 = coord(1/7)
    
    Abstract
    The Malay language is widely used in Malaysia, Indonesia and brunei. The growth in the number of documents written in Malay justifies the need for a document retrieval system for that language. Describes the implementation of a bilingual Malay and English full text document retrieval systems: SIStem capaian DOkumen Multilingua (SISDOM), by the Kebangsaan University Malaysia. The system incorporates many facilities for users, including the choice of search techniques, browsing of retrieved documents, and ranking of documents
  17. Borgman, C.L.: Multi-media, multi-cultural, and multi-lingual digital libraries : or how do we exchange data In 400 languages? (1997) 0.01
    0.009985435 = product of:
      0.06989804 = sum of:
        0.06989804 = weight(_text_:digital in 1263) [ClassicSimilarity], result of:
          0.06989804 = score(doc=1263,freq=16.0), product of:
            0.16201277 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.04107254 = queryNorm
            0.43143538 = fieldWeight in 1263, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1263)
      0.14285715 = coord(1/7)
    
    Abstract
    The Internet would not be very useful if communication were limited to textual exchanges between speakers of English located in the United States. Rather, its value lies in its ability to enable people from multiple nations, speaking multiple languages, to employ multiple media in interacting with each other. While computer networks broke through national boundaries long ago, they remain much more effective for textual communication than for exchanges of sound, images, or mixed media -- and more effective for communication in English than for exchanges in most other languages, much less interactions involving multiple languages. Supporting searching and display in multiple languages is an increasingly important issue for all digital libraries accessible on the Internet. Even if a digital library contains materials in only one language, the content needs to be searchable and displayable on computers in countries speaking other languages. We need to exchange data between digital libraries, whether in a single language or in multiple languages. Data exchanges may be large batch updates or interactive hyperlinks. In any of these cases, character sets must be represented in a consistent manner if exchanges are to succeed. Issues of interoperability, portability, and data exchange related to multi-lingual character sets have received surprisingly little attention in the digital library community or in discussions of standards for information infrastructure, except in Europe. The landmark collection of papers on Standards Policy for Information Infrastructure, for example, contains no discussion of multi-lingual issues except for a passing reference to the Unicode standard. The goal of this short essay is to draw attention to the multi-lingual issues involved in designing digital libraries accessible on the Internet. Many of the multi-lingual design issues parallel those of multi-media digital libraries, a topic more familiar to most readers of D-Lib Magazine. This essay draws examples from multi-media DLs to illustrate some of the urgent design challenges in creating a globally distributed network serving people who speak many languages other than English. First we introduce some general issues of medium, culture, and language, then discuss the design challenges in the transition from local to global systems, lastly addressing technical matters. The technical issues involve the choice of character sets to represent languages, similar to the choices made in representing images or sound. However, the scale of the language problem is far greater. Standards for multi-media representation are being adopted fairly rapidly, in parallel with the availability of multi-media content in electronic form. By contrast, we have hundreds (and sometimes thousands) of years worth of textual materials in hundreds of languages, created long before data encoding standards existed. Textual content from past and present is being encoded in language and application-specific representations that are difficult to exchange without losing data -- if they exchange at all. We illustrate the multi-language DL challenge with examples drawn from the research library community, which typically handles collections of materials in 400 or so languages. These are problems faced not only by developers of digital libraries, but by those who develop and manage any communication technology that crosses national or linguistic boundaries.
  18. Chen, K.-H.: Evaluating Chinese text retrieval with multilingual queries (2002) 0.01
    0.008806311 = product of:
      0.06164417 = sum of:
        0.06164417 = weight(_text_:techniques in 1851) [ClassicSimilarity], result of:
          0.06164417 = score(doc=1851,freq=2.0), product of:
            0.18093403 = queryWeight, product of:
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.04107254 = queryNorm
            0.3406997 = fieldWeight in 1851, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.405231 = idf(docFreq=1467, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1851)
      0.14285715 = coord(1/7)
    
    Abstract
    This paper reports the design of a Chinese test collection with multilingual queries and the application of this test collection to evaluate information retrieval Systems. The effective indexing units, IR models, translation techniques, and query expansion for Chinese text retrieval are identified. The collaboration of East Asian countries for construction of test collections for cross-language multilingual text retrieval is also discussed in this paper. As well, a tool is designed to help assessors judge relevante and gather the events of relevante judgment. The log file created by this tool will be used to analyze the behaviors of assessors in the future.
  19. Chen, S.S.-J.: Methodological considerations for developing Art & Architecture Thesaurus in Chinese and its applications (2021) 0.01
    0.008735436 = product of:
      0.061148047 = sum of:
        0.061148047 = weight(_text_:digital in 579) [ClassicSimilarity], result of:
          0.061148047 = score(doc=579,freq=6.0), product of:
            0.16201277 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.04107254 = queryNorm
            0.37742734 = fieldWeight in 579, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.0390625 = fieldNorm(doc=579)
      0.14285715 = coord(1/7)
    
    Abstract
    A multilingual thesaurus' development needs the appropriate methodological considerations not only for linguistics, but also cultural heterogeneity, as demonstrated in this report on the multilingual project of the Art & Architecture Thesaurus (AAT) in the Chinese language, which has been a collaboration between the Academia Sinica Center for Digital Culture and the Getty Research Institute for more than a decade. After a brief overview of the project, the paper will introduce a holistic methodology for considering how to enable Western art to be accessible to Chinese users and Chinese art accessible to Western users. The conceptual and structural issues will be discussed, especially the challenges of developing terminology in two different cultures. For instance, some terms shared by Western and Chinese cultures could be understood differently in each culture, which raises questions regarding their locations within the hierarchical structure of the AAT. Finally, the report will provide cases to demonstrate how the Chinese-Language AAT language supports online exhibitions, digital humanities and linking of digital art history content to the web of data.
  20. Zumer, M.; Clavel, G.: Extending the multilingual capacity of The European Liibrary : resuls & findings (2008) 0.01
    0.00806945 = product of:
      0.056486145 = sum of:
        0.056486145 = weight(_text_:digital in 858) [ClassicSimilarity], result of:
          0.056486145 = score(doc=858,freq=2.0), product of:
            0.16201277 = queryWeight, product of:
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.04107254 = queryNorm
            0.34865242 = fieldWeight in 858, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.944552 = idf(docFreq=2326, maxDocs=44218)
              0.0625 = fieldNorm(doc=858)
      0.14285715 = coord(1/7)
    
    Content
    Vortrag "One more step towards the European digital library: International Conference, Deutsche Nationalbibliothek, Frankfurt am Main 31 January - 1 February 2008.

Years

Languages

  • e 97
  • d 5
  • f 2
  • ro 1
  • More… Less…

Types

  • a 94
  • el 13
  • m 2
  • x 2
  • r 1
  • s 1
  • More… Less…