Search (71 results, page 1 of 4)

Weihs, J.: Three tales of multilingual cataloguing (1998) 0.03

0.02533477 = product of:
  0.06333692 = sum of:
    0.018058153 = weight(_text_:of in 6063) [ClassicSimilarity], result of:
      0.018058153 = score(doc=6063,freq=2.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.27643585 = fieldWeight in 6063, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.125 = fieldNorm(doc=6063)
    0.045278773 = product of:
      0.090557545 = sum of:
        0.090557545 = weight(_text_:22 in 6063) [ClassicSimilarity], result of:
          0.090557545 = score(doc=6063,freq=2.0), product of:
            0.14628662 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04177434 = queryNorm
            0.61904186 = fieldWeight in 6063, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=6063)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Date: 2. 8.2001 8:55:22

Timotin, A.: Multilingvism si tezaure de concepte (1994) 0.02

0.01927099 = product of:
  0.048177473 = sum of:
    0.025538085 = weight(_text_:of in 7887) [ClassicSimilarity], result of:
      0.025538085 = score(doc=7887,freq=16.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.39093933 = fieldWeight in 7887, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=7887)
    0.022639386 = product of:
      0.045278773 = sum of:
        0.045278773 = weight(_text_:22 in 7887) [ClassicSimilarity], result of:
          0.045278773 = score(doc=7887,freq=2.0), product of:
            0.14628662 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04177434 = queryNorm
            0.30952093 = fieldWeight in 7887, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=7887)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: Discusses the importance and utility of a thesaurus of concepts to provide logical support for multilingualism. Deals in particular with the IEC (International Electrotechnical Commission) Thesaurus,a nd the work of the IEC Thesaurus Working Group, consisting of specialists of the Research Institute in Electrical Engineering (ICPE) and the University Politehnica of Bucharest. Describes how this group contributed to the thesaurus and implemented the multilingual database required by the editing and updating of multilingual database required by the editing and updating of multilingual dictionaries in electrical engineering
Source: Probleme de Informare si Documentare. 28(1994) no.1, S.13-22

Schubert, K.: Parameters for the design of an intermediate language for multilingual thesauri (1995) 0.02

0.016284827 = product of:
  0.040712066 = sum of:
    0.020902606 = weight(_text_:of in 2092) [ClassicSimilarity], result of:
      0.020902606 = score(doc=2092,freq=14.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.31997898 = fieldWeight in 2092, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2092)
    0.019809462 = product of:
      0.039618924 = sum of:
        0.039618924 = weight(_text_:22 in 2092) [ClassicSimilarity], result of:
          0.039618924 = score(doc=2092,freq=2.0), product of:
            0.14628662 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04177434 = queryNorm
            0.2708308 = fieldWeight in 2092, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2092)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: The architecture of multilingual software systems is sometimes centred around an intermediate language. The question is analyzed to what extent this approach can be useful for multilingual thesauri, in particular regarding the functionality the thesaurus is designed to fulfil. Both the runtime use, and the construction and maintenance of the system is taken into consideration. Using the perspective of general language technology enables to draw on experience from a broader range of fields beyond thesaurus design itself as well as to consider the possibility of using a thesaurus as a knowledge module in various systems which process natural language. Therefore the features which thesauri and other natural-language processing systems have in common are emphasized, especially at the level of systems design and their core functionality
Source: Knowledge organization. 22(1995) nos.3/4, S.136-140

Kutschekmanesch, S.; Lutes, B.; Moelle, K.; Thiel, U.; Tzeras, K.: Automated multilingual indexing : a synthesis of rule-based and thesaurus-based methods (1998) 0.02

0.015834233 = product of:
  0.03958558 = sum of:
    0.011286346 = weight(_text_:of in 4157) [ClassicSimilarity], result of:
      0.011286346 = score(doc=4157,freq=2.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.17277241 = fieldWeight in 4157, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.078125 = fieldNorm(doc=4157)
    0.028299233 = product of:
      0.056598466 = sum of:
        0.056598466 = weight(_text_:22 in 4157) [ClassicSimilarity], result of:
          0.056598466 = score(doc=4157,freq=2.0), product of:
            0.14628662 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04177434 = queryNorm
            0.38690117 = fieldWeight in 4157, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=4157)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Source: Information und Märkte: 50. Deutscher Dokumentartag 1998, Kongreß der Deutschen Gesellschaft für Dokumentation e.V. (DGD), Rheinische Friedrich-Wilhelms-Universität Bonn, 22.-24. September 1998. Hrsg. von Marlies Ockenfeld u. Gerhard J. Mantwill

Heinzelin, D. de; ¬d'¬Hautcourt, F.; Pols, R.: ¬Un nouveaux thesaurus multilingue informatise relatif aux instruments de musique (1998) 0.02

0.015311283 = product of:
  0.038278207 = sum of:
    0.01563882 = weight(_text_:of in 932) [ClassicSimilarity], result of:
      0.01563882 = score(doc=932,freq=6.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.23940048 = fieldWeight in 932, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=932)
    0.022639386 = product of:
      0.045278773 = sum of:
        0.045278773 = weight(_text_:22 in 932) [ClassicSimilarity], result of:
          0.045278773 = score(doc=932,freq=2.0), product of:
            0.14628662 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04177434 = queryNorm
            0.30952093 = fieldWeight in 932, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=932)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: Describes the development and structure of a multilingual thesaurus for classifying and defining musical instruments, designed at the Brussels Theatre de la Monnaie as part of a project to create a multimedia database of theatre and musical arts
Date: 1. 8.1996 22:01:00

Cao, L.; Leong, M.-K.; Low, H.-B.: Searching heterogeneous multilingual bibliographic sources (1998) 0.01

0.014163372 = product of:
  0.03540843 = sum of:
    0.0127690425 = weight(_text_:of in 3564) [ClassicSimilarity], result of:
      0.0127690425 = score(doc=3564,freq=4.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.19546966 = fieldWeight in 3564, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=3564)
    0.022639386 = product of:
      0.045278773 = sum of:
        0.045278773 = weight(_text_:22 in 3564) [ClassicSimilarity], result of:
          0.045278773 = score(doc=3564,freq=2.0), product of:
            0.14628662 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04177434 = queryNorm
            0.30952093 = fieldWeight in 3564, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=3564)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: Propopses a Web-based architecture for searching distributed heterogeneous multi-asian language bibliographic sources, and describes a successful pilot implementation of the system at the Chinese Library (CLib) system developed in Singapore and tested at 2 university libraries and a public library
Date: 1. 8.1996 22:08:06
Footnote: Contribution to a special issue devoted to the Proceedings of the 7th International World Wide Web Conference, held 14-18 April 1998, Brisbane, Australia

Grefenstette, G.: ¬The problem of cross-language information retrieval (1998) 0.01

0.013399891 = product of:
  0.033499725 = sum of:
    0.019956108 = product of:
      0.09978054 = sum of:
        0.09978054 = weight(_text_:problem in 6301) [ClassicSimilarity], result of:
          0.09978054 = score(doc=6301,freq=2.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.5627445 = fieldWeight in 6301, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.09375 = fieldNorm(doc=6301)
      0.2 = coord(1/5)
    0.013543615 = weight(_text_:of in 6301) [ClassicSimilarity], result of:
      0.013543615 = score(doc=6301,freq=2.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.20732689 = fieldWeight in 6301, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.09375 = fieldNorm(doc=6301)
  0.4 = coord(2/5)

Cousins, S.A.; Hartley, R.J.: Towards multilingual online public access catalogues (1994) 0.01

0.012397245 = product of:
  0.030993111 = sum of:
    0.011641062 = product of:
      0.05820531 = sum of:
        0.05820531 = weight(_text_:problem in 7207) [ClassicSimilarity], result of:
          0.05820531 = score(doc=7207,freq=2.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.3282676 = fieldWeight in 7207, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7207)
      0.2 = coord(1/5)
    0.01935205 = weight(_text_:of in 7207) [ClassicSimilarity], result of:
      0.01935205 = score(doc=7207,freq=12.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.29624295 = fieldWeight in 7207, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7207)
  0.4 = coord(2/5)

Abstract: With increasing moves towards an integrated Europe the need for multilingual access to information becomes more pressing. One aspect of this need which has largely been neglected is the provision of multilingual access to OPACs and this paper is concerned with exploring this problem area. The need for multilingual OPAC search capabilities and the difficulties associated with this are discussed. The problems of subject access in particular are highlighted. Research into subject searching in monolingual OPACs is reviewed and its relevance to multilingual OPACs is outlined. Given the limitations of current machine translation of natural language it is likely that the utilisation of controlled subject search facilities. Finally some possible directions for further research are considered

Peters, C.; Picchi, E.: Across languages, across cultures : issues in multilinguality and digital libraries (1997) 0.01

0.011577157 = product of:
  0.028942892 = sum of:
    0.0133040715 = product of:
      0.066520356 = sum of:
        0.066520356 = weight(_text_:problem in 1233) [ClassicSimilarity], result of:
          0.066520356 = score(doc=1233,freq=2.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.375163 = fieldWeight in 1233, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0625 = fieldNorm(doc=1233)
      0.2 = coord(1/5)
    0.01563882 = weight(_text_:of in 1233) [ClassicSimilarity], result of:
      0.01563882 = score(doc=1233,freq=6.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.23940048 = fieldWeight in 1233, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=1233)
  0.4 = coord(2/5)

Abstract: With the recent rapid diffusion over the international computer networks of world-wide distributed document bases, the question of multilingual access and multilingual information retrieval is becoming increasingly relevant. We briefly discuss just some of the issues that must be addressed in order to implement a multilingual interface for a Digital Library system and describe our own approach to this problem.

Hainebach, R.: ¬The EUROCAT project : the integration of European community multidisciplinary and document-oriented databases on CD-ROM; an exercise in merging data from several databases into a single database as well as solving the problem of multilingualism (1993) 0.01

0.011157829 = product of:
  0.027894573 = sum of:
    0.009978054 = product of:
      0.04989027 = sum of:
        0.04989027 = weight(_text_:problem in 7404) [ClassicSimilarity], result of:
          0.04989027 = score(doc=7404,freq=2.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.28137225 = fieldWeight in 7404, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.046875 = fieldNorm(doc=7404)
      0.2 = coord(1/5)
    0.01791652 = weight(_text_:of in 7404) [ClassicSimilarity], result of:
      0.01791652 = score(doc=7404,freq=14.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.2742677 = fieldWeight in 7404, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=7404)
  0.4 = coord(2/5)

Abstract: The Institutions of the European Communities produce document-oriented databases based on publications and documents distributed either by the Office for Official Publications of the European Communities or by the individual EC institutions themselves. These databases are known under the names of ABEL, CATEL, CELEX, CORDIS RTD publications, ECLAS, EPOQUE, EURISTOTE, RAPID and SCAD and are available via hosts such as EUROBASES, ECHO and the Office for Official Publications. Until the establishment of the EUROCAT project, no single database held a comprehensive and complete collection of all European Community documents and publications. Describes the work on integrating and harmonising the data from the databases to produce the multilingual EUROCAT database using MS-DOS based software. The resulting database will be available on CD-ROM

Hainebach, R.: European Community databases : a subject analysis (1992) 0.01

0.010976778 = product of:
  0.027441945 = sum of:
    0.011641062 = product of:
      0.05820531 = sum of:
        0.05820531 = weight(_text_:problem in 7402) [ClassicSimilarity], result of:
          0.05820531 = score(doc=7402,freq=2.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.3282676 = fieldWeight in 7402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7402)
      0.2 = coord(1/5)
    0.015800884 = weight(_text_:of in 7402) [ClassicSimilarity], result of:
      0.015800884 = score(doc=7402,freq=8.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.24188137 = fieldWeight in 7402, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7402)
  0.4 = coord(2/5)

Abstract: With the introduction of the single market, more and more European Community information databases are becoming available either online or on CD-ROM. Some databases are full text but many are bibliographic. Users may access them free text or through controlled descriptors but ideally they should be able to search through 1 or more subject access points. Each of the databases uses a different method and the different subject access methods, employing thesauri or classification schemes, are examined. Proposes a solution to the problem of multiple thesauri and multilingualism
Source: Online information 92. Proc. of the 16th Int. Online Information Meeting, London, 8-10.12.1992. Ed. by David I. Raitt

Oard, D.W.: Alternative approaches for cross-language text retrieval (1997) 0.01
```
0.010920028 = product of:
  0.027300071 = sum of:
    0.010081458 = product of:
      0.050407287 = sum of:
        0.050407287 = weight(_text_:problem in 1164) [ClassicSimilarity], result of:
          0.050407287 = score(doc=1164,freq=6.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.28428814 = fieldWeight in 1164, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1164)
      0.2 = coord(1/5)
    0.017218614 = weight(_text_:of in 1164) [ClassicSimilarity], result of:
      0.017218614 = score(doc=1164,freq=38.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.2635841 = fieldWeight in 1164, product of:
          6.164414 = tf(freq=38.0), with freq of:
            38.0 = termFreq=38.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1164)
  0.4 = coord(2/5)
```
Abstract

The explosive growth of the Internet and other sources of networked information have made automatic mediation of access to networked information sources an increasingly important problem. Much of this information is expressed as electronic text, and it is becoming practical to automatically convert some printed documents and recorded speech to electronic text as well. Thus, automated systems capable of detecting useful documents are finding widespread application. With even a small number of languages it can be inconvenient to issue the same query repeatedly in every language, so users who are able to read more than one language will likely prefer a multilingual text retrieval system over a collection of monolingual systems. And since reading ability in a language does not always imply fluent writing ability in that language, such users will likely find cross-language text retrieval particularly useful for languages in which they are less confident of their ability to express their information needs effectively. The use of such systems can be also be beneficial if the user is able to read only a single language. For example, when only a small portion of the document collection will ever be examined by the user, performing retrieval before translation can be significantly more economical than performing translation before retrieval. So when the application is sufficiently important to justify the time and effort required for translation, those costs can be minimized if an effective cross-language text retrieval system is available. Even when translation is not available, there are circumstances in which cross-language text retrieval could be useful to a monolingual user. For example, a researcher might find a paper published in an unfamiliar language useful if that paper contains references to works by the same author that are in the researcher's native language.
Multilingual text retrieval can be defined as selection of useful documents from collections that may contain several languages (English, French, Chinese, etc.). This formulation allows for the possibility that individual documents might contain more than one language, a common occurrence in some applications. Both cross-language and within-language retrieval are included in this formulation, but it is the cross-language aspect of the problem which distinguishes multilingual text retrieval from its well studied monolingual counterpart. At the SIGIR 96 workshop on "Cross-Linguistic Information Retrieval" the participants discussed the proliferation of terminology being used to describe the field and settled on "Cross-Language" as the best single description of the salient aspect of the problem. "Multilingual" was felt to be too broad, since that term has also been used to describe systems able to perform within-language retrieval in more than one language but that lack any cross-language capability. "Cross-lingual" and "cross-linguistic" were felt to be equally good descriptions of the field, but "crosslanguage" was selected as the preferred term in the interest of standardization. Unfortunately, at about the same time the U.S. Defense Advanced Research Projects Agency (DARPA) introduced "translingual" as their preferred term, so we are still some distance from reaching consensus on this matter.
I will not attempt to draw a sharp distinction between retrieval and filtering in this survey. Although my own work on adaptive cross-language text filtering has led me to make this distinction fairly carefully in other presentations (c.f., (Oard 1997b)), such an proach does little to help understand the fundamental techniques which have been applied or the results that have been obtained in this case. Since it is still common to view filtering (detection of useful documents in dynamic document streams) as a kind of retrieval, will simply adopt that perspective here.
Borgman, C.L.: Multi-media, multi-cultural, and multi-lingual digital libraries : or how do we exchange data In 400 languages? (1997) 0.01
```
0.009569087 = product of:
  0.023922717 = sum of:
    0.005820531 = product of:
      0.029102655 = sum of:
        0.029102655 = weight(_text_:problem in 1263) [ClassicSimilarity], result of:
          0.029102655 = score(doc=1263,freq=2.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.1641338 = fieldWeight in 1263, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1263)
      0.2 = coord(1/5)
    0.018102186 = weight(_text_:of in 1263) [ClassicSimilarity], result of:
      0.018102186 = score(doc=1263,freq=42.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.2771099 = fieldWeight in 1263, product of:
          6.4807405 = tf(freq=42.0), with freq of:
            42.0 = termFreq=42.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1263)
  0.4 = coord(2/5)
```
Abstract

The Internet would not be very useful if communication were limited to textual exchanges between speakers of English located in the United States. Rather, its value lies in its ability to enable people from multiple nations, speaking multiple languages, to employ multiple media in interacting with each other. While computer networks broke through national boundaries long ago, they remain much more effective for textual communication than for exchanges of sound, images, or mixed media -- and more effective for communication in English than for exchanges in most other languages, much less interactions involving multiple languages. Supporting searching and display in multiple languages is an increasingly important issue for all digital libraries accessible on the Internet. Even if a digital library contains materials in only one language, the content needs to be searchable and displayable on computers in countries speaking other languages. We need to exchange data between digital libraries, whether in a single language or in multiple languages. Data exchanges may be large batch updates or interactive hyperlinks. In any of these cases, character sets must be represented in a consistent manner if exchanges are to succeed. Issues of interoperability, portability, and data exchange related to multi-lingual character sets have received surprisingly little attention in the digital library community or in discussions of standards for information infrastructure, except in Europe. The landmark collection of papers on Standards Policy for Information Infrastructure, for example, contains no discussion of multi-lingual issues except for a passing reference to the Unicode standard. The goal of this short essay is to draw attention to the multi-lingual issues involved in designing digital libraries accessible on the Internet. Many of the multi-lingual design issues parallel those of multi-media digital libraries, a topic more familiar to most readers of D-Lib Magazine. This essay draws examples from multi-media DLs to illustrate some of the urgent design challenges in creating a globally distributed network serving people who speak many languages other than English. First we introduce some general issues of medium, culture, and language, then discuss the design challenges in the transition from local to global systems, lastly addressing technical matters. The technical issues involve the choice of character sets to represent languages, similar to the choices made in representing images or sound. However, the scale of the language problem is far greater. Standards for multi-media representation are being adopted fairly rapidly, in parallel with the availability of multi-media content in electronic form. By contrast, we have hundreds (and sometimes thousands) of years worth of textual materials in hundreds of languages, created long before data encoding standards existed. Textual content from past and present is being encoded in language and application-specific representations that are difficult to exchange without losing data -- if they exchange at all. We illustrate the multi-language DL challenge with examples drawn from the research library community, which typically handles collections of materials in 400 or so languages. These are problems faced not only by developers of digital libraries, but by those who develop and manage any communication technology that crosses national or linguistic boundaries.
Clavel, G.; Dale, P.; Heiner-Freiling, M.; Kunz, M.; Landry, P.; MacEwan, A.; Naudi, M.; Oddy, P.; Saget, A.: CoBRA+ working group on multilingual subject access : final report (1999) 0.01
```
0.008648566 = product of:
  0.021621415 = sum of:
    0.005820531 = product of:
      0.029102655 = sum of:
        0.029102655 = weight(_text_:problem in 6067) [ClassicSimilarity], result of:
          0.029102655 = score(doc=6067,freq=2.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.1641338 = fieldWeight in 6067, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.02734375 = fieldNorm(doc=6067)
      0.2 = coord(1/5)
    0.015800884 = weight(_text_:of in 6067) [ClassicSimilarity], result of:
      0.015800884 = score(doc=6067,freq=32.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.24188137 = fieldWeight in 6067, product of:
          5.656854 = tf(freq=32.0), with freq of:
            32.0 = termFreq=32.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.02734375 = fieldNorm(doc=6067)
  0.4 = coord(2/5)
```
Abstract

This final report defines the problem of multilingual subject access, summarises the work carried out by the CoBRA+ working group on multilingual subject access from autumn 1997 until February 1999 and its results, identifies and discusses issues to be resolved, and presents a proposal for a prototype to the directors of the institutions concerned. For a summary of results, and the proposal, see 'CoBRA+ working group on multilingual subject access: proposals for discussion, March 18th 1999. This report will be distributed to members of the CENL and posted on the GABRIEL website. Genevieve Clavel has compiled it on the basis of the group's reports, discussions within the group and comments provided by the partners.

Content

Backgrund to the study: The question of multilingual access to bibliographic databases affects not only searchers in countries in which several languages are spoken such as Switzerland, but also all those who search material in databases containing material in more than one language, which is the case in the majority of scientific or research databases. he growth of networks means that we can easily access catalogues outside our own immediate circle - in another town, another country, another continent. In doing so we encounter problems concerning not only search interfaces, but also concerning subject access or even author access in another language. In France for example, each document, independently of the language in which it has been written, is indexed using a French-language subject heading language. Thus, in order to search by subject headings for documents written in English or German, held in the Bibliothèque nationale de France, the researcher from abroad has to master the French language. In theory, the indexer should be able to analyse a document and assign headings in his/her native language, while the user should be able to search in his/her native language. The language of the document itself should have no influence on the language of the subject heading language used for indexing nor on the language used for searching. (Practically speaking of course, there are restrictions, since there is a limit to the number of languages in which subject headings languages could be maintained and thus in which the user may search.) In the example below, we are concerned with three languages: German, French and English. If we can imagine a system in which there are equivalents among subject headings in these three languages, the following scenario may be envisaged: a German-speaking indexer will use German-language subject headings to index all the documents received, regardless of the language in which they are written. The user may search for these documents by entering subject headings in German, but also in French or in English, thanks to the equivalents that have been established, in French or in English without the necessity to know the other languages or the structure of the other SHLs. Ideally, this approach should not be confined to one database, but would allow the different databases to be brought together in virtual system: an English-speaking user in London should be able to search the database of the Deutsche Bibliothek in Frankfurt using English-language headings, and retrieving documents which have been indexed using the German subject headings' list.
Cross-language information retrieval (1998) 0.01
```
0.00804753 = product of:
  0.020118825 = sum of:
    0.004157522 = product of:
      0.020787612 = sum of:
        0.020787612 = weight(_text_:problem in 6299) [ClassicSimilarity], result of:
          0.020787612 = score(doc=6299,freq=2.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.11723843 = fieldWeight in 6299, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.01953125 = fieldNorm(doc=6299)
      0.2 = coord(1/5)
    0.015961302 = weight(_text_:of in 6299) [ClassicSimilarity], result of:
      0.015961302 = score(doc=6299,freq=64.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.24433708 = fieldWeight in 6299, product of:
          8.0 = tf(freq=64.0), with freq of:
            64.0 = termFreq=64.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.01953125 = fieldNorm(doc=6299)
  0.4 = coord(2/5)
```
Content

Enthält die Beiträge: GREFENSTETTE, G.: The Problem of Cross-Language Information Retrieval; DAVIS, M.W.: On the Effective Use of Large Parallel Corpora in Cross-Language Text Retrieval; BALLESTEROS, L. u. W.B. CROFT: Statistical Methods for Cross-Language Information Retrieval; Distributed Cross-Lingual Information Retrieval; Automatic Cross-Language Information Retrieval Using Latent Semantic Indexing; EVANS, D.A. u.a.: Mapping Vocabularies Using Latent Semantics; PICCHI, E. u. C. PETERS: Cross-Language Information Retrieval: A System for Comparable Corpus Querying; YAMABANA, K. u.a.: A Language Conversion Front-End for Cross-Language Information Retrieval; GACHOT, D.A. u.a.: The Systran NLP Browser: An Application of Machine Translation Technology in Cross-Language Information Retrieval; HULL, D.: A Weighted Boolean Model for Cross-Language Text Retrieval; SHERIDAN, P. u.a. Building a Large Multilingual Test Collection from Comparable News Documents; OARD; D.W. u. B.J. DORR: Evaluating Cross-Language Text Filtering Effectiveness

Footnote

Rez. in: Machine translation review: 1999, no.10, S.26-27 (D. Lewis): "Cross Language Information Retrieval (CLIR) addresses the growing need to access large volumes of data across language boundaries. The typical requirement is for the user to input a free form query, usually a brief description of a topic, into a search or retrieval engine which returns a list, in ranked order, of documents or web pages that are relevant to the topic. The search engine matches the terms in the query to indexed terms, usually keywords previously derived from the target documents. Unlike monolingual information retrieval, CLIR requires query terms in one language to be matched to indexed terms in another. Matching can be done by bilingual dictionary lookup, full machine translation, or by applying statistical methods. A query's success is measured in terms of recall (how many potentially relevant target documents are found) and precision (what proportion of documents found are relevant). Issues in CLIR are how to translate query terms into index terms, how to eliminate alternative translations (e.g. to decide that French 'traitement' in a query means 'treatment' and not 'salary'), and how to rank or weight translation alternatives that are retained (e.g. how to order the French terms 'aventure', 'business', 'affaire', and 'liaison' as relevant translations of English 'affair'). Grefenstette provides a lucid and useful overview of the field and the problems. The volume brings together a number of experiments and projects in CLIR. Mark Davies (New Mexico State University) describes Recuerdo, a Spanish retrieval engine which reduces translation ambiguities by scanning indexes for parallel texts; it also uses either a bilingual dictionary or direct equivalents from a parallel corpus in order to compare results for queries on parallel texts. Lisa Ballesteros and Bruce Croft (University of Massachusetts) use a 'local feedback' technique which automatically enhances a query by adding extra terms to it both before and after translation; such terms can be derived from documents known to be relevant to the query.
Christian Fluhr at al (DIST/SMTI, France) outline the EMIR (European Multilingual Information Retrieval) and ESPRIT projects. They found that using SYSTRAN to machine translate queries and to access material from various multilingual databases produced less relevant results than a method referred to as 'multilingual reformulation' (the mechanics of which are only hinted at). An interesting technique is Latent Semantic Indexing (LSI), described by Michael Littman et al (Brown University) and, most clearly, by David Evans et al (Carnegie Mellon University). LSI involves creating matrices of documents and the terms they contain and 'fitting' related documents into a reduced matrix space. This effectively allows queries to be mapped onto a common semantic representation of the documents. Eugenio Picchi and Carol Peters (Pisa) report on a procedure to create links between translation equivalents in an Italian-English parallel corpus. The links are used to construct parallel linguistic contexts in real-time for any term or combination of terms that is being searched for in either language. Their interest is primarily lexicographic but they plan to apply the same procedure to comparable corpora, i.e. to texts which are not translations of each other but which share the same domain. Kiyoshi Yamabana et al (NEC, Japan) address the issue of how to disambiguate between alternative translations of query terms. Their DMAX (double maximise) method looks at co-occurrence frequencies between both source language words and target language words in order to arrive at the most probable translation. The statistical data for the decision are derived, not from the translation texts but independently from monolingual corpora in each language. An interactive user interface allows the user to influence the selection of terms during the matching process. Denis Gachot et al (SYSTRAN) describe the SYSTRAN NLP browser, a prototype tool which collects parsing information derived from a text or corpus previously translated with SYSTRAN. The user enters queries into the browser in either a structured or free form and receives grammatical and lexical information about the source text and/or its translation.
The retrieved output from a query including the phrase 'big rockets' may be, for instance, a sentence containing 'giant rocket' which is semantically ranked above 'military ocket'. David Hull (Xerox Research Centre, Grenoble) describes an implementation of a weighted Boolean model for Spanish-English CLIR. Users construct Boolean-type queries, weighting each term in the query, which is then translated by an on-line dictionary before being applied to the database. Comparisons with the performance of unweighted free-form queries ('vector space' models) proved encouraging. Two contributions consider the evaluation of CLIR systems. In order to by-pass the time-consuming and expensive process of assembling a standard collection of documents and of user queries against which the performance of an CLIR system is manually assessed, Páriac Sheridan et al (ETH Zurich) propose a method based on retrieving 'seed documents'. This involves identifying a unique document in a database (the 'seed document') and, for a number of queries, measuring how fast it is retrieved. The authors have also assembled a large database of multilingual news documents for testing purposes. By storing the (fairly short) documents in a structured form tagged with descriptor codes (e.g. for topic, country and area), the test suite is easily expanded while remaining consistent for the purposes of testing. Douglas Ouard and Bonne Dorr (University of Maryland) describe an evaluation methodology which appears to apply LSI techniques in order to filter and rank incoming documents designed for testing CLIR systems. The volume provides the reader an excellent overview of several projects in CLIR. It is well supported with references and is intended as a secondary text for researchers and practitioners. It highlights the need for a good, general tutorial introduction to the field."
Turner, J.M.: Cross-language transfer of indexing concepts for storage and retrieval of moving images : preliminary results (1996) 0.01
```
0.0052405605 = product of:
  0.026202802 = sum of:
    0.026202802 = weight(_text_:of in 7400) [ClassicSimilarity], result of:
      0.026202802 = score(doc=7400,freq=22.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.40111488 = fieldWeight in 7400, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7400)
  0.2 = coord(1/5)
```
Abstract

In previous research, participants who screen a videotape of stock footage from the National Film Board of Canada's stockshot collection were asked to assign terms in English that could be used for retrieval of each shot. The most popular terms were analyzed as potential indexing terms. In the current research a French language version of the research tapes was prepared, using the same images, and the data collected were in French. Compares the most popular terms identified in each of the 2 studies for each of the shots in order to determine the rate of correspondence between potential indexing terms in each language

Source

Global complexity: information, chaos and control. Proceedings of the 59th Annual Meeting of the American Society for Information Science, ASIS'96, Baltimore, Maryland, 21-24 Oct 1996. Ed.: S. Hardin

Stegentritt, E.: Evaluationsresultate des mehrsprachigen Suchsystems CANAL/LS (1998) 0.00

0.0047777384 = product of:
  0.023888692 = sum of:
    0.023888692 = weight(_text_:of in 7216) [ClassicSimilarity], result of:
      0.023888692 = score(doc=7216,freq=14.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.36569026 = fieldWeight in 7216, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=7216)
  0.2 = coord(1/5)

Abstract: The search system CANAL/LS simplifies the searching of library catalogues by analyzing search questions linguistically and translating them if required. The linguistic analysis reduces the search question words to their basic forms so that they can be compared with basic title forms. Consequently all variants of words and parts of compounds in German can be found. Presents the results of an analysis of search questions in a catalogue of 45.000 titles in the field of psychology

Stancikova, P.: International integrated database systems linked to multilingual thesauri covering the field of environment and agriculture (1996) 0.00

0.004691646 = product of:
  0.02345823 = sum of:
    0.02345823 = weight(_text_:of in 2825) [ClassicSimilarity], result of:
      0.02345823 = score(doc=2825,freq=6.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.3591007 = fieldWeight in 2825, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.09375 = fieldNorm(doc=2825)
  0.2 = coord(1/5)

Source: Compatibility and integration of order systems: Research Seminar Proceedings of the TIP/ISKO Meeting, Warsaw, 13-15 September 1995

Lonsdale, D.; Mitamura, T.; Nyberg, E.: Acquisition of large lexicons for practical knowledge-based MT (1994/95) 0.00
```
0.0044919094 = product of:
  0.022459546 = sum of:
    0.022459546 = weight(_text_:of in 7409) [ClassicSimilarity], result of:
      0.022459546 = score(doc=7409,freq=22.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.34381276 = fieldWeight in 7409, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=7409)
  0.2 = coord(1/5)
```
Abstract

Although knowledge based MT systems have the potential to achieve high translation accuracy, each successful application system requires a large amount of hand coded lexical knowledge. Systems like KBMT-89 and its descendants have demonstarted how knowledge based translation can produce good results in technical domains with tractable domain semantics. Nevertheless, the magnitude of the development task for large scale applications with 10s of 1000s of of domain concepts precludes a purely hand crafted approach. The current challenge for the next generation of knowledge based MT systems is to utilize online textual resources and corpus analysis software in order to automate the most laborious aspects of the knowledge acquisition process. This partial automation can in turn maximize the productivity of human knowledge engineers and help to make large scale applications of knowledge based MT an viable approach. Discusses the corpus based knowledge acquisition methodology used in KANT, a knowledge based translation system for multilingual document production. This methodology can be generalized beyond the KANT interlinhua approach for use with any system that requires similar kinds of knowledge
Martinez Arellano, F.F.: Subject searching in online catalogs including Spanish and English material (1999) 0.00
```
0.0044919094 = product of:
  0.022459546 = sum of:
    0.022459546 = weight(_text_:of in 5350) [ClassicSimilarity], result of:
      0.022459546 = score(doc=5350,freq=22.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.34381276 = fieldWeight in 5350, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=5350)
  0.2 = coord(1/5)
```
Abstract

The use of title words, the combination of these through the use of logic operators, and the possibility of truncating them when carrying out subject searches, are some of the search options that have been incorporated into the online catalog. Several arguments in favor of these options have been expressed which state that they represent an approach for the use of natural language and that they facilitate information retrieval. However, expressed arguments against them that support the necessity of using controlled language to obtain more precision in search results also exist. This paper reports the main results from a study whose objective was to compare advantages and disadvantages of retrieval by keywords from the title and by subject headings included in the records of LIBRUNAM, an online catalog containing records for English and Spanish items at the National Autonomous University of Mexico.

Search (71 results, page 1 of 4)

Authors

Languages

Types

Themes