Search (55 results, page 1 of 3)

  • × year_i:[1990 TO 2000}
  • × theme_ss:"Multilinguale Probleme"
  1. Kunz, M.: Mehrsprachigkeit in der Sacherschließung (1998) 0.07
    0.06574899 = product of:
      0.4602429 = sum of:
        0.4602429 = weight(_text_:mehrsprachigkeit in 1247) [ClassicSimilarity], result of:
          0.4602429 = score(doc=1247,freq=2.0), product of:
            0.3070917 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03622214 = queryNorm
            1.4987148 = fieldWeight in 1247, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.125 = fieldNorm(doc=1247)
      0.14285715 = coord(1/7)
    
  2. Cao, L.; Leong, M.-K.; Low, H.-B.: Searching heterogeneous multilingual bibliographic sources (1998) 0.01
    0.008175202 = product of:
      0.028613206 = sum of:
        0.008982805 = product of:
          0.044914022 = sum of:
            0.044914022 = weight(_text_:system in 3564) [ClassicSimilarity], result of:
              0.044914022 = score(doc=3564,freq=4.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.3936941 = fieldWeight in 3564, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3564)
          0.2 = coord(1/5)
        0.0196304 = product of:
          0.0392608 = sum of:
            0.0392608 = weight(_text_:22 in 3564) [ClassicSimilarity], result of:
              0.0392608 = score(doc=3564,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.30952093 = fieldWeight in 3564, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3564)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    Propopses a Web-based architecture for searching distributed heterogeneous multi-asian language bibliographic sources, and describes a successful pilot implementation of the system at the Chinese Library (CLib) system developed in Singapore and tested at 2 university libraries and a public library
    Date
    1. 8.1996 22:08:06
  3. Picchi, E.; Peters, C.: Cross-language information retrieval : a system for comparable corpus querying (1998) 0.01
    0.007071401 = product of:
      0.049499806 = sum of:
        0.049499806 = product of:
          0.12374951 = sum of:
            0.076111 = weight(_text_:retrieval in 6305) [ClassicSimilarity], result of:
              0.076111 = score(doc=6305,freq=6.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.6946405 = fieldWeight in 6305, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6305)
            0.047638513 = weight(_text_:system in 6305) [ClassicSimilarity], result of:
              0.047638513 = score(doc=6305,freq=2.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.41757566 = fieldWeight in 6305, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6305)
          0.4 = coord(2/5)
      0.14285715 = coord(1/7)
    
    Series
    The Kluwer International series on information retrieval
    Source
    Cross-language information retrieval. Ed.: G. Grefenstette
  4. Lassalle, E.: Text retrieval : from a monolingual system to a multilingual system (1993) 0.01
    0.0065628807 = product of:
      0.045940164 = sum of:
        0.045940164 = product of:
          0.11485041 = sum of:
            0.036250874 = weight(_text_:retrieval in 7403) [ClassicSimilarity], result of:
              0.036250874 = score(doc=7403,freq=4.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.33085006 = fieldWeight in 7403, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=7403)
            0.078599535 = weight(_text_:system in 7403) [ClassicSimilarity], result of:
              0.078599535 = score(doc=7403,freq=16.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.68896466 = fieldWeight in 7403, product of:
                  4.0 = tf(freq=16.0), with freq of:
                    16.0 = termFreq=16.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=7403)
          0.4 = coord(2/5)
      0.14285715 = coord(1/7)
    
    Abstract
    Describes the TELMI monolingual text retrieval system and its future extension, a multilingual system. TELMI is designed for medium sized databases containing short texts. The characteristics of the system are fine-grained natural language processing (NLP); an open domain and a large scale knowledge base; automated indexing based on conceptual representation of texts and reusability of the NLP tools. Discusses the French MINITEL service, the MGS information service and the TELMI research system covering the full text system; NLP architecture; the lexical level; the syntactic level; the semantic level and an example of the use of a generic system
  5. Schubert, K.: Parameters for the design of an intermediate language for multilingual thesauri (1995) 0.01
    0.0064955507 = product of:
      0.022734426 = sum of:
        0.0055578267 = product of:
          0.027789133 = sum of:
            0.027789133 = weight(_text_:system in 2092) [ClassicSimilarity], result of:
              0.027789133 = score(doc=2092,freq=2.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.2435858 = fieldWeight in 2092, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2092)
          0.2 = coord(1/5)
        0.0171766 = product of:
          0.0343532 = sum of:
            0.0343532 = weight(_text_:22 in 2092) [ClassicSimilarity], result of:
              0.0343532 = score(doc=2092,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.2708308 = fieldWeight in 2092, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2092)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    The architecture of multilingual software systems is sometimes centred around an intermediate language. The question is analyzed to what extent this approach can be useful for multilingual thesauri, in particular regarding the functionality the thesaurus is designed to fulfil. Both the runtime use, and the construction and maintenance of the system is taken into consideration. Using the perspective of general language technology enables to draw on experience from a broader range of fields beyond thesaurus design itself as well as to consider the possibility of using a thesaurus as a knowledge module in various systems which process natural language. Therefore the features which thesauri and other natural-language processing systems have in common are emphasized, especially at the level of systems design and their core functionality
    Source
    Knowledge organization. 22(1995) nos.3/4, S.136-140
  6. Ata, B.M.A.: SISDOM: a multilingual document retrieval system (1995) 0.01
    0.006042793 = product of:
      0.04229955 = sum of:
        0.04229955 = product of:
          0.10574888 = sum of:
            0.05074066 = weight(_text_:retrieval in 895) [ClassicSimilarity], result of:
              0.05074066 = score(doc=895,freq=6.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.46309367 = fieldWeight in 895, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.0625 = fieldNorm(doc=895)
            0.055008218 = weight(_text_:system in 895) [ClassicSimilarity], result of:
              0.055008218 = score(doc=895,freq=6.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.48217484 = fieldWeight in 895, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0625 = fieldNorm(doc=895)
          0.4 = coord(2/5)
      0.14285715 = coord(1/7)
    
    Abstract
    The Malay language is widely used in Malaysia, Indonesia and brunei. The growth in the number of documents written in Malay justifies the need for a document retrieval system for that language. Describes the implementation of a bilingual Malay and English full text document retrieval systems: SIStem capaian DOkumen Multilingua (SISDOM), by the Kebangsaan University Malaysia. The system incorporates many facilities for users, including the choice of search techniques, browsing of retrieved documents, and ranking of documents
  7. Weihs, J.: Three tales of multilingual cataloguing (1998) 0.01
    0.0056086862 = product of:
      0.0392608 = sum of:
        0.0392608 = product of:
          0.0785216 = sum of:
            0.0785216 = weight(_text_:22 in 6063) [ClassicSimilarity], result of:
              0.0785216 = score(doc=6063,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.61904186 = fieldWeight in 6063, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=6063)
          0.5 = coord(1/2)
      0.14285715 = coord(1/7)
    
    Date
    2. 8.2001 8:55:22
  8. Pollitt, A.S.; Ellis, G.: Multilingual access to document databases (1993) 0.00
    0.004099487 = product of:
      0.028696407 = sum of:
        0.028696407 = product of:
          0.071741015 = sum of:
            0.0380555 = weight(_text_:retrieval in 1302) [ClassicSimilarity], result of:
              0.0380555 = score(doc=1302,freq=6.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.34732026 = fieldWeight in 1302, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1302)
            0.033685513 = weight(_text_:system in 1302) [ClassicSimilarity], result of:
              0.033685513 = score(doc=1302,freq=4.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.29527056 = fieldWeight in 1302, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1302)
          0.4 = coord(2/5)
      0.14285715 = coord(1/7)
    
    Abstract
    This paper examines the reasons why approaches to facilitate document retrieval which apply AI (Artificial Intelligence) or Expert Systems techniques, relying on so-called "natural language" query statements from the end-user will result in sub-optimal solutions. It does so by reflecting on the nature of language and the fundamental problems in document retrieval. Support is given to the work of thesaurus builders and indexers with illustrations of how their work may be utilised in a generally applicable computer-based document retrieval system using Multilingual MenUSE software. The EuroMenUSE interface providing multilingual document access to EPOQUE, the European Parliament's Online Query System is described.
  9. Oard, D.W.: Alternative approaches for cross-language text retrieval (1997) 0.00
    0.0039593396 = product of:
      0.027715376 = sum of:
        0.027715376 = product of:
          0.06928844 = sum of:
            0.04963856 = weight(_text_:retrieval in 1164) [ClassicSimilarity], result of:
              0.04963856 = score(doc=1164,freq=30.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.45303512 = fieldWeight in 1164, product of:
                  5.477226 = tf(freq=30.0), with freq of:
                    30.0 = termFreq=30.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=1164)
            0.019649884 = weight(_text_:system in 1164) [ClassicSimilarity], result of:
              0.019649884 = score(doc=1164,freq=4.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.17224117 = fieldWeight in 1164, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=1164)
          0.4 = coord(2/5)
      0.14285715 = coord(1/7)
    
    Abstract
    The explosive growth of the Internet and other sources of networked information have made automatic mediation of access to networked information sources an increasingly important problem. Much of this information is expressed as electronic text, and it is becoming practical to automatically convert some printed documents and recorded speech to electronic text as well. Thus, automated systems capable of detecting useful documents are finding widespread application. With even a small number of languages it can be inconvenient to issue the same query repeatedly in every language, so users who are able to read more than one language will likely prefer a multilingual text retrieval system over a collection of monolingual systems. And since reading ability in a language does not always imply fluent writing ability in that language, such users will likely find cross-language text retrieval particularly useful for languages in which they are less confident of their ability to express their information needs effectively. The use of such systems can be also be beneficial if the user is able to read only a single language. For example, when only a small portion of the document collection will ever be examined by the user, performing retrieval before translation can be significantly more economical than performing translation before retrieval. So when the application is sufficiently important to justify the time and effort required for translation, those costs can be minimized if an effective cross-language text retrieval system is available. Even when translation is not available, there are circumstances in which cross-language text retrieval could be useful to a monolingual user. For example, a researcher might find a paper published in an unfamiliar language useful if that paper contains references to works by the same author that are in the researcher's native language.
    Multilingual text retrieval can be defined as selection of useful documents from collections that may contain several languages (English, French, Chinese, etc.). This formulation allows for the possibility that individual documents might contain more than one language, a common occurrence in some applications. Both cross-language and within-language retrieval are included in this formulation, but it is the cross-language aspect of the problem which distinguishes multilingual text retrieval from its well studied monolingual counterpart. At the SIGIR 96 workshop on "Cross-Linguistic Information Retrieval" the participants discussed the proliferation of terminology being used to describe the field and settled on "Cross-Language" as the best single description of the salient aspect of the problem. "Multilingual" was felt to be too broad, since that term has also been used to describe systems able to perform within-language retrieval in more than one language but that lack any cross-language capability. "Cross-lingual" and "cross-linguistic" were felt to be equally good descriptions of the field, but "crosslanguage" was selected as the preferred term in the interest of standardization. Unfortunately, at about the same time the U.S. Defense Advanced Research Projects Agency (DARPA) introduced "translingual" as their preferred term, so we are still some distance from reaching consensus on this matter.
    I will not attempt to draw a sharp distinction between retrieval and filtering in this survey. Although my own work on adaptive cross-language text filtering has led me to make this distinction fairly carefully in other presentations (c.f., (Oard 1997b)), such an proach does little to help understand the fundamental techniques which have been applied or the results that have been obtained in this case. Since it is still common to view filtering (detection of useful documents in dynamic document streams) as a kind of retrieval, will simply adopt that perspective here.
    Theme
    Semantisches Umfeld in Indexierung u. Retrieval
  10. Ferber, R.: Automated indexing with thesaurus descriptors : a co-occurence based approach to multilingual retrieval (1997) 0.00
    0.0037481284 = product of:
      0.026236897 = sum of:
        0.026236897 = product of:
          0.065592244 = sum of:
            0.025893483 = weight(_text_:retrieval in 4144) [ClassicSimilarity], result of:
              0.025893483 = score(doc=4144,freq=4.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.23632148 = fieldWeight in 4144, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4144)
            0.03969876 = weight(_text_:system in 4144) [ClassicSimilarity], result of:
              0.03969876 = score(doc=4144,freq=8.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.3479797 = fieldWeight in 4144, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4144)
          0.4 = coord(2/5)
      0.14285715 = coord(1/7)
    
    Abstract
    Indexing documents with descriptors from a multilingual thesaurus is an approach to multilingual information retrieval. However, manual indexing is expensive. Automazed indexing methods in general use terms found in the document. Thesaurus descriptors are complex terms that are often not used in documents or have specific meanings within the thesaurus; therefore most weighting schemes of automated indexing methods are not suited to select thesaurus descriptors. In this paper a linear associative system is described that uses similarity values extracted from a large corpus of manually indexed documents to construct a rank ordering of the descriptors for a given document title. The system is adaptive and has to be tuned with a training sample of records for the specific task. The system was tested on a corpus of some 80.000 bibliographic records. The results show a high variability with changing parameter values. This indicated that it is very important to empirically adapt the model to the specific situation it is used in. The overall median of the manually assigned descriptors in the automatically generated ranked list of all 3.631 descriptors is 14 for the set used to adapt the system and 11 for a test set not used in the optimization process. This result shows that the optimization is not a fitting to a specific training set but a real adaptation of the model to the setting
  11. Pollitt, A.S.; Ellis, G.P.; Smith, M.P.; Gregory, M.R.; Li, C.S.; Zangenberg, H.: ¬A common query interface for multilingual document retrieval from databases of the European Community Institutions (1993) 0.00
    0.003710458 = product of:
      0.025973205 = sum of:
        0.025973205 = product of:
          0.06493301 = sum of:
            0.025633242 = weight(_text_:retrieval in 7736) [ClassicSimilarity], result of:
              0.025633242 = score(doc=7736,freq=2.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.23394634 = fieldWeight in 7736, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=7736)
            0.039299767 = weight(_text_:system in 7736) [ClassicSimilarity], result of:
              0.039299767 = score(doc=7736,freq=4.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.34448233 = fieldWeight in 7736, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=7736)
          0.4 = coord(2/5)
      0.14285715 = coord(1/7)
    
    Abstract
    Presents EuroMenUSE, a PC-based front-end system developed to improve access to EPOQUE, the major document database of the European Parliament. EuroMenUSEe is an exemplar and the first commercial product to result from the application of the Multilingual MenUSE software shell; in this system it uses the EUROVOC thesaurus. This Common Query interface replaces the Common command Language and provides a more effectve way for end-users to access document databases
  12. Kutschekmanesch, S.; Lutes, B.; Moelle, K.; Thiel, U.; Tzeras, K.: Automated multilingual indexing : a synthesis of rule-based and thesaurus-based methods (1998) 0.00
    0.0035054288 = product of:
      0.024538001 = sum of:
        0.024538001 = product of:
          0.049076002 = sum of:
            0.049076002 = weight(_text_:22 in 4157) [ClassicSimilarity], result of:
              0.049076002 = score(doc=4157,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.38690117 = fieldWeight in 4157, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=4157)
          0.5 = coord(1/2)
      0.14285715 = coord(1/7)
    
    Source
    Information und Märkte: 50. Deutscher Dokumentartag 1998, Kongreß der Deutschen Gesellschaft für Dokumentation e.V. (DGD), Rheinische Friedrich-Wilhelms-Universität Bonn, 22.-24. September 1998. Hrsg. von Marlies Ockenfeld u. Gerhard J. Mantwill
  13. Peters, C.; Picchi, E.: Across languages, across cultures : issues in multilinguality and digital libraries (1997) 0.00
    0.0034888082 = product of:
      0.024421657 = sum of:
        0.024421657 = product of:
          0.06105414 = sum of:
            0.029295133 = weight(_text_:retrieval in 1233) [ClassicSimilarity], result of:
              0.029295133 = score(doc=1233,freq=2.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.26736724 = fieldWeight in 1233, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1233)
            0.03175901 = weight(_text_:system in 1233) [ClassicSimilarity], result of:
              0.03175901 = score(doc=1233,freq=2.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.27838376 = fieldWeight in 1233, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1233)
          0.4 = coord(2/5)
      0.14285715 = coord(1/7)
    
    Abstract
    With the recent rapid diffusion over the international computer networks of world-wide distributed document bases, the question of multilingual access and multilingual information retrieval is becoming increasingly relevant. We briefly discuss just some of the issues that must be addressed in order to implement a multilingual interface for a Digital Library system and describe our own approach to this problem.
  14. Pearce, C.; Nicholas, C.: TELLTALE: Experiments in a dynamic hypertext environment for degraded and multilingual data (1996) 0.00
    0.003136654 = product of:
      0.021956576 = sum of:
        0.021956576 = product of:
          0.054891437 = sum of:
            0.03107218 = weight(_text_:retrieval in 4071) [ClassicSimilarity], result of:
              0.03107218 = score(doc=4071,freq=4.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.2835858 = fieldWeight in 4071, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4071)
            0.023819257 = weight(_text_:system in 4071) [ClassicSimilarity], result of:
              0.023819257 = score(doc=4071,freq=2.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.20878783 = fieldWeight in 4071, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4071)
          0.4 = coord(2/5)
      0.14285715 = coord(1/7)
    
    Abstract
    Methods and tools for finding documents relevant to a user's needs in a document corpora can be found in the information retrieval, library science, and hypertext communities. Typically, these systems provide retrieval capabilities for fairly static copora, their algorithms are dependent on the language for which they are written, e.g. English, and they do not perform well when presented with misspelled words or text that has been degraded by OCR techniques. In this article, we present experimentation results for the TELLTALE system. TELLTALE is a dynamic hypertext environment that provides full-text search from a hypertext-style user interface for text corpora that may be garbled by OCR or transmission errors, and that may contain languages other than English. TELLTALE uses several techniques based on n-grams (n character sequences of text). With these results we show that the dynamic linkage mechanisms in TELLTALE are tolerant of garbles in up to 30% of the characters in the body of the texts
  15. Cross-language information retrieval (1998) 0.00
    0.002894546 = product of:
      0.02026182 = sum of:
        0.02026182 = product of:
          0.05065455 = sum of:
            0.03661892 = weight(_text_:retrieval in 6299) [ClassicSimilarity], result of:
              0.03661892 = score(doc=6299,freq=32.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.33420905 = fieldWeight in 6299, product of:
                  5.656854 = tf(freq=32.0), with freq of:
                    32.0 = termFreq=32.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.01953125 = fieldNorm(doc=6299)
            0.014035632 = weight(_text_:system in 6299) [ClassicSimilarity], result of:
              0.014035632 = score(doc=6299,freq=4.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.12302941 = fieldWeight in 6299, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.01953125 = fieldNorm(doc=6299)
          0.4 = coord(2/5)
      0.14285715 = coord(1/7)
    
    Content
    Enthält die Beiträge: GREFENSTETTE, G.: The Problem of Cross-Language Information Retrieval; DAVIS, M.W.: On the Effective Use of Large Parallel Corpora in Cross-Language Text Retrieval; BALLESTEROS, L. u. W.B. CROFT: Statistical Methods for Cross-Language Information Retrieval; Distributed Cross-Lingual Information Retrieval; Automatic Cross-Language Information Retrieval Using Latent Semantic Indexing; EVANS, D.A. u.a.: Mapping Vocabularies Using Latent Semantics; PICCHI, E. u. C. PETERS: Cross-Language Information Retrieval: A System for Comparable Corpus Querying; YAMABANA, K. u.a.: A Language Conversion Front-End for Cross-Language Information Retrieval; GACHOT, D.A. u.a.: The Systran NLP Browser: An Application of Machine Translation Technology in Cross-Language Information Retrieval; HULL, D.: A Weighted Boolean Model for Cross-Language Text Retrieval; SHERIDAN, P. u.a. Building a Large Multilingual Test Collection from Comparable News Documents; OARD; D.W. u. B.J. DORR: Evaluating Cross-Language Text Filtering Effectiveness
    Footnote
    Rez. in: Machine translation review: 1999, no.10, S.26-27 (D. Lewis): "Cross Language Information Retrieval (CLIR) addresses the growing need to access large volumes of data across language boundaries. The typical requirement is for the user to input a free form query, usually a brief description of a topic, into a search or retrieval engine which returns a list, in ranked order, of documents or web pages that are relevant to the topic. The search engine matches the terms in the query to indexed terms, usually keywords previously derived from the target documents. Unlike monolingual information retrieval, CLIR requires query terms in one language to be matched to indexed terms in another. Matching can be done by bilingual dictionary lookup, full machine translation, or by applying statistical methods. A query's success is measured in terms of recall (how many potentially relevant target documents are found) and precision (what proportion of documents found are relevant). Issues in CLIR are how to translate query terms into index terms, how to eliminate alternative translations (e.g. to decide that French 'traitement' in a query means 'treatment' and not 'salary'), and how to rank or weight translation alternatives that are retained (e.g. how to order the French terms 'aventure', 'business', 'affaire', and 'liaison' as relevant translations of English 'affair'). Grefenstette provides a lucid and useful overview of the field and the problems. The volume brings together a number of experiments and projects in CLIR. Mark Davies (New Mexico State University) describes Recuerdo, a Spanish retrieval engine which reduces translation ambiguities by scanning indexes for parallel texts; it also uses either a bilingual dictionary or direct equivalents from a parallel corpus in order to compare results for queries on parallel texts. Lisa Ballesteros and Bruce Croft (University of Massachusetts) use a 'local feedback' technique which automatically enhances a query by adding extra terms to it both before and after translation; such terms can be derived from documents known to be relevant to the query.
    Christian Fluhr at al (DIST/SMTI, France) outline the EMIR (European Multilingual Information Retrieval) and ESPRIT projects. They found that using SYSTRAN to machine translate queries and to access material from various multilingual databases produced less relevant results than a method referred to as 'multilingual reformulation' (the mechanics of which are only hinted at). An interesting technique is Latent Semantic Indexing (LSI), described by Michael Littman et al (Brown University) and, most clearly, by David Evans et al (Carnegie Mellon University). LSI involves creating matrices of documents and the terms they contain and 'fitting' related documents into a reduced matrix space. This effectively allows queries to be mapped onto a common semantic representation of the documents. Eugenio Picchi and Carol Peters (Pisa) report on a procedure to create links between translation equivalents in an Italian-English parallel corpus. The links are used to construct parallel linguistic contexts in real-time for any term or combination of terms that is being searched for in either language. Their interest is primarily lexicographic but they plan to apply the same procedure to comparable corpora, i.e. to texts which are not translations of each other but which share the same domain. Kiyoshi Yamabana et al (NEC, Japan) address the issue of how to disambiguate between alternative translations of query terms. Their DMAX (double maximise) method looks at co-occurrence frequencies between both source language words and target language words in order to arrive at the most probable translation. The statistical data for the decision are derived, not from the translation texts but independently from monolingual corpora in each language. An interactive user interface allows the user to influence the selection of terms during the matching process. Denis Gachot et al (SYSTRAN) describe the SYSTRAN NLP browser, a prototype tool which collects parsing information derived from a text or corpus previously translated with SYSTRAN. The user enters queries into the browser in either a structured or free form and receives grammatical and lexical information about the source text and/or its translation.
    The retrieved output from a query including the phrase 'big rockets' may be, for instance, a sentence containing 'giant rocket' which is semantically ranked above 'military ocket'. David Hull (Xerox Research Centre, Grenoble) describes an implementation of a weighted Boolean model for Spanish-English CLIR. Users construct Boolean-type queries, weighting each term in the query, which is then translated by an on-line dictionary before being applied to the database. Comparisons with the performance of unweighted free-form queries ('vector space' models) proved encouraging. Two contributions consider the evaluation of CLIR systems. In order to by-pass the time-consuming and expensive process of assembling a standard collection of documents and of user queries against which the performance of an CLIR system is manually assessed, Páriac Sheridan et al (ETH Zurich) propose a method based on retrieving 'seed documents'. This involves identifying a unique document in a database (the 'seed document') and, for a number of queries, measuring how fast it is retrieved. The authors have also assembled a large database of multilingual news documents for testing purposes. By storing the (fairly short) documents in a structured form tagged with descriptor codes (e.g. for topic, country and area), the test suite is easily expanded while remaining consistent for the purposes of testing. Douglas Ouard and Bonne Dorr (University of Maryland) describe an evaluation methodology which appears to apply LSI techniques in order to filter and rank incoming documents designed for testing CLIR systems. The volume provides the reader an excellent overview of several projects in CLIR. It is well supported with references and is intended as a secondary text for researchers and practitioners. It highlights the need for a good, general tutorial introduction to the field."
    Series
    The Kluwer International series on information retrieval
  16. Timotin, A.: Multilingvism si tezaure de concepte (1994) 0.00
    0.0028043431 = product of:
      0.0196304 = sum of:
        0.0196304 = product of:
          0.0392608 = sum of:
            0.0392608 = weight(_text_:22 in 7887) [ClassicSimilarity], result of:
              0.0392608 = score(doc=7887,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.30952093 = fieldWeight in 7887, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=7887)
          0.5 = coord(1/2)
      0.14285715 = coord(1/7)
    
    Source
    Probleme de Informare si Documentare. 28(1994) no.1, S.13-22
  17. Heinzelin, D. de; ¬d'¬Hautcourt, F.; Pols, R.: ¬Un nouveaux thesaurus multilingue informatise relatif aux instruments de musique (1998) 0.00
    0.0028043431 = product of:
      0.0196304 = sum of:
        0.0196304 = product of:
          0.0392608 = sum of:
            0.0392608 = weight(_text_:22 in 932) [ClassicSimilarity], result of:
              0.0392608 = score(doc=932,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.30952093 = fieldWeight in 932, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=932)
          0.5 = coord(1/2)
      0.14285715 = coord(1/7)
    
    Date
    1. 8.1996 22:01:00
  18. Multilingual information management : current levels and future abilities. A report Commissioned by the US National Science Foundation and also delivered to the European Commission's Language Engineering Office and the US Defense Advanced Research Projects Agency, April 1999 (1999) 0.00
    0.0027789979 = product of:
      0.019452984 = sum of:
        0.019452984 = product of:
          0.048632458 = sum of:
            0.032752953 = weight(_text_:retrieval in 6068) [ClassicSimilarity], result of:
              0.032752953 = score(doc=6068,freq=10.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.29892567 = fieldWeight in 6068, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03125 = fieldNorm(doc=6068)
            0.015879504 = weight(_text_:system in 6068) [ClassicSimilarity], result of:
              0.015879504 = score(doc=6068,freq=2.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.13919188 = fieldWeight in 6068, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03125 = fieldNorm(doc=6068)
          0.4 = coord(2/5)
      0.14285715 = coord(1/7)
    
    Abstract
    Over the past 50 years, a variety of language-related capabilities has been developed in machine translation, information retrieval, speech recognition, text summarization, and so on. These applications rest upon a set of core techniques such as language modeling, information extraction, parsing, generation, and multimedia planning and integration; and they involve methods using statistics, rules, grammars, lexicons, ontologies, training techniques, and so on. It is a puzzling fact that although all of this work deals with language in some form or other, the major applications have each developed a separate research field. For example, there is no reason why speech recognition techniques involving n-grams and hidden Markov models could not have been used in machine translation 15 years earlier than they were, or why some of the lexical and semantic insights from the subarea called Computational Linguistics are still not used in information retrieval.
    This picture will rapidly change. The twin challenges of massive information overload via the web and ubiquitous computers present us with an unavoidable task: developing techniques to handle multilingual and multi-modal information robustly and efficiently, with as high quality performance as possible. The most effective way for us to address such a mammoth task, and to ensure that our various techniques and applications fit together, is to start talking across the artificial research boundaries. Extending the current technologies will require integrating the various capabilities into multi-functional and multi-lingual natural language systems. However, at this time there is no clear vision of how these technologies could or should be assembled into a coherent framework. What would be involved in connecting a speech recognition system to an information retrieval engine, and then using machine translation and summarization software to process the retrieved text? How can traditional parsing and generation be enhanced with statistical techniques? What would be the effect of carefully crafted lexicons on traditional information retrieval? At which points should machine translation be interleaved within information retrieval systems to enable multilingual processing?
  19. Grefenstette, G.: ¬The problem of cross-language information retrieval (1998) 0.00
    0.0021746 = product of:
      0.015222199 = sum of:
        0.015222199 = product of:
          0.076111 = sum of:
            0.076111 = weight(_text_:retrieval in 6301) [ClassicSimilarity], result of:
              0.076111 = score(doc=6301,freq=6.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.6946405 = fieldWeight in 6301, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6301)
          0.2 = coord(1/5)
      0.14285715 = coord(1/7)
    
    Series
    The Kluwer International series on information retrieval
    Source
    Cross-language information retrieval. Ed.: G. Grefenstette
  20. Davis, M.W.: On the effective use of large parallel corpora in cross-language text retrieval (1998) 0.00
    0.0021746 = product of:
      0.015222199 = sum of:
        0.015222199 = product of:
          0.076111 = sum of:
            0.076111 = weight(_text_:retrieval in 6302) [ClassicSimilarity], result of:
              0.076111 = score(doc=6302,freq=6.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.6946405 = fieldWeight in 6302, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6302)
          0.2 = coord(1/5)
      0.14285715 = coord(1/7)
    
    Series
    The Kluwer International series on information retrieval
    Source
    Cross-language information retrieval. Ed.: G. Grefenstette

Languages

  • e 48
  • d 3
  • ro 2
  • f 1
  • sp 1
  • More… Less…

Types

  • a 51
  • el 6
  • r 2
  • m 1
  • s 1
  • More… Less…