Search (55 results, page 2 of 3)

  • × theme_ss:"Multilinguale Probleme"
  • × year_i:[1990 TO 2000}
  1. Davis, M.; Dunning, T.: ¬A TREC evaluation of query translation methods for multi-lingual text retrieval (1996) 0.01
    0.009726323 = product of:
      0.02917897 = sum of:
        0.02917897 = product of:
          0.08753691 = sum of:
            0.08753691 = weight(_text_:retrieval in 1917) [ClassicSimilarity], result of:
              0.08753691 = score(doc=1917,freq=4.0), product of:
                0.15433937 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.051022716 = queryNorm
                0.5671716 = fieldWeight in 1917, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.09375 = fieldNorm(doc=1917)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Source
    The Fourth Text Retrieval Conference (TREC-4). Ed.: K. Harman
  2. Davis, M.: New experiments in cross-language text retrieval at NMSU's computing research lab (1997) 0.01
    0.009726323 = product of:
      0.02917897 = sum of:
        0.02917897 = product of:
          0.08753691 = sum of:
            0.08753691 = weight(_text_:retrieval in 3111) [ClassicSimilarity], result of:
              0.08753691 = score(doc=3111,freq=4.0), product of:
                0.15433937 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.051022716 = queryNorm
                0.5671716 = fieldWeight in 3111, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.09375 = fieldNorm(doc=3111)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Source
    The Fifth Text Retrieval Conference (TREC-5). Ed.: E.M. Voorhees u. D.K. Harman
  3. Sheridan, P.; Ballerini, J.P.; Schäuble, P.: Building a large multilingual test collection from comparable news documents (1998) 0.01
    0.009726323 = product of:
      0.02917897 = sum of:
        0.02917897 = product of:
          0.08753691 = sum of:
            0.08753691 = weight(_text_:retrieval in 6298) [ClassicSimilarity], result of:
              0.08753691 = score(doc=6298,freq=4.0), product of:
                0.15433937 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.051022716 = queryNorm
                0.5671716 = fieldWeight in 6298, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6298)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Series
    The Kluwer International series on information retrieval
    Source
    Cross-language information retrieval. Ed.: G. Grefenstette
  4. Evans, D.A.; Handerson, S.K.; Monarch, I.A.; Pereiro, J.; Delon, L.; Hersch, W.R.: Mapping vocabularies using latent semantics (1998) 0.01
    0.009726323 = product of:
      0.02917897 = sum of:
        0.02917897 = product of:
          0.08753691 = sum of:
            0.08753691 = weight(_text_:retrieval in 6304) [ClassicSimilarity], result of:
              0.08753691 = score(doc=6304,freq=4.0), product of:
                0.15433937 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.051022716 = queryNorm
                0.5671716 = fieldWeight in 6304, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6304)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Series
    The Kluwer International series on information retrieval
    Source
    Cross-language information retrieval. Ed.: G. Grefenstette
  5. Oard, D.W.; Resnik, P.: Support for interactive document selection in cross-language information retrieval (1999) 0.01
    0.008023808 = product of:
      0.024071421 = sum of:
        0.024071421 = product of:
          0.07221426 = sum of:
            0.07221426 = weight(_text_:retrieval in 5938) [ClassicSimilarity], result of:
              0.07221426 = score(doc=5938,freq=2.0), product of:
                0.15433937 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.051022716 = queryNorm
                0.46789268 = fieldWeight in 5938, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.109375 = fieldNorm(doc=5938)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
  6. Oard, D.W.; Diekema, A.R.: Cross-language information retrieval (1999) 0.01
    0.008023808 = product of:
      0.024071421 = sum of:
        0.024071421 = product of:
          0.07221426 = sum of:
            0.07221426 = weight(_text_:retrieval in 4690) [ClassicSimilarity], result of:
              0.07221426 = score(doc=4690,freq=2.0), product of:
                0.15433937 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.051022716 = queryNorm
                0.46789268 = fieldWeight in 4690, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.109375 = fieldNorm(doc=4690)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
  7. Gonzalo, J.; Verdejo, F.; Peters, C.; Calzolari, N.: Applying EuroWordNet to cross-language text retrieval (1998) 0.01
    0.008023808 = product of:
      0.024071421 = sum of:
        0.024071421 = product of:
          0.07221426 = sum of:
            0.07221426 = weight(_text_:retrieval in 6445) [ClassicSimilarity], result of:
              0.07221426 = score(doc=6445,freq=2.0), product of:
                0.15433937 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.051022716 = queryNorm
                0.46789268 = fieldWeight in 6445, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.109375 = fieldNorm(doc=6445)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
  8. Ata, B.M.A.: SISDOM: a multilingual document retrieval system (1995) 0.01
    0.00794151 = product of:
      0.023824528 = sum of:
        0.023824528 = product of:
          0.07147358 = sum of:
            0.07147358 = weight(_text_:retrieval in 895) [ClassicSimilarity], result of:
              0.07147358 = score(doc=895,freq=6.0), product of:
                0.15433937 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.051022716 = queryNorm
                0.46309367 = fieldWeight in 895, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.0625 = fieldNorm(doc=895)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    The Malay language is widely used in Malaysia, Indonesia and brunei. The growth in the number of documents written in Malay justifies the need for a document retrieval system for that language. Describes the implementation of a bilingual Malay and English full text document retrieval systems: SIStem capaian DOkumen Multilingua (SISDOM), by the Kebangsaan University Malaysia. The system incorporates many facilities for users, including the choice of search techniques, browsing of retrieved documents, and ranking of documents
  9. Oard, D.W.: Alternative approaches for cross-language text retrieval (1997) 0.01
    0.007769018 = product of:
      0.023307053 = sum of:
        0.023307053 = product of:
          0.06992116 = sum of:
            0.06992116 = weight(_text_:retrieval in 1164) [ClassicSimilarity], result of:
              0.06992116 = score(doc=1164,freq=30.0), product of:
                0.15433937 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.051022716 = queryNorm
                0.45303512 = fieldWeight in 1164, product of:
                  5.477226 = tf(freq=30.0), with freq of:
                    30.0 = termFreq=30.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=1164)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    The explosive growth of the Internet and other sources of networked information have made automatic mediation of access to networked information sources an increasingly important problem. Much of this information is expressed as electronic text, and it is becoming practical to automatically convert some printed documents and recorded speech to electronic text as well. Thus, automated systems capable of detecting useful documents are finding widespread application. With even a small number of languages it can be inconvenient to issue the same query repeatedly in every language, so users who are able to read more than one language will likely prefer a multilingual text retrieval system over a collection of monolingual systems. And since reading ability in a language does not always imply fluent writing ability in that language, such users will likely find cross-language text retrieval particularly useful for languages in which they are less confident of their ability to express their information needs effectively. The use of such systems can be also be beneficial if the user is able to read only a single language. For example, when only a small portion of the document collection will ever be examined by the user, performing retrieval before translation can be significantly more economical than performing translation before retrieval. So when the application is sufficiently important to justify the time and effort required for translation, those costs can be minimized if an effective cross-language text retrieval system is available. Even when translation is not available, there are circumstances in which cross-language text retrieval could be useful to a monolingual user. For example, a researcher might find a paper published in an unfamiliar language useful if that paper contains references to works by the same author that are in the researcher's native language.
    Multilingual text retrieval can be defined as selection of useful documents from collections that may contain several languages (English, French, Chinese, etc.). This formulation allows for the possibility that individual documents might contain more than one language, a common occurrence in some applications. Both cross-language and within-language retrieval are included in this formulation, but it is the cross-language aspect of the problem which distinguishes multilingual text retrieval from its well studied monolingual counterpart. At the SIGIR 96 workshop on "Cross-Linguistic Information Retrieval" the participants discussed the proliferation of terminology being used to describe the field and settled on "Cross-Language" as the best single description of the salient aspect of the problem. "Multilingual" was felt to be too broad, since that term has also been used to describe systems able to perform within-language retrieval in more than one language but that lack any cross-language capability. "Cross-lingual" and "cross-linguistic" were felt to be equally good descriptions of the field, but "crosslanguage" was selected as the preferred term in the interest of standardization. Unfortunately, at about the same time the U.S. Defense Advanced Research Projects Agency (DARPA) introduced "translingual" as their preferred term, so we are still some distance from reaching consensus on this matter.
    I will not attempt to draw a sharp distinction between retrieval and filtering in this survey. Although my own work on adaptive cross-language text filtering has led me to make this distinction fairly carefully in other presentations (c.f., (Oard 1997b)), such an proach does little to help understand the fundamental techniques which have been applied or the results that have been obtained in this case. Since it is still common to view filtering (detection of useful documents in dynamic document streams) as a kind of retrieval, will simply adopt that perspective here.
    Theme
    Semantisches Umfeld in Indexierung u. Retrieval
  10. Kutschekmanesch, S.; Lutes, B.; Moelle, K.; Thiel, U.; Tzeras, K.: Automated multilingual indexing : a synthesis of rule-based and thesaurus-based methods (1998) 0.01
    0.007680971 = product of:
      0.023042914 = sum of:
        0.023042914 = product of:
          0.06912874 = sum of:
            0.06912874 = weight(_text_:22 in 4157) [ClassicSimilarity], result of:
              0.06912874 = score(doc=4157,freq=2.0), product of:
                0.17867287 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051022716 = queryNorm
                0.38690117 = fieldWeight in 4157, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=4157)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Source
    Information und Märkte: 50. Deutscher Dokumentartag 1998, Kongreß der Deutschen Gesellschaft für Dokumentation e.V. (DGD), Rheinische Friedrich-Wilhelms-Universität Bonn, 22.-24. September 1998. Hrsg. von Marlies Ockenfeld u. Gerhard J. Mantwill
  11. Hainebach, R.: European Community databases : a subject analysis (1992) 0.01
    0.006994778 = product of:
      0.020984333 = sum of:
        0.020984333 = product of:
          0.062952995 = sum of:
            0.062952995 = weight(_text_:online in 7402) [ClassicSimilarity], result of:
              0.062952995 = score(doc=7402,freq=6.0), product of:
                0.1548489 = queryWeight, product of:
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.051022716 = queryNorm
                0.4065447 = fieldWeight in 7402, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=7402)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    With the introduction of the single market, more and more European Community information databases are becoming available either online or on CD-ROM. Some databases are full text but many are bibliographic. Users may access them free text or through controlled descriptors but ideally they should be able to search through 1 or more subject access points. Each of the databases uses a different method and the different subject access methods, employing thesauri or classification schemes, are examined. Proposes a solution to the problem of multiple thesauri and multilingualism
    Source
    Online information 92. Proc. of the 16th Int. Online Information Meeting, London, 8-10.12.1992. Ed. by David I. Raitt
  12. Multi-sprachige, multi-charakter Fragestellungen für die Online-Umgebung : Vorträge, von einer Arbeitsgruppe von der IFLA Sektion für Katalogisierung finanziert, Istanbul, Türkei, 24.8.1999 (1998) 0.01
    0.0069230343 = product of:
      0.020769102 = sum of:
        0.020769102 = product of:
          0.062307306 = sum of:
            0.062307306 = weight(_text_:online in 4185) [ClassicSimilarity], result of:
              0.062307306 = score(doc=4185,freq=2.0), product of:
                0.1548489 = queryWeight, product of:
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.051022716 = queryNorm
                0.40237486 = fieldWeight in 4185, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.09375 = fieldNorm(doc=4185)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
  13. Hlava, M.M.K.: Machine-Aided Indexing (MAI) in a multilingual environemt (1992) 0.01
    0.0065270998 = product of:
      0.0195813 = sum of:
        0.0195813 = product of:
          0.058743894 = sum of:
            0.058743894 = weight(_text_:online in 2378) [ClassicSimilarity], result of:
              0.058743894 = score(doc=2378,freq=4.0), product of:
                0.1548489 = queryWeight, product of:
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.051022716 = queryNorm
                0.37936267 = fieldWeight in 2378, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2378)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Source
    Online information 92. Proc. of the 16th Int. Online Information Meeting, London, 8-10.12.1992. Ed. by David I. Raitt
  14. Zimmermann, H.H.: Überlegungen zu einem multilingualen Thesaurus-Konzept (1995) 0.01
    0.0064842156 = product of:
      0.019452646 = sum of:
        0.019452646 = product of:
          0.058357935 = sum of:
            0.058357935 = weight(_text_:retrieval in 2076) [ClassicSimilarity], result of:
              0.058357935 = score(doc=2076,freq=4.0), product of:
                0.15433937 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.051022716 = queryNorm
                0.37811437 = fieldWeight in 2076, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2076)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    Die Thesaurus-Thematik wird zunächst in den Zusammenhang der gesamten Erschließungs- und Retrievalmöglichkeiten eines Information-Retrieval-Systems gestellt. Auf dieser Grundlage wird ein multilinguales Thesaurus-Konzept entwickelt. Wichtige Elemente sind: die Ermöglichung des Zugangs anhand des Benutzervokabulars, eine systematische, transparente Bedeutungsdifferenzierung und eine Basis-Relationierung anhand einer einzigen ("ausgezeichneten") natürlichen Sprache.
    Source
    Konstruktion und Retrieval von Wissen: 3. Tagung der Deutschen ISKO-Sektion einschließlich der Vorträge des Workshops "Thesauri als terminologische Lexika", Weilburg, 27.-29.10.1993. Hrsg.: N. Meder u.a
  15. Timotin, A.: Multilingvism si tezaure de concepte (1994) 0.01
    0.006144777 = product of:
      0.01843433 = sum of:
        0.01843433 = product of:
          0.055302992 = sum of:
            0.055302992 = weight(_text_:22 in 7887) [ClassicSimilarity], result of:
              0.055302992 = score(doc=7887,freq=2.0), product of:
                0.17867287 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051022716 = queryNorm
                0.30952093 = fieldWeight in 7887, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=7887)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Source
    Probleme de Informare si Documentare. 28(1994) no.1, S.13-22
  16. Cao, L.; Leong, M.-K.; Low, H.-B.: Searching heterogeneous multilingual bibliographic sources (1998) 0.01
    0.006144777 = product of:
      0.01843433 = sum of:
        0.01843433 = product of:
          0.055302992 = sum of:
            0.055302992 = weight(_text_:22 in 3564) [ClassicSimilarity], result of:
              0.055302992 = score(doc=3564,freq=2.0), product of:
                0.17867287 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051022716 = queryNorm
                0.30952093 = fieldWeight in 3564, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3564)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    1. 8.1996 22:08:06
  17. Heinzelin, D. de; ¬d'¬Hautcourt, F.; Pols, R.: ¬Un nouveaux thesaurus multilingue informatise relatif aux instruments de musique (1998) 0.01
    0.006144777 = product of:
      0.01843433 = sum of:
        0.01843433 = product of:
          0.055302992 = sum of:
            0.055302992 = weight(_text_:22 in 932) [ClassicSimilarity], result of:
              0.055302992 = score(doc=932,freq=2.0), product of:
                0.17867287 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051022716 = queryNorm
                0.30952093 = fieldWeight in 932, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=932)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Date
    1. 8.1996 22:01:00
  18. Lehtinen, R.; Clavel-Merrin, G.: Mehrsprachige und verschiedenartige Daten in Bibliothekssystemen und -netzen : Erfahrungen und Perspektiven aus der Schweiz und Finnland (1998) 0.01
    0.005769195 = product of:
      0.017307585 = sum of:
        0.017307585 = product of:
          0.051922753 = sum of:
            0.051922753 = weight(_text_:online in 4186) [ClassicSimilarity], result of:
              0.051922753 = score(doc=4186,freq=2.0), product of:
                0.1548489 = queryWeight, product of:
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.051022716 = queryNorm
                0.33531237 = fieldWeight in 4186, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.078125 = fieldNorm(doc=4186)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Source
    Multi-sprachige, multi-charakter Fragestellungen für die Online-Umgebung: Vorträge, von einer Arbeitsgruppe von der IFLA Sektion für Katalogisierung finanziert, Istanbul, Türkei, 24.8.1999. Ed.: J.D. Byrum u. O. Madison
  19. Cross-language information retrieval (1998) 0.01
    0.005731291 = product of:
      0.017193872 = sum of:
        0.017193872 = product of:
          0.051581617 = sum of:
            0.051581617 = weight(_text_:retrieval in 6299) [ClassicSimilarity], result of:
              0.051581617 = score(doc=6299,freq=32.0), product of:
                0.15433937 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.051022716 = queryNorm
                0.33420905 = fieldWeight in 6299, product of:
                  5.656854 = tf(freq=32.0), with freq of:
                    32.0 = termFreq=32.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.01953125 = fieldNorm(doc=6299)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Content
    Enthält die Beiträge: GREFENSTETTE, G.: The Problem of Cross-Language Information Retrieval; DAVIS, M.W.: On the Effective Use of Large Parallel Corpora in Cross-Language Text Retrieval; BALLESTEROS, L. u. W.B. CROFT: Statistical Methods for Cross-Language Information Retrieval; Distributed Cross-Lingual Information Retrieval; Automatic Cross-Language Information Retrieval Using Latent Semantic Indexing; EVANS, D.A. u.a.: Mapping Vocabularies Using Latent Semantics; PICCHI, E. u. C. PETERS: Cross-Language Information Retrieval: A System for Comparable Corpus Querying; YAMABANA, K. u.a.: A Language Conversion Front-End for Cross-Language Information Retrieval; GACHOT, D.A. u.a.: The Systran NLP Browser: An Application of Machine Translation Technology in Cross-Language Information Retrieval; HULL, D.: A Weighted Boolean Model for Cross-Language Text Retrieval; SHERIDAN, P. u.a. Building a Large Multilingual Test Collection from Comparable News Documents; OARD; D.W. u. B.J. DORR: Evaluating Cross-Language Text Filtering Effectiveness
    Footnote
    Rez. in: Machine translation review: 1999, no.10, S.26-27 (D. Lewis): "Cross Language Information Retrieval (CLIR) addresses the growing need to access large volumes of data across language boundaries. The typical requirement is for the user to input a free form query, usually a brief description of a topic, into a search or retrieval engine which returns a list, in ranked order, of documents or web pages that are relevant to the topic. The search engine matches the terms in the query to indexed terms, usually keywords previously derived from the target documents. Unlike monolingual information retrieval, CLIR requires query terms in one language to be matched to indexed terms in another. Matching can be done by bilingual dictionary lookup, full machine translation, or by applying statistical methods. A query's success is measured in terms of recall (how many potentially relevant target documents are found) and precision (what proportion of documents found are relevant). Issues in CLIR are how to translate query terms into index terms, how to eliminate alternative translations (e.g. to decide that French 'traitement' in a query means 'treatment' and not 'salary'), and how to rank or weight translation alternatives that are retained (e.g. how to order the French terms 'aventure', 'business', 'affaire', and 'liaison' as relevant translations of English 'affair'). Grefenstette provides a lucid and useful overview of the field and the problems. The volume brings together a number of experiments and projects in CLIR. Mark Davies (New Mexico State University) describes Recuerdo, a Spanish retrieval engine which reduces translation ambiguities by scanning indexes for parallel texts; it also uses either a bilingual dictionary or direct equivalents from a parallel corpus in order to compare results for queries on parallel texts. Lisa Ballesteros and Bruce Croft (University of Massachusetts) use a 'local feedback' technique which automatically enhances a query by adding extra terms to it both before and after translation; such terms can be derived from documents known to be relevant to the query.
    Christian Fluhr at al (DIST/SMTI, France) outline the EMIR (European Multilingual Information Retrieval) and ESPRIT projects. They found that using SYSTRAN to machine translate queries and to access material from various multilingual databases produced less relevant results than a method referred to as 'multilingual reformulation' (the mechanics of which are only hinted at). An interesting technique is Latent Semantic Indexing (LSI), described by Michael Littman et al (Brown University) and, most clearly, by David Evans et al (Carnegie Mellon University). LSI involves creating matrices of documents and the terms they contain and 'fitting' related documents into a reduced matrix space. This effectively allows queries to be mapped onto a common semantic representation of the documents. Eugenio Picchi and Carol Peters (Pisa) report on a procedure to create links between translation equivalents in an Italian-English parallel corpus. The links are used to construct parallel linguistic contexts in real-time for any term or combination of terms that is being searched for in either language. Their interest is primarily lexicographic but they plan to apply the same procedure to comparable corpora, i.e. to texts which are not translations of each other but which share the same domain. Kiyoshi Yamabana et al (NEC, Japan) address the issue of how to disambiguate between alternative translations of query terms. Their DMAX (double maximise) method looks at co-occurrence frequencies between both source language words and target language words in order to arrive at the most probable translation. The statistical data for the decision are derived, not from the translation texts but independently from monolingual corpora in each language. An interactive user interface allows the user to influence the selection of terms during the matching process. Denis Gachot et al (SYSTRAN) describe the SYSTRAN NLP browser, a prototype tool which collects parsing information derived from a text or corpus previously translated with SYSTRAN. The user enters queries into the browser in either a structured or free form and receives grammatical and lexical information about the source text and/or its translation.
    Series
    The Kluwer International series on information retrieval
  20. Turner, J.M.: Cross-language transfer of indexing concepts for storage and retrieval of moving images : preliminary results (1996) 0.01
    0.005673688 = product of:
      0.017021064 = sum of:
        0.017021064 = product of:
          0.05106319 = sum of:
            0.05106319 = weight(_text_:retrieval in 7400) [ClassicSimilarity], result of:
              0.05106319 = score(doc=7400,freq=4.0), product of:
                0.15433937 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.051022716 = queryNorm
                0.33085006 = fieldWeight in 7400, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=7400)
          0.33333334 = coord(1/3)
      0.33333334 = coord(1/3)
    
    Abstract
    In previous research, participants who screen a videotape of stock footage from the National Film Board of Canada's stockshot collection were asked to assign terms in English that could be used for retrieval of each shot. The most popular terms were analyzed as potential indexing terms. In the current research a French language version of the research tapes was prepared, using the same images, and the data collected were in French. Compares the most popular terms identified in each of the 2 studies for each of the shots in order to determine the rate of correspondence between potential indexing terms in each language

Languages

  • e 47
  • d 4
  • ro 2
  • f 1
  • sp 1
  • More… Less…

Types

  • a 50
  • el 5
  • s 2
  • m 1
  • r 1
  • x 1
  • More… Less…