Search (19 results, page 1 of 1)

Oard, D.W.: Serving users in many languages : cross-language information retrieval for digital libraries (1997) 0.06

0.056203403 = product of:
  0.13114128 = sum of:
    0.03718255 = weight(_text_:processing in 1261) [ClassicSimilarity], result of:
      0.03718255 = score(doc=1261,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.22363065 = fieldWeight in 1261, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1261)
    0.04992717 = weight(_text_:digital in 1261) [ClassicSimilarity], result of:
      0.04992717 = score(doc=1261,freq=4.0), product of:
        0.16201277 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.04107254 = queryNorm
        0.3081681 = fieldWeight in 1261, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1261)
    0.044031553 = weight(_text_:techniques in 1261) [ClassicSimilarity], result of:
      0.044031553 = score(doc=1261,freq=2.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.24335694 = fieldWeight in 1261, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1261)
  0.42857143 = coord(3/7)

Abstract: We are rapidly constructing an extensive network infrastructure for moving information across national boundaries, but much remains to be done before linguistic barriers can be surmounted as effectively as geographic ones. Users seeking information from a digital library could benefit from the ability to query large collections once using a single language, even when more than one language is present in the collection. If the information they locate is not available in a language that they can read, some form of translation will be needed. At present, multilingual thesauri such as EUROVOC help to address this challenge by facilitating controlled vocabulary search using terms from several languages, and services such as INSPEC produce English abstracts for documents in other languages. On the other hand, support for free text searching across languages is not yet widely deployed, and fully automatic machine translation is presently neither sufficiently fast nor sufficiently accurate to adequately support interactive cross-language information seeking. An active and rapidly growing research community has coalesced around these and other related issues, applying techniques drawn from several fields - notably information retrieval and natural language processing - to provide access to large multilingual collections.

Multilingual information management : current levels and future abilities. A report Commissioned by the US National Science Foundation and also delivered to the European Commission's Language Engineering Office and the US Defense Advanced Research Projects Agency, April 1999 (1999) 0.03
```
0.033151403 = product of:
  0.1160299 = sum of:
    0.02974604 = weight(_text_:processing in 6068) [ClassicSimilarity], result of:
      0.02974604 = score(doc=6068,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.17890452 = fieldWeight in 6068, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.03125 = fieldNorm(doc=6068)
    0.08628386 = weight(_text_:techniques in 6068) [ClassicSimilarity], result of:
      0.08628386 = score(doc=6068,freq=12.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.47688022 = fieldWeight in 6068, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.03125 = fieldNorm(doc=6068)
  0.2857143 = coord(2/7)
```
Abstract

Over the past 50 years, a variety of language-related capabilities has been developed in machine translation, information retrieval, speech recognition, text summarization, and so on. These applications rest upon a set of core techniques such as language modeling, information extraction, parsing, generation, and multimedia planning and integration; and they involve methods using statistics, rules, grammars, lexicons, ontologies, training techniques, and so on. It is a puzzling fact that although all of this work deals with language in some form or other, the major applications have each developed a separate research field. For example, there is no reason why speech recognition techniques involving n-grams and hidden Markov models could not have been used in machine translation 15 years earlier than they were, or why some of the lexical and semantic insights from the subarea called Computational Linguistics are still not used in information retrieval.
This picture will rapidly change. The twin challenges of massive information overload via the web and ubiquitous computers present us with an unavoidable task: developing techniques to handle multilingual and multi-modal information robustly and efficiently, with as high quality performance as possible. The most effective way for us to address such a mammoth task, and to ensure that our various techniques and applications fit together, is to start talking across the artificial research boundaries. Extending the current technologies will require integrating the various capabilities into multi-functional and multi-lingual natural language systems. However, at this time there is no clear vision of how these technologies could or should be assembled into a coherent framework. What would be involved in connecting a speech recognition system to an information retrieval engine, and then using machine translation and summarization software to process the retrieved text? How can traditional parsing and generation be enhanced with statistical techniques? What would be the effect of carefully crafted lexicons on traditional information retrieval? At which points should machine translation be interleaved within information retrieval systems to enable multilingual processing?

Pollitt, A.S.; Ellis, G.: Multilingual access to document databases (1993) 0.03

0.027844835 = product of:
  0.09745692 = sum of:
    0.04461906 = weight(_text_:processing in 1302) [ClassicSimilarity], result of:
      0.04461906 = score(doc=1302,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.26835677 = fieldWeight in 1302, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.046875 = fieldNorm(doc=1302)
    0.052837856 = weight(_text_:techniques in 1302) [ClassicSimilarity], result of:
      0.052837856 = score(doc=1302,freq=2.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.2920283 = fieldWeight in 1302, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.046875 = fieldNorm(doc=1302)
  0.2857143 = coord(2/7)

Abstract: This paper examines the reasons why approaches to facilitate document retrieval which apply AI (Artificial Intelligence) or Expert Systems techniques, relying on so-called "natural language" query statements from the end-user will result in sub-optimal solutions. It does so by reflecting on the nature of language and the fundamental problems in document retrieval. Support is given to the work of thesaurus builders and indexers with illustrations of how their work may be utilised in a generally applicable computer-based document retrieval system using Multilingual MenUSE software. The EuroMenUSE interface providing multilingual document access to EPOQUE, the European Parliament's Online Query System is described.
Source: Information as a Global Commodity - Communication, Processing and Use (CAIS/ACSI '93) : 21st Annual Conference Canadian Association for Information Science, Antigonish, Nova Scotia, Canada. July 1993

Schubert, K.: Parameters for the design of an intermediate language for multilingual thesauri (1995) 0.02

0.020437783 = product of:
  0.071532235 = sum of:
    0.05205557 = weight(_text_:processing in 2092) [ClassicSimilarity], result of:
      0.05205557 = score(doc=2092,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.3130829 = fieldWeight in 2092, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2092)
    0.019476667 = product of:
      0.038953334 = sum of:
        0.038953334 = weight(_text_:22 in 2092) [ClassicSimilarity], result of:
          0.038953334 = score(doc=2092,freq=2.0), product of:
            0.14382903 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04107254 = queryNorm
            0.2708308 = fieldWeight in 2092, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2092)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)

Abstract: The architecture of multilingual software systems is sometimes centred around an intermediate language. The question is analyzed to what extent this approach can be useful for multilingual thesauri, in particular regarding the functionality the thesaurus is designed to fulfil. Both the runtime use, and the construction and maintenance of the system is taken into consideration. Using the perspective of general language technology enables to draw on experience from a broader range of fields beyond thesaurus design itself as well as to consider the possibility of using a thesaurus as a knowledge module in various systems which process natural language. Therefore the features which thesauri and other natural-language processing systems have in common are emphasized, especially at the level of systems design and their core functionality
Source: Knowledge organization. 22(1995) nos.3/4, S.136-140

Oard, D.W.; Resnik, P.: Support for interactive document selection in cross-language information retrieval (1999) 0.01

0.014873021 = product of:
  0.10411114 = sum of:
    0.10411114 = weight(_text_:processing in 5938) [ClassicSimilarity], result of:
      0.10411114 = score(doc=5938,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.6261658 = fieldWeight in 5938, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.109375 = fieldNorm(doc=5938)
  0.14285715 = coord(1/7)

Source: Information processing and management. 35(1999) no.3, S.363-379

Peters, C.; Picchi, E.: Across languages, across cultures : issues in multilinguality and digital libraries (1997) 0.01

0.011411925 = product of:
  0.07988347 = sum of:
    0.07988347 = weight(_text_:digital in 1233) [ClassicSimilarity], result of:
      0.07988347 = score(doc=1233,freq=4.0), product of:
        0.16201277 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.04107254 = queryNorm
        0.493069 = fieldWeight in 1233, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.0625 = fieldNorm(doc=1233)
  0.14285715 = coord(1/7)

Abstract: With the recent rapid diffusion over the international computer networks of world-wide distributed document bases, the question of multilingual access and multilingual information retrieval is becoming increasingly relevant. We briefly discuss just some of the issues that must be addressed in order to implement a multilingual interface for a Digital Library system and describe our own approach to this problem.

Pearce, C.; Nicholas, C.: TELLTALE: Experiments in a dynamic hypertext environment for degraded and multilingual data (1996) 0.01
```
0.010674859 = product of:
  0.07472401 = sum of:
    0.07472401 = weight(_text_:techniques in 4071) [ClassicSimilarity], result of:
      0.07472401 = score(doc=4071,freq=4.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.4129904 = fieldWeight in 4071, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.046875 = fieldNorm(doc=4071)
  0.14285715 = coord(1/7)
```
Abstract

Methods and tools for finding documents relevant to a user's needs in a document corpora can be found in the information retrieval, library science, and hypertext communities. Typically, these systems provide retrieval capabilities for fairly static copora, their algorithms are dependent on the language for which they are written, e.g. English, and they do not perform well when presented with misspelled words or text that has been degraded by OCR techniques. In this article, we present experimentation results for the TELLTALE system. TELLTALE is a dynamic hypertext environment that provides full-text search from a hypertext-style user interface for text corpora that may be garbled by OCR or transmission errors, and that may contain languages other than English. TELLTALE uses several techniques based on n-grams (n character sequences of text). With these results we show that the dynamic linkage mechanisms in TELLTALE are tolerant of garbles in up to 30% of the characters in the body of the texts

Multilingual web software (1996) 0.01

0.010086811 = product of:
  0.07060768 = sum of:
    0.07060768 = weight(_text_:digital in 4710) [ClassicSimilarity], result of:
      0.07060768 = score(doc=4710,freq=2.0), product of:
        0.16201277 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.04107254 = queryNorm
        0.4358155 = fieldWeight in 4710, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.078125 = fieldNorm(doc=4710)
  0.14285715 = coord(1/7)

Source: Digital publishing technologies. 1(1996) no.10, S.19-20

Ata, B.M.A.: SISDOM: a multilingual document retrieval system (1995) 0.01
```
0.010064354 = product of:
  0.07045048 = sum of:
    0.07045048 = weight(_text_:techniques in 895) [ClassicSimilarity], result of:
      0.07045048 = score(doc=895,freq=2.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.3893711 = fieldWeight in 895, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.0625 = fieldNorm(doc=895)
  0.14285715 = coord(1/7)
```
Abstract

The Malay language is widely used in Malaysia, Indonesia and brunei. The growth in the number of documents written in Malay justifies the need for a document retrieval system for that language. Describes the implementation of a bilingual Malay and English full text document retrieval systems: SIStem capaian DOkumen Multilingua (SISDOM), by the Kebangsaan University Malaysia. The system incorporates many facilities for users, including the choice of search techniques, browsing of retrieved documents, and ranking of documents
Borgman, C.L.: Multi-media, multi-cultural, and multi-lingual digital libraries : or how do we exchange data In 400 languages? (1997) 0.01
```
0.009985435 = product of:
  0.06989804 = sum of:
    0.06989804 = weight(_text_:digital in 1263) [ClassicSimilarity], result of:
      0.06989804 = score(doc=1263,freq=16.0), product of:
        0.16201277 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.04107254 = queryNorm
        0.43143538 = fieldWeight in 1263, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1263)
  0.14285715 = coord(1/7)
```
Abstract

The Internet would not be very useful if communication were limited to textual exchanges between speakers of English located in the United States. Rather, its value lies in its ability to enable people from multiple nations, speaking multiple languages, to employ multiple media in interacting with each other. While computer networks broke through national boundaries long ago, they remain much more effective for textual communication than for exchanges of sound, images, or mixed media -- and more effective for communication in English than for exchanges in most other languages, much less interactions involving multiple languages. Supporting searching and display in multiple languages is an increasingly important issue for all digital libraries accessible on the Internet. Even if a digital library contains materials in only one language, the content needs to be searchable and displayable on computers in countries speaking other languages. We need to exchange data between digital libraries, whether in a single language or in multiple languages. Data exchanges may be large batch updates or interactive hyperlinks. In any of these cases, character sets must be represented in a consistent manner if exchanges are to succeed. Issues of interoperability, portability, and data exchange related to multi-lingual character sets have received surprisingly little attention in the digital library community or in discussions of standards for information infrastructure, except in Europe. The landmark collection of papers on Standards Policy for Information Infrastructure, for example, contains no discussion of multi-lingual issues except for a passing reference to the Unicode standard. The goal of this short essay is to draw attention to the multi-lingual issues involved in designing digital libraries accessible on the Internet. Many of the multi-lingual design issues parallel those of multi-media digital libraries, a topic more familiar to most readers of D-Lib Magazine. This essay draws examples from multi-media DLs to illustrate some of the urgent design challenges in creating a globally distributed network serving people who speak many languages other than English. First we introduce some general issues of medium, culture, and language, then discuss the design challenges in the transition from local to global systems, lastly addressing technical matters. The technical issues involve the choice of character sets to represent languages, similar to the choices made in representing images or sound. However, the scale of the language problem is far greater. Standards for multi-media representation are being adopted fairly rapidly, in parallel with the availability of multi-media content in electronic form. By contrast, we have hundreds (and sometimes thousands) of years worth of textual materials in hundreds of languages, created long before data encoding standards existed. Textual content from past and present is being encoded in language and application-specific representations that are difficult to exchange without losing data -- if they exchange at all. We illustrate the multi-language DL challenge with examples drawn from the research library community, which typically handles collections of materials in 400 or so languages. These are problems faced not only by developers of digital libraries, but by those who develop and manage any communication technology that crosses national or linguistic boundaries.
Lassalle, E.: Text retrieval : from a monolingual system to a multilingual system (1993) 0.01
```
0.0074365106 = product of:
  0.05205557 = sum of:
    0.05205557 = weight(_text_:processing in 7403) [ClassicSimilarity], result of:
      0.05205557 = score(doc=7403,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.3130829 = fieldWeight in 7403, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7403)
  0.14285715 = coord(1/7)
```
Abstract

Describes the TELMI monolingual text retrieval system and its future extension, a multilingual system. TELMI is designed for medium sized databases containing short texts. The characteristics of the system are fine-grained natural language processing (NLP); an open domain and a large scale knowledge base; automated indexing based on conceptual representation of texts and reusability of the NLP tools. Discusses the French MINITEL service, the MGS information service and the TELMI research system covering the full text system; NLP architecture; the lexical level; the syntactic level; the semantic level and an example of the use of a generic system

Weihs, J.: Three tales of multilingual cataloguing (1998) 0.01

0.006359728 = product of:
  0.044518095 = sum of:
    0.044518095 = product of:
      0.08903619 = sum of:
        0.08903619 = weight(_text_:22 in 6063) [ClassicSimilarity], result of:
          0.08903619 = score(doc=6063,freq=2.0), product of:
            0.14382903 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04107254 = queryNorm
            0.61904186 = fieldWeight in 6063, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=6063)
      0.5 = coord(1/2)
  0.14285715 = coord(1/7)

Date: 2. 8.2001 8:55:22

Ferber, R.: Automated indexing with thesaurus descriptors : a co-occurence based approach to multilingual retrieval (1997) 0.01

0.0050434056 = product of:
  0.03530384 = sum of:
    0.03530384 = weight(_text_:digital in 4144) [ClassicSimilarity], result of:
      0.03530384 = score(doc=4144,freq=2.0), product of:
        0.16201277 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.04107254 = queryNorm
        0.21790776 = fieldWeight in 4144, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4144)
  0.14285715 = coord(1/7)

Source: Research and advanced technology for digital libraries: First European Conference, ECDL'97, Pisa, Italy, September 1997, Proceedings. Ed.: C. Peters u. C. Thanos

Oard, D.W.: Alternative approaches for cross-language text retrieval (1997) 0.00
```
0.0044031553 = product of:
  0.030822085 = sum of:
    0.030822085 = weight(_text_:techniques in 1164) [ClassicSimilarity], result of:
      0.030822085 = score(doc=1164,freq=2.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.17034985 = fieldWeight in 1164, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1164)
  0.14285715 = coord(1/7)
```
Abstract

I will not attempt to draw a sharp distinction between retrieval and filtering in this survey. Although my own work on adaptive cross-language text filtering has led me to make this distinction fairly carefully in other presentations (c.f., (Oard 1997b)), such an proach does little to help understand the fundamental techniques which have been applied or the results that have been obtained in this case. Since it is still common to view filtering (detection of useful documents in dynamic document streams) as a kind of retrieval, will simply adopt that perspective here.

Kutschekmanesch, S.; Lutes, B.; Moelle, K.; Thiel, U.; Tzeras, K.: Automated multilingual indexing : a synthesis of rule-based and thesaurus-based methods (1998) 0.00

0.0039748303 = product of:
  0.027823811 = sum of:
    0.027823811 = product of:
      0.055647623 = sum of:
        0.055647623 = weight(_text_:22 in 4157) [ClassicSimilarity], result of:
          0.055647623 = score(doc=4157,freq=2.0), product of:
            0.14382903 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04107254 = queryNorm
            0.38690117 = fieldWeight in 4157, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=4157)
      0.5 = coord(1/2)
  0.14285715 = coord(1/7)

Source: Information und Märkte: 50. Deutscher Dokumentartag 1998, Kongreß der Deutschen Gesellschaft für Dokumentation e.V. (DGD), Rheinische Friedrich-Wilhelms-Universität Bonn, 22.-24. September 1998. Hrsg. von Marlies Ockenfeld u. Gerhard J. Mantwill

Timotin, A.: Multilingvism si tezaure de concepte (1994) 0.00

0.003179864 = product of:
  0.022259047 = sum of:
    0.022259047 = product of:
      0.044518095 = sum of:
        0.044518095 = weight(_text_:22 in 7887) [ClassicSimilarity], result of:
          0.044518095 = score(doc=7887,freq=2.0), product of:
            0.14382903 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04107254 = queryNorm
            0.30952093 = fieldWeight in 7887, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=7887)
      0.5 = coord(1/2)
  0.14285715 = coord(1/7)

Source: Probleme de Informare si Documentare. 28(1994) no.1, S.13-22

Cao, L.; Leong, M.-K.; Low, H.-B.: Searching heterogeneous multilingual bibliographic sources (1998) 0.00

0.003179864 = product of:
  0.022259047 = sum of:
    0.022259047 = product of:
      0.044518095 = sum of:
        0.044518095 = weight(_text_:22 in 3564) [ClassicSimilarity], result of:
          0.044518095 = score(doc=3564,freq=2.0), product of:
            0.14382903 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04107254 = queryNorm
            0.30952093 = fieldWeight in 3564, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=3564)
      0.5 = coord(1/2)
  0.14285715 = coord(1/7)

Date: 1. 8.1996 22:08:06

Heinzelin, D. de; ¬d'¬Hautcourt, F.; Pols, R.: ¬Un nouveaux thesaurus multilingue informatise relatif aux instruments de musique (1998) 0.00

0.003179864 = product of:
  0.022259047 = sum of:
    0.022259047 = product of:
      0.044518095 = sum of:
        0.044518095 = weight(_text_:22 in 932) [ClassicSimilarity], result of:
          0.044518095 = score(doc=932,freq=2.0), product of:
            0.14382903 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04107254 = queryNorm
            0.30952093 = fieldWeight in 932, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=932)
      0.5 = coord(1/2)
  0.14285715 = coord(1/7)

Date: 1. 8.1996 22:01:00

Cross-language information retrieval (1998) 0.00
```
0.003145111 = product of:
  0.022015776 = sum of:
    0.022015776 = weight(_text_:techniques in 6299) [ClassicSimilarity], result of:
      0.022015776 = score(doc=6299,freq=2.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.12167847 = fieldWeight in 6299, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.01953125 = fieldNorm(doc=6299)
  0.14285715 = coord(1/7)
```
Footnote

The retrieved output from a query including the phrase 'big rockets' may be, for instance, a sentence containing 'giant rocket' which is semantically ranked above 'military ocket'. David Hull (Xerox Research Centre, Grenoble) describes an implementation of a weighted Boolean model for Spanish-English CLIR. Users construct Boolean-type queries, weighting each term in the query, which is then translated by an on-line dictionary before being applied to the database. Comparisons with the performance of unweighted free-form queries ('vector space' models) proved encouraging. Two contributions consider the evaluation of CLIR systems. In order to by-pass the time-consuming and expensive process of assembling a standard collection of documents and of user queries against which the performance of an CLIR system is manually assessed, Páriac Sheridan et al (ETH Zurich) propose a method based on retrieving 'seed documents'. This involves identifying a unique document in a database (the 'seed document') and, for a number of queries, measuring how fast it is retrieved. The authors have also assembled a large database of multilingual news documents for testing purposes. By storing the (fairly short) documents in a structured form tagged with descriptor codes (e.g. for topic, country and area), the test suite is easily expanded while remaining consistent for the purposes of testing. Douglas Ouard and Bonne Dorr (University of Maryland) describe an evaluation methodology which appears to apply LSI techniques in order to filter and rank incoming documents designed for testing CLIR systems. The volume provides the reader an excellent overview of several projects in CLIR. It is well supported with references and is intended as a secondary text for researchers and practitioners. It highlights the need for a good, general tutorial introduction to the field."

Search (19 results, page 1 of 1)

Authors

Languages

Types

Themes