Search (105 results, page 1 of 6)

Oard, D.W.: Serving users in many languages : cross-language information retrieval for digital libraries (1997) 0.06

0.056203403 = product of:
  0.13114128 = sum of:
    0.03718255 = weight(_text_:processing in 1261) [ClassicSimilarity], result of:
      0.03718255 = score(doc=1261,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.22363065 = fieldWeight in 1261, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1261)
    0.04992717 = weight(_text_:digital in 1261) [ClassicSimilarity], result of:
      0.04992717 = score(doc=1261,freq=4.0), product of:
        0.16201277 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.04107254 = queryNorm
        0.3081681 = fieldWeight in 1261, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1261)
    0.044031553 = weight(_text_:techniques in 1261) [ClassicSimilarity], result of:
      0.044031553 = score(doc=1261,freq=2.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.24335694 = fieldWeight in 1261, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1261)
  0.42857143 = coord(3/7)

Abstract: We are rapidly constructing an extensive network infrastructure for moving information across national boundaries, but much remains to be done before linguistic barriers can be surmounted as effectively as geographic ones. Users seeking information from a digital library could benefit from the ability to query large collections once using a single language, even when more than one language is present in the collection. If the information they locate is not available in a language that they can read, some form of translation will be needed. At present, multilingual thesauri such as EUROVOC help to address this challenge by facilitating controlled vocabulary search using terms from several languages, and services such as INSPEC produce English abstracts for documents in other languages. On the other hand, support for free text searching across languages is not yet widely deployed, and fully automatic machine translation is presently neither sufficiently fast nor sufficiently accurate to adequately support interactive cross-language information seeking. An active and rapidly growing research community has coalesced around these and other related issues, applying techniques drawn from several fields - notably information retrieval and natural language processing - to provide access to large multilingual collections.

Larkey, L.S.; Connell, M.E.: Structured queries, language modelling, and relevance modelling in cross-language information retrieval (2005) 0.05
```
0.0524338 = product of:
  0.12234553 = sum of:
    0.06440207 = weight(_text_:processing in 1022) [ClassicSimilarity], result of:
      0.06440207 = score(doc=1022,freq=6.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.38733965 = fieldWeight in 1022, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1022)
    0.044031553 = weight(_text_:techniques in 1022) [ClassicSimilarity], result of:
      0.044031553 = score(doc=1022,freq=2.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.24335694 = fieldWeight in 1022, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1022)
    0.013911906 = product of:
      0.027823811 = sum of:
        0.027823811 = weight(_text_:22 in 1022) [ClassicSimilarity], result of:
          0.027823811 = score(doc=1022,freq=2.0), product of:
            0.14382903 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04107254 = queryNorm
            0.19345059 = fieldWeight in 1022, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1022)
      0.5 = coord(1/2)
  0.42857143 = coord(3/7)
```
Abstract

Two probabilistic approaches to cross-lingual retrieval are in wide use today, those based on probabilistic models of relevance, as exemplified by INQUERY, and those based on language modeling. INQUERY, as a query net model, allows the easy incorporation of query operators, including a synonym operator, which has proven to be extremely useful in cross-language information retrieval (CLIR), in an approach often called structured query translation. In contrast, language models incorporate translation probabilities into a unified framework. We compare the two approaches on Arabic and Spanish data sets, using two kinds of bilingual dictionaries--one derived from a conventional dictionary, and one derived from a parallel corpus. We find that structured query processing gives slightly better results when queries are not expanded. On the other hand, when queries are expanded, language modeling gives better results, but only when using a probabilistic dictionary derived from a parallel corpus. We pursue two additional issues inherent in the comparison of structured query processing with language modeling. The first concerns query expansion, and the second is the role of translation probabilities. We compare conventional expansion techniques (pseudo-relevance feedback) with relevance modeling, a new IR approach which fits into the formal framework of language modeling. We find that relevance modeling and pseudo-relevance feedback achieve comparable levels of retrieval and that good translation probabilities confer a small but significant advantage.

Date

26.12.2007 20:22:11

Source

Information processing and management. 41(2005) no.3, S.457-474

Kishida, K.: Technical issues of cross-language information retrieval : a review (2005) 0.05

0.05186169 = product of:
  0.1815159 = sum of:
    0.05949208 = weight(_text_:processing in 1019) [ClassicSimilarity], result of:
      0.05949208 = score(doc=1019,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.35780904 = fieldWeight in 1019, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.0625 = fieldNorm(doc=1019)
    0.12202381 = weight(_text_:techniques in 1019) [ClassicSimilarity], result of:
      0.12202381 = score(doc=1019,freq=6.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.6744105 = fieldWeight in 1019, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.0625 = fieldNorm(doc=1019)
  0.2857143 = coord(2/7)

Abstract: This paper reviews state-of-the-art techniques and methods for enhancing effectiveness of cross-language information retrieval (CLIR). The following research issues are covered: (1) matching strategies and translation techniques, (2) methods for solving the problem of translation ambiguity, (3) formal models for CLIR such as application of the language model, (4) the pivot language approach, (5) methods for searching multilingual document collection, (6) techniques for combining multiple language resources, etc.
Source: Information processing and management. 41(2005) no.3, S.433-456

Levergood, B.; Farrenkopf, S.; Frasnelli, E.: ¬The specification of the language of the field and interoperability : cross-language access to catalogues and online libraries (CACAO) (2008) 0.05

0.048921943 = product of:
  0.1141512 = sum of:
    0.04461906 = weight(_text_:processing in 2646) [ClassicSimilarity], result of:
      0.04461906 = score(doc=2646,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.26835677 = fieldWeight in 2646, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.046875 = fieldNorm(doc=2646)
    0.052837856 = weight(_text_:techniques in 2646) [ClassicSimilarity], result of:
      0.052837856 = score(doc=2646,freq=2.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.2920283 = fieldWeight in 2646, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.046875 = fieldNorm(doc=2646)
    0.016694285 = product of:
      0.03338857 = sum of:
        0.03338857 = weight(_text_:22 in 2646) [ClassicSimilarity], result of:
          0.03338857 = score(doc=2646,freq=2.0), product of:
            0.14382903 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04107254 = queryNorm
            0.23214069 = fieldWeight in 2646, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2646)
      0.5 = coord(1/2)
  0.42857143 = coord(3/7)

Abstract: The CACAO Project (Cross-language Access to Catalogues and Online Libraries) has been designed to implement natural language processing and cross-language information retrieval techniques to provide cross-language access to information in libraries, a critical issue in the linguistically diverse European Union. This project report addresses two metadata-related challenges for the library community in this context: "false friends" (identical words having different meanings in different languages) and term ambiguity. The possible solutions involve enriching the metadata with attributes specifying language or the source authority file, or associating potential search terms to classes in a classification system. The European Library will evaluate an early implementation of this work in late 2008.
Source: Metadata for semantic and social applications : proceedings of the International Conference on Dublin Core and Metadata Applications, Berlin, 22 - 26 September 2008, DC 2008: Berlin, Germany / ed. by Jane Greenberg and Wolfgang Klas

Levow, G.-A.; Oard, D.W.; Resnik, P.: Dictionary-based techniques for cross-language information retrieval (2005) 0.05
```
0.04650517 = product of:
  0.1627681 = sum of:
    0.04461906 = weight(_text_:processing in 1025) [ClassicSimilarity], result of:
      0.04461906 = score(doc=1025,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.26835677 = fieldWeight in 1025, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.046875 = fieldNorm(doc=1025)
    0.11814904 = weight(_text_:techniques in 1025) [ClassicSimilarity], result of:
      0.11814904 = score(doc=1025,freq=10.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.65299517 = fieldWeight in 1025, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.046875 = fieldNorm(doc=1025)
  0.2857143 = coord(2/7)
```
Abstract

Cross-language information retrieval (CLIR) systems allow users to find documents written in different languages from that of their query. Simple knowledge structures such as bilingual term lists have proven to be a remarkably useful basis for bridging that language gap. A broad array of dictionary-based techniques have demonstrated utility, but comparison across techniques has been difficult because evaluation results often span only a limited range of conditions. This article identifies the key issues in dictionary-based CLIR, develops unified frameworks for term selection and term translation that help to explain the relationships among existing techniques, and illustrates the effect of those techniques using four contrasting languages for systematic experiments with a uniform query translation architecture. Key results include identification of a previously unseen dependence of pre- and post-translation expansion on orthographic cognates and development of a query-specific measure for translation fanout that helps to explain the utility of structured query methods.

Source

Information processing and management. 41(2005) no.3, S.523-548
Kishida, K.: Term disambiguation techniques based on target document collection for cross-language information retrieval : an empirical comparison of performance between techniques (2007) 0.04
```
0.035784476 = product of:
  0.12524566 = sum of:
    0.03718255 = weight(_text_:processing in 897) [ClassicSimilarity], result of:
      0.03718255 = score(doc=897,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.22363065 = fieldWeight in 897, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.0390625 = fieldNorm(doc=897)
    0.088063106 = weight(_text_:techniques in 897) [ClassicSimilarity], result of:
      0.088063106 = score(doc=897,freq=8.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.4867139 = fieldWeight in 897, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.0390625 = fieldNorm(doc=897)
  0.2857143 = coord(2/7)
```
Abstract

Dictionary-based query translation for cross-language information retrieval often yields various translation candidates having different meanings for a source term in the query. This paper examines methods for solving the ambiguity of translations based on only the target document collections. First, we discuss two kinds of disambiguation technique: (1) one is a method using term co-occurrence statistics in the collection, and (2) a technique based on pseudo-relevance feedback. Next, these techniques are empirically compared using the CLEF 2003 test collection for German to Italian bilingual searches, which are executed by using English language as a pivot. The experiments showed that a variation of term co-occurrence based techniques, in which the best sequence algorithm for selecting translations is used with the Cosine coefficient, is dominant, and that the PRF method shows comparable high search performance, although statistical tests did not sufficiently support these conclusions. Furthermore, we repeat the same experiments for the case of French to Italian (pivot) and English to Italian (non-pivot) searches on the same CLEF 2003 test collection in order to verity our findings. Again, similar results were observed except that the Dice coefficient outperforms slightly the Cosine coefficient in the case of disambiguation based on term co-occurrence for English to Italian searches.

Source

Information processing and management. 43(2007) no.1, S.103-120

Bilal, D.; Bachir, I.: Children's interaction with cross-cultural and multilingual digital libraries : I. Understanding interface design representations (2007) 0.03

0.034843888 = product of:
  0.12195361 = sum of:
    0.05205557 = weight(_text_:processing in 894) [ClassicSimilarity], result of:
      0.05205557 = score(doc=894,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.3130829 = fieldWeight in 894, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.0546875 = fieldNorm(doc=894)
    0.06989804 = weight(_text_:digital in 894) [ClassicSimilarity], result of:
      0.06989804 = score(doc=894,freq=4.0), product of:
        0.16201277 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.04107254 = queryNorm
        0.43143538 = fieldWeight in 894, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.0546875 = fieldNorm(doc=894)
  0.2857143 = coord(2/7)

Abstract: This paper reports the results of a study that examined Arabic-speaking children's interaction with the International Children's Digital Library (ICDL). Assessment of the ICDL to Arabic-speaking children as a culturally diverse group was grounded in "representations" and "meaning" rather than in internationalization and localization. The utility of the ICDL navigation controls was judged based on the extent it supported children's navigation. Most of the ICDL representations and their meanings were found to be highly appropriate for older children but inappropriate for younger ones. The design of the ICDL navigation controls was supportive of children's navigation. Recommendations for assessing the cross-cultural usability of the ICDL are made and suggestions for system design improvements are provided.
Source: Information processing and management. 43(2007) no.1, S.47-64

Mustafa el Hadi, W.: Dynamics of the linguistic paradigm in information retrieval (2000) 0.03

0.03409802 = product of:
  0.11934307 = sum of:
    0.04461906 = weight(_text_:processing in 151) [ClassicSimilarity], result of:
      0.04461906 = score(doc=151,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.26835677 = fieldWeight in 151, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.046875 = fieldNorm(doc=151)
    0.07472401 = weight(_text_:techniques in 151) [ClassicSimilarity], result of:
      0.07472401 = score(doc=151,freq=4.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.4129904 = fieldWeight in 151, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.046875 = fieldNorm(doc=151)
  0.2857143 = coord(2/7)

Abstract: In this paper we briefly sketch the dynamics of the linguistic paradigm in Information Retrieval (IR) and its adaptation to the Internet. The emergence of Natural Language Processing (NLP) techniques has been a major factor leading to this adaptation. These techniques and tools try to adapt to the current needs, i.e. retrieving information from documents written and indexed in a foreign language by using a native language query to express the information need. This process, known as cross-language IR (CLIR), is a field at the cross roads of both Machine Translation and IR. This field represents a real challenge to the IR community and will require a solid cooperation with the NLP community.

Multilingual information management : current levels and future abilities. A report Commissioned by the US National Science Foundation and also delivered to the European Commission's Language Engineering Office and the US Defense Advanced Research Projects Agency, April 1999 (1999) 0.03
```
0.033151403 = product of:
  0.1160299 = sum of:
    0.02974604 = weight(_text_:processing in 6068) [ClassicSimilarity], result of:
      0.02974604 = score(doc=6068,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.17890452 = fieldWeight in 6068, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.03125 = fieldNorm(doc=6068)
    0.08628386 = weight(_text_:techniques in 6068) [ClassicSimilarity], result of:
      0.08628386 = score(doc=6068,freq=12.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.47688022 = fieldWeight in 6068, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.03125 = fieldNorm(doc=6068)
  0.2857143 = coord(2/7)
```
Abstract

Over the past 50 years, a variety of language-related capabilities has been developed in machine translation, information retrieval, speech recognition, text summarization, and so on. These applications rest upon a set of core techniques such as language modeling, information extraction, parsing, generation, and multimedia planning and integration; and they involve methods using statistics, rules, grammars, lexicons, ontologies, training techniques, and so on. It is a puzzling fact that although all of this work deals with language in some form or other, the major applications have each developed a separate research field. For example, there is no reason why speech recognition techniques involving n-grams and hidden Markov models could not have been used in machine translation 15 years earlier than they were, or why some of the lexical and semantic insights from the subarea called Computational Linguistics are still not used in information retrieval.
This picture will rapidly change. The twin challenges of massive information overload via the web and ubiquitous computers present us with an unavoidable task: developing techniques to handle multilingual and multi-modal information robustly and efficiently, with as high quality performance as possible. The most effective way for us to address such a mammoth task, and to ensure that our various techniques and applications fit together, is to start talking across the artificial research boundaries. Extending the current technologies will require integrating the various capabilities into multi-functional and multi-lingual natural language systems. However, at this time there is no clear vision of how these technologies could or should be assembled into a coherent framework. What would be involved in connecting a speech recognition system to an information retrieval engine, and then using machine translation and summarization software to process the retrieved text? How can traditional parsing and generation be enhanced with statistical techniques? What would be the effect of carefully crafted lexicons on traditional information retrieval? At which points should machine translation be interleaved within information retrieval systems to enable multilingual processing?

Wang, J.; Oard, D.W.: Matching meaning for cross-language information retrieval (2012) 0.03

0.03248564 = product of:
  0.11369974 = sum of:
    0.05205557 = weight(_text_:processing in 7430) [ClassicSimilarity], result of:
      0.05205557 = score(doc=7430,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.3130829 = fieldWeight in 7430, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7430)
    0.06164417 = weight(_text_:techniques in 7430) [ClassicSimilarity], result of:
      0.06164417 = score(doc=7430,freq=2.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.3406997 = fieldWeight in 7430, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7430)
  0.2857143 = coord(2/7)

Abstract: This article describes a framework for cross-language information retrieval that efficiently leverages statistical estimation of translation probabilities. The framework provides a unified perspective into which some earlier work on techniques for cross-language information retrieval based on translation probabilities can be cast. Modeling synonymy and filtering translation probabilities using bidirectional evidence are shown to yield a balance between retrieval effectiveness and query-time (or indexing-time) efficiency that seems well suited large-scale applications. Evaluations with six test collections show consistent improvements over strong baselines.
Source: Information processing and management. 48(2012) no.4, S.631-653

Bilal, D.; Bachir, I.: Children's interaction with cross-cultural and multilingual digital libraries : II. Information seeking, success, and affective experience (2007) 0.03
```
0.029866192 = product of:
  0.10453167 = sum of:
    0.04461906 = weight(_text_:processing in 895) [ClassicSimilarity], result of:
      0.04461906 = score(doc=895,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.26835677 = fieldWeight in 895, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.046875 = fieldNorm(doc=895)
    0.059912607 = weight(_text_:digital in 895) [ClassicSimilarity], result of:
      0.059912607 = score(doc=895,freq=4.0), product of:
        0.16201277 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.04107254 = queryNorm
        0.36980176 = fieldWeight in 895, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.046875 = fieldNorm(doc=895)
  0.2857143 = coord(2/7)
```
Abstract

This paper reports the results of a study that investigated Arabic-speaking children's interaction with the International Children's Digital Library (ICDL) to find Arabic books on four tasks. Children's information seeking activities was captured by using HyperCam software. Children's success was assessed based on a measure the researchers developed. Children's perceptions of and affective experience in using the ICDL was gathered through group interviews. Findings revealed that children's information seeking behavior was characterized by browsing using a single function; that is, looking under "Arabic" from the Simple interface pull-down menu. Children were more successful on the fully self-generated, open-ended task than on the assigned and semi-assigned tasks. Children made suggestions for improving the Arabic collection and the design of the ICDL. The findings have implications for practitioners, researchers, and system designers.

Source

Information processing and management. 43(2007) no.1, S.65-80
Mustafa el Hadi, W.: Human language technology and its role in information access and management (2003) 0.03
```
0.02841502 = product of:
  0.09945257 = sum of:
    0.03718255 = weight(_text_:processing in 5524) [ClassicSimilarity], result of:
      0.03718255 = score(doc=5524,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.22363065 = fieldWeight in 5524, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5524)
    0.062270015 = weight(_text_:techniques in 5524) [ClassicSimilarity], result of:
      0.062270015 = score(doc=5524,freq=4.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.34415868 = fieldWeight in 5524, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5524)
  0.2857143 = coord(2/7)
```
Abstract

The role of linguistics in information access, extraction and dissemination is essential. Radical changes in the techniques of information and communication at the end of the twentieth century have had a significant effect on the function of the linguistic paradigm and its applications in all forms of communication. The introduction of new technical means have deeply changed the possibilities for the distribution of information. In this situation, what is the role of the linguistic paradigm and its practical applications, i.e., natural language processing (NLP) techniques when applied to information access? What solutions can linguistics offer in human computer interaction, extraction and management? Many fields show the relevance of the linguistic paradigm through the various technologies that require NLP, such as document and message understanding, information detection, extraction, and retrieval, question and answer, cross-language information retrieval (CLIR), text summarization, filtering, and spoken document retrieval. This paper focuses on the central role of human language technologies in the information society, surveys the current situation, describes the benefits of the above mentioned applications, outlines successes and challenges, and discusses solutions. It reviews the resources and means needed to advance information access and dissemination across language boundaries in the twenty-first century. Multilingualism, which is a natural result of globalization, requires more effort in the direction of language technology. The scope of human language technology (HLT) is large, so we limit our review to applications that involve multilinguality.

Pollitt, A.S.; Ellis, G.: Multilingual access to document databases (1993) 0.03

0.027844835 = product of:
  0.09745692 = sum of:
    0.04461906 = weight(_text_:processing in 1302) [ClassicSimilarity], result of:
      0.04461906 = score(doc=1302,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.26835677 = fieldWeight in 1302, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.046875 = fieldNorm(doc=1302)
    0.052837856 = weight(_text_:techniques in 1302) [ClassicSimilarity], result of:
      0.052837856 = score(doc=1302,freq=2.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.2920283 = fieldWeight in 1302, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.046875 = fieldNorm(doc=1302)
  0.2857143 = coord(2/7)

Abstract: This paper examines the reasons why approaches to facilitate document retrieval which apply AI (Artificial Intelligence) or Expert Systems techniques, relying on so-called "natural language" query statements from the end-user will result in sub-optimal solutions. It does so by reflecting on the nature of language and the fundamental problems in document retrieval. Support is given to the work of thesaurus builders and indexers with illustrations of how their work may be utilised in a generally applicable computer-based document retrieval system using Multilingual MenUSE software. The EuroMenUSE interface providing multilingual document access to EPOQUE, the European Parliament's Online Query System is described.
Source: Information as a Global Commodity - Communication, Processing and Use (CAIS/ACSI '93) : 21st Annual Conference Canadian Association for Information Science, Antigonish, Nova Scotia, Canada. July 1993

Oard, D.W.; He, D.; Wang, J.: User-assisted query translation for interactive cross-language information retrieval (2008) 0.03
```
0.027844835 = product of:
  0.09745692 = sum of:
    0.04461906 = weight(_text_:processing in 2030) [ClassicSimilarity], result of:
      0.04461906 = score(doc=2030,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.26835677 = fieldWeight in 2030, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.046875 = fieldNorm(doc=2030)
    0.052837856 = weight(_text_:techniques in 2030) [ClassicSimilarity], result of:
      0.052837856 = score(doc=2030,freq=2.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.2920283 = fieldWeight in 2030, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.046875 = fieldNorm(doc=2030)
  0.2857143 = coord(2/7)
```
Abstract

Interactive Cross-Language Information Retrieval (CLIR), a process in which searcher and system collaborate to find documents that satisfy an information need regardless of the language in which those documents are written, calls for designs in which synergies between searcher and system can be leveraged so that the strengths of one can cover weaknesses of the other. This paper describes an approach that employs user-assisted query translation to help searchers better understand the system's operation. Supporting interaction and interface designs are introduced, and results from three user studies are presented. The results indicate that experienced searchers presented with this new system evolve new search strategies that make effective use of the new capabilities, that they achieve retrieval effectiveness comparable to results obtained using fully automatic techniques, and that reported satisfaction with support for cross-language searching increased. The paper concludes with a description of a freely available interactive CLIR system that incorporates lessons learned from this research.

Source

Information processing and management. 44(2008) no.1, S.181-211
Airio, E.; Kettunen, K.: Does dictionary based bilingual retrieval work in a non-normalized index? (2009) 0.03
```
0.027844835 = product of:
  0.09745692 = sum of:
    0.04461906 = weight(_text_:processing in 4224) [ClassicSimilarity], result of:
      0.04461906 = score(doc=4224,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.26835677 = fieldWeight in 4224, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.046875 = fieldNorm(doc=4224)
    0.052837856 = weight(_text_:techniques in 4224) [ClassicSimilarity], result of:
      0.052837856 = score(doc=4224,freq=2.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.2920283 = fieldWeight in 4224, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.046875 = fieldNorm(doc=4224)
  0.2857143 = coord(2/7)
```
Abstract

Many operational IR indexes are non-normalized, i.e. no lemmatization or stemming techniques, etc. have been employed in indexing. This poses a challenge for dictionary-based cross-language retrieval (CLIR), because translations are mostly lemmas. In this study, we face the challenge of dictionary-based CLIR in a non-normalized index. We test two optional approaches: FCG (Frequent Case Generation) and s-gramming. The idea of FCG is to automatically generate the most frequent inflected forms for a given lemma. FCG has been tested in monolingual retrieval and has been shown to be a good method for inflected retrieval, especially for highly inflected languages. S-gramming is an approximate string matching technique (an extension of n-gramming). The language pairs in our tests were English-Finnish, English-Swedish, Swedish-Finnish and Finnish-Swedish. Both our approaches performed quite well, but the results varied depending on the language pair. S-gramming and FCG performed quite equally in all the other language pairs except Finnish-Swedish, where s-gramming outperformed FCG.

Source

Information processing and management. 45(2009) no.6, S.703-713
Wang, F.L.; Yang, C.C.: ¬The impact analysis of language differences on an automatic multilingual text summarization system (2006) 0.02
```
0.02320403 = product of:
  0.0812141 = sum of:
    0.03718255 = weight(_text_:processing in 5049) [ClassicSimilarity], result of:
      0.03718255 = score(doc=5049,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.22363065 = fieldWeight in 5049, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5049)
    0.044031553 = weight(_text_:techniques in 5049) [ClassicSimilarity], result of:
      0.044031553 = score(doc=5049,freq=2.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.24335694 = fieldWeight in 5049, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5049)
  0.2857143 = coord(2/7)
```
Abstract

Based on the salient features of the documents, automatic text summarization systems extract the key sentences from source documents. This process supports the users in evaluating the relevance of the extracted documents returned by information retrieval systems. Because of this tool, efficient filtering can be achieved. Indirectly, these systems help to resolve the problem of information overloading. Many automatic text summarization systems have been implemented for use with different languages. It has been established that the grammatical and lexical differences between languages have a significant effect on text processing. However, the impact of the language differences on the automatic text summarization systems has not yet been investigated. The authors provide an impact analysis of language difference on automatic text summarization. It includes the effect on the extraction processes, the scoring mechanisms, the performance, and the matching of the extracted sentences, using the parallel corpus in English and Chinese as the tested object. The analysis results provide a greater understanding of language differences and promote the future development of more advanced text summarization techniques.
Freitas-Junior, H.R.; Ribeiro-Neto, B.A.; Freitas-Vale, R. de; Laender, A.H.F.; Lima, L.R.S. de: Categorization-driven cross-language retrieval of medical information (2006) 0.02
```
0.021766264 = product of:
  0.07618192 = sum of:
    0.062270015 = weight(_text_:techniques in 5282) [ClassicSimilarity], result of:
      0.062270015 = score(doc=5282,freq=4.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.34415868 = fieldWeight in 5282, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5282)
    0.013911906 = product of:
      0.027823811 = sum of:
        0.027823811 = weight(_text_:22 in 5282) [ClassicSimilarity], result of:
          0.027823811 = score(doc=5282,freq=2.0), product of:
            0.14382903 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04107254 = queryNorm
            0.19345059 = fieldWeight in 5282, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5282)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)
```
Abstract

The Web has become a large repository of documents (or pages) written in many different languages. In this context, traditional information retrieval (IR) techniques cannot be used whenever the user query and the documents being retrieved are in different languages. To address this problem, new cross-language information retrieval (CLIR) techniques have been proposed. In this work, we describe a method for cross-language retrieval of medical information. This method combines query terms and related medical concepts obtained automatically through a categorization procedure. The medical concepts are used to create a linguistic abstraction that allows retrieval of information in a language-independent way, minimizing linguistic problems such as polysemy. To evaluate our method, we carried out experiments using the OHSUMED test collection, whose documents are written in English, with queries expressed in Portuguese, Spanish, and French. The results indicate that our cross-language retrieval method is as effective as a standard vector space model algorithm operating on queries and documents in the same language. Further, our results are better than previous results in the literature.

Date

22. 7.2006 16:46:36

Schubert, K.: Parameters for the design of an intermediate language for multilingual thesauri (1995) 0.02

0.020437783 = product of:
  0.071532235 = sum of:
    0.05205557 = weight(_text_:processing in 2092) [ClassicSimilarity], result of:
      0.05205557 = score(doc=2092,freq=2.0), product of:
        0.1662677 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.04107254 = queryNorm
        0.3130829 = fieldWeight in 2092, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2092)
    0.019476667 = product of:
      0.038953334 = sum of:
        0.038953334 = weight(_text_:22 in 2092) [ClassicSimilarity], result of:
          0.038953334 = score(doc=2092,freq=2.0), product of:
            0.14382903 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04107254 = queryNorm
            0.2708308 = fieldWeight in 2092, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2092)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)

Abstract: The architecture of multilingual software systems is sometimes centred around an intermediate language. The question is analyzed to what extent this approach can be useful for multilingual thesauri, in particular regarding the functionality the thesaurus is designed to fulfil. Both the runtime use, and the construction and maintenance of the system is taken into consideration. Using the perspective of general language technology enables to draw on experience from a broader range of fields beyond thesaurus design itself as well as to consider the possibility of using a thesaurus as a knowledge module in various systems which process natural language. Therefore the features which thesauri and other natural-language processing systems have in common are emphasized, especially at the level of systems design and their core functionality
Source: Knowledge organization. 22(1995) nos.3/4, S.136-140

Fluhr, C.: Crosslingual access to photo databases (2012) 0.02

0.019866327 = product of:
  0.06953214 = sum of:
    0.052837856 = weight(_text_:techniques in 93) [ClassicSimilarity], result of:
      0.052837856 = score(doc=93,freq=2.0), product of:
        0.18093403 = queryWeight, product of:
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.04107254 = queryNorm
        0.2920283 = fieldWeight in 93, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.405231 = idf(docFreq=1467, maxDocs=44218)
          0.046875 = fieldNorm(doc=93)
    0.016694285 = product of:
      0.03338857 = sum of:
        0.03338857 = weight(_text_:22 in 93) [ClassicSimilarity], result of:
          0.03338857 = score(doc=93,freq=2.0), product of:
            0.14382903 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04107254 = queryNorm
            0.23214069 = fieldWeight in 93, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=93)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)

Abstract: This paper is about search of photos in photo databases of agencies which sell photos over the Internet. The problem is far from the behavior of photo databases managed by librarians and also far from the corpora generally used for research purposes. The descriptions use mainly single words and it is well known that it is not the best way to have a good search. This increases the problem of semantic ambiguity. This problem of semantic ambiguity is crucial for cross-language querying. On the other hand, users are not aware of documentation techniques and use generally very simple queries but want to get precise answers. This paper gives the experience gained in a 3 year use (2006-2008) of a cross-language access to several of the main international commercial photo databases. The languages used were French, English, and German.
Date: 17. 4.2012 14:25:22

Mitchell, J.S.; Zeng, M.L.; Zumer, M.: Modeling classification systems in multicultural and multilingual contexts (2012) 0.02
```
0.018849686 = product of:
  0.0659739 = sum of:
    0.042364612 = weight(_text_:digital in 1967) [ClassicSimilarity], result of:
      0.042364612 = score(doc=1967,freq=2.0), product of:
        0.16201277 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.04107254 = queryNorm
        0.26148933 = fieldWeight in 1967, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.046875 = fieldNorm(doc=1967)
    0.023609286 = product of:
      0.047218572 = sum of:
        0.047218572 = weight(_text_:22 in 1967) [ClassicSimilarity], result of:
          0.047218572 = score(doc=1967,freq=4.0), product of:
            0.14382903 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04107254 = queryNorm
            0.32829654 = fieldWeight in 1967, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1967)
      0.5 = coord(1/2)
  0.2857143 = coord(2/7)
```
Abstract

This paper reports on the second part of an initiative of the authors on researching classification systems with the conceptual model defined by the Functional Requirements for Subject Authority Data (FRSAD) final report. In an earlier study, the authors explored whether the FRSAD conceptual model could be extended beyond subject authority data to model classification data. The focus of the current study is to determine if classification data modeled using FRSAD can be used to solve real-world discovery problems in multicultural and multilingual contexts. The paper discusses the relationships between entities (same type or different types) in the context of classification systems that involve multiple translations and /or multicultural implementations. Results of two case studies are presented in detail: (a) two instances of the DDC (DDC 22 in English, and the Swedish-English mixed translation of DDC 22), and (b) Chinese Library Classification. The use cases of conceptual models in practice are also discussed.

Source

Beyond libraries - subject metadata in the digital environment and semantic web. IFLA Satellite Post-Conference, 17-18 August 2012, Tallinn

Search (105 results, page 1 of 6)

Authors

Years

Languages

Types

Themes

Classifications