Search (134 results, page 1 of 7)

Larkey, L.S.; Connell, M.E.: Structured queries, language modelling, and relevance modelling in cross-language information retrieval (2005) 0.06

0.06432374 = product of:
  0.10291798 = sum of:
    0.04174695 = weight(_text_:retrieval in 1022) [ClassicSimilarity], result of:
      0.04174695 = score(doc=1022,freq=8.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.33420905 = fieldWeight in 1022, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1022)
    0.021389665 = weight(_text_:use in 1022) [ClassicSimilarity], result of:
      0.021389665 = score(doc=1022,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.1691581 = fieldWeight in 1022, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1022)
    0.014758972 = weight(_text_:of in 1022) [ClassicSimilarity], result of:
      0.014758972 = score(doc=1022,freq=14.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.22855641 = fieldWeight in 1022, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1022)
    0.0110352645 = product of:
      0.022070529 = sum of:
        0.022070529 = weight(_text_:on in 1022) [ClassicSimilarity], result of:
          0.022070529 = score(doc=1022,freq=8.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.24300331 = fieldWeight in 1022, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1022)
      0.5 = coord(1/2)
    0.013987125 = product of:
      0.02797425 = sum of:
        0.02797425 = weight(_text_:22 in 1022) [ClassicSimilarity], result of:
          0.02797425 = score(doc=1022,freq=2.0), product of:
            0.1446067 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041294612 = queryNorm
            0.19345059 = fieldWeight in 1022, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1022)
      0.5 = coord(1/2)
  0.625 = coord(5/8)

Abstract: Two probabilistic approaches to cross-lingual retrieval are in wide use today, those based on probabilistic models of relevance, as exemplified by INQUERY, and those based on language modeling. INQUERY, as a query net model, allows the easy incorporation of query operators, including a synonym operator, which has proven to be extremely useful in cross-language information retrieval (CLIR), in an approach often called structured query translation. In contrast, language models incorporate translation probabilities into a unified framework. We compare the two approaches on Arabic and Spanish data sets, using two kinds of bilingual dictionaries--one derived from a conventional dictionary, and one derived from a parallel corpus. We find that structured query processing gives slightly better results when queries are not expanded. On the other hand, when queries are expanded, language modeling gives better results, but only when using a probabilistic dictionary derived from a parallel corpus. We pursue two additional issues inherent in the comparison of structured query processing with language modeling. The first concerns query expansion, and the second is the role of translation probabilities. We compare conventional expansion techniques (pseudo-relevance feedback) with relevance modeling, a new IR approach which fits into the formal framework of language modeling. We find that relevance modeling and pseudo-relevance feedback achieve comparable levels of retrieval and that good translation probabilities confer a small but significant advantage.
Date: 26.12.2007 20:22:11

Lin, W.-C.; Chang, Y.-C.; Chen, H.-H.: Integrating textual and visual information for cross-language image retrieval : a trans-media dictionary approach (2007) 0.05

0.05351604 = product of:
  0.10703208 = sum of:
    0.06135524 = weight(_text_:retrieval in 904) [ClassicSimilarity], result of:
      0.06135524 = score(doc=904,freq=12.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.49118498 = fieldWeight in 904, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=904)
    0.025667597 = weight(_text_:use in 904) [ClassicSimilarity], result of:
      0.025667597 = score(doc=904,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.20298971 = fieldWeight in 904, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.046875 = fieldNorm(doc=904)
    0.013388081 = weight(_text_:of in 904) [ClassicSimilarity], result of:
      0.013388081 = score(doc=904,freq=8.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.20732689 = fieldWeight in 904, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=904)
    0.006621159 = product of:
      0.013242318 = sum of:
        0.013242318 = weight(_text_:on in 904) [ClassicSimilarity], result of:
          0.013242318 = score(doc=904,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.14580199 = fieldWeight in 904, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=904)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: This paper explores the integration of textual and visual information for cross-language image retrieval. An approach which automatically transforms textual queries into visual representations is proposed. First, we mine the relationships between text and images and employ the mined relationships to construct visual queries from textual ones. Then, the retrieval results of textual and visual queries are combined. To evaluate the proposed approach, we conduct English monolingual and Chinese-English cross-language retrieval experiments. The selection of suitable textual query terms to construct visual queries is the major issue. Experimental results show that the proposed approach improves retrieval performance, and use of nouns is appropriate to generate visual queries.
Footnote: Beitrag in: Special issue on AIRS2005: Information Retrieval Research in Asia

McCulloch, E.: Multiple terminologies : an obstacle to information retrieval (2004) 0.05

0.04890834 = product of:
  0.09781668 = sum of:
    0.041327372 = weight(_text_:retrieval in 2798) [ClassicSimilarity], result of:
      0.041327372 = score(doc=2798,freq=4.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.33085006 = fieldWeight in 2798, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2798)
    0.029945528 = weight(_text_:use in 2798) [ClassicSimilarity], result of:
      0.029945528 = score(doc=2798,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.23682132 = fieldWeight in 2798, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2798)
    0.015619429 = weight(_text_:of in 2798) [ClassicSimilarity], result of:
      0.015619429 = score(doc=2798,freq=8.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.24188137 = fieldWeight in 2798, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2798)
    0.010924355 = product of:
      0.02184871 = sum of:
        0.02184871 = weight(_text_:on in 2798) [ClassicSimilarity], result of:
          0.02184871 = score(doc=2798,freq=4.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.24056101 = fieldWeight in 2798, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2798)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: An issue currently at the forefront of digital library research is the prevalence of disparate terminologies and the associated limitations imposed on user searching. It is thought that semantic interoperability is achievable by improving the compatibility between terminologies and classification schemes, enabling users to search multiple resources simultaneously and improve retrieval effectiveness through the use of associated terms drawn from several schemes. This column considers the terminology issue before outlining various proposed methods of tackling it, with a particular focus on terminology mapping.

Subirats, I.; Prasad, A.R.D.; Keizer, J.; Bagdanov, A.: Implementation of rich metadata formats and demantic tools using DSpace (2008) 0.04

0.04497591 = product of:
  0.071961455 = sum of:
    0.016698781 = weight(_text_:retrieval in 2656) [ClassicSimilarity], result of:
      0.016698781 = score(doc=2656,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.13368362 = fieldWeight in 2656, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=2656)
    0.024199642 = weight(_text_:use in 2656) [ClassicSimilarity], result of:
      0.024199642 = score(doc=2656,freq=4.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.19138055 = fieldWeight in 2656, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.03125 = fieldNorm(doc=2656)
    0.0154592255 = weight(_text_:of in 2656) [ClassicSimilarity], result of:
      0.0154592255 = score(doc=2656,freq=24.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.23940048 = fieldWeight in 2656, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03125 = fieldNorm(doc=2656)
    0.004414106 = product of:
      0.008828212 = sum of:
        0.008828212 = weight(_text_:on in 2656) [ClassicSimilarity], result of:
          0.008828212 = score(doc=2656,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.097201325 = fieldWeight in 2656, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.03125 = fieldNorm(doc=2656)
      0.5 = coord(1/2)
    0.0111897 = product of:
      0.0223794 = sum of:
        0.0223794 = weight(_text_:22 in 2656) [ClassicSimilarity], result of:
          0.0223794 = score(doc=2656,freq=2.0), product of:
            0.1446067 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041294612 = queryNorm
            0.15476047 = fieldWeight in 2656, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=2656)
      0.5 = coord(1/2)
  0.625 = coord(5/8)

Abstract: This poster explores the customization of DSpace to allow the use of the AGRIS Application Profile metadata standard and the AGROVOC thesaurus. The objective is the adaptation of DSpace, through the least invasive code changes either in the form of plug-ins or add-ons, to the specific needs of the Agricultural Sciences and Technology community. Metadata standards such as AGRIS AP, and Knowledge Organization Systems such as the AGROVOC thesaurus, provide mechanisms for sharing information in a standardized manner by recommending the use of common semantics and interoperable syntax (Subirats et al., 2007). AGRIS AP was created to enhance the description, exchange and subsequent retrieval of agricultural Document-like Information Objects (DLIOs). It is a metadata schema which draws from Metadata standards such as Dublin Core (DC), the Australian Government Locator Service Metadata (AGLS) and the Agricultural Metadata Element Set (AgMES) namespaces. It allows sharing of information across dispersed bibliographic systems (FAO, 2005). AGROVOC68 is a multilingual structured thesaurus covering agricultural and related domains. Its main role is to standardize the indexing process in order to make searching simpler and more efficient. AGROVOC is developed by FAO (Lauser et al., 2006). The customization of the DSpace is taking place in several phases. First, the AGRIS AP metadata schema was mapped onto the metadata DSpace model, with several enhancements implemented to support AGRIS AP elements. Next, AGROVOC will be integrated as a controlled vocabulary accessed through a local SKOS or OWL file. Eventually the system will be configurable to access AGROVOC through local files or remotely via webservices. Finally, spell checking and tooltips will be incorporated in the user interface to support metadata editing. Adapting DSpace to support AGRIS AP and annotation using the semantically-rich AGROVOC thesaurus transform DSpace into a powerful, domain-specific system for annotation and exchange of bibliographic metadata in the agricultural domain.
Source: Metadata for semantic and social applications : proceedings of the International Conference on Dublin Core and Metadata Applications, Berlin, 22 - 26 September 2008, DC 2008: Berlin, Germany / ed. by Jane Greenberg and Wolfgang Klas

Menard, E.: Study on the influence of vocabularies used for image indexing in a multilingual retrieval environment : reflections on scribbles (2007) 0.04

0.044290036 = product of:
  0.08858007 = sum of:
    0.04174695 = weight(_text_:retrieval in 1089) [ClassicSimilarity], result of:
      0.04174695 = score(doc=1089,freq=8.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.33420905 = fieldWeight in 1089, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1089)
    0.021389665 = weight(_text_:use in 1089) [ClassicSimilarity], result of:
      0.021389665 = score(doc=1089,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.1691581 = fieldWeight in 1089, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1089)
    0.017640345 = weight(_text_:of in 1089) [ClassicSimilarity], result of:
      0.017640345 = score(doc=1089,freq=20.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.27317715 = fieldWeight in 1089, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1089)
    0.007803111 = product of:
      0.015606222 = sum of:
        0.015606222 = weight(_text_:on in 1089) [ClassicSimilarity], result of:
          0.015606222 = score(doc=1089,freq=4.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.1718293 = fieldWeight in 1089, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1089)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: For many years, the Web became an important media for the diffusion of multilingual resources. Linguistic differenees still form a major obstacle to scientific, cultural, and educational exchange. Besides this linguistic diversity, a multitude of databases and collections now contain documents in various formats, which may also adversely affect the retrieval process. This paper describes a research project aiming to verify the existing relations between two indexing approaches: traditional image indexing recommending the use of controlled vocabularies or free image indexing using uncontrolled vocabulary, and their respective performance for image retrieval, in a multilingual context. This research also compares image retrieval within two contexts: a monolingual context where the language of the query is the same as the indexing language; and a multilingual context where the language of the query is different from the indexing language. This research will indicate whether one of these indexing approaches surpasses the other, in terms of effectiveness, efficiency, and satisfaction of the image searchers. This paper presents the context and the problem statement of the research project. The experiment carried out is also described, as well as the data collection methods

Qin, J.; Zhou, Y.; Chau, M.; Chen, H.: Multilingual Web retrieval : an experiment in English-Chinese business intelligence (2006) 0.04

0.04258352 = product of:
  0.08516704 = sum of:
    0.04174695 = weight(_text_:retrieval in 5054) [ClassicSimilarity], result of:
      0.04174695 = score(doc=5054,freq=8.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.33420905 = fieldWeight in 5054, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5054)
    0.021389665 = weight(_text_:use in 5054) [ClassicSimilarity], result of:
      0.021389665 = score(doc=5054,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.1691581 = fieldWeight in 5054, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5054)
    0.012473608 = weight(_text_:of in 5054) [ClassicSimilarity], result of:
      0.012473608 = score(doc=5054,freq=10.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.19316542 = fieldWeight in 5054, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5054)
    0.00955682 = product of:
      0.01911364 = sum of:
        0.01911364 = weight(_text_:on in 5054) [ClassicSimilarity], result of:
          0.01911364 = score(doc=5054,freq=6.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.21044704 = fieldWeight in 5054, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5054)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: As increasing numbers of non-English resources have become available on the Web, the interesting and important issue of how Web users can retrieve documents in different languages has arisen. Cross-language information retrieval (CLIP), the study of retrieving information in one language by queries expressed in another language, is a promising approach to the problem. Cross-language information retrieval has attracted much attention in recent years. Most research systems have achieved satisfactory performance on standard Text REtrieval Conference (TREC) collections such as news articles, but CLIR techniques have not been widely studied and evaluated for applications such as Web portals. In this article, the authors present their research in developing and evaluating a multilingual English-Chinese Web portal that incorporates various CLIP techniques for use in the business domain. A dictionary-based approach was adopted and combines phrasal translation, co-occurrence analysis, and pre- and posttranslation query expansion. The portal was evaluated by domain experts, using a set of queries in both English and Chinese. The experimental results showed that co-occurrence-based phrasal translation achieved a 74.6% improvement in precision over simple word-byword translation. When used together, pre- and posttranslation query expansion improved the performance slightly, achieving a 78.0% improvement over the baseline word-by-word translation approach. In general, applying CLIR techniques in Web applications shows promise.
Footnote: Beitrag einer special topic section on multilingual information systems
Source: Journal of the American Society for Information Science and Technology. 57(2006) no.5, S.671-683

Freitas-Junior, H.R.; Ribeiro-Neto, B.A.; Freitas-Vale, R. de; Laender, A.H.F.; Lima, L.R.S. de: Categorization-driven cross-language retrieval of medical information (2006) 0.04

0.041553866 = product of:
  0.08310773 = sum of:
    0.051129367 = weight(_text_:retrieval in 5282) [ClassicSimilarity], result of:
      0.051129367 = score(doc=5282,freq=12.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.40932083 = fieldWeight in 5282, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5282)
    0.012473608 = weight(_text_:of in 5282) [ClassicSimilarity], result of:
      0.012473608 = score(doc=5282,freq=10.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.19316542 = fieldWeight in 5282, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5282)
    0.0055176322 = product of:
      0.0110352645 = sum of:
        0.0110352645 = weight(_text_:on in 5282) [ClassicSimilarity], result of:
          0.0110352645 = score(doc=5282,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.121501654 = fieldWeight in 5282, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5282)
      0.5 = coord(1/2)
    0.013987125 = product of:
      0.02797425 = sum of:
        0.02797425 = weight(_text_:22 in 5282) [ClassicSimilarity], result of:
          0.02797425 = score(doc=5282,freq=2.0), product of:
            0.1446067 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041294612 = queryNorm
            0.19345059 = fieldWeight in 5282, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5282)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: The Web has become a large repository of documents (or pages) written in many different languages. In this context, traditional information retrieval (IR) techniques cannot be used whenever the user query and the documents being retrieved are in different languages. To address this problem, new cross-language information retrieval (CLIR) techniques have been proposed. In this work, we describe a method for cross-language retrieval of medical information. This method combines query terms and related medical concepts obtained automatically through a categorization procedure. The medical concepts are used to create a linguistic abstraction that allows retrieval of information in a language-independent way, minimizing linguistic problems such as polysemy. To evaluate our method, we carried out experiments using the OHSUMED test collection, whose documents are written in English, with queries expressed in Portuguese, Spanish, and French. The results indicate that our cross-language retrieval method is as effective as a standard vector space model algorithm operating on queries and documents in the same language. Further, our results are better than previous results in the literature.
Date: 22. 7.2006 16:46:36
Source: Journal of the American Society for Information Science and Technology. 57(2006) no.4, S.501-510

Seo, H.-C.; Kim, S.-B.; Rim, H.-C.; Myaeng, S.-H.: lmproving query translation in English-Korean Cross-language information retrieval (2005) 0.04

0.03888139 = product of:
  0.07776278 = sum of:
    0.035423465 = weight(_text_:retrieval in 1023) [ClassicSimilarity], result of:
      0.035423465 = score(doc=1023,freq=4.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.2835858 = fieldWeight in 1023, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=1023)
    0.018933605 = weight(_text_:of in 1023) [ClassicSimilarity], result of:
      0.018933605 = score(doc=1023,freq=16.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.2932045 = fieldWeight in 1023, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=1023)
    0.006621159 = product of:
      0.013242318 = sum of:
        0.013242318 = weight(_text_:on in 1023) [ClassicSimilarity], result of:
          0.013242318 = score(doc=1023,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.14580199 = fieldWeight in 1023, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=1023)
      0.5 = coord(1/2)
    0.016784549 = product of:
      0.033569098 = sum of:
        0.033569098 = weight(_text_:22 in 1023) [ClassicSimilarity], result of:
          0.033569098 = score(doc=1023,freq=2.0), product of:
            0.1446067 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041294612 = queryNorm
            0.23214069 = fieldWeight in 1023, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1023)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: Query translation is a viable method for cross-language information retrieval (CLIR), but it suffers from translation ambiguities caused by multiple translations of individual query terms. Previous research has employed various methods for disambiguation, including the method of selecting an individual target query term from multiple candidates by comparing their statistical associations with the candidate translations of other query terms. This paper proposes a new method where we examine all combinations of target query term translations corresponding to the source query terms, instead of looking at the candidates for each query term and selecting the best one at a time. The goodness value for a combination of target query terms is computed based on the association value between each pair of the terms in the combination. We tested our method using the NTCIR-3 English-Korean CLIR test collection. The results show some improvements regardless of the association measures we used.
Date: 26.12.2007 20:22:38

Landry, P.: Multilingual subject access : the linking approach of MACS (2004) 0.04

0.037809726 = product of:
  0.100825936 = sum of:
    0.029945528 = weight(_text_:use in 4825) [ClassicSimilarity], result of:
      0.029945528 = score(doc=4825,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.23682132 = fieldWeight in 4825, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4825)
    0.020662563 = weight(_text_:of in 4825) [ClassicSimilarity], result of:
      0.020662563 = score(doc=4825,freq=14.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.31997898 = fieldWeight in 4825, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4825)
    0.05021784 = product of:
      0.10043568 = sum of:
        0.10043568 = weight(_text_:line in 4825) [ClassicSimilarity], result of:
          0.10043568 = score(doc=4825,freq=2.0), product of:
            0.23157367 = queryWeight, product of:
              5.6078424 = idf(docFreq=440, maxDocs=44218)
              0.041294612 = queryNorm
            0.4337094 = fieldWeight in 4825, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.6078424 = idf(docFreq=440, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4825)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: In line with the international flavour of the book, Patrice Landry looks at the multilingual problem. This chapter is mainly concerned with a review of MACS (Multilingual Access to Subjects); a project with the strategy of developing a Web-based link and search interface through which equivalents between three Subject Heading Languages can be created and maintained, and by which users can access online databases in the language of their choice. The three systems in the project are German, French and English language. With the dramatic spread of use of the Web, particularly in the Far East, such projects are going to be increasingly valuable and important.

Wang, F.L.; Yang, C.C.: ¬The impact analysis of language differences on an automatic multilingual text summarization system (2006) 0.04

0.03768139 = product of:
  0.07536278 = sum of:
    0.020873476 = weight(_text_:retrieval in 5049) [ClassicSimilarity], result of:
      0.020873476 = score(doc=5049,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.16710453 = fieldWeight in 5049, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5049)
    0.021389665 = weight(_text_:use in 5049) [ClassicSimilarity], result of:
      0.021389665 = score(doc=5049,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.1691581 = fieldWeight in 5049, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5049)
    0.01850135 = weight(_text_:of in 5049) [ClassicSimilarity], result of:
      0.01850135 = score(doc=5049,freq=22.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.28651062 = fieldWeight in 5049, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5049)
    0.014598285 = product of:
      0.02919657 = sum of:
        0.02919657 = weight(_text_:on in 5049) [ClassicSimilarity], result of:
          0.02919657 = score(doc=5049,freq=14.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.3214632 = fieldWeight in 5049, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5049)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: Based on the salient features of the documents, automatic text summarization systems extract the key sentences from source documents. This process supports the users in evaluating the relevance of the extracted documents returned by information retrieval systems. Because of this tool, efficient filtering can be achieved. Indirectly, these systems help to resolve the problem of information overloading. Many automatic text summarization systems have been implemented for use with different languages. It has been established that the grammatical and lexical differences between languages have a significant effect on text processing. However, the impact of the language differences on the automatic text summarization systems has not yet been investigated. The authors provide an impact analysis of language difference on automatic text summarization. It includes the effect on the extraction processes, the scoring mechanisms, the performance, and the matching of the extracted sentences, using the parallel corpus in English and Chinese as the tested object. The analysis results provide a greater understanding of language differences and promote the future development of more advanced text summarization techniques.
Footnote: Beitrag einer special topic section on multilingual information systems
Source: Journal of the American Society for Information Science and Technology. 57(2006) no.5, S.684-696

Talvensaari, T.; Laurikkala, J.; Järvelin, K.; Juhola, M.: ¬A study on automatic creation of a comparable document collection in cross-language information retrieval (2006) 0.04

0.037033595 = product of:
  0.07406719 = sum of:
    0.029519552 = weight(_text_:retrieval in 5601) [ClassicSimilarity], result of:
      0.029519552 = score(doc=5601,freq=4.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.23632148 = fieldWeight in 5601, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5601)
    0.021389665 = weight(_text_:use in 5601) [ClassicSimilarity], result of:
      0.021389665 = score(doc=5601,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.1691581 = fieldWeight in 5601, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5601)
    0.017640345 = weight(_text_:of in 5601) [ClassicSimilarity], result of:
      0.017640345 = score(doc=5601,freq=20.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.27317715 = fieldWeight in 5601, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5601)
    0.0055176322 = product of:
      0.0110352645 = sum of:
        0.0110352645 = weight(_text_:on in 5601) [ClassicSimilarity], result of:
          0.0110352645 = score(doc=5601,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.121501654 = fieldWeight in 5601, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5601)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: Purpose - To present a method for creating a comparable document collection from two document collections in different languages. Design/methodology/approach - The best query keys were extracted from a Finnish source collection (articles of the newspaper Aamulehti) with the relative average term frequency formula. The keys were translated into English with a dictionary-based query translation program. The resulting lists of words were used as queries that were run against the target collection (Los Angeles Times articles) with the nearest neighbor method. The documents were aligned with unrestricted and date-restricted alignment schemes, which were also combined. Findings - The combined alignment scheme was found the best, when the relatedness of the document pairs was assessed with a five-degree relevance scale. Of the 400 document pairs, roughly 40 percent were highly or fairly related and 75 percent included at least lexical similarity. Research limitations/implications - The number of alignment pairs was small due to the short common time period of the two collections, and their geographical (and thus, topical) remoteness. In future, our aim is to build larger comparable corpora in various languages and use them as source of translation knowledge for the purposes of cross-language information retrieval (CLIR). Practical implications - Readily available parallel corpora are scarce. With this method, two unrelated document collections can relatively easily be aligned to create a CLIR resource. Originality/value - The method can be applied to weakly linked collections and morphologically complex languages, such as Finnish.
Source: Journal of documentation. 62(2006) no.3, S.372-387

Francu, V.: Multilingual access to information using an intermediate language (2003) 0.04
```
0.036819052 = product of:
  0.073638104 = sum of:
    0.028923139 = weight(_text_:retrieval in 1742) [ClassicSimilarity], result of:
      0.028923139 = score(doc=1742,freq=6.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.23154683 = fieldWeight in 1742, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=1742)
    0.01711173 = weight(_text_:use in 1742) [ClassicSimilarity], result of:
      0.01711173 = score(doc=1742,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.13532647 = fieldWeight in 1742, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.03125 = fieldNorm(doc=1742)
    0.019957775 = weight(_text_:of in 1742) [ClassicSimilarity], result of:
      0.019957775 = score(doc=1742,freq=40.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.3090647 = fieldWeight in 1742, product of:
          6.3245554 = tf(freq=40.0), with freq of:
            40.0 = termFreq=40.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03125 = fieldNorm(doc=1742)
    0.007645456 = product of:
      0.015290912 = sum of:
        0.015290912 = weight(_text_:on in 1742) [ClassicSimilarity], result of:
          0.015290912 = score(doc=1742,freq=6.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.16835764 = fieldWeight in 1742, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.03125 = fieldNorm(doc=1742)
      0.5 = coord(1/2)
  0.5 = coord(4/8)
```
Abstract

While being theoretically so widely available, information can be restricted from a more general use by linguistic barriers. The linguistic aspects of the information languages and particularly the chances of an enhanced access to information by means of multilingual access facilities will make the substance of this thesis. The main problem of this research is thus to demonstrate that information retrieval can be improved by using multilingual thesaurus terms based on an intermediate or switching language to search with. Universal classification systems in general can play the role of switching languages for reasons dealt with in the forthcoming pages. The Universal Decimal Classification (UDC) in particular is the classification system used as example of a switching language for our objectives. The question may arise: why a universal classification system and not another thesaurus? Because the UDC like most of the classification systems uses symbols. Therefore, it is language independent and the problems of compatibility between such a thesaurus and different other thesauri in different languages are avoided. Another question may still arise? Why not then, assign running numbers to the descriptors in a thesaurus and make a switching language out of the resulting enumerative system? Because of some other characteristics of the UDC: hierarchical structure and terminological richness, consistency and control. One big problem to find an answer to is: can a thesaurus be made having as a basis a classification system in any and all its parts? To what extent this question can be given an affirmative answer? This depends much on the attributes of the universal classification system which can be favourably used to this purpose. Examples of different situations will be given and discussed upon beginning with those classes of UDC which are best fitted for building a thesaurus structure out of them (classes which are both hierarchical and faceted)...

Content

Inhalt: INFORMATION LANGUAGES: A LINGUISTIC APPROACH MULTILINGUAL ASPECTS IN INFORMATION STORAGE AND RETRIEVAL COMPATIBILITY AND CONVERTIBILITY OF INFORMATION LANGUAGES CURRENT TRENDS IN MULTILINGUAL ACCESS BUILDING UDC-BASED MULTILINGUAL THESAURI ONLINE APPLICATIONS OF THE UDC-BASED MULTILINGUAL THESAURI THE IMPACT OF SPECIFICITY ON THE RETRIEVAL POWER OF A UDC-BASED MULTILINGUAL THESAURUS FINAL REMARKS AND GENERAL CONCLUSIONS Proefschrift voorgelegd tot het behalen van de graad van doctor in de Taal- en Letterkunde aan de Universiteit Antwerpen. - Vgl.: http://dlist.sir.arizona.edu/1862/.

Rosemblat, G.; Graham, L.: Cross-language search in a monolingual health information system : flexible designs and lexical processes (2006) 0.03

0.033773154 = product of:
  0.09006175 = sum of:
    0.035423465 = weight(_text_:retrieval in 241) [ClassicSimilarity], result of:
      0.035423465 = score(doc=241,freq=4.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.2835858 = fieldWeight in 241, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=241)
    0.011594418 = weight(_text_:of in 241) [ClassicSimilarity], result of:
      0.011594418 = score(doc=241,freq=6.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.17955035 = fieldWeight in 241, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=241)
    0.043043863 = product of:
      0.086087726 = sum of:
        0.086087726 = weight(_text_:line in 241) [ClassicSimilarity], result of:
          0.086087726 = score(doc=241,freq=2.0), product of:
            0.23157367 = queryWeight, product of:
              5.6078424 = idf(docFreq=440, maxDocs=44218)
              0.041294612 = queryNorm
            0.37175092 = fieldWeight in 241, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.6078424 = idf(docFreq=440, maxDocs=44218)
              0.046875 = fieldNorm(doc=241)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: The predominance of English-only online health information poses a serious challenge to nonEnglish speakers. To overcome this barrier, we incorporated cross-language information retrieval (CLIR) techniques into a fully functional prototype. It supports Spanish language searches over an English data set using a Spanish-English bilingual term list (BTL). The modular design allows for system and BTL growth and takes advantage of English-system enhancements. Language-based design decisions and implications for integrating non-English components with the existing monolingual architecture are presented. Algorithmic and BTL improvements are used to bring CUR retrieval scores in line with the monolingual values. After validating these changes, we conducted a failure analysis and error categorization for the worst performing queries. We conclude with a comprehensive discussion and directions for future work.
Source: Knowledge organization for a global learning society: Proceedings of the 9th International ISKO Conference, 4-7 July 2006, Vienna, Austria. Hrsg.: G. Budin, C. Swertz u. K. Mitgutsch

Fujita, S.: NTCIR-2 as a Rosetta stone in laboratory experiments of IR systems (2005) 0.03

0.032473434 = product of:
  0.086595826 = sum of:
    0.06534432 = weight(_text_:retrieval in 1017) [ClassicSimilarity], result of:
      0.06534432 = score(doc=1017,freq=10.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.5231199 = fieldWeight in 1017, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1017)
    0.013526822 = weight(_text_:of in 1017) [ClassicSimilarity], result of:
      0.013526822 = score(doc=1017,freq=6.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.20947541 = fieldWeight in 1017, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1017)
    0.007724685 = product of:
      0.01544937 = sum of:
        0.01544937 = weight(_text_:on in 1017) [ClassicSimilarity], result of:
          0.01544937 = score(doc=1017,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.17010231 = fieldWeight in 1017, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1017)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: This paper presents a laboratory based evaluation study of cross-language information retrieval technologies, utilizing partially parallel test collections, NTCIR-2 (used together with NTCIR-1), where Japanese-English parallel document collections, parallel topic sets and their relevance judgments are available. These enable us to observe and compare monolingual retrieval processes in two languages as well as retrieval across languages. Our experiments focused on (1) the Rosetta stone question (whether a partially parallel collection helps in cross-language information access or not?) and (2) two aspects of retrieval difficulties namely "collection discrepancy" and "query discrepancy". Japanese and English monolingual retrieval systems are combined by dictionary based query translation modules so that a symmetrical bilingual evaluation environment is implemented.

Oard, D.W.; He, D.; Wang, J.: User-assisted query translation for interactive cross-language information retrieval (2008) 0.03

0.031507738 = product of:
  0.08402064 = sum of:
    0.04338471 = weight(_text_:retrieval in 2030) [ClassicSimilarity], result of:
      0.04338471 = score(doc=2030,freq=6.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.34732026 = fieldWeight in 2030, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2030)
    0.025667597 = weight(_text_:use in 2030) [ClassicSimilarity], result of:
      0.025667597 = score(doc=2030,freq=2.0), product of:
        0.12644777 = queryWeight, product of:
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.041294612 = queryNorm
        0.20298971 = fieldWeight in 2030, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.0620887 = idf(docFreq=5623, maxDocs=44218)
          0.046875 = fieldNorm(doc=2030)
    0.014968331 = weight(_text_:of in 2030) [ClassicSimilarity], result of:
      0.014968331 = score(doc=2030,freq=10.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.23179851 = fieldWeight in 2030, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2030)
  0.375 = coord(3/8)

Abstract: Interactive Cross-Language Information Retrieval (CLIR), a process in which searcher and system collaborate to find documents that satisfy an information need regardless of the language in which those documents are written, calls for designs in which synergies between searcher and system can be leveraged so that the strengths of one can cover weaknesses of the other. This paper describes an approach that employs user-assisted query translation to help searchers better understand the system's operation. Supporting interaction and interface designs are introduced, and results from three user studies are presented. The results indicate that experienced searchers presented with this new system evolve new search strategies that make effective use of the new capabilities, that they achieve retrieval effectiveness comparable to results obtained using fully automatic techniques, and that reported satisfaction with support for cross-language searching increased. The paper concludes with a description of a freely available interactive CLIR system that incorporates lessons learned from this research.

Bian, G.-W.; Chen, H.-H.: Cross-language information access to multilingual collections on the Internet (2000) 0.03

0.031395435 = product of:
  0.06279087 = sum of:
    0.025048172 = weight(_text_:retrieval in 4436) [ClassicSimilarity], result of:
      0.025048172 = score(doc=4436,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.20052543 = fieldWeight in 4436, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=4436)
    0.011594418 = weight(_text_:of in 4436) [ClassicSimilarity], result of:
      0.011594418 = score(doc=4436,freq=6.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.17955035 = fieldWeight in 4436, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=4436)
    0.009363732 = product of:
      0.018727465 = sum of:
        0.018727465 = weight(_text_:on in 4436) [ClassicSimilarity], result of:
          0.018727465 = score(doc=4436,freq=4.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.20619515 = fieldWeight in 4436, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=4436)
      0.5 = coord(1/2)
    0.016784549 = product of:
      0.033569098 = sum of:
        0.033569098 = weight(_text_:22 in 4436) [ClassicSimilarity], result of:
          0.033569098 = score(doc=4436,freq=2.0), product of:
            0.1446067 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041294612 = queryNorm
            0.23214069 = fieldWeight in 4436, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=4436)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: Language barrier is the major problem that people face in searching for, retrieving, and understanding multilingual collections on the Internet. This paper deals with query translation and document translation in a Chinese-English information retrieval system called MTIR. Bilingual dictionary and monolingual corpus-based approaches are adopted to select suitable tranlated query terms. A machine transliteration algorithm is introduced to resolve proper name searching. We consider several design issues for document translation, including which material is translated, what roles the HTML tags play in translation, what the tradeoff is between the speed performance and the translation performance, and what from the translated result is presented in. About 100.000 Web pages translated in the last 4 months of 1997 are used for quantitative study of online and real-time Web page translation
Date: 16. 2.2000 14:22:39
Source: Journal of the American Society for Information Science. 51(2000) no.3, S.281-296

Levergood, B.; Farrenkopf, S.; Frasnelli, E.: ¬The specification of the language of the field and interoperability : cross-language access to catalogues and online libraries (CACAO) (2008) 0.03

0.03092098 = product of:
  0.06184196 = sum of:
    0.025048172 = weight(_text_:retrieval in 2646) [ClassicSimilarity], result of:
      0.025048172 = score(doc=2646,freq=2.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.20052543 = fieldWeight in 2646, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2646)
    0.013388081 = weight(_text_:of in 2646) [ClassicSimilarity], result of:
      0.013388081 = score(doc=2646,freq=8.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.20732689 = fieldWeight in 2646, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2646)
    0.006621159 = product of:
      0.013242318 = sum of:
        0.013242318 = weight(_text_:on in 2646) [ClassicSimilarity], result of:
          0.013242318 = score(doc=2646,freq=2.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.14580199 = fieldWeight in 2646, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=2646)
      0.5 = coord(1/2)
    0.016784549 = product of:
      0.033569098 = sum of:
        0.033569098 = weight(_text_:22 in 2646) [ClassicSimilarity], result of:
          0.033569098 = score(doc=2646,freq=2.0), product of:
            0.1446067 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.041294612 = queryNorm
            0.23214069 = fieldWeight in 2646, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2646)
      0.5 = coord(1/2)
  0.5 = coord(4/8)

Abstract: The CACAO Project (Cross-language Access to Catalogues and Online Libraries) has been designed to implement natural language processing and cross-language information retrieval techniques to provide cross-language access to information in libraries, a critical issue in the linguistically diverse European Union. This project report addresses two metadata-related challenges for the library community in this context: "false friends" (identical words having different meanings in different languages) and term ambiguity. The possible solutions involve enriching the metadata with attributes specifying language or the source authority file, or associating potential search terms to classes in a classification system. The European Library will evaluate an early implementation of this work in late 2008.
Source: Metadata for semantic and social applications : proceedings of the International Conference on Dublin Core and Metadata Applications, Berlin, 22 - 26 September 2008, DC 2008: Berlin, Germany / ed. by Jane Greenberg and Wolfgang Klas

Francu, V.: ¬The impact of specificity on the retrieval power of a UDC-based multilingual thesaurus (2003) 0.03

0.030852102 = product of:
  0.08227227 = sum of:
    0.050096344 = weight(_text_:retrieval in 5518) [ClassicSimilarity], result of:
      0.050096344 = score(doc=5518,freq=8.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.40105087 = fieldWeight in 5518, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=5518)
    0.018933605 = weight(_text_:of in 5518) [ClassicSimilarity], result of:
      0.018933605 = score(doc=5518,freq=16.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.2932045 = fieldWeight in 5518, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=5518)
    0.013242318 = product of:
      0.026484637 = sum of:
        0.026484637 = weight(_text_:on in 5518) [ClassicSimilarity], result of:
          0.026484637 = score(doc=5518,freq=8.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.29160398 = fieldWeight in 5518, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=5518)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: The article describes the research done over a bibliographic database in order to show the impact the specificity of the knowledge organising tools may have on information retrieval (IR). For this purpose two multilingual Universal Decimal Classification (UDC) based thesauri having different degrees of specificity are considered. Issues of harmonising a classificatory structure with a thesaurus structure are introduced, and significant aspects of information retrieval in a multilingual environment are examined in an extensive manner. Aspects of complementarity are discussed with particular emphasis on the real impact produced on IR by alternative search facilities. Finally, a number of conclusions are formulated as they arise from the study.
Content: Beitrag eines Themenheftes "Knowledge organization and classification in international information retrieval"

Petrelli, D.; Levin, S.; Beaulieu, M.; Sanderson, M.: Which user interaction for cross-language information retrieval? : design issues and reflections (2006) 0.03

0.029900867 = product of:
  0.079735644 = sum of:
    0.050096344 = weight(_text_:retrieval in 5053) [ClassicSimilarity], result of:
      0.050096344 = score(doc=5053,freq=8.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.40105087 = fieldWeight in 5053, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=5053)
    0.016396983 = weight(_text_:of in 5053) [ClassicSimilarity], result of:
      0.016396983 = score(doc=5053,freq=12.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.25392252 = fieldWeight in 5053, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=5053)
    0.013242318 = product of:
      0.026484637 = sum of:
        0.026484637 = weight(_text_:on in 5053) [ClassicSimilarity], result of:
          0.026484637 = score(doc=5053,freq=8.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.29160398 = fieldWeight in 5053, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=5053)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: A novel and complex form of information access is cross-language information retrieval: searching for texts written in foreign languages based on native language queries. Although the underlying technology for achieving such a search is relatively well understood, the appropriate interface design is not. The authors present three user evaluations undertaken during the iterative design of Clarity, a cross-language retrieval system for lowdensity languages, and shows how the user-interaction design evolved depending on the results of usability tests. The first test was instrumental to identify weaknesses in both functionalities and interface; the second was run to determine if query translation should be shown or not; the final was a global assessment and focused on user satisfaction criteria. Lessons were learned at every stage of the process leading to a much more informed view of what a cross-language retrieval system should offer to users.
Footnote: Beitrag einer special topic section on multilingual information systems
Source: Journal of the American Society for Information Science and Technology. 57(2006) no.5, S.709-722

Airio, E.: Who benefits from CLIR in web retrieval? (2008) 0.03

0.029728236 = product of:
  0.079275295 = sum of:
    0.050096344 = weight(_text_:retrieval in 2342) [ClassicSimilarity], result of:
      0.050096344 = score(doc=2342,freq=8.0), product of:
        0.124912694 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.041294612 = queryNorm
        0.40105087 = fieldWeight in 2342, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2342)
    0.017710768 = weight(_text_:of in 2342) [ClassicSimilarity], result of:
      0.017710768 = score(doc=2342,freq=14.0), product of:
        0.06457475 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.041294612 = queryNorm
        0.2742677 = fieldWeight in 2342, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2342)
    0.011468184 = product of:
      0.022936368 = sum of:
        0.022936368 = weight(_text_:on in 2342) [ClassicSimilarity], result of:
          0.022936368 = score(doc=2342,freq=6.0), product of:
            0.090823986 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.041294612 = queryNorm
            0.25253648 = fieldWeight in 2342, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=2342)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: Purpose - The aim of the current paper is to test whether query translation is beneficial in web retrieval. Design/methodology/approach - The language pairs were Finnish-Swedish, English-German and Finnish-French. A total of 12-18 participants were recruited for each language pair. Each participant performed four retrieval tasks. The author's aim was to compare the performance of the translated queries with that of the target language queries. Thus, the author asked participants to formulate a source language query and a target language query for each task. The source language queries were translated into the target language utilizing a dictionary-based system. In English-German, also machine translation was utilized. The author used Google as the search engine. Findings - The results differed depending on the language pair. The author concluded that the dictionary coverage had an effect on the results. On average, the results of query-translation were better than in the traditional laboratory tests. Originality/value - This research shows that query translation in web is beneficial especially for users with moderate and non-active language skills. This is valuable information for developers of cross-language information retrieval systems.
Source: Journal of documentation. 64(2008) no.5, S.760-778

Search (134 results, page 1 of 7)

Authors

Languages

Types

Themes

Classifications