Search (218 results, page 1 of 11)

Cao, L.; Leong, M.-K.; Low, H.-B.: Searching heterogeneous multilingual bibliographic sources (1998) 0.10

0.09784785 = product of:
  0.1956957 = sum of:
    0.07356817 = weight(_text_:wide in 3564) [ClassicSimilarity], result of:
      0.07356817 = score(doc=3564,freq=2.0), product of:
        0.18785246 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.042397358 = queryNorm
        0.3916274 = fieldWeight in 3564, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0625 = fieldNorm(doc=3564)
    0.05644414 = weight(_text_:web in 3564) [ClassicSimilarity], result of:
      0.05644414 = score(doc=3564,freq=4.0), product of:
        0.13836423 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.042397358 = queryNorm
        0.4079388 = fieldWeight in 3564, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0625 = fieldNorm(doc=3564)
    0.065683395 = product of:
      0.09852509 = sum of:
        0.052571043 = weight(_text_:system in 3564) [ClassicSimilarity], result of:
          0.052571043 = score(doc=3564,freq=4.0), product of:
            0.13353272 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.042397358 = queryNorm
            0.3936941 = fieldWeight in 3564, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0625 = fieldNorm(doc=3564)
        0.045954052 = weight(_text_:22 in 3564) [ClassicSimilarity], result of:
          0.045954052 = score(doc=3564,freq=2.0), product of:
            0.14846832 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.042397358 = queryNorm
            0.30952093 = fieldWeight in 3564, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=3564)
      0.6666667 = coord(2/3)
  0.5 = coord(3/6)

Abstract: Propopses a Web-based architecture for searching distributed heterogeneous multi-asian language bibliographic sources, and describes a successful pilot implementation of the system at the Chinese Library (CLib) system developed in Singapore and tested at 2 university libraries and a public library
Date: 1. 8.1996 22:08:06
Footnote: Contribution to a special issue devoted to the Proceedings of the 7th International World Wide Web Conference, held 14-18 April 1998, Brisbane, Australia

Talvensaari, T.; Juhola, M.; Laurikkala, J.; Järvelin, K.: Corpus-based cross-language information retrieval in retrieval of highly relevant documents (2007) 0.08

0.08102091 = product of:
  0.12153135 = sum of:
    0.045980107 = weight(_text_:wide in 139) [ClassicSimilarity], result of:
      0.045980107 = score(doc=139,freq=2.0), product of:
        0.18785246 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.042397358 = queryNorm
        0.24476713 = fieldWeight in 139, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0390625 = fieldNorm(doc=139)
    0.02494502 = weight(_text_:web in 139) [ClassicSimilarity], result of:
      0.02494502 = score(doc=139,freq=2.0), product of:
        0.13836423 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.042397358 = queryNorm
        0.18028519 = fieldWeight in 139, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=139)
    0.04286178 = weight(_text_:retrieval in 139) [ClassicSimilarity], result of:
      0.04286178 = score(doc=139,freq=8.0), product of:
        0.12824841 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.042397358 = queryNorm
        0.33420905 = fieldWeight in 139, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=139)
    0.007744446 = product of:
      0.023233337 = sum of:
        0.023233337 = weight(_text_:system in 139) [ClassicSimilarity], result of:
          0.023233337 = score(doc=139,freq=2.0), product of:
            0.13353272 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.042397358 = queryNorm
            0.17398985 = fieldWeight in 139, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0390625 = fieldNorm(doc=139)
      0.33333334 = coord(1/3)
  0.6666667 = coord(4/6)

Abstract: Information retrieval systems' ability to retrieve highly relevant documents has become more and more important in the age of extremely large collections, such as the World Wide Web (WWW). The authors' aim was to find out how corpus-based cross-language information retrieval (CLIR) manages in retrieving highly relevant documents. They created a Finnish-Swedish comparable corpus from two loosely related document collections and used it as a source of knowledge for query translation. Finnish test queries were translated into Swedish and run against a Swedish test collection. Graded relevance assessments were used in evaluating the results and three relevance criterion levels-liberal, regular, and stringent-were applied. The runs were also evaluated with generalized recall and precision, which weight the retrieved documents according to their relevance level. The performance of the Comparable Corpus Translation system (COCOT) was compared to that of a dictionarybased query translation program; the two translation methods were also combined. The results indicate that corpus-based CUR performs particularly well with highly relevant documents. In average precision, COCOT even matched the monolingual baseline on the highest relevance level. The performance of the different query translation methods was further analyzed by finding out reasons for poor rankings of highly relevant documents.

Li, K.W.; Yang, C.C.: Conceptual analysis of parallel corpus collected from the Web (2006) 0.06

0.06483132 = product of:
  0.12966263 = sum of:
    0.065025695 = weight(_text_:wide in 5051) [ClassicSimilarity], result of:
      0.065025695 = score(doc=5051,freq=4.0), product of:
        0.18785246 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.042397358 = queryNorm
        0.34615302 = fieldWeight in 5051, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5051)
    0.043206044 = weight(_text_:web in 5051) [ClassicSimilarity], result of:
      0.043206044 = score(doc=5051,freq=6.0), product of:
        0.13836423 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.042397358 = queryNorm
        0.3122631 = fieldWeight in 5051, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5051)
    0.02143089 = weight(_text_:retrieval in 5051) [ClassicSimilarity], result of:
      0.02143089 = score(doc=5051,freq=2.0), product of:
        0.12824841 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.042397358 = queryNorm
        0.16710453 = fieldWeight in 5051, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5051)
  0.5 = coord(3/6)

Abstract: As illustrated by the World Wide Web, the volume of information in languages other than English has grown significantly in recent years. This highlights the importance of multilingual corpora. Much effort has been devoted to the compilation of multilingual corpora for the purpose of cross-lingual information retrieval and machine translation. Existing parallel corpora mostly involve European languages, such as English-French and English-Spanish. There is still a lack of parallel corpora between European languages and Asian. languages. In the authors' previous work, an alignment method to identify one-to-one Chinese and English title pairs was developed to construct an English-Chinese parallel corpus that works automatically from the World Wide Web, and a 100% precision and 87% recall were obtained. Careful analysis of these results has helped the authors to understand how the alignment method can be improved. A conceptual analysis was conducted, which includes the analysis of conceptual equivalent and conceptual information alternation in the aligned and nonaligned English-Chinese title pairs that are obtained by the alignment method. The result of the analysis not only reflects the characteristics of parallel corpora, but also gives insight into the strengths and weaknesses of the alignment method. In particular, conceptual alternation, such as omission and addition, is found to have a significant impact on the performance of the alignment method.

Powell, J.; Fox, E.A.: Multilingual federated searching across heterogeneous collections (1998) 0.06

0.06293566 = product of:
  0.12587132 = sum of:
    0.07356817 = weight(_text_:wide in 1250) [ClassicSimilarity], result of:
      0.07356817 = score(doc=1250,freq=2.0), product of:
        0.18785246 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.042397358 = queryNorm
        0.3916274 = fieldWeight in 1250, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0625 = fieldNorm(doc=1250)
    0.03991203 = weight(_text_:web in 1250) [ClassicSimilarity], result of:
      0.03991203 = score(doc=1250,freq=2.0), product of:
        0.13836423 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.042397358 = queryNorm
        0.2884563 = fieldWeight in 1250, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0625 = fieldNorm(doc=1250)
    0.012391115 = product of:
      0.037173342 = sum of:
        0.037173342 = weight(_text_:system in 1250) [ClassicSimilarity], result of:
          0.037173342 = score(doc=1250,freq=2.0), product of:
            0.13353272 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.042397358 = queryNorm
            0.27838376 = fieldWeight in 1250, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0625 = fieldNorm(doc=1250)
      0.33333334 = coord(1/3)
  0.5 = coord(3/6)

Abstract: This article describes a scalable system for searching heterogeneous multilingual collections on the World Wide Web. It details a markup language for describing the characteristics of a search engine and its interface, and a protocol for requesting word translations between languages.

Peters, C.; Picchi, E.: Across languages, across cultures : issues in multilinguality and digital libraries (1997) 0.06

0.060124356 = product of:
  0.12024871 = sum of:
    0.07356817 = weight(_text_:wide in 1233) [ClassicSimilarity], result of:
      0.07356817 = score(doc=1233,freq=2.0), product of:
        0.18785246 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.042397358 = queryNorm
        0.3916274 = fieldWeight in 1233, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0625 = fieldNorm(doc=1233)
    0.034289423 = weight(_text_:retrieval in 1233) [ClassicSimilarity], result of:
      0.034289423 = score(doc=1233,freq=2.0), product of:
        0.12824841 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.042397358 = queryNorm
        0.26736724 = fieldWeight in 1233, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=1233)
    0.012391115 = product of:
      0.037173342 = sum of:
        0.037173342 = weight(_text_:system in 1233) [ClassicSimilarity], result of:
          0.037173342 = score(doc=1233,freq=2.0), product of:
            0.13353272 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.042397358 = queryNorm
            0.27838376 = fieldWeight in 1233, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0625 = fieldNorm(doc=1233)
      0.33333334 = coord(1/3)
  0.5 = coord(3/6)

Abstract: With the recent rapid diffusion over the international computer networks of world-wide distributed document bases, the question of multilingual access and multilingual information retrieval is becoming increasingly relevant. We briefly discuss just some of the issues that must be addressed in order to implement a multilingual interface for a Digital Library system and describe our own approach to this problem.

Airio, E.: Who benefits from CLIR in web retrieval? (2008) 0.06

0.056287363 = product of:
  0.11257473 = sum of:
    0.051847253 = weight(_text_:web in 2342) [ClassicSimilarity], result of:
      0.051847253 = score(doc=2342,freq=6.0), product of:
        0.13836423 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.042397358 = queryNorm
        0.37471575 = fieldWeight in 2342, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=2342)
    0.051434137 = weight(_text_:retrieval in 2342) [ClassicSimilarity], result of:
      0.051434137 = score(doc=2342,freq=8.0), product of:
        0.12824841 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.042397358 = queryNorm
        0.40105087 = fieldWeight in 2342, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2342)
    0.0092933355 = product of:
      0.027880006 = sum of:
        0.027880006 = weight(_text_:system in 2342) [ClassicSimilarity], result of:
          0.027880006 = score(doc=2342,freq=2.0), product of:
            0.13353272 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.042397358 = queryNorm
            0.20878783 = fieldWeight in 2342, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=2342)
      0.33333334 = coord(1/3)
  0.5 = coord(3/6)

Abstract: Purpose - The aim of the current paper is to test whether query translation is beneficial in web retrieval. Design/methodology/approach - The language pairs were Finnish-Swedish, English-German and Finnish-French. A total of 12-18 participants were recruited for each language pair. Each participant performed four retrieval tasks. The author's aim was to compare the performance of the translated queries with that of the target language queries. Thus, the author asked participants to formulate a source language query and a target language query for each task. The source language queries were translated into the target language utilizing a dictionary-based system. In English-German, also machine translation was utilized. The author used Google as the search engine. Findings - The results differed depending on the language pair. The author concluded that the dictionary coverage had an effect on the results. On average, the results of query-translation were better than in the traditional laboratory tests. Originality/value - This research shows that query translation in web is beneficial especially for users with moderate and non-active language skills. This is valuable information for developers of cross-language information retrieval systems.

Bian, G.-W.; Chen, H.-H.: Cross-language information access to multilingual collections on the Internet (2000) 0.05

0.054806937 = product of:
  0.10961387 = sum of:
    0.042333104 = weight(_text_:web in 4436) [ClassicSimilarity], result of:
      0.042333104 = score(doc=4436,freq=4.0), product of:
        0.13836423 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.042397358 = queryNorm
        0.3059541 = fieldWeight in 4436, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=4436)
    0.025717068 = weight(_text_:retrieval in 4436) [ClassicSimilarity], result of:
      0.025717068 = score(doc=4436,freq=2.0), product of:
        0.12824841 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.042397358 = queryNorm
        0.20052543 = fieldWeight in 4436, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=4436)
    0.041563697 = product of:
      0.062345542 = sum of:
        0.027880006 = weight(_text_:system in 4436) [ClassicSimilarity], result of:
          0.027880006 = score(doc=4436,freq=2.0), product of:
            0.13353272 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.042397358 = queryNorm
            0.20878783 = fieldWeight in 4436, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=4436)
        0.034465536 = weight(_text_:22 in 4436) [ClassicSimilarity], result of:
          0.034465536 = score(doc=4436,freq=2.0), product of:
            0.14846832 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.042397358 = queryNorm
            0.23214069 = fieldWeight in 4436, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=4436)
      0.6666667 = coord(2/3)
  0.5 = coord(3/6)

Abstract: Language barrier is the major problem that people face in searching for, retrieving, and understanding multilingual collections on the Internet. This paper deals with query translation and document translation in a Chinese-English information retrieval system called MTIR. Bilingual dictionary and monolingual corpus-based approaches are adopted to select suitable tranlated query terms. A machine transliteration algorithm is introduced to resolve proper name searching. We consider several design issues for document translation, including which material is translated, what roles the HTML tags play in translation, what the tradeoff is between the speed performance and the translation performance, and what from the translated result is presented in. About 100.000 Web pages translated in the last 4 months of 1997 are used for quantitative study of online and real-time Web page translation
Date: 16. 2.2000 14:22:39

Peters, C.; Braschler, M.; Clough, P.: Multilingual information retrieval : from research to practice (2012) 0.05
```
0.05120425 = product of:
  0.1024085 = sum of:
    0.036784086 = weight(_text_:wide in 361) [ClassicSimilarity], result of:
      0.036784086 = score(doc=361,freq=2.0), product of:
        0.18785246 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.042397358 = queryNorm
        0.1958137 = fieldWeight in 361, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.03125 = fieldNorm(doc=361)
    0.056862578 = weight(_text_:retrieval in 361) [ClassicSimilarity], result of:
      0.056862578 = score(doc=361,freq=22.0), product of:
        0.12824841 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.042397358 = queryNorm
        0.44337842 = fieldWeight in 361, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=361)
    0.008761841 = product of:
      0.026285522 = sum of:
        0.026285522 = weight(_text_:system in 361) [ClassicSimilarity], result of:
          0.026285522 = score(doc=361,freq=4.0), product of:
            0.13353272 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.042397358 = queryNorm
            0.19684705 = fieldWeight in 361, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03125 = fieldNorm(doc=361)
      0.33333334 = coord(1/3)
  0.5 = coord(3/6)
```
Abstract

We are living in a multilingual world and the diversity in languages which are used to interact with information access systems has generated a wide variety of challenges to be addressed by computer and information scientists. The growing amount of non-English information accessible globally and the increased worldwide exposure of enterprises also necessitates the adaptation of Information Retrieval (IR) methods to new, multilingual settings.Peters, Braschler and Clough present a comprehensive description of the technologies involved in designing and developing systems for Multilingual Information Retrieval (MLIR). They provide readers with broad coverage of the various issues involved in creating systems to make accessible digitally stored materials regardless of the language(s) they are written in. Details on Cross-Language Information Retrieval (CLIR) are also covered that help readers to understand how to develop retrieval systems that cross language boundaries. Their work is divided into six chapters and accompanies the reader step-by-step through the various stages involved in building, using and evaluating MLIR systems. The book concludes with some examples of recent applications that utilise MLIR technologies. Some of the techniques described have recently started to appear in commercial search systems, while others have the potential to be part of future incarnations.The book is intended for graduate students, scholars, and practitioners with a basic understanding of classical text retrieval methods. It offers guidelines and information on all aspects that need to be taken into consideration when building MLIR systems, while avoiding too many 'hands-on details' that could rapidly become obsolete. Thus it bridges the gap between the material covered by most of the classical IR textbooks and the novel requirements related to the acquisition and dissemination of information in whatever language it is stored.

Content

Inhalt: 1 Introduction 2 Within-Language Information Retrieval 3 Cross-Language Information Retrieval 4 Interaction and User Interfaces 5 Evaluation for Multilingual Information Retrieval Systems 6 Applications of Multilingual Information Access

RSWK

Information-Retrieval-System / Mehrsprachigkeit / Abfrage / Zugriff

Subject

Information-Retrieval-System / Mehrsprachigkeit / Abfrage / Zugriff

Wang, J.-H.; Teng, J.-W.; Lu, W.-H.; Chien, L.-F.: Exploiting the Web as the multilingual corpus for unknown query translation (2006) 0.05

0.049363937 = product of:
  0.098727874 = sum of:
    0.059868045 = weight(_text_:web in 5050) [ClassicSimilarity], result of:
      0.059868045 = score(doc=5050,freq=8.0), product of:
        0.13836423 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.042397358 = queryNorm
        0.43268442 = fieldWeight in 5050, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=5050)
    0.025717068 = weight(_text_:retrieval in 5050) [ClassicSimilarity], result of:
      0.025717068 = score(doc=5050,freq=2.0), product of:
        0.12824841 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.042397358 = queryNorm
        0.20052543 = fieldWeight in 5050, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=5050)
    0.01314276 = product of:
      0.03942828 = sum of:
        0.03942828 = weight(_text_:system in 5050) [ClassicSimilarity], result of:
          0.03942828 = score(doc=5050,freq=4.0), product of:
            0.13353272 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.042397358 = queryNorm
            0.29527056 = fieldWeight in 5050, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=5050)
      0.33333334 = coord(1/3)
  0.5 = coord(3/6)

Abstract: Users' cross-lingual queries to a digital library system might be short and the query terms may not be included in a common translation dictionary (unknown terms). In this article, the authors investigate the feasibility of exploiting the Web as the multilingual corpus source to translate unknown query terms for cross-language information retrieval in digital libraries. They propose a Webbased term translation approach to determine effective translations for unknown query terms by mining bilingual search-result pages obtained from a real Web search engine. This approach can enhance the construction of a domain-specific bilingual lexicon and bring multilingual support to a digital library that only has monolingual document collections. Very promising results have been obtained in generating effective translation equivalents for many unknown terms, including proper nouns, technical terms, and Web query terms, and in assisting bilingual lexicon construction for a real digital library system.

Larkey, L.S.; Connell, M.E.: Structured queries, language modelling, and relevance modelling in cross-language information retrieval (2005) 0.05
```
0.04920782 = product of:
  0.09841564 = sum of:
    0.045980107 = weight(_text_:wide in 1022) [ClassicSimilarity], result of:
      0.045980107 = score(doc=1022,freq=2.0), product of:
        0.18785246 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.042397358 = queryNorm
        0.24476713 = fieldWeight in 1022, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1022)
    0.04286178 = weight(_text_:retrieval in 1022) [ClassicSimilarity], result of:
      0.04286178 = score(doc=1022,freq=8.0), product of:
        0.12824841 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.042397358 = queryNorm
        0.33420905 = fieldWeight in 1022, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1022)
    0.009573761 = product of:
      0.028721282 = sum of:
        0.028721282 = weight(_text_:22 in 1022) [ClassicSimilarity], result of:
          0.028721282 = score(doc=1022,freq=2.0), product of:
            0.14846832 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.042397358 = queryNorm
            0.19345059 = fieldWeight in 1022, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1022)
      0.33333334 = coord(1/3)
  0.5 = coord(3/6)
```
Abstract

Two probabilistic approaches to cross-lingual retrieval are in wide use today, those based on probabilistic models of relevance, as exemplified by INQUERY, and those based on language modeling. INQUERY, as a query net model, allows the easy incorporation of query operators, including a synonym operator, which has proven to be extremely useful in cross-language information retrieval (CLIR), in an approach often called structured query translation. In contrast, language models incorporate translation probabilities into a unified framework. We compare the two approaches on Arabic and Spanish data sets, using two kinds of bilingual dictionaries--one derived from a conventional dictionary, and one derived from a parallel corpus. We find that structured query processing gives slightly better results when queries are not expanded. On the other hand, when queries are expanded, language modeling gives better results, but only when using a probabilistic dictionary derived from a parallel corpus. We pursue two additional issues inherent in the comparison of structured query processing with language modeling. The first concerns query expansion, and the second is the role of translation probabilities. We compare conventional expansion techniques (pseudo-relevance feedback) with relevance modeling, a new IR approach which fits into the formal framework of language modeling. We find that relevance modeling and pseudo-relevance feedback achieve comparable levels of retrieval and that good translation probabilities confer a small but significant advantage.

Date

26.12.2007 20:22:11

Ata, B.M.A.: SISDOM: a multilingual document retrieval system (1995) 0.04

0.044409744 = product of:
  0.13322923 = sum of:
    0.059391025 = weight(_text_:retrieval in 895) [ClassicSimilarity], result of:
      0.059391025 = score(doc=895,freq=6.0), product of:
        0.12824841 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.042397358 = queryNorm
        0.46309367 = fieldWeight in 895, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=895)
    0.073838204 = product of:
      0.110757306 = sum of:
        0.064386114 = weight(_text_:system in 895) [ClassicSimilarity], result of:
          0.064386114 = score(doc=895,freq=6.0), product of:
            0.13353272 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.042397358 = queryNorm
            0.48217484 = fieldWeight in 895, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0625 = fieldNorm(doc=895)
        0.04637119 = weight(_text_:29 in 895) [ClassicSimilarity], result of:
          0.04637119 = score(doc=895,freq=2.0), product of:
            0.14914064 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.042397358 = queryNorm
            0.31092256 = fieldWeight in 895, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0625 = fieldNorm(doc=895)
      0.6666667 = coord(2/3)
  0.33333334 = coord(2/6)

Abstract: The Malay language is widely used in Malaysia, Indonesia and brunei. The growth in the number of documents written in Malay justifies the need for a document retrieval system for that language. Describes the implementation of a bilingual Malay and English full text document retrieval systems: SIStem capaian DOkumen Multilingua (SISDOM), by the Kebangsaan University Malaysia. The system incorporates many facilities for users, including the choice of search techniques, browsing of retrieved documents, and ranking of documents
Date: 31. 7.1996 9:29:12

Freitas-Junior, H.R.; Ribeiro-Neto, B.A.; Freitas-Vale, R. de; Laender, A.H.F.; Lima, L.R.S. de: Categorization-driven cross-language retrieval of medical information (2006) 0.04

0.043506764 = product of:
  0.08701353 = sum of:
    0.02494502 = weight(_text_:web in 5282) [ClassicSimilarity], result of:
      0.02494502 = score(doc=5282,freq=2.0), product of:
        0.13836423 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.042397358 = queryNorm
        0.18028519 = fieldWeight in 5282, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5282)
    0.052494746 = weight(_text_:retrieval in 5282) [ClassicSimilarity], result of:
      0.052494746 = score(doc=5282,freq=12.0), product of:
        0.12824841 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.042397358 = queryNorm
        0.40932083 = fieldWeight in 5282, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5282)
    0.009573761 = product of:
      0.028721282 = sum of:
        0.028721282 = weight(_text_:22 in 5282) [ClassicSimilarity], result of:
          0.028721282 = score(doc=5282,freq=2.0), product of:
            0.14846832 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.042397358 = queryNorm
            0.19345059 = fieldWeight in 5282, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5282)
      0.33333334 = coord(1/3)
  0.5 = coord(3/6)

Abstract: The Web has become a large repository of documents (or pages) written in many different languages. In this context, traditional information retrieval (IR) techniques cannot be used whenever the user query and the documents being retrieved are in different languages. To address this problem, new cross-language information retrieval (CLIR) techniques have been proposed. In this work, we describe a method for cross-language retrieval of medical information. This method combines query terms and related medical concepts obtained automatically through a categorization procedure. The medical concepts are used to create a linguistic abstraction that allows retrieval of information in a language-independent way, minimizing linguistic problems such as polysemy. To evaluate our method, we carried out experiments using the OHSUMED test collection, whose documents are written in English, with queries expressed in Portuguese, Spanish, and French. The results indicate that our cross-language retrieval method is as effective as a standard vector space model algorithm operating on queries and documents in the same language. Further, our results are better than previous results in the literature.
Date: 22. 7.2006 16:46:36

Fulford, H.: Monolingual or multilingual web sites? : An exploratory study of UK SMEs (2000) 0.04
```
0.04027172 = product of:
  0.12081516 = sum of:
    0.045980107 = weight(_text_:wide in 5561) [ClassicSimilarity], result of:
      0.045980107 = score(doc=5561,freq=2.0), product of:
        0.18785246 = queryWeight, product of:
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.042397358 = queryNorm
        0.24476713 = fieldWeight in 5561, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.4307585 = idf(docFreq=1430, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5561)
    0.074835055 = weight(_text_:web in 5561) [ClassicSimilarity], result of:
      0.074835055 = score(doc=5561,freq=18.0), product of:
        0.13836423 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.042397358 = queryNorm
        0.5408555 = fieldWeight in 5561, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5561)
  0.33333334 = coord(2/6)
```
Abstract

The strategic importance of the internet as a tool for penetrating global markets is increasingly being realized by UK-based SMEs (Small- Medium-sized Enterprises). This may be evidenced by the proliferation over the past few years of SME web sites promoting products and services, and more recently still by the growing number of SMEs offering facilities on their web sites for conducting business transactions online. In this paper, we report on an exploratory study considering the use being made of the world wide web by UK-based SMEs. The study is focussed on the strategies SMEs are employing to communicate via the web with an international client base. We investigate in particular the languages being used to present web content, considering specifically the extent to which English is being employed. Preliminary results obtained to date suggest that there is heavy reliance on the assumption that the language of the web is English. Based on the findings of our study, we discuss some of the performance and competition issues surrounding the use of foreign languages in business, and consider some of the possible barriers to SMEs creating multilingual web sites. We conclude by making some recommendations for SMEs endeavouring to establish a multilingual online presence, and note the strategic role to be played by web designers, IT consultants, business strategists, professional translators, and localization specialists to help achieve this presence effectively and professionally

Hull, D.: ¬A weighted Boolean model for cross-language text retrieval (1998) 0.04

0.037424047 = product of:
  0.112272136 = sum of:
    0.08908654 = weight(_text_:retrieval in 6307) [ClassicSimilarity], result of:
      0.08908654 = score(doc=6307,freq=6.0), product of:
        0.12824841 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.042397358 = queryNorm
        0.6946405 = fieldWeight in 6307, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.09375 = fieldNorm(doc=6307)
    0.023185596 = product of:
      0.06955679 = sum of:
        0.06955679 = weight(_text_:29 in 6307) [ClassicSimilarity], result of:
          0.06955679 = score(doc=6307,freq=2.0), product of:
            0.14914064 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.042397358 = queryNorm
            0.46638384 = fieldWeight in 6307, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.09375 = fieldNorm(doc=6307)
      0.33333334 = coord(1/3)
  0.33333334 = coord(2/6)

Date: 5. 8.2001 14:04:29
Series: The Kluwer International series on information retrieval
Source: Cross-language information retrieval. Ed.: G. Grefenstette

Picchi, E.; Peters, C.: Cross-language information retrieval : a system for comparable corpus querying (1998) 0.04

0.03589107 = product of:
  0.10767321 = sum of:
    0.08908654 = weight(_text_:retrieval in 6305) [ClassicSimilarity], result of:
      0.08908654 = score(doc=6305,freq=6.0), product of:
        0.12824841 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.042397358 = queryNorm
        0.6946405 = fieldWeight in 6305, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.09375 = fieldNorm(doc=6305)
    0.018586671 = product of:
      0.05576001 = sum of:
        0.05576001 = weight(_text_:system in 6305) [ClassicSimilarity], result of:
          0.05576001 = score(doc=6305,freq=2.0), product of:
            0.13353272 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.042397358 = queryNorm
            0.41757566 = fieldWeight in 6305, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.09375 = fieldNorm(doc=6305)
      0.33333334 = coord(1/3)
  0.33333334 = coord(2/6)

Series: The Kluwer International series on information retrieval
Source: Cross-language information retrieval. Ed.: G. Grefenstette

Subirats, I.; Prasad, A.R.D.; Keizer, J.; Bagdanov, A.: Implementation of rich metadata formats and demantic tools using DSpace (2008) 0.03
```
0.03497121 = product of:
  0.06994242 = sum of:
    0.019956015 = weight(_text_:web in 2656) [ClassicSimilarity], result of:
      0.019956015 = score(doc=2656,freq=2.0), product of:
        0.13836423 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.042397358 = queryNorm
        0.14422815 = fieldWeight in 2656, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.03125 = fieldNorm(doc=2656)
    0.017144712 = weight(_text_:retrieval in 2656) [ClassicSimilarity], result of:
      0.017144712 = score(doc=2656,freq=2.0), product of:
        0.12824841 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.042397358 = queryNorm
        0.13368362 = fieldWeight in 2656, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=2656)
    0.032841697 = product of:
      0.049262546 = sum of:
        0.026285522 = weight(_text_:system in 2656) [ClassicSimilarity], result of:
          0.026285522 = score(doc=2656,freq=4.0), product of:
            0.13353272 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.042397358 = queryNorm
            0.19684705 = fieldWeight in 2656, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03125 = fieldNorm(doc=2656)
        0.022977026 = weight(_text_:22 in 2656) [ClassicSimilarity], result of:
          0.022977026 = score(doc=2656,freq=2.0), product of:
            0.14846832 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.042397358 = queryNorm
            0.15476047 = fieldWeight in 2656, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=2656)
      0.6666667 = coord(2/3)
  0.5 = coord(3/6)
```
Abstract

This poster explores the customization of DSpace to allow the use of the AGRIS Application Profile metadata standard and the AGROVOC thesaurus. The objective is the adaptation of DSpace, through the least invasive code changes either in the form of plug-ins or add-ons, to the specific needs of the Agricultural Sciences and Technology community. Metadata standards such as AGRIS AP, and Knowledge Organization Systems such as the AGROVOC thesaurus, provide mechanisms for sharing information in a standardized manner by recommending the use of common semantics and interoperable syntax (Subirats et al., 2007). AGRIS AP was created to enhance the description, exchange and subsequent retrieval of agricultural Document-like Information Objects (DLIOs). It is a metadata schema which draws from Metadata standards such as Dublin Core (DC), the Australian Government Locator Service Metadata (AGLS) and the Agricultural Metadata Element Set (AgMES) namespaces. It allows sharing of information across dispersed bibliographic systems (FAO, 2005). AGROVOC68 is a multilingual structured thesaurus covering agricultural and related domains. Its main role is to standardize the indexing process in order to make searching simpler and more efficient. AGROVOC is developed by FAO (Lauser et al., 2006). The customization of the DSpace is taking place in several phases. First, the AGRIS AP metadata schema was mapped onto the metadata DSpace model, with several enhancements implemented to support AGRIS AP elements. Next, AGROVOC will be integrated as a controlled vocabulary accessed through a local SKOS or OWL file. Eventually the system will be configurable to access AGROVOC through local files or remotely via webservices. Finally, spell checking and tooltips will be incorporated in the user interface to support metadata editing. Adapting DSpace to support AGRIS AP and annotation using the semantically-rich AGROVOC thesaurus transform DSpace into a powerful, domain-specific system for annotation and exchange of bibliographic metadata in the agricultural domain.

Source

Metadata for semantic and social applications : proceedings of the International Conference on Dublin Core and Metadata Applications, Berlin, 22 - 26 September 2008, DC 2008: Berlin, Germany / ed. by Jane Greenberg and Wolfgang Klas

Theme

Semantic Web
Qin, J.; Zhou, Y.; Chau, M.; Chen, H.: Multilingual Web retrieval : an experiment in English-Chinese business intelligence (2006) 0.03
```
0.03465478 = product of:
  0.10396434 = sum of:
    0.06110257 = weight(_text_:web in 5054) [ClassicSimilarity], result of:
      0.06110257 = score(doc=5054,freq=12.0), product of:
        0.13836423 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.042397358 = queryNorm
        0.4416067 = fieldWeight in 5054, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5054)
    0.04286178 = weight(_text_:retrieval in 5054) [ClassicSimilarity], result of:
      0.04286178 = score(doc=5054,freq=8.0), product of:
        0.12824841 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.042397358 = queryNorm
        0.33420905 = fieldWeight in 5054, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5054)
  0.33333334 = coord(2/6)
```
Abstract

As increasing numbers of non-English resources have become available on the Web, the interesting and important issue of how Web users can retrieve documents in different languages has arisen. Cross-language information retrieval (CLIP), the study of retrieving information in one language by queries expressed in another language, is a promising approach to the problem. Cross-language information retrieval has attracted much attention in recent years. Most research systems have achieved satisfactory performance on standard Text REtrieval Conference (TREC) collections such as news articles, but CLIR techniques have not been widely studied and evaluated for applications such as Web portals. In this article, the authors present their research in developing and evaluating a multilingual English-Chinese Web portal that incorporates various CLIP techniques for use in the business domain. A dictionary-based approach was adopted and combines phrasal translation, co-occurrence analysis, and pre- and posttranslation query expansion. The portal was evaluated by domain experts, using a set of queries in both English and Chinese. The experimental results showed that co-occurrence-based phrasal translation achieved a 74.6% improvement in precision over simple word-byword translation. When used together, pre- and posttranslation query expansion improved the performance slightly, achieving a 78.0% improvement over the baseline word-by-word translation approach. In general, applying CLIR techniques in Web applications shows promise.
Effektive Information Retrieval Verfahren in Theorie und Praxis : ausgewählte und erweiterte Beiträge des Vierten Hildesheimer Evaluierungs- und Retrievalworkshop (HIER 2005), Hildesheim, 20.7.2005 (2006) 0.03
```
0.034629855 = product of:
  0.06925971 = sum of:
    0.014111035 = weight(_text_:web in 5973) [ClassicSimilarity], result of:
      0.014111035 = score(doc=5973,freq=4.0), product of:
        0.13836423 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.042397358 = queryNorm
        0.1019847 = fieldWeight in 5973, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.015625 = fieldNorm(doc=5973)
    0.046952724 = weight(_text_:retrieval in 5973) [ClassicSimilarity], result of:
      0.046952724 = score(doc=5973,freq=60.0), product of:
        0.12824841 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.042397358 = queryNorm
        0.36610767 = fieldWeight in 5973, product of:
          7.745967 = tf(freq=60.0), with freq of:
            60.0 = termFreq=60.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.015625 = fieldNorm(doc=5973)
    0.008195952 = product of:
      0.024587855 = sum of:
        0.024587855 = weight(_text_:system in 5973) [ClassicSimilarity], result of:
          0.024587855 = score(doc=5973,freq=14.0), product of:
            0.13353272 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.042397358 = queryNorm
            0.18413356 = fieldWeight in 5973, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.015625 = fieldNorm(doc=5973)
      0.33333334 = coord(1/3)
  0.5 = coord(3/6)
```
Abstract

Information Retrieval hat sich zu einer Schlüsseltechnologie in der Wissensgesellschaft entwickelt. Die Anzahl der täglichen Anfragen an Internet-Suchmaschinen bildet nur einen Indikator für die große Bedeutung dieses Themas. Der Sammelbandband informiert über Themen wie Information Retrieval-Grundlagen, Retrieval Systeme, Digitale Bibliotheken, Evaluierung und Multilinguale Systeme, beschreibt Anwendungsszenarien und setzt sich mit neuen Herausforderungen an das Information Retrieval auseinander. Die Beiträge behandeln aktuelle Themen und neue Herausforderungen an das Information Retrieval. Die intensive Beteiligung der Informationswissenschaft der Universität Hildesheim am Cross Language Evaluation Forum (CLEF), einer europäischen Evaluierungsinitiative zur Erforschung mehrsprachiger Retrieval Systeme, berührt mehrere der Beiträge. Ebenso spielen Anwendungsszenarien und die Auseinandersetzung mit aktuellen und praktischen Fragestellungen eine große Rolle.

Content

Inhalt: Jan-Hendrik Scheufen: RECOIN: Modell offener Schnittstellen für Information-Retrieval-Systeme und -Komponenten Markus Nick, Klaus-Dieter Althoff: Designing Maintainable Experience-based Information Systems Gesine Quint, Steffen Weichert: Die benutzerzentrierte Entwicklung des Produkt- Retrieval-Systems EIKON der Blaupunkt GmbH Claus-Peter Klas, Sascha Kriewel, André Schaefer, Gudrun Fischer: Das DAFFODIL System - Strategische Literaturrecherche in Digitalen Bibliotheken Matthias Meiert: Entwicklung eines Modells zur Integration digitaler Dokumente in die Universitätsbibliothek Hildesheim Daniel Harbig, René Schneider: Ontology Learning im Rahmen von MyShelf Michael Kluck, Marco Winter: Topic-Entwicklung und Relevanzbewertung bei GIRT: ein Werkstattbericht Thomas Mandl: Neue Entwicklungen bei den Evaluierungsinitiativen im Information Retrieval Joachim Pfister: Clustering von Patent-Dokumenten am Beispiel der Datenbanken des Fachinformationszentrums Karlsruhe Ralph Kölle, Glenn Langemeier, Wolfgang Semar: Programmieren lernen in kollaborativen Lernumgebungen Olga Tartakovski, Margaryta Shramko: Implementierung eines Werkzeugs zur Sprachidentifikation in mono- und multilingualen Texten Nina Kummer: Indexierungstechniken für das japanische Retrieval Suriya Na Nhongkai, Hans-Joachim Bentz: Bilinguale Suche mittels Konzeptnetzen Robert Strötgen, Thomas Mandl, René Schneider: Entwicklung und Evaluierung eines Question Answering Systems im Rahmen des Cross Language Evaluation Forum (CLEF) Niels Jensen: Evaluierung von mehrsprachigem Web-Retrieval: Experimente mit dem EuroGOV-Korpus im Rahmen des Cross Language Evaluation Forum (CLEF)

Footnote

Rez. in: Information - Wissenschaft und Praxis 57(2006) H.5, S.290-291 (C. Schindler): "Weniger als ein Jahr nach dem "Vierten Hildesheimer Evaluierungs- und Retrievalworkshop" (HIER 2005) im Juli 2005 ist der dazugehörige Tagungsband erschienen. Eingeladen hatte die Hildesheimer Informationswissenschaft um ihre Forschungsergebnisse und die einiger externer Experten zum Thema Information Retrieval einem Fachpublikum zu präsentieren und zur Diskussion zu stellen. Unter dem Titel "Effektive Information Retrieval Verfahren in Theorie und Praxis" sind nahezu sämtliche Beiträge des Workshops in dem nun erschienenen, 15 Beiträge umfassenden Band gesammelt. Mit dem Schwerpunkt Information Retrieval (IR) wird ein Teilgebiet der Informationswissenschaft vorgestellt, das schon immer im Zentrum informationswissenschaftlicher Forschung steht. Ob durch den Leistungsanstieg von Prozessoren und Speichermedien, durch die Verbreitung des Internet über nationale Grenzen hinweg oder durch den stetigen Anstieg der Wissensproduktion, festzuhalten ist, dass in einer zunehmend wechselseitig vernetzten Welt die Orientierung und das Auffinden von Dokumenten in großen Wissensbeständen zu einer zentralen Herausforderung geworden sind. Aktuelle Verfahrensweisen zu diesem Thema, dem Information Retrieval, präsentiert der neue Band anhand von praxisbezogenen Projekten und theoretischen Diskussionen. Das Kernthema Information Retrieval wird in dem Sammelband in die Bereiche Retrieval-Systeme, Digitale Bibliothek, Evaluierung und Multilinguale Systeme untergliedert. Die Artikel der einzelnen Sektionen sind insgesamt recht heterogen und bieten daher keine Überschneidungen inhaltlicher Art. Jedoch ist eine vollkommene thematische Abdeckung der unterschiedlichen Bereiche ebenfalls nicht gegeben, was bei der Präsentation von Forschungsergebnissen eines Institutes und seiner Kooperationspartner auch nur bedingt erwartet werden kann. So lässt sich sowohl in der Gliederung als auch in den einzelnen Beiträgen eine thematische Verdichtung erkennen, die das spezielle Profil und die Besonderheit der Hildesheimer Informationswissenschaft im Feld des Information Retrieval wiedergibt. Teil davon ist die mehrsprachige und interdisziplinäre Ausrichtung, die die Schnittstellen zwischen Informationswissenschaft, Sprachwissenschaft und Informatik in ihrer praxisbezogenen und internationalen Forschung fokussiert.
Im ersten Kapitel "Retrieval-Systeme" werden verschiedene Information RetrievalSysteme präsentiert und Verfahren zu deren Gestaltung diskutiert. Jan-Hendrik Scheufen stellt das Meta-Framework RECOIN zur Information Retrieval Forschung vor, das sich durch eine flexible Handhabung unterschiedlichster Applikationen auszeichnet und dadurch eine zentrierte Protokollierung und Steuerung von Retrieval-Prozessen ermöglicht. Dieses Konzept eines offenen, komponentenbasierten Systems wurde in Form eines Plug-Ins für die javabasierte Open-Source-Plattform Eclipse realisiert. Markus Nick und Klaus-Dieter Althoff erläutern in ihrem Beitrag, der übrigens der einzige englischsprachige Text im Buch ist, das Verfahren DILLEBIS zur Erhaltung und Pflege (Maintenance) von erfahrungsbasierten Informationssystemen. Sie bezeichnen dieses Verfahren als Maintainable Experience-based Information System und plädieren für eine Ausrichtung von erfahrungsbasierten Systemen entsprechend diesem Modell. Gesine Quint und Steffen Weichert stellen dagegen in ihrem Beitrag die benutzerzentrierte Entwicklung des Produkt-Retrieval-Systems EIKON vor, das in Kooperation mit der Blaupunkt GmbH realisiert wurde. In einem iterativen Designzyklus erfolgte die Gestaltung von gruppenspezifischen Interaktionsmöglichkeiten für ein Car-Multimedia-Zubehör-System. Im zweiten Kapitel setzen sich mehrere Autoren dezidierter mit dem Anwendungsgebiet "Digitale Bibliothek" auseinander. Claus-Peter Klas, Sascha Kriewel, Andre Schaefer und Gudrun Fischer von der Universität Duisburg-Essen stellen das System DAFFODIL vor, das durch eine Vielzahl an Werkzeugen zur strategischen Unterstützung bei Literaturrecherchen in digitalen Bibliotheken dient. Zusätzlich ermöglicht die Protokollierung sämtlicher Ereignisse den Einsatz des Systems als Evaluationsplattform. Der Aufsatz von Matthias Meiert erläutert die Implementierung von elektronischen Publikationsprozessen an Hochschulen am Beispiel von Abschlussarbeiten des Studienganges Internationales Informationsmanagement der Universität Hildesheim. Neben Rahmenbedingungen werden sowohl der Ist-Zustand als auch der Soll-Zustand des wissenschaftlichen elektronischen Publizierens in Form von gruppenspezifischen Empfehlungen dargestellt. Daniel Harbig und Rene Schneider beschreiben in ihrem Aufsatz zwei Verfahrensweisen zum maschinellen Erlernen von Ontologien, angewandt am virtuellen Bibliotheksregal MyShelf. Nach der Evaluation dieser beiden Ansätze plädieren die Autoren für ein semi-automatisiertes Verfahren zur Erstellung von Ontologien.
"Evaluierung", das Thema des dritten Kapitels, ist in seiner Breite nicht auf das Information Retrieval beschränkt sondern beinhaltet ebenso einzelne Aspekte der Bereiche Mensch-Maschine-Interaktion sowie des E-Learning. Michael Muck und Marco Winter von der Stiftung Wissenschaft und Politik sowie dem Informationszentrum Sozialwissenschaften thematisieren in ihrem Beitrag den Einfluss der Fragestellung (Topic) auf die Bewertung von Relevanz und zeigen Verfahrensweisen für die Topic-Erstellung auf, die beim Cross Language Evaluation Forum (CLEF) Anwendung finden. Im darauf folgenden Aufsatz stellt Thomas Mandl verschiedene Evaluierungsinitiativen im Information Retrieval und aktuelle Entwicklungen dar. Joachim Pfister erläutert in seinem Beitrag das automatisierte Gruppieren, das sogenannte Clustering, von Patent-Dokumenten in den Datenbanken des Fachinformationszentrums Karlsruhe und evaluiert unterschiedliche Clusterverfahren auf Basis von Nutzerbewertungen. Ralph Kölle, Glenn Langemeier und Wolfgang Semar widmen sich dem kollaborativen Lernen unter den speziellen Bedingungen des Programmierens. Dabei werden das System VitaminL zur synchronen Bearbeitung von Programmieraufgaben und das Kennzahlensystem K-3 für die Bewertung kollaborativer Zusammenarbeit in einer Lehrveranstaltung angewendet. Der aktuelle Forschungsschwerpunkt der Hildesheimer Informationswissenschaft zeichnet sich im vierten Kapitel unter dem Thema "Multilinguale Systeme" ab. Hier finden sich die meisten Beiträge des Tagungsbandes wieder. Olga Tartakovski und Margaryta Shramko beschreiben und prüfen das System Langldent, das die Sprache von mono- und multilingualen Texten identifiziert. Die Eigenheiten der japanischen Schriftzeichen stellt Nina Kummer dar und vergleicht experimentell die unterschiedlichen Techniken der Indexierung. Suriya Na Nhongkai und Hans-Joachim Bentz präsentieren und prüfen eine bilinguale Suche auf Basis von Konzeptnetzen, wobei die Konzeptstruktur das verbindende Elemente der beiden Textsammlungen darstellt. Das Entwickeln und Evaluieren eines mehrsprachigen Question-Answering-Systems im Rahmen des Cross Language Evaluation Forum (CLEF), das die alltagssprachliche Formulierung von konkreten Fragestellungen ermöglicht, wird im Beitrag von Robert Strötgen, Thomas Mandl und Rene Schneider thematisiert. Den Schluss bildet der Aufsatz von Niels Jensen, der ein mehrsprachiges Web-Retrieval-System ebenfalls im Zusammenhang mit dem CLEF anhand des multilingualen EuroGOVKorpus evaluiert.
Abschließend lässt sich sagen, dass der Tagungsband einen gelungenen Überblick über die Information Retrieval Projekte der Hildesheimer Informationswissenschaft und ihrer Kooperationspartner gibt. Die einzelnen Beiträge sind sehr anregend und auf einem hohen Niveau angesiedelt. Ein kleines Hindernis für den Leser stellt die inhaltliche und strukturelle Orientierung innerhalb des Bandes dar. Der Bezug der einzelnen Artikel zum Thema des Kapitels wird zwar im Vorwort kurz erläutert. Erschwert wird die Orientierung im Buch jedoch durch fehlende Kapitelüberschriften am Anfang der einzelnen Sektionen. Außerdem ist zu erwähnen, dass einer der Artikel einen anderen Titel als im Inhaltsverzeichnis angekündigt trägt. Sieht der Leser von diesen formalen Mängeln ab, wird er reichlich mit praxisbezogenen und theoretisch fundierten Projektdarstellungen und Forschungsergebnissen belohnt. Dies insbesondere, da nicht nur aktuelle Themen der Informationswissenschaft aufgegriffen, sondern ebenso weiterentwickelt und durch die speziellen interdisziplinären und internationalen Bedingungen in Hildesheim geformt werden. Dabei zeigt sich anhand der verschiedenen Projekte, wie gut die Hildesheimer Informationswissenschaft in die Community überregionaler Informationseinrichtungen und anderer deutscher informationswissenschaftlicher Forschungsgruppen eingebunden ist. Damit hat der Workshop bei einer weiteren Öffnung der Expertengruppe das Potential zu einer eigenständigen Institution im Bereich des Information Retrieval zu werden. In diesem Sinne lässt sich auf weitere fruchtbare Workshops und deren Veröffentlichungen hoffen. Ein nächster Workshop der Universität Hildesheim zum Thema Information Retrieval, organisiert mit der Fachgruppe Information Retrieval der Gesellschaft für Informatik, kündigt sich bereits für den 9. bis 13- Oktober 2006 an."
Li, Q.; Chen, Y.P.; Myaeng, S.-H.; Jin, Y.; Kang, B.-Y.: Concept unification of terms in different languages via web mining for Information Retrieval (2009) 0.03
```
0.034128264 = product of:
  0.10238479 = sum of:
    0.04989004 = weight(_text_:web in 4215) [ClassicSimilarity], result of:
      0.04989004 = score(doc=4215,freq=8.0), product of:
        0.13836423 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.042397358 = queryNorm
        0.36057037 = fieldWeight in 4215, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4215)
    0.052494746 = weight(_text_:retrieval in 4215) [ClassicSimilarity], result of:
      0.052494746 = score(doc=4215,freq=12.0), product of:
        0.12824841 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.042397358 = queryNorm
        0.40932083 = fieldWeight in 4215, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4215)
  0.33333334 = coord(2/6)
```
Abstract

For historical and cultural reasons, English phrases, especially proper nouns and new words, frequently appear in Web pages written primarily in East Asian languages such as Chinese, Korean, and Japanese. Although such English terms and their equivalences in these East Asian languages refer to the same concept, they are often erroneously treated as independent index units in traditional Information Retrieval (IR). This paper describes the degree to which the problem arises in IR and proposes a novel technique to solve it. Our method first extracts English terms from native Web documents in an East Asian language, and then unifies the extracted terms and their equivalences in the native language as one index unit. For Cross-Language Information Retrieval (CLIR), one of the major hindrances to achieving retrieval performance at the level of Mono-Lingual Information Retrieval (MLIR) is the translation of terms in search queries which can not be found in a bilingual dictionary. The Web mining approach proposed in this paper for concept unification of terms in different languages can also be applied to solve this well-known challenge in CLIR. Experimental results based on NTCIR and KT-Set test collections show that the high translation precision of our approach greatly improves performance of both Mono-Lingual and Cross-Language Information Retrieval.
Multilingual information management : current levels and future abilities. A report Commissioned by the US National Science Foundation and also delivered to the European Commission's Language Engineering Office and the US Defense Advanced Research Projects Agency, April 1999 (1999) 0.03
```
0.032244157 = product of:
  0.064488314 = sum of:
    0.019956015 = weight(_text_:web in 6068) [ClassicSimilarity], result of:
      0.019956015 = score(doc=6068,freq=2.0), product of:
        0.13836423 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.042397358 = queryNorm
        0.14422815 = fieldWeight in 6068, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.03125 = fieldNorm(doc=6068)
    0.038336743 = weight(_text_:retrieval in 6068) [ClassicSimilarity], result of:
      0.038336743 = score(doc=6068,freq=10.0), product of:
        0.12824841 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.042397358 = queryNorm
        0.29892567 = fieldWeight in 6068, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=6068)
    0.0061955573 = product of:
      0.018586671 = sum of:
        0.018586671 = weight(_text_:system in 6068) [ClassicSimilarity], result of:
          0.018586671 = score(doc=6068,freq=2.0), product of:
            0.13353272 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.042397358 = queryNorm
            0.13919188 = fieldWeight in 6068, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.03125 = fieldNorm(doc=6068)
      0.33333334 = coord(1/3)
  0.5 = coord(3/6)
```
Abstract

Over the past 50 years, a variety of language-related capabilities has been developed in machine translation, information retrieval, speech recognition, text summarization, and so on. These applications rest upon a set of core techniques such as language modeling, information extraction, parsing, generation, and multimedia planning and integration; and they involve methods using statistics, rules, grammars, lexicons, ontologies, training techniques, and so on. It is a puzzling fact that although all of this work deals with language in some form or other, the major applications have each developed a separate research field. For example, there is no reason why speech recognition techniques involving n-grams and hidden Markov models could not have been used in machine translation 15 years earlier than they were, or why some of the lexical and semantic insights from the subarea called Computational Linguistics are still not used in information retrieval.
This picture will rapidly change. The twin challenges of massive information overload via the web and ubiquitous computers present us with an unavoidable task: developing techniques to handle multilingual and multi-modal information robustly and efficiently, with as high quality performance as possible. The most effective way for us to address such a mammoth task, and to ensure that our various techniques and applications fit together, is to start talking across the artificial research boundaries. Extending the current technologies will require integrating the various capabilities into multi-functional and multi-lingual natural language systems. However, at this time there is no clear vision of how these technologies could or should be assembled into a coherent framework. What would be involved in connecting a speech recognition system to an information retrieval engine, and then using machine translation and summarization software to process the retrieved text? How can traditional parsing and generation be enhanced with statistical techniques? What would be the effect of carefully crafted lexicons on traditional information retrieval? At which points should machine translation be interleaved within information retrieval systems to enable multilingual processing?

Search (218 results, page 1 of 11)

Authors

Years

Languages

Types

Themes

Subjects

Classifications