Search (106 results, page 1 of 6)

Hotho, A.; Bloehdorn, S.: Data Mining 2004 : Text classification by boosting weak learners based on terms and concepts (2004) 0.08

0.081290185 = product of:
  0.21677382 = sum of:
    0.050934732 = product of:
      0.1528042 = sum of:
        0.1528042 = weight(_text_:3a in 562) [ClassicSimilarity], result of:
          0.1528042 = score(doc=562,freq=2.0), product of:
            0.27188486 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.032069415 = queryNorm
            0.56201804 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.33333334 = coord(1/3)
    0.1528042 = weight(_text_:2f in 562) [ClassicSimilarity], result of:
      0.1528042 = score(doc=562,freq=2.0), product of:
        0.27188486 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.032069415 = queryNorm
        0.56201804 = fieldWeight in 562, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.046875 = fieldNorm(doc=562)
    0.013034889 = product of:
      0.026069777 = sum of:
        0.026069777 = weight(_text_:22 in 562) [ClassicSimilarity], result of:
          0.026069777 = score(doc=562,freq=2.0), product of:
            0.112301625 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.032069415 = queryNorm
            0.23214069 = fieldWeight in 562, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=562)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Content: Vgl.: http://www.google.de/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CEAQFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.91.4940%26rep%3Drep1%26type%3Dpdf&ei=dOXrUMeIDYHDtQahsIGACg&usg=AFQjCNHFWVh6gNPvnOrOS9R3rkrXCNVD-A&sig2=5I2F5evRfMnsttSgFF9g7Q&bvm=bv.1357316858,d.Yms.
Date: 8. 1.2013 10:22:32

Doszkocs, T.E.; Zamora, A.: Dictionary services and spelling aids for Web searching (2004) 0.03

0.029168403 = product of:
  0.07778241 = sum of:
    0.022924898 = weight(_text_:retrieval in 2541) [ClassicSimilarity], result of:
      0.022924898 = score(doc=2541,freq=4.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.23632148 = fieldWeight in 2541, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2541)
    0.039495744 = sum of:
      0.017573725 = weight(_text_:system in 2541) [ClassicSimilarity], result of:
        0.017573725 = score(doc=2541,freq=2.0), product of:
          0.10100432 = queryWeight, product of:
            3.1495528 = idf(docFreq=5152, maxDocs=44218)
            0.032069415 = queryNorm
          0.17398985 = fieldWeight in 2541, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.1495528 = idf(docFreq=5152, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2541)
      0.021922018 = weight(_text_:29 in 2541) [ClassicSimilarity], result of:
        0.021922018 = score(doc=2541,freq=2.0), product of:
          0.11281017 = queryWeight, product of:
            3.5176873 = idf(docFreq=3565, maxDocs=44218)
            0.032069415 = queryNorm
          0.19432661 = fieldWeight in 2541, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5176873 = idf(docFreq=3565, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2541)
    0.015361764 = product of:
      0.030723527 = sum of:
        0.030723527 = weight(_text_:22 in 2541) [ClassicSimilarity], result of:
          0.030723527 = score(doc=2541,freq=4.0), product of:
            0.112301625 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.032069415 = queryNorm
            0.27358043 = fieldWeight in 2541, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2541)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: The Specialized Information Services Division (SIS) of the National Library of Medicine (NLM) provides Web access to more than a dozen scientific databases on toxicology and the environment on TOXNET . Search queries on TOXNET often include misspelled or variant English words, medical and scientific jargon and chemical names. Following the example of search engines like Google and ClinicalTrials.gov, we set out to develop a spelling "suggestion" system for increased recall and precision in TOXNET searching. This paper describes development of dictionary technology that can be used in a variety of applications such as orthographic verification, writing aid, natural language processing, and information storage and retrieval. The design of the technology allows building complex applications using the components developed in the earlier phases of the work in a modular fashion without extensive rewriting of computer code. Since many of the potential applications envisioned for this work have on-line or web-based interfaces, the dictionaries and other computer components must have fast response, and must be adaptable to open-ended database vocabularies, including chemical nomenclature. The dictionary vocabulary for this work was derived from SIS and other databases and specialized resources, such as NLM's Unified Medical Language Systems (UMLS) . The resulting technology, A-Z Dictionary (AZdict), has three major constituents: 1) the vocabulary list, 2) the word attributes that define part of speech and morphological relationships between words in the list, and 3) a set of programs that implements the retrieval of words and their attributes, and determines similarity between words (ChemSpell). These three components can be used in various applications such as spelling verification, spelling aid, part-of-speech tagging, paraphrasing, and many other natural language processing functions.
Date: 14. 8.2004 17:22:56
Source: Online. 28(2004) no.3, S.22-29

Bowker, L.: Information retrieval in translation memory systems : assessment of current limitations and possibilities for future development (2002) 0.02

0.021847224 = product of:
  0.087388895 = sum of:
    0.032094855 = weight(_text_:retrieval in 1854) [ClassicSimilarity], result of:
      0.032094855 = score(doc=1854,freq=4.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.33085006 = fieldWeight in 1854, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1854)
    0.05529404 = sum of:
      0.024603218 = weight(_text_:system in 1854) [ClassicSimilarity], result of:
        0.024603218 = score(doc=1854,freq=2.0), product of:
          0.10100432 = queryWeight, product of:
            3.1495528 = idf(docFreq=5152, maxDocs=44218)
            0.032069415 = queryNorm
          0.2435858 = fieldWeight in 1854, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.1495528 = idf(docFreq=5152, maxDocs=44218)
            0.0546875 = fieldNorm(doc=1854)
      0.030690823 = weight(_text_:29 in 1854) [ClassicSimilarity], result of:
        0.030690823 = score(doc=1854,freq=2.0), product of:
          0.11281017 = queryWeight, product of:
            3.5176873 = idf(docFreq=3565, maxDocs=44218)
            0.032069415 = queryNorm
          0.27205724 = fieldWeight in 1854, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5176873 = idf(docFreq=3565, maxDocs=44218)
            0.0546875 = fieldNorm(doc=1854)
  0.25 = coord(2/8)

Abstract: A translation memory system is a new type of human language technology (HLT) tool that is gaining popularity among translators. Such tools allow translators to store previously translated texts in a type of aligned bilingual database, and to recycle relevant parts of these texts when producing new translations. Currently, these tools retrieve information from the database using superficial character string matching, which often results in poor precision and recall. This paper explains how translation memory systems work, and it considers some possible ways for introducing more sophisticated information retrieval techniques into such systems by taking syntactic and semantic similarity into account. Some of the suggested techniques are inspired by these used in other areas of HLT, and some by techniques used in information science.
Source: Knowledge organization. 29(2002) nos.3/4, S.198-203

Herrera-Viedma, E.: Modeling the retrieval process for an information retrieval system using an ordinal fuzzy linguistic approach (2001) 0.02
```
0.02010944 = product of:
  0.08043776 = sum of:
    0.028077152 = weight(_text_:retrieval in 5752) [ClassicSimilarity], result of:
      0.028077152 = score(doc=5752,freq=6.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.28943354 = fieldWeight in 5752, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5752)
    0.052360605 = sum of:
      0.030438587 = weight(_text_:system in 5752) [ClassicSimilarity], result of:
        0.030438587 = score(doc=5752,freq=6.0), product of:
          0.10100432 = queryWeight, product of:
            3.1495528 = idf(docFreq=5152, maxDocs=44218)
            0.032069415 = queryNorm
          0.30135927 = fieldWeight in 5752, product of:
            2.4494898 = tf(freq=6.0), with freq of:
              6.0 = termFreq=6.0
            3.1495528 = idf(docFreq=5152, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5752)
      0.021922018 = weight(_text_:29 in 5752) [ClassicSimilarity], result of:
        0.021922018 = score(doc=5752,freq=2.0), product of:
          0.11281017 = queryWeight, product of:
            3.5176873 = idf(docFreq=3565, maxDocs=44218)
            0.032069415 = queryNorm
          0.19432661 = fieldWeight in 5752, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5176873 = idf(docFreq=3565, maxDocs=44218)
            0.0390625 = fieldNorm(doc=5752)
  0.25 = coord(2/8)
```
Abstract

A linguistic model for an Information Retrieval System (IRS) defined using an ordinal fuzzy linguistic approach is proposed. The ordinal fuzzy linguistic approach is presented, and its use for modeling the imprecision and subjectivity that appear in the user-IRS interaction is studied. The user queries and IRS responses are modeled linguistically using the concept of fuzzy linguistic variables. The system accepts Boolean queries whose terms can be weighted simultaneously by means of ordinal linguistic values according to three possible semantics: a symmetrical threshold semantic, a quantitative semantic, and an importance semantic. The first one identifies a new threshold semantic used to express qualitative restrictions on the documents retrieved for a given term. It is monotone increasing in index term weight for the threshold values that are on the right of the mid-value, and decreasing for the threshold values that are on the left of the mid-value. The second one is a new semantic proposal introduced to express quantitative restrictions on the documents retrieved for a term, i.e., restrictions on the number of documents that must be retrieved containing that term. The last one is the usual semantic of relative importance that has an effect when the term is in a Boolean expression. A bottom-up evaluation mechanism of queries is presented that coherently integrates the use of the three semantics and satisfies the separability property. The advantage of this IRS with respect to others is that users can express linguistically different semantic restrictions on the desired documents simultaneously, incorporating more flexibility in the user-IRS interaction

Date

29. 9.2001 14:00:25
Beitzel, S.M.; Jensen, E.C.; Chowdhury, A.; Grossman, D.; Frieder, O; Goharian, N.: Fusion of effective retrieval strategies in the same information retrieval system (2004) 0.02
```
0.019944277 = product of:
  0.07977711 = sum of:
    0.061513953 = weight(_text_:retrieval in 2502) [ClassicSimilarity], result of:
      0.061513953 = score(doc=2502,freq=20.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.63411707 = fieldWeight in 2502, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2502)
    0.018263152 = product of:
      0.036526304 = sum of:
        0.036526304 = weight(_text_:system in 2502) [ClassicSimilarity], result of:
          0.036526304 = score(doc=2502,freq=6.0), product of:
            0.10100432 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.032069415 = queryNorm
            0.36163113 = fieldWeight in 2502, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=2502)
      0.5 = coord(1/2)
  0.25 = coord(2/8)
```
Abstract

Prior efforts have shown that under certain situations retrieval effectiveness may be improved via the use of data fusion techniques. Although these improvements have been observed from the fusion of result sets from several distinct information retrieval systems, it has often been thought that fusing different document retrieval strategies in a single information retrieval system will lead to similar improvements. In this study, we show that this is not the case. We hold constant systemic differences such as parsing, stemming, phrase processing, and relevance feedback, and fuse result sets generated from highly effective retrieval strategies in the same information retrieval system. From this, we show that data fusion of highly effective retrieval strategies alone shows little or no improvement in retrieval effectiveness. Furthermore, we present a detailed analysis of the performance of modern data fusion approaches, and demonstrate the reasons why they do not perform weIl when applied to this problem. Detailed results and analyses are included to support our conclusions.
Airio, E.; Kettunen, K.: Does dictionary based bilingual retrieval work in a non-normalized index? (2009) 0.02
```
0.0175226 = product of:
  0.0700904 = sum of:
    0.038904842 = weight(_text_:retrieval in 4224) [ClassicSimilarity], result of:
      0.038904842 = score(doc=4224,freq=8.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.40105087 = fieldWeight in 4224, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=4224)
    0.031185552 = product of:
      0.062371105 = sum of:
        0.062371105 = weight(_text_:etc in 4224) [ClassicSimilarity], result of:
          0.062371105 = score(doc=4224,freq=2.0), product of:
            0.17370372 = queryWeight, product of:
              5.4164915 = idf(docFreq=533, maxDocs=44218)
              0.032069415 = queryNorm
            0.35906604 = fieldWeight in 4224, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4164915 = idf(docFreq=533, maxDocs=44218)
              0.046875 = fieldNorm(doc=4224)
      0.5 = coord(1/2)
  0.25 = coord(2/8)
```
Abstract

Many operational IR indexes are non-normalized, i.e. no lemmatization or stemming techniques, etc. have been employed in indexing. This poses a challenge for dictionary-based cross-language retrieval (CLIR), because translations are mostly lemmas. In this study, we face the challenge of dictionary-based CLIR in a non-normalized index. We test two optional approaches: FCG (Frequent Case Generation) and s-gramming. The idea of FCG is to automatically generate the most frequent inflected forms for a given lemma. FCG has been tested in monolingual retrieval and has been shown to be a good method for inflected retrieval, especially for highly inflected languages. S-gramming is an approximate string matching technique (an extension of n-gramming). The language pairs in our tests were English-Finnish, English-Swedish, Swedish-Finnish and Finnish-Swedish. Both our approaches performed quite well, but the results varied depending on the language pair. S-gramming and FCG performed quite equally in all the other language pairs except Finnish-Swedish, where s-gramming outperformed FCG.

Liu, S.; Liu, F.; Yu, C.; Meng, W.: ¬An effective approach to document retrieval via utilizing WordNet and recognizing phrases (2004) 0.02

0.016942954 = product of:
  0.067771815 = sum of:
    0.045849796 = weight(_text_:retrieval in 4078) [ClassicSimilarity], result of:
      0.045849796 = score(doc=4078,freq=4.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.47264296 = fieldWeight in 4078, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=4078)
    0.021922018 = product of:
      0.043844037 = sum of:
        0.043844037 = weight(_text_:29 in 4078) [ClassicSimilarity], result of:
          0.043844037 = score(doc=4078,freq=2.0), product of:
            0.11281017 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.032069415 = queryNorm
            0.38865322 = fieldWeight in 4078, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.078125 = fieldNorm(doc=4078)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Date: 10.10.2005 10:29:08
Source: SIGIR'04: Proceedings of the 27th Annual International ACM-SIGIR Conference an Research and Development in Information Retrieval. Ed.: K. Järvelin, u.a

Jones, I.; Cunliffe, D.; Tudhope, D.: Natural language processing and knowledge organization systems as an aid to retrieval (2004) 0.02
```
0.01672344 = product of:
  0.06689376 = sum of:
    0.02779496 = weight(_text_:retrieval in 2677) [ClassicSimilarity], result of:
      0.02779496 = score(doc=2677,freq=12.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.28652456 = fieldWeight in 2677, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02734375 = fieldNorm(doc=2677)
    0.03909879 = sum of:
      0.017397102 = weight(_text_:system in 2677) [ClassicSimilarity], result of:
        0.017397102 = score(doc=2677,freq=4.0), product of:
          0.10100432 = queryWeight, product of:
            3.1495528 = idf(docFreq=5152, maxDocs=44218)
            0.032069415 = queryNorm
          0.17224117 = fieldWeight in 2677, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.1495528 = idf(docFreq=5152, maxDocs=44218)
            0.02734375 = fieldNorm(doc=2677)
      0.021701692 = weight(_text_:29 in 2677) [ClassicSimilarity], result of:
        0.021701692 = score(doc=2677,freq=4.0), product of:
          0.11281017 = queryWeight, product of:
            3.5176873 = idf(docFreq=3565, maxDocs=44218)
            0.032069415 = queryNorm
          0.19237353 = fieldWeight in 2677, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.5176873 = idf(docFreq=3565, maxDocs=44218)
            0.02734375 = fieldNorm(doc=2677)
  0.25 = coord(2/8)
```
Abstract

This paper discusses research that employs methods from Natural Language Processing (NLP) in exploiting the intellectual resources of Knowledge Organization Systems (KOS), particularly in the retrieval of information. A technique for the disambiguation of homographs and nominal compounds in free text, where these are known ambiguous terms in the KOS itself, is described. The use of Roget's Thesaurus as an intermediary in the process is also reported. A short review of the relevant literature in the field is given. Design considerations, results and conclusions are presented from the implementation of a prototype system. The linguistic techniques are applied at two complementary levels, namely an a free text string used as an entry point to the KOS, and an the underlying controlled vocabulary itself.

Content

1. Introduction The need for research into the application of linguistic techniques in Information Retrieval (IR) in general, and a similar need in faceted Knowledge Organization Systems (KOS) has been indicated by various authors. Smeaton (1997) points out the inherent limitations of conventional approaches to IR based an "bags of words", mainly difficulties caused by lexical ambiguity in the words concerned, and goes an to suggest the possibility of using Natural Language Processing (NLP) in query formulation. Past experience with a faceted retrieval system highlighted the need for integrating the linguistic perspective in order to fully utilise the potential of a KOS (Tudhope et al." 2002). The present research seeks to address some of these needs in using NLP to improve the efficacy of KOS tools in query and retrieval systems. Syntactic parsing and part-of-speech tagging can substantially reduce lexical ambiguity through homograph disambiguation. Given the two strings "1 fable the motion" and "I put the motion an the fable", for instance, the parser used in this research clearly indicates that 'fable' in the first string is a verb, while 'table' in the second string is a noun, a distinction that would be missed in the "bag of words" approach. This syntactic disambiguation enables a more precise matching from free text to the controlled vocabulary of a KOS and vice versa. The use of a general linguistic resource, namely Roget's Thesaurus of English Words and Phrases (RTEWP), as an intermediary in this process, is investigated. The adaptation of the Link parser (Sleator & Temperley, 1993) to the purposes of the research is reported. The design and implementation of the early practical stages of the project are described, and the results of the initial experiments are presented and evaluated. Applications of the techniques developed are foreseen in the areas of query disambiguation, information retrieval and automatic indexing. In the first section of the paper a brief review of the literature and relevant current work in the field is presented. The second section includes reports an the development of algorithms, the construction of data sets and theoretical and experimental work undertaken to date. The third section evaluates the results obtained, and outlines directions for future research.

Date

29. 8.2004 19:29:56

Bian, G.-W.; Chen, H.-H.: Cross-language information access to multilingual collections on the Internet (2000) 0.02

0.016136829 = product of:
  0.043031543 = sum of:
    0.019452421 = weight(_text_:retrieval in 4436) [ClassicSimilarity], result of:
      0.019452421 = score(doc=4436,freq=2.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.20052543 = fieldWeight in 4436, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=4436)
    0.010544236 = product of:
      0.021088472 = sum of:
        0.021088472 = weight(_text_:system in 4436) [ClassicSimilarity], result of:
          0.021088472 = score(doc=4436,freq=2.0), product of:
            0.10100432 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.032069415 = queryNorm
            0.20878783 = fieldWeight in 4436, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=4436)
      0.5 = coord(1/2)
    0.013034889 = product of:
      0.026069777 = sum of:
        0.026069777 = weight(_text_:22 in 4436) [ClassicSimilarity], result of:
          0.026069777 = score(doc=4436,freq=2.0), product of:
            0.112301625 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.032069415 = queryNorm
            0.23214069 = fieldWeight in 4436, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=4436)
      0.5 = coord(1/2)
  0.375 = coord(3/8)

Abstract: Language barrier is the major problem that people face in searching for, retrieving, and understanding multilingual collections on the Internet. This paper deals with query translation and document translation in a Chinese-English information retrieval system called MTIR. Bilingual dictionary and monolingual corpus-based approaches are adopted to select suitable tranlated query terms. A machine transliteration algorithm is introduced to resolve proper name searching. We consider several design issues for document translation, including which material is translated, what roles the HTML tags play in translation, what the tradeoff is between the speed performance and the translation performance, and what from the translated result is presented in. About 100.000 Web pages translated in the last 4 months of 1997 are used for quantitative study of online and real-time Web page translation
Date: 16. 2.2000 14:22:39

Rosemblat, G.; Tse, T.; Gemoets, D.: Adapting a monolingual consumer health system for Spanish cross-language information retrieval (2004) 0.02

0.015605161 = product of:
  0.062420644 = sum of:
    0.022924898 = weight(_text_:retrieval in 2673) [ClassicSimilarity], result of:
      0.022924898 = score(doc=2673,freq=4.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.23632148 = fieldWeight in 2673, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2673)
    0.039495744 = sum of:
      0.017573725 = weight(_text_:system in 2673) [ClassicSimilarity], result of:
        0.017573725 = score(doc=2673,freq=2.0), product of:
          0.10100432 = queryWeight, product of:
            3.1495528 = idf(docFreq=5152, maxDocs=44218)
            0.032069415 = queryNorm
          0.17398985 = fieldWeight in 2673, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.1495528 = idf(docFreq=5152, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2673)
      0.021922018 = weight(_text_:29 in 2673) [ClassicSimilarity], result of:
        0.021922018 = score(doc=2673,freq=2.0), product of:
          0.11281017 = queryWeight, product of:
            3.5176873 = idf(docFreq=3565, maxDocs=44218)
            0.032069415 = queryNorm
          0.19432661 = fieldWeight in 2673, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5176873 = idf(docFreq=3565, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2673)
  0.25 = coord(2/8)

Abstract: This preliminary study applies a bilingual term list (BTL) approach to cross-language information retrieval (CLIR) in the consumer health domain and compares it to a machine translation (MT) approach. We compiled a Spanish-English BTL of 34,980 medical and general terms. We collected a training set of 466 general health queries from MedlinePlus en espaiiol and 488 domainspecific queries from ClinicalTrials.gov translated into Spanish. We submitted the training set queries in English against a test bed of 7,170 ClinicalTrials.gov English documents, and compared MT and BTL against this English monolingual standard. The BTL approach was less effective (F = 0.420) than the MT approach (F = 0.578). A failure analysis of the results led to substitution of BTL dictionary sources and the addition of rudimentary normalisation of plural forms. These changes improved the CLIR effectiveness of the same training set queries (F = 0.474), and yielded comparable results for a test set of new 954 queries (F= 0.484). These results will shape our efforts to support Spanishspeakers' needs for consumer health information currently only available in English.
Date: 29. 8.2004 19:12:06

Chen, K.-H.: Evaluating Chinese text retrieval with multilingual queries (2002) 0.02

0.015183598 = product of:
  0.06073439 = sum of:
    0.04538898 = weight(_text_:retrieval in 1851) [ClassicSimilarity], result of:
      0.04538898 = score(doc=1851,freq=8.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.46789268 = fieldWeight in 1851, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1851)
    0.015345411 = product of:
      0.030690823 = sum of:
        0.030690823 = weight(_text_:29 in 1851) [ClassicSimilarity], result of:
          0.030690823 = score(doc=1851,freq=2.0), product of:
            0.11281017 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.032069415 = queryNorm
            0.27205724 = fieldWeight in 1851, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1851)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Abstract: This paper reports the design of a Chinese test collection with multilingual queries and the application of this test collection to evaluate information retrieval Systems. The effective indexing units, IR models, translation techniques, and query expansion for Chinese text retrieval are identified. The collaboration of East Asian countries for construction of test collections for cross-language multilingual text retrieval is also discussed in this paper. As well, a tool is designed to help assessors judge relevante and gather the events of relevante judgment. The log file created by this tool will be used to analyze the behaviors of assessors in the future.
Source: Knowledge organization. 29(2002) nos.3/4, S.156-170

Conceptual structures : logical, linguistic, and computational issues. 8th International Conference on Conceptual Structures, ICCS 2000, Darmstadt, Germany, August 14-18, 2000 (2000) 0.01
```
0.014712609 = product of:
  0.039233625 = sum of:
    0.0097262105 = weight(_text_:retrieval in 691) [ClassicSimilarity], result of:
      0.0097262105 = score(doc=691,freq=2.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.10026272 = fieldWeight in 691, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0234375 = fieldNorm(doc=691)
    0.0074559003 = product of:
      0.014911801 = sum of:
        0.014911801 = weight(_text_:system in 691) [ClassicSimilarity], result of:
          0.014911801 = score(doc=691,freq=4.0), product of:
            0.10100432 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.032069415 = queryNorm
            0.14763528 = fieldWeight in 691, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0234375 = fieldNorm(doc=691)
      0.5 = coord(1/2)
    0.022051515 = product of:
      0.04410303 = sum of:
        0.04410303 = weight(_text_:etc in 691) [ClassicSimilarity], result of:
          0.04410303 = score(doc=691,freq=4.0), product of:
            0.17370372 = queryWeight, product of:
              5.4164915 = idf(docFreq=533, maxDocs=44218)
              0.032069415 = queryNorm
            0.25389802 = fieldWeight in 691, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.4164915 = idf(docFreq=533, maxDocs=44218)
              0.0234375 = fieldNorm(doc=691)
      0.5 = coord(1/2)
  0.375 = coord(3/8)
```
Abstract

Computer scientists create models of a perceived reality. Through AI techniques, these models aim at providing the basic support for emulating cognitive behavior such as reasoning and learning, which is one of the main goals of the Al research effort. Such computer models are formed through the interaction of various acquisition and inference mechanisms: perception, concept learning, conceptual clustering, hypothesis testing, probabilistic inference, etc., and are represented using different paradigms tightly linked to the processes that use them. Among these paradigms let us cite: biological models (neural nets, genetic programming), logic-based models (first-order logic, modal logic, rule-based systems), virtual reality models (object systems, agent systems), probabilistic models (Bayesian nets, fuzzy logic), linguistic models (conceptual dependency graphs, language-based rep resentations), etc. One of the strengths of the Conceptual Graph (CG) theory is its versatility in terms of the representation paradigms under which it falls. It can be viewed and therefore used, under different representation paradigms, which makes it a popular choice for a wealth of applications. Its full coupling with different cognitive processes lead to the opening of the field toward related research communities such as the Description Logic, Formal Concept Analysis, and Computational Linguistic communities. We now see more and more research results from one community enrich the other, laying the foundations of common philosophical grounds from which a successful synergy can emerge. ICCS 2000 embodies this spirit of research collaboration. It presents a set of papers that we believe, by their exposure, will benefit the whole community. For instance, the technical program proposes tracks on Conceptual Ontologies, Language, Formal Concept Analysis, Computational Aspects of Conceptual Structures, and Formal Semantics, with some papers on pragmatism and human related aspects of computing. Never before was the program of ICCS formed by so heterogeneously rooted theories of knowledge representation and use. We hope that this swirl of ideas will benefit you as much as it already has benefited us while putting together this program

Content

Concepts and Language: The Role of Conceptual Structure in Human Evolution (Keith Devlin) - Concepts in Linguistics - Concepts in Natural Language (Gisela Harras) - Patterns, Schemata, and Types: Author Support through Formalized Experience (Felix H. Gatzemeier) - Conventions and Notations for Knowledge Representation and Retrieval (Philippe Martin) - Conceptual Ontology: Ontology, Metadata, and Semiotics (John F. Sowa) - Pragmatically Yours (Mary Keeler) - Conceptual Modeling for Distributed Ontology Environments (Deborah L. McGuinness) - Discovery of Class Relations in Exception Structured Knowledge Bases (Hendra Suryanto, Paul Compton) - Conceptual Graphs: Perspectives: CGs Applications: Where Are We 7 Years after the First ICCS ? (Michel Chein, David Genest) - The Engineering of a CC-Based System: Fundamental Issues (Guy W. Mineau) - Conceptual Graphs, Metamodeling, and Notation of Concepts (Olivier Gerbé, Guy W. Mineau, Rudolf K. Keller) - Knowledge Representation and Reasonings: Based on Graph Homomorphism (Marie-Laure Mugnier) - User Modeling Using Conceptual Graphs for Intelligent Agents (James F. Baldwin, Trevor P. Martin, Aimilia Tzanavari) - Towards a Unified Querying System of Both Structured and Semi-structured Imprecise Data Using Fuzzy View (Patrice Buche, Ollivier Haemmerlé) - Formal Semantics of Conceptual Structures: The Extensional Semantics of the Conceptual Graph Formalism (Guy W. Mineau) - Semantics of Attribute Relations in Conceptual Graphs (Pavel Kocura) - Nested Concept Graphs and Triadic Power Context Families (Susanne Prediger) - Negations in Simple Concept Graphs (Frithjof Dau) - Extending the CG Model by Simulations (Jean-François Baget) - Contextual Logic and Formal Concept Analysis: Building and Structuring Description Logic Knowledge Bases: Using Least Common Subsumers and Concept Analysis (Franz Baader, Ralf Molitor) - On the Contextual Logic of Ordinal Data (Silke Pollandt, Rudolf Wille) - Boolean Concept Logic (Rudolf Wille) - Lattices of Triadic Concept Graphs (Bernd Groh, Rudolf Wille) - Formalizing Hypotheses with Concepts (Bernhard Ganter, Sergei 0. Kuznetsov) - Generalized Formal Concept Analysis (Laurent Chaudron, Nicolas Maille) - A Logical Generalization of Formal Concept Analysis (Sébastien Ferré, Olivier Ridoux) - On the Treatment of Incomplete Knowledge in Formal Concept Analysis (Peter Burmeister, Richard Holzer) - Conceptual Structures in Practice: Logic-Based Networks: Concept Graphs and Conceptual Structures (Peter W. Eklund) - Conceptual Knowledge Discovery and Data Analysis (Joachim Hereth, Gerd Stumme, Rudolf Wille, Uta Wille) - CEM - A Conceptual Email Manager (Richard Cole, Gerd Stumme) - A Contextual-Logic Extension of TOSCANA (Peter Eklund, Bernd Groh, Gerd Stumme, Rudolf Wille) - A Conceptual Graph Model for W3C Resource Description Framework (Olivier Corby, Rose Dieng, Cédric Hébert) - Computational Aspects of Conceptual Structures: Computing with Conceptual Structures (Bernhard Ganter) - Symmetry and the Computation of Conceptual Structures (Robert Levinson) An Introduction to SNePS 3 (Stuart C. Shapiro) - Composition Norm Dynamics Calculation with Conceptual Graphs (Aldo de Moor) - From PROLOG++ to PROLOG+CG: A CG Object-Oriented Logic Programming Language (Adil Kabbaj, Martin Janta-Polczynski) - A Cost-Bounded Algorithm to Control Events Generalization (Gaël de Chalendar, Brigitte Grau, Olivier Ferret)
Oard, D.W.; He, D.; Wang, J.: User-assisted query translation for interactive cross-language information retrieval (2008) 0.01
```
0.013695264 = product of:
  0.054781057 = sum of:
    0.033692583 = weight(_text_:retrieval in 2030) [ClassicSimilarity], result of:
      0.033692583 = score(doc=2030,freq=6.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.34732026 = fieldWeight in 2030, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2030)
    0.021088472 = product of:
      0.042176943 = sum of:
        0.042176943 = weight(_text_:system in 2030) [ClassicSimilarity], result of:
          0.042176943 = score(doc=2030,freq=8.0), product of:
            0.10100432 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.032069415 = queryNorm
            0.41757566 = fieldWeight in 2030, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=2030)
      0.5 = coord(1/2)
  0.25 = coord(2/8)
```
Abstract

Interactive Cross-Language Information Retrieval (CLIR), a process in which searcher and system collaborate to find documents that satisfy an information need regardless of the language in which those documents are written, calls for designs in which synergies between searcher and system can be leveraged so that the strengths of one can cover weaknesses of the other. This paper describes an approach that employs user-assisted query translation to help searchers better understand the system's operation. Supporting interaction and interface designs are introduced, and results from three user studies are presented. The results indicate that experienced searchers presented with this new system evolve new search strategies that make effective use of the new capabilities, that they achieve retrieval effectiveness comparable to results obtained using fully automatic techniques, and that reported satisfaction with support for cross-language searching increased. The paper concludes with a description of a freely available interactive CLIR system that incorporates lessons learned from this research.

Kreymer, O.: ¬An evaluation of help mechanisms in natural language information retrieval systems (2002) 0.01

0.013510293 = product of:
  0.054041173 = sum of:
    0.043496937 = weight(_text_:retrieval in 2557) [ClassicSimilarity], result of:
      0.043496937 = score(doc=2557,freq=10.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.44838852 = fieldWeight in 2557, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2557)
    0.010544236 = product of:
      0.021088472 = sum of:
        0.021088472 = weight(_text_:system in 2557) [ClassicSimilarity], result of:
          0.021088472 = score(doc=2557,freq=2.0), product of:
            0.10100432 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.032069415 = queryNorm
            0.20878783 = fieldWeight in 2557, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=2557)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Abstract: The field of natural language processing (NLP) demonstrates rapid changes in the design of information retrieval systems and human-computer interaction. While natural language is being looked on as the most effective tool for information retrieval in a contemporary information environment, the systems using it are only beginning to emerge. This study attempts to evaluate the current state of NLP information retrieval systems from the user's point of view: what techniques are used by these systems to guide their users through the search process? The analysis focused on the structure and components of the systems' help mechanisms. Results of the study demonstrated that systems which claimed to be using natural language searching in fact used a wide range of information retrieval techniques from real natural language processing to Boolean searching. As a result, the user assistance mechanisms of these systems also varied. While pseudo-NLP systems would suit a more traditional method of instruction, real NLP systems primarily utilised the methods of explanation and user-system dialogue.

Airio, E.: Who benefits from CLIR in web retrieval? (2008) 0.01
```
0.01236227 = product of:
  0.04944908 = sum of:
    0.038904842 = weight(_text_:retrieval in 2342) [ClassicSimilarity], result of:
      0.038904842 = score(doc=2342,freq=8.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.40105087 = fieldWeight in 2342, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2342)
    0.010544236 = product of:
      0.021088472 = sum of:
        0.021088472 = weight(_text_:system in 2342) [ClassicSimilarity], result of:
          0.021088472 = score(doc=2342,freq=2.0), product of:
            0.10100432 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.032069415 = queryNorm
            0.20878783 = fieldWeight in 2342, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=2342)
      0.5 = coord(1/2)
  0.25 = coord(2/8)
```
Abstract

Purpose - The aim of the current paper is to test whether query translation is beneficial in web retrieval. Design/methodology/approach - The language pairs were Finnish-Swedish, English-German and Finnish-French. A total of 12-18 participants were recruited for each language pair. Each participant performed four retrieval tasks. The author's aim was to compare the performance of the translated queries with that of the target language queries. Thus, the author asked participants to formulate a source language query and a target language query for each task. The source language queries were translated into the target language utilizing a dictionary-based system. In English-German, also machine translation was utilized. The author used Google as the search engine. Findings - The results differed depending on the language pair. The author concluded that the dictionary coverage had an effect on the results. On average, the results of query-translation were better than in the traditional laboratory tests. Originality/value - This research shows that query translation in web is beneficial especially for users with moderate and non-active language skills. This is valuable information for developers of cross-language information retrieval systems.
Jacquemin, C.: Spotting and discovering terms through natural language processing (2001) 0.01
```
0.0121684875 = product of:
  0.04867395 = sum of:
    0.036247447 = weight(_text_:retrieval in 119) [ClassicSimilarity], result of:
      0.036247447 = score(doc=119,freq=10.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.37365708 = fieldWeight in 119, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=119)
    0.012426502 = product of:
      0.024853004 = sum of:
        0.024853004 = weight(_text_:system in 119) [ClassicSimilarity], result of:
          0.024853004 = score(doc=119,freq=4.0), product of:
            0.10100432 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.032069415 = queryNorm
            0.24605882 = fieldWeight in 119, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0390625 = fieldNorm(doc=119)
      0.5 = coord(1/2)
  0.25 = coord(2/8)
```
Abstract

In this book Christian Jacquemin shows how the power of natural language processing (NLP) can be used to advance text indexing and information retrieval (IR). Jacquemin's novel tool is FASTR, a parser that normalizes terms and recognizes term variants. Since there are more meanings in a language than there are words, FASTR uses a metagrammar composed of shallow linguistic transformations that describe the morphological, syntactic, semantic, and pragmatic variations of words and terms. The acquired parsed terms can then be applied for precise retrieval and assembly of information. The use of a corpus-based unification grammar to define, recognize, and combine term variants from their base forms allows for intelligent information access to, or "linguistic data tuning" of, heterogeneous texts. FASTR can be used to do automatic controlled indexing, to carry out content-based Web searches through conceptually related alternative query formulations, to abstract scientific and technical extracts, and even to translate and collect terms from multilingual material. Jacquemin provides a comprehensive account of the method and implementation of this innovative retrieval technique for text processing.

RSWK

Automatische Indexierung / Computerlinguistik / Information Retrieval
Textverstehendes System (HBZ)

Subject

Automatische Indexierung / Computerlinguistik / Information Retrieval
Textverstehendes System (HBZ)

Herrera-Viedma, E.; Cordón, O.; Herrera, J.C.; Luqe, M.: ¬An IRS based on multi-granular lnguistic information (2003) 0.01

0.012149587 = product of:
  0.04859835 = sum of:
    0.027509877 = weight(_text_:retrieval in 2740) [ClassicSimilarity], result of:
      0.027509877 = score(doc=2740,freq=4.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.2835858 = fieldWeight in 2740, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2740)
    0.021088472 = product of:
      0.042176943 = sum of:
        0.042176943 = weight(_text_:system in 2740) [ClassicSimilarity], result of:
          0.042176943 = score(doc=2740,freq=8.0), product of:
            0.10100432 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.032069415 = queryNorm
            0.41757566 = fieldWeight in 2740, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.046875 = fieldNorm(doc=2740)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Abstract: An information retrieval system (IRS) based on fuzzy multi-granular linguistic information is proposed. The system has an evaluation method to process multi-granular linguistic information, in such a way that the inputs to the IRS are represented in a different linguistic domain than the outputs. The system accepts Boolean queries whose terms are weighted by means of the ordinal linguistic values represented by the linguistic variable "Importance" assessed an a label set S. The system evaluates the weighted queries according to a threshold semantic and obtains the linguistic retrieval status values (RSV) of documents represented by a linguistic variable "Relevance" expressed in a different label set S'. The advantage of this linguistic IRS with respect to others is that the use of the multi-granular linguistic information facilitates and improves the IRS-user interaction

Ballesteros, L.A.: Cross-language retrieval via transitive relation (2000) 0.01
```
0.012123488 = product of:
  0.04849395 = sum of:
    0.03970709 = weight(_text_:retrieval in 30) [ClassicSimilarity], result of:
      0.03970709 = score(doc=30,freq=12.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.40932083 = fieldWeight in 30, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=30)
    0.008786863 = product of:
      0.017573725 = sum of:
        0.017573725 = weight(_text_:system in 30) [ClassicSimilarity], result of:
          0.017573725 = score(doc=30,freq=2.0), product of:
            0.10100432 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.032069415 = queryNorm
            0.17398985 = fieldWeight in 30, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0390625 = fieldNorm(doc=30)
      0.5 = coord(1/2)
  0.25 = coord(2/8)
```
Abstract

The growth in availability of multi-lingual data in all areas of the public and private sector is driving an increasing need for systems that facilitate access to multi-lingual resources. Cross-language Retrieval (CLR) technology is a means of addressing this need. A CLR system must address two main hurdles to effective cross-language retrieval. First, it must address the ambiguity that arises when trying to map the meaning of text across languages. That is, it must address both within-language ambiguity and cross-language ambiguity. Second, it has to incorporate multilingual resources that will enable it to perform the mapping across languages. The difficulty here is that there is a limited number of lexical resources and virtually none for some pairs of languages. This work focuses on a dictionary approach to addressing the problem of limited lexical resources. A dictionary approach is taken since bilingual dictionaries are more prevalent and simpler to apply than other resources. We show that a transitive translation approach, where a third language is employed as an interlingua between the source and target languages, is a viable means of performing CLR between languages for which no bilingual dictionary is available

Series

The Kluwer international series on information retrieval; 7

Source

Advances in information retrieval: Recent research from the Center for Intelligent Information Retrieval. Ed.: W.B. Croft

Sidhom, S.; Hassoun, M.: Morpho-syntactic parsing for a text mining environment : An NP recognition model for knowledge visualization and information retrieval (2002) 0.01

0.011711448 = product of:
  0.046845794 = sum of:
    0.033692583 = weight(_text_:retrieval in 1852) [ClassicSimilarity], result of:
      0.033692583 = score(doc=1852,freq=6.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.34732026 = fieldWeight in 1852, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=1852)
    0.01315321 = product of:
      0.02630642 = sum of:
        0.02630642 = weight(_text_:29 in 1852) [ClassicSimilarity], result of:
          0.02630642 = score(doc=1852,freq=2.0), product of:
            0.11281017 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.032069415 = queryNorm
            0.23319192 = fieldWeight in 1852, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.046875 = fieldNorm(doc=1852)
      0.5 = coord(1/2)
  0.25 = coord(2/8)

Abstract: Sidhom and Hassoun discuss the crucial role of NLP tools in Knowledge Extraction and Management as well as in the design of Information Retrieval Systems. The authors focus more specifically an the morpho-syntactic issues by describing their morpho-syntactic analysis platform, which has been implemented to cover the automatic indexing and information retrieval topics. To this end they implemented the Cascaded "Augmented Transition Network (ATN)". They used this formalism in order to analyse French text descriptions of Multimedia documents. An implementation of an ATN parsing automaton is briefly described. The Platform in its logical operation is considered as an investigative tool towards the knowledge organization (based an an NP recognition model) and management of multiform e-documents (text, multimedia, audio, image) using their text descriptions.
Source: Knowledge organization. 29(2002) nos.3/4, S.171-180

Chen, J.: ¬A lexical knowledge base approach for English-Chinese cross-language information retrieval (2006) 0.01
```
0.010824111 = product of:
  0.043296445 = sum of:
    0.028077152 = weight(_text_:retrieval in 4923) [ClassicSimilarity], result of:
      0.028077152 = score(doc=4923,freq=6.0), product of:
        0.09700725 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.032069415 = queryNorm
        0.28943354 = fieldWeight in 4923, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4923)
    0.015219294 = product of:
      0.030438587 = sum of:
        0.030438587 = weight(_text_:system in 4923) [ClassicSimilarity], result of:
          0.030438587 = score(doc=4923,freq=6.0), product of:
            0.10100432 = queryWeight, product of:
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.032069415 = queryNorm
            0.30135927 = fieldWeight in 4923, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1495528 = idf(docFreq=5152, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4923)
      0.5 = coord(1/2)
  0.25 = coord(2/8)
```
Abstract

This study proposes and explores a natural language processing- (NLP) based strategy to address out-ofdictionary and vocabulary mismatch problems in query translation based English-Chinese Cross-Language Information Retrieval (EC-CLIR). The strategy, named the LKB approach, is to construct a lexical knowledge base (LKB) and to use it for query translation. In this article, the author describes the LKB construction process, which customizes available translation resources based an the document collection of the EC-CLIR system. The evaluation shows that the LKB approach is very promising. It consistently increased the percentage of correct translations and decreased the percentage of missing translations in addition to effectively detecting the vocabulary gap between the document collection and the translation resource of the system. The comparative analysis of the top EC-CLIR results using the LKB and two other translation resources demonstrates that the LKB approach has produced significant improvement in EC-CLIR performance compared to performance using the original translation resource without customization. It has also achieved the same level of performance as a sophisticated machine translation system. The study concludes that the LKB approach has the potential to be an empirical model for developing real-world CLIR systems. Linguistic knowledge and NLP techniques, if appropriately used, can improve the effectiveness of English-Chinese crosslanguage information retrieval.

Search (106 results, page 1 of 6)

Authors

Languages

Types

Themes

Subjects

Classifications