Search (100 results, page 1 of 5)

Guglielmo, E.J.; Rowe, N.C.: Natural-language retrieval of images based on descriptive captions (1996) 0.05
```
0.04653595 = product of:
  0.11633987 = sum of:
    0.0538205 = weight(_text_:index in 6624) [ClassicSimilarity], result of:
      0.0538205 = score(doc=6624,freq=2.0), product of:
        0.18579477 = queryWeight, product of:
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.04251826 = queryNorm
        0.28967714 = fieldWeight in 6624, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.046875 = fieldNorm(doc=6624)
    0.06251937 = weight(_text_:system in 6624) [ClassicSimilarity], result of:
      0.06251937 = score(doc=6624,freq=10.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.46686378 = fieldWeight in 6624, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.046875 = fieldNorm(doc=6624)
  0.4 = coord(2/5)
```
Abstract

Describes a prototype intelligent information retrieval system that uses natural-language understanding to efficiently locate captioned data. Multimedia data generally requires captions to explain its features and significance. Such descriptive captions often rely on long nominal compunds (strings of consecutive nouns) which create problems of ambiguous word sense. Presents a system in which captions and user queries are parsed and interpreted to produce a logical form, using a detailed theory of the meaning of nominal compounds. A fine-grain match can then compare the logical form of the query to the logical forms for each caption. To improve system efficiency, the system performs a coarse-grain match with index files, using nouns and verbs extracted from the query. Experiments with randomly selected queries and captions from an existing image library show an increase of 30% in precision and 50% in recall over the keyphrase approach currently used. Processing times have a median of 7 seconds as compared to 8 minutes for the existing system
Park, T.K.: ¬The nature of relevance in information retrieval : an empirical study (1993) 0.04
```
0.044728827 = product of:
  0.11182207 = sum of:
    0.08386256 = weight(_text_:context in 5336) [ClassicSimilarity], result of:
      0.08386256 = score(doc=5336,freq=6.0), product of:
        0.17622331 = queryWeight, product of:
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.04251826 = queryNorm
        0.475888 = fieldWeight in 5336, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.046875 = fieldNorm(doc=5336)
    0.027959513 = weight(_text_:system in 5336) [ClassicSimilarity], result of:
      0.027959513 = score(doc=5336,freq=2.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.20878783 = fieldWeight in 5336, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.046875 = fieldNorm(doc=5336)
  0.4 = coord(2/5)
```
Abstract

Experimental research in information retrieval (IR) depends on the idea of relevance. Because of its key role in IR, recent questions about relevance have raised issues of methododlogical concern and have shaken the philosophical foundations of IR theory development. Despite an existing set of theoretical definitions of this concept, our understanding of relevance from users' perspectives is still limited. Using naturalistic inquiry methodology, this article reports an emprical study of user-based relevance interpretations. A model is presented that reflects the nature of the thought process of users who are evaluating bibliographic citations produced by a document retrieval system. Three major categories of variables affecting relevance assessments - internal context, external context, and problem context - are idetified and described. Users' relevance assessments involve multiple layers of interpretations that are derived from individuals' experiences, perceptions, and private knowledge related to the particular information problems at hand

Dunlop, M.D.; Johnson, C.W.; Reid, J.: Exploring the layers of information retrieval evaluation (1998) 0.04

0.041047435 = product of:
  0.10261859 = sum of:
    0.05648775 = weight(_text_:context in 3762) [ClassicSimilarity], result of:
      0.05648775 = score(doc=3762,freq=2.0), product of:
        0.17622331 = queryWeight, product of:
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.04251826 = queryNorm
        0.32054642 = fieldWeight in 3762, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3762)
    0.04613084 = weight(_text_:system in 3762) [ClassicSimilarity], result of:
      0.04613084 = score(doc=3762,freq=4.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.34448233 = fieldWeight in 3762, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3762)
  0.4 = coord(2/5)

Abstract: Presents current work on modelling interactive information retrieval systems and users' interactions with them. Analyzes the papers in this special issue in the context of evaluation in information retrieval (IR) by examining the different layers at which IR use could be evaluated. IR poses the double evaluation problem of evaluating both the underlying system effectiveness and the overall ability of the system to aid users. The papers look at different issues in combining human-computer interaction (HCI) research with IR research and provide insights into the problem of evaluating the information seeking process

Shakir, H.S.; Nagao, M.: Context-sensitive processing of semantic queries in an image database system (1996) 0.04

0.0385732 = product of:
  0.096433006 = sum of:
    0.068473496 = weight(_text_:context in 6626) [ClassicSimilarity], result of:
      0.068473496 = score(doc=6626,freq=4.0), product of:
        0.17622331 = queryWeight, product of:
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.04251826 = queryNorm
        0.38856095 = fieldWeight in 6626, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.046875 = fieldNorm(doc=6626)
    0.027959513 = weight(_text_:system in 6626) [ClassicSimilarity], result of:
      0.027959513 = score(doc=6626,freq=2.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.20878783 = fieldWeight in 6626, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.046875 = fieldNorm(doc=6626)
  0.4 = coord(2/5)

Abstract: In an image database environment, an image can be retrieved using common names of entities that appear in it. Shows how an image is abstracted into a hierarchy of entity names and features and how relations are established between entities visible in the image. Semantic queries are also hierarchical. Its core is a fuzzy matching technique that compares semantic queries to image abstractions by assessing the similarity of contexts between the query and the candidate image. An important object of this matching technique is to distinguish between abstractions of different images that have the same labels but are different in context from each other. Each image is tagged with a matching degree even when it does not provide an exact match of the query. Experiments have been conducted to evaluate the strategy

Aldous, K.J.: ¬A system for the automatic retrieval of information from a specialist database (1996) 0.04
```
0.035183515 = product of:
  0.08795879 = sum of:
    0.04841807 = weight(_text_:context in 4078) [ClassicSimilarity], result of:
      0.04841807 = score(doc=4078,freq=2.0), product of:
        0.17622331 = queryWeight, product of:
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.04251826 = queryNorm
        0.27475408 = fieldWeight in 4078, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.046875 = fieldNorm(doc=4078)
    0.03954072 = weight(_text_:system in 4078) [ClassicSimilarity], result of:
      0.03954072 = score(doc=4078,freq=4.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.29527056 = fieldWeight in 4078, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.046875 = fieldNorm(doc=4078)
  0.4 = coord(2/5)
```
Abstract

Accessing useful information from a complex database requires knowledge of the structure of the database and an understanding of the methods of information retrieval. A means of overcoming this knowledge barrier to the use of narrow domain databases is proposed in which the user is required to enter only a series of terms which identify the required material. Describes a method which classifies terms according to their meaning in the context of the database and which uses this classification to access and execute models of code stored in the database to effect retrieval. Presents an implementation of the method using a database of technical information on the nature and use of fungicides. Initial results of trials with potential users indicate that the system can produce relevant resposes to queries expressed in this style. Since the code modules are part of the database, extensions may be easily implemented to handle most queries which users are likely to pose

Lespinasse, K.: TREC: une conference pour l'evaluation des systemes de recherche d'information (1997) 0.03

0.03196765 = product of:
  0.07991912 = sum of:
    0.064557426 = weight(_text_:context in 744) [ClassicSimilarity], result of:
      0.064557426 = score(doc=744,freq=2.0), product of:
        0.17622331 = queryWeight, product of:
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.04251826 = queryNorm
        0.36633876 = fieldWeight in 744, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.0625 = fieldNorm(doc=744)
    0.015361699 = product of:
      0.046085097 = sum of:
        0.046085097 = weight(_text_:22 in 744) [ClassicSimilarity], result of:
          0.046085097 = score(doc=744,freq=2.0), product of:
            0.1488917 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04251826 = queryNorm
            0.30952093 = fieldWeight in 744, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=744)
      0.33333334 = coord(1/3)
  0.4 = coord(2/5)

Abstract: TREC ia an annual conference held in the USA devoted to electronic systems for large full text information searching. The conference deals with evaluation and comparison techniques developed since 1992 by participants from the research and industrial fields. The work of the conference is destined for designers (rather than users) of systems which access full text information. Describes the context, objectives, organization, evaluation methods and limits of TREC
Date: 1. 8.1996 22:01:00

¬The Fifth Text Retrieval Conference (TREC-5) (1997) 0.03

0.027233064 = product of:
  0.06808266 = sum of:
    0.05272096 = weight(_text_:system in 3087) [ClassicSimilarity], result of:
      0.05272096 = score(doc=3087,freq=4.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.3936941 = fieldWeight in 3087, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.0625 = fieldNorm(doc=3087)
    0.015361699 = product of:
      0.046085097 = sum of:
        0.046085097 = weight(_text_:22 in 3087) [ClassicSimilarity], result of:
          0.046085097 = score(doc=3087,freq=2.0), product of:
            0.1488917 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04251826 = queryNorm
            0.30952093 = fieldWeight in 3087, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=3087)
      0.33333334 = coord(1/3)
  0.4 = coord(2/5)

Abstract: Proceedings of the 5th TREC-confrerence held in Gaithersburgh, Maryland, Nov 20-22, 1996. Aim of the conference was discussion on retrieval techniques for large test collections. Different research groups used different techniques, such as automated thesauri, term weighting, natural language techniques, relevance feedback and advanced pattern matching, for information retrieval from the same large database. This procedure makes it possible to compare the results. The proceedings include papers, tables of the system results, and brief system descriptions including timing and storage information

Harman, D.K.: ¬The first text retrieval conference : TREC-1, 1992 (1993) 0.02

0.021112198 = product of:
  0.052780494 = sum of:
    0.03727935 = weight(_text_:system in 1317) [ClassicSimilarity], result of:
      0.03727935 = score(doc=1317,freq=2.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.27838376 = fieldWeight in 1317, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.0625 = fieldNorm(doc=1317)
    0.015501143 = product of:
      0.04650343 = sum of:
        0.04650343 = weight(_text_:29 in 1317) [ClassicSimilarity], result of:
          0.04650343 = score(doc=1317,freq=2.0), product of:
            0.14956595 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.04251826 = queryNorm
            0.31092256 = fieldWeight in 1317, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0625 = fieldNorm(doc=1317)
      0.33333334 = coord(1/3)
  0.4 = coord(2/5)

Abstract: Reports on the 1st Text Retrieval Conference (TREC-1) held in Rockville, MD, 4-6 Nov. 1992. The TREC experiment is being run by the National Institute of Standards and Technology to allow information retrieval researchers to scale up from small collection of data to larger sized experiments. Gropus of researchers have been provided with text documents compressed on CD-ROM. They used experimental retrieval system to search the text and evaluate the results
Source: Information processing and management. 29(1993) no.4, S.411-414

Evans, D.A.; Lefferts, R.G.: CLARIT-TREC experiments (1995) 0.02

0.018639674 = product of:
  0.09319837 = sum of:
    0.09319837 = weight(_text_:system in 1912) [ClassicSimilarity], result of:
      0.09319837 = score(doc=1912,freq=8.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.6959594 = fieldWeight in 1912, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.078125 = fieldNorm(doc=1912)
  0.2 = coord(1/5)

Abstract: Describes the following elements of the CLARIT system information management system: natural language processing, document indexing, vector space querying and query augmentation. Reports on the processing results carried out as part of the TREC-2 and into system parameterization. Results demonstrate high prescision and excellent recall, but the system is not yet optimized

Losee, R.M.: Determining information retrieval and filtering performance without experimentation (1995) 0.02

0.018424368 = product of:
  0.04606092 = sum of:
    0.03261943 = weight(_text_:system in 3368) [ClassicSimilarity], result of:
      0.03261943 = score(doc=3368,freq=2.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.2435858 = fieldWeight in 3368, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3368)
    0.013441487 = product of:
      0.04032446 = sum of:
        0.04032446 = weight(_text_:22 in 3368) [ClassicSimilarity], result of:
          0.04032446 = score(doc=3368,freq=2.0), product of:
            0.1488917 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04251826 = queryNorm
            0.2708308 = fieldWeight in 3368, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3368)
      0.33333334 = coord(1/3)
  0.4 = coord(2/5)

Abstract: The performance of an information retrieval or text and media filtering system may be determined through analytic methods as well as by traditional simulation or experimental methods. These analytic methods can provide precise statements about expected performance. They can thus determine which of 2 similarly performing systems is superior. For both a single query terms and for a multiple query term retrieval model, a model for comparing the performance of different probabilistic retrieval methods is developed. This method may be used in computing the average search length for a query, given only knowledge of database parameter values. Describes predictive models for inverse document frequency, binary independence, and relevance feedback based retrieval and filtering. Simulation illustrate how the single term model performs and sample performance predictions are given for single term and multiple term problems
Date: 22. 2.1996 13:14:10

Huffman, G.D.; Vital, D.A.; Bivins, R.G.: Generating indices with lexical association methods : term uniqueness (1990) 0.02
```
0.017055526 = product of:
  0.042638816 = sum of:
    0.032950602 = weight(_text_:system in 4152) [ClassicSimilarity], result of:
      0.032950602 = score(doc=4152,freq=4.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.24605882 = fieldWeight in 4152, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4152)
    0.009688215 = product of:
      0.029064644 = sum of:
        0.029064644 = weight(_text_:29 in 4152) [ClassicSimilarity], result of:
          0.029064644 = score(doc=4152,freq=2.0), product of:
            0.14956595 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.04251826 = queryNorm
            0.19432661 = fieldWeight in 4152, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4152)
      0.33333334 = coord(1/3)
  0.4 = coord(2/5)
```
Abstract

A software system has been developed which orders citations retrieved from an online database in terms of relevancy. The system resulted from an effort generated by NASA's Technology Utilization Program to create new advanced software tools to largely automate the process of determining relevancy of database citations retrieved to support large technology transfer studies. The ranking is based on the generation of an enriched vocabulary using lexical association methods, a user assessment of the vocabulary and a combination of the user assessment and the lexical metric. One of the key elements in relevancy ranking is the enriched vocabulary -the terms mst be both unique and descriptive. This paper examines term uniqueness. Six lexical association methods were employed to generate characteristic word indices. A limited subset of the terms - the highest 20,40,60 and 7,5% of the uniquess words - we compared and uniquess factors developed. Computational times were also measured. It was found that methods based on occurrences and signal produced virtually the same terms. The limited subset of terms producedby the exact and centroid discrimination value were also nearly identical. Unique terms sets were produced by teh occurrence, variance and discrimination value (centroid), An end-user evaluation showed that the generated terms were largely distinct and had values of word precision which were consistent with values of the search precision.

Date

23.11.1995 11:29:46
Leppanen, E.: Homografiongelma tekstihaussa ja homografien disambiguoinnin vaikutukset (1996) 0.02
```
0.015834149 = product of:
  0.03958537 = sum of:
    0.027959513 = weight(_text_:system in 27) [ClassicSimilarity], result of:
      0.027959513 = score(doc=27,freq=2.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.20878783 = fieldWeight in 27, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.046875 = fieldNorm(doc=27)
    0.011625858 = product of:
      0.034877572 = sum of:
        0.034877572 = weight(_text_:29 in 27) [ClassicSimilarity], result of:
          0.034877572 = score(doc=27,freq=2.0), product of:
            0.14956595 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.04251826 = queryNorm
            0.23319192 = fieldWeight in 27, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.046875 = fieldNorm(doc=27)
      0.33333334 = coord(1/3)
  0.4 = coord(2/5)
```
Abstract

Homonymy is known to often cause false drops in free text searching in a full text database. The problem is quite common and difficult to avoid in Finnish, but nobody has examined it before. Reports on a study that examined the frequency of, and solutions to, the homonymy problem, based on searches made in a Finnish full text database containing about 55.000 newspaper articles. The results indicate that homonymy is not a very serious problem in full text searching, with only about 1 search result set out of 4 containing false drops caused by homonymy. Several other reasons for nonrelevance were much more common. However, in some set results there were a considerable number of homonymy errors, so the number seems to be very random. A study was also made into whether homonyms can be disambiguated by syntactic analysis. The result was that 75,2% of homonyms were disambiguated by this method. Verb homonyms were considerably easier to disambiguate than substantives. Although homonymy is not a very big problem it could perhaps easily be eliminated if there was a suitable syntactic analyzer in the IR system

Date

9.12.1997 18:33:29
Cross-language information retrieval (1998) 0.02
```
0.015560204 = product of:
  0.03890051 = sum of:
    0.022425208 = weight(_text_:index in 6299) [ClassicSimilarity], result of:
      0.022425208 = score(doc=6299,freq=2.0), product of:
        0.18579477 = queryWeight, product of:
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.04251826 = queryNorm
        0.12069881 = fieldWeight in 6299, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.01953125 = fieldNorm(doc=6299)
    0.016475301 = weight(_text_:system in 6299) [ClassicSimilarity], result of:
      0.016475301 = score(doc=6299,freq=4.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.12302941 = fieldWeight in 6299, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.01953125 = fieldNorm(doc=6299)
  0.4 = coord(2/5)
```
Content

Enthält die Beiträge: GREFENSTETTE, G.: The Problem of Cross-Language Information Retrieval; DAVIS, M.W.: On the Effective Use of Large Parallel Corpora in Cross-Language Text Retrieval; BALLESTEROS, L. u. W.B. CROFT: Statistical Methods for Cross-Language Information Retrieval; Distributed Cross-Lingual Information Retrieval; Automatic Cross-Language Information Retrieval Using Latent Semantic Indexing; EVANS, D.A. u.a.: Mapping Vocabularies Using Latent Semantics; PICCHI, E. u. C. PETERS: Cross-Language Information Retrieval: A System for Comparable Corpus Querying; YAMABANA, K. u.a.: A Language Conversion Front-End for Cross-Language Information Retrieval; GACHOT, D.A. u.a.: The Systran NLP Browser: An Application of Machine Translation Technology in Cross-Language Information Retrieval; HULL, D.: A Weighted Boolean Model for Cross-Language Text Retrieval; SHERIDAN, P. u.a. Building a Large Multilingual Test Collection from Comparable News Documents; OARD; D.W. u. B.J. DORR: Evaluating Cross-Language Text Filtering Effectiveness

Footnote

Rez. in: Machine translation review: 1999, no.10, S.26-27 (D. Lewis): "Cross Language Information Retrieval (CLIR) addresses the growing need to access large volumes of data across language boundaries. The typical requirement is for the user to input a free form query, usually a brief description of a topic, into a search or retrieval engine which returns a list, in ranked order, of documents or web pages that are relevant to the topic. The search engine matches the terms in the query to indexed terms, usually keywords previously derived from the target documents. Unlike monolingual information retrieval, CLIR requires query terms in one language to be matched to indexed terms in another. Matching can be done by bilingual dictionary lookup, full machine translation, or by applying statistical methods. A query's success is measured in terms of recall (how many potentially relevant target documents are found) and precision (what proportion of documents found are relevant). Issues in CLIR are how to translate query terms into index terms, how to eliminate alternative translations (e.g. to decide that French 'traitement' in a query means 'treatment' and not 'salary'), and how to rank or weight translation alternatives that are retained (e.g. how to order the French terms 'aventure', 'business', 'affaire', and 'liaison' as relevant translations of English 'affair'). Grefenstette provides a lucid and useful overview of the field and the problems. The volume brings together a number of experiments and projects in CLIR. Mark Davies (New Mexico State University) describes Recuerdo, a Spanish retrieval engine which reduces translation ambiguities by scanning indexes for parallel texts; it also uses either a bilingual dictionary or direct equivalents from a parallel corpus in order to compare results for queries on parallel texts. Lisa Ballesteros and Bruce Croft (University of Massachusetts) use a 'local feedback' technique which automatically enhances a query by adding extra terms to it both before and after translation; such terms can be derived from documents known to be relevant to the query.
The retrieved output from a query including the phrase 'big rockets' may be, for instance, a sentence containing 'giant rocket' which is semantically ranked above 'military ocket'. David Hull (Xerox Research Centre, Grenoble) describes an implementation of a weighted Boolean model for Spanish-English CLIR. Users construct Boolean-type queries, weighting each term in the query, which is then translated by an on-line dictionary before being applied to the database. Comparisons with the performance of unweighted free-form queries ('vector space' models) proved encouraging. Two contributions consider the evaluation of CLIR systems. In order to by-pass the time-consuming and expensive process of assembling a standard collection of documents and of user queries against which the performance of an CLIR system is manually assessed, Páriac Sheridan et al (ETH Zurich) propose a method based on retrieving 'seed documents'. This involves identifying a unique document in a database (the 'seed document') and, for a number of queries, measuring how fast it is retrieved. The authors have also assembled a large database of multilingual news documents for testing purposes. By storing the (fairly short) documents in a structured form tagged with descriptor codes (e.g. for topic, country and area), the test suite is easily expanded while remaining consistent for the purposes of testing. Douglas Ouard and Bonne Dorr (University of Maryland) describe an evaluation methodology which appears to apply LSI techniques in order to filter and rank incoming documents designed for testing CLIR systems. The volume provides the reader an excellent overview of several projects in CLIR. It is well supported with references and is intended as a secondary text for researchers and practitioners. It highlights the need for a good, general tutorial introduction to the field."

Cleverdon, C.W.; Mills, J.: ¬The testing of index language devices (1997) 0.01

0.014352133 = product of:
  0.07176066 = sum of:
    0.07176066 = weight(_text_:index in 576) [ClassicSimilarity], result of:
      0.07176066 = score(doc=576,freq=2.0), product of:
        0.18579477 = queryWeight, product of:
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.04251826 = queryNorm
        0.3862362 = fieldWeight in 576, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.0625 = fieldNorm(doc=576)
  0.2 = coord(1/5)

Gluck, M.: Understanding performance in information systems : blending relevance and competence (1995) 0.01
```
0.0136973085 = product of:
  0.06848654 = sum of:
    0.06848654 = weight(_text_:system in 2678) [ClassicSimilarity], result of:
      0.06848654 = score(doc=2678,freq=12.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.51142365 = fieldWeight in 2678, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.046875 = fieldNorm(doc=2678)
  0.2 = coord(1/5)
```
Abstract

Presents brief accounts of the user based performance measure of relevance and the information system-based performance measure of competence. Relevance and competence are shown to be complex notions that have not been studied cojointly. Reports the results of an experiment that used a geographical information system to illustrate how collecting and analyzing data simultaneously from both system and user views of performance can suggest improvements. The user's view was formed by respondents describing how user needs were met by a geographic information system. The system view of the user was described by the accuracy and time on tasks of subjects as they read and answered questions concerning text and maps. This research generated 2 hypotheses: relevance varies directly with levels of competence and experience, and relevance varies directly with the difficulty of the task. Findings also indicate that through a merged, no fault model, information science can contribute to constructing a holistic view of the system performance by illustrating relationships among factors such as competence and relevance, and by exposing new factors such as expectation

Huang, M.-H.: ¬The evaluation of information retrieval systems (1997) 0.01

0.013180241 = product of:
  0.065901205 = sum of:
    0.065901205 = weight(_text_:system in 1827) [ClassicSimilarity], result of:
      0.065901205 = score(doc=1827,freq=4.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.49211764 = fieldWeight in 1827, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.078125 = fieldNorm(doc=1827)
  0.2 = coord(1/5)

Abstract: Describes the current status of retrieval system evaluation and predicts its future development. discusses various performance measures and 'utility' concepts from a historical perspective. Also addresses the current status of search evaluation and dicusses the empirical findings of retrieval system evaluation

Hersh, W.R.; Hickam, D.H.: ¬An evaluation of interactive Boolean and natural language searching with an online medical textbook (1995) 0.01
```
0.013047772 = product of:
  0.06523886 = sum of:
    0.06523886 = weight(_text_:system in 2651) [ClassicSimilarity], result of:
      0.06523886 = score(doc=2651,freq=8.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.4871716 = fieldWeight in 2651, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2651)
  0.2 = coord(1/5)
```
Abstract

Few studies have compared the interactive use of Boolean and natural language search systems. Studies the use of 3 retrieval systems by senior medical students searching on queries generated by actual physicians in a clinical setting. The searchers were randomized to search on 2 or 3 different retrieval systems: a Boolean system, a word-based natural language system, and a concept-based natural language system. Results showed no statistically significant differences in recall or precision among the 3 systems. Likewise, there is no user preference for any system over the other. The study revealed problems with traditional measures of retrieval evaluation when applied to the interactive search setting
Wolfram, D.; Volz, A.; Dimitroff, A.: ¬The effect of linkage structure on retrieval performance in a hypertext-based bibliographic retrieval system (1996) 0.01
```
0.013047772 = product of:
  0.06523886 = sum of:
    0.06523886 = weight(_text_:system in 6622) [ClassicSimilarity], result of:
      0.06523886 = score(doc=6622,freq=8.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.4871716 = fieldWeight in 6622, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.0546875 = fieldNorm(doc=6622)
  0.2 = coord(1/5)
```
Abstract

Investigates how linkage environments in a hypertext based bibliographic retrieval system affect retrieval performance for novice and experienced searchers, 2 systems, 1 with inter record linkages to authors and descriptors and 1 that also included title and abstract keywords, were tested. No significant differences in retrieval performance and system usage were found for most search tests. The enhanced system did provide better performance where title and abstract keywords provided the most direct access to relevant records. The findings have implications for the design of bilbiographic information retrieval systems using hypertext linkages
Drabenstott, K.M.; Weller, M.S.: ¬A comparative approach to system evaluation : delegating control of retrieval tests to an experimental online system (1996) 0.01
```
0.013047772 = product of:
  0.06523886 = sum of:
    0.06523886 = weight(_text_:system in 7435) [ClassicSimilarity], result of:
      0.06523886 = score(doc=7435,freq=8.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.4871716 = fieldWeight in 7435, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7435)
  0.2 = coord(1/5)
```
Abstract

Describes the comparative approach to system evaluation used in this research project which delegated the administartion of an online retrieval test to an experimental online catalogue to produce data for evaluating the effectiveness of a new subject access design. Describes the methods enlisted to sort out problem test administration, e.g. to identify out-of-scope queries, incomplete system administration, and suspect post-search questionnaire responses. Covers how w the researchers handled problem search administrations and what actions they would use to reduce or eliminate the occurrence of such administrations in future online retrieval tests that delegate control of retrieval tests to online systems
Wan, T.-L.; Evens, M.; Wan, Y.-W.; Pao, Y.-Y.: Experiments with automatic indexing and a relational thesaurus in a Chinese information retrieval system (1997) 0.01
```
0.013047772 = product of:
  0.06523886 = sum of:
    0.06523886 = weight(_text_:system in 956) [ClassicSimilarity], result of:
      0.06523886 = score(doc=956,freq=8.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.4871716 = fieldWeight in 956, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.0546875 = fieldNorm(doc=956)
  0.2 = coord(1/5)
```
Abstract

This article describes a series of experiments with an interactive Chinese information retrieval system named CIRS and an interactive relational thesaurus. 2 important issues have been explored: whether thesauri enhance the retrieval effectiveness of Chinese documents, and whether automatic indexing can complete with manual indexing in a Chinese information retrieval system. Recall and precision are used to measure and evaluate the effectiveness of the system. Statistical analysis of the recall and precision measures suggest that the use of the relational thesaurus does improve the retrieval effectiveness both in the automatic indexing environment and in the manual indexing environment and that automatic indexing is at least as good as manual indexing

Search (100 results, page 1 of 5)

Authors

Languages

Types

Themes