Search (387 results, page 1 of 20)

Ding, C.H.Q.: ¬A probabilistic model for Latent Semantic Indexing (2005) 0.07
```
0.06737011 = product of:
  0.16842526 = sum of:
    0.028076671 = weight(_text_:retrieval in 3459) [ClassicSimilarity], result of:
      0.028076671 = score(doc=3459,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.20052543 = fieldWeight in 3459, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=3459)
    0.1403486 = weight(_text_:semantic in 3459) [ClassicSimilarity], result of:
      0.1403486 = score(doc=3459,freq=14.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.7292479 = fieldWeight in 3459, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.046875 = fieldNorm(doc=3459)
  0.4 = coord(2/5)
```
Abstract

Latent Semantic Indexing (LSI), when applied to semantic space built an text collections, improves information retrieval, information filtering, and word sense disambiguation. A new dual probability model based an the similarity concepts is introduced to provide deeper understanding of LSI. Semantic associations can be quantitatively characterized by their statistical significance, the likelihood. Semantic dimensions containing redundant and noisy information can be separated out and should be ignored because their negative contribution to the overall statistical significance. LSI is the optimal solution of the model. The peak in the likelihood curve indicates the existence of an intrinsic semantic dimension. The importance of LSI dimensions follows the Zipf-distribution, indicating that LSI dimensions represent latent concepts. Document frequency of words follows the Zipf distribution, and the number of distinct words follows log-normal distribution. Experiments an five standard document collections confirm and illustrate the analysis.

Object

Latent Semantic Indexing
Lochbaum, K.E.; Streeter, A.R.: Comparing and combining the effectiveness of latent semantic indexing and the ordinary vector space model for information retrieval (1989) 0.07
```
0.06689858 = product of:
  0.16724643 = sum of:
    0.048630223 = weight(_text_:retrieval in 3458) [ClassicSimilarity], result of:
      0.048630223 = score(doc=3458,freq=6.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.34732026 = fieldWeight in 3458, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=3458)
    0.118616216 = weight(_text_:semantic in 3458) [ClassicSimilarity], result of:
      0.118616216 = score(doc=3458,freq=10.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.616327 = fieldWeight in 3458, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.046875 = fieldNorm(doc=3458)
  0.4 = coord(2/5)
```
Abstract

A retrievalsystem was built to find individuals with appropriate expertise within a large research establishment on the basis of their authored documents. The expert-locating system uses a new method for automatic indexing and retrieval based on singular value decomposition, a matrix decomposition technique related to the factor analysis. Organizational groups, represented by the documents they write, and the terms contained in these documents, are fit simultaneously into a 100-dimensional "semantic" space. User queries are positioned in the semantic space, and the most similar groups are returned to the user. Here we compared the standard vector-space model with this new technique and found that combining the two methods improved performance over either alone. We also examined the effects of various experimental variables on the system`s retrieval accuracy. In particular, the effects of: term weighting functions in the semantic space construction and in query construction, suffix stripping, and using lexical units larger than a a single word were studied.

Object

Latent Semantic Indexing

¬The Eleventh Text Retrieval Conference, TREC 2002 (2003) 0.06

0.06343392 = product of:
  0.1585848 = sum of:
    0.064840294 = weight(_text_:retrieval in 4049) [ClassicSimilarity], result of:
      0.064840294 = score(doc=4049,freq=6.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.46309367 = fieldWeight in 4049, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=4049)
    0.09374451 = sum of:
      0.043574058 = weight(_text_:web in 4049) [ClassicSimilarity], result of:
        0.043574058 = score(doc=4049,freq=2.0), product of:
          0.15105948 = queryWeight, product of:
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.04628742 = queryNorm
          0.2884563 = fieldWeight in 4049, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.0625 = fieldNorm(doc=4049)
      0.05017045 = weight(_text_:22 in 4049) [ClassicSimilarity], result of:
        0.05017045 = score(doc=4049,freq=2.0), product of:
          0.16209066 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04628742 = queryNorm
          0.30952093 = fieldWeight in 4049, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0625 = fieldNorm(doc=4049)
  0.4 = coord(2/5)

Abstract: Proceedings of the llth TREC-conference held in Gaithersburg, Maryland (USA), November 19-22, 2002. Aim of the conference was discussion an retrieval and related information-seeking tasks for large test collection. 93 research groups used different techniques, for information retrieval from the same large database. This procedure makes it possible to compare the results. The tasks are: Cross-language searching, filtering, interactive searching, searching for novelty, question answering, searching for video shots, and Web searching.

Cross-language information retrieval (1998) 0.06
```
0.055131674 = product of:
  0.09188612 = sum of:
    0.04679445 = weight(_text_:retrieval in 6299) [ClassicSimilarity], result of:
      0.04679445 = score(doc=6299,freq=32.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.33420905 = fieldWeight in 6299, product of:
          5.656854 = tf(freq=32.0), with freq of:
            32.0 = termFreq=32.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.01953125 = fieldNorm(doc=6299)
    0.038283218 = weight(_text_:semantic in 6299) [ClassicSimilarity], result of:
      0.038283218 = score(doc=6299,freq=6.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.19891867 = fieldWeight in 6299, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.01953125 = fieldNorm(doc=6299)
    0.0068084467 = product of:
      0.013616893 = sum of:
        0.013616893 = weight(_text_:web in 6299) [ClassicSimilarity], result of:
          0.013616893 = score(doc=6299,freq=2.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.09014259 = fieldWeight in 6299, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.01953125 = fieldNorm(doc=6299)
      0.5 = coord(1/2)
  0.6 = coord(3/5)
```
Content

Enthält die Beiträge: GREFENSTETTE, G.: The Problem of Cross-Language Information Retrieval; DAVIS, M.W.: On the Effective Use of Large Parallel Corpora in Cross-Language Text Retrieval; BALLESTEROS, L. u. W.B. CROFT: Statistical Methods for Cross-Language Information Retrieval; Distributed Cross-Lingual Information Retrieval; Automatic Cross-Language Information Retrieval Using Latent Semantic Indexing; EVANS, D.A. u.a.: Mapping Vocabularies Using Latent Semantics; PICCHI, E. u. C. PETERS: Cross-Language Information Retrieval: A System for Comparable Corpus Querying; YAMABANA, K. u.a.: A Language Conversion Front-End for Cross-Language Information Retrieval; GACHOT, D.A. u.a.: The Systran NLP Browser: An Application of Machine Translation Technology in Cross-Language Information Retrieval; HULL, D.: A Weighted Boolean Model for Cross-Language Text Retrieval; SHERIDAN, P. u.a. Building a Large Multilingual Test Collection from Comparable News Documents; OARD; D.W. u. B.J. DORR: Evaluating Cross-Language Text Filtering Effectiveness

Footnote

Rez. in: Machine translation review: 1999, no.10, S.26-27 (D. Lewis): "Cross Language Information Retrieval (CLIR) addresses the growing need to access large volumes of data across language boundaries. The typical requirement is for the user to input a free form query, usually a brief description of a topic, into a search or retrieval engine which returns a list, in ranked order, of documents or web pages that are relevant to the topic. The search engine matches the terms in the query to indexed terms, usually keywords previously derived from the target documents. Unlike monolingual information retrieval, CLIR requires query terms in one language to be matched to indexed terms in another. Matching can be done by bilingual dictionary lookup, full machine translation, or by applying statistical methods. A query's success is measured in terms of recall (how many potentially relevant target documents are found) and precision (what proportion of documents found are relevant). Issues in CLIR are how to translate query terms into index terms, how to eliminate alternative translations (e.g. to decide that French 'traitement' in a query means 'treatment' and not 'salary'), and how to rank or weight translation alternatives that are retained (e.g. how to order the French terms 'aventure', 'business', 'affaire', and 'liaison' as relevant translations of English 'affair'). Grefenstette provides a lucid and useful overview of the field and the problems. The volume brings together a number of experiments and projects in CLIR. Mark Davies (New Mexico State University) describes Recuerdo, a Spanish retrieval engine which reduces translation ambiguities by scanning indexes for parallel texts; it also uses either a bilingual dictionary or direct equivalents from a parallel corpus in order to compare results for queries on parallel texts. Lisa Ballesteros and Bruce Croft (University of Massachusetts) use a 'local feedback' technique which automatically enhances a query by adding extra terms to it both before and after translation; such terms can be derived from documents known to be relevant to the query.
Christian Fluhr at al (DIST/SMTI, France) outline the EMIR (European Multilingual Information Retrieval) and ESPRIT projects. They found that using SYSTRAN to machine translate queries and to access material from various multilingual databases produced less relevant results than a method referred to as 'multilingual reformulation' (the mechanics of which are only hinted at). An interesting technique is Latent Semantic Indexing (LSI), described by Michael Littman et al (Brown University) and, most clearly, by David Evans et al (Carnegie Mellon University). LSI involves creating matrices of documents and the terms they contain and 'fitting' related documents into a reduced matrix space. This effectively allows queries to be mapped onto a common semantic representation of the documents. Eugenio Picchi and Carol Peters (Pisa) report on a procedure to create links between translation equivalents in an Italian-English parallel corpus. The links are used to construct parallel linguistic contexts in real-time for any term or combination of terms that is being searched for in either language. Their interest is primarily lexicographic but they plan to apply the same procedure to comparable corpora, i.e. to texts which are not translations of each other but which share the same domain. Kiyoshi Yamabana et al (NEC, Japan) address the issue of how to disambiguate between alternative translations of query terms. Their DMAX (double maximise) method looks at co-occurrence frequencies between both source language words and target language words in order to arrive at the most probable translation. The statistical data for the decision are derived, not from the translation texts but independently from monolingual corpora in each language. An interactive user interface allows the user to influence the selection of terms during the matching process. Denis Gachot et al (SYSTRAN) describe the SYSTRAN NLP browser, a prototype tool which collects parsing information derived from a text or corpus previously translated with SYSTRAN. The user enters queries into the browser in either a structured or free form and receives grammatical and lexical information about the source text and/or its translation.

Series

The Kluwer International series on information retrieval

Ellis, D.: Progress and problems in information retrieval (1996) 0.05

0.052387595 = product of:
  0.13096899 = sum of:
    0.105883755 = weight(_text_:retrieval in 789) [ClassicSimilarity], result of:
      0.105883755 = score(doc=789,freq=16.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.75622874 = fieldWeight in 789, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=789)
    0.025085226 = product of:
      0.05017045 = sum of:
        0.05017045 = weight(_text_:22 in 789) [ClassicSimilarity], result of:
          0.05017045 = score(doc=789,freq=2.0), product of:
            0.16209066 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04628742 = queryNorm
            0.30952093 = fieldWeight in 789, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=789)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: An introduction to the principal generic approaches to information retrieval research with their associated concepts, models and systems, this text is designed to keep the information professional up to date with the major themes and developments that have preoccupied researchers in recent month in relation to textual and documentary retrieval systems.
COMPASS: Information retrieval
Content: First published 1991 as New horizons in information retrieval
Date: 26. 7.2002 20:22:46
LCSH: Information retrieval
Subject: Information retrieval
Information retrieval

Keyes, J.G.: Using conceptual categories of questions to measure differences in retrieval performance (1996) 0.05

0.049459886 = product of:
  0.12364971 = sum of:
    0.048630223 = weight(_text_:retrieval in 7440) [ClassicSimilarity], result of:
      0.048630223 = score(doc=7440,freq=6.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.34732026 = fieldWeight in 7440, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=7440)
    0.075019486 = weight(_text_:semantic in 7440) [ClassicSimilarity], result of:
      0.075019486 = score(doc=7440,freq=4.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.38979942 = fieldWeight in 7440, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.046875 = fieldNorm(doc=7440)
  0.4 = coord(2/5)

Abstract: The form of a question denotes the relationship between the current state of knowledge of the questioner and the propositional content of the question. To assess whether these semantic differences have implications for information retrieval, uses CF database, a 1239 document test database containing titles and abstracts of documents pertaining to cystic fibrosis. The database has an accompanying list of 100 questions which were divided into 5 conceptual categories of questions based on their semantic representation. 2 retrieval methods were used to investigate potential diferences in outcomes across conceptual categories: the cosine measurement and the similarity measurement. The ranked results produced by different algorithms will vary for individual conceptual categories as well as for overall performance

Pao, M.L.; Worthen, D.B.: Retrieval effectiveness by semantic and citation searching (1989) 0.05

0.045890357 = product of:
  0.114725895 = sum of:
    0.03970641 = weight(_text_:retrieval in 2288) [ClassicSimilarity], result of:
      0.03970641 = score(doc=2288,freq=4.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.2835858 = fieldWeight in 2288, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2288)
    0.075019486 = weight(_text_:semantic in 2288) [ClassicSimilarity], result of:
      0.075019486 = score(doc=2288,freq=4.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.38979942 = fieldWeight in 2288, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.046875 = fieldNorm(doc=2288)
  0.4 = coord(2/5)

Abstract: A pilot study on the relative retrieval effectiveness of semantic relevance (by terms) and pragmatic relevance (by citations) is reported. A single database has been constructed to provide access by both descriptors and cited references. For each question from a set of queries, two equivalent sets were retrieved. All retrieved items were evaluated by subject experts for relevance to their originating queries. We conclude that there are essentially two types of relevance at work resulting in two different sets of documents. Using both search methods to create a union set is likely to increase recall. Those few retrieved by the intersection of the two methods tend to result in higher precision. Suggestions are made to develop a front-end system to display the overlapping items for higher precision and to manipulate and rank the union set sets retrieved by the two search modes for improved output

Leiva-Mederos, A.; Senso, J.A.; Hidalgo-Delgado, Y.; Hipola, P.: Working framework of semantic interoperability for CRIS with heterogeneous data sources (2017) 0.04
```
0.044913407 = product of:
  0.11228351 = sum of:
    0.01871778 = weight(_text_:retrieval in 3706) [ClassicSimilarity], result of:
      0.01871778 = score(doc=3706,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.13368362 = fieldWeight in 3706, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=3706)
    0.09356573 = weight(_text_:semantic in 3706) [ClassicSimilarity], result of:
      0.09356573 = score(doc=3706,freq=14.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.4861653 = fieldWeight in 3706, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.03125 = fieldNorm(doc=3706)
  0.4 = coord(2/5)
```
Abstract

Purpose Information from Current Research Information Systems (CRIS) is stored in different formats, in platforms that are not compatible, or even in independent networks. It would be helpful to have a well-defined methodology to allow for management data processing from a single site, so as to take advantage of the capacity to link disperse data found in different systems, platforms, sources and/or formats. Based on functionalities and materials of the VLIR project, the purpose of this paper is to present a model that provides for interoperability by means of semantic alignment techniques and metadata crosswalks, and facilitates the fusion of information stored in diverse sources. Design/methodology/approach After reviewing the state of the art regarding the diverse mechanisms for achieving semantic interoperability, the paper analyzes the following: the specific coverage of the data sets (type of data, thematic coverage and geographic coverage); the technical specifications needed to retrieve and analyze a distribution of the data set (format, protocol, etc.); the conditions of re-utilization (copyright and licenses); and the "dimensions" included in the data set as well as the semantics of these dimensions (the syntax and the taxonomies of reference). The semantic interoperability framework here presented implements semantic alignment and metadata crosswalk to convert information from three different systems (ABCD, Moodle and DSpace) to integrate all the databases in a single RDF file. Findings The paper also includes an evaluation based on the comparison - by means of calculations of recall and precision - of the proposed model and identical consultations made on Open Archives Initiative and SQL, in order to estimate its efficiency. The results have been satisfactory enough, due to the fact that the semantic interoperability facilitates the exact retrieval of information. Originality/value The proposed model enhances management of the syntactic and semantic interoperability of the CRIS system designed. In a real setting of use it achieves very positive results.

Tomaiuolo, N.G.; Parker, J.: Maximizing relevant retrieval : keyword and natural language searching (1998) 0.04

0.04376455 = product of:
  0.109411374 = sum of:
    0.06551223 = weight(_text_:retrieval in 6418) [ClassicSimilarity], result of:
      0.06551223 = score(doc=6418,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.46789268 = fieldWeight in 6418, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.109375 = fieldNorm(doc=6418)
    0.043899145 = product of:
      0.08779829 = sum of:
        0.08779829 = weight(_text_:22 in 6418) [ClassicSimilarity], result of:
          0.08779829 = score(doc=6418,freq=2.0), product of:
            0.16209066 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04628742 = queryNorm
            0.5416616 = fieldWeight in 6418, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6418)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Source: Online. 22(1998) no.6, S.57-58

Voorhees, E.M.; Harman, D.: Overview of the Sixth Text REtrieval Conference (TREC-6) (2000) 0.04

0.04376455 = product of:
  0.109411374 = sum of:
    0.06551223 = weight(_text_:retrieval in 6438) [ClassicSimilarity], result of:
      0.06551223 = score(doc=6438,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.46789268 = fieldWeight in 6438, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.109375 = fieldNorm(doc=6438)
    0.043899145 = product of:
      0.08779829 = sum of:
        0.08779829 = weight(_text_:22 in 6438) [ClassicSimilarity], result of:
          0.08779829 = score(doc=6438,freq=2.0), product of:
            0.16209066 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04628742 = queryNorm
            0.5416616 = fieldWeight in 6438, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6438)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Date: 11. 8.2001 16:22:19

Dalrymple, P.W.: Retrieval by reformulation in two library catalogs : toward a cognitive model of searching behavior (1990) 0.04

0.04376455 = product of:
  0.109411374 = sum of:
    0.06551223 = weight(_text_:retrieval in 5089) [ClassicSimilarity], result of:
      0.06551223 = score(doc=5089,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.46789268 = fieldWeight in 5089, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.109375 = fieldNorm(doc=5089)
    0.043899145 = product of:
      0.08779829 = sum of:
        0.08779829 = weight(_text_:22 in 5089) [ClassicSimilarity], result of:
          0.08779829 = score(doc=5089,freq=2.0), product of:
            0.16209066 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04628742 = queryNorm
            0.5416616 = fieldWeight in 5089, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=5089)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Date: 22. 7.2006 18:43:54

Gödert, W.; Liebig, M.: Maschinelle Indexierung auf dem Prüfstand : Ergebnisse eines Retrievaltests zum MILOS II Projekt (1997) 0.04

0.043284822 = product of:
  0.108212054 = sum of:
    0.04632414 = weight(_text_:retrieval in 1174) [ClassicSimilarity], result of:
      0.04632414 = score(doc=1174,freq=4.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.33085006 = fieldWeight in 1174, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1174)
    0.06188791 = weight(_text_:semantic in 1174) [ClassicSimilarity], result of:
      0.06188791 = score(doc=1174,freq=2.0), product of:
        0.19245663 = queryWeight, product of:
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.04628742 = queryNorm
        0.32156807 = fieldWeight in 1174, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1578603 = idf(docFreq=1879, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1174)
  0.4 = coord(2/5)

Abstract: The test ran between Nov 95-Aug 96 in Cologne Fachhochschule fur Bibliothekswesen (College of Librarianship).The test basis was a database of 190,000 book titles published between 1990-95. MILOS II mechanized indexing methods proved helpful in avoiding or reducing numbers of unsatisfied/no result retrieval searches. Retrieval from mechanised indexing is 3 times more successful than from title keyword data. MILOS II also used a standardized semantic vocabulary. Mechanised indexing demands high quality software and output data

Mandl, T.: Web- und Multimedia-Dokumente : Neuere Entwicklungen bei der Evaluierung von Information Retrieval Systemen (2003) 0.04

0.042198192 = product of:
  0.10549548 = sum of:
    0.08370846 = weight(_text_:retrieval in 1734) [ClassicSimilarity], result of:
      0.08370846 = score(doc=1734,freq=10.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.59785134 = fieldWeight in 1734, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=1734)
    0.021787029 = product of:
      0.043574058 = sum of:
        0.043574058 = weight(_text_:web in 1734) [ClassicSimilarity], result of:
          0.043574058 = score(doc=1734,freq=2.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.2884563 = fieldWeight in 1734, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0625 = fieldNorm(doc=1734)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: Die Menge an Daten im Internet steigt weiter rapide an. Damit wächst auch der Bedarf an qualitativ hochwertigen Information Retrieval Diensten zur Orientierung und problemorientierten Suche. Die Entscheidung für die Benutzung oder Beschaffung von Information Retrieval Software erfordert aussagekräftige Evaluierungsergebnisse. Dieser Beitrag stellt neuere Entwicklungen bei der Evaluierung von Information Retrieval Systemen vor und zeigt den Trend zu Spezialisierung und Diversifizierung von Evaluierungsstudien, die den Realitätsgrad derErgebnisse erhöhen. DerSchwerpunkt liegt auf dem Retrieval von Fachtexten, Internet-Seiten und Multimedia-Objekten.

Sanderson, M.: ¬The Reuters test collection (1996) 0.04

0.03998254 = product of:
  0.09995635 = sum of:
    0.07487112 = weight(_text_:retrieval in 6971) [ClassicSimilarity], result of:
      0.07487112 = score(doc=6971,freq=8.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.5347345 = fieldWeight in 6971, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=6971)
    0.025085226 = product of:
      0.05017045 = sum of:
        0.05017045 = weight(_text_:22 in 6971) [ClassicSimilarity], result of:
          0.05017045 = score(doc=6971,freq=2.0), product of:
            0.16209066 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04628742 = queryNorm
            0.30952093 = fieldWeight in 6971, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=6971)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: Describes the Reuters test collection, which at 22.173 references is significantly larger than most traditional test collections. In addition, Reuters has none of the recall calculation problems normally associated with some of the larger test collections available. Explains the method derived by D.D. Lewis to perform retrieval experiments on the Reuters collection and illustrates the use of the Reuters collection using some simple retrieval experiments that compare the performance of stemming algorithms
Source: Information retrieval: new systems and current research. Proceedings of the 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Drymen, Scotland, 22-23 Mar 94. Ed.: R. Leon

Ravana, S.D.; Taheri, M.S.; Rajagopal, P.: Document-based approach to improve the accuracy of pairwise comparison in evaluating information retrieval systems (2015) 0.04

0.039646205 = product of:
  0.099115506 = sum of:
    0.040525187 = weight(_text_:retrieval in 2587) [ClassicSimilarity], result of:
      0.040525187 = score(doc=2587,freq=6.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.28943354 = fieldWeight in 2587, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2587)
    0.05859032 = sum of:
      0.027233787 = weight(_text_:web in 2587) [ClassicSimilarity], result of:
        0.027233787 = score(doc=2587,freq=2.0), product of:
          0.15105948 = queryWeight, product of:
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.04628742 = queryNorm
          0.18028519 = fieldWeight in 2587, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2587)
      0.031356532 = weight(_text_:22 in 2587) [ClassicSimilarity], result of:
        0.031356532 = score(doc=2587,freq=2.0), product of:
          0.16209066 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.04628742 = queryNorm
          0.19345059 = fieldWeight in 2587, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2587)
  0.4 = coord(2/5)

Abstract: Purpose The purpose of this paper is to propose a method to have more accurate results in comparing performance of the paired information retrieval (IR) systems with reference to the current method, which is based on the mean effectiveness scores of the systems across a set of identified topics/queries. Design/methodology/approach Based on the proposed approach, instead of the classic method of using a set of topic scores, the documents level scores are considered as the evaluation unit. These document scores are the defined document's weight, which play the role of the mean average precision (MAP) score of the systems as a significance test's statics. The experiments were conducted using the TREC 9 Web track collection. Findings The p-values generated through the two types of significance tests, namely the Student's t-test and Mann-Whitney show that by using the document level scores as an evaluation unit, the difference between IR systems is more significant compared with utilizing topic scores. Originality/value Utilizing a suitable test collection is a primary prerequisite for IR systems comparative evaluation. However, in addition to reusable test collections, having an accurate statistical testing is a necessity for these evaluations. The findings of this study will assist IR researchers to evaluate their retrieval systems and algorithms more accurately.
Date: 20. 1.2015 18:30:22

Losee, R.M.: Determining information retrieval and filtering performance without experimentation (1995) 0.04

0.038077794 = product of:
  0.09519448 = sum of:
    0.07324491 = weight(_text_:retrieval in 3368) [ClassicSimilarity], result of:
      0.07324491 = score(doc=3368,freq=10.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.5231199 = fieldWeight in 3368, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3368)
    0.021949572 = product of:
      0.043899145 = sum of:
        0.043899145 = weight(_text_:22 in 3368) [ClassicSimilarity], result of:
          0.043899145 = score(doc=3368,freq=2.0), product of:
            0.16209066 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04628742 = queryNorm
            0.2708308 = fieldWeight in 3368, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3368)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: The performance of an information retrieval or text and media filtering system may be determined through analytic methods as well as by traditional simulation or experimental methods. These analytic methods can provide precise statements about expected performance. They can thus determine which of 2 similarly performing systems is superior. For both a single query terms and for a multiple query term retrieval model, a model for comparing the performance of different probabilistic retrieval methods is developed. This method may be used in computing the average search length for a query, given only knowledge of database parameter values. Describes predictive models for inverse document frequency, binary independence, and relevance feedback based retrieval and filtering. Simulation illustrate how the single term model performs and sample performance predictions are given for single term and multiple term problems
Date: 22. 2.1996 13:14:10

Behnert, C.; Lewandowski, D.: ¬A framework for designing retrieval effectiveness studies of library information systems using human relevance assessments (2017) 0.04
```
0.03729829 = product of:
  0.093245715 = sum of:
    0.07398852 = weight(_text_:retrieval in 3700) [ClassicSimilarity], result of:
      0.07398852 = score(doc=3700,freq=20.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.5284309 = fieldWeight in 3700, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3700)
    0.019257195 = product of:
      0.03851439 = sum of:
        0.03851439 = weight(_text_:web in 3700) [ClassicSimilarity], result of:
          0.03851439 = score(doc=3700,freq=4.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.25496176 = fieldWeight in 3700, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3700)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

Purpose This paper demonstrates how to apply traditional information retrieval evaluation methods based on standards from the Text REtrieval Conference (TREC) and web search evaluation to all types of modern library information systems including online public access catalogs, discovery systems, and digital libraries that provide web search features to gather information from heterogeneous sources. Design/methodology/approach We apply conventional procedures from information retrieval evaluation to the library information system context considering the specific characteristics of modern library materials. Findings We introduce a framework consisting of five parts: (1) search queries, (2) search results, (3) assessors, (4) testing, and (5) data analysis. We show how to deal with comparability problems resulting from diverse document types, e.g., electronic articles vs. printed monographs and what issues need to be considered for retrieval tests in the library context. Practical implications The framework can be used as a guideline for conducting retrieval effectiveness studies in the library context. Originality/value Although a considerable amount of research has been done on information retrieval evaluation, and standards for conducting retrieval effectiveness studies do exist, to our knowledge this is the first attempt to provide a systematic framework for evaluating the retrieval effectiveness of twenty-first-century library information systems. We demonstrate which issues must be considered and what decisions must be made by researchers prior to a retrieval test.

Crestani, F.; Rijsbergen, C.J. van: Information retrieval by imaging (1996) 0.04

0.037239123 = product of:
  0.093097806 = sum of:
    0.07428389 = weight(_text_:retrieval in 6967) [ClassicSimilarity], result of:
      0.07428389 = score(doc=6967,freq=14.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.5305404 = fieldWeight in 6967, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=6967)
    0.01881392 = product of:
      0.03762784 = sum of:
        0.03762784 = weight(_text_:22 in 6967) [ClassicSimilarity], result of:
          0.03762784 = score(doc=6967,freq=2.0), product of:
            0.16209066 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04628742 = queryNorm
            0.23214069 = fieldWeight in 6967, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=6967)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: Explains briefly what constitutes the imaging process and explains how imaging can be used in information retrieval. Proposes an approach based on the concept of: 'a term is a possible world'; which enables the exploitation of term to term relationships which are estimated using an information theoretic measure. Reports results of an evaluation exercise to compare the performance of imaging retrieval, using possible world semantics, with a benchmark and using the Cranfield 2 document collection to measure precision and recall. Initially, the performance imaging retrieval was seen to be better but statistical analysis proved that the difference was not significant. The problem with imaging retrieval lies in the amount of computations needed to be performed at run time and a later experiement investigated the possibility of reducing this amount. Notes lines of further investigation
Source: Information retrieval: new systems and current research. Proceedings of the 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Drymen, Scotland, 22-23 Mar 94. Ed.: R. Leon

¬The Fifth Text Retrieval Conference (TREC-5) (1997) 0.04

0.035970207 = product of:
  0.08992552 = sum of:
    0.064840294 = weight(_text_:retrieval in 3087) [ClassicSimilarity], result of:
      0.064840294 = score(doc=3087,freq=6.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.46309367 = fieldWeight in 3087, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=3087)
    0.025085226 = product of:
      0.05017045 = sum of:
        0.05017045 = weight(_text_:22 in 3087) [ClassicSimilarity], result of:
          0.05017045 = score(doc=3087,freq=2.0), product of:
            0.16209066 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04628742 = queryNorm
            0.30952093 = fieldWeight in 3087, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=3087)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: Proceedings of the 5th TREC-confrerence held in Gaithersburgh, Maryland, Nov 20-22, 1996. Aim of the conference was discussion on retrieval techniques for large test collections. Different research groups used different techniques, such as automated thesauri, term weighting, natural language techniques, relevance feedback and advanced pattern matching, for information retrieval from the same large database. This procedure makes it possible to compare the results. The proceedings include papers, tables of the system results, and brief system descriptions including timing and storage information

Agata, T.: ¬A measure for evaluating search engines on the World Wide Web : retrieval test with ESL (Expected Search Length) (1997) 0.04

0.035533555 = product of:
  0.08883388 = sum of:
    0.056153342 = weight(_text_:retrieval in 3892) [ClassicSimilarity], result of:
      0.056153342 = score(doc=3892,freq=2.0), product of:
        0.14001551 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.04628742 = queryNorm
        0.40105087 = fieldWeight in 3892, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.09375 = fieldNorm(doc=3892)
    0.03268054 = product of:
      0.06536108 = sum of:
        0.06536108 = weight(_text_:web in 3892) [ClassicSimilarity], result of:
          0.06536108 = score(doc=3892,freq=2.0), product of:
            0.15105948 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.04628742 = queryNorm
            0.43268442 = fieldWeight in 3892, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.09375 = fieldNorm(doc=3892)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Search (387 results, page 1 of 20)

Authors

Years

Languages

Types

Themes

Subjects

Classifications