Search (102 results, page 1 of 6)

Ng, K.B.; Loewenstern, D.; Basu, C.; Hirsh, H.; Kantor, P.B.: Data fusion of machine-learning methods for the TREC5 routing tak (and other work) (1997) 0.03

0.03406783 = product of:
  0.06813566 = sum of:
    0.042235587 = weight(_text_:data in 3107) [ClassicSimilarity], result of:
      0.042235587 = score(doc=3107,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.34936053 = fieldWeight in 3107, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.078125 = fieldNorm(doc=3107)
    0.02590007 = product of:
      0.05180014 = sum of:
        0.05180014 = weight(_text_:22 in 3107) [ClassicSimilarity], result of:
          0.05180014 = score(doc=3107,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.38690117 = fieldWeight in 3107, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=3107)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Date: 27. 2.1999 20:59:22

Sievert, M.E.; McKinin, E.J.: Why full-text misses some relevant documents : an analysis of documents not retrieved by CCML or MEDIS (1989) 0.03

0.03311137 = product of:
  0.06622274 = sum of:
    0.0506827 = weight(_text_:data in 3564) [ClassicSimilarity], result of:
      0.0506827 = score(doc=3564,freq=8.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.4192326 = fieldWeight in 3564, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=3564)
    0.015540041 = product of:
      0.031080082 = sum of:
        0.031080082 = weight(_text_:22 in 3564) [ClassicSimilarity], result of:
          0.031080082 = score(doc=3564,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.23214069 = fieldWeight in 3564, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=3564)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: Searches conducted as part of the MEDLINE/Full-Text Research Project revealed that the full-text data bases of clinical medical journal articles (CCML (Comprehensive Core Medical Library) from BRS Information Technologies, and MEDIS from Mead Data Central) did not retrieve all the relevant citations. An analysis of the data indicated that 204 relevant citations were retrieved only by MEDLINE. A comparison of the strategies used on the full-text data bases with the text of the articles of these 204 citations revealed that 2 reasons contributed to these failure. The searcher often constructed a restrictive strategy which resulted in the loss of relevant documents; and as in other kinds of retrieval, the problems of natural language caused the loss of relevant documents.
Date: 9. 1.1996 10:22:31

Smithson, S.: Information retrieval evaluation in practice : a case study approach (1994) 0.03

0.029970573 = product of:
  0.059941147 = sum of:
    0.041811097 = weight(_text_:data in 7302) [ClassicSimilarity], result of:
      0.041811097 = score(doc=7302,freq=4.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.34584928 = fieldWeight in 7302, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7302)
    0.01813005 = product of:
      0.0362601 = sum of:
        0.0362601 = weight(_text_:22 in 7302) [ClassicSimilarity], result of:
          0.0362601 = score(doc=7302,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.2708308 = fieldWeight in 7302, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7302)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: The evaluation of information retrieval systems is an important yet difficult operation. This paper describes an exploratory evaluation study that takes an interpretive approach to evaluation. The longitudinal study examines evaluation through the information-seeking behaviour of 22 case studies of 'real' users. The eclectic approach to data collection produced behavioral data that is compared with relevance judgements and satisfaction ratings. The study demonstrates considerable variations among the cases, among different evaluation measures within the same case, and among the same measures at different stages within a single case. It is argued that those involved in evaluation should be aware of the difficulties, and base any evaluation on a good understanding of the cases in question

Larsen, B.; Ingwersen, P.; Lund, B.: Data fusion according to the principle of polyrepresentation (2009) 0.03
```
0.02587114 = product of:
  0.05174228 = sum of:
    0.041382253 = weight(_text_:data in 2752) [ClassicSimilarity], result of:
      0.041382253 = score(doc=2752,freq=12.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.342302 = fieldWeight in 2752, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=2752)
    0.010360028 = product of:
      0.020720055 = sum of:
        0.020720055 = weight(_text_:22 in 2752) [ClassicSimilarity], result of:
          0.020720055 = score(doc=2752,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.15476047 = fieldWeight in 2752, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=2752)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

We report data fusion experiments carried out on the four best-performing retrieval models from TREC 5. Three were conceptually/algorithmically very different from one another; one was algorithmically similar to one of the former. The objective of the test was to observe the performance of the 11 logical data fusion combinations compared to the performance of the four individual models and their intermediate fusions when following the principle of polyrepresentation. This principle is based on cognitive IR perspective (Ingwersen & Järvelin, 2005) and implies that each retrieval model is regarded as a representation of a unique interpretation of information retrieval (IR). It predicts that only fusions of very different, but equally good, IR models may outperform each constituent as well as their intermediate fusions. Two kinds of experiments were carried out. One tested restricted fusions, which entails that only the inner disjoint overlap documents between fused models are ranked. The second set of experiments was based on traditional data fusion methods. The experiments involved the 30 TREC 5 topics that contain more than 44 relevant documents. In all tests, the Borda and CombSUM scoring methods were used. Performance was measured by precision and recall, with document cutoff values (DCVs) at 100 and 15 documents, respectively. Results show that restricted fusions made of two, three, or four cognitively/algorithmically very different retrieval models perform significantly better than do the individual models at DCV100. At DCV15, however, the results of polyrepresentative fusion were less predictable. The traditional fusion method based on polyrepresentation principles demonstrates a clear picture of performance at both DCV levels and verifies the polyrepresentation predictions for data fusion in IR. Data fusion improves retrieval performance over their constituent IR models only if the models all are quite conceptually/algorithmically dissimilar and equally and well performing, in that order of importance.

Date

22. 3.2009 18:48:28

Belkin, N.J.: ¬An overview of results from Rutgers' investigations of interactive information retrieval (1998) 0.02

0.017033914 = product of:
  0.03406783 = sum of:
    0.021117793 = weight(_text_:data in 2339) [ClassicSimilarity], result of:
      0.021117793 = score(doc=2339,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.17468026 = fieldWeight in 2339, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2339)
    0.012950035 = product of:
      0.02590007 = sum of:
        0.02590007 = weight(_text_:22 in 2339) [ClassicSimilarity], result of:
          0.02590007 = score(doc=2339,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.19345059 = fieldWeight in 2339, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2339)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Date: 22. 9.1997 19:16:05
Source: Visualizing subject access for 21st century information resources: Papers presented at the 1997 Clinic on Library Applications of Data Processing, 2-4 Mar 1997, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign. Ed.: P.A. Cochrane et al

Chu, H.: Factors affecting relevance judgment : a report from TREC Legal track (2011) 0.02
```
0.017033914 = product of:
  0.03406783 = sum of:
    0.021117793 = weight(_text_:data in 4540) [ClassicSimilarity], result of:
      0.021117793 = score(doc=4540,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.17468026 = fieldWeight in 4540, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4540)
    0.012950035 = product of:
      0.02590007 = sum of:
        0.02590007 = weight(_text_:22 in 4540) [ClassicSimilarity], result of:
          0.02590007 = score(doc=4540,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.19345059 = fieldWeight in 4540, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4540)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Purpose - This study intends to identify factors that affect relevance judgment of retrieved information as part of the 2007 TREC Legal track interactive task. Design/methodology/approach - Data were gathered and analyzed from the participants of the 2007 TREC Legal track interactive task using a questionnaire which includes not only a list of 80 relevance factors identified in prior research, but also a space for expressing their thoughts on relevance judgment in the process. Findings - This study finds that topicality remains a primary criterion, out of various options, for determining relevance, while specificity of the search request, task, or retrieved results also helps greatly in relevance judgment. Research limitations/implications - Relevance research should focus on the topicality and specificity of what is being evaluated as well as conducted in real environments. Practical implications - If multiple relevance factors are presented to assessors, the total number in a list should be below ten to take account of the limited processing capacity of human beings' short-term memory. Otherwise, the assessors might either completely ignore or inadequately consider some of the relevance factors when making judgment decisions. Originality/value - This study presents a method for reducing the artificiality of relevance research design, an apparent limitation in many related studies. Specifically, relevance judgment was made in this research as part of the 2007 TREC Legal track interactive task rather than a study devised for the sake of it. The assessors also served as searchers so that their searching experience would facilitate their subsequent relevance judgments.

Date

12. 7.2011 18:29:22
Van der Walt, H.E.A.; Brakel, P.A. van: Method for the evaluation of the retrieval effectiveness of a CD-ROM bibliographic database (1991) 0.02
```
0.016527288 = product of:
  0.06610915 = sum of:
    0.06610915 = weight(_text_:data in 3114) [ClassicSimilarity], result of:
      0.06610915 = score(doc=3114,freq=10.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.5468357 = fieldWeight in 3114, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3114)
  0.25 = coord(1/4)
```
Abstract

Addresses the problem of how potential users of CD-ROM data bases can objectively establish which version of the same data base is best suited for a specific situation. The problem was solved by applying the retrieval effectiveness of current on-line data base search systems as a standard measurement. 5 search queries from the medical sciences were presented by experienced users of MEDLINE. Search strategies were written for both DIALOG and DATA-STAR. Search results were compared to create a recall base from documents present in both on-line searches. This recall base was then used to establish the retrieval and precision of 4 CD-ROM data bases: MEDLINE, Compact Cambrdge MEDLINE, DIALOG OnDisc, Comprehensive MEDLINE/EBSCO

Wilbur, W.J.: Global term weights for document retrieval learned from TREC data (2001) 0.01

0.014782455 = product of:
  0.05912982 = sum of:
    0.05912982 = weight(_text_:data in 2647) [ClassicSimilarity], result of:
      0.05912982 = score(doc=2647,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.48910472 = fieldWeight in 2647, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.109375 = fieldNorm(doc=2647)
  0.25 = coord(1/4)

Feng, S.: ¬A comparative study of indexing languages in single and multidatabase searching (1989) 0.01
```
0.014630836 = product of:
  0.058523346 = sum of:
    0.058523346 = weight(_text_:data in 2494) [ClassicSimilarity], result of:
      0.058523346 = score(doc=2494,freq=6.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.48408815 = fieldWeight in 2494, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0625 = fieldNorm(doc=2494)
  0.25 = coord(1/4)
```
Abstract

An experiment was conducted using 3 data bases in library and information science - Library and Information Science Abstracts (LISA), Information Science Abstracts and ERIC - to investigate some of the main factors affecting on-line searching: effectiveness of search vocabularies, combinations of fields searched, and overlaps among databases. Natural language, controlled vocabulary and a mixture of natural language and controlled terms were tested using different fields of bibliographic records. Also discusses a comparative evaluation of single and multi-data base searching, measuring the overlap among data bases and their influence upon on-line searching.
MacCain, K.W.; White, H.D.; Griffith, B.C.: Comparing retrieval performance in online data bases (1987) 0.01
```
0.012931953 = product of:
  0.051727813 = sum of:
    0.051727813 = weight(_text_:data in 1167) [ClassicSimilarity], result of:
      0.051727813 = score(doc=1167,freq=12.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.4278775 = fieldWeight in 1167, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1167)
  0.25 = coord(1/4)
```
Abstract

This study systematically compares retrievals on 11 topics across five well-known data bases, with MEDLINE's subject indexing as a focus. Each topic was posed by a researcher in the medical behavioral sciences. Each was searches in MEDLINE, EXCERPTA MEDICA, and PSYCHINFO, which permit descriptor searches, and in SCISEARCH and SOCIAL SCISEARCH, which express topics through cited references. Searches on each topic were made with (1) descriptors, (2) cited references, and (3) natural language (a capabiblity common to all five data bases). The researchers who posed the topics judged the results. In every case, the set of records judged relevant was used to to calculate recall, precision, and novelty ratios. Overall, MEDLINE had the highest recall percentage (37%), followed by SSCI (31%). All searches resulted in high precision ratios; novelty ratios of data bases and searches varied widely. Differences in record format among data bases affected the success of the natural language retrievals. Some 445 documents judged relevant were not retrieved from MEDLINE using its descriptors; they were found in MEDLINE through natural language or in an alternative data base. An analysis was performed to examine possible faults in MEDLINE subject indexing as the reason for their nonretrieval. However, no patterns of indexing failure could be seen in those documents subsequently found in MEDLINE through known-item searches. Documents not found in MEDLINE primarily represent failures of coverage - articles were from nonindexed or selectively indexed journals
Wildemuth, B.M.: Measures of success in searching a full-text fact base (1990) 0.01
```
0.0128019815 = product of:
  0.051207926 = sum of:
    0.051207926 = weight(_text_:data in 2050) [ClassicSimilarity], result of:
      0.051207926 = score(doc=2050,freq=6.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.42357713 = fieldWeight in 2050, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2050)
  0.25 = coord(1/4)
```
Abstract

The traditional measures of online searching proficiency (recall and precision) are less appropriate when applied to the searching of full text databases. The pilot study investigated and evaluated 5 measures of overall success in searching a full text data bank. Data was drawn from INQUIRER searches conducted by medical students at North Carolina Univ. at Chapel Hill. INQUIRER ia an online database of facts and concepts in microbiology. The 5 measures were: success/failure; precision; search term overlap; number of search cycles; and time per search. Concludes that the last 4 measures look promising for the evaluation of fact data bases such as ENQUIRER
Kelledy, F.; Smeaton, A.F.: Thresholding the postings lists in information retrieval : experiments on TREC data (1995) 0.01
```
0.0128019815 = product of:
  0.051207926 = sum of:
    0.051207926 = weight(_text_:data in 5804) [ClassicSimilarity], result of:
      0.051207926 = score(doc=5804,freq=6.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.42357713 = fieldWeight in 5804, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5804)
  0.25 = coord(1/4)
```
Abstract

A variety of methods for speeding up the response time of information retrieval processes have been put forward, one of which is the idea of thresholding. Thresholding relies on the data in information retrieval storage structures being organised to allow cut-off points to be used during processing. These cut-off points or thresholds are designed and ised to reduce the amount of information processed and to maintain the quality or minimise the degradation of response to a user's query. TREC is an annual series of benchmarking exercises to compare indexing and retrieval techniques. Reports experiments with a portion of the TREC data where features are introduced into the retrieval process to improve response time. These features improve response time while maintaining the same level of retrieval effectiveness
Ahlgren, P.; Grönqvist, L.: Evaluation of retrieval effectiveness with incomplete relevance data : theoretical and experimental comparison of three measures (2008) 0.01
```
0.0128019815 = product of:
  0.051207926 = sum of:
    0.051207926 = weight(_text_:data in 2032) [ClassicSimilarity], result of:
      0.051207926 = score(doc=2032,freq=6.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.42357713 = fieldWeight in 2032, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2032)
  0.25 = coord(1/4)
```
Abstract

This paper investigates two relatively new measures of retrieval effectiveness in relation to the problem of incomplete relevance data. The measures, Bpref and RankEff, which do not take into account documents that have not been relevance judged, are compared theoretically and experimentally. The experimental comparisons involve a third measure, the well-known mean uninterpolated average precision. The results indicate that RankEff is the most stable of the three measures when the amount of relevance data is reduced, with respect to system ranking and absolute values. In addition, RankEff has the lowest error-rate.

Schabas, A.H.: ¬A comparative evaluation of the retrieval effectiveness of titles, Library of Congress Subject Headings and PRECIS strings for computer searching of UK MARC data (1979) 0.01

0.012670675 = product of:
  0.0506827 = sum of:
    0.0506827 = weight(_text_:data in 5277) [ClassicSimilarity], result of:
      0.0506827 = score(doc=5277,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.4192326 = fieldWeight in 5277, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.09375 = fieldNorm(doc=5277)
  0.25 = coord(1/4)

Sievert, M.E.; McKinin, E.J.; Slough, M.: ¬A comparison of indexing and full-text for the retrieval of clinical medical literature (1988) 0.01
```
0.012670675 = product of:
  0.0506827 = sum of:
    0.0506827 = weight(_text_:data in 3563) [ClassicSimilarity], result of:
      0.0506827 = score(doc=3563,freq=8.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.4192326 = fieldWeight in 3563, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=3563)
  0.25 = coord(1/4)
```
Abstract

The availability of two full text data bases in the clinical medical journal literature, MEDIS from Mead Data Central and CCML from BRS Information Technologies, provided an opportunity to compare the efficacy of the full text to the traditional, indexed system, MEDLINE for retrieval effectiveness. 100 searches were solicited from an academic health sciences library and the request were searched on all 3 data bases. The results were compared and preliminary analysis suggests that the full text data bases retrieve a greater number of relevant citations and MEDLINE achieves higher precision.

Savoy, J.; Calvé, A. le; Vrajitoru, D.: Report on the TREC5 experiment : data fusion and collection fusion (1997) 0.01

0.012670675 = product of:
  0.0506827 = sum of:
    0.0506827 = weight(_text_:data in 3108) [ClassicSimilarity], result of:
      0.0506827 = score(doc=3108,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.4192326 = fieldWeight in 3108, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.09375 = fieldNorm(doc=3108)
  0.25 = coord(1/4)

Taghva, K.: ¬The effects of noisy data on text retrieval (1994) 0.01
```
0.011946027 = product of:
  0.04778411 = sum of:
    0.04778411 = weight(_text_:data in 7227) [ClassicSimilarity], result of:
      0.04778411 = score(doc=7227,freq=4.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.3952563 = fieldWeight in 7227, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0625 = fieldNorm(doc=7227)
  0.25 = coord(1/4)
```
Abstract

Reports of the results of experiments on query evaluation on the presence of noisy data, in particular, an OCR-generated database and its corresponding 99.8 % correct version are used to process a set of queries to determine the effect the degraded version will have on retrieval. With the set of scientific documents used in the testing, the effect is insignificant. Improves the result by applying an automatic postprocessing system designed to correct the kinds of errors generated by recognition devices

Ekmekcioglu, F.C.; Robertson, A.M.; Willett, P.: Effectiveness of query expansion in ranked-output document retrieval systems (1992) 0.01

0.011946027 = product of:
  0.04778411 = sum of:
    0.04778411 = weight(_text_:data in 5689) [ClassicSimilarity], result of:
      0.04778411 = score(doc=5689,freq=4.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.3952563 = fieldWeight in 5689, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0625 = fieldNorm(doc=5689)
  0.25 = coord(1/4)

Abstract: Reports an evaluation of 3 methods for the expansion of natural language queries in ranked output retrieval systems. The methods are based on term co-occurrence data, on Soundex codes, and on a string similarity measure. Searches for 110 queries in a data base of 26.280 titles and abstracts suggest that there is no significant difference in retrieval effectiveness between any of these methods and unexpanded searches

Ruthven, I.: Relevance behaviour in TREC (2014) 0.01
```
0.011805206 = product of:
  0.047220822 = sum of:
    0.047220822 = weight(_text_:data in 1785) [ClassicSimilarity], result of:
      0.047220822 = score(doc=1785,freq=10.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.39059696 = fieldWeight in 1785, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1785)
  0.25 = coord(1/4)
```
Abstract

Purpose - The purpose of this paper is to examine how various types of TREC data can be used to better understand relevance and serve as test-bed for exploring relevance. The author proposes that there are many interesting studies that can be performed on the TREC data collections that are not directly related to evaluating systems but to learning more about human judgements of information and relevance and that these studies can provide useful research questions for other types of investigation. Design/methodology/approach - Through several case studies the author shows how existing data from TREC can be used to learn more about the factors that may affect relevance judgements and interactive search decisions and answer new research questions for exploring relevance. Findings - The paper uncovers factors, such as familiarity, interest and strictness of relevance criteria, that affect the nature of relevance assessments within TREC, contrasting these against findings from user studies of relevance. Research limitations/implications - The research only considers certain uses of TREC data and assessment given by professional relevance assessors but motivates further exploration of the TREC data so that the research community can further exploit the effort involved in the construction of TREC test collections. Originality/value - The paper presents an original viewpoint on relevance investigations and TREC itself by motivating TREC as a source of inspiration on understanding relevance rather than purely as a source of evaluation material.
Leiva-Mederos, A.; Senso, J.A.; Hidalgo-Delgado, Y.; Hipola, P.: Working framework of semantic interoperability for CRIS with heterogeneous data sources (2017) 0.01
```
0.011174486 = product of:
  0.044697944 = sum of:
    0.044697944 = weight(_text_:data in 3706) [ClassicSimilarity], result of:
      0.044697944 = score(doc=3706,freq=14.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.36972845 = fieldWeight in 3706, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=3706)
  0.25 = coord(1/4)
```
Abstract

Purpose Information from Current Research Information Systems (CRIS) is stored in different formats, in platforms that are not compatible, or even in independent networks. It would be helpful to have a well-defined methodology to allow for management data processing from a single site, so as to take advantage of the capacity to link disperse data found in different systems, platforms, sources and/or formats. Based on functionalities and materials of the VLIR project, the purpose of this paper is to present a model that provides for interoperability by means of semantic alignment techniques and metadata crosswalks, and facilitates the fusion of information stored in diverse sources. Design/methodology/approach After reviewing the state of the art regarding the diverse mechanisms for achieving semantic interoperability, the paper analyzes the following: the specific coverage of the data sets (type of data, thematic coverage and geographic coverage); the technical specifications needed to retrieve and analyze a distribution of the data set (format, protocol, etc.); the conditions of re-utilization (copyright and licenses); and the "dimensions" included in the data set as well as the semantics of these dimensions (the syntax and the taxonomies of reference). The semantic interoperability framework here presented implements semantic alignment and metadata crosswalk to convert information from three different systems (ABCD, Moodle and DSpace) to integrate all the databases in a single RDF file. Findings The paper also includes an evaluation based on the comparison - by means of calculations of recall and precision - of the proposed model and identical consultations made on Open Archives Initiative and SQL, in order to estimate its efficiency. The results have been satisfactory enough, due to the fact that the semantic interoperability facilitates the exact retrieval of information. Originality/value The proposed model enhances management of the syntactic and semantic interoperability of the CRIS system designed. In a real setting of use it achieves very positive results.

Search (102 results, page 1 of 6)

Authors

Years

Languages

Types

Themes