Search (271 results, page 1 of 14)

Smeaton, A.F.; Rijsbergen, C.J. van: ¬The retrieval effects of query expansion on a feedback document retrieval system (1983) 0.12

0.11607331 = product of:
  0.17410997 = sum of:
    0.12554102 = weight(_text_:retrieval in 2134) [ClassicSimilarity], result of:
      0.12554102 = score(doc=2134,freq=6.0), product of:
        0.15490976 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.051211275 = queryNorm
        0.8104139 = fieldWeight in 2134, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.109375 = fieldNorm(doc=2134)
    0.048568945 = product of:
      0.09713789 = sum of:
        0.09713789 = weight(_text_:22 in 2134) [ClassicSimilarity], result of:
          0.09713789 = score(doc=2134,freq=2.0), product of:
            0.17933317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051211275 = queryNorm
            0.5416616 = fieldWeight in 2134, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=2134)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Date: 30. 3.2001 13:32:22
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Chang, C.-H.; Hsu, C.-C.: Integrating query expansion and conceptual relevance feedback for personalized Web information retrieval (1998) 0.10

0.1045112 = product of:
  0.1567668 = sum of:
    0.051251903 = weight(_text_:retrieval in 1319) [ClassicSimilarity], result of:
      0.051251903 = score(doc=1319,freq=4.0), product of:
        0.15490976 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.051211275 = queryNorm
        0.33085006 = fieldWeight in 1319, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1319)
    0.10551489 = sum of:
      0.056945946 = weight(_text_:conference in 1319) [ClassicSimilarity], result of:
        0.056945946 = score(doc=1319,freq=2.0), product of:
          0.19418365 = queryWeight, product of:
            3.7918143 = idf(docFreq=2710, maxDocs=44218)
            0.051211275 = queryNorm
          0.2932582 = fieldWeight in 1319, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.7918143 = idf(docFreq=2710, maxDocs=44218)
            0.0546875 = fieldNorm(doc=1319)
      0.048568945 = weight(_text_:22 in 1319) [ClassicSimilarity], result of:
        0.048568945 = score(doc=1319,freq=2.0), product of:
          0.17933317 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.051211275 = queryNorm
          0.2708308 = fieldWeight in 1319, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0546875 = fieldNorm(doc=1319)
  0.6666667 = coord(2/3)

Date: 1. 8.1996 22:08:06
Footnote: Contribution to a special issue devoted to the Proceedings of the 7th International World Wide Web Conference, held 14-18 April 1998, Brisbane, Australia
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.09

0.09222864 = product of:
  0.13834296 = sum of:
    0.08283559 = weight(_text_:retrieval in 402) [ClassicSimilarity], result of:
      0.08283559 = score(doc=402,freq=2.0), product of:
        0.15490976 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.051211275 = queryNorm
        0.5347345 = fieldWeight in 402, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.125 = fieldNorm(doc=402)
    0.05550737 = product of:
      0.11101474 = sum of:
        0.11101474 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
          0.11101474 = score(doc=402,freq=2.0), product of:
            0.17933317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051211275 = queryNorm
            0.61904186 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Source: Information processing and management. 22(1986) no.6, S.465-476

Ravana, S.D.; Rajagopal, P.; Balakrishnan, V.: Ranking retrieval systems using pseudo relevance judgments (2015) 0.09
```
0.08971586 = product of:
  0.13457379 = sum of:
    0.04483608 = weight(_text_:retrieval in 2591) [ClassicSimilarity], result of:
      0.04483608 = score(doc=2591,freq=6.0), product of:
        0.15490976 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.051211275 = queryNorm
        0.28943354 = fieldWeight in 2591, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2591)
    0.08973771 = sum of:
      0.040675674 = weight(_text_:conference in 2591) [ClassicSimilarity], result of:
        0.040675674 = score(doc=2591,freq=2.0), product of:
          0.19418365 = queryWeight, product of:
            3.7918143 = idf(docFreq=2710, maxDocs=44218)
            0.051211275 = queryNorm
          0.20947012 = fieldWeight in 2591, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.7918143 = idf(docFreq=2710, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2591)
      0.049062043 = weight(_text_:22 in 2591) [ClassicSimilarity], result of:
        0.049062043 = score(doc=2591,freq=4.0), product of:
          0.17933317 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.051211275 = queryNorm
          0.27358043 = fieldWeight in 2591, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2591)
  0.6666667 = coord(2/3)
```
Abstract

Purpose In a system-based approach, replicating the web would require large test collections, and judging the relevancy of all documents per topic in creating relevance judgment through human assessors is infeasible. Due to the large amount of documents that requires judgment, there are possible errors introduced by human assessors because of disagreements. The paper aims to discuss these issues. Design/methodology/approach This study explores exponential variation and document ranking methods that generate a reliable set of relevance judgments (pseudo relevance judgments) to reduce human efforts. These methods overcome problems with large amounts of documents for judgment while avoiding human disagreement errors during the judgment process. This study utilizes two key factors: number of occurrences of each document per topic from all the system runs; and document rankings to generate the alternate methods. Findings The effectiveness of the proposed method is evaluated using the correlation coefficient of ranked systems using mean average precision scores between the original Text REtrieval Conference (TREC) relevance judgments and pseudo relevance judgments. The results suggest that the proposed document ranking method with a pool depth of 100 could be a reliable alternative to reduce human effort and disagreement errors involved in generating TREC-like relevance judgments. Originality/value Simple methods proposed in this study show improvement in the correlation coefficient in generating alternate relevance judgment without human assessors while contributing to information retrieval evaluation.

Date

20. 1.2015 18:30:22
18. 9.2018 18:22:56

Klas, C.-P.; Fuhr, N.; Schaefer, A.: Evaluating strategic support for information access in the DAFFODIL system (2004) 0.09

0.08958103 = product of:
  0.13437153 = sum of:
    0.043930206 = weight(_text_:retrieval in 2419) [ClassicSimilarity], result of:
      0.043930206 = score(doc=2419,freq=4.0), product of:
        0.15490976 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.051211275 = queryNorm
        0.2835858 = fieldWeight in 2419, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2419)
    0.09044133 = sum of:
      0.04881081 = weight(_text_:conference in 2419) [ClassicSimilarity], result of:
        0.04881081 = score(doc=2419,freq=2.0), product of:
          0.19418365 = queryWeight, product of:
            3.7918143 = idf(docFreq=2710, maxDocs=44218)
            0.051211275 = queryNorm
          0.25136417 = fieldWeight in 2419, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.7918143 = idf(docFreq=2710, maxDocs=44218)
            0.046875 = fieldNorm(doc=2419)
      0.041630525 = weight(_text_:22 in 2419) [ClassicSimilarity], result of:
        0.041630525 = score(doc=2419,freq=2.0), product of:
          0.17933317 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.051211275 = queryNorm
          0.23214069 = fieldWeight in 2419, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=2419)
  0.6666667 = coord(2/3)

Abstract: The digital library system Daffodil is targeted at strategic support of users during the information search process. For searching, exploring and managing digital library objects it provides user-customisable information seeking patterns over a federation of heterogeneous digital libraries. In this paper evaluation results with respect to retrieval effectiveness, efficiency and user satisfaction are presented. The analysis focuses on strategic support for the scientific work-flow. Daffodil supports the whole work-flow, from data source selection over information seeking to the representation, organisation and reuse of information. By embedding high level search functionality into the scientific work-flow, the user experiences better strategic system support due to a more systematic work process. These ideas have been implemented in Daffodil followed by a qualitative evaluation. The evaluation has been conducted with 28 participants, ranging from information seeking novices to experts. The results are promising, as they support the chosen model.
Date: 16.11.2008 16:22:48
Source: Research and advanced technology for digital libraries : 8th European conference, ECDL 2004, Bath, UK, September 12-17, 2004 : proceedings. Eds.: Heery, R. u. E. Lyon
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Liu, A.; Zou, Q.; Chu, W.W.: Configurable indexing and ranking for XML information retrieval (2004) 0.08

0.07592845 = product of:
  0.113892674 = sum of:
    0.073217005 = weight(_text_:retrieval in 4114) [ClassicSimilarity], result of:
      0.073217005 = score(doc=4114,freq=4.0), product of:
        0.15490976 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.051211275 = queryNorm
        0.47264296 = fieldWeight in 4114, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=4114)
    0.040675674 = product of:
      0.08135135 = sum of:
        0.08135135 = weight(_text_:conference in 4114) [ClassicSimilarity], result of:
          0.08135135 = score(doc=4114,freq=2.0), product of:
            0.19418365 = queryWeight, product of:
              3.7918143 = idf(docFreq=2710, maxDocs=44218)
              0.051211275 = queryNorm
            0.41894025 = fieldWeight in 4114, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7918143 = idf(docFreq=2710, maxDocs=44218)
              0.078125 = fieldNorm(doc=4114)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Source: SIGIR'04: Proceedings of the 27th Annual International ACM-SIGIR Conference an Research and Development in Information Retrieval. Ed.: K. Järvelin, u.a

Clarke, C.L.A.; Cormack, G.V.; Burkowski, F.J.: Shortest substring ranking : multitext experiments for TREC-4 (1996) 0.07

0.07395834 = product of:
  0.110937506 = sum of:
    0.062126692 = weight(_text_:retrieval in 549) [ClassicSimilarity], result of:
      0.062126692 = score(doc=549,freq=2.0), product of:
        0.15490976 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.051211275 = queryNorm
        0.40105087 = fieldWeight in 549, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.09375 = fieldNorm(doc=549)
    0.04881081 = product of:
      0.09762162 = sum of:
        0.09762162 = weight(_text_:conference in 549) [ClassicSimilarity], result of:
          0.09762162 = score(doc=549,freq=2.0), product of:
            0.19418365 = queryWeight, product of:
              3.7918143 = idf(docFreq=2710, maxDocs=44218)
              0.051211275 = queryNorm
            0.50272834 = fieldWeight in 549, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7918143 = idf(docFreq=2710, maxDocs=44218)
              0.09375 = fieldNorm(doc=549)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Source: The Fourth Text Retrieval Conference (TREC-4). Ed.: K. Harman

Savoy, J.; Ndarugendamwo, M.; Vrajitoru, D.: Report on the TREC-4 experiment : combining probabilistic and vector-space schemes (1996) 0.07

0.07395834 = product of:
  0.110937506 = sum of:
    0.062126692 = weight(_text_:retrieval in 7574) [ClassicSimilarity], result of:
      0.062126692 = score(doc=7574,freq=2.0), product of:
        0.15490976 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.051211275 = queryNorm
        0.40105087 = fieldWeight in 7574, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.09375 = fieldNorm(doc=7574)
    0.04881081 = product of:
      0.09762162 = sum of:
        0.09762162 = weight(_text_:conference in 7574) [ClassicSimilarity], result of:
          0.09762162 = score(doc=7574,freq=2.0), product of:
            0.19418365 = queryWeight, product of:
              3.7918143 = idf(docFreq=2710, maxDocs=44218)
              0.051211275 = queryNorm
            0.50272834 = fieldWeight in 7574, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7918143 = idf(docFreq=2710, maxDocs=44218)
              0.09375 = fieldNorm(doc=7574)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Source: The Fourth Text Retrieval Conference (TREC-4). Ed.: K. Harman

Belkin, N.J.; Cool, C.; Koenemann, J.; Ng, K.B.; Park, S.: Using relevance feedback and ranking in interactive searching (1996) 0.07

0.07395834 = product of:
  0.110937506 = sum of:
    0.062126692 = weight(_text_:retrieval in 7588) [ClassicSimilarity], result of:
      0.062126692 = score(doc=7588,freq=2.0), product of:
        0.15490976 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.051211275 = queryNorm
        0.40105087 = fieldWeight in 7588, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.09375 = fieldNorm(doc=7588)
    0.04881081 = product of:
      0.09762162 = sum of:
        0.09762162 = weight(_text_:conference in 7588) [ClassicSimilarity], result of:
          0.09762162 = score(doc=7588,freq=2.0), product of:
            0.19418365 = queryWeight, product of:
              3.7918143 = idf(docFreq=2710, maxDocs=44218)
              0.051211275 = queryNorm
            0.50272834 = fieldWeight in 7588, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7918143 = idf(docFreq=2710, maxDocs=44218)
              0.09375 = fieldNorm(doc=7588)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Source: The Fourth Text Retrieval Conference (TREC-4). Ed.: K. Harman

Faloutsos, C.: Signature files (1992) 0.07

0.06632762 = product of:
  0.09949142 = sum of:
    0.07173773 = weight(_text_:retrieval in 3499) [ClassicSimilarity], result of:
      0.07173773 = score(doc=3499,freq=6.0), product of:
        0.15490976 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.051211275 = queryNorm
        0.46309367 = fieldWeight in 3499, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=3499)
    0.027753685 = product of:
      0.05550737 = sum of:
        0.05550737 = weight(_text_:22 in 3499) [ClassicSimilarity], result of:
          0.05550737 = score(doc=3499,freq=2.0), product of:
            0.17933317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051211275 = queryNorm
            0.30952093 = fieldWeight in 3499, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=3499)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: Presents a survey and discussion on signature-based text retrieval methods. It describes the main idea behind the signature approach and its advantages over other text retrieval methods, it provides a classification of the signature methods that have appeared in the literature, it describes the main representatives of each class, together with the relative advantages and drawbacks, and it gives a list of applications as well as commercial or university prototypes that use the signature approach
Date: 7. 5.1999 15:22:48
Source: Information retrieval: data structures and algorithms. Ed.: W.B. Frakes u. R. Baeza-Yates

Losada, D.E.; Barreiro, A.: Emebedding term similarity and inverse document frequency into a logical model of information retrieval (2003) 0.07

0.06632762 = product of:
  0.09949142 = sum of:
    0.07173773 = weight(_text_:retrieval in 1422) [ClassicSimilarity], result of:
      0.07173773 = score(doc=1422,freq=6.0), product of:
        0.15490976 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.051211275 = queryNorm
        0.46309367 = fieldWeight in 1422, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=1422)
    0.027753685 = product of:
      0.05550737 = sum of:
        0.05550737 = weight(_text_:22 in 1422) [ClassicSimilarity], result of:
          0.05550737 = score(doc=1422,freq=2.0), product of:
            0.17933317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051211275 = queryNorm
            0.30952093 = fieldWeight in 1422, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1422)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: We propose a novel approach to incorporate term similarity and inverse document frequency into a logical model of information retrieval. The ability of the logic to handle expressive representations along with the use of such classical notions are promising characteristics for IR systems. The approach proposed here has been efficiently implemented and experiments against test collections are presented.
Date: 22. 3.2003 19:27:23
Footnote: Beitrag eines Themenheftes: Mathematical, logical, and formal methods in information retrieval

Crestani, F.; Dominich, S.; Lalmas, M.; Rijsbergen, C.J.K. van: Mathematical, logical, and formal methods in information retrieval : an introduction to the special issue (2003) 0.06

0.064603075 = product of:
  0.096904606 = sum of:
    0.076089345 = weight(_text_:retrieval in 1451) [ClassicSimilarity], result of:
      0.076089345 = score(doc=1451,freq=12.0), product of:
        0.15490976 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.051211275 = queryNorm
        0.49118498 = fieldWeight in 1451, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=1451)
    0.020815263 = product of:
      0.041630525 = sum of:
        0.041630525 = weight(_text_:22 in 1451) [ClassicSimilarity], result of:
          0.041630525 = score(doc=1451,freq=2.0), product of:
            0.17933317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051211275 = queryNorm
            0.23214069 = fieldWeight in 1451, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1451)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: Research an the use of mathematical, logical, and formal methods, has been central to Information Retrieval research for a long time. Research in this area is important not only because it helps enhancing retrieval effectiveness, but also because it helps clarifying the underlying concepts of Information Retrieval. In this article we outline some of the major aspects of the subject, and summarize the papers of this special issue with respect to how they relate to these aspects. We conclude by highlighting some directions of future research, which are needed to better understand the formal characteristics of Information Retrieval.
Date: 22. 3.2003 19:27:36
Footnote: Einführung zu den Beiträgen eines Themenheftes: Mathematical, logical, and formal methods in information retrieval

Yu, K.; Tresp, V.; Yu, S.: ¬A nonparametric hierarchical Bayesian framework for information filtering (2004) 0.06

0.061631948 = product of:
  0.09244792 = sum of:
    0.051772244 = weight(_text_:retrieval in 4117) [ClassicSimilarity], result of:
      0.051772244 = score(doc=4117,freq=2.0), product of:
        0.15490976 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.051211275 = queryNorm
        0.33420905 = fieldWeight in 4117, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.078125 = fieldNorm(doc=4117)
    0.040675674 = product of:
      0.08135135 = sum of:
        0.08135135 = weight(_text_:conference in 4117) [ClassicSimilarity], result of:
          0.08135135 = score(doc=4117,freq=2.0), product of:
            0.19418365 = queryWeight, product of:
              3.7918143 = idf(docFreq=2710, maxDocs=44218)
              0.051211275 = queryNorm
            0.41894025 = fieldWeight in 4117, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7918143 = idf(docFreq=2710, maxDocs=44218)
              0.078125 = fieldNorm(doc=4117)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Source: SIGIR'04: Proceedings of the 27th Annual International ACM-SIGIR Conference an Research and Development in Information Retrieval. Ed.: K. Järvelin, u.a

Burgin, R.: ¬The retrieval effectiveness of 5 clustering algorithms as a function of indexing exhaustivity (1995) 0.06
```
0.060375374 = product of:
  0.09056306 = sum of:
    0.073217005 = weight(_text_:retrieval in 3365) [ClassicSimilarity], result of:
      0.073217005 = score(doc=3365,freq=16.0), product of:
        0.15490976 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.051211275 = queryNorm
        0.47264296 = fieldWeight in 3365, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3365)
    0.017346052 = product of:
      0.034692105 = sum of:
        0.034692105 = weight(_text_:22 in 3365) [ClassicSimilarity], result of:
          0.034692105 = score(doc=3365,freq=2.0), product of:
            0.17933317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051211275 = queryNorm
            0.19345059 = fieldWeight in 3365, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3365)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The retrieval effectiveness of 5 hierarchical clustering methods (single link, complete link, group average, Ward's method, and weighted average) is examined as a function of indexing exhaustivity with 4 test collections (CR, Cranfield, Medlars, and Time). Evaluations of retrieval effectiveness, based on 3 measures of optimal retrieval performance, confirm earlier findings that the performance of a retrieval system based on single link clustering varies as a function of indexing exhaustivity but fail ti find similar patterns for other clustering methods. The data also confirm earlier findings regarding the poor performance of single link clustering is a retrieval environment. The poor performance of single link clustering appears to derive from that method's tendency to produce a small number of large, ill defined document clusters. By contrast, the data examined here found the retrieval performance of the other clustering methods to be general comparable. The data presented also provides an opportunity to examine the theoretical limits of cluster based retrieval and to compare these theoretical limits to the effectiveness of operational implementations. Performance standards of the 4 document collections examined were found to vary widely, and the effectiveness of operational implementations were found to be in the range defined as unacceptable. Further improvements in search strategies and document representations warrant investigations

Date

22. 2.1996 11:20:06

MacFarlane, A.; Robertson, S.E.; McCann, J.A.: Parallel computing for passage retrieval (2004) 0.06

0.05755153 = product of:
  0.08632729 = sum of:
    0.058573607 = weight(_text_:retrieval in 5108) [ClassicSimilarity], result of:
      0.058573607 = score(doc=5108,freq=4.0), product of:
        0.15490976 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.051211275 = queryNorm
        0.37811437 = fieldWeight in 5108, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=5108)
    0.027753685 = product of:
      0.05550737 = sum of:
        0.05550737 = weight(_text_:22 in 5108) [ClassicSimilarity], result of:
          0.05550737 = score(doc=5108,freq=2.0), product of:
            0.17933317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051211275 = queryNorm
            0.30952093 = fieldWeight in 5108, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=5108)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: In this paper methods for both speeding up passage processing and examining more passages using parallel computers are explored. The number of passages processed are varied in order to examine the effect on retrieval effectiveness and efficiency. The particular algorithm applied has previously been used to good effect in Okapi experiments at TREC. This algorithm and the mechanism for applying parallel computing to speed up processing are described.
Date: 20. 1.2007 18:30:22

Witschel, H.F.: Global term weights in distributed environments (2008) 0.06

0.05529464 = product of:
  0.08294196 = sum of:
    0.062126692 = weight(_text_:retrieval in 2096) [ClassicSimilarity], result of:
      0.062126692 = score(doc=2096,freq=8.0), product of:
        0.15490976 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.051211275 = queryNorm
        0.40105087 = fieldWeight in 2096, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=2096)
    0.020815263 = product of:
      0.041630525 = sum of:
        0.041630525 = weight(_text_:22 in 2096) [ClassicSimilarity], result of:
          0.041630525 = score(doc=2096,freq=2.0), product of:
            0.17933317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051211275 = queryNorm
            0.23214069 = fieldWeight in 2096, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2096)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: This paper examines the estimation of global term weights (such as IDF) in information retrieval scenarios where a global view on the collection is not available. In particular, the two options of either sampling documents or of using a reference corpus independent of the target retrieval collection are compared using standard IR test collections. In addition, the possibility of pruning term lists based on frequency is evaluated. The results show that very good retrieval performance can be reached when just the most frequent terms of a collection - an "extended stop word list" - are known and all terms which are not in that list are treated equally. However, the list cannot always be fully estimated from a general-purpose reference corpus, but some "domain-specific stop words" need to be added. A good solution for achieving this is to mix estimates from small samples of the target retrieval collection with ones derived from a reference corpus.
Date: 1. 8.2008 9:44:22

Dang, E.K.F.; Luk, R.W.P.; Allan, J.: Beyond bag-of-words : bigram-enhanced context-dependent term weights (2014) 0.05
```
0.05214731 = product of:
  0.07822096 = sum of:
    0.05788313 = weight(_text_:retrieval in 1283) [ClassicSimilarity], result of:
      0.05788313 = score(doc=1283,freq=10.0), product of:
        0.15490976 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.051211275 = queryNorm
        0.37365708 = fieldWeight in 1283, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1283)
    0.020337837 = product of:
      0.040675674 = sum of:
        0.040675674 = weight(_text_:conference in 1283) [ClassicSimilarity], result of:
          0.040675674 = score(doc=1283,freq=2.0), product of:
            0.19418365 = queryWeight, product of:
              3.7918143 = idf(docFreq=2710, maxDocs=44218)
              0.051211275 = queryNorm
            0.20947012 = fieldWeight in 1283, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7918143 = idf(docFreq=2710, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1283)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

While term independence is a widely held assumption in most of the established information retrieval approaches, it is clearly not true and various works in the past have investigated a relaxation of the assumption. One approach is to use n-grams in document representation instead of unigrams. However, the majority of early works on n-grams obtained only modest performance improvement. On the other hand, the use of information based on supporting terms or "contexts" of queries has been found to be promising. In particular, recent studies showed that using new context-dependent term weights improved the performance of relevance feedback (RF) retrieval compared with using traditional bag-of-words BM25 term weights. Calculation of the new term weights requires an estimation of the local probability of relevance of each query term occurrence. In previous studies, the estimation of this probability was based on unigrams that occur in the neighborhood of a query term. We explore an integration of the n-gram and context approaches by computing context-dependent term weights based on a mixture of unigrams and bigrams. Extensive experiments are performed using the title queries of the Text Retrieval Conference (TREC)-6, TREC-7, TREC-8, and TREC-2005 collections, for RF with relevance judgment of either the top 10 or top 20 documents of an initial retrieval. We identify some crucial elements needed in the use of bigrams in our methods, such as proper inverse document frequency (IDF) weighting of the bigrams and noise reduction by pruning bigrams with large document frequency values. We show that enhancing context-dependent term weights with bigrams is effective in further improving retrieval performance.

Ding, Y.; Chowdhury, G.; Foo, S.: Organsising keywords in a Web search environment : a methodology based on co-word analysis (2000) 0.05

0.052139133 = product of:
  0.0782087 = sum of:
    0.0538033 = weight(_text_:retrieval in 105) [ClassicSimilarity], result of:
      0.0538033 = score(doc=105,freq=6.0), product of:
        0.15490976 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.051211275 = queryNorm
        0.34732026 = fieldWeight in 105, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=105)
    0.024405405 = product of:
      0.04881081 = sum of:
        0.04881081 = weight(_text_:conference in 105) [ClassicSimilarity], result of:
          0.04881081 = score(doc=105,freq=2.0), product of:
            0.19418365 = queryWeight, product of:
              3.7918143 = idf(docFreq=2710, maxDocs=44218)
              0.051211275 = queryNorm
            0.25136417 = fieldWeight in 105, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.7918143 = idf(docFreq=2710, maxDocs=44218)
              0.046875 = fieldNorm(doc=105)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: The rapid development of the Internet and World Wide Web has caused some critical problem for information retrieval. Researchers have made several attempts to solve these problems. Thesauri and subject heading lists as traditional information retrieval tools have been criticised for their efficiency to tackle these newly emerging problems. This paper proposes an information retrieval tool generated by cocitation analysis, comprising keyword clusters with relationships based on the co-occurrences of keywords in the literature. Such a tool can play the role of an associative thesaurus that can provide information about the keywords in a domain that might be useful for information searching and query expansion
Source: Dynamism and stability in knowledge organization: Proceedings of the 6th International ISKO-Conference, 10-13 July 2000, Toronto, Canada. Ed.: C. Beghtol et al

Dominich, S.: Mathematical foundations of information retrieval (2001) 0.05

0.05015279 = product of:
  0.07522918 = sum of:
    0.05788313 = weight(_text_:retrieval in 1753) [ClassicSimilarity], result of:
      0.05788313 = score(doc=1753,freq=10.0), product of:
        0.15490976 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.051211275 = queryNorm
        0.37365708 = fieldWeight in 1753, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1753)
    0.017346052 = product of:
      0.034692105 = sum of:
        0.034692105 = weight(_text_:22 in 1753) [ClassicSimilarity], result of:
          0.034692105 = score(doc=1753,freq=2.0), product of:
            0.17933317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051211275 = queryNorm
            0.19345059 = fieldWeight in 1753, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1753)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: This book offers a comprehensive and consistent mathematical approach to information retrieval (IR) without which no implementation is possible, and sheds an entirely new light upon the structure of IR models. It contains the descriptions of all IR models in a unified formal style and language, along with examples for each, thus offering a comprehensive overview of them. The book also creates mathematical foundations and a consistent mathematical theory (including all mathematical results achieved so far) of IR as a stand-alone mathematical discipline, which thus can be read and taught independently. Also, the book contains all necessary mathematical knowledge on which IR relies, to help the reader avoid searching different sources. The book will be of interest to computer or information scientists, librarians, mathematicians, undergraduate students and researchers whose work involves information retrieval.
Date: 22. 3.2008 12:26:32
LCSH: Information storage and retrieval
Subject: Information storage and retrieval

Joss, M.W.; Wszola, S.: ¬The engines that can : text search and retrieval software, their strategies, and vendors (1996) 0.05
```
0.04974571 = product of:
  0.07461856 = sum of:
    0.0538033 = weight(_text_:retrieval in 5123) [ClassicSimilarity], result of:
      0.0538033 = score(doc=5123,freq=6.0), product of:
        0.15490976 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.051211275 = queryNorm
        0.34732026 = fieldWeight in 5123, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=5123)
    0.020815263 = product of:
      0.041630525 = sum of:
        0.041630525 = weight(_text_:22 in 5123) [ClassicSimilarity], result of:
          0.041630525 = score(doc=5123,freq=2.0), product of:
            0.17933317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051211275 = queryNorm
            0.23214069 = fieldWeight in 5123, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=5123)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Traces the development of text searching and retrieval software designed to cope with the increasing demands made by the storage and handling of large amounts of data, recorded on high data storage media, from CD-ROM to multi gigabyte storage media and online information services, with particular reference to the need to cope with graphics as well as conventional ASCII text. Includes details of: Boolean searching, fuzzy searching and matching; relevance ranking; proximity searching and improved strategies for dealing with text searching in very large databases. Concludes that the best searching tools for CD-ROM publishers are those optimized for searching and retrieval on CD-ROM. CD-ROM drives have relatively lower random seek times than hard discs and so the software most appropriate to the medium is that which can effectively arrange the indexes and text on the CD-ROM to avoid continuous random access searching. Lists and reviews a selection of software packages designed to achieve the sort of results required for rapid CD-ROM searching

Date

12. 9.1996 13:56:22

Search (271 results, page 1 of 14)

Authors

Years

Languages

Types

Themes

Subjects

Classifications