Search (257 results, page 1 of 13)

Smeaton, A.F.; Rijsbergen, C.J. van: ¬The retrieval effects of query expansion on a feedback document retrieval system (1983) 0.05

0.049302004 = product of:
  0.147906 = sum of:
    0.147906 = product of:
      0.22185901 = sum of:
        0.12507877 = weight(_text_:retrieval in 2134) [ClassicSimilarity], result of:
          0.12507877 = score(doc=2134,freq=6.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.8104139 = fieldWeight in 2134, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.109375 = fieldNorm(doc=2134)
        0.09678023 = weight(_text_:22 in 2134) [ClassicSimilarity], result of:
          0.09678023 = score(doc=2134,freq=2.0), product of:
            0.17867287 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051022716 = queryNorm
            0.5416616 = fieldWeight in 2134, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=2134)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)

Date: 30. 3.2001 13:32:22
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Dominich, S.: Mathematical foundations of information retrieval (2001) 0.04

0.042983107 = product of:
  0.12894931 = sum of:
    0.12894931 = sum of:
      0.036714934 = weight(_text_:online in 1753) [ClassicSimilarity], result of:
        0.036714934 = score(doc=1753,freq=4.0), product of:
          0.1548489 = queryWeight, product of:
            3.0349014 = idf(docFreq=5778, maxDocs=44218)
            0.051022716 = queryNorm
          0.23710167 = fieldWeight in 1753, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.0349014 = idf(docFreq=5778, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1753)
      0.05767 = weight(_text_:retrieval in 1753) [ClassicSimilarity], result of:
        0.05767 = score(doc=1753,freq=10.0), product of:
          0.15433937 = queryWeight, product of:
            3.024915 = idf(docFreq=5836, maxDocs=44218)
            0.051022716 = queryNorm
          0.37365708 = fieldWeight in 1753, product of:
            3.1622777 = tf(freq=10.0), with freq of:
              10.0 = termFreq=10.0
            3.024915 = idf(docFreq=5836, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1753)
      0.03456437 = weight(_text_:22 in 1753) [ClassicSimilarity], result of:
        0.03456437 = score(doc=1753,freq=2.0), product of:
          0.17867287 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.051022716 = queryNorm
          0.19345059 = fieldWeight in 1753, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=1753)
  0.33333334 = coord(1/3)

Abstract: This book offers a comprehensive and consistent mathematical approach to information retrieval (IR) without which no implementation is possible, and sheds an entirely new light upon the structure of IR models. It contains the descriptions of all IR models in a unified formal style and language, along with examples for each, thus offering a comprehensive overview of them. The book also creates mathematical foundations and a consistent mathematical theory (including all mathematical results achieved so far) of IR as a stand-alone mathematical discipline, which thus can be read and taught independently. Also, the book contains all necessary mathematical knowledge on which IR relies, to help the reader avoid searching different sources. The book will be of interest to computer or information scientists, librarians, mathematicians, undergraduate students and researchers whose work involves information retrieval.
Date: 22. 3.2008 12:26:32
LCSH: Information storage and retrieval
RSWK: Online-Recherche / Mathematische Methode
Subject: Online-Recherche / Mathematische Methode
Information storage and retrieval

Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.04

0.04291924 = product of:
  0.12875772 = sum of:
    0.12875772 = product of:
      0.19313657 = sum of:
        0.08253059 = weight(_text_:retrieval in 402) [ClassicSimilarity], result of:
          0.08253059 = score(doc=402,freq=2.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.5347345 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
        0.110605985 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
          0.110605985 = score(doc=402,freq=2.0), product of:
            0.17867287 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051022716 = queryNorm
            0.61904186 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)

Source: Information processing and management. 22(1986) no.6, S.465-476

Joss, M.W.; Wszola, S.: ¬The engines that can : text search and retrieval software, their strategies, and vendors (1996) 0.04

0.042078696 = product of:
  0.12623608 = sum of:
    0.12623608 = sum of:
      0.031153653 = weight(_text_:online in 5123) [ClassicSimilarity], result of:
        0.031153653 = score(doc=5123,freq=2.0), product of:
          0.1548489 = queryWeight, product of:
            3.0349014 = idf(docFreq=5778, maxDocs=44218)
            0.051022716 = queryNorm
          0.20118743 = fieldWeight in 5123, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.0349014 = idf(docFreq=5778, maxDocs=44218)
            0.046875 = fieldNorm(doc=5123)
      0.05360519 = weight(_text_:retrieval in 5123) [ClassicSimilarity], result of:
        0.05360519 = score(doc=5123,freq=6.0), product of:
          0.15433937 = queryWeight, product of:
            3.024915 = idf(docFreq=5836, maxDocs=44218)
            0.051022716 = queryNorm
          0.34732026 = fieldWeight in 5123, product of:
            2.4494898 = tf(freq=6.0), with freq of:
              6.0 = termFreq=6.0
            3.024915 = idf(docFreq=5836, maxDocs=44218)
            0.046875 = fieldNorm(doc=5123)
      0.04147724 = weight(_text_:22 in 5123) [ClassicSimilarity], result of:
        0.04147724 = score(doc=5123,freq=2.0), product of:
          0.17867287 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.051022716 = queryNorm
          0.23214069 = fieldWeight in 5123, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046875 = fieldNorm(doc=5123)
  0.33333334 = coord(1/3)

Abstract: Traces the development of text searching and retrieval software designed to cope with the increasing demands made by the storage and handling of large amounts of data, recorded on high data storage media, from CD-ROM to multi gigabyte storage media and online information services, with particular reference to the need to cope with graphics as well as conventional ASCII text. Includes details of: Boolean searching, fuzzy searching and matching; relevance ranking; proximity searching and improved strategies for dealing with text searching in very large databases. Concludes that the best searching tools for CD-ROM publishers are those optimized for searching and retrieval on CD-ROM. CD-ROM drives have relatively lower random seek times than hard discs and so the software most appropriate to the medium is that which can effectively arrange the indexes and text on the CD-ROM to avoid continuous random access searching. Lists and reviews a selection of software packages designed to achieve the sort of results required for rapid CD-ROM searching
Date: 12. 9.1996 13:56:22

Weller, K.; Stock, W.G.: Transitive meronymy : automatic concept-based query expansion using weighted transitive part-whole relations (2008) 0.04

0.041076105 = product of:
  0.061614156 = sum of:
    0.044593092 = weight(_text_:im in 1835) [ClassicSimilarity], result of:
      0.044593092 = score(doc=1835,freq=4.0), product of:
        0.1442303 = queryWeight, product of:
          2.8267863 = idf(docFreq=7115, maxDocs=44218)
          0.051022716 = queryNorm
        0.30917975 = fieldWeight in 1835, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.8267863 = idf(docFreq=7115, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1835)
    0.017021064 = product of:
      0.05106319 = sum of:
        0.05106319 = weight(_text_:retrieval in 1835) [ClassicSimilarity], result of:
          0.05106319 = score(doc=1835,freq=4.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.33085006 = fieldWeight in 1835, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1835)
      0.33333334 = coord(1/3)
  0.6666667 = coord(2/3)

Abstract: Transitive Meronymie. Automatische begriffsbasierte Suchanfrageerweiterung unter Nutzung gewichteter transitiver Teil-Ganzes-Relationen. Unsere theoretisch orientierte Arbeit isoliert transitive Teil-Ganzes-Beziehungen. Wir diskutieren den Einsatz der Meronymie bei der automatischen begriffsbasierten Suchanfrageerweiterung im Information Retrieval. Aus praktischen Gründen schlagen wir vor, die Bestandsrelationen zu spezifizieren und die einzelnen Arten mit unterschiedlichen Gewichtungswerten zu versehen, die im Retrieval genutzt werden. Für das Design von Wissensordnungen ist bedeutsam, dass innerhalb der Begriffsleiter einer Abstraktionsrelation ein Begriff alle seine Teile (sowie alle transitiven Teile der Teile) an seine Unterbegriffe vererbt.

Faloutsos, C.: Signature files (1992) 0.03

0.028172573 = product of:
  0.08451772 = sum of:
    0.08451772 = product of:
      0.12677658 = sum of:
        0.07147358 = weight(_text_:retrieval in 3499) [ClassicSimilarity], result of:
          0.07147358 = score(doc=3499,freq=6.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.46309367 = fieldWeight in 3499, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=3499)
        0.055302992 = weight(_text_:22 in 3499) [ClassicSimilarity], result of:
          0.055302992 = score(doc=3499,freq=2.0), product of:
            0.17867287 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051022716 = queryNorm
            0.30952093 = fieldWeight in 3499, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=3499)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)

Abstract: Presents a survey and discussion on signature-based text retrieval methods. It describes the main idea behind the signature approach and its advantages over other text retrieval methods, it provides a classification of the signature methods that have appeared in the literature, it describes the main representatives of each class, together with the relative advantages and drawbacks, and it gives a list of applications as well as commercial or university prototypes that use the signature approach
Date: 7. 5.1999 15:22:48
Source: Information retrieval: data structures and algorithms. Ed.: W.B. Frakes u. R. Baeza-Yates

Losada, D.E.; Barreiro, A.: Emebedding term similarity and inverse document frequency into a logical model of information retrieval (2003) 0.03

0.028172573 = product of:
  0.08451772 = sum of:
    0.08451772 = product of:
      0.12677658 = sum of:
        0.07147358 = weight(_text_:retrieval in 1422) [ClassicSimilarity], result of:
          0.07147358 = score(doc=1422,freq=6.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.46309367 = fieldWeight in 1422, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=1422)
        0.055302992 = weight(_text_:22 in 1422) [ClassicSimilarity], result of:
          0.055302992 = score(doc=1422,freq=2.0), product of:
            0.17867287 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051022716 = queryNorm
            0.30952093 = fieldWeight in 1422, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1422)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)

Abstract: We propose a novel approach to incorporate term similarity and inverse document frequency into a logical model of information retrieval. The ability of the logic to handle expressive representations along with the use of such classical notions are promising characteristics for IR systems. The approach proposed here has been efficiently implemented and experiments against test collections are presented.
Date: 22. 3.2003 19:27:23
Footnote: Beitrag eines Themenheftes: Mathematical, logical, and formal methods in information retrieval

Feder, J.D.; Hobbs, E.T.: Speech recognition and full-text retrieval : interface and integration (1995) 0.03

0.027601168 = product of:
  0.0828035 = sum of:
    0.0828035 = product of:
      0.12420525 = sum of:
        0.062307306 = weight(_text_:online in 2725) [ClassicSimilarity], result of:
          0.062307306 = score(doc=2725,freq=2.0), product of:
            0.1548489 = queryWeight, product of:
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.051022716 = queryNorm
            0.40237486 = fieldWeight in 2725, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.09375 = fieldNorm(doc=2725)
        0.06189794 = weight(_text_:retrieval in 2725) [ClassicSimilarity], result of:
          0.06189794 = score(doc=2725,freq=2.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.40105087 = fieldWeight in 2725, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.09375 = fieldNorm(doc=2725)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)

Source: Proceedings of the 16th National Online Meeting 1995, New York, 2-4 May 1995. Ed.: M.E. Williams

Crestani, F.; Dominich, S.; Lalmas, M.; Rijsbergen, C.J.K. van: Mathematical, logical, and formal methods in information retrieval : an introduction to the special issue (2003) 0.03

0.02606365 = product of:
  0.07819095 = sum of:
    0.07819095 = product of:
      0.11728642 = sum of:
        0.07580918 = weight(_text_:retrieval in 1451) [ClassicSimilarity], result of:
          0.07580918 = score(doc=1451,freq=12.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.49118498 = fieldWeight in 1451, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=1451)
        0.04147724 = weight(_text_:22 in 1451) [ClassicSimilarity], result of:
          0.04147724 = score(doc=1451,freq=2.0), product of:
            0.17867287 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051022716 = queryNorm
            0.23214069 = fieldWeight in 1451, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1451)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)

Abstract: Research an the use of mathematical, logical, and formal methods, has been central to Information Retrieval research for a long time. Research in this area is important not only because it helps enhancing retrieval effectiveness, but also because it helps clarifying the underlying concepts of Information Retrieval. In this article we outline some of the major aspects of the subject, and summarize the papers of this special issue with respect to how they relate to these aspects. We conclude by highlighting some directions of future research, which are needed to better understand the formal characteristics of Information Retrieval.
Date: 22. 3.2003 19:27:36
Footnote: Einführung zu den Beiträgen eines Themenheftes: Mathematical, logical, and formal methods in information retrieval

MacFarlane, A.; Robertson, S.E.; McCann, J.A.: Parallel computing for passage retrieval (2004) 0.03

0.025257986 = product of:
  0.075773954 = sum of:
    0.075773954 = product of:
      0.11366093 = sum of:
        0.058357935 = weight(_text_:retrieval in 5108) [ClassicSimilarity], result of:
          0.058357935 = score(doc=5108,freq=4.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.37811437 = fieldWeight in 5108, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=5108)
        0.055302992 = weight(_text_:22 in 5108) [ClassicSimilarity], result of:
          0.055302992 = score(doc=5108,freq=2.0), product of:
            0.17867287 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051022716 = queryNorm
            0.30952093 = fieldWeight in 5108, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=5108)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)

Abstract: In this paper methods for both speeding up passage processing and examining more passages using parallel computers are explored. The number of passages processed are varied in order to examine the effect on retrieval effectiveness and efficiency. The particular algorithm applied has previously been used to good effect in Okapi experiments at TREC. This algorithm and the mechanism for applying parallel computing to speed up processing are described.
Date: 20. 1.2007 18:30:22

Burgin, R.: ¬The retrieval effectiveness of 5 clustering algorithms as a function of indexing exhaustivity (1995) 0.02
```
0.023891509 = product of:
  0.071674526 = sum of:
    0.071674526 = product of:
      0.10751179 = sum of:
        0.07294742 = weight(_text_:retrieval in 3365) [ClassicSimilarity], result of:
          0.07294742 = score(doc=3365,freq=16.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.47264296 = fieldWeight in 3365, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3365)
        0.03456437 = weight(_text_:22 in 3365) [ClassicSimilarity], result of:
          0.03456437 = score(doc=3365,freq=2.0), product of:
            0.17867287 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051022716 = queryNorm
            0.19345059 = fieldWeight in 3365, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3365)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)
```
Abstract

The retrieval effectiveness of 5 hierarchical clustering methods (single link, complete link, group average, Ward's method, and weighted average) is examined as a function of indexing exhaustivity with 4 test collections (CR, Cranfield, Medlars, and Time). Evaluations of retrieval effectiveness, based on 3 measures of optimal retrieval performance, confirm earlier findings that the performance of a retrieval system based on single link clustering varies as a function of indexing exhaustivity but fail ti find similar patterns for other clustering methods. The data also confirm earlier findings regarding the poor performance of single link clustering is a retrieval environment. The poor performance of single link clustering appears to derive from that method's tendency to produce a small number of large, ill defined document clusters. By contrast, the data examined here found the retrieval performance of the other clustering methods to be general comparable. The data presented also provides an opportunity to examine the theoretical limits of cluster based retrieval and to compare these theoretical limits to the effectiveness of operational implementations. Performance standards of the 4 document collections examined were found to vary widely, and the effectiveness of operational implementations were found to be in the range defined as unacceptable. Further improvements in search strategies and document representations warrant investigations

Date

22. 2.1996 11:20:06

Witschel, H.F.: Global term weights in distributed environments (2008) 0.02

0.022972263 = product of:
  0.06891679 = sum of:
    0.06891679 = product of:
      0.10337518 = sum of:
        0.06189794 = weight(_text_:retrieval in 2096) [ClassicSimilarity], result of:
          0.06189794 = score(doc=2096,freq=8.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.40105087 = fieldWeight in 2096, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=2096)
        0.04147724 = weight(_text_:22 in 2096) [ClassicSimilarity], result of:
          0.04147724 = score(doc=2096,freq=2.0), product of:
            0.17867287 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051022716 = queryNorm
            0.23214069 = fieldWeight in 2096, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2096)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)

Abstract: This paper examines the estimation of global term weights (such as IDF) in information retrieval scenarios where a global view on the collection is not available. In particular, the two options of either sampling documents or of using a reference corpus independent of the target retrieval collection are compared using standard IR test collections. In addition, the possibility of pruning term lists based on frequency is evaluated. The results show that very good retrieval performance can be reached when just the most frequent terms of a collection - an "extended stop word list" - are known and all terms which are not in that list are treated equally. However, the list cannot always be fully estimated from a general-purpose reference corpus, but some "domain-specific stop words" need to be added. A good solution for achieving this is to mix estimates from small samples of the target retrieval collection with ones derived from a reference corpus.
Date: 1. 8.2008 9:44:22

Khoo, C.S.G.; Wan, K.-W.: ¬A simple relevancy-ranking strategy for an interface to Boolean OPACs (2004) 0.02
```
0.022649694 = product of:
  0.06794908 = sum of:
    0.06794908 = sum of:
      0.025700454 = weight(_text_:online in 2509) [ClassicSimilarity], result of:
        0.025700454 = score(doc=2509,freq=4.0), product of:
          0.1548489 = queryWeight, product of:
            3.0349014 = idf(docFreq=5778, maxDocs=44218)
            0.051022716 = queryNorm
          0.16597117 = fieldWeight in 2509, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.0349014 = idf(docFreq=5778, maxDocs=44218)
            0.02734375 = fieldNorm(doc=2509)
      0.018053565 = weight(_text_:retrieval in 2509) [ClassicSimilarity], result of:
        0.018053565 = score(doc=2509,freq=2.0), product of:
          0.15433937 = queryWeight, product of:
            3.024915 = idf(docFreq=5836, maxDocs=44218)
            0.051022716 = queryNorm
          0.11697317 = fieldWeight in 2509, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.024915 = idf(docFreq=5836, maxDocs=44218)
            0.02734375 = fieldNorm(doc=2509)
      0.024195058 = weight(_text_:22 in 2509) [ClassicSimilarity], result of:
        0.024195058 = score(doc=2509,freq=2.0), product of:
          0.17867287 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.051022716 = queryNorm
          0.1354154 = fieldWeight in 2509, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.02734375 = fieldNorm(doc=2509)
  0.33333334 = coord(1/3)
```
Abstract

A relevancy-ranking algorithm for a natural language interface to Boolean online public access catalogs (OPACs) was formulated and compared with that currently used in a knowledge-based search interface called the E-Referencer, being developed by the authors. The algorithm makes use of seven weIl-known ranking criteria: breadth of match, section weighting, proximity of query words, variant word forms (stemming), document frequency, term frequency and document length. The algorithm converts a natural language query into a series of increasingly broader Boolean search statements. In a small experiment with ten subjects in which the algorithm was simulated by hand, the algorithm obtained good results with a mean overall precision of 0.42 and mean average precision of 0.62, representing a 27 percent improvement in precision and 41 percent improvement in average precision compared to the E-Referencer. The usefulness of each step in the algorithm was analyzed and suggestions are made for improving the algorithm.

Content

"Most Web search engines accept natural language queries, perform some kind of fuzzy matching and produce ranked output, displaying first the documents that are most likely to be relevant. On the other hand, most library online public access catalogs (OPACs) an the Web are still Boolean retrieval systems that perform exact matching, and require users to express their search requests precisely in a Boolean search language and to refine their search statements to improve the search results. It is well-documented that users have difficulty searching Boolean OPACs effectively (e.g. Borgman, 1996; Ensor, 1992; Wallace, 1993). One approach to making OPACs easier to use is to develop a natural language search interface that acts as a middleware between the user's Web browser and the OPAC system. The search interface can accept a natural language query from the user and reformulate it as a series of Boolean search statements that are then submitted to the OPAC. The records retrieved by the OPAC are ranked by the search interface before forwarding them to the user's Web browser. The user, then, does not need to interact directly with the Boolean OPAC but with the natural language search interface or search intermediary. The search interface interacts with the OPAC system an the user's behalf. The advantage of this approach is that no modification to the OPAC or library system is required. Furthermore, the search interface can access multiple OPACs, acting as a meta search engine, and integrate search results from various OPACs before sending them to the user. The search interface needs to incorporate a method for converting the user's natural language query into a series of Boolean search statements, and for ranking the OPAC records retrieved. The purpose of this study was to develop a relevancyranking algorithm for a search interface to Boolean OPAC systems. This is part of an on-going effort to develop a knowledge-based search interface to OPACs called the E-Referencer (Khoo et al., 1998, 1999; Poo et al., 2000). E-Referencer v. 2 that has been implemented applies a repertoire of initial search strategies and reformulation strategies to retrieve records from OPACs using the Z39.50 protocol, and also assists users in mapping query keywords to the Library of Congress subject headings."

Source

Electronic library. 22(2004) no.2, S.112-120

Jones, G.; Robertson, A.M.; Willett, P.: ¬An introduction to genetic algorithms and to their use in information retrieval (1994) 0.02

0.022199143 = product of:
  0.066597424 = sum of:
    0.066597424 = product of:
      0.09989613 = sum of:
        0.0415382 = weight(_text_:online in 7415) [ClassicSimilarity], result of:
          0.0415382 = score(doc=7415,freq=2.0), product of:
            0.1548489 = queryWeight, product of:
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.051022716 = queryNorm
            0.2682499 = fieldWeight in 7415, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.0625 = fieldNorm(doc=7415)
        0.058357935 = weight(_text_:retrieval in 7415) [ClassicSimilarity], result of:
          0.058357935 = score(doc=7415,freq=4.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.37811437 = fieldWeight in 7415, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0625 = fieldNorm(doc=7415)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)

Abstract: This paper provides an introduction to genetic algorithms, a new approach to the investigation of computationally-intensive problems that may be insoluble using conventional, deterministic approaches. A genetic algorithm takes an initial set of possible starting solutions and then iteratively improves theses solutions using operators that are analogous to those involved in Darwinian evolution. The approach is illusrated by reference to several problems in information retrieval
Source: Online and CD-ROM review. 18(1994) no.1, S.3-12

Chang, C.-H.; Hsu, C.-C.: Integrating query expansion and conceptual relevance feedback for personalized Web information retrieval (1998) 0.02

0.022100737 = product of:
  0.06630221 = sum of:
    0.06630221 = product of:
      0.09945331 = sum of:
        0.05106319 = weight(_text_:retrieval in 1319) [ClassicSimilarity], result of:
          0.05106319 = score(doc=1319,freq=4.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.33085006 = fieldWeight in 1319, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1319)
        0.048390117 = weight(_text_:22 in 1319) [ClassicSimilarity], result of:
          0.048390117 = score(doc=1319,freq=2.0), product of:
            0.17867287 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051022716 = queryNorm
            0.2708308 = fieldWeight in 1319, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1319)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)

Date: 1. 8.1996 22:08:06
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Liddy, E.D.; Paik, W.; McKenna, M.; Yu, E.S.: ¬A natural language text retrieval system with relevance feedback (1995) 0.02

0.021974515 = product of:
  0.06592354 = sum of:
    0.06592354 = product of:
      0.09888531 = sum of:
        0.03634593 = weight(_text_:online in 3131) [ClassicSimilarity], result of:
          0.03634593 = score(doc=3131,freq=2.0), product of:
            0.1548489 = queryWeight, product of:
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.051022716 = queryNorm
            0.23471867 = fieldWeight in 3131, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3131)
        0.062539384 = weight(_text_:retrieval in 3131) [ClassicSimilarity], result of:
          0.062539384 = score(doc=3131,freq=6.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.40520695 = fieldWeight in 3131, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3131)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)

Abstract: Outlines a fully integrated retrieval engine that processes documents and queries at the multiple, complex linguistic levels that humans use to construe meaning. Currently undergoing beta site trials, the DR-LINK natural language text retrieval system allows searchers to state queries as fully formed, natural sentences. The meaning and matching of both queries and documents is accomplished at the conceptual level of human expression, not by the simple concurrence of keywords. Furthermore, the natural browsing behaviour of information searchers is accomodated by allowing documents identified as potentially relevant by the explicit semantics of the system to be used as relevance feedback queries which provide an appropriate implicit semantic representation of the information seeker's need
Source: Proceedings of the 16th National Online Meeting 1995, New York, 2-4 May 1995. Ed.: M.E. Williams

Robertson, S.E.: OKAPI at TREC-3 (1995) 0.02

0.021974515 = product of:
  0.06592354 = sum of:
    0.06592354 = product of:
      0.09888531 = sum of:
        0.03634593 = weight(_text_:online in 5694) [ClassicSimilarity], result of:
          0.03634593 = score(doc=5694,freq=2.0), product of:
            0.1548489 = queryWeight, product of:
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.051022716 = queryNorm
            0.23471867 = fieldWeight in 5694, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5694)
        0.062539384 = weight(_text_:retrieval in 5694) [ClassicSimilarity], result of:
          0.062539384 = score(doc=5694,freq=6.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.40520695 = fieldWeight in 5694, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5694)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)

Abstract: Reports text information retrieval experiments performed as part of the 3 rd round of Text Retrieval Conferences (TREC) using the Okapi online catalogue system at City University, UK. The emphasis in TREC-3 was: further refinement of term weighting functions; an investigation of run time passage determination and searching; expansion of ad hoc queries by terms extracted from the top documents retrieved by a trial search; new methods for choosing query expansion terms after relevance feedback, now split into methods of ranking terms prior to selection and subsequent selection procedures; and the development of a user interface procedure within the new TREC interactive search framework
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Campos, L.M. de; Fernández-Luna, J.M.; Huete, J.F.: Implementing relevance feedback in the Bayesian network retrieval model (2003) 0.02

0.02112943 = product of:
  0.06338829 = sum of:
    0.06338829 = product of:
      0.09508243 = sum of:
        0.05360519 = weight(_text_:retrieval in 825) [ClassicSimilarity], result of:
          0.05360519 = score(doc=825,freq=6.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.34732026 = fieldWeight in 825, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.046875 = fieldNorm(doc=825)
        0.04147724 = weight(_text_:22 in 825) [ClassicSimilarity], result of:
          0.04147724 = score(doc=825,freq=2.0), product of:
            0.17867287 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051022716 = queryNorm
            0.23214069 = fieldWeight in 825, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=825)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)

Abstract: Relevance Feedback consists in automatically formulating a new query according to the relevance judgments provided by the user after evaluating a set of retrieved documents. In this article, we introduce several relevance feedback methods for the Bayesian Network Retrieval ModeL The theoretical frame an which our methods are based uses the concept of partial evidences, which summarize the new pieces of information gathered after evaluating the results obtained by the original query. These partial evidences are inserted into the underlying Bayesian network and a new inference process (probabilities propagation) is run to compute the posterior relevance probabilities of the documents in the collection given the new query. The quality of the proposed methods is tested using a preliminary experimentation with different standard document collections.
Date: 22. 3.2003 19:30:19
Footnote: Beitrag eines Themenheftes: Mathematical, logical, and formal methods in information retrieval

Ravana, S.D.; Rajagopal, P.; Balakrishnan, V.: Ranking retrieval systems using pseudo relevance judgments (2015) 0.02
```
0.020789422 = product of:
  0.062368266 = sum of:
    0.062368266 = product of:
      0.093552396 = sum of:
        0.04467099 = weight(_text_:retrieval in 2591) [ClassicSimilarity], result of:
          0.04467099 = score(doc=2591,freq=6.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.28943354 = fieldWeight in 2591, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2591)
        0.0488814 = weight(_text_:22 in 2591) [ClassicSimilarity], result of:
          0.0488814 = score(doc=2591,freq=4.0), product of:
            0.17867287 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051022716 = queryNorm
            0.27358043 = fieldWeight in 2591, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2591)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)
```
Abstract

Purpose In a system-based approach, replicating the web would require large test collections, and judging the relevancy of all documents per topic in creating relevance judgment through human assessors is infeasible. Due to the large amount of documents that requires judgment, there are possible errors introduced by human assessors because of disagreements. The paper aims to discuss these issues. Design/methodology/approach This study explores exponential variation and document ranking methods that generate a reliable set of relevance judgments (pseudo relevance judgments) to reduce human efforts. These methods overcome problems with large amounts of documents for judgment while avoiding human disagreement errors during the judgment process. This study utilizes two key factors: number of occurrences of each document per topic from all the system runs; and document rankings to generate the alternate methods. Findings The effectiveness of the proposed method is evaluated using the correlation coefficient of ranked systems using mean average precision scores between the original Text REtrieval Conference (TREC) relevance judgments and pseudo relevance judgments. The results suggest that the proposed document ranking method with a pool depth of 100 could be a reliable alternative to reduce human effort and disagreement errors involved in generating TREC-like relevance judgments. Originality/value Simple methods proposed in this study show improvement in the correlation coefficient in generating alternate relevance judgment without human assessors while contributing to information retrieval evaluation.

Date

20. 1.2015 18:30:22
18. 9.2018 18:22:56
Jacucci, G.; Barral, O.; Daee, P.; Wenzel, M.; Serim, B.; Ruotsalo, T.; Pluchino, P.; Freeman, J.; Gamberini, L.; Kaski, S.; Blankertz, B.: Integrating neurophysiologic relevance feedback in intent modeling for information retrieval (2019) 0.02
```
0.019621456 = product of:
  0.058864366 = sum of:
    0.058864366 = product of:
      0.08829655 = sum of:
        0.036714934 = weight(_text_:online in 5356) [ClassicSimilarity], result of:
          0.036714934 = score(doc=5356,freq=4.0), product of:
            0.1548489 = queryWeight, product of:
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.051022716 = queryNorm
            0.23710167 = fieldWeight in 5356, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0349014 = idf(docFreq=5778, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5356)
        0.051581617 = weight(_text_:retrieval in 5356) [ClassicSimilarity], result of:
          0.051581617 = score(doc=5356,freq=8.0), product of:
            0.15433937 = queryWeight, product of:
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.051022716 = queryNorm
            0.33420905 = fieldWeight in 5356, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.024915 = idf(docFreq=5836, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5356)
      0.6666667 = coord(2/3)
  0.33333334 = coord(1/3)
```
Abstract

The use of implicit relevance feedback from neurophysiology could deliver effortless information retrieval. However, both computing neurophysiologic responses and retrieving documents are characterized by uncertainty because of noisy signals and incomplete or inconsistent representations of the data. We present the first-of-its-kind, fully integrated information retrieval system that makes use of online implicit relevance feedback generated from brain activity as measured through electroencephalography (EEG), and eye movements. The findings of the evaluation experiment (N = 16) show that we are able to compute online neurophysiology-based relevance feedback with performance significantly better than chance in complex data domains and realistic search tasks. We contribute by demonstrating how to integrate in interactive intent modeling this inherently noisy implicit relevance feedback combined with scarce explicit feedback. Although experimental measures of task performance did not allow us to demonstrate how the classification outcomes translated into search task performance, the experiment proved that our approach is able to generate relevance feedback from brain signals and eye movements in a realistic scenario, thus providing promising implications for future work in neuroadaptive information retrieval (IR).

Search (257 results, page 1 of 13)

Authors

Years

Types

Themes

Subjects

Classifications