Search (90 results, page 1 of 5)

  • × theme_ss:"Retrievalalgorithmen"
  • × year_i:[1990 TO 2000}
  1. Burgin, R.: ¬The retrieval effectiveness of 5 clustering algorithms as a function of indexing exhaustivity (1995) 0.06
    0.060737852 = product of:
      0.0809838 = sum of:
        0.008582841 = weight(_text_:information in 3365) [ClassicSimilarity], result of:
          0.008582841 = score(doc=3365,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.09697737 = fieldWeight in 3365, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3365)
        0.0553244 = weight(_text_:standards in 3365) [ClassicSimilarity], result of:
          0.0553244 = score(doc=3365,freq=2.0), product of:
            0.22470023 = queryWeight, product of:
              4.4569545 = idf(docFreq=1393, maxDocs=44218)
              0.050415643 = queryNorm
            0.24621427 = fieldWeight in 3365, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4569545 = idf(docFreq=1393, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3365)
        0.01707656 = product of:
          0.03415312 = sum of:
            0.03415312 = weight(_text_:22 in 3365) [ClassicSimilarity], result of:
              0.03415312 = score(doc=3365,freq=2.0), product of:
                0.17654699 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050415643 = queryNorm
                0.19345059 = fieldWeight in 3365, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3365)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    The retrieval effectiveness of 5 hierarchical clustering methods (single link, complete link, group average, Ward's method, and weighted average) is examined as a function of indexing exhaustivity with 4 test collections (CR, Cranfield, Medlars, and Time). Evaluations of retrieval effectiveness, based on 3 measures of optimal retrieval performance, confirm earlier findings that the performance of a retrieval system based on single link clustering varies as a function of indexing exhaustivity but fail ti find similar patterns for other clustering methods. The data also confirm earlier findings regarding the poor performance of single link clustering is a retrieval environment. The poor performance of single link clustering appears to derive from that method's tendency to produce a small number of large, ill defined document clusters. By contrast, the data examined here found the retrieval performance of the other clustering methods to be general comparable. The data presented also provides an opportunity to examine the theoretical limits of cluster based retrieval and to compare these theoretical limits to the effectiveness of operational implementations. Performance standards of the 4 document collections examined were found to vary widely, and the effectiveness of operational implementations were found to be in the range defined as unacceptable. Further improvements in search strategies and document representations warrant investigations
    Date
    22. 2.1996 11:20:06
    Source
    Journal of the American Society for Information Science. 46(1995) no.8, S.562-572
  2. Clarke, C.L.A.; Cormack, G.V.; Burkowski, F.J.: Shortest substring ranking : multitext experiments for TREC-4 (1996) 0.03
    0.033194643 = product of:
      0.13277857 = sum of:
        0.13277857 = weight(_text_:standards in 549) [ClassicSimilarity], result of:
          0.13277857 = score(doc=549,freq=2.0), product of:
            0.22470023 = queryWeight, product of:
              4.4569545 = idf(docFreq=1393, maxDocs=44218)
              0.050415643 = queryNorm
            0.59091425 = fieldWeight in 549, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4569545 = idf(docFreq=1393, maxDocs=44218)
              0.09375 = fieldNorm(doc=549)
      0.25 = coord(1/4)
    
    Imprint
    Gaithersburgh, MD : National Institute of Standards and Technology
  3. Savoy, J.; Ndarugendamwo, M.; Vrajitoru, D.: Report on the TREC-4 experiment : combining probabilistic and vector-space schemes (1996) 0.03
    0.033194643 = product of:
      0.13277857 = sum of:
        0.13277857 = weight(_text_:standards in 7574) [ClassicSimilarity], result of:
          0.13277857 = score(doc=7574,freq=2.0), product of:
            0.22470023 = queryWeight, product of:
              4.4569545 = idf(docFreq=1393, maxDocs=44218)
              0.050415643 = queryNorm
            0.59091425 = fieldWeight in 7574, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4569545 = idf(docFreq=1393, maxDocs=44218)
              0.09375 = fieldNorm(doc=7574)
      0.25 = coord(1/4)
    
    Imprint
    Gaithersburgh, MD : National Institute of Standards and Technology
  4. Belkin, N.J.; Cool, C.; Koenemann, J.; Ng, K.B.; Park, S.: Using relevance feedback and ranking in interactive searching (1996) 0.03
    0.033194643 = product of:
      0.13277857 = sum of:
        0.13277857 = weight(_text_:standards in 7588) [ClassicSimilarity], result of:
          0.13277857 = score(doc=7588,freq=2.0), product of:
            0.22470023 = queryWeight, product of:
              4.4569545 = idf(docFreq=1393, maxDocs=44218)
              0.050415643 = queryNorm
            0.59091425 = fieldWeight in 7588, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4569545 = idf(docFreq=1393, maxDocs=44218)
              0.09375 = fieldNorm(doc=7588)
      0.25 = coord(1/4)
    
    Imprint
    Gaithersburgh, MD : National Institute of Standards and Technology
  5. Chang, C.-H.; Hsu, C.-C.: Integrating query expansion and conceptual relevance feedback for personalized Web information retrieval (1998) 0.02
    0.023969568 = product of:
      0.047939137 = sum of:
        0.024031956 = weight(_text_:information in 1319) [ClassicSimilarity], result of:
          0.024031956 = score(doc=1319,freq=8.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.27153665 = fieldWeight in 1319, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1319)
        0.023907183 = product of:
          0.047814365 = sum of:
            0.047814365 = weight(_text_:22 in 1319) [ClassicSimilarity], result of:
              0.047814365 = score(doc=1319,freq=2.0), product of:
                0.17654699 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050415643 = queryNorm
                0.2708308 = fieldWeight in 1319, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1319)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Keyword based querying has been an immediate and efficient way to specify and retrieve related information that the user inquired. However, conventional document ranking based on an automatic assessment of document relevance to the query may not be the best approach when little information is given. Proposes an idea to integrate 2 existing techniques, query expansion and relevance feedback to achieve a concept-based information search for the Web
    Date
    1. 8.1996 22:08:06
  6. Carpineto, C.; Romano, G.: Information retrieval through hybrid navigation of lattice representations (1996) 0.02
    0.022797368 = product of:
      0.045594737 = sum of:
        0.020812286 = weight(_text_:information in 7434) [ClassicSimilarity], result of:
          0.020812286 = score(doc=7434,freq=6.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.23515764 = fieldWeight in 7434, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7434)
        0.024782453 = product of:
          0.049564905 = sum of:
            0.049564905 = weight(_text_:organization in 7434) [ClassicSimilarity], result of:
              0.049564905 = score(doc=7434,freq=2.0), product of:
                0.17974974 = queryWeight, product of:
                  3.5653565 = idf(docFreq=3399, maxDocs=44218)
                  0.050415643 = queryNorm
                0.27574396 = fieldWeight in 7434, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5653565 = idf(docFreq=3399, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=7434)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Presents a comprehensive approach to automatic organization and hybrid navigation of text databases. An organizing stage builds a particular lattice representation of the data, through text indexing followed by lattice clustering of the indexed texts. The lattice representation supports the navigation state of the system, a visual retrieval interface that combines 3 main retrieval strategies: browsing, querying, and bounding. Such a hybrid paradigm permits high flexibility in trading off information exploration and retrieval, and had good retrieval performance. Compares information retrieval using lattice-based hybrid navigation with conventional Boolean querying. Experiments conducted on 2 medium-sized bibliographic databases showed that the performance of lattice retrieval was comparable to or better than Boolean retrieval
  7. Schamber, L.; Bateman, J.: Relevance criteria uses and importance : progress in development of a measurement scale (1999) 0.02
    0.02213614 = product of:
      0.04427228 = sum of:
        0.02303018 = weight(_text_:information in 6691) [ClassicSimilarity], result of:
          0.02303018 = score(doc=6691,freq=10.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.2602176 = fieldWeight in 6691, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=6691)
        0.021242103 = product of:
          0.042484205 = sum of:
            0.042484205 = weight(_text_:organization in 6691) [ClassicSimilarity], result of:
              0.042484205 = score(doc=6691,freq=2.0), product of:
                0.17974974 = queryWeight, product of:
                  3.5653565 = idf(docFreq=3399, maxDocs=44218)
                  0.050415643 = queryNorm
                0.23635197 = fieldWeight in 6691, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5653565 = idf(docFreq=3399, maxDocs=44218)
                  0.046875 = fieldNorm(doc=6691)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    The criteria employed by end-users in making relevance judgments can be powerful and useful indicators of the values users ascribe to a variety of factors in their information seeking and use situations. This paper describes intermediate results in a long-term project intended to develop a measurement scale based on users' relevance criteria. The five tests that are reported here have involved 350 users in an effort to progressively refine and validate the scale content. The range of research questions and types of users and information environments have gradually been expanded to assess the adaptability and transferability of the instrument. The instrument provides quantitative data, notably criterion importance ratings that can be analyzed using several techniques. The substantive findings confirm those of previous studies on relevance evaluation behavior
    Imprint
    Medford, NJ : Information Today
    Series
    Proceedings of the American Society for Information Science; vol.36
    Source
    Knowledge: creation, organization and use. Proceedings of the 62nd Annual Meeting of the American Society for Information Science, 31.10.-4.11.1999. Ed.: L. Woods
  8. Faloutsos, C.: Signature files (1992) 0.02
    0.02052752 = product of:
      0.04105504 = sum of:
        0.013732546 = weight(_text_:information in 3499) [ClassicSimilarity], result of:
          0.013732546 = score(doc=3499,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.1551638 = fieldWeight in 3499, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=3499)
        0.027322493 = product of:
          0.054644987 = sum of:
            0.054644987 = weight(_text_:22 in 3499) [ClassicSimilarity], result of:
              0.054644987 = score(doc=3499,freq=2.0), product of:
                0.17654699 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050415643 = queryNorm
                0.30952093 = fieldWeight in 3499, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3499)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Date
    7. 5.1999 15:22:48
    Source
    Information retrieval: data structures and algorithms. Ed.: W.B. Frakes u. R. Baeza-Yates
  9. Information retrieval : data structures and algorithms (1992) 0.02
    0.020204907 = product of:
      0.040409815 = sum of:
        0.022708062 = weight(_text_:information in 3495) [ClassicSimilarity], result of:
          0.022708062 = score(doc=3495,freq=14.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.256578 = fieldWeight in 3495, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3495)
        0.017701752 = product of:
          0.035403505 = sum of:
            0.035403505 = weight(_text_:organization in 3495) [ClassicSimilarity], result of:
              0.035403505 = score(doc=3495,freq=2.0), product of:
                0.17974974 = queryWeight, product of:
                  3.5653565 = idf(docFreq=3399, maxDocs=44218)
                  0.050415643 = queryNorm
                0.19695997 = fieldWeight in 3495, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5653565 = idf(docFreq=3399, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3495)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    The book consists of separate chapters by some 20 different authors. It covers many of the information retrieval algorithms, including methods of file organization, file search and access, and query processing
    Content
    An edited volume containing data structures and algorithms for information retrieval including a disk with examples written in C. for prgrammers and students interested in parsing text, automated indexing, its the first collection in book form of the basic data structures and algorithms that are critical to the storage and retrieval of documents. ------------------Enthält die Kapitel: FRAKES, W.B.: Introduction to information storage and retrieval systems; BAEZA-YATES, R.S.: Introduction to data structures and algorithms related to information retrieval; HARMAN, D. u.a.: Inverted files; FALOUTSOS, C.: Signature files; GONNET, G.H. u.a.: New indices for text: PAT trees and PAT arrays; FORD, D.A. u. S. CHRISTODOULAKIS: File organizations for optical disks; FOX, C.: Lexical analysis and stoplists; FRAKES, W.B.: Stemming algorithms; SRINIVASAN, P.: Thesaurus construction; BAEZA-YATES, R.A.: String searching algorithms; HARMAN, D.: Relevance feedback and other query modification techniques; WARTIK, S.: Boolean operators; WARTIK, S. u.a.: Hashing algorithms; HARMAN, D.: Ranking algorithms; FOX, E.: u.a.: Extended Boolean models; RASMUSSEN, E.: Clustering algorithms; HOLLAAR, L.: Special-purpose hardware for information retrieval; STANFILL, C.: Parallel information retrieval algorithms
  10. Lee, D.L.; Ren, L.: Document ranking on weight-partitioned signature files (1996) 0.02
    0.018399216 = product of:
      0.036798432 = sum of:
        0.012015978 = weight(_text_:information in 2417) [ClassicSimilarity], result of:
          0.012015978 = score(doc=2417,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.13576832 = fieldWeight in 2417, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2417)
        0.024782453 = product of:
          0.049564905 = sum of:
            0.049564905 = weight(_text_:organization in 2417) [ClassicSimilarity], result of:
              0.049564905 = score(doc=2417,freq=2.0), product of:
                0.17974974 = queryWeight, product of:
                  3.5653565 = idf(docFreq=3399, maxDocs=44218)
                  0.050415643 = queryNorm
                0.27574396 = fieldWeight in 2417, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5653565 = idf(docFreq=3399, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2417)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Proposes the weight partitioned signature file, a signature file organization for supporting document ranking. It uses multiple signature files each corresponding to one term frequency to represent terms with different term frequencies. Words with the same term frequency in a document are grouped together and hased into the signature file corresponding to that term frequency. Investigates the effect of false drops on retrieval effectiveness. Analyses the performance of the weight partitioned signature file under different search strategies and configurations. Obtains an optimal formula for storage allocation to minimise the effect of false drops on document ranks. Analytical results are supported by experiments on document collections
    Source
    ACM transactions on information systems. 14(1996) no.2, S.109-137
  11. Savoy, J.: Ranking schemes in hybrid Boolean systems : a new approach (1997) 0.02
    0.017903835 = product of:
      0.03580767 = sum of:
        0.014565565 = weight(_text_:information in 393) [ClassicSimilarity], result of:
          0.014565565 = score(doc=393,freq=4.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.16457605 = fieldWeight in 393, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=393)
        0.021242103 = product of:
          0.042484205 = sum of:
            0.042484205 = weight(_text_:organization in 393) [ClassicSimilarity], result of:
              0.042484205 = score(doc=393,freq=2.0), product of:
                0.17974974 = queryWeight, product of:
                  3.5653565 = idf(docFreq=3399, maxDocs=44218)
                  0.050415643 = queryNorm
                0.23635197 = fieldWeight in 393, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5653565 = idf(docFreq=3399, maxDocs=44218)
                  0.046875 = fieldNorm(doc=393)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    In most commercial online systems, the retrieval system is based on the Boolean model and its inverted file organization. Since the investment in these systems is so great and changing them could be economically unfeasible, this article suggests a new ranking scheme especially adapted for hypertext environments in order to produce more effective retrieval results and yet maintain the effectiveness of the investment made to date in the Boolean model. To select the retrieved documents, the suggested ranking strategy uses multiple sources of document content evidence. The proposed scheme integrates both the information provided by the index and query terms, and the inherent relationships between documents such as bibliographic references or hypertext links. We will demonstrate that our scheme represents an integration of both subject and citation indexing, and results in a significant imporvement over classical ranking schemes uses in hybrid Boolean systems, while preserving its efficiency. Moreover, through knowing the nearest neighbor and the hypertext links which constitute additional sources of evidence, our strategy will take them into account in order to further improve retrieval effectiveness and to provide 'good' starting points for browsing in a hypertext or hypermedia environement
    Source
    Journal of the American Society for Information Science. 48(1997) no.3, S.235-253
  12. Kelledy, F.; Smeaton, A.F.: Signature files and beyond (1996) 0.02
    0.017528716 = product of:
      0.035057433 = sum of:
        0.014565565 = weight(_text_:information in 6973) [ClassicSimilarity], result of:
          0.014565565 = score(doc=6973,freq=4.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.16457605 = fieldWeight in 6973, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=6973)
        0.02049187 = product of:
          0.04098374 = sum of:
            0.04098374 = weight(_text_:22 in 6973) [ClassicSimilarity], result of:
              0.04098374 = score(doc=6973,freq=2.0), product of:
                0.17654699 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050415643 = queryNorm
                0.23214069 = fieldWeight in 6973, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=6973)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Source
    Information retrieval: new systems and current research. Proceedings of the 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Drymen, Scotland, 22-23 Mar 94. Ed.: R. Leon
  13. Kantor, P.; Kim, M.H.; Ibraev, U.; Atasoy, K.: Estimating the number of relevant documents in enormous collections (1999) 0.02
    0.017433718 = product of:
      0.034867436 = sum of:
        0.017165681 = weight(_text_:information in 6690) [ClassicSimilarity], result of:
          0.017165681 = score(doc=6690,freq=8.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.19395474 = fieldWeight in 6690, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6690)
        0.017701752 = product of:
          0.035403505 = sum of:
            0.035403505 = weight(_text_:organization in 6690) [ClassicSimilarity], result of:
              0.035403505 = score(doc=6690,freq=2.0), product of:
                0.17974974 = queryWeight, product of:
                  3.5653565 = idf(docFreq=3399, maxDocs=44218)
                  0.050415643 = queryNorm
                0.19695997 = fieldWeight in 6690, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5653565 = idf(docFreq=3399, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=6690)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    In assessing information retrieval systems, it is important to know not only the precision of the retrieved set, but also to compare the number of retrieved relevant items to the total number of relevant items. For large collections, such as the TREC test collections, or the World Wide Web, it is not possible to enumerate the entire set of relevant documents. If the retrieved documents are evaluated, a variant of the statistical "capture-recapture" method can be used to estimate the total number of relevant documents, providing the several retrieval systems used are sufficiently independent. We show that the underlying signal detection model supporting such an analysis can be extended in two ways. First, assuming that there are two distinct performance characteristics (corresponding to the chance of retrieving a relevant, and retrieving a given non-relevant document), we show that if there are three or more independent systems available it is possible to estimate the number of relevant documents without actually having to decide whether each individual document is relevant. We report applications of this 3-system method to the TREC data, leading to the conclusion that the independence assumptions are not satisfied. We then extend the model to a multi-system, multi-problem model, and show that it is possible to include statistical dependencies of all orders in the model, and determine the number of relevant documents for each of the problems in the set. Application to the TREC setting will be presented
    Imprint
    Medford, NJ : Information Today
    Series
    Proceedings of the American Society for Information Science; vol.36
    Source
    Knowledge: creation, organization and use. Proceedings of the 62nd Annual Meeting of the American Society for Information Science, 31.10.-4.11.1999. Ed.: L. Woods
  14. Joss, M.W.; Wszola, S.: ¬The engines that can : text search and retrieval software, their strategies, and vendors (1996) 0.02
    0.015395639 = product of:
      0.030791279 = sum of:
        0.01029941 = weight(_text_:information in 5123) [ClassicSimilarity], result of:
          0.01029941 = score(doc=5123,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.116372846 = fieldWeight in 5123, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=5123)
        0.02049187 = product of:
          0.04098374 = sum of:
            0.04098374 = weight(_text_:22 in 5123) [ClassicSimilarity], result of:
              0.04098374 = score(doc=5123,freq=2.0), product of:
                0.17654699 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050415643 = queryNorm
                0.23214069 = fieldWeight in 5123, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5123)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Traces the development of text searching and retrieval software designed to cope with the increasing demands made by the storage and handling of large amounts of data, recorded on high data storage media, from CD-ROM to multi gigabyte storage media and online information services, with particular reference to the need to cope with graphics as well as conventional ASCII text. Includes details of: Boolean searching, fuzzy searching and matching; relevance ranking; proximity searching and improved strategies for dealing with text searching in very large databases. Concludes that the best searching tools for CD-ROM publishers are those optimized for searching and retrieval on CD-ROM. CD-ROM drives have relatively lower random seek times than hard discs and so the software most appropriate to the medium is that which can effectively arrange the indexes and text on the CD-ROM to avoid continuous random access searching. Lists and reviews a selection of software packages designed to achieve the sort of results required for rapid CD-ROM searching
    Date
    12. 9.1996 13:56:22
  15. Efthimiadis, E.N.: User choices : a new yardstick for the evaluation of ranking algorithms for interactive query expansion (1995) 0.01
    0.0128297005 = product of:
      0.025659401 = sum of:
        0.008582841 = weight(_text_:information in 5697) [ClassicSimilarity], result of:
          0.008582841 = score(doc=5697,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.09697737 = fieldWeight in 5697, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5697)
        0.01707656 = product of:
          0.03415312 = sum of:
            0.03415312 = weight(_text_:22 in 5697) [ClassicSimilarity], result of:
              0.03415312 = score(doc=5697,freq=2.0), product of:
                0.17654699 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050415643 = queryNorm
                0.19345059 = fieldWeight in 5697, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5697)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Date
    22. 2.1996 13:14:10
    Source
    Information processing and management. 31(1995) no.4, S.605-620
  16. Cole, C.: Intelligent information retrieval: diagnosing information need : Part I: the theoretical framework for developing an intelligent IR tool (1998) 0.01
    0.008919551 = product of:
      0.035678204 = sum of:
        0.035678204 = weight(_text_:information in 6431) [ClassicSimilarity], result of:
          0.035678204 = score(doc=6431,freq=6.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.40312737 = fieldWeight in 6431, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=6431)
      0.25 = coord(1/4)
    
    Source
    Information processing and management. 34(1998) no.6, S.709-720
  17. Cole, C.: Intelligent information retrieval: diagnosing information need : Part II: uncertainty expansion in a prototype of a diagnostic IR tool (1998) 0.01
    0.008919551 = product of:
      0.035678204 = sum of:
        0.035678204 = weight(_text_:information in 6432) [ClassicSimilarity], result of:
          0.035678204 = score(doc=6432,freq=6.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.40312737 = fieldWeight in 6432, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.09375 = fieldNorm(doc=6432)
      0.25 = coord(1/4)
    
    Source
    Information processing and management. 34(1998) no.6, S.721-731
  18. Baeza-Yates, R.A.: Introduction to data structures and algorithms related to information retrieval (1992) 0.01
    0.008582841 = product of:
      0.034331363 = sum of:
        0.034331363 = weight(_text_:information in 3082) [ClassicSimilarity], result of:
          0.034331363 = score(doc=3082,freq=8.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.38790947 = fieldWeight in 3082, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.078125 = fieldNorm(doc=3082)
      0.25 = coord(1/4)
    
    Abstract
    In this chapter we review the main concepts and data structures used in information retrieval, and we classify information retrieval related algorithms
    Source
    Information retrieval: data structures and algorithms. Ed.: W.B. Frakes u. R. Baeza-Yates
  19. Lee, J.H.: Combining the evidence of different relevance feedback methods for information retrieval (1998) 0.01
    0.00849658 = product of:
      0.03398632 = sum of:
        0.03398632 = weight(_text_:information in 6469) [ClassicSimilarity], result of:
          0.03398632 = score(doc=6469,freq=4.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.3840108 = fieldWeight in 6469, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.109375 = fieldNorm(doc=6469)
      0.25 = coord(1/4)
    
    Source
    Information processing and management. 34(1998) no.6, S.681-691
  20. Hofferer, M.: Heuristic search in information retrieval (1994) 0.01
    0.0076767267 = product of:
      0.030706907 = sum of:
        0.030706907 = weight(_text_:information in 1070) [ClassicSimilarity], result of:
          0.030706907 = score(doc=1070,freq=10.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.3469568 = fieldWeight in 1070, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0625 = fieldNorm(doc=1070)
      0.25 = coord(1/4)
    
    Abstract
    Describes an adaptive information retrieval system: Information Retrieval Algorithm System (IRAS); that uses heuristic searching to sample a document space and retrieve relevant documents according to users' requests; and also a learning module based on a knowledge representation system and an approximate probabilistic characterization of relevant documents; to reproduce a user classification of relevant documents and to provide a rule controlled ranking
    Source
    Information retrieval: new systems and current research. Proceedings of the 15th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Glasgow 1993. Ed.: Ruben Leon

Languages

  • e 87
  • d 2
  • chi 1
  • More… Less…

Types

  • a 80
  • m 4
  • s 3
  • p 2
  • r 2
  • el 1
  • More… Less…