Search (93 results, page 1 of 5)

  • × theme_ss:"Retrievalalgorithmen"
  • × year_i:[1990 TO 2000}
  1. Burgin, R.: ¬The retrieval effectiveness of 5 clustering algorithms as a function of indexing exhaustivity (1995) 0.06
    0.056283943 = product of:
      0.08442591 = sum of:
        0.029067779 = weight(_text_:of in 3365) [ClassicSimilarity], result of:
          0.029067779 = score(doc=3365,freq=34.0), product of:
            0.08160993 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.05218836 = queryNorm
            0.35617945 = fieldWeight in 3365, product of:
              5.8309517 = tf(freq=34.0), with freq of:
                34.0 = termFreq=34.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3365)
        0.055358134 = sum of:
          0.020004123 = weight(_text_:science in 3365) [ClassicSimilarity], result of:
            0.020004123 = score(doc=3365,freq=2.0), product of:
              0.13747036 = queryWeight, product of:
                2.6341193 = idf(docFreq=8627, maxDocs=44218)
                0.05218836 = queryNorm
              0.1455159 = fieldWeight in 3365, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.6341193 = idf(docFreq=8627, maxDocs=44218)
                0.0390625 = fieldNorm(doc=3365)
          0.03535401 = weight(_text_:22 in 3365) [ClassicSimilarity], result of:
            0.03535401 = score(doc=3365,freq=2.0), product of:
              0.18275474 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.05218836 = queryNorm
              0.19345059 = fieldWeight in 3365, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=3365)
      0.6666667 = coord(2/3)
    
    Abstract
    The retrieval effectiveness of 5 hierarchical clustering methods (single link, complete link, group average, Ward's method, and weighted average) is examined as a function of indexing exhaustivity with 4 test collections (CR, Cranfield, Medlars, and Time). Evaluations of retrieval effectiveness, based on 3 measures of optimal retrieval performance, confirm earlier findings that the performance of a retrieval system based on single link clustering varies as a function of indexing exhaustivity but fail ti find similar patterns for other clustering methods. The data also confirm earlier findings regarding the poor performance of single link clustering is a retrieval environment. The poor performance of single link clustering appears to derive from that method's tendency to produce a small number of large, ill defined document clusters. By contrast, the data examined here found the retrieval performance of the other clustering methods to be general comparable. The data presented also provides an opportunity to examine the theoretical limits of cluster based retrieval and to compare these theoretical limits to the effectiveness of operational implementations. Performance standards of the 4 document collections examined were found to vary widely, and the effectiveness of operational implementations were found to be in the range defined as unacceptable. Further improvements in search strategies and document representations warrant investigations
    Date
    22. 2.1996 11:20:06
    Source
    Journal of the American Society for Information Science. 46(1995) no.8, S.562-572
  2. Davis, C.H.: Beyond Boole : the next logical step (1995) 0.04
    0.036377676 = product of:
      0.05456651 = sum of:
        0.02255991 = weight(_text_:of in 3550) [ClassicSimilarity], result of:
          0.02255991 = score(doc=3550,freq=2.0), product of:
            0.08160993 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.05218836 = queryNorm
            0.27643585 = fieldWeight in 3550, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.125 = fieldNorm(doc=3550)
        0.0320066 = product of:
          0.0640132 = sum of:
            0.0640132 = weight(_text_:science in 3550) [ClassicSimilarity], result of:
              0.0640132 = score(doc=3550,freq=2.0), product of:
                0.13747036 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.05218836 = queryNorm
                0.4656509 = fieldWeight in 3550, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.125 = fieldNorm(doc=3550)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Source
    Bulletin of the American Society for Information Science. 21(1995), S.17-20
  3. Kelledy, F.; Smeaton, A.F.: Signature files and beyond (1996) 0.03
    0.03447683 = product of:
      0.051715247 = sum of:
        0.03050284 = weight(_text_:of in 6973) [ClassicSimilarity], result of:
          0.03050284 = score(doc=6973,freq=26.0), product of:
            0.08160993 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.05218836 = queryNorm
            0.37376386 = fieldWeight in 6973, product of:
              5.0990195 = tf(freq=26.0), with freq of:
                26.0 = termFreq=26.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=6973)
        0.021212406 = product of:
          0.042424813 = sum of:
            0.042424813 = weight(_text_:22 in 6973) [ClassicSimilarity], result of:
              0.042424813 = score(doc=6973,freq=2.0), product of:
                0.18275474 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05218836 = queryNorm
                0.23214069 = fieldWeight in 6973, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=6973)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Proposes that signature files be used as a viable alternative to other indexing strategies such as inverted files for searching through large volumes of text. Demonstrates through simulation, that search times can be further reduced by enhancing the basic signature file concept using deterministic partitioning algorithms which eliminate the need for an exhaustive search of the entire signature file. Reports research to evaluate the performance of some deterministic partitioning algorithms in a non simulated environment using 276 MB of raw newspaper text (taken from the Wall Street Journal) and real user queries. Presents a selection of results to illustrate trends and highlight important aspects of the performance of these methods under realistic rather than simulated operating conditions. As a result of the research reported here certain aspects of this approach to signature files are shown to be found wanting and require improvement. Suggests lines of future research on the partitioning of signature files
    Source
    Information retrieval: new systems and current research. Proceedings of the 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Drymen, Scotland, 22-23 Mar 94. Ed.: R. Leon
  4. Spink, A.; Losee, R.M.: Feedback in information retrieval (1996) 0.03
    0.033228777 = product of:
      0.049843162 = sum of:
        0.033839863 = weight(_text_:of in 7441) [ClassicSimilarity], result of:
          0.033839863 = score(doc=7441,freq=18.0), product of:
            0.08160993 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.05218836 = queryNorm
            0.41465375 = fieldWeight in 7441, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=7441)
        0.0160033 = product of:
          0.0320066 = sum of:
            0.0320066 = weight(_text_:science in 7441) [ClassicSimilarity], result of:
              0.0320066 = score(doc=7441,freq=2.0), product of:
                0.13747036 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.05218836 = queryNorm
                0.23282544 = fieldWeight in 7441, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.0625 = fieldNorm(doc=7441)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    State of the art review of the mechanisms of feedback in information retrieval (IR) in terms of feedback concepts and models in cybernetics and social sciences. Critically evaluates feedback research based on the traditional IR models and comparing the different approaches to automatic relevance feedback techniques, and feedback research within the framework of interactive IR models. Calls for an extension of the concept of feedback beyond relevance feedback to interactive feedback. Cites specific examples of feedback models used within IR research and presents 6 challenges to future research
    Source
    Annual review of information science and technology. 31(1996), S.33-78
  5. Wilbur, W.J.: ¬A retrieval system based on automatic relevance weighting of search terms (1992) 0.03
    0.031938553 = product of:
      0.04790783 = sum of:
        0.03190453 = weight(_text_:of in 5269) [ClassicSimilarity], result of:
          0.03190453 = score(doc=5269,freq=16.0), product of:
            0.08160993 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.05218836 = queryNorm
            0.39093933 = fieldWeight in 5269, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=5269)
        0.0160033 = product of:
          0.0320066 = sum of:
            0.0320066 = weight(_text_:science in 5269) [ClassicSimilarity], result of:
              0.0320066 = score(doc=5269,freq=2.0), product of:
                0.13747036 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.05218836 = queryNorm
                0.23282544 = fieldWeight in 5269, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5269)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Describes the development of a retrieval system based on automatic relevance weighting of search terms and founded on the Bayesian formulation of the probability of relevance as function of term occurrence where the contribution from individual terms is assumed to be independent. The relevance pair (RP) model and the vector cosine (VC) model were compared and in the test environment improved retrieval was obtained with the RP model when compared with the VC model
    Source
    Proceedings of the 55th Annual Meeting of the American Society for Information Science, Pittsburgh, 26.-29.10.92. Ed.: D. Shaw
  6. Faloutsos, C.: Signature files (1992) 0.03
    0.031880446 = product of:
      0.047820665 = sum of:
        0.019537456 = weight(_text_:of in 3499) [ClassicSimilarity], result of:
          0.019537456 = score(doc=3499,freq=6.0), product of:
            0.08160993 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.05218836 = queryNorm
            0.23940048 = fieldWeight in 3499, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=3499)
        0.028283209 = product of:
          0.056566417 = sum of:
            0.056566417 = weight(_text_:22 in 3499) [ClassicSimilarity], result of:
              0.056566417 = score(doc=3499,freq=2.0), product of:
                0.18275474 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05218836 = queryNorm
                0.30952093 = fieldWeight in 3499, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3499)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Presents a survey and discussion on signature-based text retrieval methods. It describes the main idea behind the signature approach and its advantages over other text retrieval methods, it provides a classification of the signature methods that have appeared in the literature, it describes the main representatives of each class, together with the relative advantages and drawbacks, and it gives a list of applications as well as commercial or university prototypes that use the signature approach
    Date
    7. 5.1999 15:22:48
  7. Nakkouzi, Z.S.; Eastman, C.M.: Query formulation for handling negation in information retrieval systems (1990) 0.03
    0.030564837 = product of:
      0.045847256 = sum of:
        0.029843956 = weight(_text_:of in 3531) [ClassicSimilarity], result of:
          0.029843956 = score(doc=3531,freq=14.0), product of:
            0.08160993 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.05218836 = queryNorm
            0.36569026 = fieldWeight in 3531, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=3531)
        0.0160033 = product of:
          0.0320066 = sum of:
            0.0320066 = weight(_text_:science in 3531) [ClassicSimilarity], result of:
              0.0320066 = score(doc=3531,freq=2.0), product of:
                0.13747036 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.05218836 = queryNorm
                0.23282544 = fieldWeight in 3531, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3531)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Queries containing negation are widely recognised as presenting problems for both users and systems. In information retrieval systems such problems usually manifest themselves in the use of the NOT operator. Describes an algorithm to transform Boolean queries with negated terms into queries without negation; the transformation process is based on the use of a hierarchical thesaurus. Examines a set of user requests submitted to the Thomas Cooper Library at the University of South Carolina to determine the pattern and frequency of use of negation.
    Source
    Journal of the American Society for Information Science. 41(1990) no.3, S.171-182
  8. Schamber, L.; Bateman, J.: Relevance criteria uses and importance : progress in development of a measurement scale (1999) 0.03
    0.029151216 = product of:
      0.043726824 = sum of:
        0.02675276 = weight(_text_:of in 6691) [ClassicSimilarity], result of:
          0.02675276 = score(doc=6691,freq=20.0), product of:
            0.08160993 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.05218836 = queryNorm
            0.32781258 = fieldWeight in 6691, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=6691)
        0.016974064 = product of:
          0.033948127 = sum of:
            0.033948127 = weight(_text_:science in 6691) [ClassicSimilarity], result of:
              0.033948127 = score(doc=6691,freq=4.0), product of:
                0.13747036 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.05218836 = queryNorm
                0.24694869 = fieldWeight in 6691, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.046875 = fieldNorm(doc=6691)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    The criteria employed by end-users in making relevance judgments can be powerful and useful indicators of the values users ascribe to a variety of factors in their information seeking and use situations. This paper describes intermediate results in a long-term project intended to develop a measurement scale based on users' relevance criteria. The five tests that are reported here have involved 350 users in an effort to progressively refine and validate the scale content. The range of research questions and types of users and information environments have gradually been expanded to assess the adaptability and transferability of the instrument. The instrument provides quantitative data, notably criterion importance ratings that can be analyzed using several techniques. The substantive findings confirm those of previous studies on relevance evaluation behavior
    Series
    Proceedings of the American Society for Information Science; vol.36
    Source
    Knowledge: creation, organization and use. Proceedings of the 62nd Annual Meeting of the American Society for Information Science, 31.10.-4.11.1999. Ed.: L. Woods
  9. Frants, V.I.; Shapiro, J.: Control and feedback in a documentary information retrieval system (1991) 0.03
    0.029088955 = product of:
      0.04363343 = sum of:
        0.027630134 = weight(_text_:of in 416) [ClassicSimilarity], result of:
          0.027630134 = score(doc=416,freq=12.0), product of:
            0.08160993 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.05218836 = queryNorm
            0.33856338 = fieldWeight in 416, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=416)
        0.0160033 = product of:
          0.0320066 = sum of:
            0.0320066 = weight(_text_:science in 416) [ClassicSimilarity], result of:
              0.0320066 = score(doc=416,freq=2.0), product of:
                0.13747036 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.05218836 = queryNorm
                0.23282544 = fieldWeight in 416, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.0625 = fieldNorm(doc=416)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Addresses the problem of control in documentary information retrieval systems is analysed and it is shown why an IR system has to be looked at as an adaptive system. The algorithms of feedback are proposed and it is shown how they depend on the type of the collection of documents: static (no change in the collection between searches) and dynamic (when the change occurs between searches). The proposed algorithms are the basis for the development of the fully automated information retrieval systems
    Source
    Journal of the American Society for Information Science. 42(1991) no.9, S.623-634
  10. Chen, H.; Zhang, Y.; Houston, A.L.: Semantic indexing and searching using a Hopfield net (1998) 0.03
    0.028235972 = product of:
      0.042353958 = sum of:
        0.025379896 = weight(_text_:of in 5704) [ClassicSimilarity], result of:
          0.025379896 = score(doc=5704,freq=18.0), product of:
            0.08160993 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.05218836 = queryNorm
            0.3109903 = fieldWeight in 5704, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=5704)
        0.016974064 = product of:
          0.033948127 = sum of:
            0.033948127 = weight(_text_:science in 5704) [ClassicSimilarity], result of:
              0.033948127 = score(doc=5704,freq=4.0), product of:
                0.13747036 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.05218836 = queryNorm
                0.24694869 = fieldWeight in 5704, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5704)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Presents a neural network approach to document semantic indexing. Reports results of a study to apply a Hopfield net algorithm to simulate human associative memory for concept exploration in the domain of computer science and engineering. The INSPEC database, consisting of 320.000 abstracts from leading periodical articles was used as the document test bed. Benchmark tests conformed that 3 parameters: maximum number of activated nodes; maximum allowable error; and maximum number of iterations; were useful in positively influencing network convergence behaviour without negatively impacting central processing unit performance. Another series of benchmark tests was performed to determine the effectiveness of various filtering techniques in reducing the negative impact of noisy input terms. Preliminary user tests conformed expectations that the Hopfield net is potentially useful as an associative memory technique to improve document recall and precision by solving discrepancies between indexer vocabularies and end user vocabularies
    Source
    Journal of information science. 24(1998) no.1, S.3-18
  11. Kantor, P.; Kim, M.H.; Ibraev, U.; Atasoy, K.: Estimating the number of relevant documents in enormous collections (1999) 0.03
    0.02822996 = product of:
      0.04234494 = sum of:
        0.028199887 = weight(_text_:of in 6690) [ClassicSimilarity], result of:
          0.028199887 = score(doc=6690,freq=32.0), product of:
            0.08160993 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.05218836 = queryNorm
            0.34554482 = fieldWeight in 6690, product of:
              5.656854 = tf(freq=32.0), with freq of:
                32.0 = termFreq=32.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6690)
        0.014145052 = product of:
          0.028290104 = sum of:
            0.028290104 = weight(_text_:science in 6690) [ClassicSimilarity], result of:
              0.028290104 = score(doc=6690,freq=4.0), product of:
                0.13747036 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.05218836 = queryNorm
                0.20579056 = fieldWeight in 6690, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=6690)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    In assessing information retrieval systems, it is important to know not only the precision of the retrieved set, but also to compare the number of retrieved relevant items to the total number of relevant items. For large collections, such as the TREC test collections, or the World Wide Web, it is not possible to enumerate the entire set of relevant documents. If the retrieved documents are evaluated, a variant of the statistical "capture-recapture" method can be used to estimate the total number of relevant documents, providing the several retrieval systems used are sufficiently independent. We show that the underlying signal detection model supporting such an analysis can be extended in two ways. First, assuming that there are two distinct performance characteristics (corresponding to the chance of retrieving a relevant, and retrieving a given non-relevant document), we show that if there are three or more independent systems available it is possible to estimate the number of relevant documents without actually having to decide whether each individual document is relevant. We report applications of this 3-system method to the TREC data, leading to the conclusion that the independence assumptions are not satisfied. We then extend the model to a multi-system, multi-problem model, and show that it is possible to include statistical dependencies of all orders in the model, and determine the number of relevant documents for each of the problems in the set. Application to the TREC setting will be presented
    Series
    Proceedings of the American Society for Information Science; vol.36
    Source
    Knowledge: creation, organization and use. Proceedings of the 62nd Annual Meeting of the American Society for Information Science, 31.10.-4.11.1999. Ed.: L. Woods
  12. Efthimiadis, E.N.: User choices : a new yardstick for the evaluation of ranking algorithms for interactive query expansion (1995) 0.03
    0.028065886 = product of:
      0.042098828 = sum of:
        0.02442182 = weight(_text_:of in 5697) [ClassicSimilarity], result of:
          0.02442182 = score(doc=5697,freq=24.0), product of:
            0.08160993 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.05218836 = queryNorm
            0.2992506 = fieldWeight in 5697, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5697)
        0.017677005 = product of:
          0.03535401 = sum of:
            0.03535401 = weight(_text_:22 in 5697) [ClassicSimilarity], result of:
              0.03535401 = score(doc=5697,freq=2.0), product of:
                0.18275474 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05218836 = queryNorm
                0.19345059 = fieldWeight in 5697, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5697)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    The performance of 8 ranking algorithms was evaluated with respect to their effectiveness in ranking terms for query expansion. The evaluation was conducted within an investigation of interactive query expansion and relevance feedback in a real operational environment. Focuses on the identification of algorithms that most effectively take cognizance of user preferences. user choices (i.e. the terms selected by the searchers for the query expansion search) provided the yardstick for the evaluation of the 8 ranking algorithms. This methodology introduces a user oriented approach in evaluating ranking algorithms for query expansion in contrast to the standard, system oriented approaches. Similarities in the performance of the 8 algorithms and the ways these algorithms rank terms were the main focus of this evaluation. The findings demonstrate that the r-lohi, wpq, enim, and porter algorithms have similar performance in bringing good terms to the top of a ranked list of terms for query expansion. However, further evaluation of the algorithms in different (e.g. full text) environments is needed before these results can be generalized beyond the context of the present study
    Date
    22. 2.1996 13:14:10
  13. Joss, M.W.; Wszola, S.: ¬The engines that can : text search and retrieval software, their strategies, and vendors (1996) 0.03
    0.02795667 = product of:
      0.041935004 = sum of:
        0.020722598 = weight(_text_:of in 5123) [ClassicSimilarity], result of:
          0.020722598 = score(doc=5123,freq=12.0), product of:
            0.08160993 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.05218836 = queryNorm
            0.25392252 = fieldWeight in 5123, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=5123)
        0.021212406 = product of:
          0.042424813 = sum of:
            0.042424813 = weight(_text_:22 in 5123) [ClassicSimilarity], result of:
              0.042424813 = score(doc=5123,freq=2.0), product of:
                0.18275474 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05218836 = queryNorm
                0.23214069 = fieldWeight in 5123, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5123)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Traces the development of text searching and retrieval software designed to cope with the increasing demands made by the storage and handling of large amounts of data, recorded on high data storage media, from CD-ROM to multi gigabyte storage media and online information services, with particular reference to the need to cope with graphics as well as conventional ASCII text. Includes details of: Boolean searching, fuzzy searching and matching; relevance ranking; proximity searching and improved strategies for dealing with text searching in very large databases. Concludes that the best searching tools for CD-ROM publishers are those optimized for searching and retrieval on CD-ROM. CD-ROM drives have relatively lower random seek times than hard discs and so the software most appropriate to the medium is that which can effectively arrange the indexes and text on the CD-ROM to avoid continuous random access searching. Lists and reviews a selection of software packages designed to achieve the sort of results required for rapid CD-ROM searching
    Date
    12. 9.1996 13:56:22
  14. Couvreur, T.R.; Benzel, R.N.; Miller, S.F.; Zeitler, D.N.; Lee, D.L.; Singhal, M.; Shivaratri, N.; Wong, W.Y.P.: ¬An analysis of performance and cost factors in searching large text databases using parallel search systems (1994) 0.03
    0.027946234 = product of:
      0.04191935 = sum of:
        0.027916465 = weight(_text_:of in 7657) [ClassicSimilarity], result of:
          0.027916465 = score(doc=7657,freq=16.0), product of:
            0.08160993 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.05218836 = queryNorm
            0.34207192 = fieldWeight in 7657, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7657)
        0.0140028875 = product of:
          0.028005775 = sum of:
            0.028005775 = weight(_text_:science in 7657) [ClassicSimilarity], result of:
              0.028005775 = score(doc=7657,freq=2.0), product of:
                0.13747036 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.05218836 = queryNorm
                0.20372227 = fieldWeight in 7657, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=7657)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    The results of modelling the performance of searching large text databases (>10 GBytes) via various parallel hardware architectures and search algorithms are discussed. The performance under load and the cost of each configuration are compared. Strengths, weaknesses, performance sensitivities, and search features supported for each configuration are also addressed. In addition, a common search workload used in the modelling is described. The search workload is derived from a set of searches run against the Chemical Abstracts file of bibliographic and abstract text available on STN International. This common workload is applied to all configurations modelled to provide a common basis of comparison
    Source
    Journal of the American Society for Information Science. 45(1994) no.7, S.443-464
  15. Wollf, J.G.: ¬A scalable technique for best-match retrieval of sequential information using metrics-guided search (1994) 0.03
    0.027946234 = product of:
      0.04191935 = sum of:
        0.027916465 = weight(_text_:of in 5334) [ClassicSimilarity], result of:
          0.027916465 = score(doc=5334,freq=16.0), product of:
            0.08160993 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.05218836 = queryNorm
            0.34207192 = fieldWeight in 5334, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5334)
        0.0140028875 = product of:
          0.028005775 = sum of:
            0.028005775 = weight(_text_:science in 5334) [ClassicSimilarity], result of:
              0.028005775 = score(doc=5334,freq=2.0), product of:
                0.13747036 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.05218836 = queryNorm
                0.20372227 = fieldWeight in 5334, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5334)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Describes a new technique for retrieving information by finding the best match or matches between a textual query and a textual database. The technique uses principles of beam search with a measure of probability to guide the search and prune the search tree. Unlike many methods for comparing strings, the method gives a set of alternative matches, graded by the quality of the matching. The new technique is embodies in a software simulation SP21 which runs on a conventional computer. Presnts examples showing best-match retrieval of information from a textual database. Presents analytic and emprirical evidence on the performance of the technique. It lends itself well to parallel processing. Discusses planned developments
    Source
    Journal of information science. 20(1994) no.1, S.16-28
  16. Wong, S.K.M.; Yao, Y.Y.: Query formulation in linear retrieval models (1990) 0.03
    0.02748403 = product of:
      0.041226044 = sum of:
        0.025222747 = weight(_text_:of in 3571) [ClassicSimilarity], result of:
          0.025222747 = score(doc=3571,freq=10.0), product of:
            0.08160993 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.05218836 = queryNorm
            0.3090647 = fieldWeight in 3571, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=3571)
        0.0160033 = product of:
          0.0320066 = sum of:
            0.0320066 = weight(_text_:science in 3571) [ClassicSimilarity], result of:
              0.0320066 = score(doc=3571,freq=2.0), product of:
                0.13747036 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.05218836 = queryNorm
                0.23282544 = fieldWeight in 3571, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3571)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    The subject of query formulation is analysed within the framework of adaptive linear models. The study is based on the notions of user preference and an acceptable ranking strategy. A gradient descent algorithm is used to formulate the query vector by an inductive process. Presents a critical analysis of the existing relevance feedback and probabilistic approaches.
    Source
    Journal of the American Society for Information Science. 41(1990) no.5, S.334-341
  17. Green, R.: Topical relevance relationships : 2: an exploratory study and preliminary typology (1995) 0.03
    0.02670734 = product of:
      0.04006101 = sum of:
        0.028058534 = weight(_text_:of in 3724) [ClassicSimilarity], result of:
          0.028058534 = score(doc=3724,freq=22.0), product of:
            0.08160993 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.05218836 = queryNorm
            0.34381276 = fieldWeight in 3724, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=3724)
        0.012002475 = product of:
          0.02400495 = sum of:
            0.02400495 = weight(_text_:science in 3724) [ClassicSimilarity], result of:
              0.02400495 = score(doc=3724,freq=2.0), product of:
                0.13747036 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.05218836 = queryNorm
                0.17461908 = fieldWeight in 3724, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3724)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    The assumption of topic matching between user needs and texts topically relevant to those needs is often erroneous. Reports an emprical investigantion of the question 'what relationship types actually account for topical relevance'? In order to avoid the bias to topic matching search strategies, user needs are back generated from a randomly selected subset of the subject headings employed in a user oriented topical concordance. The corresponding relevant texts are those indicated in the concordance under the subject heading. Compares the topics of the user needs with the topics of the relevant texts to determine the relationships between them. Topical relevance relationships include a large variety of relationships, only some of which are matching relationships. Others are examples of paradigmatic or syntagmatic relationships. There appear to be no constraints on the kinds of relationships that can function as topical relevance relationships. They are distinguishable from other types of relationships only on functional grounds
    Source
    Journal of the American Society for Information Science. 46(1995) no.9, S.654-662
  18. Ruthven, I.; Lalmas, M.: Selective relevance feedback using term characteristics (1999) 0.03
    0.026629638 = product of:
      0.039944455 = sum of:
        0.019940332 = weight(_text_:of in 3824) [ClassicSimilarity], result of:
          0.019940332 = score(doc=3824,freq=4.0), product of:
            0.08160993 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.05218836 = queryNorm
            0.24433708 = fieldWeight in 3824, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=3824)
        0.020004123 = product of:
          0.040008247 = sum of:
            0.040008247 = weight(_text_:science in 3824) [ClassicSimilarity], result of:
              0.040008247 = score(doc=3824,freq=2.0), product of:
                0.13747036 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.05218836 = queryNorm
                0.2910318 = fieldWeight in 3824, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3824)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Source
    Vocabulary as a central concept in digital libraries: interdisciplinary concepts, challenges, and opportunities : proceedings of the Third International Conference an Conceptions of Library and Information Science (COLIS3), Dubrovnik, Croatia, 23-26 May 1999. Ed. by T. Arpanac et al
  19. Aigrain, P.; Longueville, V.: ¬A model for the evaluation of expansion techniques in information retrieval systems (1994) 0.03
    0.025836824 = product of:
      0.038755234 = sum of:
        0.02675276 = weight(_text_:of in 5331) [ClassicSimilarity], result of:
          0.02675276 = score(doc=5331,freq=20.0), product of:
            0.08160993 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.05218836 = queryNorm
            0.32781258 = fieldWeight in 5331, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=5331)
        0.012002475 = product of:
          0.02400495 = sum of:
            0.02400495 = weight(_text_:science in 5331) [ClassicSimilarity], result of:
              0.02400495 = score(doc=5331,freq=2.0), product of:
                0.13747036 = queryWeight, product of:
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.05218836 = queryNorm
                0.17461908 = fieldWeight in 5331, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  2.6341193 = idf(docFreq=8627, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5331)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    We describe an evaluation model for expansion systems in information retrieval, that is, systems expanding a user selection of documents in order to provide the user with a larger set of documents sharing the same or related chracteristics. Our model leads to a test protocal and practical estimates of the efficieny of an expansion system provided that it is possible for a sample of users to exhaustively scan the content of a subset of the database in order to decide which documents would have been selected by an 'ideal' expansion system. This condition is met only by databases whose unit contents can be quickly apprehended, such as still image databases or synthetic bibliographical references. We compare our model with other types of possible indicators, and discuss the precision to which our measure can be estimated, using data from experimentation with an image database system developed by our research team
    Source
    Journal of the American Society for Information Science. 45(1994) no.4, S.225-234
  20. Chang, C.-H.; Hsu, C.-C.: Integrating query expansion and conceptual relevance feedback for personalized Web information retrieval (1998) 0.03
    0.025804028 = product of:
      0.038706042 = sum of:
        0.0139582325 = weight(_text_:of in 1319) [ClassicSimilarity], result of:
          0.0139582325 = score(doc=1319,freq=4.0), product of:
            0.08160993 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.05218836 = queryNorm
            0.17103596 = fieldWeight in 1319, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1319)
        0.024747808 = product of:
          0.049495615 = sum of:
            0.049495615 = weight(_text_:22 in 1319) [ClassicSimilarity], result of:
              0.049495615 = score(doc=1319,freq=2.0), product of:
                0.18275474 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.05218836 = queryNorm
                0.2708308 = fieldWeight in 1319, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1319)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Keyword based querying has been an immediate and efficient way to specify and retrieve related information that the user inquired. However, conventional document ranking based on an automatic assessment of document relevance to the query may not be the best approach when little information is given. Proposes an idea to integrate 2 existing techniques, query expansion and relevance feedback to achieve a concept-based information search for the Web
    Date
    1. 8.1996 22:08:06
    Footnote
    Contribution to a special issue devoted to the Proceedings of the 7th International World Wide Web Conference, held 14-18 April 1998, Brisbane, Australia

Languages

  • e 91
  • chi 2
  • More… Less…

Types

  • a 84
  • m 3
  • s 3
  • p 2
  • r 2
  • el 1
  • More… Less…