Search (94 results, page 1 of 5)

  • × theme_ss:"Retrievalalgorithmen"
  • × year_i:[1990 TO 2000}
  1. Joss, M.W.; Wszola, S.: ¬The engines that can : text search and retrieval software, their strategies, and vendors (1996) 0.08
    0.084378034 = product of:
      0.2531341 = sum of:
        0.015556021 = weight(_text_:of in 5123) [ClassicSimilarity], result of:
          0.015556021 = score(doc=5123,freq=12.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.25392252 = fieldWeight in 5123, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=5123)
        0.08174701 = weight(_text_:software in 5123) [ClassicSimilarity], result of:
          0.08174701 = score(doc=5123,freq=8.0), product of:
            0.15541996 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.03917671 = queryNorm
            0.525975 = fieldWeight in 5123, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.046875 = fieldNorm(doc=5123)
        0.15583107 = sum of:
          0.123983644 = weight(_text_:packages in 5123) [ClassicSimilarity], result of:
            0.123983644 = score(doc=5123,freq=2.0), product of:
              0.2706874 = queryWeight, product of:
                6.9093957 = idf(docFreq=119, maxDocs=44218)
                0.03917671 = queryNorm
              0.45803255 = fieldWeight in 5123, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.9093957 = idf(docFreq=119, maxDocs=44218)
                0.046875 = fieldNorm(doc=5123)
          0.031847417 = weight(_text_:22 in 5123) [ClassicSimilarity], result of:
            0.031847417 = score(doc=5123,freq=2.0), product of:
              0.13719016 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.03917671 = queryNorm
              0.23214069 = fieldWeight in 5123, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=5123)
      0.33333334 = coord(3/9)
    
    Abstract
    Traces the development of text searching and retrieval software designed to cope with the increasing demands made by the storage and handling of large amounts of data, recorded on high data storage media, from CD-ROM to multi gigabyte storage media and online information services, with particular reference to the need to cope with graphics as well as conventional ASCII text. Includes details of: Boolean searching, fuzzy searching and matching; relevance ranking; proximity searching and improved strategies for dealing with text searching in very large databases. Concludes that the best searching tools for CD-ROM publishers are those optimized for searching and retrieval on CD-ROM. CD-ROM drives have relatively lower random seek times than hard discs and so the software most appropriate to the medium is that which can effectively arrange the indexes and text on the CD-ROM to avoid continuous random access searching. Lists and reviews a selection of software packages designed to achieve the sort of results required for rapid CD-ROM searching
    Date
    12. 9.1996 13:56:22
  2. Wong, S.K.M.: On modelling information retrieval with probabilistic inference (1995) 0.04
    0.038161904 = product of:
      0.11448571 = sum of:
        0.06711562 = weight(_text_:applications in 1938) [ClassicSimilarity], result of:
          0.06711562 = score(doc=1938,freq=2.0), product of:
            0.17247584 = queryWeight, product of:
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.03917671 = queryNorm
            0.38913056 = fieldWeight in 1938, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.0625 = fieldNorm(doc=1938)
        0.014666359 = weight(_text_:of in 1938) [ClassicSimilarity], result of:
          0.014666359 = score(doc=1938,freq=6.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.23940048 = fieldWeight in 1938, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=1938)
        0.03270373 = weight(_text_:systems in 1938) [ClassicSimilarity], result of:
          0.03270373 = score(doc=1938,freq=2.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.2716328 = fieldWeight in 1938, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0625 = fieldNorm(doc=1938)
      0.33333334 = coord(3/9)
    
    Abstract
    Examines and extends the logical models of information retrieval in the context of probability theory and extends the applications of these fundamental ideas to term weighting and relevance. Develops a unified framework for modelling the retrieval process with probabilistic inference to provide a common conceptual and mathematical basis for many retrieval models, such as Boolean, fuzzy sets, vector space, and conventional probabilistic models. Employs this framework to identify the underlying assumptions by each model and analyzes the inherent relationships between them. Although the treatment is primarily theoretical, practical methods for rstimating the required probabilities are provided by simple examples
    Source
    ACM transactions on information systems. 13(1995) no.1, S.38-68
  3. Faloutsos, C.: Signature files (1992) 0.03
    0.034337863 = product of:
      0.10301359 = sum of:
        0.06711562 = weight(_text_:applications in 3499) [ClassicSimilarity], result of:
          0.06711562 = score(doc=3499,freq=2.0), product of:
            0.17247584 = queryWeight, product of:
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.03917671 = queryNorm
            0.38913056 = fieldWeight in 3499, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.0625 = fieldNorm(doc=3499)
        0.014666359 = weight(_text_:of in 3499) [ClassicSimilarity], result of:
          0.014666359 = score(doc=3499,freq=6.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.23940048 = fieldWeight in 3499, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=3499)
        0.021231614 = product of:
          0.042463228 = sum of:
            0.042463228 = weight(_text_:22 in 3499) [ClassicSimilarity], result of:
              0.042463228 = score(doc=3499,freq=2.0), product of:
                0.13719016 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03917671 = queryNorm
                0.30952093 = fieldWeight in 3499, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3499)
          0.5 = coord(1/2)
      0.33333334 = coord(3/9)
    
    Abstract
    Presents a survey and discussion on signature-based text retrieval methods. It describes the main idea behind the signature approach and its advantages over other text retrieval methods, it provides a classification of the signature methods that have appeared in the literature, it describes the main representatives of each class, together with the relative advantages and drawbacks, and it gives a list of applications as well as commercial or university prototypes that use the signature approach
    Date
    7. 5.1999 15:22:48
  4. Kantor, P.; Kim, M.H.; Ibraev, U.; Atasoy, K.: Estimating the number of relevant documents in enormous collections (1999) 0.03
    0.032839723 = product of:
      0.09851916 = sum of:
        0.041947264 = weight(_text_:applications in 6690) [ClassicSimilarity], result of:
          0.041947264 = score(doc=6690,freq=2.0), product of:
            0.17247584 = queryWeight, product of:
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.03917671 = queryNorm
            0.2432066 = fieldWeight in 6690, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6690)
        0.021169065 = weight(_text_:of in 6690) [ClassicSimilarity], result of:
          0.021169065 = score(doc=6690,freq=32.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.34554482 = fieldWeight in 6690, product of:
              5.656854 = tf(freq=32.0), with freq of:
                32.0 = termFreq=32.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6690)
        0.03540283 = weight(_text_:systems in 6690) [ClassicSimilarity], result of:
          0.03540283 = score(doc=6690,freq=6.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.29405114 = fieldWeight in 6690, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6690)
      0.33333334 = coord(3/9)
    
    Abstract
    In assessing information retrieval systems, it is important to know not only the precision of the retrieved set, but also to compare the number of retrieved relevant items to the total number of relevant items. For large collections, such as the TREC test collections, or the World Wide Web, it is not possible to enumerate the entire set of relevant documents. If the retrieved documents are evaluated, a variant of the statistical "capture-recapture" method can be used to estimate the total number of relevant documents, providing the several retrieval systems used are sufficiently independent. We show that the underlying signal detection model supporting such an analysis can be extended in two ways. First, assuming that there are two distinct performance characteristics (corresponding to the chance of retrieving a relevant, and retrieving a given non-relevant document), we show that if there are three or more independent systems available it is possible to estimate the number of relevant documents without actually having to decide whether each individual document is relevant. We report applications of this 3-system method to the TREC data, leading to the conclusion that the independence assumptions are not satisfied. We then extend the model to a multi-system, multi-problem model, and show that it is possible to include statistical dependencies of all orders in the model, and determine the number of relevant documents for each of the problems in the set. Application to the TREC setting will be presented
    Series
    Proceedings of the American Society for Information Science; vol.36
    Source
    Knowledge: creation, organization and use. Proceedings of the 62nd Annual Meeting of the American Society for Information Science, 31.10.-4.11.1999. Ed.: L. Woods
  5. Keen, M.: Query reformulation in ranked output interaction (1994) 0.03
    0.031968117 = product of:
      0.09590435 = sum of:
        0.01960283 = weight(_text_:of in 1065) [ClassicSimilarity], result of:
          0.01960283 = score(doc=1065,freq=14.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.31997898 = fieldWeight in 1065, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1065)
        0.028615767 = weight(_text_:systems in 1065) [ClassicSimilarity], result of:
          0.028615767 = score(doc=1065,freq=2.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.23767869 = fieldWeight in 1065, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1065)
        0.047685754 = weight(_text_:software in 1065) [ClassicSimilarity], result of:
          0.047685754 = score(doc=1065,freq=2.0), product of:
            0.15541996 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.03917671 = queryNorm
            0.30681872 = fieldWeight in 1065, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1065)
      0.33333334 = coord(3/9)
    
    Abstract
    Reports on a research project to evaluate and compare Boolean searching and methods of query reformulation using ranked output retrieval. Illustrates the design and operating features of the ranked output system, called ROSE (Ranked Output Search Engine), by means of typical results obtained by searching a database of 1239 records on the subject of cystic fibrosis. Concludes that further work is needed to determine the best reformulation tactics needed to harness the professional searcher's intelligence with that much more limited intelligence provided by the search software
    Source
    Information retrieval: new systems and current research. Proceedings of the 15th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Glasgow 1993. Ed.: Ruben Leon
  6. Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (1999) 0.03
    0.028135002 = product of:
      0.12660751 = sum of:
        0.010999769 = weight(_text_:of in 5777) [ClassicSimilarity], result of:
          0.010999769 = score(doc=5777,freq=6.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.17955035 = fieldWeight in 5777, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=5777)
        0.11560774 = weight(_text_:software in 5777) [ClassicSimilarity], result of:
          0.11560774 = score(doc=5777,freq=16.0), product of:
            0.15541996 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.03917671 = queryNorm
            0.743841 = fieldWeight in 5777, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.046875 = fieldNorm(doc=5777)
      0.22222222 = coord(2/9)
    
    Abstract
    This book discusses many of the key design issues for building search engines and emphazises the important role that applied mathematics can play in improving information retrieval. The authors discuss not only important data structures, algorithms, and software but also user-centered issues such as interfaces, manual indexing, and document preparation. They also present some of the current problems in information retrieval that many not be familiar to applied mathematicians and computer scientists and some of the driving computational methods (SVD, SDD) for automated conceptual indexing
    Classification
    ST 230 [Informatik # Monographien # Software und -entwicklung # Software allgemein, (Einführung, Lehrbücher, Methoden der Programmierung) Software engineering, Programmentwicklungssysteme, Softwarewerkzeuge]
    RVK
    ST 230 [Informatik # Monographien # Software und -entwicklung # Software allgemein, (Einführung, Lehrbücher, Methoden der Programmierung) Software engineering, Programmentwicklungssysteme, Softwarewerkzeuge]
    Series
    Software, environments, tools; 8
  7. Kelledy, F.; Smeaton, A.F.: Signature files and beyond (1996) 0.02
    0.02111645 = product of:
      0.06334935 = sum of:
        0.022897845 = weight(_text_:of in 6973) [ClassicSimilarity], result of:
          0.022897845 = score(doc=6973,freq=26.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.37376386 = fieldWeight in 6973, product of:
              5.0990195 = tf(freq=26.0), with freq of:
                26.0 = termFreq=26.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=6973)
        0.0245278 = weight(_text_:systems in 6973) [ClassicSimilarity], result of:
          0.0245278 = score(doc=6973,freq=2.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.2037246 = fieldWeight in 6973, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.046875 = fieldNorm(doc=6973)
        0.015923709 = product of:
          0.031847417 = sum of:
            0.031847417 = weight(_text_:22 in 6973) [ClassicSimilarity], result of:
              0.031847417 = score(doc=6973,freq=2.0), product of:
                0.13719016 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03917671 = queryNorm
                0.23214069 = fieldWeight in 6973, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=6973)
          0.5 = coord(1/2)
      0.33333334 = coord(3/9)
    
    Abstract
    Proposes that signature files be used as a viable alternative to other indexing strategies such as inverted files for searching through large volumes of text. Demonstrates through simulation, that search times can be further reduced by enhancing the basic signature file concept using deterministic partitioning algorithms which eliminate the need for an exhaustive search of the entire signature file. Reports research to evaluate the performance of some deterministic partitioning algorithms in a non simulated environment using 276 MB of raw newspaper text (taken from the Wall Street Journal) and real user queries. Presents a selection of results to illustrate trends and highlight important aspects of the performance of these methods under realistic rather than simulated operating conditions. As a result of the research reported here certain aspects of this approach to signature files are shown to be found wanting and require improvement. Suggests lines of future research on the partitioning of signature files
    Source
    Information retrieval: new systems and current research. Proceedings of the 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Drymen, Scotland, 22-23 Mar 94. Ed.: R. Leon
  8. Quint, B.: Check out the new RANK command on DIALOG (1993) 0.02
    0.020995347 = product of:
      0.09447906 = sum of:
        0.08389453 = weight(_text_:applications in 6640) [ClassicSimilarity], result of:
          0.08389453 = score(doc=6640,freq=2.0), product of:
            0.17247584 = queryWeight, product of:
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.03917671 = queryNorm
            0.4864132 = fieldWeight in 6640, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4025097 = idf(docFreq=1471, maxDocs=44218)
              0.078125 = fieldNorm(doc=6640)
        0.010584532 = weight(_text_:of in 6640) [ClassicSimilarity], result of:
          0.010584532 = score(doc=6640,freq=2.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.17277241 = fieldWeight in 6640, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=6640)
      0.22222222 = coord(2/9)
    
    Abstract
    Describes the RANK command on DIALOG online information service. RANK conducts statistical analysis on an existing set of search results for fields specified by searchers. Details how to use RANK and applications. Points out drawbacks to its use
  9. Willett, P.: Best-match text retrieval (1993) 0.02
    0.019808581 = product of:
      0.08913861 = sum of:
        0.018332949 = weight(_text_:of in 7818) [ClassicSimilarity], result of:
          0.018332949 = score(doc=7818,freq=6.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.2992506 = fieldWeight in 7818, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=7818)
        0.07080566 = weight(_text_:systems in 7818) [ClassicSimilarity], result of:
          0.07080566 = score(doc=7818,freq=6.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.5881023 = fieldWeight in 7818, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.078125 = fieldNorm(doc=7818)
      0.22222222 = coord(2/9)
    
    Abstract
    Provides an introduction to the computational techniques that underlie best match searching retrieval systems. Discusses: problems of traditional Boolean systems; characteristics of best-match searching; automatic indexing; term conflation; matching of documents and queries (dealing with similarity measures, initial weights, relevance weights, and the matching algorithm); and describes operational best-match systems
  10. Chang, C.-H.; Hsu, C.-C.: Integrating query expansion and conceptual relevance feedback for personalized Web information retrieval (1998) 0.02
    0.019223861 = product of:
      0.05767158 = sum of:
        0.010478153 = weight(_text_:of in 1319) [ClassicSimilarity], result of:
          0.010478153 = score(doc=1319,freq=4.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.17103596 = fieldWeight in 1319, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1319)
        0.028615767 = weight(_text_:systems in 1319) [ClassicSimilarity], result of:
          0.028615767 = score(doc=1319,freq=2.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.23767869 = fieldWeight in 1319, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1319)
        0.018577661 = product of:
          0.037155323 = sum of:
            0.037155323 = weight(_text_:22 in 1319) [ClassicSimilarity], result of:
              0.037155323 = score(doc=1319,freq=2.0), product of:
                0.13719016 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03917671 = queryNorm
                0.2708308 = fieldWeight in 1319, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1319)
          0.5 = coord(1/2)
      0.33333334 = coord(3/9)
    
    Abstract
    Keyword based querying has been an immediate and efficient way to specify and retrieve related information that the user inquired. However, conventional document ranking based on an automatic assessment of document relevance to the query may not be the best approach when little information is given. Proposes an idea to integrate 2 existing techniques, query expansion and relevance feedback to achieve a concept-based information search for the Web
    Date
    1. 8.1996 22:08:06
    Footnote
    Contribution to a special issue devoted to the Proceedings of the 7th International World Wide Web Conference, held 14-18 April 1998, Brisbane, Australia
    Source
    Computer networks and ISDN systems. 30(1998) nos.1/7, S.621-623
  11. Nakkouzi, Z.S.; Eastman, C.M.: Query formulation for handling negation in information retrieval systems (1990) 0.02
    0.017566169 = product of:
      0.07904776 = sum of:
        0.022403233 = weight(_text_:of in 3531) [ClassicSimilarity], result of:
          0.022403233 = score(doc=3531,freq=14.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.36569026 = fieldWeight in 3531, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=3531)
        0.05664453 = weight(_text_:systems in 3531) [ClassicSimilarity], result of:
          0.05664453 = score(doc=3531,freq=6.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.4704818 = fieldWeight in 3531, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0625 = fieldNorm(doc=3531)
      0.22222222 = coord(2/9)
    
    Abstract
    Queries containing negation are widely recognised as presenting problems for both users and systems. In information retrieval systems such problems usually manifest themselves in the use of the NOT operator. Describes an algorithm to transform Boolean queries with negated terms into queries without negation; the transformation process is based on the use of a hierarchical thesaurus. Examines a set of user requests submitted to the Thomas Cooper Library at the University of South Carolina to determine the pattern and frequency of use of negation.
    Source
    Journal of the American Society for Information Science. 41(1990) no.3, S.171-182
  12. Chang, R.: ¬The development of indexing technology (1993) 0.02
    0.0153698595 = product of:
      0.069164366 = sum of:
        0.014666359 = weight(_text_:of in 7024) [ClassicSimilarity], result of:
          0.014666359 = score(doc=7024,freq=6.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.23940048 = fieldWeight in 7024, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=7024)
        0.054498006 = weight(_text_:software in 7024) [ClassicSimilarity], result of:
          0.054498006 = score(doc=7024,freq=2.0), product of:
            0.15541996 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.03917671 = queryNorm
            0.35064998 = fieldWeight in 7024, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.0625 = fieldNorm(doc=7024)
      0.22222222 = coord(2/9)
    
    Abstract
    Reviews the basic techniques of computerized indexing, including various file accessing methods such as: Sequential Access Method (SAM); Direct Access Method (DAM); Indexed Sequential Access Method (ISAM), and Virtual Indexed Sequential Access Method (VSAM); and various B-tree (balanced tree)structures. Illustrates how records are stored and accessed, and how B-trees are used to for improving the operations of information retrieval and maintenance
    Source
    Library software review. 12(1993) no.3, S.30-35
  13. Wollf, J.G.: ¬A scalable technique for best-match retrieval of sequential information using metrics-guided search (1994) 0.02
    0.015253791 = product of:
      0.06864206 = sum of:
        0.020956306 = weight(_text_:of in 5334) [ClassicSimilarity], result of:
          0.020956306 = score(doc=5334,freq=16.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.34207192 = fieldWeight in 5334, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5334)
        0.047685754 = weight(_text_:software in 5334) [ClassicSimilarity], result of:
          0.047685754 = score(doc=5334,freq=2.0), product of:
            0.15541996 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.03917671 = queryNorm
            0.30681872 = fieldWeight in 5334, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5334)
      0.22222222 = coord(2/9)
    
    Abstract
    Describes a new technique for retrieving information by finding the best match or matches between a textual query and a textual database. The technique uses principles of beam search with a measure of probability to guide the search and prune the search tree. Unlike many methods for comparing strings, the method gives a set of alternative matches, graded by the quality of the matching. The new technique is embodies in a software simulation SP21 which runs on a conventional computer. Presnts examples showing best-match retrieval of information from a textual database. Presents analytic and emprirical evidence on the performance of the technique. It lends itself well to parallel processing. Discusses planned developments
    Source
    Journal of information science. 20(1994) no.1, S.16-28
  14. Frants, V.I.; Shapiro, J.: Control and feedback in a documentary information retrieval system (1991) 0.01
    0.014886984 = product of:
      0.066991426 = sum of:
        0.020741362 = weight(_text_:of in 416) [ClassicSimilarity], result of:
          0.020741362 = score(doc=416,freq=12.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.33856338 = fieldWeight in 416, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=416)
        0.046250064 = weight(_text_:systems in 416) [ClassicSimilarity], result of:
          0.046250064 = score(doc=416,freq=4.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.38414678 = fieldWeight in 416, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0625 = fieldNorm(doc=416)
      0.22222222 = coord(2/9)
    
    Abstract
    Addresses the problem of control in documentary information retrieval systems is analysed and it is shown why an IR system has to be looked at as an adaptive system. The algorithms of feedback are proposed and it is shown how they depend on the type of the collection of documents: static (no change in the collection between searches) and dynamic (when the change occurs between searches). The proposed algorithms are the basis for the development of the fully automated information retrieval systems
    Source
    Journal of the American Society for Information Science. 42(1991) no.9, S.623-634
  15. Loughran, H.: ¬A review of nearest neighbour information retrieval (1994) 0.01
    0.01484586 = product of:
      0.06680637 = sum of:
        0.025926704 = weight(_text_:of in 616) [ClassicSimilarity], result of:
          0.025926704 = score(doc=616,freq=12.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.42320424 = fieldWeight in 616, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=616)
        0.040879667 = weight(_text_:systems in 616) [ClassicSimilarity], result of:
          0.040879667 = score(doc=616,freq=2.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.339541 = fieldWeight in 616, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.078125 = fieldNorm(doc=616)
      0.22222222 = coord(2/9)
    
    Abstract
    Explains the concept of 'nearest neighbour' searching, also known as best match or ranked output, which it is claimed can overcome many of the inadequacies of traditional Boolean methods. Also points to some of the limitations. Identifies a number of commercial information retrieval systems which feature this search technique
  16. Smith, M.; Smith, M.P.; Wade, S.J.: Applying genetic programming to the problem of term weight algorithms (1995) 0.01
    0.014485389 = product of:
      0.06518425 = sum of:
        0.018934188 = weight(_text_:of in 5803) [ClassicSimilarity], result of:
          0.018934188 = score(doc=5803,freq=10.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.3090647 = fieldWeight in 5803, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=5803)
        0.046250064 = weight(_text_:systems in 5803) [ClassicSimilarity], result of:
          0.046250064 = score(doc=5803,freq=4.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.38414678 = fieldWeight in 5803, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0625 = fieldNorm(doc=5803)
      0.22222222 = coord(2/9)
    
    Abstract
    Presents the results of an initial study on the application of Genetic Programming (GP) to the production of term weighting algorithms in relevance feedback systems within information retrieval systems. Compares Porter, wpq and GP algorithms with user rankings. Offers a backgroud to term weighting alsgorithms and Genetic Programming
    Source
    New review of document and text management. 1995, no.1, S.101-110
  17. Pfeifer, U.; Pennekamp, S.: Incremental processing of vague queries in interactive retrieval systems (1997) 0.01
    0.014485389 = product of:
      0.06518425 = sum of:
        0.018934188 = weight(_text_:of in 735) [ClassicSimilarity], result of:
          0.018934188 = score(doc=735,freq=10.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.3090647 = fieldWeight in 735, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.0625 = fieldNorm(doc=735)
        0.046250064 = weight(_text_:systems in 735) [ClassicSimilarity], result of:
          0.046250064 = score(doc=735,freq=4.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.38414678 = fieldWeight in 735, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.0625 = fieldNorm(doc=735)
      0.22222222 = coord(2/9)
    
    Abstract
    The application of information retrieval techniques in interactive environments requires systems capable of effeciently processing vague queries. To reach reasonable response times, new data structures and algorithms have to be developed. In this paper we describe an approach taking advantage of the conditions of interactive usage and special access paths. To have a reference we investigate text queries and compared our algorithms to the well known 'Buckley/Lewit' algorithm. We achieved significant improvements for the response times
  18. Savoy, J.: Ranking schemes in hybrid Boolean systems : a new approach (1997) 0.01
    0.014056942 = product of:
      0.06325624 = sum of:
        0.014200641 = weight(_text_:of in 393) [ClassicSimilarity], result of:
          0.014200641 = score(doc=393,freq=10.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.23179851 = fieldWeight in 393, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=393)
        0.0490556 = weight(_text_:systems in 393) [ClassicSimilarity], result of:
          0.0490556 = score(doc=393,freq=8.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.4074492 = fieldWeight in 393, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.046875 = fieldNorm(doc=393)
      0.22222222 = coord(2/9)
    
    Abstract
    In most commercial online systems, the retrieval system is based on the Boolean model and its inverted file organization. Since the investment in these systems is so great and changing them could be economically unfeasible, this article suggests a new ranking scheme especially adapted for hypertext environments in order to produce more effective retrieval results and yet maintain the effectiveness of the investment made to date in the Boolean model. To select the retrieved documents, the suggested ranking strategy uses multiple sources of document content evidence. The proposed scheme integrates both the information provided by the index and query terms, and the inherent relationships between documents such as bibliographic references or hypertext links. We will demonstrate that our scheme represents an integration of both subject and citation indexing, and results in a significant imporvement over classical ranking schemes uses in hybrid Boolean systems, while preserving its efficiency. Moreover, through knowing the nearest neighbor and the hypertext links which constitute additional sources of evidence, our strategy will take them into account in order to further improve retrieval effectiveness and to provide 'good' starting points for browsing in a hypertext or hypermedia environement
    Source
    Journal of the American Society for Information Science. 48(1997) no.3, S.235-253
  19. Aigrain, P.; Longueville, V.: ¬A model for the evaluation of expansion techniques in information retrieval systems (1994) 0.01
    0.013903585 = product of:
      0.06256613 = sum of:
        0.020082738 = weight(_text_:of in 5331) [ClassicSimilarity], result of:
          0.020082738 = score(doc=5331,freq=20.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.32781258 = fieldWeight in 5331, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.046875 = fieldNorm(doc=5331)
        0.042483397 = weight(_text_:systems in 5331) [ClassicSimilarity], result of:
          0.042483397 = score(doc=5331,freq=6.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.35286134 = fieldWeight in 5331, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.046875 = fieldNorm(doc=5331)
      0.22222222 = coord(2/9)
    
    Abstract
    We describe an evaluation model for expansion systems in information retrieval, that is, systems expanding a user selection of documents in order to provide the user with a larger set of documents sharing the same or related chracteristics. Our model leads to a test protocal and practical estimates of the efficieny of an expansion system provided that it is possible for a sample of users to exhaustively scan the content of a subset of the database in order to decide which documents would have been selected by an 'ideal' expansion system. This condition is met only by databases whose unit contents can be quickly apprehended, such as still image databases or synthetic bibliographical references. We compare our model with other types of possible indicators, and discuss the precision to which our measure can be estimated, using data from experimentation with an image database system developed by our research team
    Source
    Journal of the American Society for Information Science. 45(1994) no.4, S.225-234
  20. Zhang, W.; Korf, R.E.: Performance of linear-space search algorithms (1995) 0.01
    0.012410768 = product of:
      0.055848457 = sum of:
        0.014968789 = weight(_text_:of in 4744) [ClassicSimilarity], result of:
          0.014968789 = score(doc=4744,freq=4.0), product of:
            0.061262865 = queryWeight, product of:
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.03917671 = queryNorm
            0.24433708 = fieldWeight in 4744, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.5637573 = idf(docFreq=25162, maxDocs=44218)
              0.078125 = fieldNorm(doc=4744)
        0.040879667 = weight(_text_:systems in 4744) [ClassicSimilarity], result of:
          0.040879667 = score(doc=4744,freq=2.0), product of:
            0.12039685 = queryWeight, product of:
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.03917671 = queryNorm
            0.339541 = fieldWeight in 4744, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0731742 = idf(docFreq=5561, maxDocs=44218)
              0.078125 = fieldNorm(doc=4744)
      0.22222222 = coord(2/9)
    
    Abstract
    Search algorithms in artificial intelligence systems that use space linear in the search depth are employed in practice to solve difficult problems optimally, such as planning and scheduling. Studies the average-case performance of linear-space search algorithms, including depth-first branch-and-bound, iterative-deepening, and recursive best-first search

Languages

  • e 91
  • chi 2
  • d 1
  • More… Less…

Types

  • a 85
  • m 3
  • s 3
  • p 2
  • r 2
  • el 1
  • More… Less…