Search (111 results, page 1 of 6)

  • × theme_ss:"Retrievalalgorithmen"
  1. Soulier, L.; Jabeur, L.B.; Tamine, L.; Bahsoun, W.: On ranking relevant entities in heterogeneous networks using a language-based model (2013) 0.05
    0.05242471 = product of:
      0.13106178 = sum of:
        0.06534132 = weight(_text_:bibliographic in 664) [ClassicSimilarity], result of:
          0.06534132 = score(doc=664,freq=6.0), product of:
            0.17541347 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.04505818 = queryNorm
            0.3724989 = fieldWeight in 664, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.0390625 = fieldNorm(doc=664)
        0.065720454 = sum of:
          0.035196647 = weight(_text_:data in 664) [ClassicSimilarity], result of:
            0.035196647 = score(doc=664,freq=4.0), product of:
              0.14247625 = queryWeight, product of:
                3.1620505 = idf(docFreq=5088, maxDocs=44218)
                0.04505818 = queryNorm
              0.24703519 = fieldWeight in 664, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.1620505 = idf(docFreq=5088, maxDocs=44218)
                0.0390625 = fieldNorm(doc=664)
          0.030523809 = weight(_text_:22 in 664) [ClassicSimilarity], result of:
            0.030523809 = score(doc=664,freq=2.0), product of:
              0.15778607 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.04505818 = queryNorm
              0.19345059 = fieldWeight in 664, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=664)
      0.4 = coord(2/5)
    
    Abstract
    A new challenge, accessing multiple relevant entities, arises from the availability of linked heterogeneous data. In this article, we address more specifically the problem of accessing relevant entities, such as publications and authors within a bibliographic network, given an information need. We propose a novel algorithm, called BibRank, that estimates a joint relevance of documents and authors within a bibliographic network. This model ranks each type of entity using a score propagation algorithm with respect to the query topic and the structure of the underlying bi-type information entity network. Evidence sources, namely content-based and network-based scores, are both used to estimate the topical similarity between connected entities. For this purpose, authorship relationships are analyzed through a language model-based score on the one hand and on the other hand, non topically related entities of the same type are detected through marginal citations. The article reports the results of experiments using the Bibrank algorithm for an information retrieval task. The CiteSeerX bibliographic data set forms the basis for the topical query automatic generation and evaluation. We show that a statistically significant improvement over closely related ranking models is achieved.
    Date
    22. 3.2013 19:34:49
  2. Carpineto, C.; Romano, G.: Information retrieval through hybrid navigation of lattice representations (1996) 0.03
    0.028094485 = product of:
      0.07023621 = sum of:
        0.052814763 = weight(_text_:bibliographic in 7434) [ClassicSimilarity], result of:
          0.052814763 = score(doc=7434,freq=2.0), product of:
            0.17541347 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.04505818 = queryNorm
            0.30108726 = fieldWeight in 7434, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7434)
        0.01742145 = product of:
          0.0348429 = sum of:
            0.0348429 = weight(_text_:data in 7434) [ClassicSimilarity], result of:
              0.0348429 = score(doc=7434,freq=2.0), product of:
                0.14247625 = queryWeight, product of:
                  3.1620505 = idf(docFreq=5088, maxDocs=44218)
                  0.04505818 = queryNorm
                0.24455236 = fieldWeight in 7434, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1620505 = idf(docFreq=5088, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=7434)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    Presents a comprehensive approach to automatic organization and hybrid navigation of text databases. An organizing stage builds a particular lattice representation of the data, through text indexing followed by lattice clustering of the indexed texts. The lattice representation supports the navigation state of the system, a visual retrieval interface that combines 3 main retrieval strategies: browsing, querying, and bounding. Such a hybrid paradigm permits high flexibility in trading off information exploration and retrieval, and had good retrieval performance. Compares information retrieval using lattice-based hybrid navigation with conventional Boolean querying. Experiments conducted on 2 medium-sized bibliographic databases showed that the performance of lattice retrieval was comparable to or better than Boolean retrieval
  3. Schiminovich, S.: Automatic classification and retrieval of documents by means of a bibliographic pattern discovery algorithm (1971) 0.02
    0.021125905 = product of:
      0.105629526 = sum of:
        0.105629526 = weight(_text_:bibliographic in 4846) [ClassicSimilarity], result of:
          0.105629526 = score(doc=4846,freq=2.0), product of:
            0.17541347 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.04505818 = queryNorm
            0.6021745 = fieldWeight in 4846, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.109375 = fieldNorm(doc=4846)
      0.2 = coord(1/5)
    
  4. Aho, A.; Corasick, M.: Efficient string matching : an aid to bibliographic search (1975) 0.02
    0.021125905 = product of:
      0.105629526 = sum of:
        0.105629526 = weight(_text_:bibliographic in 3506) [ClassicSimilarity], result of:
          0.105629526 = score(doc=3506,freq=2.0), product of:
            0.17541347 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.04505818 = queryNorm
            0.6021745 = fieldWeight in 3506, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.109375 = fieldNorm(doc=3506)
      0.2 = coord(1/5)
    
  5. Lewandowski, D.: How can library materials be ranked in the OPAC? (2009) 0.02
    0.020067489 = product of:
      0.050168723 = sum of:
        0.03772483 = weight(_text_:bibliographic in 2810) [ClassicSimilarity], result of:
          0.03772483 = score(doc=2810,freq=2.0), product of:
            0.17541347 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.04505818 = queryNorm
            0.21506234 = fieldWeight in 2810, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2810)
        0.012443894 = product of:
          0.024887787 = sum of:
            0.024887787 = weight(_text_:data in 2810) [ClassicSimilarity], result of:
              0.024887787 = score(doc=2810,freq=2.0), product of:
                0.14247625 = queryWeight, product of:
                  3.1620505 = idf(docFreq=5088, maxDocs=44218)
                  0.04505818 = queryNorm
                0.17468026 = fieldWeight in 2810, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1620505 = idf(docFreq=5088, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2810)
          0.5 = coord(1/2)
      0.4 = coord(2/5)
    
    Abstract
    Some Online Public Access Catalogues offer a ranking component. However, ranking there is merely text-based and is doomed to fail due to limited text in bibliographic data. The main assumption for the talk is that we are in a situation where the appropriate ranking factors for OPACs should be defined, while the implementation is no major problem. We must define what we want, and not so much focus on the technical work. Some deep thinking is necessary on the "perfect results set" and how we can achieve it through ranking. The talk presents a set of potential ranking factors and clustering possibilities for further discussion. A look at commercial Web search engines could provide us with ideas how ranking can be improved with additional factors. Search engines are way beyond pure text-based ranking and apply ranking factors in the groups like popularity, freshness, personalisation, etc. The talk describes the main factors used in search engines and how derivatives of these could be used for libraries' purposes. The goal of ranking is to provide the user with the best-suitable results on top of the results list. How can this goal be achieved with the library catalogue and also concerning the library's different collections and databases? The assumption is that ranking of such materials is a complex problem and is yet nowhere near solved. Libraries should focus on ranking to improve user experience.
  6. Faloutsos, C.: Signature files (1992) 0.02
    0.01773171 = product of:
      0.08865855 = sum of:
        0.08865855 = sum of:
          0.03982046 = weight(_text_:data in 3499) [ClassicSimilarity], result of:
            0.03982046 = score(doc=3499,freq=2.0), product of:
              0.14247625 = queryWeight, product of:
                3.1620505 = idf(docFreq=5088, maxDocs=44218)
                0.04505818 = queryNorm
              0.2794884 = fieldWeight in 3499, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1620505 = idf(docFreq=5088, maxDocs=44218)
                0.0625 = fieldNorm(doc=3499)
          0.04883809 = weight(_text_:22 in 3499) [ClassicSimilarity], result of:
            0.04883809 = score(doc=3499,freq=2.0), product of:
              0.15778607 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.04505818 = queryNorm
              0.30952093 = fieldWeight in 3499, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0625 = fieldNorm(doc=3499)
      0.2 = coord(1/5)
    
    Date
    7. 5.1999 15:22:48
    Source
    Information retrieval: data structures and algorithms. Ed.: W.B. Frakes u. R. Baeza-Yates
  7. Joss, M.W.; Wszola, S.: ¬The engines that can : text search and retrieval software, their strategies, and vendors (1996) 0.02
    0.015772909 = product of:
      0.078864545 = sum of:
        0.078864545 = sum of:
          0.042235978 = weight(_text_:data in 5123) [ClassicSimilarity], result of:
            0.042235978 = score(doc=5123,freq=4.0), product of:
              0.14247625 = queryWeight, product of:
                3.1620505 = idf(docFreq=5088, maxDocs=44218)
                0.04505818 = queryNorm
              0.29644224 = fieldWeight in 5123, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.1620505 = idf(docFreq=5088, maxDocs=44218)
                0.046875 = fieldNorm(doc=5123)
          0.036628567 = weight(_text_:22 in 5123) [ClassicSimilarity], result of:
            0.036628567 = score(doc=5123,freq=2.0), product of:
              0.15778607 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.04505818 = queryNorm
              0.23214069 = fieldWeight in 5123, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=5123)
      0.2 = coord(1/5)
    
    Abstract
    Traces the development of text searching and retrieval software designed to cope with the increasing demands made by the storage and handling of large amounts of data, recorded on high data storage media, from CD-ROM to multi gigabyte storage media and online information services, with particular reference to the need to cope with graphics as well as conventional ASCII text. Includes details of: Boolean searching, fuzzy searching and matching; relevance ranking; proximity searching and improved strategies for dealing with text searching in very large databases. Concludes that the best searching tools for CD-ROM publishers are those optimized for searching and retrieval on CD-ROM. CD-ROM drives have relatively lower random seek times than hard discs and so the software most appropriate to the medium is that which can effectively arrange the indexes and text on the CD-ROM to avoid continuous random access searching. Lists and reviews a selection of software packages designed to achieve the sort of results required for rapid CD-ROM searching
    Date
    12. 9.1996 13:56:22
  8. Burgin, R.: ¬The retrieval effectiveness of 5 clustering algorithms as a function of indexing exhaustivity (1995) 0.01
    0.014726144 = product of:
      0.07363072 = sum of:
        0.07363072 = sum of:
          0.043106914 = weight(_text_:data in 3365) [ClassicSimilarity], result of:
            0.043106914 = score(doc=3365,freq=6.0), product of:
              0.14247625 = queryWeight, product of:
                3.1620505 = idf(docFreq=5088, maxDocs=44218)
                0.04505818 = queryNorm
              0.30255508 = fieldWeight in 3365, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.1620505 = idf(docFreq=5088, maxDocs=44218)
                0.0390625 = fieldNorm(doc=3365)
          0.030523809 = weight(_text_:22 in 3365) [ClassicSimilarity], result of:
            0.030523809 = score(doc=3365,freq=2.0), product of:
              0.15778607 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.04505818 = queryNorm
              0.19345059 = fieldWeight in 3365, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=3365)
      0.2 = coord(1/5)
    
    Abstract
    The retrieval effectiveness of 5 hierarchical clustering methods (single link, complete link, group average, Ward's method, and weighted average) is examined as a function of indexing exhaustivity with 4 test collections (CR, Cranfield, Medlars, and Time). Evaluations of retrieval effectiveness, based on 3 measures of optimal retrieval performance, confirm earlier findings that the performance of a retrieval system based on single link clustering varies as a function of indexing exhaustivity but fail ti find similar patterns for other clustering methods. The data also confirm earlier findings regarding the poor performance of single link clustering is a retrieval environment. The poor performance of single link clustering appears to derive from that method's tendency to produce a small number of large, ill defined document clusters. By contrast, the data examined here found the retrieval performance of the other clustering methods to be general comparable. The data presented also provides an opportunity to examine the theoretical limits of cluster based retrieval and to compare these theoretical limits to the effectiveness of operational implementations. Performance standards of the 4 document collections examined were found to vary widely, and the effectiveness of operational implementations were found to be in the range defined as unacceptable. Further improvements in search strategies and document representations warrant investigations
    Date
    22. 2.1996 11:20:06
  9. Klas, C.-P.; Fuhr, N.; Schaefer, A.: Evaluating strategic support for information access in the DAFFODIL system (2004) 0.01
    0.0132987825 = product of:
      0.06649391 = sum of:
        0.06649391 = sum of:
          0.029865343 = weight(_text_:data in 2419) [ClassicSimilarity], result of:
            0.029865343 = score(doc=2419,freq=2.0), product of:
              0.14247625 = queryWeight, product of:
                3.1620505 = idf(docFreq=5088, maxDocs=44218)
                0.04505818 = queryNorm
              0.2096163 = fieldWeight in 2419, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1620505 = idf(docFreq=5088, maxDocs=44218)
                0.046875 = fieldNorm(doc=2419)
          0.036628567 = weight(_text_:22 in 2419) [ClassicSimilarity], result of:
            0.036628567 = score(doc=2419,freq=2.0), product of:
              0.15778607 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.04505818 = queryNorm
              0.23214069 = fieldWeight in 2419, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=2419)
      0.2 = coord(1/5)
    
    Abstract
    The digital library system Daffodil is targeted at strategic support of users during the information search process. For searching, exploring and managing digital library objects it provides user-customisable information seeking patterns over a federation of heterogeneous digital libraries. In this paper evaluation results with respect to retrieval effectiveness, efficiency and user satisfaction are presented. The analysis focuses on strategic support for the scientific work-flow. Daffodil supports the whole work-flow, from data source selection over information seeking to the representation, organisation and reuse of information. By embedding high level search functionality into the scientific work-flow, the user experiences better strategic system support due to a more systematic work process. These ideas have been implemented in Daffodil followed by a qualitative evaluation. The evaluation has been conducted with 28 participants, ranging from information seeking novices to experts. The results are promising, as they support the chosen model.
    Date
    16.11.2008 16:22:48
  10. Shiri, A.A.; Revie, C.: Query expansion behavior within a thesaurus-enhanced search environment : a user-centered evaluation (2006) 0.01
    0.013144091 = product of:
      0.065720454 = sum of:
        0.065720454 = sum of:
          0.035196647 = weight(_text_:data in 56) [ClassicSimilarity], result of:
            0.035196647 = score(doc=56,freq=4.0), product of:
              0.14247625 = queryWeight, product of:
                3.1620505 = idf(docFreq=5088, maxDocs=44218)
                0.04505818 = queryNorm
              0.24703519 = fieldWeight in 56, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.1620505 = idf(docFreq=5088, maxDocs=44218)
                0.0390625 = fieldNorm(doc=56)
          0.030523809 = weight(_text_:22 in 56) [ClassicSimilarity], result of:
            0.030523809 = score(doc=56,freq=2.0), product of:
              0.15778607 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.04505818 = queryNorm
              0.19345059 = fieldWeight in 56, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=56)
      0.2 = coord(1/5)
    
    Abstract
    The study reported here investigated the query expansion behavior of end-users interacting with a thesaurus-enhanced search system on the Web. Two groups, namely academic staff and postgraduate students, were recruited into this study. Data were collected from 90 searches performed by 30 users using the OVID interface to the CAB abstracts database. Data-gathering techniques included questionnaires, screen capturing software, and interviews. The results presented here relate to issues of search-topic and search-term characteristics, number and types of expanded queries, usefulness of thesaurus terms, and behavioral differences between academic staff and postgraduate students in their interaction. The key conclusions drawn were that (a) academic staff chose more narrow and synonymous terms than did postgraduate students, who generally selected broader and related terms; (b) topic complexity affected users' interaction with the thesaurus in that complex topics required more query expansion and search term selection; (c) users' prior topic-search experience appeared to have a significant effect on their selection and evaluation of thesaurus terms; (d) in 50% of the searches where additional terms were suggested from the thesaurus, users stated that they had not been aware of the terms at the beginning of the search; this observation was particularly noticeable in the case of postgraduate students.
    Date
    22. 7.2006 16:32:43
  11. Guerrero-Bote, V.P.; Moya Anegón, F. de; Herrero Solana, V.: Document organization using Kohonen's algorithm (2002) 0.01
    0.012071946 = product of:
      0.060359728 = sum of:
        0.060359728 = weight(_text_:bibliographic in 2564) [ClassicSimilarity], result of:
          0.060359728 = score(doc=2564,freq=2.0), product of:
            0.17541347 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.04505818 = queryNorm
            0.34409973 = fieldWeight in 2564, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.0625 = fieldNorm(doc=2564)
      0.2 = coord(1/5)
    
    Abstract
    The classification of documents from a bibliographic database is a task that is linked to processes of information retrieval based on partial matching. A method is described of vectorizing reference documents from LISA which permits their topological organization using Kohonen's algorithm. As an example a map is generated of 202 documents from LISA, and an analysis is made of the possibilities of this type of neural network with respect to the development of information retrieval systems based on graphical browsing.
  12. Baloh, P.; Desouza, K.C.; Hackney, R.: Contextualizing organizational interventions of knowledge management systems : a design science perspectiveA domain analysis (2012) 0.01
    0.01108232 = product of:
      0.055411596 = sum of:
        0.055411596 = sum of:
          0.024887787 = weight(_text_:data in 241) [ClassicSimilarity], result of:
            0.024887787 = score(doc=241,freq=2.0), product of:
              0.14247625 = queryWeight, product of:
                3.1620505 = idf(docFreq=5088, maxDocs=44218)
                0.04505818 = queryNorm
              0.17468026 = fieldWeight in 241, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1620505 = idf(docFreq=5088, maxDocs=44218)
                0.0390625 = fieldNorm(doc=241)
          0.030523809 = weight(_text_:22 in 241) [ClassicSimilarity], result of:
            0.030523809 = score(doc=241,freq=2.0), product of:
              0.15778607 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.04505818 = queryNorm
              0.19345059 = fieldWeight in 241, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=241)
      0.2 = coord(1/5)
    
    Abstract
    We address how individuals' (workers) knowledge needs influence the design of knowledge management systems (KMS), enabling knowledge creation and utilization. It is evident that KMS technologies and activities are indiscriminately deployed in most organizations with little regard to the actual context of their adoption. Moreover, it is apparent that the extant literature pertaining to knowledge management projects is frequently deficient in identifying the variety of factors indicative for successful KMS. This presents an obvious business practice and research gap that requires a critical analysis of the necessary intervention that will actually improve how workers can leverage and form organization-wide knowledge. This research involved an extensive review of the literature, a grounded theory methodological approach and rigorous data collection and synthesis through an empirical case analysis (Parsons Brinckerhoff and Samsung). The contribution of this study is the formulation of a model for designing KMS based upon the design science paradigm, which aspires to create artifacts that are interdependent of people and organizations. The essential proposition is that KMS design and implementation must be contextualized in relation to knowledge needs and that these will differ for various organizational settings. The findings present valuable insights and further understanding of the way in which KMS design efforts should be focused.
    Date
    11. 6.2012 14:22:34
  13. Couvreur, T.R.; Benzel, R.N.; Miller, S.F.; Zeitler, D.N.; Lee, D.L.; Singhal, M.; Shivaratri, N.; Wong, W.Y.P.: ¬An analysis of performance and cost factors in searching large text databases using parallel search systems (1994) 0.01
    0.010562953 = product of:
      0.052814763 = sum of:
        0.052814763 = weight(_text_:bibliographic in 7657) [ClassicSimilarity], result of:
          0.052814763 = score(doc=7657,freq=2.0), product of:
            0.17541347 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.04505818 = queryNorm
            0.30108726 = fieldWeight in 7657, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7657)
      0.2 = coord(1/5)
    
    Abstract
    The results of modelling the performance of searching large text databases (>10 GBytes) via various parallel hardware architectures and search algorithms are discussed. The performance under load and the cost of each configuration are compared. Strengths, weaknesses, performance sensitivities, and search features supported for each configuration are also addressed. In addition, a common search workload used in the modelling is described. The search workload is derived from a set of searches run against the Chemical Abstracts file of bibliographic and abstract text available on STN International. This common workload is applied to all configurations modelled to provide a common basis of comparison
  14. Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.01
    0.009767618 = product of:
      0.04883809 = sum of:
        0.04883809 = product of:
          0.09767618 = sum of:
            0.09767618 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
              0.09767618 = score(doc=402,freq=2.0), product of:
                0.15778607 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04505818 = queryNorm
                0.61904186 = fieldWeight in 402, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=402)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Source
    Information processing and management. 22(1986) no.6, S.465-476
  15. Rada, R.; Barlow, J.; Potharst, J.; Zanstra, P.; Bijstra, D.: Document ranking using an enriched thesaurus (1991) 0.01
    0.0090539595 = product of:
      0.045269795 = sum of:
        0.045269795 = weight(_text_:bibliographic in 6626) [ClassicSimilarity], result of:
          0.045269795 = score(doc=6626,freq=2.0), product of:
            0.17541347 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.04505818 = queryNorm
            0.2580748 = fieldWeight in 6626, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.046875 = fieldNorm(doc=6626)
      0.2 = coord(1/5)
    
    Abstract
    A thesaurus may be viewed as a graph, and document retrieval algorithms can exploit this graph when both the documents and the query are represented by thesaurus terms. These retrieval algorithms measure the distance between the query and documents by using the path lengths in the graph. Previous work witj such strategies has shown that the hierarchical relations in the thesaurus are useful but the non-hierarchical are not. This paper shows that when the query explicitly mentions a particular non-hierarchical relation, the retrieval algorithm benefits from the presence of such relations in the thesaurus. Our algorithms were applied to the Excerpta Medica bibliographic citation database whose citations are indexed with terms from the EMTREE thesaurus. We also created an enriched EMTREE by systematically adding non-hierarchical relations from a medical knowledge base. Our algorithms used at one time EMTREE and, at another time, the enriched EMTREE in the course of ranking documents from Excerpta Medica against queries. When, and only when, the query specifically mentioned a particular non-hierarchical relation type, did EMTREE enriched with that relation type lead to a ranking that better corresponded to an expert's ranking
  16. Savoy, J.: Ranking schemes in hybrid Boolean systems : a new approach (1997) 0.01
    0.0090539595 = product of:
      0.045269795 = sum of:
        0.045269795 = weight(_text_:bibliographic in 393) [ClassicSimilarity], result of:
          0.045269795 = score(doc=393,freq=2.0), product of:
            0.17541347 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.04505818 = queryNorm
            0.2580748 = fieldWeight in 393, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.046875 = fieldNorm(doc=393)
      0.2 = coord(1/5)
    
    Abstract
    In most commercial online systems, the retrieval system is based on the Boolean model and its inverted file organization. Since the investment in these systems is so great and changing them could be economically unfeasible, this article suggests a new ranking scheme especially adapted for hypertext environments in order to produce more effective retrieval results and yet maintain the effectiveness of the investment made to date in the Boolean model. To select the retrieved documents, the suggested ranking strategy uses multiple sources of document content evidence. The proposed scheme integrates both the information provided by the index and query terms, and the inherent relationships between documents such as bibliographic references or hypertext links. We will demonstrate that our scheme represents an integration of both subject and citation indexing, and results in a significant imporvement over classical ranking schemes uses in hybrid Boolean systems, while preserving its efficiency. Moreover, through knowing the nearest neighbor and the hypertext links which constitute additional sources of evidence, our strategy will take them into account in order to further improve retrieval effectiveness and to provide 'good' starting points for browsing in a hypertext or hypermedia environement
  17. Stanfill, C.: Parallel information retrieval algorithms (1992) 0.01
    0.008904126 = product of:
      0.044520628 = sum of:
        0.044520628 = product of:
          0.089041255 = sum of:
            0.089041255 = weight(_text_:data in 3515) [ClassicSimilarity], result of:
              0.089041255 = score(doc=3515,freq=10.0), product of:
                0.14247625 = queryWeight, product of:
                  3.1620505 = idf(docFreq=5088, maxDocs=44218)
                  0.04505818 = queryNorm
                0.6249551 = fieldWeight in 3515, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  3.1620505 = idf(docFreq=5088, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3515)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Abstract
    Data Parallel computers, such as the connection Machine CM-2, can provide interactive access to text databases containign tens, hundreds or even thousands of Gigabytes of data. Starts by presenting a brief overview of data parallel computing, a performance model of the CM-2, and a model of the workload involved in searching text databases. Discusses various algorithms used in information retrieval and gives performance estimates based on the data and procssing models presented
    Source
    Information retrieval: data structures and algorithms. Ed.: W.B. Frakes u. R. Baeza-Yates
  18. Baeza-Yates, R.A.: Introduction to data structures and algorithms related to information retrieval (1992) 0.01
    0.0086213825 = product of:
      0.043106914 = sum of:
        0.043106914 = product of:
          0.08621383 = sum of:
            0.08621383 = weight(_text_:data in 3082) [ClassicSimilarity], result of:
              0.08621383 = score(doc=3082,freq=6.0), product of:
                0.14247625 = queryWeight, product of:
                  3.1620505 = idf(docFreq=5088, maxDocs=44218)
                  0.04505818 = queryNorm
                0.60511017 = fieldWeight in 3082, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.1620505 = idf(docFreq=5088, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3082)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Abstract
    In this chapter we review the main concepts and data structures used in information retrieval, and we classify information retrieval related algorithms
    Source
    Information retrieval: data structures and algorithms. Ed.: W.B. Frakes u. R. Baeza-Yates
  19. Smeaton, A.F.; Rijsbergen, C.J. van: ¬The retrieval effects of query expansion on a feedback document retrieval system (1983) 0.01
    0.008546666 = product of:
      0.04273333 = sum of:
        0.04273333 = product of:
          0.08546666 = sum of:
            0.08546666 = weight(_text_:22 in 2134) [ClassicSimilarity], result of:
              0.08546666 = score(doc=2134,freq=2.0), product of:
                0.15778607 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04505818 = queryNorm
                0.5416616 = fieldWeight in 2134, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=2134)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Date
    30. 3.2001 13:32:22
  20. Back, J.: ¬An evaluation of relevancy ranking techniques used by Internet search engines (2000) 0.01
    0.008546666 = product of:
      0.04273333 = sum of:
        0.04273333 = product of:
          0.08546666 = sum of:
            0.08546666 = weight(_text_:22 in 3445) [ClassicSimilarity], result of:
              0.08546666 = score(doc=3445,freq=2.0), product of:
                0.15778607 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04505818 = queryNorm
                0.5416616 = fieldWeight in 3445, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3445)
          0.5 = coord(1/2)
      0.2 = coord(1/5)
    
    Date
    25. 8.2005 17:42:22

Years

Languages

  • e 106
  • d 5
  • More… Less…

Types

  • a 99
  • m 8
  • el 3
  • s 3
  • r 1
  • More… Less…