Search (117 results, page 1 of 6)

  • × theme_ss:"Retrievalalgorithmen"
  • × type_ss:"a"
  • × year_i:[2000 TO 2010}
  1. Klas, C.-P.; Fuhr, N.; Schaefer, A.: Evaluating strategic support for information access in the DAFFODIL system (2004) 0.01
    0.012472611 = product of:
      0.043654136 = sum of:
        0.028931338 = product of:
          0.072328344 = sum of:
            0.03107218 = weight(_text_:retrieval in 2419) [ClassicSimilarity], result of:
              0.03107218 = score(doc=2419,freq=4.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.2835858 = fieldWeight in 2419, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2419)
            0.041256163 = weight(_text_:system in 2419) [ClassicSimilarity], result of:
              0.041256163 = score(doc=2419,freq=6.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.36163113 = fieldWeight in 2419, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2419)
          0.4 = coord(2/5)
        0.0147228 = product of:
          0.0294456 = sum of:
            0.0294456 = weight(_text_:22 in 2419) [ClassicSimilarity], result of:
              0.0294456 = score(doc=2419,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.23214069 = fieldWeight in 2419, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2419)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    The digital library system Daffodil is targeted at strategic support of users during the information search process. For searching, exploring and managing digital library objects it provides user-customisable information seeking patterns over a federation of heterogeneous digital libraries. In this paper evaluation results with respect to retrieval effectiveness, efficiency and user satisfaction are presented. The analysis focuses on strategic support for the scientific work-flow. Daffodil supports the whole work-flow, from data source selection over information seeking to the representation, organisation and reuse of information. By embedding high level search functionality into the scientific work-flow, the user experiences better strategic system support due to a more systematic work process. These ideas have been implemented in Daffodil followed by a qualitative evaluation. The evaluation has been conducted with 28 participants, ranging from information seeking novices to experts. The results are promising, as they support the chosen model.
    Date
    16.11.2008 16:22:48
    Theme
    Semantisches Umfeld in Indexierung u. Retrieval
  2. Losada, D.E.; Barreiro, A.: Emebedding term similarity and inverse document frequency into a logical model of information retrieval (2003) 0.01
    0.008508152 = product of:
      0.029778533 = sum of:
        0.010148132 = product of:
          0.05074066 = sum of:
            0.05074066 = weight(_text_:retrieval in 1422) [ClassicSimilarity], result of:
              0.05074066 = score(doc=1422,freq=6.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.46309367 = fieldWeight in 1422, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1422)
          0.2 = coord(1/5)
        0.0196304 = product of:
          0.0392608 = sum of:
            0.0392608 = weight(_text_:22 in 1422) [ClassicSimilarity], result of:
              0.0392608 = score(doc=1422,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.30952093 = fieldWeight in 1422, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1422)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    We propose a novel approach to incorporate term similarity and inverse document frequency into a logical model of information retrieval. The ability of the logic to handle expressive representations along with the use of such classical notions are promising characteristics for IR systems. The approach proposed here has been efficiently implemented and experiments against test collections are presented.
    Date
    22. 3.2003 19:27:23
    Footnote
    Beitrag eines Themenheftes: Mathematical, logical, and formal methods in information retrieval
  3. MacFarlane, A.; Robertson, S.E.; McCann, J.A.: Parallel computing for passage retrieval (2004) 0.01
    0.007976091 = product of:
      0.027916316 = sum of:
        0.008285915 = product of:
          0.04142957 = sum of:
            0.04142957 = weight(_text_:retrieval in 5108) [ClassicSimilarity], result of:
              0.04142957 = score(doc=5108,freq=4.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.37811437 = fieldWeight in 5108, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5108)
          0.2 = coord(1/5)
        0.0196304 = product of:
          0.0392608 = sum of:
            0.0392608 = weight(_text_:22 in 5108) [ClassicSimilarity], result of:
              0.0392608 = score(doc=5108,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.30952093 = fieldWeight in 5108, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5108)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    In this paper methods for both speeding up passage processing and examining more passages using parallel computers are explored. The number of passages processed are varied in order to examine the effect on retrieval effectiveness and efficiency. The particular algorithm applied has previously been used to good effect in Okapi experiments at TREC. This algorithm and the mechanism for applying parallel computing to speed up processing are described.
    Date
    20. 1.2007 18:30:22
  4. Shiri, A.A.; Revie, C.: Query expansion behavior within a thesaurus-enhanced search environment : a user-centered evaluation (2006) 0.01
    0.007866439 = product of:
      0.027532537 = sum of:
        0.015263537 = product of:
          0.03815884 = sum of:
            0.01830946 = weight(_text_:retrieval in 56) [ClassicSimilarity], result of:
              0.01830946 = score(doc=56,freq=2.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.16710453 = fieldWeight in 56, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=56)
            0.01984938 = weight(_text_:system in 56) [ClassicSimilarity], result of:
              0.01984938 = score(doc=56,freq=2.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.17398985 = fieldWeight in 56, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=56)
          0.4 = coord(2/5)
        0.0122690005 = product of:
          0.024538001 = sum of:
            0.024538001 = weight(_text_:22 in 56) [ClassicSimilarity], result of:
              0.024538001 = score(doc=56,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.19345059 = fieldWeight in 56, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=56)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    The study reported here investigated the query expansion behavior of end-users interacting with a thesaurus-enhanced search system on the Web. Two groups, namely academic staff and postgraduate students, were recruited into this study. Data were collected from 90 searches performed by 30 users using the OVID interface to the CAB abstracts database. Data-gathering techniques included questionnaires, screen capturing software, and interviews. The results presented here relate to issues of search-topic and search-term characteristics, number and types of expanded queries, usefulness of thesaurus terms, and behavioral differences between academic staff and postgraduate students in their interaction. The key conclusions drawn were that (a) academic staff chose more narrow and synonymous terms than did postgraduate students, who generally selected broader and related terms; (b) topic complexity affected users' interaction with the thesaurus in that complex topics required more query expansion and search term selection; (c) users' prior topic-search experience appeared to have a significant effect on their selection and evaluation of thesaurus terms; (d) in 50% of the searches where additional terms were suggested from the thesaurus, users stated that they had not been aware of the terms at the beginning of the search; this observation was particularly noticeable in the case of postgraduate students.
    Date
    22. 7.2006 16:32:43
    Theme
    Semantisches Umfeld in Indexierung u. Retrieval
  5. Crestani, F.; Dominich, S.; Lalmas, M.; Rijsbergen, C.J.K. van: Mathematical, logical, and formal methods in information retrieval : an introduction to the special issue (2003) 0.01
    0.0072818627 = product of:
      0.025486518 = sum of:
        0.010763719 = product of:
          0.053818595 = sum of:
            0.053818595 = weight(_text_:retrieval in 1451) [ClassicSimilarity], result of:
              0.053818595 = score(doc=1451,freq=12.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.49118498 = fieldWeight in 1451, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1451)
          0.2 = coord(1/5)
        0.0147228 = product of:
          0.0294456 = sum of:
            0.0294456 = weight(_text_:22 in 1451) [ClassicSimilarity], result of:
              0.0294456 = score(doc=1451,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.23214069 = fieldWeight in 1451, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1451)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    Research an the use of mathematical, logical, and formal methods, has been central to Information Retrieval research for a long time. Research in this area is important not only because it helps enhancing retrieval effectiveness, but also because it helps clarifying the underlying concepts of Information Retrieval. In this article we outline some of the major aspects of the subject, and summarize the papers of this special issue with respect to how they relate to these aspects. We conclude by highlighting some directions of future research, which are needed to better understand the formal characteristics of Information Retrieval.
    Date
    22. 3.2003 19:27:36
    Footnote
    Einführung zu den Beiträgen eines Themenheftes: Mathematical, logical, and formal methods in information retrieval
  6. Witschel, H.F.: Global term weights in distributed environments (2008) 0.01
    0.006717526 = product of:
      0.023511339 = sum of:
        0.00878854 = product of:
          0.0439427 = sum of:
            0.0439427 = weight(_text_:retrieval in 2096) [ClassicSimilarity], result of:
              0.0439427 = score(doc=2096,freq=8.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.40105087 = fieldWeight in 2096, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2096)
          0.2 = coord(1/5)
        0.0147228 = product of:
          0.0294456 = sum of:
            0.0294456 = weight(_text_:22 in 2096) [ClassicSimilarity], result of:
              0.0294456 = score(doc=2096,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.23214069 = fieldWeight in 2096, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2096)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    This paper examines the estimation of global term weights (such as IDF) in information retrieval scenarios where a global view on the collection is not available. In particular, the two options of either sampling documents or of using a reference corpus independent of the target retrieval collection are compared using standard IR test collections. In addition, the possibility of pruning term lists based on frequency is evaluated. The results show that very good retrieval performance can be reached when just the most frequent terms of a collection - an "extended stop word list" - are known and all terms which are not in that list are treated equally. However, the list cannot always be fully estimated from a general-purpose reference corpus, but some "domain-specific stop words" need to be added. A good solution for achieving this is to mix estimates from small samples of the target retrieval collection with ones derived from a reference corpus.
    Date
    1. 8.2008 9:44:22
  7. Khoo, C.S.G.; Wan, K.-W.: ¬A simple relevancy-ranking strategy for an interface to Boolean OPACs (2004) 0.01
    0.006668968 = product of:
      0.023341388 = sum of:
        0.014753087 = product of:
          0.036882717 = sum of:
            0.012816621 = weight(_text_:retrieval in 2509) [ClassicSimilarity], result of:
              0.012816621 = score(doc=2509,freq=2.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.11697317 = fieldWeight in 2509, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=2509)
            0.024066096 = weight(_text_:system in 2509) [ClassicSimilarity], result of:
              0.024066096 = score(doc=2509,freq=6.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.21095149 = fieldWeight in 2509, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=2509)
          0.4 = coord(2/5)
        0.0085883 = product of:
          0.0171766 = sum of:
            0.0171766 = weight(_text_:22 in 2509) [ClassicSimilarity], result of:
              0.0171766 = score(doc=2509,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.1354154 = fieldWeight in 2509, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=2509)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Content
    "Most Web search engines accept natural language queries, perform some kind of fuzzy matching and produce ranked output, displaying first the documents that are most likely to be relevant. On the other hand, most library online public access catalogs (OPACs) an the Web are still Boolean retrieval systems that perform exact matching, and require users to express their search requests precisely in a Boolean search language and to refine their search statements to improve the search results. It is well-documented that users have difficulty searching Boolean OPACs effectively (e.g. Borgman, 1996; Ensor, 1992; Wallace, 1993). One approach to making OPACs easier to use is to develop a natural language search interface that acts as a middleware between the user's Web browser and the OPAC system. The search interface can accept a natural language query from the user and reformulate it as a series of Boolean search statements that are then submitted to the OPAC. The records retrieved by the OPAC are ranked by the search interface before forwarding them to the user's Web browser. The user, then, does not need to interact directly with the Boolean OPAC but with the natural language search interface or search intermediary. The search interface interacts with the OPAC system an the user's behalf. The advantage of this approach is that no modification to the OPAC or library system is required. Furthermore, the search interface can access multiple OPACs, acting as a meta search engine, and integrate search results from various OPACs before sending them to the user. The search interface needs to incorporate a method for converting the user's natural language query into a series of Boolean search statements, and for ranking the OPAC records retrieved. The purpose of this study was to develop a relevancyranking algorithm for a search interface to Boolean OPAC systems. This is part of an on-going effort to develop a knowledge-based search interface to OPACs called the E-Referencer (Khoo et al., 1998, 1999; Poo et al., 2000). E-Referencer v. 2 that has been implemented applies a repertoire of initial search strategies and reformulation strategies to retrieve records from OPACs using the Z39.50 protocol, and also assists users in mapping query keywords to the Library of Congress subject headings."
    Source
    Electronic library. 22(2004) no.2, S.112-120
  8. Campos, L.M. de; Fernández-Luna, J.M.; Huete, J.F.: Implementing relevance feedback in the Bayesian network retrieval model (2003) 0.01
    0.0063811145 = product of:
      0.0223339 = sum of:
        0.0076110996 = product of:
          0.0380555 = sum of:
            0.0380555 = weight(_text_:retrieval in 825) [ClassicSimilarity], result of:
              0.0380555 = score(doc=825,freq=6.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.34732026 = fieldWeight in 825, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.046875 = fieldNorm(doc=825)
          0.2 = coord(1/5)
        0.0147228 = product of:
          0.0294456 = sum of:
            0.0294456 = weight(_text_:22 in 825) [ClassicSimilarity], result of:
              0.0294456 = score(doc=825,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.23214069 = fieldWeight in 825, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=825)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    Relevance Feedback consists in automatically formulating a new query according to the relevance judgments provided by the user after evaluating a set of retrieved documents. In this article, we introduce several relevance feedback methods for the Bayesian Network Retrieval ModeL The theoretical frame an which our methods are based uses the concept of partial evidences, which summarize the new pieces of information gathered after evaluating the results obtained by the original query. These partial evidences are inserted into the underlying Bayesian network and a new inference process (probabilities propagation) is run to compute the posterior relevance probabilities of the documents in the collection given the new query. The quality of the proposed methods is tested using a preliminary experimentation with different standard document collections.
    Date
    22. 3.2003 19:30:19
    Footnote
    Beitrag eines Themenheftes: Mathematical, logical, and formal methods in information retrieval
  9. Kanaeva, Z.: Ranking: Google und CiteSeer (2005) 0.01
    0.0063723573 = product of:
      0.02230325 = sum of:
        0.0051266486 = product of:
          0.025633242 = sum of:
            0.025633242 = weight(_text_:retrieval in 3276) [ClassicSimilarity], result of:
              0.025633242 = score(doc=3276,freq=2.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.23394634 = fieldWeight in 3276, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3276)
          0.2 = coord(1/5)
        0.0171766 = product of:
          0.0343532 = sum of:
            0.0343532 = weight(_text_:22 in 3276) [ClassicSimilarity], result of:
              0.0343532 = score(doc=3276,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.2708308 = fieldWeight in 3276, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3276)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    Im Rahmen des klassischen Information Retrieval wurden verschiedene Verfahren für das Ranking sowie die Suche in einer homogenen strukturlosen Dokumentenmenge entwickelt. Die Erfolge der Suchmaschine Google haben gezeigt dass die Suche in einer zwar inhomogenen aber zusammenhängenden Dokumentenmenge wie dem Internet unter Berücksichtigung der Dokumentenverbindungen (Links) sehr effektiv sein kann. Unter den von der Suchmaschine Google realisierten Konzepten ist ein Verfahren zum Ranking von Suchergebnissen (PageRank), das in diesem Artikel kurz erklärt wird. Darüber hinaus wird auf die Konzepte eines Systems namens CiteSeer eingegangen, welches automatisch bibliographische Angaben indexiert (engl. Autonomous Citation Indexing, ACI). Letzteres erzeugt aus einer Menge von nicht vernetzten wissenschaftlichen Dokumenten eine zusammenhängende Dokumentenmenge und ermöglicht den Einsatz von Banking-Verfahren, die auf den von Google genutzten Verfahren basieren.
    Date
    20. 3.2005 16:23:22
  10. Beitzel, S.M.; Jensen, E.C.; Chowdhury, A.; Grossman, D.; Frieder, O; Goharian, N.: Fusion of effective retrieval strategies in the same information retrieval system (2004) 0.01
    0.006327753 = product of:
      0.044294268 = sum of:
        0.044294268 = product of:
          0.11073567 = sum of:
            0.0694795 = weight(_text_:retrieval in 2502) [ClassicSimilarity], result of:
              0.0694795 = score(doc=2502,freq=20.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.63411707 = fieldWeight in 2502, product of:
                  4.472136 = tf(freq=20.0), with freq of:
                    20.0 = termFreq=20.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2502)
            0.041256163 = weight(_text_:system in 2502) [ClassicSimilarity], result of:
              0.041256163 = score(doc=2502,freq=6.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.36163113 = fieldWeight in 2502, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2502)
          0.4 = coord(2/5)
      0.14285715 = coord(1/7)
    
    Abstract
    Prior efforts have shown that under certain situations retrieval effectiveness may be improved via the use of data fusion techniques. Although these improvements have been observed from the fusion of result sets from several distinct information retrieval systems, it has often been thought that fusing different document retrieval strategies in a single information retrieval system will lead to similar improvements. In this study, we show that this is not the case. We hold constant systemic differences such as parsing, stemming, phrase processing, and relevance feedback, and fuse result sets generated from highly effective retrieval strategies in the same information retrieval system. From this, we show that data fusion of highly effective retrieval strategies alone shows little or no improvement in retrieval effectiveness. Furthermore, we present a detailed analysis of the performance of modern data fusion approaches, and demonstrate the reasons why they do not perform weIl when applied to this problem. Detailed results and analyses are included to support our conclusions.
  11. Furner, J.: ¬A unifying model of document relatedness for hybrid search engines (2003) 0.01
    0.0061314013 = product of:
      0.021459904 = sum of:
        0.0067371028 = product of:
          0.033685513 = sum of:
            0.033685513 = weight(_text_:system in 2717) [ClassicSimilarity], result of:
              0.033685513 = score(doc=2717,freq=4.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.29527056 = fieldWeight in 2717, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2717)
          0.2 = coord(1/5)
        0.0147228 = product of:
          0.0294456 = sum of:
            0.0294456 = weight(_text_:22 in 2717) [ClassicSimilarity], result of:
              0.0294456 = score(doc=2717,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.23214069 = fieldWeight in 2717, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2717)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    Previous work an search-engine design has indicated that information-seekers may benefit from being given the opportunity to exploit multiple sources of evidence of document relatedness. Few existing systems, however, give users more than minimal control over the selections that may be made among methods of exploitation. By applying the methods of "document network analysis" (DNA), a unifying, graph-theoretic model of content-, collaboration-, and context-based systems (CCC) may be developed in which the nature of the similarities between types of document relatedness and document ranking are clarified. The usefulness of the approach to system design suggested by this model may be tested by constructing and evaluating a prototype system (UCXtra) that allows searchers to maintain control over the multiple ways in which document collections may be ranked and re-ranked.
    Date
    11. 9.2004 17:32:22
  12. Zhu, B.; Chen, H.: Validating a geographical image retrieval system (2000) 0.01
    0.0061188615 = product of:
      0.042832028 = sum of:
        0.042832028 = product of:
          0.10708007 = sum of:
            0.053818595 = weight(_text_:retrieval in 4769) [ClassicSimilarity], result of:
              0.053818595 = score(doc=4769,freq=12.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.49118498 = fieldWeight in 4769, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4769)
            0.053261478 = weight(_text_:system in 4769) [ClassicSimilarity], result of:
              0.053261478 = score(doc=4769,freq=10.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.46686378 = fieldWeight in 4769, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4769)
          0.4 = coord(2/5)
      0.14285715 = coord(1/7)
    
    Abstract
    This paper summarizes a prototype geographical image retrieval system that demonstrates how to integrate image processing and information analysis techniques to support large-scale content-based image retrieval. By using an image as its interface, the prototype system addresses a troublesome aspect of traditional retrieval models, which require users to have complete knowledge of the low-level features of an image. In addition we describe an experiment to validate against that of human subjects in an effort to address the scarcity of research evaluating performance of an algorithm against that of human beings. The results of the experiment indicate that the system could do as well as human subjects in accomplishing the tasks of similarity analysis and image categorization. We also found that under some circumstances texture features of an image are insufficient to represent an geographic image. We believe, however, that our image retrieval system provides a promising approach to integrating image processing techniques and information retrieval algorithms
  13. Fan, W.; Fox, E.A.; Pathak, P.; Wu, H.: ¬The effects of fitness functions an genetic programming-based ranking discovery for Web search (2004) 0.01
    0.005982068 = product of:
      0.020937236 = sum of:
        0.006214436 = product of:
          0.03107218 = sum of:
            0.03107218 = weight(_text_:retrieval in 2239) [ClassicSimilarity], result of:
              0.03107218 = score(doc=2239,freq=4.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.2835858 = fieldWeight in 2239, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2239)
          0.2 = coord(1/5)
        0.0147228 = product of:
          0.0294456 = sum of:
            0.0294456 = weight(_text_:22 in 2239) [ClassicSimilarity], result of:
              0.0294456 = score(doc=2239,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.23214069 = fieldWeight in 2239, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2239)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    Genetic-based evolutionary learning algorithms, such as genetic algorithms (GAs) and genetic programming (GP), have been applied to information retrieval (IR) since the 1980s. Recently, GP has been applied to a new IR taskdiscovery of ranking functions for Web search-and has achieved very promising results. However, in our prior research, only one fitness function has been used for GP-based learning. It is unclear how other fitness functions may affect ranking function discovery for Web search, especially since it is weIl known that choosing a proper fitness function is very important for the effectiveness and efficiency of evolutionary algorithms. In this article, we report our experience in contrasting different fitness function designs an GP-based learning using a very large Web corpus. Our results indicate that the design of fitness functions is instrumental in performance improvement. We also give recommendations an the design of fitness functions for genetic-based information retrieval experiments.
    Date
    31. 5.2004 19:22:06
  14. Song, D.; Bruza, P.D.: Towards context sensitive information inference (2003) 0.01
    0.005317596 = product of:
      0.018611584 = sum of:
        0.0063425824 = product of:
          0.031712912 = sum of:
            0.031712912 = weight(_text_:retrieval in 1428) [ClassicSimilarity], result of:
              0.031712912 = score(doc=1428,freq=6.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.28943354 = fieldWeight in 1428, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1428)
          0.2 = coord(1/5)
        0.0122690005 = product of:
          0.024538001 = sum of:
            0.024538001 = weight(_text_:22 in 1428) [ClassicSimilarity], result of:
              0.024538001 = score(doc=1428,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.19345059 = fieldWeight in 1428, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1428)
          0.5 = coord(1/2)
      0.2857143 = coord(2/7)
    
    Abstract
    Humans can make hasty, but generally robust judgements about what a text fragment is, or is not, about. Such judgements are termed information inference. This article furnishes an account of information inference from a psychologistic stance. By drawing an theories from nonclassical logic and applied cognition, an information inference mechanism is proposed that makes inferences via computations of information flow through an approximation of a conceptual space. Within a conceptual space information is represented geometrically. In this article, geometric representations of words are realized as vectors in a high dimensional semantic space, which is automatically constructed from a text corpus. Two approaches were presented for priming vector representations according to context. The first approach uses a concept combination heuristic to adjust the vector representation of a concept in the light of the representation of another concept. The second approach computes a prototypical concept an the basis of exemplar trace texts and moves it in the dimensional space according to the context. Information inference is evaluated by measuring the effectiveness of query models derived by information flow computations. Results show that information flow contributes significantly to query model effectiveness, particularly with respect to precision. Moreover, retrieval effectiveness compares favorably with two probabilistic query models, and another based an semantic association. More generally, this article can be seen as a contribution towards realizing operational systems that mimic text-based human reasoning.
    Date
    22. 3.2003 19:35:46
    Footnote
    Beitrag eines Themenheftes: Mathematical, logical, and formal methods in information retrieval
    Theme
    Semantisches Umfeld in Indexierung u. Retrieval
  15. Back, J.: ¬An evaluation of relevancy ranking techniques used by Internet search engines (2000) 0.00
    0.0049076 = product of:
      0.0343532 = sum of:
        0.0343532 = product of:
          0.0687064 = sum of:
            0.0687064 = weight(_text_:22 in 3445) [ClassicSimilarity], result of:
              0.0687064 = score(doc=3445,freq=2.0), product of:
                0.12684377 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03622214 = queryNorm
                0.5416616 = fieldWeight in 3445, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3445)
          0.5 = coord(1/2)
      0.14285715 = coord(1/7)
    
    Date
    25. 8.2005 17:42:22
  16. Lee, C.; Lee, G.G.: Probabilistic information retrieval model for a dependence structured indexing system (2005) 0.00
    0.004782734 = product of:
      0.033479135 = sum of:
        0.033479135 = product of:
          0.08369784 = sum of:
            0.044398077 = weight(_text_:retrieval in 1004) [ClassicSimilarity], result of:
              0.044398077 = score(doc=1004,freq=6.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.40520695 = fieldWeight in 1004, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1004)
            0.039299767 = weight(_text_:system in 1004) [ClassicSimilarity], result of:
              0.039299767 = score(doc=1004,freq=4.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.34448233 = fieldWeight in 1004, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1004)
          0.4 = coord(2/5)
      0.14285715 = coord(1/7)
    
    Abstract
    Most previous information retrieval (IR) models assume that terms of queries and documents are statistically independent from each other. However, conditional independence assumption is obviously and openly understood to be wrong, so we present a new method of incorporating term dependence into a probabilistic retrieval model by adapting a dependency structured indexing system using a dependency parse tree and Chow Expansion to compensate the weakness of the assumption. In this paper, we describe a theoretic process to apply the Chow Expansion to the general probabilistic models and the state-of-the-art 2-Poisson model. Through experiments on document collections in English and Korean, we demonstrate that the incorporation of term dependences using Chow Expansion contributes to the improvement of performance in probabilistic IR systems.
  17. Herrera-Viedma, E.; Cordón, O.; Herrera, J.C.; Luqe, M.: ¬An IRS based on multi-granular lnguistic information (2003) 0.00
    0.004497754 = product of:
      0.031484276 = sum of:
        0.031484276 = product of:
          0.07871069 = sum of:
            0.03107218 = weight(_text_:retrieval in 2740) [ClassicSimilarity], result of:
              0.03107218 = score(doc=2740,freq=4.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.2835858 = fieldWeight in 2740, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2740)
            0.047638513 = weight(_text_:system in 2740) [ClassicSimilarity], result of:
              0.047638513 = score(doc=2740,freq=8.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.41757566 = fieldWeight in 2740, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2740)
          0.4 = coord(2/5)
      0.14285715 = coord(1/7)
    
    Abstract
    An information retrieval system (IRS) based on fuzzy multi-granular linguistic information is proposed. The system has an evaluation method to process multi-granular linguistic information, in such a way that the inputs to the IRS are represented in a different linguistic domain than the outputs. The system accepts Boolean queries whose terms are weighted by means of the ordinal linguistic values represented by the linguistic variable "Importance" assessed an a label set S. The system evaluates the weighted queries according to a threshold semantic and obtains the linguistic retrieval status values (RSV) of documents represented by a linguistic variable "Relevance" expressed in a different label set S'. The advantage of this linguistic IRS with respect to others is that the use of the multi-granular linguistic information facilitates and improves the IRS-user interaction
  18. Singh, S.; Dey, L.: ¬A rough-fuzzy document grading system for customized text information retrieval (2005) 0.00
    0.004497754 = product of:
      0.031484276 = sum of:
        0.031484276 = product of:
          0.07871069 = sum of:
            0.03107218 = weight(_text_:retrieval in 1007) [ClassicSimilarity], result of:
              0.03107218 = score(doc=1007,freq=4.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.2835858 = fieldWeight in 1007, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1007)
            0.047638513 = weight(_text_:system in 1007) [ClassicSimilarity], result of:
              0.047638513 = score(doc=1007,freq=8.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.41757566 = fieldWeight in 1007, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1007)
          0.4 = coord(2/5)
      0.14285715 = coord(1/7)
    
    Abstract
    Due to the large repository of documents available on the web, users are usually inundated by a large volume of information, most of which is found to be irrelevant. Since user perspectives vary, a client-side text filtering system that learns the user's perspective can reduce the problem of irrelevant retrieval. In this paper, we have provided the design of a customized text information filtering system which learns user preferences and modifies the initial query to fetch better documents. It uses a rough-fuzzy reasoning scheme. The rough-set based reasoning takes care of natural language nuances, like synonym handling, very elegantly. The fuzzy decider provides qualitative grading to the documents for the user's perusal. We have provided the detailed design of the various modules and some results related to the performance analysis of the system.
  19. Shah, B.; Raghavan, V.; Dhatric, P.; Zhao, X.: ¬A cluster-based approach for efficient content-based image retrieval using a similarity-preserving space transformation method (2006) 0.00
    0.0043722023 = product of:
      0.030605415 = sum of:
        0.030605415 = product of:
          0.076513536 = sum of:
            0.04844227 = weight(_text_:retrieval in 6118) [ClassicSimilarity], result of:
              0.04844227 = score(doc=6118,freq=14.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.442117 = fieldWeight in 6118, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=6118)
            0.028071264 = weight(_text_:system in 6118) [ClassicSimilarity], result of:
              0.028071264 = score(doc=6118,freq=4.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.24605882 = fieldWeight in 6118, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=6118)
          0.4 = coord(2/5)
      0.14285715 = coord(1/7)
    
    Abstract
    The techniques of clustering and space transformation have been successfully used in the past to solve a number of pattern recognition problems. In this article, the authors propose a new approach to content-based image retrieval (CBIR) that uses (a) a newly proposed similarity-preserving space transformation method to transform the original low-level image space into a highlevel vector space that enables efficient query processing, and (b) a clustering scheme that further improves the efficiency of our retrieval system. This combination is unique and the resulting system provides synergistic advantages of using both clustering and space transformation. The proposed space transformation method is shown to preserve the order of the distances in the transformed feature space. This strategy makes this approach to retrieval generic as it can be applied to object types, other than images, and feature spaces more general than metric spaces. The CBIR approach uses the inexpensive "estimated" distance in the transformed space, as opposed to the computationally inefficient "real" distance in the original space, to retrieve the desired results for a given query image. The authors also provide a theoretical analysis of the complexity of their CBIR approach when used for color-based retrieval, which shows that it is computationally more efficient than other comparable approaches. An extensive set of experiments to test the efficiency and effectiveness of the proposed approach has been performed. The results show that the approach offers superior response time (improvement of 1-2 orders of magnitude compared to retrieval approaches that either use pruning techniques like indexing, clustering, etc., or space transformation, but not both) with sufficiently high retrieval accuracy.
  20. Hubert, G.; Mothe, J.: ¬An adaptable search engine for multimodal information retrieval (2009) 0.00
    0.0041822046 = product of:
      0.029275432 = sum of:
        0.029275432 = product of:
          0.07318858 = sum of:
            0.04142957 = weight(_text_:retrieval in 2951) [ClassicSimilarity], result of:
              0.04142957 = score(doc=2951,freq=4.0), product of:
                0.109568894 = queryWeight, product of:
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.03622214 = queryNorm
                0.37811437 = fieldWeight in 2951, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.024915 = idf(docFreq=5836, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2951)
            0.03175901 = weight(_text_:system in 2951) [ClassicSimilarity], result of:
              0.03175901 = score(doc=2951,freq=2.0), product of:
                0.11408355 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.03622214 = queryNorm
                0.27838376 = fieldWeight in 2951, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2951)
          0.4 = coord(2/5)
      0.14285715 = coord(1/7)
    
    Abstract
    This article describes an information retrieval approach according to the two different search modes that exist: browsing an ontology (via categories) or defining a query in free language (via keywords). Various proposals offer approaches adapted to one of these two modes. We present a proposal leading to a system allowing the integration of both modes using the same search engine. This engine is adapted according to each possible search mode.

Languages

  • e 107
  • d 8
  • pt 1
  • sp 1
  • More… Less…