Search (24 results, page 1 of 2)

  • × year_i:[2000 TO 2010}
  • × theme_ss:"Retrievalalgorithmen"
  1. Klas, C.-P.; Fuhr, N.; Schaefer, A.: Evaluating strategic support for information access in the DAFFODIL system (2004) 0.05
    0.046260543 = product of:
      0.09252109 = sum of:
        0.09252109 = sum of:
          0.05574795 = weight(_text_:n in 2419) [ClassicSimilarity], result of:
            0.05574795 = score(doc=2419,freq=2.0), product of:
              0.19504215 = queryWeight, product of:
                4.3116565 = idf(docFreq=1611, maxDocs=44218)
                0.045236014 = queryNorm
              0.28582513 = fieldWeight in 2419, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3116565 = idf(docFreq=1611, maxDocs=44218)
                0.046875 = fieldNorm(doc=2419)
          0.036773134 = weight(_text_:22 in 2419) [ClassicSimilarity], result of:
            0.036773134 = score(doc=2419,freq=2.0), product of:
              0.15840882 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.045236014 = queryNorm
              0.23214069 = fieldWeight in 2419, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=2419)
      0.5 = coord(1/2)
    
    Date
    16.11.2008 16:22:48
  2. Chen, Z.; Fu, B.: On the complexity of Rocchio's similarity-based relevance feedback algorithm (2007) 0.03
    0.028448759 = product of:
      0.056897517 = sum of:
        0.056897517 = product of:
          0.113795035 = sum of:
            0.113795035 = weight(_text_:n in 578) [ClassicSimilarity], result of:
              0.113795035 = score(doc=578,freq=12.0), product of:
                0.19504215 = queryWeight, product of:
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.045236014 = queryNorm
                0.58343816 = fieldWeight in 578, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=578)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Rocchio's similarity-based relevance feedback algorithm, one of the most important query reformation methods in information retrieval, is essentially an adaptive learning algorithm from examples in searching for documents represented by a linear classifier. Despite its popularity in various applications, there is little rigorous analysis of its learning complexity in literature. In this article, the authors prove for the first time that the learning complexity of Rocchio's algorithm is O(d + d**2(log d + log n)) over the discretized vector space {0, ... , n - 1 }**d when the inner product similarity measure is used. The upper bound on the learning complexity for searching for documents represented by a monotone linear classifier (q, 0) over {0, ... , n - 1 }d can be improved to, at most, 1 + 2k (n - 1) (log d + log(n - 1)), where k is the number of nonzero components in q. Several lower bounds on the learning complexity are also obtained for Rocchio's algorithm. For example, the authors prove that Rocchio's algorithm has a lower bound Omega((d über 2)log n) on its learning complexity over the Boolean vector space {0,1}**d.
  3. Cheng, C.-S.; Chung, C.-P.; Shann, J.J.-J.: Fast query evaluation through document identifier assignment for inverted file-based information retrieval systems (2006) 0.03
    0.025970044 = product of:
      0.051940087 = sum of:
        0.051940087 = product of:
          0.103880174 = sum of:
            0.103880174 = weight(_text_:n in 979) [ClassicSimilarity], result of:
              0.103880174 = score(doc=979,freq=10.0), product of:
                0.19504215 = queryWeight, product of:
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.045236014 = queryNorm
                0.53260374 = fieldWeight in 979, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=979)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Compressing an inverted file can greatly improve query performance of an information retrieval system (IRS) by reducing disk I/Os. We observe that a good document identifier assignment (DIA) can make the document identifiers in the posting lists more clustered, and result in better compression as well as shorter query processing time. In this paper, we tackle the NP-complete problem of finding an optimal DIA to minimize the average query processing time in an IRS when the probability distribution of query terms is given. We indicate that the greedy nearest neighbor (Greedy-NN) algorithm can provide excellent performance for this problem. However, the Greedy-NN algorithm is inappropriate if used in large-scale IRSs, due to its high complexity O(N2 × n), where N denotes the number of documents and n denotes the number of distinct terms. In real-world IRSs, the distribution of query terms is skewed. Based on this fact, we propose a fast O(N × n) heuristic, called partition-based document identifier assignment (PBDIA) algorithm, which can efficiently assign consecutive document identifiers to those documents containing frequently used query terms, and improve compression efficiency of the posting lists for those terms. This can result in reduced query processing time. The experimental results show that the PBDIA algorithm can yield a competitive performance versus the Greedy-NN for the DIA problem, and that this optimization problem has significant advantages for both long queries and parallel information retrieval (IR).
  4. Back, J.: ¬An evaluation of relevancy ranking techniques used by Internet search engines (2000) 0.02
    0.021450995 = product of:
      0.04290199 = sum of:
        0.04290199 = product of:
          0.08580398 = sum of:
            0.08580398 = weight(_text_:22 in 3445) [ClassicSimilarity], result of:
              0.08580398 = score(doc=3445,freq=2.0), product of:
                0.15840882 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045236014 = queryNorm
                0.5416616 = fieldWeight in 3445, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3445)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    25. 8.2005 17:42:22
  5. Dannenberg, R.B.; Birmingham, W.P.; Pardo, B.; Hu, N.; Meek, C.; Tzanetakis, G.: ¬A comparative evaluation of search techniques for query-by-humming using the MUSART testbed (2007) 0.02
    0.020116309 = product of:
      0.040232617 = sum of:
        0.040232617 = product of:
          0.080465235 = sum of:
            0.080465235 = weight(_text_:n in 269) [ClassicSimilarity], result of:
              0.080465235 = score(doc=269,freq=6.0), product of:
                0.19504215 = queryWeight, product of:
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.045236014 = queryNorm
                0.41255307 = fieldWeight in 269, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=269)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Query-by-humming systems offer content-based searching for melodies and require no special musical training or knowledge. Many such systems have been built, but there has not been much useful evaluation and comparison in the literature due to the lack of shared databases and queries. The MUSART project testbed allows various search algorithms to be compared using a shared framework that automatically runs experiments and summarizes results. Using this testbed, the authors compared algorithms based on string alignment, melodic contour matching, a hidden Markov model, n-grams, and CubyHum. Retrieval performance is very sensitive to distance functions and the representation of pitch and rhythm, which raises questions about some previously published conclusions. Some algorithms are particularly sensitive to the quality of queries. Our queries, which are taken from human subjects in a realistic setting, are quite difficult, especially for n-gram models. Finally, simulations on query-by-humming performance as a function of database size indicate that retrieval performance falls only slowly as the database size increases.
  6. Losee, R.M.; Church Jr., L.: Are two document clusters better than one? : the cluster performance question for information retrieval (2005) 0.02
    0.016259817 = product of:
      0.032519635 = sum of:
        0.032519635 = product of:
          0.06503927 = sum of:
            0.06503927 = weight(_text_:n in 3270) [ClassicSimilarity], result of:
              0.06503927 = score(doc=3270,freq=2.0), product of:
                0.19504215 = queryWeight, product of:
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.045236014 = queryNorm
                0.33346266 = fieldWeight in 3270, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3270)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    When do information retrieval systems using two document clusters provide better retrieval performance than systems using no clustering? We answer this question for one set of assumptions and suggest how this may be studied with other assumptions. The "Cluster Hypothesis" asks an empirical question about the relationships between documents and user-supplied relevance judgments, while the "Cluster Performance Question" proposed here focuses an the when and why of information retrieval or digital library performance for clustered and unclustered text databases. This may be generalized to study the relative performance of m versus n clusters.
  7. Bidoki, A.M.Z.; Yazdani, N.: an intelligent ranking algorithm for web pages : DistanceRank (2008) 0.02
    0.016259817 = product of:
      0.032519635 = sum of:
        0.032519635 = product of:
          0.06503927 = sum of:
            0.06503927 = weight(_text_:n in 2068) [ClassicSimilarity], result of:
              0.06503927 = score(doc=2068,freq=2.0), product of:
                0.19504215 = queryWeight, product of:
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.045236014 = queryNorm
                0.33346266 = fieldWeight in 2068, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2068)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  8. Beitzel, S.M.; Jensen, E.C.; Chowdhury, A.; Grossman, D.; Frieder, O; Goharian, N.: Fusion of effective retrieval strategies in the same information retrieval system (2004) 0.01
    0.013936987 = product of:
      0.027873974 = sum of:
        0.027873974 = product of:
          0.05574795 = sum of:
            0.05574795 = weight(_text_:n in 2502) [ClassicSimilarity], result of:
              0.05574795 = score(doc=2502,freq=2.0), product of:
                0.19504215 = queryWeight, product of:
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.045236014 = queryNorm
                0.28582513 = fieldWeight in 2502, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2502)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  9. MacFarlane, A.; Robertson, S.E.; McCann, J.A.: Parallel computing for passage retrieval (2004) 0.01
    0.012257711 = product of:
      0.024515422 = sum of:
        0.024515422 = product of:
          0.049030844 = sum of:
            0.049030844 = weight(_text_:22 in 5108) [ClassicSimilarity], result of:
              0.049030844 = score(doc=5108,freq=2.0), product of:
                0.15840882 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045236014 = queryNorm
                0.30952093 = fieldWeight in 5108, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5108)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    20. 1.2007 18:30:22
  10. Losada, D.E.; Barreiro, A.: Emebedding term similarity and inverse document frequency into a logical model of information retrieval (2003) 0.01
    0.012257711 = product of:
      0.024515422 = sum of:
        0.024515422 = product of:
          0.049030844 = sum of:
            0.049030844 = weight(_text_:22 in 1422) [ClassicSimilarity], result of:
              0.049030844 = score(doc=1422,freq=2.0), product of:
                0.15840882 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045236014 = queryNorm
                0.30952093 = fieldWeight in 1422, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1422)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2003 19:27:23
  11. Fuhr, N.: Theorie des Information Retrieval I : Modelle (2004) 0.01
    0.011614156 = product of:
      0.023228312 = sum of:
        0.023228312 = product of:
          0.046456624 = sum of:
            0.046456624 = weight(_text_:n in 2912) [ClassicSimilarity], result of:
              0.046456624 = score(doc=2912,freq=2.0), product of:
                0.19504215 = queryWeight, product of:
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.045236014 = queryNorm
                0.23818761 = fieldWeight in 2912, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2912)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  12. Urbain, J.; Goharian, N.; Frieder, O.: Probabilistic passage models for semantic search of genomics literature (2008) 0.01
    0.011614156 = product of:
      0.023228312 = sum of:
        0.023228312 = product of:
          0.046456624 = sum of:
            0.046456624 = weight(_text_:n in 2380) [ClassicSimilarity], result of:
              0.046456624 = score(doc=2380,freq=2.0), product of:
                0.19504215 = queryWeight, product of:
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.045236014 = queryNorm
                0.23818761 = fieldWeight in 2380, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2380)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  13. Schaefer, A.; Jordan, M.; Klas, C.-P.; Fuhr, N.: Active support for query formulation in virtual digital libraries : a case study with DAFFODIL (2005) 0.01
    0.011614156 = product of:
      0.023228312 = sum of:
        0.023228312 = product of:
          0.046456624 = sum of:
            0.046456624 = weight(_text_:n in 4296) [ClassicSimilarity], result of:
              0.046456624 = score(doc=4296,freq=2.0), product of:
                0.19504215 = queryWeight, product of:
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.045236014 = queryNorm
                0.23818761 = fieldWeight in 4296, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4296)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  14. Kanaeva, Z.: Ranking: Google und CiteSeer (2005) 0.01
    0.010725497 = product of:
      0.021450995 = sum of:
        0.021450995 = product of:
          0.04290199 = sum of:
            0.04290199 = weight(_text_:22 in 3276) [ClassicSimilarity], result of:
              0.04290199 = score(doc=3276,freq=2.0), product of:
                0.15840882 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045236014 = queryNorm
                0.2708308 = fieldWeight in 3276, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3276)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    20. 3.2005 16:23:22
  15. White, R.W.; Marchionini, G.: Examining the effectiveness of real-time query expansion (2007) 0.01
    0.009291325 = product of:
      0.01858265 = sum of:
        0.01858265 = product of:
          0.0371653 = sum of:
            0.0371653 = weight(_text_:n in 913) [ClassicSimilarity], result of:
              0.0371653 = score(doc=913,freq=2.0), product of:
                0.19504215 = queryWeight, product of:
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.045236014 = queryNorm
                0.19055009 = fieldWeight in 913, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.3116565 = idf(docFreq=1611, maxDocs=44218)
                  0.03125 = fieldNorm(doc=913)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Interactive query expansion (IQE) (c.f. [Efthimiadis, E. N. (1996). Query expansion. Annual Review of Information Systems and Technology, 31, 121-187]) is a potentially useful technique to help searchers formulate improved query statements, and ultimately retrieve better search results. However, IQE is seldom used in operational settings. Two possible explanations for this are that IQE is generally not integrated into searchers' established information-seeking behaviors (e.g., examining lists of documents), and it may not be offered at a time in the search when it is needed most (i.e., during the initial query formulation). These challenges can be addressed by coupling IQE more closely with familiar search activities, rather than as a separate functionality that searchers must learn. In this article we introduce and evaluate a variant of IQE known as Real-Time Query Expansion (RTQE). As a searcher enters their query in a text box at the interface, RTQE provides a list of suggested additional query terms, in effect offering query expansion options while the query is formulated. To investigate how the technique is used - and when it may be useful - we conducted a user study comparing three search interfaces: a baseline interface with no query expansion support; an interface that provides expansion options during query entry, and a third interface that provides options after queries have been submitted to a search system. The results show that offering RTQE leads to better quality initial queries, more engagement in the search, and an increase in the uptake of query expansion. However, the results also imply that care must be taken when implementing RTQE interactively. Our findings have broad implications for how IQE should be offered, and form part of our research on the development of techniques to support the increased use of query expansion.
  16. Crestani, F.; Dominich, S.; Lalmas, M.; Rijsbergen, C.J.K. van: Mathematical, logical, and formal methods in information retrieval : an introduction to the special issue (2003) 0.01
    0.0091932835 = product of:
      0.018386567 = sum of:
        0.018386567 = product of:
          0.036773134 = sum of:
            0.036773134 = weight(_text_:22 in 1451) [ClassicSimilarity], result of:
              0.036773134 = score(doc=1451,freq=2.0), product of:
                0.15840882 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045236014 = queryNorm
                0.23214069 = fieldWeight in 1451, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1451)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2003 19:27:36
  17. Fan, W.; Fox, E.A.; Pathak, P.; Wu, H.: ¬The effects of fitness functions an genetic programming-based ranking discovery for Web search (2004) 0.01
    0.0091932835 = product of:
      0.018386567 = sum of:
        0.018386567 = product of:
          0.036773134 = sum of:
            0.036773134 = weight(_text_:22 in 2239) [ClassicSimilarity], result of:
              0.036773134 = score(doc=2239,freq=2.0), product of:
                0.15840882 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045236014 = queryNorm
                0.23214069 = fieldWeight in 2239, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2239)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    31. 5.2004 19:22:06
  18. Furner, J.: ¬A unifying model of document relatedness for hybrid search engines (2003) 0.01
    0.0091932835 = product of:
      0.018386567 = sum of:
        0.018386567 = product of:
          0.036773134 = sum of:
            0.036773134 = weight(_text_:22 in 2717) [ClassicSimilarity], result of:
              0.036773134 = score(doc=2717,freq=2.0), product of:
                0.15840882 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045236014 = queryNorm
                0.23214069 = fieldWeight in 2717, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2717)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    11. 9.2004 17:32:22
  19. Witschel, H.F.: Global term weights in distributed environments (2008) 0.01
    0.0091932835 = product of:
      0.018386567 = sum of:
        0.018386567 = product of:
          0.036773134 = sum of:
            0.036773134 = weight(_text_:22 in 2096) [ClassicSimilarity], result of:
              0.036773134 = score(doc=2096,freq=2.0), product of:
                0.15840882 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045236014 = queryNorm
                0.23214069 = fieldWeight in 2096, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2096)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    1. 8.2008 9:44:22
  20. Campos, L.M. de; Fernández-Luna, J.M.; Huete, J.F.: Implementing relevance feedback in the Bayesian network retrieval model (2003) 0.01
    0.0091932835 = product of:
      0.018386567 = sum of:
        0.018386567 = product of:
          0.036773134 = sum of:
            0.036773134 = weight(_text_:22 in 825) [ClassicSimilarity], result of:
              0.036773134 = score(doc=825,freq=2.0), product of:
                0.15840882 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.045236014 = queryNorm
                0.23214069 = fieldWeight in 825, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=825)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 3.2003 19:30:19