Search (74 results, page 1 of 4)

  • × theme_ss:"Retrievalalgorithmen"
  1. Shah, B.; Raghavan, V.; Dhatric, P.; Zhao, X.: ¬A cluster-based approach for efficient content-based image retrieval using a similarity-preserving space transformation method (2006) 0.05
    0.046808347 = sum of:
      0.021199638 = product of:
        0.08479855 = sum of:
          0.08479855 = weight(_text_:authors in 6118) [ClassicSimilarity], result of:
            0.08479855 = score(doc=6118,freq=4.0), product of:
              0.23809293 = queryWeight, product of:
                4.558814 = idf(docFreq=1258, maxDocs=44218)
                0.052226946 = queryNorm
              0.35615736 = fieldWeight in 6118, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.558814 = idf(docFreq=1258, maxDocs=44218)
                0.0390625 = fieldNorm(doc=6118)
        0.25 = coord(1/4)
      0.025608707 = product of:
        0.051217414 = sum of:
          0.051217414 = weight(_text_:b in 6118) [ClassicSimilarity], result of:
            0.051217414 = score(doc=6118,freq=4.0), product of:
              0.18503809 = queryWeight, product of:
                3.542962 = idf(docFreq=3476, maxDocs=44218)
                0.052226946 = queryNorm
              0.2767939 = fieldWeight in 6118, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.542962 = idf(docFreq=3476, maxDocs=44218)
                0.0390625 = fieldNorm(doc=6118)
        0.5 = coord(1/2)
    
    Abstract
    The techniques of clustering and space transformation have been successfully used in the past to solve a number of pattern recognition problems. In this article, the authors propose a new approach to content-based image retrieval (CBIR) that uses (a) a newly proposed similarity-preserving space transformation method to transform the original low-level image space into a highlevel vector space that enables efficient query processing, and (b) a clustering scheme that further improves the efficiency of our retrieval system. This combination is unique and the resulting system provides synergistic advantages of using both clustering and space transformation. The proposed space transformation method is shown to preserve the order of the distances in the transformed feature space. This strategy makes this approach to retrieval generic as it can be applied to object types, other than images, and feature spaces more general than metric spaces. The CBIR approach uses the inexpensive "estimated" distance in the transformed space, as opposed to the computationally inefficient "real" distance in the original space, to retrieve the desired results for a given query image. The authors also provide a theoretical analysis of the complexity of their CBIR approach when used for color-based retrieval, which shows that it is computationally more efficient than other comparable approaches. An extensive set of experiments to test the efficiency and effectiveness of the proposed approach has been performed. The results show that the approach offers superior response time (improvement of 1-2 orders of magnitude compared to retrieval approaches that either use pruning techniques like indexing, clustering, etc., or space transformation, but not both) with sufficiently high retrieval accuracy.
  2. Lin, J.; Katz, B.: Building a reusable test collection for question answering (2006) 0.04
    0.039718196 = sum of:
      0.017988488 = product of:
        0.07195395 = sum of:
          0.07195395 = weight(_text_:authors in 5045) [ClassicSimilarity], result of:
            0.07195395 = score(doc=5045,freq=2.0), product of:
              0.23809293 = queryWeight, product of:
                4.558814 = idf(docFreq=1258, maxDocs=44218)
                0.052226946 = queryNorm
              0.30220953 = fieldWeight in 5045, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.558814 = idf(docFreq=1258, maxDocs=44218)
                0.046875 = fieldNorm(doc=5045)
        0.25 = coord(1/4)
      0.021729708 = product of:
        0.043459415 = sum of:
          0.043459415 = weight(_text_:b in 5045) [ClassicSimilarity], result of:
            0.043459415 = score(doc=5045,freq=2.0), product of:
              0.18503809 = queryWeight, product of:
                3.542962 = idf(docFreq=3476, maxDocs=44218)
                0.052226946 = queryNorm
              0.23486741 = fieldWeight in 5045, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.542962 = idf(docFreq=3476, maxDocs=44218)
                0.046875 = fieldNorm(doc=5045)
        0.5 = coord(1/2)
    
    Abstract
    In contrast to traditional information retrieval systems, which return ranked lists of documents that users must manually browse through, a question answering system attempts to directly answer natural language questions posed by the user. Although such systems possess language-processing capabilities, they still rely on traditional document retrieval techniques to generate an initial candidate set of documents. In this article, the authors argue that document retrieval for question answering represents a task different from retrieving documents in response to more general retrospective information needs. Thus, to guide future system development, specialized question answering test collections must be constructed. They show that the current evaluation resources have major shortcomings; to remedy the situation, they have manually created a small, reusable question answering test collection for research purposes. In this article they describe their methodology for building this test collection and discuss issues they encountered regarding the notion of "answer correctness."
  3. Chen, Z.; Fu, B.: On the complexity of Rocchio's similarity-based relevance feedback algorithm (2007) 0.04
    0.03930773 = sum of:
      0.021199638 = product of:
        0.08479855 = sum of:
          0.08479855 = weight(_text_:authors in 578) [ClassicSimilarity], result of:
            0.08479855 = score(doc=578,freq=4.0), product of:
              0.23809293 = queryWeight, product of:
                4.558814 = idf(docFreq=1258, maxDocs=44218)
                0.052226946 = queryNorm
              0.35615736 = fieldWeight in 578, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.558814 = idf(docFreq=1258, maxDocs=44218)
                0.0390625 = fieldNorm(doc=578)
        0.25 = coord(1/4)
      0.01810809 = product of:
        0.03621618 = sum of:
          0.03621618 = weight(_text_:b in 578) [ClassicSimilarity], result of:
            0.03621618 = score(doc=578,freq=2.0), product of:
              0.18503809 = queryWeight, product of:
                3.542962 = idf(docFreq=3476, maxDocs=44218)
                0.052226946 = queryNorm
              0.19572285 = fieldWeight in 578, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.542962 = idf(docFreq=3476, maxDocs=44218)
                0.0390625 = fieldNorm(doc=578)
        0.5 = coord(1/2)
    
    Abstract
    Rocchio's similarity-based relevance feedback algorithm, one of the most important query reformation methods in information retrieval, is essentially an adaptive learning algorithm from examples in searching for documents represented by a linear classifier. Despite its popularity in various applications, there is little rigorous analysis of its learning complexity in literature. In this article, the authors prove for the first time that the learning complexity of Rocchio's algorithm is O(d + d**2(log d + log n)) over the discretized vector space {0, ... , n - 1 }**d when the inner product similarity measure is used. The upper bound on the learning complexity for searching for documents represented by a monotone linear classifier (q, 0) over {0, ... , n - 1 }d can be improved to, at most, 1 + 2k (n - 1) (log d + log(n - 1)), where k is the number of nonzero components in q. Several lower bounds on the learning complexity are also obtained for Rocchio's algorithm. For example, the authors prove that Rocchio's algorithm has a lower bound Omega((d über 2)log n) on its learning complexity over the Boolean vector space {0,1}**d.
  4. Soulier, L.; Jabeur, L.B.; Tamine, L.; Bahsoun, W.: On ranking relevant entities in heterogeneous networks using a language-based model (2013) 0.04
    0.038889714 = sum of:
      0.021199638 = product of:
        0.08479855 = sum of:
          0.08479855 = weight(_text_:authors in 664) [ClassicSimilarity], result of:
            0.08479855 = score(doc=664,freq=4.0), product of:
              0.23809293 = queryWeight, product of:
                4.558814 = idf(docFreq=1258, maxDocs=44218)
                0.052226946 = queryNorm
              0.35615736 = fieldWeight in 664, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.558814 = idf(docFreq=1258, maxDocs=44218)
                0.0390625 = fieldNorm(doc=664)
        0.25 = coord(1/4)
      0.017690076 = product of:
        0.03538015 = sum of:
          0.03538015 = weight(_text_:22 in 664) [ClassicSimilarity], result of:
            0.03538015 = score(doc=664,freq=2.0), product of:
              0.18288986 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.052226946 = queryNorm
              0.19345059 = fieldWeight in 664, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=664)
        0.5 = coord(1/2)
    
    Abstract
    A new challenge, accessing multiple relevant entities, arises from the availability of linked heterogeneous data. In this article, we address more specifically the problem of accessing relevant entities, such as publications and authors within a bibliographic network, given an information need. We propose a novel algorithm, called BibRank, that estimates a joint relevance of documents and authors within a bibliographic network. This model ranks each type of entity using a score propagation algorithm with respect to the query topic and the structure of the underlying bi-type information entity network. Evidence sources, namely content-based and network-based scores, are both used to estimate the topical similarity between connected entities. For this purpose, authorship relationships are analyzed through a language model-based score on the one hand and on the other hand, non topically related entities of the same type are detected through marginal citations. The article reports the results of experiments using the Bibrank algorithm for an information retrieval task. The CiteSeerX bibliographic data set forms the basis for the topical query automatic generation and evaluation. We show that a statistically significant improvement over closely related ranking models is achieved.
    Date
    22. 3.2013 19:34:49
  5. Henzinger, M.R.: Link analysis in Web information retrieval (2000) 0.04
    0.037446678 = sum of:
      0.01695971 = product of:
        0.06783884 = sum of:
          0.06783884 = weight(_text_:authors in 801) [ClassicSimilarity], result of:
            0.06783884 = score(doc=801,freq=4.0), product of:
              0.23809293 = queryWeight, product of:
                4.558814 = idf(docFreq=1258, maxDocs=44218)
                0.052226946 = queryNorm
              0.28492588 = fieldWeight in 801, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.558814 = idf(docFreq=1258, maxDocs=44218)
                0.03125 = fieldNorm(doc=801)
        0.25 = coord(1/4)
      0.020486966 = product of:
        0.04097393 = sum of:
          0.04097393 = weight(_text_:b in 801) [ClassicSimilarity], result of:
            0.04097393 = score(doc=801,freq=4.0), product of:
              0.18503809 = queryWeight, product of:
                3.542962 = idf(docFreq=3476, maxDocs=44218)
                0.052226946 = queryNorm
              0.22143513 = fieldWeight in 801, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.542962 = idf(docFreq=3476, maxDocs=44218)
                0.03125 = fieldNorm(doc=801)
        0.5 = coord(1/2)
    
    Content
    The goal of information retrieval is to find all documents relevant for a user query in a collection of documents. Decades of research in information retrieval were successful in developing and refining techniques that are solely word-based (see e.g., [2]). With the advent of the web new sources of information became available, one of them being the hyperlinks between documents and records of user behavior. To be precise, hypertexts (i.e., collections of documents connected by hyperlinks) have existed and have been studied for a long time. What was new was the large number of hyperlinks created by independent individuals. Hyperlinks provide a valuable source of information for web information retrieval as we will show in this article. This area of information retrieval is commonly called link analysis. Why would one expect hyperlinks to be useful? Ahyperlink is a reference of a web page B that is contained in a web page A. When the hyperlink is clicked on in a web browser, the browser displays page B. This functionality alone is not helpful for web information retrieval. However, the way hyperlinks are typically used by authors of web pages can give them valuable information content. Typically, authors create links because they think they will be useful for the readers of the pages. Thus, links are usually either navigational aids that, for example, bring the reader back to the homepage of the site, or links that point to pages whose content augments the content of the current page. The second kind of links tend to point to high-quality pages that might be on the same topic as the page containing the link.
  6. Shiri, A.A.; Revie, C.: Query expansion behavior within a thesaurus-enhanced search environment : a user-centered evaluation (2006) 0.04
    0.035798166 = product of:
      0.07159633 = sum of:
        0.07159633 = sum of:
          0.03621618 = weight(_text_:b in 56) [ClassicSimilarity], result of:
            0.03621618 = score(doc=56,freq=2.0), product of:
              0.18503809 = queryWeight, product of:
                3.542962 = idf(docFreq=3476, maxDocs=44218)
                0.052226946 = queryNorm
              0.19572285 = fieldWeight in 56, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.542962 = idf(docFreq=3476, maxDocs=44218)
                0.0390625 = fieldNorm(doc=56)
          0.03538015 = weight(_text_:22 in 56) [ClassicSimilarity], result of:
            0.03538015 = score(doc=56,freq=2.0), product of:
              0.18288986 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.052226946 = queryNorm
              0.19345059 = fieldWeight in 56, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=56)
      0.5 = coord(1/2)
    
    Abstract
    The study reported here investigated the query expansion behavior of end-users interacting with a thesaurus-enhanced search system on the Web. Two groups, namely academic staff and postgraduate students, were recruited into this study. Data were collected from 90 searches performed by 30 users using the OVID interface to the CAB abstracts database. Data-gathering techniques included questionnaires, screen capturing software, and interviews. The results presented here relate to issues of search-topic and search-term characteristics, number and types of expanded queries, usefulness of thesaurus terms, and behavioral differences between academic staff and postgraduate students in their interaction. The key conclusions drawn were that (a) academic staff chose more narrow and synonymous terms than did postgraduate students, who generally selected broader and related terms; (b) topic complexity affected users' interaction with the thesaurus in that complex topics required more query expansion and search term selection; (c) users' prior topic-search experience appeared to have a significant effect on their selection and evaluation of thesaurus terms; (d) in 50% of the searches where additional terms were suggested from the thesaurus, users stated that they had not been aware of the terms at the beginning of the search; this observation was particularly noticeable in the case of postgraduate students.
    Date
    22. 7.2006 16:32:43
  7. Dannenberg, R.B.; Birmingham, W.P.; Pardo, B.; Hu, N.; Meek, C.; Tzanetakis, G.: ¬A comparative evaluation of search techniques for query-by-humming using the MUSART testbed (2007) 0.03
    0.033098496 = sum of:
      0.014990407 = product of:
        0.05996163 = sum of:
          0.05996163 = weight(_text_:authors in 269) [ClassicSimilarity], result of:
            0.05996163 = score(doc=269,freq=2.0), product of:
              0.23809293 = queryWeight, product of:
                4.558814 = idf(docFreq=1258, maxDocs=44218)
                0.052226946 = queryNorm
              0.25184128 = fieldWeight in 269, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.558814 = idf(docFreq=1258, maxDocs=44218)
                0.0390625 = fieldNorm(doc=269)
        0.25 = coord(1/4)
      0.01810809 = product of:
        0.03621618 = sum of:
          0.03621618 = weight(_text_:b in 269) [ClassicSimilarity], result of:
            0.03621618 = score(doc=269,freq=2.0), product of:
              0.18503809 = queryWeight, product of:
                3.542962 = idf(docFreq=3476, maxDocs=44218)
                0.052226946 = queryNorm
              0.19572285 = fieldWeight in 269, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.542962 = idf(docFreq=3476, maxDocs=44218)
                0.0390625 = fieldNorm(doc=269)
        0.5 = coord(1/2)
    
    Abstract
    Query-by-humming systems offer content-based searching for melodies and require no special musical training or knowledge. Many such systems have been built, but there has not been much useful evaluation and comparison in the literature due to the lack of shared databases and queries. The MUSART project testbed allows various search algorithms to be compared using a shared framework that automatically runs experiments and summarizes results. Using this testbed, the authors compared algorithms based on string alignment, melodic contour matching, a hidden Markov model, n-grams, and CubyHum. Retrieval performance is very sensitive to distance functions and the representation of pitch and rhythm, which raises questions about some previously published conclusions. Some algorithms are particularly sensitive to the quality of queries. Our queries, which are taken from human subjects in a realistic setting, are quite difficult, especially for n-gram models. Finally, simulations on query-by-humming performance as a function of database size indicate that retrieval performance falls only slowly as the database size increases.
  8. Henzinger, M.R.: Hyperlink analysis for the Web (2001) 0.03
    0.03144618 = sum of:
      0.01695971 = product of:
        0.06783884 = sum of:
          0.06783884 = weight(_text_:authors in 8) [ClassicSimilarity], result of:
            0.06783884 = score(doc=8,freq=4.0), product of:
              0.23809293 = queryWeight, product of:
                4.558814 = idf(docFreq=1258, maxDocs=44218)
                0.052226946 = queryNorm
              0.28492588 = fieldWeight in 8, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.558814 = idf(docFreq=1258, maxDocs=44218)
                0.03125 = fieldNorm(doc=8)
        0.25 = coord(1/4)
      0.014486472 = product of:
        0.028972944 = sum of:
          0.028972944 = weight(_text_:b in 8) [ClassicSimilarity], result of:
            0.028972944 = score(doc=8,freq=2.0), product of:
              0.18503809 = queryWeight, product of:
                3.542962 = idf(docFreq=3476, maxDocs=44218)
                0.052226946 = queryNorm
              0.15657827 = fieldWeight in 8, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.542962 = idf(docFreq=3476, maxDocs=44218)
                0.03125 = fieldNorm(doc=8)
        0.5 = coord(1/2)
    
    Content
    Information retrieval is a computer science subfield whose goal is to find all documents relevant to a user query in a given collection of documents. As such, information retrieval should really be called document retrieval. Before the advent of the Web, IR systems were typically installed in libraries for use mostly by reference librarians. The retrieval algorithm for these systems was usually based exclusively on analysis of the words in the document. The Web changed all this. Now each Web user has access to various search engines whose retrieval algorithms often use not only the words in the documents but also information like the hyperlink structure of the Web or markup language tags. How are hyperlinks useful? The hyperlink functionality alone-that is, the hyperlink to Web page B that is contained in Web page A-is not directly useful in information retrieval. However, the way Web page authors use hyperlinks can give them valuable information content. Authors usually create hyperlinks they think will be useful to readers. Some may be navigational aids that, for example, take the reader back to the site's home page; others provide access to documents that augment the content of the current page. The latter tend to point to highquality pages that might be on the same topic as the page containing the hyperlink. Web information retrieval systems can exploit this information to refine searches for relevant documents. Hyperlink analysis significantly improves the relevance of the search results, so much so that all major Web search engines claim to use some type of hyperlink analysis. However, the search engines do not disclose details about the type of hyperlink analysis they perform- mostly to avoid manipulation of search results by Web-positioning companies. In this article, I discuss how hyperlink analysis can be applied to ranking algorithms, and survey other ways Web search engines can use this analysis.
  9. Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.03
    0.02830412 = product of:
      0.05660824 = sum of:
        0.05660824 = product of:
          0.11321648 = sum of:
            0.11321648 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
              0.11321648 = score(doc=402,freq=2.0), product of:
                0.18288986 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.052226946 = queryNorm
                0.61904186 = fieldWeight in 402, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=402)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Source
    Information processing and management. 22(1986) no.6, S.465-476
  10. Ziegler, B.: ESS: ein schneller Algorithmus zur Mustersuche in Zeichenfolgen (1996) 0.03
    0.025351325 = product of:
      0.05070265 = sum of:
        0.05070265 = product of:
          0.1014053 = sum of:
            0.1014053 = weight(_text_:b in 7543) [ClassicSimilarity], result of:
              0.1014053 = score(doc=7543,freq=2.0), product of:
                0.18503809 = queryWeight, product of:
                  3.542962 = idf(docFreq=3476, maxDocs=44218)
                  0.052226946 = queryNorm
                0.54802394 = fieldWeight in 7543, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.542962 = idf(docFreq=3476, maxDocs=44218)
                  0.109375 = fieldNorm(doc=7543)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  11. Smeaton, A.F.; Rijsbergen, C.J. van: ¬The retrieval effects of query expansion on a feedback document retrieval system (1983) 0.02
    0.024766104 = product of:
      0.04953221 = sum of:
        0.04953221 = product of:
          0.09906442 = sum of:
            0.09906442 = weight(_text_:22 in 2134) [ClassicSimilarity], result of:
              0.09906442 = score(doc=2134,freq=2.0), product of:
                0.18288986 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.052226946 = queryNorm
                0.5416616 = fieldWeight in 2134, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=2134)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    30. 3.2001 13:32:22
  12. Back, J.: ¬An evaluation of relevancy ranking techniques used by Internet search engines (2000) 0.02
    0.024766104 = product of:
      0.04953221 = sum of:
        0.04953221 = product of:
          0.09906442 = sum of:
            0.09906442 = weight(_text_:22 in 3445) [ClassicSimilarity], result of:
              0.09906442 = score(doc=3445,freq=2.0), product of:
                0.18288986 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.052226946 = queryNorm
                0.5416616 = fieldWeight in 3445, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3445)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    25. 8.2005 17:42:22
  13. Khoo, C.S.G.; Wan, K.-W.: ¬A simple relevancy-ranking strategy for an interface to Boolean OPACs (2004) 0.02
    0.022876337 = sum of:
      0.010493284 = product of:
        0.041973136 = sum of:
          0.041973136 = weight(_text_:authors in 2509) [ClassicSimilarity], result of:
            0.041973136 = score(doc=2509,freq=2.0), product of:
              0.23809293 = queryWeight, product of:
                4.558814 = idf(docFreq=1258, maxDocs=44218)
                0.052226946 = queryNorm
              0.17628889 = fieldWeight in 2509, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.558814 = idf(docFreq=1258, maxDocs=44218)
                0.02734375 = fieldNorm(doc=2509)
        0.25 = coord(1/4)
      0.012383052 = product of:
        0.024766104 = sum of:
          0.024766104 = weight(_text_:22 in 2509) [ClassicSimilarity], result of:
            0.024766104 = score(doc=2509,freq=2.0), product of:
              0.18288986 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.052226946 = queryNorm
              0.1354154 = fieldWeight in 2509, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.02734375 = fieldNorm(doc=2509)
        0.5 = coord(1/2)
    
    Abstract
    A relevancy-ranking algorithm for a natural language interface to Boolean online public access catalogs (OPACs) was formulated and compared with that currently used in a knowledge-based search interface called the E-Referencer, being developed by the authors. The algorithm makes use of seven weIl-known ranking criteria: breadth of match, section weighting, proximity of query words, variant word forms (stemming), document frequency, term frequency and document length. The algorithm converts a natural language query into a series of increasingly broader Boolean search statements. In a small experiment with ten subjects in which the algorithm was simulated by hand, the algorithm obtained good results with a mean overall precision of 0.42 and mean average precision of 0.62, representing a 27 percent improvement in precision and 41 percent improvement in average precision compared to the E-Referencer. The usefulness of each step in the algorithm was analyzed and suggestions are made for improving the algorithm.
    Source
    Electronic library. 22(2004) no.2, S.112-120
  14. Ding, Y.; Yan, E.; Frazho, A.; Caverlee, J.: PageRank for ranking authors in co-citation networks (2009) 0.02
    0.022031307 = product of:
      0.044062614 = sum of:
        0.044062614 = product of:
          0.17625046 = sum of:
            0.17625046 = weight(_text_:authors in 3161) [ClassicSimilarity], result of:
              0.17625046 = score(doc=3161,freq=12.0), product of:
                0.23809293 = queryWeight, product of:
                  4.558814 = idf(docFreq=1258, maxDocs=44218)
                  0.052226946 = queryNorm
                0.7402591 = fieldWeight in 3161, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  4.558814 = idf(docFreq=1258, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3161)
          0.25 = coord(1/4)
      0.5 = coord(1/2)
    
    Abstract
    This paper studies how varied damping factors in the PageRank algorithm influence the ranking of authors and proposes weighted PageRank algorithms. We selected the 108 most highly cited authors in the information retrieval (IR) area from the 1970s to 2008 to form the author co-citation network. We calculated the ranks of these 108 authors based on PageRank with the damping factor ranging from 0.05 to 0.95. In order to test the relationship between different measures, we compared PageRank and weighted PageRank results with the citation ranking, h-index, and centrality measures. We found that in our author co-citation network, citation rank is highly correlated with PageRank with different damping factors and also with different weighted PageRank algorithms; citation rank and PageRank are not significantly correlated with centrality measures; and h-index rank does not significantly correlate with centrality measures but does significantly correlate with other measures. The key factors that have impact on the PageRank of authors in the author co-citation network are being co-cited with important authors.
  15. Silveira, M.; Ribeiro-Neto, B.: Concept-based ranking : a case study in the juridical domain (2004) 0.02
    0.021729708 = product of:
      0.043459415 = sum of:
        0.043459415 = product of:
          0.08691883 = sum of:
            0.08691883 = weight(_text_:b in 2339) [ClassicSimilarity], result of:
              0.08691883 = score(doc=2339,freq=2.0), product of:
                0.18503809 = queryWeight, product of:
                  3.542962 = idf(docFreq=3476, maxDocs=44218)
                  0.052226946 = queryNorm
                0.46973482 = fieldWeight in 2339, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.542962 = idf(docFreq=3476, maxDocs=44218)
                  0.09375 = fieldNorm(doc=2339)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  16. Fuhr, N.: Ranking-Experimente mit gewichteter Indexierung (1986) 0.02
    0.02122809 = product of:
      0.04245618 = sum of:
        0.04245618 = product of:
          0.08491236 = sum of:
            0.08491236 = weight(_text_:22 in 58) [ClassicSimilarity], result of:
              0.08491236 = score(doc=58,freq=2.0), product of:
                0.18288986 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.052226946 = queryNorm
                0.46428138 = fieldWeight in 58, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=58)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    14. 6.2015 22:12:44
  17. Fuhr, N.: Rankingexperimente mit gewichteter Indexierung (1986) 0.02
    0.02122809 = product of:
      0.04245618 = sum of:
        0.04245618 = product of:
          0.08491236 = sum of:
            0.08491236 = weight(_text_:22 in 2051) [ClassicSimilarity], result of:
              0.08491236 = score(doc=2051,freq=2.0), product of:
                0.18288986 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.052226946 = queryNorm
                0.46428138 = fieldWeight in 2051, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=2051)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    14. 6.2015 22:12:56
  18. Chang, R.: ¬The development of indexing technology (1993) 0.02
    0.020486966 = product of:
      0.04097393 = sum of:
        0.04097393 = product of:
          0.08194786 = sum of:
            0.08194786 = weight(_text_:b in 7024) [ClassicSimilarity], result of:
              0.08194786 = score(doc=7024,freq=4.0), product of:
                0.18503809 = queryWeight, product of:
                  3.542962 = idf(docFreq=3476, maxDocs=44218)
                  0.052226946 = queryNorm
                0.44287026 = fieldWeight in 7024, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.542962 = idf(docFreq=3476, maxDocs=44218)
                  0.0625 = fieldNorm(doc=7024)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Reviews the basic techniques of computerized indexing, including various file accessing methods such as: Sequential Access Method (SAM); Direct Access Method (DAM); Indexed Sequential Access Method (ISAM), and Virtual Indexed Sequential Access Method (VSAM); and various B-tree (balanced tree)structures. Illustrates how records are stored and accessed, and how B-trees are used to for improving the operations of information retrieval and maintenance
  19. Chang, R.: Keyword searching and indexing (1993) 0.02
    0.020486966 = product of:
      0.04097393 = sum of:
        0.04097393 = product of:
          0.08194786 = sum of:
            0.08194786 = weight(_text_:b in 7223) [ClassicSimilarity], result of:
              0.08194786 = score(doc=7223,freq=4.0), product of:
                0.18503809 = queryWeight, product of:
                  3.542962 = idf(docFreq=3476, maxDocs=44218)
                  0.052226946 = queryNorm
                0.44287026 = fieldWeight in 7223, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.542962 = idf(docFreq=3476, maxDocs=44218)
                  0.0625 = fieldNorm(doc=7223)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Explains how a computer indexing system works. Reviews fundamentals of how data are stored and retrieved by computers. Describes B-Tree and B+-Tree indexing structures. Gives basic keyword searching techniques that the user must apply to make use of the indexing programs. The demand for keyword retrieval is increasing and librarians should expect to see the keyword-indexing feature become commonly available
  20. Quint, B.: Check out the new RANK command on DIALOG (1993) 0.02
    0.01810809 = product of:
      0.03621618 = sum of:
        0.03621618 = product of:
          0.07243236 = sum of:
            0.07243236 = weight(_text_:b in 6640) [ClassicSimilarity], result of:
              0.07243236 = score(doc=6640,freq=2.0), product of:
                0.18503809 = queryWeight, product of:
                  3.542962 = idf(docFreq=3476, maxDocs=44218)
                  0.052226946 = queryNorm
                0.3914457 = fieldWeight in 6640, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.542962 = idf(docFreq=3476, maxDocs=44218)
                  0.078125 = fieldNorm(doc=6640)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    

Years

Languages

  • e 63
  • d 11

Types

  • a 66
  • m 5
  • s 2
  • r 1
  • x 1
  • More… Less…