Search (54 results, page 1 of 3)

  • × theme_ss:"Retrievalalgorithmen"
  1. Ruthven, I.; Lalmas, M.: Selective relevance feedback using term characteristics (1999) 0.04
    0.043355502 = product of:
      0.1300665 = sum of:
        0.08805987 = weight(_text_:et in 3824) [ClassicSimilarity], result of:
          0.08805987 = score(doc=3824,freq=2.0), product of:
            0.16986917 = queryWeight, product of:
              4.692005 = idf(docFreq=1101, maxDocs=44218)
              0.03620396 = queryNorm
            0.5183982 = fieldWeight in 3824, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.692005 = idf(docFreq=1101, maxDocs=44218)
              0.078125 = fieldNorm(doc=3824)
        0.042006623 = product of:
          0.084013246 = sum of:
            0.084013246 = weight(_text_:al in 3824) [ClassicSimilarity], result of:
              0.084013246 = score(doc=3824,freq=2.0), product of:
                0.16592026 = queryWeight, product of:
                  4.582931 = idf(docFreq=1228, maxDocs=44218)
                  0.03620396 = queryNorm
                0.5063471 = fieldWeight in 3824, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.582931 = idf(docFreq=1228, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3824)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Source
    Vocabulary as a central concept in digital libraries: interdisciplinary concepts, challenges, and opportunities : proceedings of the Third International Conference an Conceptions of Library and Information Science (COLIS3), Dubrovnik, Croatia, 23-26 May 1999. Ed. by T. Arpanac et al
  2. Khoo, C.S.G.; Wan, K.-W.: ¬A simple relevancy-ranking strategy for an interface to Boolean OPACs (2004) 0.03
    0.03411328 = product of:
      0.102339834 = sum of:
        0.043587416 = weight(_text_:et in 2509) [ClassicSimilarity], result of:
          0.043587416 = score(doc=2509,freq=4.0), product of:
            0.16986917 = queryWeight, product of:
              4.692005 = idf(docFreq=1101, maxDocs=44218)
              0.03620396 = queryNorm
            0.25659403 = fieldWeight in 2509, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.692005 = idf(docFreq=1101, maxDocs=44218)
              0.02734375 = fieldNorm(doc=2509)
        0.058752418 = sum of:
          0.041584436 = weight(_text_:al in 2509) [ClassicSimilarity], result of:
            0.041584436 = score(doc=2509,freq=4.0), product of:
              0.16592026 = queryWeight, product of:
                4.582931 = idf(docFreq=1228, maxDocs=44218)
                0.03620396 = queryNorm
              0.25062904 = fieldWeight in 2509, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.582931 = idf(docFreq=1228, maxDocs=44218)
                0.02734375 = fieldNorm(doc=2509)
          0.01716798 = weight(_text_:22 in 2509) [ClassicSimilarity], result of:
            0.01716798 = score(doc=2509,freq=2.0), product of:
              0.12678011 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.03620396 = queryNorm
              0.1354154 = fieldWeight in 2509, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.02734375 = fieldNorm(doc=2509)
      0.33333334 = coord(2/6)
    
    Content
    "Most Web search engines accept natural language queries, perform some kind of fuzzy matching and produce ranked output, displaying first the documents that are most likely to be relevant. On the other hand, most library online public access catalogs (OPACs) an the Web are still Boolean retrieval systems that perform exact matching, and require users to express their search requests precisely in a Boolean search language and to refine their search statements to improve the search results. It is well-documented that users have difficulty searching Boolean OPACs effectively (e.g. Borgman, 1996; Ensor, 1992; Wallace, 1993). One approach to making OPACs easier to use is to develop a natural language search interface that acts as a middleware between the user's Web browser and the OPAC system. The search interface can accept a natural language query from the user and reformulate it as a series of Boolean search statements that are then submitted to the OPAC. The records retrieved by the OPAC are ranked by the search interface before forwarding them to the user's Web browser. The user, then, does not need to interact directly with the Boolean OPAC but with the natural language search interface or search intermediary. The search interface interacts with the OPAC system an the user's behalf. The advantage of this approach is that no modification to the OPAC or library system is required. Furthermore, the search interface can access multiple OPACs, acting as a meta search engine, and integrate search results from various OPACs before sending them to the user. The search interface needs to incorporate a method for converting the user's natural language query into a series of Boolean search statements, and for ranking the OPAC records retrieved. The purpose of this study was to develop a relevancyranking algorithm for a search interface to Boolean OPAC systems. This is part of an on-going effort to develop a knowledge-based search interface to OPACs called the E-Referencer (Khoo et al., 1998, 1999; Poo et al., 2000). E-Referencer v. 2 that has been implemented applies a repertoire of initial search strategies and reformulation strategies to retrieve records from OPACs using the Z39.50 protocol, and also assists users in mapping query keywords to the Library of Congress subject headings."
    Source
    Electronic library. 22(2004) no.2, S.112-120
  3. Ding, Y.; Chowdhury, G.; Foo, S.: Organsising keywords in a Web search environment : a methodology based on co-word analysis (2000) 0.03
    0.0260133 = product of:
      0.0780399 = sum of:
        0.052835923 = weight(_text_:et in 105) [ClassicSimilarity], result of:
          0.052835923 = score(doc=105,freq=2.0), product of:
            0.16986917 = queryWeight, product of:
              4.692005 = idf(docFreq=1101, maxDocs=44218)
              0.03620396 = queryNorm
            0.3110389 = fieldWeight in 105, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.692005 = idf(docFreq=1101, maxDocs=44218)
              0.046875 = fieldNorm(doc=105)
        0.025203973 = product of:
          0.050407946 = sum of:
            0.050407946 = weight(_text_:al in 105) [ClassicSimilarity], result of:
              0.050407946 = score(doc=105,freq=2.0), product of:
                0.16592026 = queryWeight, product of:
                  4.582931 = idf(docFreq=1228, maxDocs=44218)
                  0.03620396 = queryNorm
                0.30380827 = fieldWeight in 105, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.582931 = idf(docFreq=1228, maxDocs=44218)
                  0.046875 = fieldNorm(doc=105)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Source
    Dynamism and stability in knowledge organization: Proceedings of the 6th International ISKO-Conference, 10-13 July 2000, Toronto, Canada. Ed.: C. Beghtol et al
  4. Bodoff, D.; Robertson, S.: ¬A new unified probabilistic model (2004) 0.03
    0.0260133 = product of:
      0.0780399 = sum of:
        0.052835923 = weight(_text_:et in 2129) [ClassicSimilarity], result of:
          0.052835923 = score(doc=2129,freq=2.0), product of:
            0.16986917 = queryWeight, product of:
              4.692005 = idf(docFreq=1101, maxDocs=44218)
              0.03620396 = queryNorm
            0.3110389 = fieldWeight in 2129, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.692005 = idf(docFreq=1101, maxDocs=44218)
              0.046875 = fieldNorm(doc=2129)
        0.025203973 = product of:
          0.050407946 = sum of:
            0.050407946 = weight(_text_:al in 2129) [ClassicSimilarity], result of:
              0.050407946 = score(doc=2129,freq=2.0), product of:
                0.16592026 = queryWeight, product of:
                  4.582931 = idf(docFreq=1228, maxDocs=44218)
                  0.03620396 = queryNorm
                0.30380827 = fieldWeight in 2129, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.582931 = idf(docFreq=1228, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2129)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Abstract
    This paper proposes a new unified probabilistic model. Two previous models, Robertson et al.'s "Model 0" and "Model 3," each have strengths and weaknesses. The strength of Model 0 not found in Model 3, is that it does not require relevance data about the particular document or query, and, related to that, its probability estimates are straightforward. The strength of Model 3 not found in Model 0 is that it can utilize feedback information about the particular document and query in question. In this paper we introduce a new unified probabilistic model that combines these strengths: the expression of its probabilities is straightforward, it does not require that data must be available for the particular document or query in question, but it can utilize such specific data if it is available. The model is one way to resolve the difficulty of combining two marginal views in probabilistic retrieval.
  5. Cross-language information retrieval (1998) 0.02
    0.024983555 = product of:
      0.074950665 = sum of:
        0.04922697 = weight(_text_:et in 6299) [ClassicSimilarity], result of:
          0.04922697 = score(doc=6299,freq=10.0), product of:
            0.16986917 = queryWeight, product of:
              4.692005 = idf(docFreq=1101, maxDocs=44218)
              0.03620396 = queryNorm
            0.28979343 = fieldWeight in 6299, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              4.692005 = idf(docFreq=1101, maxDocs=44218)
              0.01953125 = fieldNorm(doc=6299)
        0.025723698 = product of:
          0.051447395 = sum of:
            0.051447395 = weight(_text_:al in 6299) [ClassicSimilarity], result of:
              0.051447395 = score(doc=6299,freq=12.0), product of:
                0.16592026 = queryWeight, product of:
                  4.582931 = idf(docFreq=1228, maxDocs=44218)
                  0.03620396 = queryNorm
                0.31007302 = fieldWeight in 6299, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  4.582931 = idf(docFreq=1228, maxDocs=44218)
                  0.01953125 = fieldNorm(doc=6299)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Footnote
    Christian Fluhr at al (DIST/SMTI, France) outline the EMIR (European Multilingual Information Retrieval) and ESPRIT projects. They found that using SYSTRAN to machine translate queries and to access material from various multilingual databases produced less relevant results than a method referred to as 'multilingual reformulation' (the mechanics of which are only hinted at). An interesting technique is Latent Semantic Indexing (LSI), described by Michael Littman et al (Brown University) and, most clearly, by David Evans et al (Carnegie Mellon University). LSI involves creating matrices of documents and the terms they contain and 'fitting' related documents into a reduced matrix space. This effectively allows queries to be mapped onto a common semantic representation of the documents. Eugenio Picchi and Carol Peters (Pisa) report on a procedure to create links between translation equivalents in an Italian-English parallel corpus. The links are used to construct parallel linguistic contexts in real-time for any term or combination of terms that is being searched for in either language. Their interest is primarily lexicographic but they plan to apply the same procedure to comparable corpora, i.e. to texts which are not translations of each other but which share the same domain. Kiyoshi Yamabana et al (NEC, Japan) address the issue of how to disambiguate between alternative translations of query terms. Their DMAX (double maximise) method looks at co-occurrence frequencies between both source language words and target language words in order to arrive at the most probable translation. The statistical data for the decision are derived, not from the translation texts but independently from monolingual corpora in each language. An interactive user interface allows the user to influence the selection of terms during the matching process. Denis Gachot et al (SYSTRAN) describe the SYSTRAN NLP browser, a prototype tool which collects parsing information derived from a text or corpus previously translated with SYSTRAN. The user enters queries into the browser in either a structured or free form and receives grammatical and lexical information about the source text and/or its translation.
    The retrieved output from a query including the phrase 'big rockets' may be, for instance, a sentence containing 'giant rocket' which is semantically ranked above 'military ocket'. David Hull (Xerox Research Centre, Grenoble) describes an implementation of a weighted Boolean model for Spanish-English CLIR. Users construct Boolean-type queries, weighting each term in the query, which is then translated by an on-line dictionary before being applied to the database. Comparisons with the performance of unweighted free-form queries ('vector space' models) proved encouraging. Two contributions consider the evaluation of CLIR systems. In order to by-pass the time-consuming and expensive process of assembling a standard collection of documents and of user queries against which the performance of an CLIR system is manually assessed, Páriac Sheridan et al (ETH Zurich) propose a method based on retrieving 'seed documents'. This involves identifying a unique document in a database (the 'seed document') and, for a number of queries, measuring how fast it is retrieved. The authors have also assembled a large database of multilingual news documents for testing purposes. By storing the (fairly short) documents in a structured form tagged with descriptor codes (e.g. for topic, country and area), the test suite is easily expanded while remaining consistent for the purposes of testing. Douglas Ouard and Bonne Dorr (University of Maryland) describe an evaluation methodology which appears to apply LSI techniques in order to filter and rank incoming documents designed for testing CLIR systems. The volume provides the reader an excellent overview of several projects in CLIR. It is well supported with references and is intended as a secondary text for researchers and practitioners. It highlights the need for a good, general tutorial introduction to the field."
  6. Agosti, M.; Pretto, L.: ¬A theoretical study of a generalized version of kleinberg's HITS algorithm (2005) 0.02
    0.021677751 = product of:
      0.06503325 = sum of:
        0.044029936 = weight(_text_:et in 4) [ClassicSimilarity], result of:
          0.044029936 = score(doc=4,freq=2.0), product of:
            0.16986917 = queryWeight, product of:
              4.692005 = idf(docFreq=1101, maxDocs=44218)
              0.03620396 = queryNorm
            0.2591991 = fieldWeight in 4, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.692005 = idf(docFreq=1101, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4)
        0.021003311 = product of:
          0.042006623 = sum of:
            0.042006623 = weight(_text_:al in 4) [ClassicSimilarity], result of:
              0.042006623 = score(doc=4,freq=2.0), product of:
                0.16592026 = queryWeight, product of:
                  4.582931 = idf(docFreq=1228, maxDocs=44218)
                  0.03620396 = queryNorm
                0.25317356 = fieldWeight in 4, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.582931 = idf(docFreq=1228, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Abstract
    Kleinberg's HITS (Hyperlink-Induced Topic Search) algorithm (Kleinberg 1999), which was originally developed in a Web context, tries to infer the authoritativeness of a Web page in relation to a specific query using the structure of a subgraph of the Web graph, which is obtained considering this specific query. Recent applications of this algorithm in contexts far removed from that of Web searching (Bacchin, Ferro and Melucci 2002, Ng et al. 2001) inspired us to study the algorithm in the abstract, independently of its particular applications, trying to mathematically illuminate its behaviour. In the present paper we detail this theoretical analysis. The original work starts from the definition of a revised and more general version of the algorithm, which includes the classic one as a particular case. We perform an analysis of the structure of two particular matrices, essential to studying the behaviour of the algorithm, and we prove the convergence of the algorithm in the most general case, finding the analytic expression of the vectors to which it converges. Then we study the symmetry of the algorithm and prove the equivalence between the existence of symmetry and the independence from the order of execution of some basic operations on initial vectors. Finally, we expound some interesting consequences of our theoretical results.
  7. Courtois, M.P.; Berry, M.W.: Results ranking in Web search engines (1999) 0.02
    0.020755913 = product of:
      0.12453547 = sum of:
        0.12453547 = weight(_text_:et in 3726) [ClassicSimilarity], result of:
          0.12453547 = score(doc=3726,freq=4.0), product of:
            0.16986917 = queryWeight, product of:
              4.692005 = idf(docFreq=1101, maxDocs=44218)
              0.03620396 = queryNorm
            0.7331258 = fieldWeight in 3726, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.692005 = idf(docFreq=1101, maxDocs=44218)
              0.078125 = fieldNorm(doc=3726)
      0.16666667 = coord(1/6)
    
    Abstract
    Comparaison des méthodes de classement de 5 moteurs de recherche (AltaVista, HotBot, Excie, Infoseek et Lycos). Sont testées la présence de tous les mots, la proximité et la localisation
  8. Grossman, D.A.; Frieder, O.: Information retrieval : algorithms and heuristics (1998) 0.02
    0.020138597 = product of:
      0.12083158 = sum of:
        0.12083158 = weight(_text_:o in 2182) [ClassicSimilarity], result of:
          0.12083158 = score(doc=2182,freq=2.0), product of:
            0.1816457 = queryWeight, product of:
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.03620396 = queryNorm
            0.6652047 = fieldWeight in 2182, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.09375 = fieldNorm(doc=2182)
      0.16666667 = coord(1/6)
    
  9. Wartik, S.; Fox, E.; Heath, L.; Chen, Q.-F.: Hashing algorithms (1992) 0.01
    0.01342573 = product of:
      0.08055438 = sum of:
        0.08055438 = weight(_text_:o in 3510) [ClassicSimilarity], result of:
          0.08055438 = score(doc=3510,freq=2.0), product of:
            0.1816457 = queryWeight, product of:
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.03620396 = queryNorm
            0.4434698 = fieldWeight in 3510, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.0625 = fieldNorm(doc=3510)
      0.16666667 = coord(1/6)
    
    Abstract
    Discusses hashing, an information storage and retrieval technique useful for implementing many of the other structures in this book. The concepts underlying hashing are presented, along with 2 implementation strategies. The chapter also contains an extensive discussion of perfect hashing, an important optimization in information retrieval, and an O(n) algorithm to find minimal perfect hash functions for a set of keys
  10. Vechtomova, O.; Karamuftuoglu, M.: Lexical cohesion and term proximity in document ranking (2008) 0.01
    0.01342573 = product of:
      0.08055438 = sum of:
        0.08055438 = weight(_text_:o in 2101) [ClassicSimilarity], result of:
          0.08055438 = score(doc=2101,freq=2.0), product of:
            0.1816457 = queryWeight, product of:
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.03620396 = queryNorm
            0.4434698 = fieldWeight in 2101, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.0625 = fieldNorm(doc=2101)
      0.16666667 = coord(1/6)
    
  11. Cheng, C.-S.; Chung, C.-P.; Shann, J.J.-J.: Fast query evaluation through document identifier assignment for inverted file-based information retrieval systems (2006) 0.01
    0.011866782 = product of:
      0.07120069 = sum of:
        0.07120069 = weight(_text_:o in 979) [ClassicSimilarity], result of:
          0.07120069 = score(doc=979,freq=4.0), product of:
            0.1816457 = queryWeight, product of:
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.03620396 = queryNorm
            0.39197564 = fieldWeight in 979, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.0390625 = fieldNorm(doc=979)
      0.16666667 = coord(1/6)
    
    Abstract
    Compressing an inverted file can greatly improve query performance of an information retrieval system (IRS) by reducing disk I/Os. We observe that a good document identifier assignment (DIA) can make the document identifiers in the posting lists more clustered, and result in better compression as well as shorter query processing time. In this paper, we tackle the NP-complete problem of finding an optimal DIA to minimize the average query processing time in an IRS when the probability distribution of query terms is given. We indicate that the greedy nearest neighbor (Greedy-NN) algorithm can provide excellent performance for this problem. However, the Greedy-NN algorithm is inappropriate if used in large-scale IRSs, due to its high complexity O(N2 × n), where N denotes the number of documents and n denotes the number of distinct terms. In real-world IRSs, the distribution of query terms is skewed. Based on this fact, we propose a fast O(N × n) heuristic, called partition-based document identifier assignment (PBDIA) algorithm, which can efficiently assign consecutive document identifiers to those documents containing frequently used query terms, and improve compression efficiency of the posting lists for those terms. This can result in reduced query processing time. The experimental results show that the PBDIA algorithm can yield a competitive performance versus the Greedy-NN for the DIA problem, and that this optimization problem has significant advantages for both long queries and parallel information retrieval (IR).
  12. Oberhauser, O.; Labner, J.: Relevance Ranking in Online-Katalogen : Informationsstand und Perspektiven (2003) 0.01
    0.011747515 = product of:
      0.070485085 = sum of:
        0.070485085 = weight(_text_:o in 2188) [ClassicSimilarity], result of:
          0.070485085 = score(doc=2188,freq=2.0), product of:
            0.1816457 = queryWeight, product of:
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.03620396 = queryNorm
            0.38803607 = fieldWeight in 2188, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2188)
      0.16666667 = coord(1/6)
    
  13. Vechtomova, O.; Karamuftuoglu, M.: Elicitation and use of relevance feedback information (2006) 0.01
    0.011747515 = product of:
      0.070485085 = sum of:
        0.070485085 = weight(_text_:o in 966) [ClassicSimilarity], result of:
          0.070485085 = score(doc=966,freq=2.0), product of:
            0.1816457 = queryWeight, product of:
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.03620396 = queryNorm
            0.38803607 = fieldWeight in 966, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.0546875 = fieldNorm(doc=966)
      0.16666667 = coord(1/6)
    
  14. Beitzel, S.M.; Jensen, E.C.; Chowdhury, A.; Grossman, D.; Frieder, O; Goharian, N.: Fusion of effective retrieval strategies in the same information retrieval system (2004) 0.01
    0.010069299 = product of:
      0.06041579 = sum of:
        0.06041579 = weight(_text_:o in 2502) [ClassicSimilarity], result of:
          0.06041579 = score(doc=2502,freq=2.0), product of:
            0.1816457 = queryWeight, product of:
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.03620396 = queryNorm
            0.33260235 = fieldWeight in 2502, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.046875 = fieldNorm(doc=2502)
      0.16666667 = coord(1/6)
    
  15. Herrera-Viedma, E.; Cordón, O.; Herrera, J.C.; Luqe, M.: ¬An IRS based on multi-granular lnguistic information (2003) 0.01
    0.010069299 = product of:
      0.06041579 = sum of:
        0.06041579 = weight(_text_:o in 2740) [ClassicSimilarity], result of:
          0.06041579 = score(doc=2740,freq=2.0), product of:
            0.1816457 = queryWeight, product of:
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.03620396 = queryNorm
            0.33260235 = fieldWeight in 2740, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.046875 = fieldNorm(doc=2740)
      0.16666667 = coord(1/6)
    
  16. Oberhauser, O.: Relevance Ranking in den Online-Katalogen der "nächsten Generation" (2010) 0.01
    0.010069299 = product of:
      0.06041579 = sum of:
        0.06041579 = weight(_text_:o in 4308) [ClassicSimilarity], result of:
          0.06041579 = score(doc=4308,freq=2.0), product of:
            0.1816457 = queryWeight, product of:
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.03620396 = queryNorm
            0.33260235 = fieldWeight in 4308, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.046875 = fieldNorm(doc=4308)
      0.16666667 = coord(1/6)
    
  17. Habernal, I.; Konopík, M.; Rohlík, O.: Question answering (2012) 0.01
    0.010069299 = product of:
      0.06041579 = sum of:
        0.06041579 = weight(_text_:o in 101) [ClassicSimilarity], result of:
          0.06041579 = score(doc=101,freq=2.0), product of:
            0.1816457 = queryWeight, product of:
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.03620396 = queryNorm
            0.33260235 = fieldWeight in 101, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.046875 = fieldNorm(doc=101)
      0.16666667 = coord(1/6)
    
  18. Chen, Z.; Fu, B.: On the complexity of Rocchio's similarity-based relevance feedback algorithm (2007) 0.01
    0.008391082 = product of:
      0.05034649 = sum of:
        0.05034649 = weight(_text_:o in 578) [ClassicSimilarity], result of:
          0.05034649 = score(doc=578,freq=2.0), product of:
            0.1816457 = queryWeight, product of:
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.03620396 = queryNorm
            0.27716863 = fieldWeight in 578, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.0390625 = fieldNorm(doc=578)
      0.16666667 = coord(1/6)
    
    Abstract
    Rocchio's similarity-based relevance feedback algorithm, one of the most important query reformation methods in information retrieval, is essentially an adaptive learning algorithm from examples in searching for documents represented by a linear classifier. Despite its popularity in various applications, there is little rigorous analysis of its learning complexity in literature. In this article, the authors prove for the first time that the learning complexity of Rocchio's algorithm is O(d + d**2(log d + log n)) over the discretized vector space {0, ... , n - 1 }**d when the inner product similarity measure is used. The upper bound on the learning complexity for searching for documents represented by a monotone linear classifier (q, 0) over {0, ... , n - 1 }d can be improved to, at most, 1 + 2k (n - 1) (log d + log(n - 1)), where k is the number of nonzero components in q. Several lower bounds on the learning complexity are also obtained for Rocchio's algorithm. For example, the authors prove that Rocchio's algorithm has a lower bound Omega((d über 2)log n) on its learning complexity over the Boolean vector space {0,1}**d.
  19. Urbain, J.; Goharian, N.; Frieder, O.: Probabilistic passage models for semantic search of genomics literature (2008) 0.01
    0.008391082 = product of:
      0.05034649 = sum of:
        0.05034649 = weight(_text_:o in 2380) [ClassicSimilarity], result of:
          0.05034649 = score(doc=2380,freq=2.0), product of:
            0.1816457 = queryWeight, product of:
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.03620396 = queryNorm
            0.27716863 = fieldWeight in 2380, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2380)
      0.16666667 = coord(1/6)
    
  20. Calegari, S.; Sanchez, E.: Object-fuzzy concept network : an enrichment of ontologies in semantic information retrieval (2008) 0.01
    0.008391082 = product of:
      0.05034649 = sum of:
        0.05034649 = weight(_text_:o in 2393) [ClassicSimilarity], result of:
          0.05034649 = score(doc=2393,freq=2.0), product of:
            0.1816457 = queryWeight, product of:
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.03620396 = queryNorm
            0.27716863 = fieldWeight in 2393, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.017288 = idf(docFreq=795, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2393)
      0.16666667 = coord(1/6)
    
    Abstract
    This article shows how a fuzzy ontology-based approach can improve semantic documents retrieval. After formally defining a fuzzy ontology and a fuzzy knowledge base, a special type of new fuzzy relationship called (semantic) correlation, which links the concepts or entities in a fuzzy ontology, is discussed. These correlations, first assigned by experts, are updated after querying or when a document has been inserted into a database. Moreover, in order to define a dynamic knowledge of a domain adapting itself to the context, it is shown how to handle a tradeoff between the correct definition of an object, taken in the ontology structure, and the actual meaning assigned by individuals. The notion of a fuzzy concept network is extended, incorporating database objects so that entities and documents can similarly be represented in the network. Information retrieval (IR) algorithm, using an object-fuzzy concept network (O-FCN), is introduced and described. This algorithm allows us to derive a unique path among the entities involved in the query to obtain maxima semantic associations in the knowledge domain. Finally, the study has been validated by querying a database using fuzzy recall, fuzzy precision, and coefficient variant measures in the crisp and fuzzy cases.

Years

Languages

  • e 46
  • d 8

Types

  • a 50
  • m 3
  • r 1
  • s 1
  • More… Less…