Search (157 results, page 1 of 8)

  • × theme_ss:"Retrievalalgorithmen"
  1. Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.11
    0.10588467 = product of:
      0.158827 = sum of:
        0.10315535 = weight(_text_:management in 402) [ClassicSimilarity], result of:
          0.10315535 = score(doc=402,freq=2.0), product of:
            0.17312427 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051362853 = queryNorm
            0.5958457 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
        0.05567166 = product of:
          0.11134332 = sum of:
            0.11134332 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
              0.11134332 = score(doc=402,freq=2.0), product of:
                0.17986396 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051362853 = queryNorm
                0.61904186 = fieldWeight in 402, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=402)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Source
    Information processing and management. 22(1986) no.6, S.465-476
  2. Ravana, S.D.; Rajagopal, P.; Balakrishnan, V.: Ranking retrieval systems using pseudo relevance judgments (2015) 0.08
    0.08083217 = product of:
      0.121248245 = sum of:
        0.032236047 = weight(_text_:management in 2591) [ClassicSimilarity], result of:
          0.032236047 = score(doc=2591,freq=2.0), product of:
            0.17312427 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051362853 = queryNorm
            0.18620178 = fieldWeight in 2591, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2591)
        0.0890122 = sum of:
          0.03980494 = weight(_text_:system in 2591) [ClassicSimilarity], result of:
            0.03980494 = score(doc=2591,freq=4.0), product of:
              0.16177002 = queryWeight, product of:
                3.1495528 = idf(docFreq=5152, maxDocs=44218)
                0.051362853 = queryNorm
              0.24605882 = fieldWeight in 2591, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.1495528 = idf(docFreq=5152, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2591)
          0.04920726 = weight(_text_:22 in 2591) [ClassicSimilarity], result of:
            0.04920726 = score(doc=2591,freq=4.0), product of:
              0.17986396 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.051362853 = queryNorm
              0.27358043 = fieldWeight in 2591, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2591)
      0.6666667 = coord(2/3)
    
    Abstract
    Purpose In a system-based approach, replicating the web would require large test collections, and judging the relevancy of all documents per topic in creating relevance judgment through human assessors is infeasible. Due to the large amount of documents that requires judgment, there are possible errors introduced by human assessors because of disagreements. The paper aims to discuss these issues. Design/methodology/approach This study explores exponential variation and document ranking methods that generate a reliable set of relevance judgments (pseudo relevance judgments) to reduce human efforts. These methods overcome problems with large amounts of documents for judgment while avoiding human disagreement errors during the judgment process. This study utilizes two key factors: number of occurrences of each document per topic from all the system runs; and document rankings to generate the alternate methods. Findings The effectiveness of the proposed method is evaluated using the correlation coefficient of ranked systems using mean average precision scores between the original Text REtrieval Conference (TREC) relevance judgments and pseudo relevance judgments. The results suggest that the proposed document ranking method with a pool depth of 100 could be a reliable alternative to reduce human effort and disagreement errors involved in generating TREC-like relevance judgments. Originality/value Simple methods proposed in this study show improvement in the correlation coefficient in generating alternate relevance judgment without human assessors while contributing to information retrieval evaluation.
    Date
    20. 1.2015 18:30:22
    18. 9.2018 18:22:56
    Source
    Aslib journal of information management. 67(2015) no.6, S.700-714
  3. Fuhr, N.: Ranking-Experimente mit gewichteter Indexierung (1986) 0.08
    0.0794135 = product of:
      0.119120255 = sum of:
        0.077366516 = weight(_text_:management in 58) [ClassicSimilarity], result of:
          0.077366516 = score(doc=58,freq=2.0), product of:
            0.17312427 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051362853 = queryNorm
            0.44688427 = fieldWeight in 58, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.09375 = fieldNorm(doc=58)
        0.041753743 = product of:
          0.083507486 = sum of:
            0.083507486 = weight(_text_:22 in 58) [ClassicSimilarity], result of:
              0.083507486 = score(doc=58,freq=2.0), product of:
                0.17986396 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051362853 = queryNorm
                0.46428138 = fieldWeight in 58, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=58)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Date
    14. 6.2015 22:12:44
    Source
    Deutscher Dokumentartag 1985, Nürnberg, 1.-4.10.1985: Fachinformation: Methodik - Management - Markt; neue Entwicklungen, Berufe, Produkte. Bearb.: H. Strohl-Goebel
  4. Efthimiadis, E.N.: User choices : a new yardstick for the evaluation of ranking algorithms for interactive query expansion (1995) 0.06
    0.063451454 = product of:
      0.09517717 = sum of:
        0.032236047 = weight(_text_:management in 5697) [ClassicSimilarity], result of:
          0.032236047 = score(doc=5697,freq=2.0), product of:
            0.17312427 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051362853 = queryNorm
            0.18620178 = fieldWeight in 5697, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5697)
        0.06294113 = sum of:
          0.02814634 = weight(_text_:system in 5697) [ClassicSimilarity], result of:
            0.02814634 = score(doc=5697,freq=2.0), product of:
              0.16177002 = queryWeight, product of:
                3.1495528 = idf(docFreq=5152, maxDocs=44218)
                0.051362853 = queryNorm
              0.17398985 = fieldWeight in 5697, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1495528 = idf(docFreq=5152, maxDocs=44218)
                0.0390625 = fieldNorm(doc=5697)
          0.03479479 = weight(_text_:22 in 5697) [ClassicSimilarity], result of:
            0.03479479 = score(doc=5697,freq=2.0), product of:
              0.17986396 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.051362853 = queryNorm
              0.19345059 = fieldWeight in 5697, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=5697)
      0.6666667 = coord(2/3)
    
    Abstract
    The performance of 8 ranking algorithms was evaluated with respect to their effectiveness in ranking terms for query expansion. The evaluation was conducted within an investigation of interactive query expansion and relevance feedback in a real operational environment. Focuses on the identification of algorithms that most effectively take cognizance of user preferences. user choices (i.e. the terms selected by the searchers for the query expansion search) provided the yardstick for the evaluation of the 8 ranking algorithms. This methodology introduces a user oriented approach in evaluating ranking algorithms for query expansion in contrast to the standard, system oriented approaches. Similarities in the performance of the 8 algorithms and the ways these algorithms rank terms were the main focus of this evaluation. The findings demonstrate that the r-lohi, wpq, enim, and porter algorithms have similar performance in bringing good terms to the top of a ranked list of terms for query expansion. However, further evaluation of the algorithms in different (e.g. full text) environments is needed before these results can be generalized beyond the context of the present study
    Date
    22. 2.1996 13:14:10
    Source
    Information processing and management. 31(1995) no.4, S.605-620
  5. Smeaton, A.F.; Rijsbergen, C.J. van: ¬The retrieval effects of query expansion on a feedback document retrieval system (1983) 0.06
    0.058745053 = product of:
      0.17623515 = sum of:
        0.17623515 = sum of:
          0.07880975 = weight(_text_:system in 2134) [ClassicSimilarity], result of:
            0.07880975 = score(doc=2134,freq=2.0), product of:
              0.16177002 = queryWeight, product of:
                3.1495528 = idf(docFreq=5152, maxDocs=44218)
                0.051362853 = queryNorm
              0.4871716 = fieldWeight in 2134, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1495528 = idf(docFreq=5152, maxDocs=44218)
                0.109375 = fieldNorm(doc=2134)
          0.0974254 = weight(_text_:22 in 2134) [ClassicSimilarity], result of:
            0.0974254 = score(doc=2134,freq=2.0), product of:
              0.17986396 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.051362853 = queryNorm
              0.5416616 = fieldWeight in 2134, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.109375 = fieldNorm(doc=2134)
      0.33333334 = coord(1/3)
    
    Date
    30. 3.2001 13:32:22
  6. Srinivasan, P.: Query expansion and MEDLINE (1996) 0.05
    0.0493965 = product of:
      0.07409475 = sum of:
        0.051577676 = weight(_text_:management in 8453) [ClassicSimilarity], result of:
          0.051577676 = score(doc=8453,freq=2.0), product of:
            0.17312427 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051362853 = queryNorm
            0.29792285 = fieldWeight in 8453, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.0625 = fieldNorm(doc=8453)
        0.022517072 = product of:
          0.045034144 = sum of:
            0.045034144 = weight(_text_:system in 8453) [ClassicSimilarity], result of:
              0.045034144 = score(doc=8453,freq=2.0), product of:
                0.16177002 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.051362853 = queryNorm
                0.27838376 = fieldWeight in 8453, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0625 = fieldNorm(doc=8453)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Evaluates the retrieval effectiveness of query expansion strategies on a test collection of the medical database MEDLINE using Cornell University's SMART retrieval system. Tests 3 expansion strategies for their ability to identify appropriate MeSH terms for user queries. Compares retrieval effectiveness using the original unexpanded and the alternative expanded user queries on a collection of 75 queries and 2.334 Medline citations. Recommends query expansions using retrieval feedback for adding MeSH search terms to a user's initial query
    Source
    Information processing and management. 32(1996) no.4, S.431-443
  7. Baloh, P.; Desouza, K.C.; Hackney, R.: Contextualizing organizational interventions of knowledge management systems : a design science perspectiveA domain analysis (2012) 0.05
    0.048821248 = product of:
      0.07323187 = sum of:
        0.055834472 = weight(_text_:management in 241) [ClassicSimilarity], result of:
          0.055834472 = score(doc=241,freq=6.0), product of:
            0.17312427 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051362853 = queryNorm
            0.32251096 = fieldWeight in 241, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.0390625 = fieldNorm(doc=241)
        0.017397394 = product of:
          0.03479479 = sum of:
            0.03479479 = weight(_text_:22 in 241) [ClassicSimilarity], result of:
              0.03479479 = score(doc=241,freq=2.0), product of:
                0.17986396 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051362853 = queryNorm
                0.19345059 = fieldWeight in 241, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=241)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    We address how individuals' (workers) knowledge needs influence the design of knowledge management systems (KMS), enabling knowledge creation and utilization. It is evident that KMS technologies and activities are indiscriminately deployed in most organizations with little regard to the actual context of their adoption. Moreover, it is apparent that the extant literature pertaining to knowledge management projects is frequently deficient in identifying the variety of factors indicative for successful KMS. This presents an obvious business practice and research gap that requires a critical analysis of the necessary intervention that will actually improve how workers can leverage and form organization-wide knowledge. This research involved an extensive review of the literature, a grounded theory methodological approach and rigorous data collection and synthesis through an empirical case analysis (Parsons Brinckerhoff and Samsung). The contribution of this study is the formulation of a model for designing KMS based upon the design science paradigm, which aspires to create artifacts that are interdependent of people and organizations. The essential proposition is that KMS design and implementation must be contextualized in relation to knowledge needs and that these will differ for various organizational settings. The findings present valuable insights and further understanding of the way in which KMS design efforts should be focused.
    Date
    11. 6.2012 14:22:34
  8. Lee, C.; Lee, G.G.: Probabilistic information retrieval model for a dependence structured indexing system (2005) 0.05
    0.048662614 = product of:
      0.07299392 = sum of:
        0.045130465 = weight(_text_:management in 1004) [ClassicSimilarity], result of:
          0.045130465 = score(doc=1004,freq=2.0), product of:
            0.17312427 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051362853 = queryNorm
            0.2606825 = fieldWeight in 1004, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1004)
        0.027863456 = product of:
          0.055726912 = sum of:
            0.055726912 = weight(_text_:system in 1004) [ClassicSimilarity], result of:
              0.055726912 = score(doc=1004,freq=4.0), product of:
                0.16177002 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.051362853 = queryNorm
                0.34448233 = fieldWeight in 1004, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1004)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Most previous information retrieval (IR) models assume that terms of queries and documents are statistically independent from each other. However, conditional independence assumption is obviously and openly understood to be wrong, so we present a new method of incorporating term dependence into a probabilistic retrieval model by adapting a dependency structured indexing system using a dependency parse tree and Chow Expansion to compensate the weakness of the assumption. In this paper, we describe a theoretic process to apply the Chow Expansion to the general probabilistic models and the state-of-the-art 2-Poisson model. Through experiments on document collections in English and Korean, we demonstrate that the incorporation of term dependences using Chow Expansion contributes to the improvement of performance in probabilistic IR systems.
    Source
    Information processing and management. 41(2005) no.2, S.161-176
  9. Torra, V.; Miyamoto, S.; Lanau, S.: Exploration of textual document archives using a fuzzy hierarchical clustering algorithm in the GAMBAL system (2005) 0.05
    0.048662614 = product of:
      0.07299392 = sum of:
        0.045130465 = weight(_text_:management in 1028) [ClassicSimilarity], result of:
          0.045130465 = score(doc=1028,freq=2.0), product of:
            0.17312427 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051362853 = queryNorm
            0.2606825 = fieldWeight in 1028, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1028)
        0.027863456 = product of:
          0.055726912 = sum of:
            0.055726912 = weight(_text_:system in 1028) [ClassicSimilarity], result of:
              0.055726912 = score(doc=1028,freq=4.0), product of:
                0.16177002 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.051362853 = queryNorm
                0.34448233 = fieldWeight in 1028, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1028)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    The Internet, together with the large amount of textual information available in document archives, has increased the relevance of information retrieval related tools. In this work we present an extension of the Gambal system for clustering and visualization of documents based on fuzzy clustering techniques. The tool allows to structure the set of documents in a hierarchical way (using a fuzzy hierarchical structure) and represent this structure in a graphical interface (a 3D sphere) over which the user can navigate. Gambal allows the analysis of the documents and the computation of their similarity not only on the basis of the syntactic similarity between words but also based on a dictionary (Wordnet 1.7) and latent semantics analysis.
    Source
    Information processing and management. 41(2005) no.3, S.587-598
  10. Singh, S.; Dey, L.: ¬A rough-fuzzy document grading system for customized text information retrieval (2005) 0.05
    0.04830591 = product of:
      0.07245886 = sum of:
        0.038683258 = weight(_text_:management in 1007) [ClassicSimilarity], result of:
          0.038683258 = score(doc=1007,freq=2.0), product of:
            0.17312427 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051362853 = queryNorm
            0.22344214 = fieldWeight in 1007, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.046875 = fieldNorm(doc=1007)
        0.03377561 = product of:
          0.06755122 = sum of:
            0.06755122 = weight(_text_:system in 1007) [ClassicSimilarity], result of:
              0.06755122 = score(doc=1007,freq=8.0), product of:
                0.16177002 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.051362853 = queryNorm
                0.41757566 = fieldWeight in 1007, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1007)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Due to the large repository of documents available on the web, users are usually inundated by a large volume of information, most of which is found to be irrelevant. Since user perspectives vary, a client-side text filtering system that learns the user's perspective can reduce the problem of irrelevant retrieval. In this paper, we have provided the design of a customized text information filtering system which learns user preferences and modifies the initial query to fetch better documents. It uses a rough-fuzzy reasoning scheme. The rough-set based reasoning takes care of natural language nuances, like synonym handling, very elegantly. The fuzzy decider provides qualitative grading to the documents for the user's perusal. We have provided the detailed design of the various modules and some results related to the performance analysis of the system.
    Source
    Information processing and management. 41(2005) no.2, S.195-216
  11. Desai, M.; Spink, A.: ¬A algorithm to cluster documents based on relevance (2005) 0.05
    0.0452892 = product of:
      0.0679338 = sum of:
        0.038683258 = weight(_text_:management in 1035) [ClassicSimilarity], result of:
          0.038683258 = score(doc=1035,freq=2.0), product of:
            0.17312427 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051362853 = queryNorm
            0.22344214 = fieldWeight in 1035, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.046875 = fieldNorm(doc=1035)
        0.029250536 = product of:
          0.058501072 = sum of:
            0.058501072 = weight(_text_:system in 1035) [ClassicSimilarity], result of:
              0.058501072 = score(doc=1035,freq=6.0), product of:
                0.16177002 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.051362853 = queryNorm
                0.36163113 = fieldWeight in 1035, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1035)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Search engines fail to make a clear distinction between items of varying relevance when presenting search results to users. Instead, they rely on the user of the system to estimate which items are relevant, partially relevant, or not relevant. The user of the system is given the task of distinguishing between documents that are relevant to different degrees. This process often hinders the accessibility of relevant or partially relevant documents, particularly when the results set is large and documents of varying relevance are scattered throughout the set. In this paper, we present a clustering scheme that groups documents within relevant, partially relevant, and not relevant regions for a given search. A clustering algorithm accomplishes the task of clustering documents based on relevance. The clusters were evaluated by end-users issuing categorical, interval, and descriptive relevance judgments for the documents returned from a search. The degree of overlap between users and the system for each of the clustered regions was measured to determine the overall effectiveness of the algorithm. This research showed that clustering documents on the Web by regions of relevance is highly necessary and quite feasible.
    Source
    Information processing and management. 41(2005) no.5, S.1035-1050
  12. García Cumbreras, M.A.; Perea-Ortega, J.M.; García Vega, M.; Ureña López, L.A.: Information retrieval with geographical references : relevant documents filtering vs. query expansion (2009) 0.04
    0.043221936 = product of:
      0.0648329 = sum of:
        0.045130465 = weight(_text_:management in 4222) [ClassicSimilarity], result of:
          0.045130465 = score(doc=4222,freq=2.0), product of:
            0.17312427 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051362853 = queryNorm
            0.2606825 = fieldWeight in 4222, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4222)
        0.019702438 = product of:
          0.039404877 = sum of:
            0.039404877 = weight(_text_:system in 4222) [ClassicSimilarity], result of:
              0.039404877 = score(doc=4222,freq=2.0), product of:
                0.16177002 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.051362853 = queryNorm
                0.2435858 = fieldWeight in 4222, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=4222)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    This is a thorough analysis of two techniques applied to Geographic Information Retrieval (GIR). Previous studies have researched the application of query expansion to improve the selection process of information retrieval systems. This paper emphasizes the effectiveness of the filtering of relevant documents applied to a GIR system, instead of query expansion. Based on the CLEF (Cross Language Evaluation Forum) framework available, several experiments have been run. Some based on query expansion, some on the filtering of relevant documents. The results show that filtering works better in a GIR environment, because relevant documents are not reordered in the final list.
    Source
    Information processing and management. 45(2009) no.5, S.605-614
  13. Zhao, L.; Wu, L.; Huang, X.: Using query expansion in graph-based approach for query-focused multi-document summarization (2009) 0.04
    0.041710816 = product of:
      0.06256622 = sum of:
        0.038683258 = weight(_text_:management in 2449) [ClassicSimilarity], result of:
          0.038683258 = score(doc=2449,freq=2.0), product of:
            0.17312427 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051362853 = queryNorm
            0.22344214 = fieldWeight in 2449, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.046875 = fieldNorm(doc=2449)
        0.02388296 = product of:
          0.04776592 = sum of:
            0.04776592 = weight(_text_:system in 2449) [ClassicSimilarity], result of:
              0.04776592 = score(doc=2449,freq=4.0), product of:
                0.16177002 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.051362853 = queryNorm
                0.29527056 = fieldWeight in 2449, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2449)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    This paper presents a novel query expansion method, which is combined in the graph-based algorithm for query-focused multi-document summarization, so as to resolve the problem of information limit in the original query. Our approach makes use of both the sentence-to-sentence relations and the sentence-to-word relations to select the query biased informative words from the document set and use them as query expansions to improve the sentence ranking result. Compared to previous query expansion approaches, our approach can capture more relevant information with less noise. We performed experiments on the data of document understanding conference (DUC) 2005 and DUC 2006, and the evaluation results show that the proposed query expansion method can significantly improve the system performance and make our system comparable to the state-of-the-art systems.
    Source
    Information processing and management. 45(2009) no.1, S.35-41
  14. Witschel, H.F.: Global term weights in distributed environments (2008) 0.04
    0.03970675 = product of:
      0.059560128 = sum of:
        0.038683258 = weight(_text_:management in 2096) [ClassicSimilarity], result of:
          0.038683258 = score(doc=2096,freq=2.0), product of:
            0.17312427 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051362853 = queryNorm
            0.22344214 = fieldWeight in 2096, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.046875 = fieldNorm(doc=2096)
        0.020876871 = product of:
          0.041753743 = sum of:
            0.041753743 = weight(_text_:22 in 2096) [ClassicSimilarity], result of:
              0.041753743 = score(doc=2096,freq=2.0), product of:
                0.17986396 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.051362853 = queryNorm
                0.23214069 = fieldWeight in 2096, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2096)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Date
    1. 8.2008 9:44:22
    Source
    Information processing and management. 44(2008) no.3, S.1049-1061
  15. Sakai, T.: On the reliability of information retrieval metrics based on graded relevance (2007) 0.04
    0.03704738 = product of:
      0.055571064 = sum of:
        0.038683258 = weight(_text_:management in 910) [ClassicSimilarity], result of:
          0.038683258 = score(doc=910,freq=2.0), product of:
            0.17312427 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051362853 = queryNorm
            0.22344214 = fieldWeight in 910, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.046875 = fieldNorm(doc=910)
        0.016887804 = product of:
          0.03377561 = sum of:
            0.03377561 = weight(_text_:system in 910) [ClassicSimilarity], result of:
              0.03377561 = score(doc=910,freq=2.0), product of:
                0.16177002 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.051362853 = queryNorm
                0.20878783 = fieldWeight in 910, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.046875 = fieldNorm(doc=910)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    This paper compares 14 information retrieval metrics based on graded relevance, together with 10 traditional metrics based on binary relevance, in terms of stability, sensitivity and resemblance of system rankings. More specifically, we compare these metrics using the Buckley/Voorhees stability method, the Voorhees/Buckley swap method and Kendall's rank correlation, with three data sets comprising test collections and submitted runs from NTCIR. Our experiments show that (Average) Normalised Discounted Cumulative Gain at document cut-off l are the best among the rank-based graded-relevance metrics, provided that l is large. On the other hand, if one requires a recall-based graded-relevance metric that is highly correlated with Average Precision, then Q-measure is the best choice. Moreover, these best graded-relevance metrics are at least as stable and sensitive as Average Precision, and are fairly robust to the choice of gain values.
    Source
    Information processing and management. 43(2007) no.2, S.531-548
  16. Quiroga, L.M.; Mostafa, J.: ¬An experiment in building profiles in information filtering : the role of context of user relevance feedback (2002) 0.03
    0.03475901 = product of:
      0.052138515 = sum of:
        0.032236047 = weight(_text_:management in 2579) [ClassicSimilarity], result of:
          0.032236047 = score(doc=2579,freq=2.0), product of:
            0.17312427 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051362853 = queryNorm
            0.18620178 = fieldWeight in 2579, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2579)
        0.01990247 = product of:
          0.03980494 = sum of:
            0.03980494 = weight(_text_:system in 2579) [ClassicSimilarity], result of:
              0.03980494 = score(doc=2579,freq=4.0), product of:
                0.16177002 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.051362853 = queryNorm
                0.24605882 = fieldWeight in 2579, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2579)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    An experiment was conducted to see how relevance feedback could be used to build and adjust profiles to improve the performance of filtering systems. Data was collected during the system interaction of 18 graduate students with SIFTER (Smart Information Filtering Technology for Electronic Resources), a filtering system that ranks incoming information based on users' profiles. The data set came from a collection of 6000 records concerning consumer health. In the first phase of the study, three different modes of profile acquisition were compared. The explicit mode allowed users to directly specify the profile; the implicit mode utilized relevance feedback to create and refine the profile; and the combined mode allowed users to initialize the profile and to continuously refine it using relevance feedback. Filtering performance, measured in terms of Normalized Precision, showed that the three approaches were significantly different ( [small alpha, Greek] =0.05 and p =0.012). The explicit mode of profile acquisition consistently produced superior results. Exclusive reliance on relevance feedback in the implicit mode resulted in inferior performance. The low performance obtained by the implicit acquisition mode motivated the second phase of the study, which aimed to clarify the role of context in relevance feedback judgments. An inductive content analysis of thinking aloud protocols showed dimensions that were highly situational, establishing the importance context plays in feedback relevance assessments. Results suggest the need for better representation of documents, profiles, and relevance feedback mechanisms that incorporate dimensions identified in this research.
    Source
    Information processing and management. 38(2002) no.5, S.671-694
  17. Salton, G.: ¬A simple blueprint for automatic Boolean query processing (1988) 0.03
    0.03438512 = product of:
      0.10315535 = sum of:
        0.10315535 = weight(_text_:management in 6774) [ClassicSimilarity], result of:
          0.10315535 = score(doc=6774,freq=2.0), product of:
            0.17312427 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051362853 = queryNorm
            0.5958457 = fieldWeight in 6774, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.125 = fieldNorm(doc=6774)
      0.33333334 = coord(1/3)
    
    Source
    Information processing and management. 24(1988) no.3, S.269-280
  18. Reddaway, S.: High speed text retrieval from large databases on a massively parallel processor (1991) 0.03
    0.03438512 = product of:
      0.10315535 = sum of:
        0.10315535 = weight(_text_:management in 7745) [ClassicSimilarity], result of:
          0.10315535 = score(doc=7745,freq=2.0), product of:
            0.17312427 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051362853 = queryNorm
            0.5958457 = fieldWeight in 7745, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.125 = fieldNorm(doc=7745)
      0.33333334 = coord(1/3)
    
    Source
    Information processing and management. 27(1991), S.311-316
  19. Klas, C.-P.; Fuhr, N.; Schaefer, A.: Evaluating strategic support for information access in the DAFFODIL system (2004) 0.03
    0.033418275 = product of:
      0.10025482 = sum of:
        0.10025482 = sum of:
          0.058501072 = weight(_text_:system in 2419) [ClassicSimilarity], result of:
            0.058501072 = score(doc=2419,freq=6.0), product of:
              0.16177002 = queryWeight, product of:
                3.1495528 = idf(docFreq=5152, maxDocs=44218)
                0.051362853 = queryNorm
              0.36163113 = fieldWeight in 2419, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.1495528 = idf(docFreq=5152, maxDocs=44218)
                0.046875 = fieldNorm(doc=2419)
          0.041753743 = weight(_text_:22 in 2419) [ClassicSimilarity], result of:
            0.041753743 = score(doc=2419,freq=2.0), product of:
              0.17986396 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.051362853 = queryNorm
              0.23214069 = fieldWeight in 2419, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=2419)
      0.33333334 = coord(1/3)
    
    Abstract
    The digital library system Daffodil is targeted at strategic support of users during the information search process. For searching, exploring and managing digital library objects it provides user-customisable information seeking patterns over a federation of heterogeneous digital libraries. In this paper evaluation results with respect to retrieval effectiveness, efficiency and user satisfaction are presented. The analysis focuses on strategic support for the scientific work-flow. Daffodil supports the whole work-flow, from data source selection over information seeking to the representation, organisation and reuse of information. By embedding high level search functionality into the scientific work-flow, the user experiences better strategic system support due to a more systematic work process. These ideas have been implemented in Daffodil followed by a qualitative evaluation. The evaluation has been conducted with 28 participants, ranging from information seeking novices to experts. The results are promising, as they support the chosen model.
    Date
    16.11.2008 16:22:48
  20. Cheng, C.-S.; Chung, C.-P.; Shann, J.J.-J.: Fast query evaluation through document identifier assignment for inverted file-based information retrieval systems (2006) 0.03
    0.030872812 = product of:
      0.046309218 = sum of:
        0.032236047 = weight(_text_:management in 979) [ClassicSimilarity], result of:
          0.032236047 = score(doc=979,freq=2.0), product of:
            0.17312427 = queryWeight, product of:
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.051362853 = queryNorm
            0.18620178 = fieldWeight in 979, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.3706124 = idf(docFreq=4130, maxDocs=44218)
              0.0390625 = fieldNorm(doc=979)
        0.01407317 = product of:
          0.02814634 = sum of:
            0.02814634 = weight(_text_:system in 979) [ClassicSimilarity], result of:
              0.02814634 = score(doc=979,freq=2.0), product of:
                0.16177002 = queryWeight, product of:
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.051362853 = queryNorm
                0.17398985 = fieldWeight in 979, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.1495528 = idf(docFreq=5152, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=979)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Compressing an inverted file can greatly improve query performance of an information retrieval system (IRS) by reducing disk I/Os. We observe that a good document identifier assignment (DIA) can make the document identifiers in the posting lists more clustered, and result in better compression as well as shorter query processing time. In this paper, we tackle the NP-complete problem of finding an optimal DIA to minimize the average query processing time in an IRS when the probability distribution of query terms is given. We indicate that the greedy nearest neighbor (Greedy-NN) algorithm can provide excellent performance for this problem. However, the Greedy-NN algorithm is inappropriate if used in large-scale IRSs, due to its high complexity O(N2 × n), where N denotes the number of documents and n denotes the number of distinct terms. In real-world IRSs, the distribution of query terms is skewed. Based on this fact, we propose a fast O(N × n) heuristic, called partition-based document identifier assignment (PBDIA) algorithm, which can efficiently assign consecutive document identifiers to those documents containing frequently used query terms, and improve compression efficiency of the posting lists for those terms. This can result in reduced query processing time. The experimental results show that the PBDIA algorithm can yield a competitive performance versus the Greedy-NN for the DIA problem, and that this optimization problem has significant advantages for both long queries and parallel information retrieval (IR).
    Source
    Information processing and management. 42(2006) no.3, S.729-750

Years

Languages

  • e 145
  • d 9
  • chi 1
  • m 1
  • More… Less…

Types

  • a 147
  • m 6
  • el 2
  • p 2
  • s 2
  • r 1
  • More… Less…