Search (40 results, page 1 of 2)

  • × theme_ss:"Retrievalalgorithmen"
  1. Van der Veer Martens, B.; Fleet, C. van: Opening the black box of "relevance work" : a domain analysis (2012) 0.04
    0.040099498 = product of:
      0.080198996 = sum of:
        0.04883048 = weight(_text_:social in 247) [ClassicSimilarity], result of:
          0.04883048 = score(doc=247,freq=2.0), product of:
            0.1847249 = queryWeight, product of:
              3.9875789 = idf(docFreq=2228, maxDocs=44218)
              0.046325076 = queryNorm
            0.26434162 = fieldWeight in 247, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9875789 = idf(docFreq=2228, maxDocs=44218)
              0.046875 = fieldNorm(doc=247)
        0.031368516 = product of:
          0.06273703 = sum of:
            0.06273703 = weight(_text_:aspects in 247) [ClassicSimilarity], result of:
              0.06273703 = score(doc=247,freq=2.0), product of:
                0.20938325 = queryWeight, product of:
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.046325076 = queryNorm
                0.29962775 = fieldWeight in 247, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.046875 = fieldNorm(doc=247)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    In response to Hjørland's recent call for a reconceptualization of the foundations of relevance, we suggest that the sociocognitive aspects of intermediation by information agencies, such as archives and libraries, are a necessary and unexplored part of the infrastructure of the subject knowledge domains central to his recommended "view of relevance informed by a social paradigm" (2010, p. 217). From a comparative analysis of documents from 39 graduate-level introductory courses in archives, reference, and strategic/competitive intelligence taught in 13 American Library Association-accredited library and information science (LIS) programs, we identify four defining sociocognitive dimensions of "relevance work" in information agencies within Hjørland's proposed framework for relevance: tasks, time, systems, and assessors. This study is intended to supply sociocognitive content from within the relevance work domain to support further domain analytic research, and to emphasize the importance of intermediary relevance work for all subject knowledge domains.
  2. Kelledy, F.; Smeaton, A.F.: Signature files and beyond (1996) 0.03
    0.031595502 = product of:
      0.12638201 = sum of:
        0.12638201 = sum of:
          0.08872356 = weight(_text_:aspects in 6973) [ClassicSimilarity], result of:
            0.08872356 = score(doc=6973,freq=4.0), product of:
              0.20938325 = queryWeight, product of:
                4.5198684 = idf(docFreq=1308, maxDocs=44218)
                0.046325076 = queryNorm
              0.42373765 = fieldWeight in 6973, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.5198684 = idf(docFreq=1308, maxDocs=44218)
                0.046875 = fieldNorm(doc=6973)
          0.03765845 = weight(_text_:22 in 6973) [ClassicSimilarity], result of:
            0.03765845 = score(doc=6973,freq=2.0), product of:
              0.16222252 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046325076 = queryNorm
              0.23214069 = fieldWeight in 6973, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=6973)
      0.25 = coord(1/4)
    
    Abstract
    Proposes that signature files be used as a viable alternative to other indexing strategies such as inverted files for searching through large volumes of text. Demonstrates through simulation, that search times can be further reduced by enhancing the basic signature file concept using deterministic partitioning algorithms which eliminate the need for an exhaustive search of the entire signature file. Reports research to evaluate the performance of some deterministic partitioning algorithms in a non simulated environment using 276 MB of raw newspaper text (taken from the Wall Street Journal) and real user queries. Presents a selection of results to illustrate trends and highlight important aspects of the performance of these methods under realistic rather than simulated operating conditions. As a result of the research reported here certain aspects of this approach to signature files are shown to be found wanting and require improvement. Suggests lines of future research on the partitioning of signature files
    Source
    Information retrieval: new systems and current research. Proceedings of the 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Drymen, Scotland, 22-23 Mar 94. Ed.: R. Leon
  3. Crestani, F.; Dominich, S.; Lalmas, M.; Rijsbergen, C.J.K. van: Mathematical, logical, and formal methods in information retrieval : an introduction to the special issue (2003) 0.03
    0.031595502 = product of:
      0.12638201 = sum of:
        0.12638201 = sum of:
          0.08872356 = weight(_text_:aspects in 1451) [ClassicSimilarity], result of:
            0.08872356 = score(doc=1451,freq=4.0), product of:
              0.20938325 = queryWeight, product of:
                4.5198684 = idf(docFreq=1308, maxDocs=44218)
                0.046325076 = queryNorm
              0.42373765 = fieldWeight in 1451, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.5198684 = idf(docFreq=1308, maxDocs=44218)
                0.046875 = fieldNorm(doc=1451)
          0.03765845 = weight(_text_:22 in 1451) [ClassicSimilarity], result of:
            0.03765845 = score(doc=1451,freq=2.0), product of:
              0.16222252 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046325076 = queryNorm
              0.23214069 = fieldWeight in 1451, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=1451)
      0.25 = coord(1/4)
    
    Abstract
    Research an the use of mathematical, logical, and formal methods, has been central to Information Retrieval research for a long time. Research in this area is important not only because it helps enhancing retrieval effectiveness, but also because it helps clarifying the underlying concepts of Information Retrieval. In this article we outline some of the major aspects of the subject, and summarize the papers of this special issue with respect to how they relate to these aspects. We conclude by highlighting some directions of future research, which are needed to better understand the formal characteristics of Information Retrieval.
    Date
    22. 3.2003 19:27:36
  4. Spink, A.; Losee, R.M.: Feedback in information retrieval (1996) 0.02
    0.016276827 = product of:
      0.06510731 = sum of:
        0.06510731 = weight(_text_:social in 7441) [ClassicSimilarity], result of:
          0.06510731 = score(doc=7441,freq=2.0), product of:
            0.1847249 = queryWeight, product of:
              3.9875789 = idf(docFreq=2228, maxDocs=44218)
              0.046325076 = queryNorm
            0.3524555 = fieldWeight in 7441, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9875789 = idf(docFreq=2228, maxDocs=44218)
              0.0625 = fieldNorm(doc=7441)
      0.25 = coord(1/4)
    
    Abstract
    State of the art review of the mechanisms of feedback in information retrieval (IR) in terms of feedback concepts and models in cybernetics and social sciences. Critically evaluates feedback research based on the traditional IR models and comparing the different approaches to automatic relevance feedback techniques, and feedback research within the framework of interactive IR models. Calls for an extension of the concept of feedback beyond relevance feedback to interactive feedback. Cites specific examples of feedback models used within IR research and presents 6 challenges to future research
  5. Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.01
    0.012552816 = product of:
      0.050211266 = sum of:
        0.050211266 = product of:
          0.10042253 = sum of:
            0.10042253 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
              0.10042253 = score(doc=402,freq=2.0), product of:
                0.16222252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046325076 = queryNorm
                0.61904186 = fieldWeight in 402, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=402)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Source
    Information processing and management. 22(1986) no.6, S.465-476
  6. Liu, X.; Turtle, H.: Real-time user interest modeling for real-time ranking (2013) 0.01
    0.01220762 = product of:
      0.04883048 = sum of:
        0.04883048 = weight(_text_:social in 1035) [ClassicSimilarity], result of:
          0.04883048 = score(doc=1035,freq=2.0), product of:
            0.1847249 = queryWeight, product of:
              3.9875789 = idf(docFreq=2228, maxDocs=44218)
              0.046325076 = queryNorm
            0.26434162 = fieldWeight in 1035, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9875789 = idf(docFreq=2228, maxDocs=44218)
              0.046875 = fieldNorm(doc=1035)
      0.25 = coord(1/4)
    
    Abstract
    User interest as a very dynamic information need is often ignored in most existing information retrieval systems. In this research, we present the results of experiments designed to evaluate the performance of a real-time interest model (RIM) that attempts to identify the dynamic and changing query level interests regarding social media outputs. Unlike most existing ranking methods, our ranking approach targets calculation of the probability that user interest in the content of the document is subject to very dynamic user interest change. We describe 2 formulations of the model (real-time interest vector space and real-time interest language model) stemming from classical relevance ranking methods and develop a novel methodology for evaluating the performance of RIM using Amazon Mechanical Turk to collect (interest-based) relevance judgments on a daily basis. Our results show that the model usually, although not always, performs better than baseline results obtained from commercial web search engines. We identify factors that affect RIM performance and outline plans for future research.
  7. Efron, M.; Winget, M.: Query polyrepresentation for ranking retrieval systems without relevance judgments (2010) 0.01
    0.011090445 = product of:
      0.04436178 = sum of:
        0.04436178 = product of:
          0.08872356 = sum of:
            0.08872356 = weight(_text_:aspects in 3469) [ClassicSimilarity], result of:
              0.08872356 = score(doc=3469,freq=4.0), product of:
                0.20938325 = queryWeight, product of:
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.046325076 = queryNorm
                0.42373765 = fieldWeight in 3469, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3469)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Ranking information retrieval (IR) systems with respect to their effectiveness is a crucial operation during IR evaluation, as well as during data fusion. This article offers a novel method of approaching the system-ranking problem, based on the widely studied idea of polyrepresentation. The principle of polyrepresentation suggests that a single information need can be represented by many query articulations-what we call query aspects. By skimming the top k (where k is small) documents retrieved by a single system for multiple query aspects, we collect a set of documents that are likely to be relevant to a given test topic. Labeling these skimmed documents as putatively relevant lets us build pseudorelevance judgments without undue human intervention. We report experiments where using these pseudorelevance judgments delivers a rank ordering of IR systems that correlates highly with rankings based on human relevance judgments.
  8. Smeaton, A.F.; Rijsbergen, C.J. van: ¬The retrieval effects of query expansion on a feedback document retrieval system (1983) 0.01
    0.010983714 = product of:
      0.043934856 = sum of:
        0.043934856 = product of:
          0.08786971 = sum of:
            0.08786971 = weight(_text_:22 in 2134) [ClassicSimilarity], result of:
              0.08786971 = score(doc=2134,freq=2.0), product of:
                0.16222252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046325076 = queryNorm
                0.5416616 = fieldWeight in 2134, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=2134)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    30. 3.2001 13:32:22
  9. Back, J.: ¬An evaluation of relevancy ranking techniques used by Internet search engines (2000) 0.01
    0.010983714 = product of:
      0.043934856 = sum of:
        0.043934856 = product of:
          0.08786971 = sum of:
            0.08786971 = weight(_text_:22 in 3445) [ClassicSimilarity], result of:
              0.08786971 = score(doc=3445,freq=2.0), product of:
                0.16222252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046325076 = queryNorm
                0.5416616 = fieldWeight in 3445, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3445)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    25. 8.2005 17:42:22
  10. Nunes, S.; Ribeiro, C.; David, G.: Term weighting based on document revision history (2011) 0.01
    0.010173016 = product of:
      0.040692065 = sum of:
        0.040692065 = weight(_text_:social in 4946) [ClassicSimilarity], result of:
          0.040692065 = score(doc=4946,freq=2.0), product of:
            0.1847249 = queryWeight, product of:
              3.9875789 = idf(docFreq=2228, maxDocs=44218)
              0.046325076 = queryNorm
            0.22028469 = fieldWeight in 4946, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9875789 = idf(docFreq=2228, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4946)
      0.25 = coord(1/4)
    
    Abstract
    In real-world information retrieval systems, the underlying document collection is rarely stable or definitive. This work is focused on the study of signals extracted from the content of documents at different points in time for the purpose of weighting individual terms in a document. The basic idea behind our proposals is that terms that have existed for a longer time in a document should have a greater weight. We propose 4 term weighting functions that use each document's history to estimate a current term score. To evaluate this thesis, we conduct 3 independent experiments using a collection of documents sampled from Wikipedia. In the first experiment, we use data from Wikipedia to judge each set of terms. In a second experiment, we use an external collection of tags from a popular social bookmarking service as a gold standard. In the third experiment, we crowdsource user judgments to collect feedback on term preference. Across all experiments results consistently support our thesis. We show that temporally aware measures, specifically the proposed revision term frequency and revision term frequency span, outperform a term-weighting measure based on raw term frequency alone.
  11. Hoenkamp, E.; Bruza, P.: How everyday language can and will boost effective information retrieval (2015) 0.01
    0.010173016 = product of:
      0.040692065 = sum of:
        0.040692065 = weight(_text_:social in 2123) [ClassicSimilarity], result of:
          0.040692065 = score(doc=2123,freq=2.0), product of:
            0.1847249 = queryWeight, product of:
              3.9875789 = idf(docFreq=2228, maxDocs=44218)
              0.046325076 = queryNorm
            0.22028469 = fieldWeight in 2123, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9875789 = idf(docFreq=2228, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2123)
      0.25 = coord(1/4)
    
    Abstract
    Typing 2 or 3 keywords into a browser has become an easy and efficient way to find information. Yet, typing even short queries becomes tedious on ever shrinking (virtual) keyboards. Meanwhile, speech processing is maturing rapidly, facilitating everyday language input. Also, wearable technology can inform users proactively by listening in on their conversations or processing their social media interactions. Given these developments, everyday language may soon become the new input of choice. We present an information retrieval (IR) algorithm specifically designed to accept everyday language. It integrates two paradigms of information retrieval, previously studied in isolation; one directed mainly at the surface structure of language, the other primarily at the underlying meaning. The integration was achieved by a Markov machine that encodes meaning by its transition graph, and surface structure by the language it generates. A rigorous evaluation of the approach showed, first, that it can compete with the quality of existing language models, second, that it is more effective the more verbose the input, and third, as a consequence, that it is promising for an imminent transition from keyword input, where the onus is on the user to formulate concise queries, to a modality where users can express more freely, more informal, and more natural their need for information in everyday language.
  12. Fuhr, N.: Ranking-Experimente mit gewichteter Indexierung (1986) 0.01
    0.009414612 = product of:
      0.03765845 = sum of:
        0.03765845 = product of:
          0.0753169 = sum of:
            0.0753169 = weight(_text_:22 in 58) [ClassicSimilarity], result of:
              0.0753169 = score(doc=58,freq=2.0), product of:
                0.16222252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046325076 = queryNorm
                0.46428138 = fieldWeight in 58, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=58)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    14. 6.2015 22:12:44
  13. Fuhr, N.: Rankingexperimente mit gewichteter Indexierung (1986) 0.01
    0.009414612 = product of:
      0.03765845 = sum of:
        0.03765845 = product of:
          0.0753169 = sum of:
            0.0753169 = weight(_text_:22 in 2051) [ClassicSimilarity], result of:
              0.0753169 = score(doc=2051,freq=2.0), product of:
                0.16222252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046325076 = queryNorm
                0.46428138 = fieldWeight in 2051, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=2051)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    14. 6.2015 22:12:56
  14. Sormunen, E.; Kekäläinen, J.; Koivisto, J.; Järvelin, K.: Document text characteristics affect the ranking of the most relevant documents by expanded structured queries (2001) 0.01
    0.009242038 = product of:
      0.036968153 = sum of:
        0.036968153 = product of:
          0.073936306 = sum of:
            0.073936306 = weight(_text_:aspects in 4487) [ClassicSimilarity], result of:
              0.073936306 = score(doc=4487,freq=4.0), product of:
                0.20938325 = queryWeight, product of:
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.046325076 = queryNorm
                0.35311472 = fieldWeight in 4487, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4487)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    The increasing flood of documentary information through the Internet and other information sources challenges the developers of information retrieval systems. It is not enough that an IR system is able to make a distinction between relevant and non-relevant documents. The reduction of information overload requires that IR systems provide the capability of screening the most valuable documents out of the mass of potentially or marginally relevant documents. This paper introduces a new concept-based method to analyse the text characteristics of documents at varying relevance levels. The results of the document analysis were applied in an experiment on query expansion (QE) in a probabilistic IR system. Statistical differences in textual characteristics of highly relevant and less relevant documents were investigated by applying a facet analysis technique. In highly relevant documents a larger number of aspects of the request were discussed, searchable expressions for the aspects were distributed over a larger set of text paragraphs, and a larger set of unique expressions were used per aspect than in marginally relevant documents. A query expansion experiment verified that the findings of the text analysis can be exploited in formulating more effective queries for best match retrieval in the search for highly relevant documents. The results revealed that expanded queries with concept-based structures performed better than unexpanded queries or Ñnatural languageÒ queries. Further, it was shown that highly relevant documents benefit essentially more from the concept-based QE in ranking than marginally relevant documents.
  15. Ozdemiray, A.M.; Altingovde, I.S.: Explicit search result diversification using score and rank aggregation methods (2015) 0.01
    0.009242038 = product of:
      0.036968153 = sum of:
        0.036968153 = product of:
          0.073936306 = sum of:
            0.073936306 = weight(_text_:aspects in 1856) [ClassicSimilarity], result of:
              0.073936306 = score(doc=1856,freq=4.0), product of:
                0.20938325 = queryWeight, product of:
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.046325076 = queryNorm
                0.35311472 = fieldWeight in 1856, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1856)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Search result diversification is one of the key techniques to cope with the ambiguous and underspecified information needs of web users. In the last few years, strategies that are based on the explicit knowledge of query aspects emerged as highly effective ways of diversifying search results. Our contributions in this article are two-fold. First, we extensively evaluate the performance of a state-of-the-art explicit diversification strategy and pin-point its potential weaknesses. We propose basic yet novel optimizations to remedy these weaknesses and boost the performance of this algorithm. As a second contribution, inspired by the success of the current diversification strategies that exploit the relevance of the candidate documents to individual query aspects, we cast the diversification problem into the problem of ranking aggregation. To this end, we propose to materialize the re-rankings of the candidate documents for each query aspect and then merge these rankings by adapting the score(-based) and rank(-based) aggregation methods. Our extensive experimental evaluations show that certain ranking aggregation methods are superior to existing explicit diversification strategies in terms of diversification effectiveness. Furthermore, these ranking aggregation methods have lower computational complexity than the state-of-the-art diversification strategies.
  16. Liu, J.; Liu, C.: Personalization in text information retrieval : a survey (2020) 0.01
    0.007842129 = product of:
      0.031368516 = sum of:
        0.031368516 = product of:
          0.06273703 = sum of:
            0.06273703 = weight(_text_:aspects in 5761) [ClassicSimilarity], result of:
              0.06273703 = score(doc=5761,freq=2.0), product of:
                0.20938325 = queryWeight, product of:
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.046325076 = queryNorm
                0.29962775 = fieldWeight in 5761, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5761)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Personalization of information retrieval (PIR) is aimed at tailoring a search toward individual users and user groups by taking account of additional information about users besides their queries. In the past two decades or so, PIR has received extensive attention in both academia and industry. This article surveys the literature of personalization in text retrieval, following a framework for aspects or factors that can be used for personalization. The framework consists of additional information about users that can be explicitly obtained by asking users for their preferences, or implicitly inferred from users' search behaviors. Users' characteristics and contextual factors such as tasks, time, location, etc., can be helpful for personalization. This article also addresses various issues including when to personalize, the evaluation of PIR, privacy, usability, etc. Based on the extensive review, challenges are discussed and directions for future effort are suggested.
  17. Wei, F.; Li, W.; Lu, Q.; He, Y.: Applying two-level reinforcement ranking in query-oriented multidocument summarization (2009) 0.01
    0.0065351077 = product of:
      0.026140431 = sum of:
        0.026140431 = product of:
          0.052280862 = sum of:
            0.052280862 = weight(_text_:aspects in 3120) [ClassicSimilarity], result of:
              0.052280862 = score(doc=3120,freq=2.0), product of:
                0.20938325 = queryWeight, product of:
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.046325076 = queryNorm
                0.2496898 = fieldWeight in 3120, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3120)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Sentence ranking is the issue of most concern in document summarization today. While traditional feature-based approaches evaluate sentence significance and rank the sentences relying on the features that are particularly designed to characterize the different aspects of the individual sentences, the newly emerging graph-based ranking algorithms (such as the PageRank-like algorithms) recursively compute sentence significance using the global information in a text graph that links sentences together. In general, the existing PageRank-like algorithms can model well the phenomena that a sentence is important if it is linked by many other important sentences. Or they are capable of modeling the mutual reinforcement among the sentences in the text graph. However, when dealing with multidocument summarization these algorithms often assemble a set of documents into one large file. The document dimension is totally ignored. In this article we present a framework to model the two-level mutual reinforcement among sentences as well as documents. Under this framework we design and develop a novel ranking algorithm such that the document reinforcement is taken into account in the process of sentence ranking. The convergence issue is examined. We also explore an interesting and important property of the proposed algorithm. When evaluated on the DUC 2005 and 2006 query-oriented multidocument summarization datasets, significant results are achieved.
  18. González-Ibáñez, R.; Esparza-Villamán, A.; Vargas-Godoy, J.C.; Shah, C.: ¬A comparison of unimodal and multimodal models for implicit detection of relevance in interactive IR (2019) 0.01
    0.0065351077 = product of:
      0.026140431 = sum of:
        0.026140431 = product of:
          0.052280862 = sum of:
            0.052280862 = weight(_text_:aspects in 5417) [ClassicSimilarity], result of:
              0.052280862 = score(doc=5417,freq=2.0), product of:
                0.20938325 = queryWeight, product of:
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.046325076 = queryNorm
                0.2496898 = fieldWeight in 5417, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.5198684 = idf(docFreq=1308, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5417)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Implicit detection of relevance has been approached by many during the last decade. From the use of individual measures to the use of multiple features from different sources (multimodality), studies have shown the feasibility to automatically detect whether a document is relevant. Despite promising results, it is not clear yet to what extent multimodality constitutes an effective approach compared to unimodality. In this article, we hypothesize that it is possible to build unimodal models capable of outperforming multimodal models in the detection of perceived relevance. To test this hypothesis, we conducted three experiments to compare unimodal and multimodal classification models built using a combination of 24 features. Our classification experiments showed that a univariate unimodal model based on the left-click feature supports our hypothesis. On the other hand, our prediction experiment suggests that multimodality slightly improves early classification compared to the best unimodal models. Based on our results, we argue that the feasibility for practical applications of state-of-the-art multimodal approaches may be strongly constrained by technology, cultural, ethical, and legal aspects, in which case unimodality may offer a better alternative today for supporting relevance detection in interactive information retrieval systems.
  19. MacFarlane, A.; Robertson, S.E.; McCann, J.A.: Parallel computing for passage retrieval (2004) 0.01
    0.006276408 = product of:
      0.025105633 = sum of:
        0.025105633 = product of:
          0.050211266 = sum of:
            0.050211266 = weight(_text_:22 in 5108) [ClassicSimilarity], result of:
              0.050211266 = score(doc=5108,freq=2.0), product of:
                0.16222252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046325076 = queryNorm
                0.30952093 = fieldWeight in 5108, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5108)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    20. 1.2007 18:30:22
  20. Faloutsos, C.: Signature files (1992) 0.01
    0.006276408 = product of:
      0.025105633 = sum of:
        0.025105633 = product of:
          0.050211266 = sum of:
            0.050211266 = weight(_text_:22 in 3499) [ClassicSimilarity], result of:
              0.050211266 = score(doc=3499,freq=2.0), product of:
                0.16222252 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046325076 = queryNorm
                0.30952093 = fieldWeight in 3499, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3499)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    7. 5.1999 15:22:48

Years

Languages

  • e 36
  • d 4

Types

  • a 38
  • m 1
  • r 1
  • More… Less…