Search (23 results, page 1 of 2)

  • × year_i:[2010 TO 2020}
  • × theme_ss:"Retrievalalgorithmen"
  1. Habernal, I.; Konopík, M.; Rohlík, O.: Question answering (2012) 0.03
    0.034050807 = product of:
      0.10215242 = sum of:
        0.057803504 = weight(_text_:wide in 101) [ClassicSimilarity], result of:
          0.057803504 = score(doc=101,freq=2.0), product of:
            0.19679762 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.044416238 = queryNorm
            0.29372054 = fieldWeight in 101, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.046875 = fieldNorm(doc=101)
        0.04434892 = weight(_text_:web in 101) [ClassicSimilarity], result of:
          0.04434892 = score(doc=101,freq=4.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.3059541 = fieldWeight in 101, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=101)
      0.33333334 = coord(2/6)
    
    Abstract
    Question Answering is an area of information retrieval with the added challenge of applying sophisticated techniques to identify the complex syntactic and semantic relationships present in text in order to provide a more sophisticated and satisfactory response to the user's information needs. For this reason, the authors see question answering as the next step beyond standard information retrieval. In this chapter state of the art question answering is covered focusing on providing an overview of systems, techniques and approaches that are likely to be employed in the next generations of search engines. Special attention is paid to question answering using the World Wide Web as the data source and to question answering exploiting the possibilities of Semantic Web. Considerations about the current issues and prospects for promising future research are also provided.
  2. Bhansali, D.; Desai, H.; Deulkar, K.: ¬A study of different ranking approaches for semantic search (2015) 0.03
    0.026011107 = product of:
      0.07803332 = sum of:
        0.045263432 = weight(_text_:web in 2696) [ClassicSimilarity], result of:
          0.045263432 = score(doc=2696,freq=6.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.3122631 = fieldWeight in 2696, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2696)
        0.03276989 = weight(_text_:computer in 2696) [ClassicSimilarity], result of:
          0.03276989 = score(doc=2696,freq=2.0), product of:
            0.16231956 = queryWeight, product of:
              3.6545093 = idf(docFreq=3109, maxDocs=44218)
              0.044416238 = queryNorm
            0.20188503 = fieldWeight in 2696, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.6545093 = idf(docFreq=3109, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2696)
      0.33333334 = coord(2/6)
    
    Abstract
    Search Engines have become an integral part of our day to day life. Our reliance on search engines increases with every passing day. With the amount of data available on Internet increasing exponentially, it becomes important to develop new methods and tools that help to return results relevant to the queries and reduce the time spent on searching. The results should be diverse but at the same time should return results focused on the queries asked. Relation Based Page Rank [4] algorithms are considered to be the next frontier in improvement of Semantic Web Search. The probability of finding relevance in the search results as posited by the user while entering the query is used to measure the relevance. However, its application is limited by the complexity of determining relation between the terms and assigning explicit meaning to each term. Trust Rank is one of the most widely used ranking algorithms for semantic web search. Few other ranking algorithms like HITS algorithm, PageRank algorithm are also used for Semantic Web Searching. In this paper, we will provide a comparison of few ranking approaches.
    Source
    International journal of computer applications. 129(2015) no.5, S12-15
  3. Symonds, M.; Bruza, P.; Zuccon, G.; Koopman, B.; Sitbon, L.; Turner, I.: Automatic query expansion : a structural linguistic perspective (2014) 0.02
    0.02476748 = product of:
      0.07430244 = sum of:
        0.04816959 = weight(_text_:wide in 1338) [ClassicSimilarity], result of:
          0.04816959 = score(doc=1338,freq=2.0), product of:
            0.19679762 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.044416238 = queryNorm
            0.24476713 = fieldWeight in 1338, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1338)
        0.026132854 = weight(_text_:web in 1338) [ClassicSimilarity], result of:
          0.026132854 = score(doc=1338,freq=2.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.18028519 = fieldWeight in 1338, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1338)
      0.33333334 = coord(2/6)
    
    Abstract
    A user's query is considered to be an imprecise description of their information need. Automatic query expansion is the process of reformulating the original query with the goal of improving retrieval effectiveness. Many successful query expansion techniques model syntagmatic associations that infer two terms co-occur more often than by chance in natural language. However, structural linguistics relies on both syntagmatic and paradigmatic associations to deduce the meaning of a word. Given the success of dependency-based approaches to query expansion and the reliance on word meanings in the query formulation process, we argue that modeling both syntagmatic and paradigmatic information in the query expansion process improves retrieval effectiveness. This article develops and evaluates a new query expansion technique that is based on a formal, corpus-based model of word meaning that models syntagmatic and paradigmatic associations. We demonstrate that when sufficient statistical information exists, as in the case of longer queries, including paradigmatic information alone provides significant improvements in retrieval effectiveness across a wide variety of data sets. More generally, when our new query expansion approach is applied to large-scale web retrieval it demonstrates significant improvements in retrieval effectiveness over a strong baseline system, based on a commercial search engine.
  4. Baloh, P.; Desouza, K.C.; Hackney, R.: Contextualizing organizational interventions of knowledge management systems : a design science perspectiveA domain analysis (2012) 0.02
    0.021071352 = product of:
      0.063214056 = sum of:
        0.04816959 = weight(_text_:wide in 241) [ClassicSimilarity], result of:
          0.04816959 = score(doc=241,freq=2.0), product of:
            0.19679762 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.044416238 = queryNorm
            0.24476713 = fieldWeight in 241, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=241)
        0.0150444675 = product of:
          0.030088935 = sum of:
            0.030088935 = weight(_text_:22 in 241) [ClassicSimilarity], result of:
              0.030088935 = score(doc=241,freq=2.0), product of:
                0.1555381 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044416238 = queryNorm
                0.19345059 = fieldWeight in 241, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=241)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Abstract
    We address how individuals' (workers) knowledge needs influence the design of knowledge management systems (KMS), enabling knowledge creation and utilization. It is evident that KMS technologies and activities are indiscriminately deployed in most organizations with little regard to the actual context of their adoption. Moreover, it is apparent that the extant literature pertaining to knowledge management projects is frequently deficient in identifying the variety of factors indicative for successful KMS. This presents an obvious business practice and research gap that requires a critical analysis of the necessary intervention that will actually improve how workers can leverage and form organization-wide knowledge. This research involved an extensive review of the literature, a grounded theory methodological approach and rigorous data collection and synthesis through an empirical case analysis (Parsons Brinckerhoff and Samsung). The contribution of this study is the formulation of a model for designing KMS based upon the design science paradigm, which aspires to create artifacts that are interdependent of people and organizations. The essential proposition is that KMS design and implementation must be contextualized in relation to knowledge needs and that these will differ for various organizational settings. The findings present valuable insights and further understanding of the way in which KMS design efforts should be focused.
    Date
    11. 6.2012 14:22:34
  5. Ravana, S.D.; Rajagopal, P.; Balakrishnan, V.: Ranking retrieval systems using pseudo relevance judgments (2015) 0.02
    0.015802983 = product of:
      0.047408946 = sum of:
        0.026132854 = weight(_text_:web in 2591) [ClassicSimilarity], result of:
          0.026132854 = score(doc=2591,freq=2.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.18028519 = fieldWeight in 2591, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2591)
        0.02127609 = product of:
          0.04255218 = sum of:
            0.04255218 = weight(_text_:22 in 2591) [ClassicSimilarity], result of:
              0.04255218 = score(doc=2591,freq=4.0), product of:
                0.1555381 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044416238 = queryNorm
                0.27358043 = fieldWeight in 2591, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2591)
          0.5 = coord(1/2)
      0.33333334 = coord(2/6)
    
    Abstract
    Purpose In a system-based approach, replicating the web would require large test collections, and judging the relevancy of all documents per topic in creating relevance judgment through human assessors is infeasible. Due to the large amount of documents that requires judgment, there are possible errors introduced by human assessors because of disagreements. The paper aims to discuss these issues. Design/methodology/approach This study explores exponential variation and document ranking methods that generate a reliable set of relevance judgments (pseudo relevance judgments) to reduce human efforts. These methods overcome problems with large amounts of documents for judgment while avoiding human disagreement errors during the judgment process. This study utilizes two key factors: number of occurrences of each document per topic from all the system runs; and document rankings to generate the alternate methods. Findings The effectiveness of the proposed method is evaluated using the correlation coefficient of ranked systems using mean average precision scores between the original Text REtrieval Conference (TREC) relevance judgments and pseudo relevance judgments. The results suggest that the proposed document ranking method with a pool depth of 100 could be a reliable alternative to reduce human effort and disagreement errors involved in generating TREC-like relevance judgments. Originality/value Simple methods proposed in this study show improvement in the correlation coefficient in generating alternate relevance judgment without human assessors while contributing to information retrieval evaluation.
    Date
    20. 1.2015 18:30:22
    18. 9.2018 18:22:56
  6. Fu, X.: Towards a model of implicit feedback for Web search (2010) 0.01
    0.009052687 = product of:
      0.054316122 = sum of:
        0.054316122 = weight(_text_:web in 3310) [ClassicSimilarity], result of:
          0.054316122 = score(doc=3310,freq=6.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.37471575 = fieldWeight in 3310, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=3310)
      0.16666667 = coord(1/6)
    
    Abstract
    This research investigated several important issues in using implicit feedback techniques to assist searchers with difficulties in formulating effective search strategies. It focused on examining the relationship between types of behavioral evidence that can be captured from Web searches and searchers' interests. A carefully crafted observation study was conducted to capture, examine, and elucidate the analytical processes and work practices of human analysts when they simulated the role of an implicit feedback system by trying to infer searchers' interests from behavioral traces. Findings provided rare insight into the complexities and nuances in using behavioral evidence for implicit feedback and led to the proposal of an implicit feedback model for Web search that bridged previous studies on behavioral evidence and implicit feedback measures. A new level of analysis termed an analytical lens emerged from the data and provides a road map for future research on this topic.
  7. Moura, E.S. de; Fernandes, D.; Ribeiro-Neto, B.; Silva, A.S. da; Gonçalves, M.A.: Using structural information to improve search in Web collections (2010) 0.01
    0.009052687 = product of:
      0.054316122 = sum of:
        0.054316122 = weight(_text_:web in 4119) [ClassicSimilarity], result of:
          0.054316122 = score(doc=4119,freq=6.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.37471575 = fieldWeight in 4119, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=4119)
      0.16666667 = coord(1/6)
    
    Abstract
    In this work, we investigate the problem of using the block structure of Web pages to improve ranking results. Starting with basic intuitions provided by the concepts of term frequency (TF) and inverse document frequency (IDF), we propose nine block-weight functions to distinguish the impact of term occurrences inside page blocks, instead of inside whole pages. These are then used to compute a modified BM25 ranking function. Using four distinct Web collections, we ran extensive experiments to compare our block-weight ranking formulas with two other baselines: (a) a BM25 ranking applied to full pages, and (b) a BM25 ranking that takes into account best blocks. Our methods suggest that our block-weighting ranking method is superior to all baselines across all collections we used and that average gain in precision figures from 5 to 20% are generated.
  8. Walz, J.: Analyse der Übertragbarkeit allgemeiner Rankingfaktoren von Web-Suchmaschinen auf Discovery-Systeme (2018) 0.01
    0.009052687 = product of:
      0.054316122 = sum of:
        0.054316122 = weight(_text_:web in 5744) [ClassicSimilarity], result of:
          0.054316122 = score(doc=5744,freq=6.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.37471575 = fieldWeight in 5744, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=5744)
      0.16666667 = coord(1/6)
    
    Abstract
    Ziel: Ziel dieser Bachelorarbeit war es, die Übertragbarkeit der allgemeinen Rankingfaktoren, wie sie von Web-Suchmaschinen verwendet werden, auf Discovery-Systeme zu analysieren. Dadurch könnte das bisher hauptsächlich auf dem textuellen Abgleich zwischen Suchanfrage und Dokumenten basierende bibliothekarische Ranking verbessert werden. Methode: Hierfür wurden Faktoren aus den Gruppen Popularität, Aktualität, Lokalität, Technische Faktoren, sowie dem personalisierten Ranking diskutiert. Die entsprechenden Rankingfaktoren wurden nach ihrer Vorkommenshäufigkeit in der analysierten Literatur und der daraus abgeleiteten Wichtigkeit, ausgewählt. Ergebnis: Von den 23 untersuchten Rankingfaktoren sind 14 (61 %) direkt vom Ranking der Web-Suchmaschinen auf das Ranking der Discovery-Systeme übertragbar. Zu diesen zählen unter anderem das Klickverhalten, das Erstellungsdatum, der Nutzerstandort, sowie die Sprache. Sechs (26%) der untersuchten Faktoren sind dagegen nicht übertragbar (z.B. Aktualisierungsfrequenz und Ladegeschwindigkeit). Die Linktopologie, die Nutzungshäufigkeit, sowie die Aktualisierungsfrequenz sind mit entsprechenden Modifikationen übertragbar.
  9. Bar-Ilan, J.; Levene, M.: ¬The hw-rank : an h-index variant for ranking web pages (2015) 0.01
    0.008710952 = product of:
      0.052265707 = sum of:
        0.052265707 = weight(_text_:web in 1694) [ClassicSimilarity], result of:
          0.052265707 = score(doc=1694,freq=2.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.36057037 = fieldWeight in 1694, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.078125 = fieldNorm(doc=1694)
      0.16666667 = coord(1/6)
    
  10. Van der Veer Martens, B.; Fleet, C. van: Opening the black box of "relevance work" : a domain analysis (2012) 0.01
    0.008245593 = product of:
      0.049473554 = sum of:
        0.049473554 = product of:
          0.09894711 = sum of:
            0.09894711 = weight(_text_:programs in 247) [ClassicSimilarity], result of:
              0.09894711 = score(doc=247,freq=2.0), product of:
                0.25748047 = queryWeight, product of:
                  5.79699 = idf(docFreq=364, maxDocs=44218)
                  0.044416238 = queryNorm
                0.38428974 = fieldWeight in 247, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.79699 = idf(docFreq=364, maxDocs=44218)
                  0.046875 = fieldNorm(doc=247)
          0.5 = coord(1/2)
      0.16666667 = coord(1/6)
    
    Abstract
    In response to Hjørland's recent call for a reconceptualization of the foundations of relevance, we suggest that the sociocognitive aspects of intermediation by information agencies, such as archives and libraries, are a necessary and unexplored part of the infrastructure of the subject knowledge domains central to his recommended "view of relevance informed by a social paradigm" (2010, p. 217). From a comparative analysis of documents from 39 graduate-level introductory courses in archives, reference, and strategic/competitive intelligence taught in 13 American Library Association-accredited library and information science (LIS) programs, we identify four defining sociocognitive dimensions of "relevance work" in information agencies within Hjørland's proposed framework for relevance: tasks, time, systems, and assessors. This study is intended to supply sociocognitive content from within the relevance work domain to support further domain analytic research, and to emphasize the importance of intermediary relevance work for all subject knowledge domains.
  11. Tsai, C.-F.; Hu, Y.-H.; Chen, Z.-Y.: Factors affecting rocchio-based pseudorelevance feedback in image retrieval (2015) 0.01
    0.008028265 = product of:
      0.04816959 = sum of:
        0.04816959 = weight(_text_:wide in 1607) [ClassicSimilarity], result of:
          0.04816959 = score(doc=1607,freq=2.0), product of:
            0.19679762 = queryWeight, product of:
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.044416238 = queryNorm
            0.24476713 = fieldWeight in 1607, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.4307585 = idf(docFreq=1430, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1607)
      0.16666667 = coord(1/6)
    
    Abstract
    Pseudorelevance feedback (PRF) was proposed to solve the limitation of relevance feedback (RF), which is based on the user-in-the-loop process. In PRF, the top-k retrieved images are regarded as PRF. Although the PRF set contains noise, PRF has proven effective for automatically improving the overall retrieval result. To implement PRF, the Rocchio algorithm has been considered as a reasonable and well-established baseline. However, the performance of Rocchio-based PRF is subject to various representation choices (or factors). In this article, we examine these factors that affect the performance of Rocchio-based PRF, including image-feature representation, the number of top-ranked images, the weighting parameters of Rocchio, and similarity measure. We offer practical insights on how to optimize the performance of Rocchio-based PRF by choosing appropriate representation choices. Our extensive experiments on NUS-WIDE-LITE and Caltech 101 + Corel 5000 data sets show that the optimal feature representation is color moment + wavelet texture in terms of retrieval efficiency and effectiveness. Other representation choices are that using top-20 ranked images as pseudopositive and pseudonegative feedback sets with the equal weight (i.e., 0.5) by the correlation and cosine distance functions can produce the optimal retrieval result.
  12. Koumenides, C.L.; Shadbolt, N.R.: Ranking methods for entity-oriented semantic web search (2014) 0.01
    0.0073914872 = product of:
      0.04434892 = sum of:
        0.04434892 = weight(_text_:web in 1280) [ClassicSimilarity], result of:
          0.04434892 = score(doc=1280,freq=4.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.3059541 = fieldWeight in 1280, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=1280)
      0.16666667 = coord(1/6)
    
    Abstract
    This article provides a technical review of semantic search methods used to support text-based search over formal Semantic Web knowledge bases. Our focus is on ranking methods and auxiliary processes explored by existing semantic search systems, outlined within broad areas of classification. We present reflective examples from the literature in some detail, which should appeal to readers interested in a deeper perspective on the various methods and systems implemented in the outlined literature. The presentation covers graph exploration and propagation methods, adaptations of classic probabilistic retrieval models, and query-independent link analysis via flexible extensions to the PageRank algorithm. Future research directions are discussed, including development of more cohesive retrieval models to unlock further potentials and uses, data indexing schemes, integration with user interfaces, and building community consensus for more systematic evaluation and gradual development.
  13. Jindal, V.; Bawa, S.; Batra, S.: ¬A review of ranking approaches for semantic search on Web (2014) 0.01
    0.0073914872 = product of:
      0.04434892 = sum of:
        0.04434892 = weight(_text_:web in 2799) [ClassicSimilarity], result of:
          0.04434892 = score(doc=2799,freq=4.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.3059541 = fieldWeight in 2799, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=2799)
      0.16666667 = coord(1/6)
    
    Abstract
    With ever increasing information being available to the end users, search engines have become the most powerful tools for obtaining useful information scattered on the Web. However, it is very common that even most renowned search engines return result sets with not so useful pages to the user. Research on semantic search aims to improve traditional information search and retrieval methods where the basic relevance criteria rely primarily on the presence of query keywords within the returned pages. This work is an attempt to explore different relevancy ranking approaches based on semantics which are considered appropriate for the retrieval of relevant information. In this paper, various pilot projects and their corresponding outcomes have been investigated based on methodologies adopted and their most distinctive characteristics towards ranking. An overview of selected approaches and their comparison by means of the classification criteria has been presented. With the help of this comparison, some common concepts and outstanding features have been identified.
  14. Lee, J.; Min, J.-K.; Oh, A.; Chung, C.-W.: Effective ranking and search techniques for Web resources considering semantic relationships (2014) 0.01
    0.006159573 = product of:
      0.036957435 = sum of:
        0.036957435 = weight(_text_:web in 2670) [ClassicSimilarity], result of:
          0.036957435 = score(doc=2670,freq=4.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.25496176 = fieldWeight in 2670, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2670)
      0.16666667 = coord(1/6)
    
    Abstract
    On the Semantic Web, the types of resources and the semantic relationships between resources are defined in an ontology. By using that information, the accuracy of information retrieval can be improved. In this paper, we present effective ranking and search techniques considering the semantic relationships in an ontology. Our technique retrieves top-k resources which are the most relevant to query keywords through the semantic relationships. To do this, we propose a weighting measure for the semantic relationship. Based on this measure, we propose a novel ranking method which considers the number of meaningful semantic relationships between a resource and keywords as well as the coverage and discriminating power of keywords. In order to improve the efficiency of the search, we prune the unnecessary search space using the length and weight thresholds of the semantic relationship path. In addition, we exploit Threshold Algorithm based on an extended inverted index to answer top-k results efficiently. The experimental results using real data sets demonstrate that our retrieval method using the semantic information generates accurate results efficiently compared to the traditional methods.
  15. Behnert, C.; Plassmeier, K.; Borst, T.; Lewandowski, D.: Evaluierung von Rankingverfahren für bibliothekarische Informationssysteme (2019) 0.01
    0.0060976665 = product of:
      0.036585998 = sum of:
        0.036585998 = weight(_text_:web in 5023) [ClassicSimilarity], result of:
          0.036585998 = score(doc=5023,freq=2.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.25239927 = fieldWeight in 5023, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5023)
      0.16666667 = coord(1/6)
    
    Abstract
    Dieser Beitrag beschreibt eine Studie zur Entwicklung und Evaluierung von Rankingverfahren für bibliothekarische Informationssysteme. Dazu wurden mögliche Faktoren für das Relevanzranking ausgehend von den Verfahren in Websuchmaschinen identifiziert, auf den Bibliothekskontext übertragen und systematisch evaluiert. Mithilfe eines Testsystems, das auf dem ZBW-Informationsportal EconBiz und einer web-basierten Software zur Evaluierung von Suchsystemen aufsetzt, wurden verschiedene Relevanzfaktoren (z. B. Popularität in Verbindung mit Aktualität) getestet. Obwohl die getesteten Rankingverfahren auf einer theoretischen Ebene divers sind, konnten keine einheitlichen Verbesserungen gegenüber den Baseline-Rankings gemessen werden. Die Ergebnisse deuten darauf hin, dass eine Adaptierung des Rankings auf individuelle Nutzer bzw. Nutzungskontexte notwendig sein könnte, um eine höhere Performance zu erzielen.
  16. Jiang, X.; Sun, X.; Yang, Z.; Zhuge, H.; Lapshinova-Koltunski, E.; Yao, J.: Exploiting heterogeneous scientific literature networks to combat ranking bias : evidence from the computational linguistics area (2016) 0.01
    0.005461648 = product of:
      0.03276989 = sum of:
        0.03276989 = weight(_text_:computer in 3017) [ClassicSimilarity], result of:
          0.03276989 = score(doc=3017,freq=2.0), product of:
            0.16231956 = queryWeight, product of:
              3.6545093 = idf(docFreq=3109, maxDocs=44218)
              0.044416238 = queryNorm
            0.20188503 = fieldWeight in 3017, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.6545093 = idf(docFreq=3109, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3017)
      0.16666667 = coord(1/6)
    
    Abstract
    It is important to help researchers find valuable papers from a large literature collection. To this end, many graph-based ranking algorithms have been proposed. However, most of these algorithms suffer from the problem of ranking bias. Ranking bias hurts the usefulness of a ranking algorithm because it returns a ranking list with an undesirable time distribution. This paper is a focused study on how to alleviate ranking bias by leveraging the heterogeneous network structure of the literature collection. We propose a new graph-based ranking algorithm, MutualRank, that integrates mutual reinforcement relationships among networks of papers, researchers, and venues to achieve a more synthetic, accurate, and less-biased ranking than previous methods. MutualRank provides a unified model that involves both intra- and inter-network information for ranking papers, researchers, and venues simultaneously. We use the ACL Anthology Network as the benchmark data set and construct the gold standard from computer linguistics course websites of well-known universities and two well-known textbooks. The experimental results show that MutualRank greatly outperforms the state-of-the-art competitors, including PageRank, HITS, CoRank, Future Rank, and P-Rank, in ranking papers in both improving ranking effectiveness and alleviating ranking bias. Rankings of researchers and venues by MutualRank are also quite reasonable.
  17. Liu, X.; Turtle, H.: Real-time user interest modeling for real-time ranking (2013) 0.01
    0.0052265706 = product of:
      0.031359423 = sum of:
        0.031359423 = weight(_text_:web in 1035) [ClassicSimilarity], result of:
          0.031359423 = score(doc=1035,freq=2.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.21634221 = fieldWeight in 1035, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=1035)
      0.16666667 = coord(1/6)
    
    Abstract
    User interest as a very dynamic information need is often ignored in most existing information retrieval systems. In this research, we present the results of experiments designed to evaluate the performance of a real-time interest model (RIM) that attempts to identify the dynamic and changing query level interests regarding social media outputs. Unlike most existing ranking methods, our ranking approach targets calculation of the probability that user interest in the content of the document is subject to very dynamic user interest change. We describe 2 formulations of the model (real-time interest vector space and real-time interest language model) stemming from classical relevance ranking methods and develop a novel methodology for evaluating the performance of RIM using Amazon Mechanical Turk to collect (interest-based) relevance judgments on a daily basis. Our results show that the model usually, although not always, performs better than baseline results obtained from commercial web search engines. We identify factors that affect RIM performance and outline plans for future research.
  18. Ozdemiray, A.M.; Altingovde, I.S.: Explicit search result diversification using score and rank aggregation methods (2015) 0.00
    0.004355476 = product of:
      0.026132854 = sum of:
        0.026132854 = weight(_text_:web in 1856) [ClassicSimilarity], result of:
          0.026132854 = score(doc=1856,freq=2.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.18028519 = fieldWeight in 1856, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1856)
      0.16666667 = coord(1/6)
    
    Abstract
    Search result diversification is one of the key techniques to cope with the ambiguous and underspecified information needs of web users. In the last few years, strategies that are based on the explicit knowledge of query aspects emerged as highly effective ways of diversifying search results. Our contributions in this article are two-fold. First, we extensively evaluate the performance of a state-of-the-art explicit diversification strategy and pin-point its potential weaknesses. We propose basic yet novel optimizations to remedy these weaknesses and boost the performance of this algorithm. As a second contribution, inspired by the success of the current diversification strategies that exploit the relevance of the candidate documents to individual query aspects, we cast the diversification problem into the problem of ranking aggregation. To this end, we propose to materialize the re-rankings of the candidate documents for each query aspect and then merge these rankings by adapting the score(-based) and rank(-based) aggregation methods. Our extensive experimental evaluations show that certain ranking aggregation methods are superior to existing explicit diversification strategies in terms of diversification effectiveness. Furthermore, these ranking aggregation methods have lower computational complexity than the state-of-the-art diversification strategies.
  19. Hubert, G.; Pitarch, Y.; Pinel-Sauvagnat, K.; Tournier, R.; Laporte, L.: TournaRank : when retrieval becomes document competition (2018) 0.00
    0.004355476 = product of:
      0.026132854 = sum of:
        0.026132854 = weight(_text_:web in 5087) [ClassicSimilarity], result of:
          0.026132854 = score(doc=5087,freq=2.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.18028519 = fieldWeight in 5087, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5087)
      0.16666667 = coord(1/6)
    
    Abstract
    Numerous feature-based models have been recently proposed by the information retrieval community. The capability of features to express different relevance facets (query- or document-dependent) can explain such a success story. Such models are most of the time supervised, thus requiring a learning phase. To leverage the advantages of feature-based representations of documents, we propose TournaRank, an unsupervised approach inspired by real-life game and sport competition principles. Documents compete against each other in tournaments using features as evidences of relevance. Tournaments are modeled as a sequence of matches, which involve pairs of documents playing in turn their features. Once a tournament is ended, documents are ranked according to their number of won matches during the tournament. This principle is generic since it can be applied to any collection type. It also provides great flexibility since different alternatives can be considered by changing the tournament type, the match rules, the feature set, or the strategies adopted by documents during matches. TournaRank was experimented on several collections to evaluate our model in different contexts and to compare it with related approaches such as Learning To Rank and fusion ones: the TREC Robust2004 collection for homogeneous documents, the TREC Web2014 (ClueWeb12) collection for heterogeneous web documents, and the LETOR3.0 collection for comparison with supervised feature-based models.
  20. Jiang, J.-D.; Jiang, J.-Y.; Cheng, P.-J.: Cocluster hypothesis and ranking consistency for relevance ranking in web search (2019) 0.00
    0.004355476 = product of:
      0.026132854 = sum of:
        0.026132854 = weight(_text_:web in 5247) [ClassicSimilarity], result of:
          0.026132854 = score(doc=5247,freq=2.0), product of:
            0.14495286 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.044416238 = queryNorm
            0.18028519 = fieldWeight in 5247, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5247)
      0.16666667 = coord(1/6)