Search (153 results, page 1 of 8)

  • × theme_ss:"Retrievalalgorithmen"
  1. Kanaeva, Z.: Ranking: Google und CiteSeer (2005) 0.14
    0.14489293 = product of:
      0.28978586 = sum of:
        0.13105242 = weight(_text_:suchmaschine in 3276) [ClassicSimilarity], result of:
          0.13105242 = score(doc=3276,freq=4.0), product of:
            0.21191008 = queryWeight, product of:
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.03747799 = queryNorm
            0.6184341 = fieldWeight in 3276, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3276)
        0.14688537 = weight(_text_:ranking in 3276) [ClassicSimilarity], result of:
          0.14688537 = score(doc=3276,freq=6.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.7245744 = fieldWeight in 3276, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3276)
        0.011848084 = product of:
          0.03554425 = sum of:
            0.03554425 = weight(_text_:22 in 3276) [ClassicSimilarity], result of:
              0.03554425 = score(doc=3276,freq=2.0), product of:
                0.13124153 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03747799 = queryNorm
                0.2708308 = fieldWeight in 3276, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3276)
          0.33333334 = coord(1/3)
      0.5 = coord(3/6)
    
    Abstract
    Im Rahmen des klassischen Information Retrieval wurden verschiedene Verfahren für das Ranking sowie die Suche in einer homogenen strukturlosen Dokumentenmenge entwickelt. Die Erfolge der Suchmaschine Google haben gezeigt dass die Suche in einer zwar inhomogenen aber zusammenhängenden Dokumentenmenge wie dem Internet unter Berücksichtigung der Dokumentenverbindungen (Links) sehr effektiv sein kann. Unter den von der Suchmaschine Google realisierten Konzepten ist ein Verfahren zum Ranking von Suchergebnissen (PageRank), das in diesem Artikel kurz erklärt wird. Darüber hinaus wird auf die Konzepte eines Systems namens CiteSeer eingegangen, welches automatisch bibliographische Angaben indexiert (engl. Autonomous Citation Indexing, ACI). Letzteres erzeugt aus einer Menge von nicht vernetzten wissenschaftlichen Dokumenten eine zusammenhängende Dokumentenmenge und ermöglicht den Einsatz von Banking-Verfahren, die auf den von Google genutzten Verfahren basieren.
    Date
    20. 3.2005 16:23:22
  2. Kaszkiel, M.; Zobel, J.: Effective ranking with arbitrary passages (2001) 0.07
    0.071948126 = product of:
      0.21584436 = sum of:
        0.20559667 = weight(_text_:ranking in 5764) [ClassicSimilarity], result of:
          0.20559667 = score(doc=5764,freq=16.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            1.0141928 = fieldWeight in 5764, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.046875 = fieldNorm(doc=5764)
        0.010247685 = product of:
          0.030743055 = sum of:
            0.030743055 = weight(_text_:29 in 5764) [ClassicSimilarity], result of:
              0.030743055 = score(doc=5764,freq=2.0), product of:
                0.13183585 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03747799 = queryNorm
                0.23319192 = fieldWeight in 5764, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5764)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Abstract
    Text retrieval systems store a great variety of documents, from abstracts, newspaper articles, and Web pages to journal articles, books, court transcripts, and legislation. Collections of diverse types of documents expose shortcomings in current approaches to ranking. Use of short fragments of documents, called passages, instead of whole documents can overcome these shortcomings: passage ranking provides convenient units of text to return to the user, can avoid the difficulties of comparing documents of different length, and enables identification of short blocks of relevant material among otherwise irrelevant text. In this article, we compare several kinds of passage in an extensive series of experiments. We introduce a new type of passage, overlapping fragments of either fixed or variable length. We show that ranking with these arbitrary passages gives substantial improvements in retrieval effectiveness over traditional document ranking schemes, particularly for queries on collections of long documents. Ranking with arbitrary passages shows consistent improvements compared to ranking with whole documents, and to ranking with previous passage types that depend on document structure or topic shifts in documents
    Date
    29. 9.2001 14:00:39
  3. Back, J.: ¬An evaluation of relevancy ranking techniques used by Internet search engines (2000) 0.06
    0.06443493 = product of:
      0.19330478 = sum of:
        0.16960861 = weight(_text_:ranking in 3445) [ClassicSimilarity], result of:
          0.16960861 = score(doc=3445,freq=2.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.8366664 = fieldWeight in 3445, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.109375 = fieldNorm(doc=3445)
        0.023696167 = product of:
          0.0710885 = sum of:
            0.0710885 = weight(_text_:22 in 3445) [ClassicSimilarity], result of:
              0.0710885 = score(doc=3445,freq=2.0), product of:
                0.13124153 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03747799 = queryNorm
                0.5416616 = fieldWeight in 3445, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3445)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Date
    25. 8.2005 17:42:22
  4. Vechtomova, O.; Karamuftuoglu, M.: Lexical cohesion and term proximity in document ranking (2008) 0.06
    0.060510855 = product of:
      0.18153256 = sum of:
        0.16786899 = weight(_text_:ranking in 2101) [ClassicSimilarity], result of:
          0.16786899 = score(doc=2101,freq=6.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.828085 = fieldWeight in 2101, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.0625 = fieldNorm(doc=2101)
        0.013663581 = product of:
          0.04099074 = sum of:
            0.04099074 = weight(_text_:29 in 2101) [ClassicSimilarity], result of:
              0.04099074 = score(doc=2101,freq=2.0), product of:
                0.13183585 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03747799 = queryNorm
                0.31092256 = fieldWeight in 2101, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=2101)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Abstract
    We demonstrate effective new methods of document ranking based on lexical cohesive relationships between query terms. The proposed methods rely solely on the lexical relationships between original query terms, and do not involve query expansion or relevance feedback. Two types of lexical cohesive relationship information between query terms are used in document ranking: short-distance collocation relationship between query terms, and long-distance relationship, determined by the collocation of query terms with other words. The methods are evaluated on TREC corpora, and show improvements over baseline systems.
    Date
    1. 8.2008 12:29:05
  5. Tober, M.; Hennig, L.; Furch, D.: SEO Ranking-Faktoren und Rang-Korrelationen 2014 : Google Deutschland (2014) 0.06
    0.06046989 = product of:
      0.18140966 = sum of:
        0.16786899 = weight(_text_:ranking in 1484) [ClassicSimilarity], result of:
          0.16786899 = score(doc=1484,freq=6.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.828085 = fieldWeight in 1484, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.0625 = fieldNorm(doc=1484)
        0.013540667 = product of:
          0.040622 = sum of:
            0.040622 = weight(_text_:22 in 1484) [ClassicSimilarity], result of:
              0.040622 = score(doc=1484,freq=2.0), product of:
                0.13124153 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03747799 = queryNorm
                0.30952093 = fieldWeight in 1484, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1484)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Abstract
    Dieses Whitepaper beschäftigt sich mit der Definition und Bewertung von Faktoren, die eine hohe Rangkorrelation-Koeffizienz mit organischen Suchergebnissen aufweisen und dient dem Zweck der tieferen Analyse von Suchmaschinen-Algorithmen. Die Datenerhebung samt Auswertung bezieht sich auf Ranking-Faktoren für Google-Deutschland im Jahr 2014. Zusätzlich wurden die Korrelationen und Faktoren unter anderem anhand von Durchschnitts- und Medianwerten sowie Entwicklungstendenzen zu den Vorjahren hinsichtlich ihrer Relevanz für vordere Suchergebnis-Positionen interpretiert.
    Date
    13. 9.2014 14:45:22
    Source
    http://www.searchmetrics.com/media/documents/knowledge-base/searchmetrics-ranking-faktoren-studie-2014.pdf
  6. Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (2005) 0.06
    0.05814619 = product of:
      0.17443857 = sum of:
        0.105906345 = weight(_text_:suchmaschine in 7) [ClassicSimilarity], result of:
          0.105906345 = score(doc=7,freq=8.0), product of:
            0.21191008 = queryWeight, product of:
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.03747799 = queryNorm
            0.4997702 = fieldWeight in 7, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.03125 = fieldNorm(doc=7)
        0.06853223 = weight(_text_:ranking in 7) [ClassicSimilarity], result of:
          0.06853223 = score(doc=7,freq=4.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.33806428 = fieldWeight in 7, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03125 = fieldNorm(doc=7)
      0.33333334 = coord(2/6)
    
    Content
    Inhalt: Introduction Document File Preparation - Manual Indexing - Information Extraction - Vector Space Modeling - Matrix Decompositions - Query Representations - Ranking and Relevance Feedback - Searching by Link Structure - User Interface - Book Format Document File Preparation Document Purification and Analysis - Text Formatting - Validation - Manual Indexing - Automatic Indexing - Item Normalization - Inverted File Structures - Document File - Dictionary List - Inversion List - Other File Structures Vector Space Models Construction - Term-by-Document Matrices - Simple Query Matching - Design Issues - Term Weighting - Sparse Matrix Storage - Low-Rank Approximations Matrix Decompositions QR Factorization - Singular Value Decomposition - Low-Rank Approximations - Query Matching - Software - Semidiscrete Decomposition - Updating Techniques Query Management Query Binding - Types of Queries - Boolean Queries - Natural Language Queries - Thesaurus Queries - Fuzzy Queries - Term Searches - Probabilistic Queries Ranking and Relevance Feedback Performance Evaluation - Precision - Recall - Average Precision - Genetic Algorithms - Relevance Feedback Searching by Link Structure HITS Method - HITS Implementation - HITS Summary - PageRank Method - PageRank Adjustments - PageRank Implementation - PageRank Summary User Interface Considerations General Guidelines - Search Engine Interfaces - Form Fill-in - Display Considerations - Progress Indication - No Penalties for Error - Results - Test and Retest - Final Considerations Further Reading
    RSWK
    Suchmaschine / Information Retrieval
    Suchmaschine / Information Retrieval / Mathematisches Modell (HEBIS)
    Subject
    Suchmaschine / Information Retrieval
    Suchmaschine / Information Retrieval / Mathematisches Modell (HEBIS)
  7. Fuhr, N.: Ranking-Experimente mit gewichteter Indexierung (1986) 0.06
    0.055229932 = product of:
      0.1656898 = sum of:
        0.1453788 = weight(_text_:ranking in 58) [ClassicSimilarity], result of:
          0.1453788 = score(doc=58,freq=2.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.71714264 = fieldWeight in 58, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.09375 = fieldNorm(doc=58)
        0.020311 = product of:
          0.060932998 = sum of:
            0.060932998 = weight(_text_:22 in 58) [ClassicSimilarity], result of:
              0.060932998 = score(doc=58,freq=2.0), product of:
                0.13124153 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03747799 = queryNorm
                0.46428138 = fieldWeight in 58, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=58)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Date
    14. 6.2015 22:12:44
  8. Stock, M.; Stock, W.G.: Internet-Suchwerkzeuge im Vergleich (IV) : Relevance Ranking nach "Popularität" von Webseiten: Google (2001) 0.05
    0.050706387 = product of:
      0.15211916 = sum of:
        0.07942976 = weight(_text_:suchmaschine in 5771) [ClassicSimilarity], result of:
          0.07942976 = score(doc=5771,freq=2.0), product of:
            0.21191008 = queryWeight, product of:
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.03747799 = queryNorm
            0.37482765 = fieldWeight in 5771, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.046875 = fieldNorm(doc=5771)
        0.0726894 = weight(_text_:ranking in 5771) [ClassicSimilarity], result of:
          0.0726894 = score(doc=5771,freq=2.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.35857132 = fieldWeight in 5771, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.046875 = fieldNorm(doc=5771)
      0.33333334 = coord(2/6)
    
    Abstract
    In unserem Retrievaltest von Suchwerkzeugen im World Wide Web (Password 11/2000) schnitt die Suchmaschine Google am besten ab. Im Vergleich zu anderen Search Engines setzt Google kaum auf Informationslinguistik, sondern auf Algorithmen, die sich aus den Besonderheiten der Web-Dokumente ableiten lassen. Kernstück der informationsstatistischen Technik ist das "PageRank"- Verfahren (benannt nach dem Entwickler Larry Page), das aus der Hypertextstruktur des Web die "Popularität" von Seiten anhand ihrer ein- und ausgehenden Links berechnet. Google besticht durch das Angebot intuitiv verstehbarer Suchbildschirme sowie durch einige sehr nützliche "Kleinigkeiten" wie die Angabe des Rangs einer Seite, Highlighting, Suchen in der Seite, Suchen innerhalb eines Suchergebnisses usw., alles verstaut in einer eigenen Befehlsleiste innerhalb des Browsers. Ähnlich wie RealNames bietet Google mit dem Produkt "AdWords" den Aufkauf von Suchtermen an. Nach einer Reihe von nunmehr vier Password-Artikeln über InternetSuchwerkzeugen im Vergleich wollen wir abschließend zu einer Bewertung kommen. Wie ist der Stand der Technik bei Directories und Search Engines aus informationswissenschaftlicher Sicht einzuschätzen? Werden die "typischen" Internetnutzer, die ja in der Regel keine Information Professionals sind, adäquat bedient? Und können auch Informationsfachleute von den Suchwerkzeugen profitieren?
  9. Maylein, L.; Langenstein, A.: Neues vom Relevanz-Ranking im HEIDI-Katalog der Universitätsbibliothek Heidelberg : Perspektiven für bibliothekarische Dienstleistungen (2013) 0.05
    0.050242677 = product of:
      0.15072803 = sum of:
        0.13706446 = weight(_text_:ranking in 775) [ClassicSimilarity], result of:
          0.13706446 = score(doc=775,freq=4.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.67612857 = fieldWeight in 775, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.0625 = fieldNorm(doc=775)
        0.013663581 = product of:
          0.04099074 = sum of:
            0.04099074 = weight(_text_:29 in 775) [ClassicSimilarity], result of:
              0.04099074 = score(doc=775,freq=2.0), product of:
                0.13183585 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03747799 = queryNorm
                0.31092256 = fieldWeight in 775, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=775)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Abstract
    Das Relevanz-Ranking im Katalog der Universitätsbibliothek Heidelberg HEIDI, bereits 2009 in einem Beitrag in dieser Zeitschrift beschrieben, wurde in den letzten Jahren durch neue Entwicklungen und Methoden stark verbessert. Der Aufsatz beschreibt die Realisierung der bisherigen Rankingmaßnahmen unter der neu eingesetzten Suchmaschinenplattform SOLR. Weiter werden verschiedene neue Möglichkeiten für Rankinganpassungen unter SOLR sowie deren Einsatz im HEIDI-Katalog dargestellt.
    Date
    29. 6.2013 18:06:23
  10. Langville, A.N.; Meyer, C.D.: Google's PageRank and beyond : the science of search engine rankings (2006) 0.05
    0.0483971 = product of:
      0.1451913 = sum of:
        0.05616532 = weight(_text_:suchmaschine in 6) [ClassicSimilarity], result of:
          0.05616532 = score(doc=6,freq=4.0), product of:
            0.21191008 = queryWeight, product of:
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.03747799 = queryNorm
            0.26504317 = fieldWeight in 6, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.0234375 = fieldNorm(doc=6)
        0.089025974 = weight(_text_:ranking in 6) [ClassicSimilarity], result of:
          0.089025974 = score(doc=6,freq=12.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.43915838 = fieldWeight in 6, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.0234375 = fieldNorm(doc=6)
      0.33333334 = coord(2/6)
    
    Content
    Inhalt: Chapter 1. Introduction to Web Search Engines: 1.1 A Short History of Information Retrieval - 1.2 An Overview of Traditional Information Retrieval - 1.3 Web Information Retrieval Chapter 2. Crawling, Indexing, and Query Processing: 2.1 Crawling - 2.2 The Content Index - 2.3 Query Processing Chapter 3. Ranking Webpages by Popularity: 3.1 The Scene in 1998 - 3.2 Two Theses - 3.3 Query-Independence Chapter 4. The Mathematics of Google's PageRank: 4.1 The Original Summation Formula for PageRank - 4.2 Matrix Representation of the Summation Equations - 4.3 Problems with the Iterative Process - 4.4 A Little Markov Chain Theory - 4.5 Early Adjustments to the Basic Model - 4.6 Computation of the PageRank Vector - 4.7 Theorem and Proof for Spectrum of the Google Matrix Chapter 5. Parameters in the PageRank Model: 5.1 The a Factor - 5.2 The Hyperlink Matrix H - 5.3 The Teleportation Matrix E Chapter 6. The Sensitivity of PageRank; 6.1 Sensitivity with respect to alpha - 6.2 Sensitivity with respect to H - 6.3 Sensitivity with respect to vT - 6.4 Other Analyses of Sensitivity - 6.5 Sensitivity Theorems and Proofs Chapter 7. The PageRank Problem as a Linear System: 7.1 Properties of (I - alphaS) - 7.2 Properties of (I - alphaH) - 7.3 Proof of the PageRank Sparse Linear System Chapter 8. Issues in Large-Scale Implementation of PageRank: 8.1 Storage Issues - 8.2 Convergence Criterion - 8.3 Accuracy - 8.4 Dangling Nodes - 8.5 Back Button Modeling
    Chapter 9. Accelerating the Computation of PageRank: 9.1 An Adaptive Power Method - 9.2 Extrapolation - 9.3 Aggregation - 9.4 Other Numerical Methods Chapter 10. Updating the PageRank Vector: 10.1 The Two Updating Problems and their History - 10.2 Restarting the Power Method - 10.3 Approximate Updating Using Approximate Aggregation - 10.4 Exact Aggregation - 10.5 Exact vs. Approximate Aggregation - 10.6 Updating with Iterative Aggregation - 10.7 Determining the Partition - 10.8 Conclusions Chapter 11. The HITS Method for Ranking Webpages: 11.1 The HITS Algorithm - 11.2 HITS Implementation - 11.3 HITS Convergence - 11.4 HITS Example - 11.5 Strengths and Weaknesses of HITS - 11.6 HITS's Relationship to Bibliometrics - 11.7 Query-Independent HITS - 11.8 Accelerating HITS - 11.9 HITS Sensitivity Chapter 12. Other Link Methods for Ranking Webpages: 12.1 SALSA - 12.2 Hybrid Ranking Methods - 12.3 Rankings based on Traffic Flow Chapter 13. The Future of Web Information Retrieval: 13.1 Spam - 13.2 Personalization - 13.3 Clustering - 13.4 Intelligent Agents - 13.5 Trends and Time-Sensitive Search - 13.6 Privacy and Censorship - 13.7 Library Classification Schemes - 13.8 Data Fusion Chapter 14. Resources for Web Information Retrieval: 14.1 Resources for Getting Started - 14.2 Resources for Serious Study Chapter 15. The Mathematics Guide: 15.1 Linear Algebra - 15.2 Perron-Frobenius Theory - 15.3 Markov Chains - 15.4 Perron Complementation - 15.5 Stochastic Complementation - 15.6 Censoring - 15.7 Aggregation - 15.8 Disaggregation
    RSWK
    Google / Suchmaschine / Ranking (BVB)
    Subject
    Google / Suchmaschine / Ranking (BVB)
  11. Efthimiadis, E.N.: User choices : a new yardstick for the evaluation of ranking algorithms for interactive query expansion (1995) 0.05
    0.047970545 = product of:
      0.14391163 = sum of:
        0.13544871 = weight(_text_:ranking in 5697) [ClassicSimilarity], result of:
          0.13544871 = score(doc=5697,freq=10.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.66815823 = fieldWeight in 5697, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5697)
        0.008462917 = product of:
          0.025388751 = sum of:
            0.025388751 = weight(_text_:22 in 5697) [ClassicSimilarity], result of:
              0.025388751 = score(doc=5697,freq=2.0), product of:
                0.13124153 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03747799 = queryNorm
                0.19345059 = fieldWeight in 5697, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5697)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Abstract
    The performance of 8 ranking algorithms was evaluated with respect to their effectiveness in ranking terms for query expansion. The evaluation was conducted within an investigation of interactive query expansion and relevance feedback in a real operational environment. Focuses on the identification of algorithms that most effectively take cognizance of user preferences. user choices (i.e. the terms selected by the searchers for the query expansion search) provided the yardstick for the evaluation of the 8 ranking algorithms. This methodology introduces a user oriented approach in evaluating ranking algorithms for query expansion in contrast to the standard, system oriented approaches. Similarities in the performance of the 8 algorithms and the ways these algorithms rank terms were the main focus of this evaluation. The findings demonstrate that the r-lohi, wpq, enim, and porter algorithms have similar performance in bringing good terms to the top of a ranked list of terms for query expansion. However, further evaluation of the algorithms in different (e.g. full text) environments is needed before these results can be generalized beyond the context of the present study
    Date
    22. 2.1996 13:14:10
  12. Fan, W.; Fox, E.A.; Pathak, P.; Wu, H.: ¬The effects of fitness functions an genetic programming-based ranking discovery for Web search (2004) 0.05
    0.045352414 = product of:
      0.13605724 = sum of:
        0.12590174 = weight(_text_:ranking in 2239) [ClassicSimilarity], result of:
          0.12590174 = score(doc=2239,freq=6.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.62106377 = fieldWeight in 2239, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.046875 = fieldNorm(doc=2239)
        0.0101555 = product of:
          0.030466499 = sum of:
            0.030466499 = weight(_text_:22 in 2239) [ClassicSimilarity], result of:
              0.030466499 = score(doc=2239,freq=2.0), product of:
                0.13124153 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03747799 = queryNorm
                0.23214069 = fieldWeight in 2239, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2239)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Abstract
    Genetic-based evolutionary learning algorithms, such as genetic algorithms (GAs) and genetic programming (GP), have been applied to information retrieval (IR) since the 1980s. Recently, GP has been applied to a new IR taskdiscovery of ranking functions for Web search-and has achieved very promising results. However, in our prior research, only one fitness function has been used for GP-based learning. It is unclear how other fitness functions may affect ranking function discovery for Web search, especially since it is weIl known that choosing a proper fitness function is very important for the effectiveness and efficiency of evolutionary algorithms. In this article, we report our experience in contrasting different fitness function designs an GP-based learning using a very large Web corpus. Our results indicate that the design of fitness functions is instrumental in performance improvement. We also give recommendations an the design of fitness functions for genetic-based information retrieval experiments.
    Date
    31. 5.2004 19:22:06
  13. Costa Carvalho, A. da; Rossi, C.; Moura, E.S. de; Silva, A.S. da; Fernandes, D.: LePrEF: Learn to precompute evidence fusion for efficient query evaluation (2012) 0.04
    0.04322958 = product of:
      0.12968874 = sum of:
        0.121149 = weight(_text_:ranking in 278) [ClassicSimilarity], result of:
          0.121149 = score(doc=278,freq=8.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.5976189 = fieldWeight in 278, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.0390625 = fieldNorm(doc=278)
        0.008539738 = product of:
          0.025619213 = sum of:
            0.025619213 = weight(_text_:29 in 278) [ClassicSimilarity], result of:
              0.025619213 = score(doc=278,freq=2.0), product of:
                0.13183585 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03747799 = queryNorm
                0.19432661 = fieldWeight in 278, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=278)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Abstract
    State-of-the-art search engine ranking methods combine several distinct sources of relevance evidence to produce a high-quality ranking of results for each query. The fusion of information is currently done at query-processing time, which has a direct effect on the response time of search systems. Previous research also shows that an alternative to improve search efficiency in textual databases is to precompute term impacts at indexing time. In this article, we propose a novel alternative to precompute term impacts, providing a generic framework for combining any distinct set of sources of evidence by using a machine-learning technique. This method retains the advantages of producing high-quality results, but avoids the costs of combining evidence at query-processing time. Our method, called Learn to Precompute Evidence Fusion (LePrEF), uses genetic programming to compute a unified precomputed impact value for each term found in each document prior to query processing, at indexing time. Compared with previous research on precomputing term impacts, our method offers the advantage of providing a generic framework to precompute impact using any set of relevance evidence at any text collection, whereas previous research articles do not. The precomputed impact values are indexed and used later for computing document ranking at query-processing time. By doing so, our method effectively reduces the query processing to simple additions of such impacts. We show that this approach, while leading to results comparable to state-of-the-art ranking methods, also can lead to a significant decrease in computational costs during query processing.
    Date
    24. 6.2012 14:29:10
  14. Ravana, S.D.; Rajagopal, P.; Balakrishnan, V.: Ranking retrieval systems using pseudo relevance judgments (2015) 0.04
    0.038962167 = product of:
      0.1168865 = sum of:
        0.10491812 = weight(_text_:ranking in 2591) [ClassicSimilarity], result of:
          0.10491812 = score(doc=2591,freq=6.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.51755315 = fieldWeight in 2591, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2591)
        0.011968372 = product of:
          0.035905115 = sum of:
            0.035905115 = weight(_text_:22 in 2591) [ClassicSimilarity], result of:
              0.035905115 = score(doc=2591,freq=4.0), product of:
                0.13124153 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03747799 = queryNorm
                0.27358043 = fieldWeight in 2591, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2591)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Abstract
    Purpose In a system-based approach, replicating the web would require large test collections, and judging the relevancy of all documents per topic in creating relevance judgment through human assessors is infeasible. Due to the large amount of documents that requires judgment, there are possible errors introduced by human assessors because of disagreements. The paper aims to discuss these issues. Design/methodology/approach This study explores exponential variation and document ranking methods that generate a reliable set of relevance judgments (pseudo relevance judgments) to reduce human efforts. These methods overcome problems with large amounts of documents for judgment while avoiding human disagreement errors during the judgment process. This study utilizes two key factors: number of occurrences of each document per topic from all the system runs; and document rankings to generate the alternate methods. Findings The effectiveness of the proposed method is evaluated using the correlation coefficient of ranked systems using mean average precision scores between the original Text REtrieval Conference (TREC) relevance judgments and pseudo relevance judgments. The results suggest that the proposed document ranking method with a pool depth of 100 could be a reliable alternative to reduce human effort and disagreement errors involved in generating TREC-like relevance judgments. Originality/value Simple methods proposed in this study show improvement in the correlation coefficient in generating alternate relevance judgment without human assessors while contributing to information retrieval evaluation.
    Date
    20. 1.2015 18:30:22
    18. 9.2018 18:22:56
  15. Mayr, P.: Bradfordizing als Re-Ranking-Ansatz in Literaturinformationssystemen (2011) 0.04
    0.037682008 = product of:
      0.11304602 = sum of:
        0.102798335 = weight(_text_:ranking in 4292) [ClassicSimilarity], result of:
          0.102798335 = score(doc=4292,freq=4.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.5070964 = fieldWeight in 4292, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.046875 = fieldNorm(doc=4292)
        0.010247685 = product of:
          0.030743055 = sum of:
            0.030743055 = weight(_text_:29 in 4292) [ClassicSimilarity], result of:
              0.030743055 = score(doc=4292,freq=2.0), product of:
                0.13183585 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03747799 = queryNorm
                0.23319192 = fieldWeight in 4292, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4292)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Abstract
    In diesem Artikel wird ein Re-Ranking-Ansatz für Suchsysteme vorgestellt, der die Recherche nach wissenschaftlicher Literatur messbar verbessern kann. Das nichttextorientierte Rankingverfahren Bradfordizing wird eingeführt und anschließend im empirischen Teil des Artikels bzgl. der Effektivität für typische fachbezogene Recherche-Topics evaluiert. Dem Bradford Law of Scattering (BLS), auf dem Bradfordizing basiert, liegt zugrunde, dass sich die Literatur zu einem beliebigen Fachgebiet bzw. -thema in Zonen unterschiedlicher Dokumentenkonzentration verteilt. Dem Kernbereich mit hoher Konzentration der Literatur folgen Bereiche mit mittlerer und geringer Konzentration. Bradfordizing sortiert bzw. rankt eine Dokumentmenge damit nach den sogenannten Kernzeitschriften. Der Retrievaltest mit 164 intellektuell bewerteten Fragestellungen in Fachdatenbanken aus den Bereichen Sozial- und Politikwissenschaften, Wirtschaftswissenschaften, Psychologie und Medizin zeigt, dass die Dokumente der Kernzeitschriften signifikant häufiger relevant bewertet werden als Dokumente der zweiten Dokumentzone bzw. den Peripherie-Zeitschriften. Die Implementierung von Bradfordizing und weiteren Re-Rankingverfahren liefert unmittelbare Mehrwerte für den Nutzer.
    Date
    9. 2.2011 17:47:29
  16. Wei, F.; Li, W.; Liu, S.: iRANK: a rank-learn-combine framework for unsupervised ensemble ranking (2010) 0.04
    0.03640075 = product of:
      0.21840449 = sum of:
        0.21840449 = weight(_text_:ranking in 3472) [ClassicSimilarity], result of:
          0.21840449 = score(doc=3472,freq=26.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            1.0773728 = fieldWeight in 3472, product of:
              5.0990195 = tf(freq=26.0), with freq of:
                26.0 = termFreq=26.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3472)
      0.16666667 = coord(1/6)
    
    Abstract
    The authors address the problem of unsupervised ensemble ranking. Traditional approaches either combine multiple ranking criteria into a unified representation to obtain an overall ranking score or to utilize certain rank fusion or aggregation techniques to combine the ranking results. Beyond the aforementioned combine-then-rank and rank-then-combine approaches, the authors propose a novel rank-learn-combine ranking framework, called Interactive Ranking (iRANK), which allows two base rankers to teach each other before combination during the ranking process by providing their own ranking results as feedback to the others to boost the ranking performance. This mutual ranking refinement process continues until the two base rankers cannot learn from each other any more. The overall performance is improved by the enhancement of the base rankers through the mutual learning mechanism. The authors further design two ranking refinement strategies to efficiently and effectively use the feedback based on reasonable assumptions and rational analysis. Although iRANK is applicable to many applications, as a case study, they apply this framework to the sentence ranking problem in query-focused summarization and evaluate its effectiveness on the DUC 2005 and 2006 data sets. The results are encouraging with consistent and promising improvements.
  17. Jiang, X.; Sun, X.; Yang, Z.; Zhuge, H.; Lapshinova-Koltunski, E.; Yao, J.: Exploiting heterogeneous scientific literature networks to combat ranking bias : evidence from the computational linguistics area (2016) 0.04
    0.03640075 = product of:
      0.21840449 = sum of:
        0.21840449 = weight(_text_:ranking in 3017) [ClassicSimilarity], result of:
          0.21840449 = score(doc=3017,freq=26.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            1.0773728 = fieldWeight in 3017, product of:
              5.0990195 = tf(freq=26.0), with freq of:
                26.0 = termFreq=26.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3017)
      0.16666667 = coord(1/6)
    
    Abstract
    It is important to help researchers find valuable papers from a large literature collection. To this end, many graph-based ranking algorithms have been proposed. However, most of these algorithms suffer from the problem of ranking bias. Ranking bias hurts the usefulness of a ranking algorithm because it returns a ranking list with an undesirable time distribution. This paper is a focused study on how to alleviate ranking bias by leveraging the heterogeneous network structure of the literature collection. We propose a new graph-based ranking algorithm, MutualRank, that integrates mutual reinforcement relationships among networks of papers, researchers, and venues to achieve a more synthetic, accurate, and less-biased ranking than previous methods. MutualRank provides a unified model that involves both intra- and inter-network information for ranking papers, researchers, and venues simultaneously. We use the ACL Anthology Network as the benchmark data set and construct the gold standard from computer linguistics course websites of well-known universities and two well-known textbooks. The experimental results show that MutualRank greatly outperforms the state-of-the-art competitors, including PageRank, HITS, CoRank, Future Rank, and P-Rank, in ranking papers in both improving ranking effectiveness and alleviating ranking bias. Rankings of researchers and venues by MutualRank are also quite reasonable.
  18. Lewandowski, D.: How can library materials be ranked in the OPAC? (2009) 0.03
    0.03348382 = product of:
      0.2009029 = sum of:
        0.2009029 = weight(_text_:ranking in 2810) [ClassicSimilarity], result of:
          0.2009029 = score(doc=2810,freq=22.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.9910388 = fieldWeight in 2810, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2810)
      0.16666667 = coord(1/6)
    
    Abstract
    Some Online Public Access Catalogues offer a ranking component. However, ranking there is merely text-based and is doomed to fail due to limited text in bibliographic data. The main assumption for the talk is that we are in a situation where the appropriate ranking factors for OPACs should be defined, while the implementation is no major problem. We must define what we want, and not so much focus on the technical work. Some deep thinking is necessary on the "perfect results set" and how we can achieve it through ranking. The talk presents a set of potential ranking factors and clustering possibilities for further discussion. A look at commercial Web search engines could provide us with ideas how ranking can be improved with additional factors. Search engines are way beyond pure text-based ranking and apply ranking factors in the groups like popularity, freshness, personalisation, etc. The talk describes the main factors used in search engines and how derivatives of these could be used for libraries' purposes. The goal of ranking is to provide the user with the best-suitable results on top of the results list. How can this goal be achieved with the library catalogue and also concerning the library's different collections and databases? The assumption is that ranking of such materials is a complex problem and is yet nowhere near solved. Libraries should focus on ranking to improve user experience.
  19. Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (1999) 0.03
    0.032427065 = product of:
      0.19456238 = sum of:
        0.19456238 = weight(_text_:suchmaschine in 5777) [ClassicSimilarity], result of:
          0.19456238 = score(doc=5777,freq=12.0), product of:
            0.21191008 = queryWeight, product of:
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.03747799 = queryNorm
            0.9181365 = fieldWeight in 5777, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.046875 = fieldNorm(doc=5777)
      0.16666667 = coord(1/6)
    
    RSWK
    Suchmaschine / Information Retrieval
    World Wide Web / Suchmaschine / Mathematisches Modell (BVB)
    Suchmaschine / Information Retrieval / Mathematisches Modell (HEBIS)
    Subject
    Suchmaschine / Information Retrieval
    World Wide Web / Suchmaschine / Mathematisches Modell (BVB)
    Suchmaschine / Information Retrieval / Mathematisches Modell (HEBIS)
  20. Robertson, S.E.; Belkin, N.J.: Ranking in principle (1978) 0.03
    0.032306403 = product of:
      0.1938384 = sum of:
        0.1938384 = weight(_text_:ranking in 1143) [ClassicSimilarity], result of:
          0.1938384 = score(doc=1143,freq=2.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.95619017 = fieldWeight in 1143, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.125 = fieldNorm(doc=1143)
      0.16666667 = coord(1/6)
    

Years

Languages

  • e 132
  • d 20
  • pt 1
  • More… Less…

Types

  • a 137
  • m 7
  • el 3
  • x 3
  • r 2
  • s 2
  • p 1
  • More… Less…