Search (167 results, page 1 of 9)

  • × theme_ss:"Retrievalalgorithmen"
  1. Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (1999) 0.06
    0.06046185 = product of:
      0.30230924 = sum of:
        0.16425991 = weight(_text_:suchmaschine in 5777) [ClassicSimilarity], result of:
          0.16425991 = score(doc=5777,freq=12.0), product of:
            0.17890577 = queryWeight, product of:
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.031640913 = queryNorm
            0.9181365 = fieldWeight in 5777, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.046875 = fieldNorm(doc=5777)
        0.09337013 = weight(_text_:software in 5777) [ClassicSimilarity], result of:
          0.09337013 = score(doc=5777,freq=16.0), product of:
            0.12552431 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.031640913 = queryNorm
            0.743841 = fieldWeight in 5777, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.046875 = fieldNorm(doc=5777)
        0.044679187 = weight(_text_:web in 5777) [ClassicSimilarity], result of:
          0.044679187 = score(doc=5777,freq=8.0), product of:
            0.10326045 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031640913 = queryNorm
            0.43268442 = fieldWeight in 5777, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=5777)
      0.2 = coord(3/15)
    
    Abstract
    This book discusses many of the key design issues for building search engines and emphazises the important role that applied mathematics can play in improving information retrieval. The authors discuss not only important data structures, algorithms, and software but also user-centered issues such as interfaces, manual indexing, and document preparation. They also present some of the current problems in information retrieval that many not be familiar to applied mathematicians and computer scientists and some of the driving computational methods (SVD, SDD) for automated conceptual indexing
    Classification
    ST 230 [Informatik # Monographien # Software und -entwicklung # Software allgemein, (Einführung, Lehrbücher, Methoden der Programmierung) Software engineering, Programmentwicklungssysteme, Softwarewerkzeuge]
    LCSH
    Web search engines
    RSWK
    Suchmaschine / Information Retrieval
    World Wide Web / Suchmaschine / Mathematisches Modell (BVB)
    Suchmaschine / Information Retrieval / Mathematisches Modell (HEBIS)
    RVK
    ST 230 [Informatik # Monographien # Software und -entwicklung # Software allgemein, (Einführung, Lehrbücher, Methoden der Programmierung) Software engineering, Programmentwicklungssysteme, Softwarewerkzeuge]
    Series
    Software, environments, tools; 8
    Subject
    Suchmaschine / Information Retrieval
    World Wide Web / Suchmaschine / Mathematisches Modell (BVB)
    Suchmaschine / Information Retrieval / Mathematisches Modell (HEBIS)
    Web search engines
  2. Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (2005) 0.05
    0.04618574 = product of:
      0.17319651 = sum of:
        0.08941177 = weight(_text_:suchmaschine in 7) [ClassicSimilarity], result of:
          0.08941177 = score(doc=7,freq=8.0), product of:
            0.17890577 = queryWeight, product of:
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.031640913 = queryNorm
            0.4997702 = fieldWeight in 7, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.03125 = fieldNorm(doc=7)
        0.038118195 = weight(_text_:software in 7) [ClassicSimilarity], result of:
          0.038118195 = score(doc=7,freq=6.0), product of:
            0.12552431 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.031640913 = queryNorm
            0.3036718 = fieldWeight in 7, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.03125 = fieldNorm(doc=7)
        0.024604581 = weight(_text_:evaluation in 7) [ClassicSimilarity], result of:
          0.024604581 = score(doc=7,freq=2.0), product of:
            0.13272417 = queryWeight, product of:
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.031640913 = queryNorm
            0.18538132 = fieldWeight in 7, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.03125 = fieldNorm(doc=7)
        0.021061972 = weight(_text_:web in 7) [ClassicSimilarity], result of:
          0.021061972 = score(doc=7,freq=4.0), product of:
            0.10326045 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031640913 = queryNorm
            0.2039694 = fieldWeight in 7, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03125 = fieldNorm(doc=7)
      0.26666668 = coord(4/15)
    
    Abstract
    The second edition of Understanding Search Engines: Mathematical Modeling and Text Retrieval follows the basic premise of the first edition by discussing many of the key design issues for building search engines and emphasizing the important role that applied mathematics can play in improving information retrieval. The authors discuss important data structures, algorithms, and software as well as user-centered issues such as interfaces, manual indexing, and document preparation. Significant changes bring the text up to date on current information retrieval methods: for example the addition of a new chapter on link-structure algorithms used in search engines such as Google. The chapter on user interface has been rewritten to specifically focus on search engine usability. In addition the authors have added new recommendations for further reading and expanded the bibliography, and have updated and streamlined the index to make it more reader friendly.
    Content
    Inhalt: Introduction Document File Preparation - Manual Indexing - Information Extraction - Vector Space Modeling - Matrix Decompositions - Query Representations - Ranking and Relevance Feedback - Searching by Link Structure - User Interface - Book Format Document File Preparation Document Purification and Analysis - Text Formatting - Validation - Manual Indexing - Automatic Indexing - Item Normalization - Inverted File Structures - Document File - Dictionary List - Inversion List - Other File Structures Vector Space Models Construction - Term-by-Document Matrices - Simple Query Matching - Design Issues - Term Weighting - Sparse Matrix Storage - Low-Rank Approximations Matrix Decompositions QR Factorization - Singular Value Decomposition - Low-Rank Approximations - Query Matching - Software - Semidiscrete Decomposition - Updating Techniques Query Management Query Binding - Types of Queries - Boolean Queries - Natural Language Queries - Thesaurus Queries - Fuzzy Queries - Term Searches - Probabilistic Queries Ranking and Relevance Feedback Performance Evaluation - Precision - Recall - Average Precision - Genetic Algorithms - Relevance Feedback Searching by Link Structure HITS Method - HITS Implementation - HITS Summary - PageRank Method - PageRank Adjustments - PageRank Implementation - PageRank Summary User Interface Considerations General Guidelines - Search Engine Interfaces - Form Fill-in - Display Considerations - Progress Indication - No Penalties for Error - Results - Test and Retest - Final Considerations Further Reading
    LCSH
    Web search engines
    RSWK
    Suchmaschine / Information Retrieval
    Suchmaschine / Information Retrieval / Mathematisches Modell (HEBIS)
    Series
    Software, environments, tools; 17
    Subject
    Web search engines
    Suchmaschine / Information Retrieval
    Suchmaschine / Information Retrieval / Mathematisches Modell (HEBIS)
  3. Shiri, A.A.; Revie, C.: Query expansion behavior within a thesaurus-enhanced search environment : a user-centered evaluation (2006) 0.03
    0.026756855 = product of:
      0.1003382 = sum of:
        0.027509436 = weight(_text_:software in 56) [ClassicSimilarity], result of:
          0.027509436 = score(doc=56,freq=2.0), product of:
            0.12552431 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.031640913 = queryNorm
            0.21915624 = fieldWeight in 56, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.0390625 = fieldNorm(doc=56)
        0.043495167 = weight(_text_:evaluation in 56) [ClassicSimilarity], result of:
          0.043495167 = score(doc=56,freq=4.0), product of:
            0.13272417 = queryWeight, product of:
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.031640913 = queryNorm
            0.327711 = fieldWeight in 56, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.0390625 = fieldNorm(doc=56)
        0.01861633 = weight(_text_:web in 56) [ClassicSimilarity], result of:
          0.01861633 = score(doc=56,freq=2.0), product of:
            0.10326045 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031640913 = queryNorm
            0.18028519 = fieldWeight in 56, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=56)
        0.010717267 = product of:
          0.021434534 = sum of:
            0.021434534 = weight(_text_:22 in 56) [ClassicSimilarity], result of:
              0.021434534 = score(doc=56,freq=2.0), product of:
                0.110801086 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.031640913 = queryNorm
                0.19345059 = fieldWeight in 56, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=56)
          0.5 = coord(1/2)
      0.26666668 = coord(4/15)
    
    Abstract
    The study reported here investigated the query expansion behavior of end-users interacting with a thesaurus-enhanced search system on the Web. Two groups, namely academic staff and postgraduate students, were recruited into this study. Data were collected from 90 searches performed by 30 users using the OVID interface to the CAB abstracts database. Data-gathering techniques included questionnaires, screen capturing software, and interviews. The results presented here relate to issues of search-topic and search-term characteristics, number and types of expanded queries, usefulness of thesaurus terms, and behavioral differences between academic staff and postgraduate students in their interaction. The key conclusions drawn were that (a) academic staff chose more narrow and synonymous terms than did postgraduate students, who generally selected broader and related terms; (b) topic complexity affected users' interaction with the thesaurus in that complex topics required more query expansion and search term selection; (c) users' prior topic-search experience appeared to have a significant effect on their selection and evaluation of thesaurus terms; (d) in 50% of the searches where additional terms were suggested from the thesaurus, users stated that they had not been aware of the terms at the beginning of the search; this observation was particularly noticeable in the case of postgraduate students.
    Date
    22. 7.2006 16:32:43
  4. Courtois, M.P.; Berry, M.W.: Results ranking in Web search engines (1999) 0.02
    0.022951564 = product of:
      0.17213672 = sum of:
        0.13490407 = sum of:
          0.03219906 = weight(_text_:online in 3726) [ClassicSimilarity], result of:
            0.03219906 = score(doc=3726,freq=2.0), product of:
              0.096027054 = queryWeight, product of:
                3.0349014 = idf(docFreq=5778, maxDocs=44218)
                0.031640913 = queryNorm
              0.33531237 = fieldWeight in 3726, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.0349014 = idf(docFreq=5778, maxDocs=44218)
                0.078125 = fieldNorm(doc=3726)
          0.10270501 = weight(_text_:recherche in 3726) [ClassicSimilarity], result of:
            0.10270501 = score(doc=3726,freq=2.0), product of:
              0.17150146 = queryWeight, product of:
                5.4202437 = idf(docFreq=531, maxDocs=44218)
                0.031640913 = queryNorm
              0.59885794 = fieldWeight in 3726, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.4202437 = idf(docFreq=531, maxDocs=44218)
                0.078125 = fieldNorm(doc=3726)
        0.03723266 = weight(_text_:web in 3726) [ClassicSimilarity], result of:
          0.03723266 = score(doc=3726,freq=2.0), product of:
            0.10326045 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031640913 = queryNorm
            0.36057037 = fieldWeight in 3726, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.078125 = fieldNorm(doc=3726)
      0.13333334 = coord(2/15)
    
    Abstract
    Comparaison des méthodes de classement de 5 moteurs de recherche (AltaVista, HotBot, Excie, Infoseek et Lycos). Sont testées la présence de tous les mots, la proximité et la localisation
    Source
    Online. 23(1999) no.3, S.39-46
  5. Thelwall, M.; Vaughan, L.: New versions of PageRank employing alternative Web document models (2004) 0.02
    0.020364448 = product of:
      0.15273336 = sum of:
        0.063185915 = weight(_text_:web in 674) [ClassicSimilarity], result of:
          0.063185915 = score(doc=674,freq=16.0), product of:
            0.10326045 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031640913 = queryNorm
            0.6119082 = fieldWeight in 674, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=674)
        0.08954745 = weight(_text_:site in 674) [ClassicSimilarity], result of:
          0.08954745 = score(doc=674,freq=4.0), product of:
            0.1738463 = queryWeight, product of:
              5.494352 = idf(docFreq=493, maxDocs=44218)
              0.031640913 = queryNorm
            0.5150955 = fieldWeight in 674, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.494352 = idf(docFreq=493, maxDocs=44218)
              0.046875 = fieldNorm(doc=674)
      0.13333334 = coord(2/15)
    
    Abstract
    Introduces several new versions of PageRank (the link based Web page ranking algorithm), based on an information science perspective on the concept of the Web document. Although the Web page is the typical indivisible unit of information in search engine results and most Web information retrieval algorithms, other research has suggested that aggregating pages based on directories and domains gives promising alternatives, particularly when Web links are the object of study. The new algorithms introduced based on these alternatives were used to rank four sets of Web pages. The ranking results were compared with human subjects' rankings. The results of the tests were somewhat inconclusive: the new approach worked well for the set that includes pages from different Web sites; however, it does not work well in ranking pages that are from the same site. It seems that the new algorithms may be effective for some tasks but not for others, especially when only low numbers of links are involved or the pages to be ranked are from the same site or directory.
  6. Joss, M.W.; Wszola, S.: ¬The engines that can : text search and retrieval software, their strategies, and vendors (1996) 0.02
    0.017708618 = product of:
      0.08854309 = sum of:
        0.009659718 = product of:
          0.019319436 = sum of:
            0.019319436 = weight(_text_:online in 5123) [ClassicSimilarity], result of:
              0.019319436 = score(doc=5123,freq=2.0), product of:
                0.096027054 = queryWeight, product of:
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.031640913 = queryNorm
                0.20118743 = fieldWeight in 5123, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5123)
          0.5 = coord(1/2)
        0.06602265 = weight(_text_:software in 5123) [ClassicSimilarity], result of:
          0.06602265 = score(doc=5123,freq=8.0), product of:
            0.12552431 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.031640913 = queryNorm
            0.525975 = fieldWeight in 5123, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.046875 = fieldNorm(doc=5123)
        0.01286072 = product of:
          0.02572144 = sum of:
            0.02572144 = weight(_text_:22 in 5123) [ClassicSimilarity], result of:
              0.02572144 = score(doc=5123,freq=2.0), product of:
                0.110801086 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.031640913 = queryNorm
                0.23214069 = fieldWeight in 5123, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5123)
          0.5 = coord(1/2)
      0.2 = coord(3/15)
    
    Abstract
    Traces the development of text searching and retrieval software designed to cope with the increasing demands made by the storage and handling of large amounts of data, recorded on high data storage media, from CD-ROM to multi gigabyte storage media and online information services, with particular reference to the need to cope with graphics as well as conventional ASCII text. Includes details of: Boolean searching, fuzzy searching and matching; relevance ranking; proximity searching and improved strategies for dealing with text searching in very large databases. Concludes that the best searching tools for CD-ROM publishers are those optimized for searching and retrieval on CD-ROM. CD-ROM drives have relatively lower random seek times than hard discs and so the software most appropriate to the medium is that which can effectively arrange the indexes and text on the CD-ROM to avoid continuous random access searching. Lists and reviews a selection of software packages designed to achieve the sort of results required for rapid CD-ROM searching
    Date
    12. 9.1996 13:56:22
  7. Kanaeva, Z.: Ranking: Google und CiteSeer (2005) 0.02
    0.016752748 = product of:
      0.12564561 = sum of:
        0.11064143 = weight(_text_:suchmaschine in 3276) [ClassicSimilarity], result of:
          0.11064143 = score(doc=3276,freq=4.0), product of:
            0.17890577 = queryWeight, product of:
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.031640913 = queryNorm
            0.6184341 = fieldWeight in 3276, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3276)
        0.015004174 = product of:
          0.030008348 = sum of:
            0.030008348 = weight(_text_:22 in 3276) [ClassicSimilarity], result of:
              0.030008348 = score(doc=3276,freq=2.0), product of:
                0.110801086 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.031640913 = queryNorm
                0.2708308 = fieldWeight in 3276, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3276)
          0.5 = coord(1/2)
      0.13333334 = coord(2/15)
    
    Abstract
    Im Rahmen des klassischen Information Retrieval wurden verschiedene Verfahren für das Ranking sowie die Suche in einer homogenen strukturlosen Dokumentenmenge entwickelt. Die Erfolge der Suchmaschine Google haben gezeigt dass die Suche in einer zwar inhomogenen aber zusammenhängenden Dokumentenmenge wie dem Internet unter Berücksichtigung der Dokumentenverbindungen (Links) sehr effektiv sein kann. Unter den von der Suchmaschine Google realisierten Konzepten ist ein Verfahren zum Ranking von Suchergebnissen (PageRank), das in diesem Artikel kurz erklärt wird. Darüber hinaus wird auf die Konzepte eines Systems namens CiteSeer eingegangen, welches automatisch bibliographische Angaben indexiert (engl. Autonomous Citation Indexing, ACI). Letzteres erzeugt aus einer Menge von nicht vernetzten wissenschaftlichen Dokumenten eine zusammenhängende Dokumentenmenge und ermöglicht den Einsatz von Banking-Verfahren, die auf den von Google genutzten Verfahren basieren.
    Date
    20. 3.2005 16:23:22
  8. Austin, D.: How Google finds your needle in the Web's haystack : as we'll see, the trick is to ask the web itself to rank the importance of pages... (2006) 0.02
    0.015528813 = product of:
      0.077644065 = sum of:
        0.019256605 = weight(_text_:software in 93) [ClassicSimilarity], result of:
          0.019256605 = score(doc=93,freq=2.0), product of:
            0.12552431 = queryWeight, product of:
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.031640913 = queryNorm
            0.15340936 = fieldWeight in 93, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.9671519 = idf(docFreq=2274, maxDocs=44218)
              0.02734375 = fieldNorm(doc=93)
        0.02152901 = weight(_text_:evaluation in 93) [ClassicSimilarity], result of:
          0.02152901 = score(doc=93,freq=2.0), product of:
            0.13272417 = queryWeight, product of:
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.031640913 = queryNorm
            0.16220866 = fieldWeight in 93, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.02734375 = fieldNorm(doc=93)
        0.036858454 = weight(_text_:web in 93) [ClassicSimilarity], result of:
          0.036858454 = score(doc=93,freq=16.0), product of:
            0.10326045 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031640913 = queryNorm
            0.35694647 = fieldWeight in 93, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.02734375 = fieldNorm(doc=93)
      0.2 = coord(3/15)
    
    Abstract
    Imagine a library containing 25 billion documents but with no centralized organization and no librarians. In addition, anyone may add a document at any time without telling anyone. You may feel sure that one of the documents contained in the collection has a piece of information that is vitally important to you, and, being impatient like most of us, you'd like to find it in a matter of seconds. How would you go about doing it? Posed in this way, the problem seems impossible. Yet this description is not too different from the World Wide Web, a huge, highly-disorganized collection of documents in many different formats. Of course, we're all familiar with search engines (perhaps you found this article using one) so we know that there is a solution. This article will describe Google's PageRank algorithm and how it returns pages from the web's collection of 25 billion documents that match search criteria so well that "google" has become a widely used verb. Most search engines, including Google, continually run an army of computer programs that retrieve pages from the web, index the words in each document, and store this information in an efficient format. Each time a user asks for a web search using a search phrase, such as "search engine," the search engine determines all the pages on the web that contains the words in the search phrase. (Perhaps additional information such as the distance between the words "search" and "engine" will be noted as well.) Here is the problem: Google now claims to index 25 billion pages. Roughly 95% of the text in web pages is composed from a mere 10,000 words. This means that, for most searches, there will be a huge number of pages containing the words in the search phrase. What is needed is a means of ranking the importance of the pages that fit the search criteria so that the pages can be sorted with the most important pages at the top of the list. One way to determine the importance of pages is to use a human-generated ranking. For instance, you may have seen pages that consist mainly of a large number of links to other resources in a particular area of interest. Assuming the person maintaining this page is reliable, the pages referenced are likely to be useful. Of course, the list may quickly fall out of date, and the person maintaining the list may miss some important pages, either unintentionally or as a result of an unstated bias. Google's PageRank algorithm assesses the importance of web pages without human evaluation of the content. In fact, Google feels that the value of its service is largely in its ability to provide unbiased results to search queries; Google claims, "the heart of our software is PageRank." As we'll see, the trick is to ask the web itself to rank the importance of pages.
  9. Back, J.: ¬An evaluation of relevancy ranking techniques used by Internet search engines (2000) 0.02
    0.015483252 = product of:
      0.116124384 = sum of:
        0.08611604 = weight(_text_:evaluation in 3445) [ClassicSimilarity], result of:
          0.08611604 = score(doc=3445,freq=2.0), product of:
            0.13272417 = queryWeight, product of:
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.031640913 = queryNorm
            0.64883465 = fieldWeight in 3445, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.109375 = fieldNorm(doc=3445)
        0.030008348 = product of:
          0.060016695 = sum of:
            0.060016695 = weight(_text_:22 in 3445) [ClassicSimilarity], result of:
              0.060016695 = score(doc=3445,freq=2.0), product of:
                0.110801086 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.031640913 = queryNorm
                0.5416616 = fieldWeight in 3445, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3445)
          0.5 = coord(1/2)
      0.13333334 = coord(2/15)
    
    Date
    25. 8.2005 17:42:22
  10. Chen, H.; Lally, A.M.; Zhu, B.; Chau, M.: HelpfulMed : Intelligent searching for medical information over the Internet (2003) 0.02
    0.015207631 = product of:
      0.07603815 = sum of:
        0.008049765 = product of:
          0.01609953 = sum of:
            0.01609953 = weight(_text_:online in 1615) [ClassicSimilarity], result of:
              0.01609953 = score(doc=1615,freq=2.0), product of:
                0.096027054 = queryWeight, product of:
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.031640913 = queryNorm
                0.16765618 = fieldWeight in 1615, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1615)
          0.5 = coord(1/2)
        0.030755727 = weight(_text_:evaluation in 1615) [ClassicSimilarity], result of:
          0.030755727 = score(doc=1615,freq=2.0), product of:
            0.13272417 = queryWeight, product of:
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.031640913 = queryNorm
            0.23172665 = fieldWeight in 1615, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1615)
        0.03723266 = weight(_text_:web in 1615) [ClassicSimilarity], result of:
          0.03723266 = score(doc=1615,freq=8.0), product of:
            0.10326045 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031640913 = queryNorm
            0.36057037 = fieldWeight in 1615, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1615)
      0.2 = coord(3/15)
    
    Abstract
    The Medical professionals and researchers need information from reputable sources to accomplish their work. Unfortunately, the Web has a large number of documents that are irrelevant to their work, even those documents that purport to be "medically-related." This paper describes an architecture designed to integrate advanced searching and indexing algorithms, an automatic thesaurus, or "concept space," and Kohonen-based Self-Organizing Map (SOM) technologies to provide searchers with finegrained results. Initial results indicate that these systems provide complementary retrieval functionalities. HelpfulMed not only allows users to search Web pages and other online databases, but also allows them to build searches through the use of an automatic thesaurus and browse a graphical display of medical-related topics. Evaluation results for each of the different components are included. Our spidering algorithm outperformed both breadth-first search and PageRank spiders an a test collection of 100,000 Web pages. The automatically generated thesaurus performed as well as both MeSH and UMLS-systems which require human mediation for currency. Lastly, a variant of the Kohonen SOM was comparable to MeSH terms in perceived cluster precision and significantly better at perceived cluster recall.
    Footnote
    Teil eines Themenheftes: "Web retrieval and mining: A machine learning perspective"
  11. Thelwall, M.: Can Google's PageRank be used to find the most important academic Web pages? (2003) 0.02
    0.0151029965 = product of:
      0.113272466 = sum of:
        0.049952857 = weight(_text_:web in 4457) [ClassicSimilarity], result of:
          0.049952857 = score(doc=4457,freq=10.0), product of:
            0.10326045 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031640913 = queryNorm
            0.48375595 = fieldWeight in 4457, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=4457)
        0.06331961 = weight(_text_:site in 4457) [ClassicSimilarity], result of:
          0.06331961 = score(doc=4457,freq=2.0), product of:
            0.1738463 = queryWeight, product of:
              5.494352 = idf(docFreq=493, maxDocs=44218)
              0.031640913 = queryNorm
            0.3642275 = fieldWeight in 4457, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.494352 = idf(docFreq=493, maxDocs=44218)
              0.046875 = fieldNorm(doc=4457)
      0.13333334 = coord(2/15)
    
    Abstract
    Google's PageRank is an influential algorithm that uses a model of Web use that is dominated by its link structure in order to rank pages by their estimated value to the Web community. This paper reports on the outcome of applying the algorithm to the Web sites of three national university systems in order to test whether it is capable of identifying the most important Web pages. The results are also compared with simple inlink counts. It was discovered that the highest inlinked pages do not always have the highest PageRank, indicating that the two metrics are genuinely different, even for the top pages. More significantly, however, internal links dominated external links for the high ranks in either method and superficial reasons accounted for high scores in both cases. It is concluded that PageRank is not useful for identifying the top pages in a site and that it must be combined with a powerful text matching techniques in order to get the quality of information retrieval results provided by Google.
  12. Dominich, S.: Mathematical foundations of information retrieval (2001) 0.01
    0.014147849 = product of:
      0.10610886 = sum of:
        0.095391594 = sum of:
          0.022768175 = weight(_text_:online in 1753) [ClassicSimilarity], result of:
            0.022768175 = score(doc=1753,freq=4.0), product of:
              0.096027054 = queryWeight, product of:
                3.0349014 = idf(docFreq=5778, maxDocs=44218)
                0.031640913 = queryNorm
              0.23710167 = fieldWeight in 1753, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.0349014 = idf(docFreq=5778, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1753)
          0.07262342 = weight(_text_:recherche in 1753) [ClassicSimilarity], result of:
            0.07262342 = score(doc=1753,freq=4.0), product of:
              0.17150146 = queryWeight, product of:
                5.4202437 = idf(docFreq=531, maxDocs=44218)
                0.031640913 = queryNorm
              0.42345655 = fieldWeight in 1753, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.4202437 = idf(docFreq=531, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1753)
        0.010717267 = product of:
          0.021434534 = sum of:
            0.021434534 = weight(_text_:22 in 1753) [ClassicSimilarity], result of:
              0.021434534 = score(doc=1753,freq=2.0), product of:
                0.110801086 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.031640913 = queryNorm
                0.19345059 = fieldWeight in 1753, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1753)
          0.5 = coord(1/2)
      0.13333334 = coord(2/15)
    
    Date
    22. 3.2008 12:26:32
    RSWK
    Online-Recherche / Mathematische Methode
    Subject
    Online-Recherche / Mathematische Methode
  13. Stock, M.; Stock, W.G.: Internet-Suchwerkzeuge im Vergleich (IV) : Relevance Ranking nach "Popularität" von Webseiten: Google (2001) 0.01
    0.014100287 = product of:
      0.10575215 = sum of:
        0.06705883 = weight(_text_:suchmaschine in 5771) [ClassicSimilarity], result of:
          0.06705883 = score(doc=5771,freq=2.0), product of:
            0.17890577 = queryWeight, product of:
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.031640913 = queryNorm
            0.37482765 = fieldWeight in 5771, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.046875 = fieldNorm(doc=5771)
        0.038693316 = weight(_text_:web in 5771) [ClassicSimilarity], result of:
          0.038693316 = score(doc=5771,freq=6.0), product of:
            0.10326045 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031640913 = queryNorm
            0.37471575 = fieldWeight in 5771, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=5771)
      0.13333334 = coord(2/15)
    
    Abstract
    In unserem Retrievaltest von Suchwerkzeugen im World Wide Web (Password 11/2000) schnitt die Suchmaschine Google am besten ab. Im Vergleich zu anderen Search Engines setzt Google kaum auf Informationslinguistik, sondern auf Algorithmen, die sich aus den Besonderheiten der Web-Dokumente ableiten lassen. Kernstück der informationsstatistischen Technik ist das "PageRank"- Verfahren (benannt nach dem Entwickler Larry Page), das aus der Hypertextstruktur des Web die "Popularität" von Seiten anhand ihrer ein- und ausgehenden Links berechnet. Google besticht durch das Angebot intuitiv verstehbarer Suchbildschirme sowie durch einige sehr nützliche "Kleinigkeiten" wie die Angabe des Rangs einer Seite, Highlighting, Suchen in der Seite, Suchen innerhalb eines Suchergebnisses usw., alles verstaut in einer eigenen Befehlsleiste innerhalb des Browsers. Ähnlich wie RealNames bietet Google mit dem Produkt "AdWords" den Aufkauf von Suchtermen an. Nach einer Reihe von nunmehr vier Password-Artikeln über InternetSuchwerkzeugen im Vergleich wollen wir abschließend zu einer Bewertung kommen. Wie ist der Stand der Technik bei Directories und Search Engines aus informationswissenschaftlicher Sicht einzuschätzen? Werden die "typischen" Internetnutzer, die ja in der Regel keine Information Professionals sind, adäquat bedient? Und können auch Informationsfachleute von den Suchwerkzeugen profitieren?
  14. Ravana, S.D.; Rajagopal, P.; Balakrishnan, V.: Ranking retrieval systems using pseudo relevance judgments (2015) 0.01
    0.012905712 = product of:
      0.06452856 = sum of:
        0.030755727 = weight(_text_:evaluation in 2591) [ClassicSimilarity], result of:
          0.030755727 = score(doc=2591,freq=2.0), product of:
            0.13272417 = queryWeight, product of:
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.031640913 = queryNorm
            0.23172665 = fieldWeight in 2591, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2591)
        0.01861633 = weight(_text_:web in 2591) [ClassicSimilarity], result of:
          0.01861633 = score(doc=2591,freq=2.0), product of:
            0.10326045 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031640913 = queryNorm
            0.18028519 = fieldWeight in 2591, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2591)
        0.015156505 = product of:
          0.03031301 = sum of:
            0.03031301 = weight(_text_:22 in 2591) [ClassicSimilarity], result of:
              0.03031301 = score(doc=2591,freq=4.0), product of:
                0.110801086 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.031640913 = queryNorm
                0.27358043 = fieldWeight in 2591, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2591)
          0.5 = coord(1/2)
      0.2 = coord(3/15)
    
    Abstract
    Purpose In a system-based approach, replicating the web would require large test collections, and judging the relevancy of all documents per topic in creating relevance judgment through human assessors is infeasible. Due to the large amount of documents that requires judgment, there are possible errors introduced by human assessors because of disagreements. The paper aims to discuss these issues. Design/methodology/approach This study explores exponential variation and document ranking methods that generate a reliable set of relevance judgments (pseudo relevance judgments) to reduce human efforts. These methods overcome problems with large amounts of documents for judgment while avoiding human disagreement errors during the judgment process. This study utilizes two key factors: number of occurrences of each document per topic from all the system runs; and document rankings to generate the alternate methods. Findings The effectiveness of the proposed method is evaluated using the correlation coefficient of ranked systems using mean average precision scores between the original Text REtrieval Conference (TREC) relevance judgments and pseudo relevance judgments. The results suggest that the proposed document ranking method with a pool depth of 100 could be a reliable alternative to reduce human effort and disagreement errors involved in generating TREC-like relevance judgments. Originality/value Simple methods proposed in this study show improvement in the correlation coefficient in generating alternate relevance judgment without human assessors while contributing to information retrieval evaluation.
    Date
    20. 1.2015 18:30:22
    18. 9.2018 18:22:56
  15. Henzinger, M.R.: Link analysis in Web information retrieval (2000) 0.01
    0.011907877 = product of:
      0.089309074 = sum of:
        0.047096003 = weight(_text_:web in 801) [ClassicSimilarity], result of:
          0.047096003 = score(doc=801,freq=20.0), product of:
            0.10326045 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031640913 = queryNorm
            0.45608947 = fieldWeight in 801, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03125 = fieldNorm(doc=801)
        0.04221307 = weight(_text_:site in 801) [ClassicSimilarity], result of:
          0.04221307 = score(doc=801,freq=2.0), product of:
            0.1738463 = queryWeight, product of:
              5.494352 = idf(docFreq=493, maxDocs=44218)
              0.031640913 = queryNorm
            0.24281834 = fieldWeight in 801, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.494352 = idf(docFreq=493, maxDocs=44218)
              0.03125 = fieldNorm(doc=801)
      0.13333334 = coord(2/15)
    
    Abstract
    The analysis of the hyperlink structure of the web has led to significant improvements in web information retrieval. This survey describes two successful link analysis algorithms and the state-of-the art of the field.
    Content
    The goal of information retrieval is to find all documents relevant for a user query in a collection of documents. Decades of research in information retrieval were successful in developing and refining techniques that are solely word-based (see e.g., [2]). With the advent of the web new sources of information became available, one of them being the hyperlinks between documents and records of user behavior. To be precise, hypertexts (i.e., collections of documents connected by hyperlinks) have existed and have been studied for a long time. What was new was the large number of hyperlinks created by independent individuals. Hyperlinks provide a valuable source of information for web information retrieval as we will show in this article. This area of information retrieval is commonly called link analysis. Why would one expect hyperlinks to be useful? Ahyperlink is a reference of a web page B that is contained in a web page A. When the hyperlink is clicked on in a web browser, the browser displays page B. This functionality alone is not helpful for web information retrieval. However, the way hyperlinks are typically used by authors of web pages can give them valuable information content. Typically, authors create links because they think they will be useful for the readers of the pages. Thus, links are usually either navigational aids that, for example, bring the reader back to the homepage of the site, or links that point to pages whose content augments the content of the current page. The second kind of links tend to point to high-quality pages that might be on the same topic as the page containing the link.
  16. Liddy, E.D.; Paik, W.; McKenna, M.; Yu, E.S.: ¬A natural language text retrieval system with relevance feedback (1995) 0.01
    0.01135234 = product of:
      0.085142545 = sum of:
        0.011269671 = product of:
          0.022539342 = sum of:
            0.022539342 = weight(_text_:online in 3131) [ClassicSimilarity], result of:
              0.022539342 = score(doc=3131,freq=2.0), product of:
                0.096027054 = queryWeight, product of:
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.031640913 = queryNorm
                0.23471867 = fieldWeight in 3131, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.0349014 = idf(docFreq=5778, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3131)
          0.5 = coord(1/2)
        0.07387287 = weight(_text_:site in 3131) [ClassicSimilarity], result of:
          0.07387287 = score(doc=3131,freq=2.0), product of:
            0.1738463 = queryWeight, product of:
              5.494352 = idf(docFreq=493, maxDocs=44218)
              0.031640913 = queryNorm
            0.4249321 = fieldWeight in 3131, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.494352 = idf(docFreq=493, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3131)
      0.13333334 = coord(2/15)
    
    Abstract
    Outlines a fully integrated retrieval engine that processes documents and queries at the multiple, complex linguistic levels that humans use to construe meaning. Currently undergoing beta site trials, the DR-LINK natural language text retrieval system allows searchers to state queries as fully formed, natural sentences. The meaning and matching of both queries and documents is accomplished at the conceptual level of human expression, not by the simple concurrence of keywords. Furthermore, the natural browsing behaviour of information searchers is accomodated by allowing documents identified as potentially relevant by the explicit semantics of the system to be used as relevance feedback queries which provide an appropriate implicit semantic representation of the information seeker's need
    Source
    Proceedings of the 16th National Online Meeting 1995, New York, 2-4 May 1995. Ed.: M.E. Williams
  17. Langville, A.N.; Meyer, C.D.: Google's PageRank and beyond : the science of search engine rankings (2006) 0.01
    0.010790287 = product of:
      0.08092715 = sum of:
        0.047417752 = weight(_text_:suchmaschine in 6) [ClassicSimilarity], result of:
          0.047417752 = score(doc=6,freq=4.0), product of:
            0.17890577 = queryWeight, product of:
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.031640913 = queryNorm
            0.26504317 = fieldWeight in 6, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.0234375 = fieldNorm(doc=6)
        0.033509392 = weight(_text_:web in 6) [ClassicSimilarity], result of:
          0.033509392 = score(doc=6,freq=18.0), product of:
            0.10326045 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031640913 = queryNorm
            0.32451332 = fieldWeight in 6, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0234375 = fieldNorm(doc=6)
      0.13333334 = coord(2/15)
    
    Abstract
    Why doesn't your home page appear on the first page of search results, even when you query your own name? How do other Web pages always appear at the top? What creates these powerful rankings? And how? The first book ever about the science of Web page rankings, "Google's PageRank and Beyond" supplies the answers to these and other questions and more. The book serves two very different audiences: the curious science reader and the technical computational reader. The chapters build in mathematical sophistication, so that the first five are accessible to the general academic reader. While other chapters are much more mathematical in nature, each one contains something for both audiences. For example, the authors include entertaining asides such as how search engines make money and how the Great Firewall of China influences research. The book includes an extensive background chapter designed to help readers learn more about the mathematics of search engines, and it contains several MATLAB codes and links to sample Web data sets. The philosophy throughout is to encourage readers to experiment with the ideas and algorithms in the text. Any business seriously interested in improving its rankings in the major search engines can benefit from the clear examples, sample code, and list of resources provided. It includes: many illustrative examples and entertaining asides; MATLAB code; accessible and informal style; and complete and self-contained section for mathematics review.
    Content
    Inhalt: Chapter 1. Introduction to Web Search Engines: 1.1 A Short History of Information Retrieval - 1.2 An Overview of Traditional Information Retrieval - 1.3 Web Information Retrieval Chapter 2. Crawling, Indexing, and Query Processing: 2.1 Crawling - 2.2 The Content Index - 2.3 Query Processing Chapter 3. Ranking Webpages by Popularity: 3.1 The Scene in 1998 - 3.2 Two Theses - 3.3 Query-Independence Chapter 4. The Mathematics of Google's PageRank: 4.1 The Original Summation Formula for PageRank - 4.2 Matrix Representation of the Summation Equations - 4.3 Problems with the Iterative Process - 4.4 A Little Markov Chain Theory - 4.5 Early Adjustments to the Basic Model - 4.6 Computation of the PageRank Vector - 4.7 Theorem and Proof for Spectrum of the Google Matrix Chapter 5. Parameters in the PageRank Model: 5.1 The a Factor - 5.2 The Hyperlink Matrix H - 5.3 The Teleportation Matrix E Chapter 6. The Sensitivity of PageRank; 6.1 Sensitivity with respect to alpha - 6.2 Sensitivity with respect to H - 6.3 Sensitivity with respect to vT - 6.4 Other Analyses of Sensitivity - 6.5 Sensitivity Theorems and Proofs Chapter 7. The PageRank Problem as a Linear System: 7.1 Properties of (I - alphaS) - 7.2 Properties of (I - alphaH) - 7.3 Proof of the PageRank Sparse Linear System Chapter 8. Issues in Large-Scale Implementation of PageRank: 8.1 Storage Issues - 8.2 Convergence Criterion - 8.3 Accuracy - 8.4 Dangling Nodes - 8.5 Back Button Modeling
    Chapter 9. Accelerating the Computation of PageRank: 9.1 An Adaptive Power Method - 9.2 Extrapolation - 9.3 Aggregation - 9.4 Other Numerical Methods Chapter 10. Updating the PageRank Vector: 10.1 The Two Updating Problems and their History - 10.2 Restarting the Power Method - 10.3 Approximate Updating Using Approximate Aggregation - 10.4 Exact Aggregation - 10.5 Exact vs. Approximate Aggregation - 10.6 Updating with Iterative Aggregation - 10.7 Determining the Partition - 10.8 Conclusions Chapter 11. The HITS Method for Ranking Webpages: 11.1 The HITS Algorithm - 11.2 HITS Implementation - 11.3 HITS Convergence - 11.4 HITS Example - 11.5 Strengths and Weaknesses of HITS - 11.6 HITS's Relationship to Bibliometrics - 11.7 Query-Independent HITS - 11.8 Accelerating HITS - 11.9 HITS Sensitivity Chapter 12. Other Link Methods for Ranking Webpages: 12.1 SALSA - 12.2 Hybrid Ranking Methods - 12.3 Rankings based on Traffic Flow Chapter 13. The Future of Web Information Retrieval: 13.1 Spam - 13.2 Personalization - 13.3 Clustering - 13.4 Intelligent Agents - 13.5 Trends and Time-Sensitive Search - 13.6 Privacy and Censorship - 13.7 Library Classification Schemes - 13.8 Data Fusion Chapter 14. Resources for Web Information Retrieval: 14.1 Resources for Getting Started - 14.2 Resources for Serious Study Chapter 15. The Mathematics Guide: 15.1 Linear Algebra - 15.2 Perron-Frobenius Theory - 15.3 Markov Chains - 15.4 Perron Complementation - 15.5 Stochastic Complementation - 15.6 Censoring - 15.7 Aggregation - 15.8 Disaggregation
    RSWK
    Google / Web-Seite / Rangstatistik (HEBIS)
    Google / Suchmaschine / Ranking (BVB)
    Subject
    Google / Web-Seite / Rangstatistik (HEBIS)
    Google / Suchmaschine / Ranking (BVB)
  18. Efthimiadis, E.N.: User choices : a new yardstick for the evaluation of ranking algorithms for interactive query expansion (1995) 0.01
    0.010598556 = product of:
      0.079489164 = sum of:
        0.0687719 = weight(_text_:evaluation in 5697) [ClassicSimilarity], result of:
          0.0687719 = score(doc=5697,freq=10.0), product of:
            0.13272417 = queryWeight, product of:
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.031640913 = queryNorm
            0.5181566 = fieldWeight in 5697, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5697)
        0.010717267 = product of:
          0.021434534 = sum of:
            0.021434534 = weight(_text_:22 in 5697) [ClassicSimilarity], result of:
              0.021434534 = score(doc=5697,freq=2.0), product of:
                0.110801086 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.031640913 = queryNorm
                0.19345059 = fieldWeight in 5697, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5697)
          0.5 = coord(1/2)
      0.13333334 = coord(2/15)
    
    Abstract
    The performance of 8 ranking algorithms was evaluated with respect to their effectiveness in ranking terms for query expansion. The evaluation was conducted within an investigation of interactive query expansion and relevance feedback in a real operational environment. Focuses on the identification of algorithms that most effectively take cognizance of user preferences. user choices (i.e. the terms selected by the searchers for the query expansion search) provided the yardstick for the evaluation of the 8 ranking algorithms. This methodology introduces a user oriented approach in evaluating ranking algorithms for query expansion in contrast to the standard, system oriented approaches. Similarities in the performance of the 8 algorithms and the ways these algorithms rank terms were the main focus of this evaluation. The findings demonstrate that the r-lohi, wpq, enim, and porter algorithms have similar performance in bringing good terms to the top of a ranked list of terms for query expansion. However, further evaluation of the algorithms in different (e.g. full text) environments is needed before these results can be generalized beyond the context of the present study
    Date
    22. 2.1996 13:14:10
  19. Stock, W.G.: On relevance distributions (2006) 0.01
    0.010532705 = product of:
      0.07899529 = sum of:
        0.049209163 = weight(_text_:evaluation in 5116) [ClassicSimilarity], result of:
          0.049209163 = score(doc=5116,freq=2.0), product of:
            0.13272417 = queryWeight, product of:
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.031640913 = queryNorm
            0.37076265 = fieldWeight in 5116, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.0625 = fieldNorm(doc=5116)
        0.029786127 = weight(_text_:web in 5116) [ClassicSimilarity], result of:
          0.029786127 = score(doc=5116,freq=2.0), product of:
            0.10326045 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.031640913 = queryNorm
            0.2884563 = fieldWeight in 5116, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0625 = fieldNorm(doc=5116)
      0.13333334 = coord(2/15)
    
    Abstract
    There are at least three possible ways that documents are distributed by relevance: informetric (power law), inverse logistic, and dichotomous. The nature of the type of distribution has implications for the construction of relevance ranking algorithms for search engines, for automated (blind) relevance feedback, for user behavior when using Web search engines, for combining of outputs of search engines for metasearch, for topic detection and tracking, and for the methodology of evaluation of information retrieval systems.
  20. Klas, C.-P.; Fuhr, N.; Schaefer, A.: Evaluating strategic support for information access in the DAFFODIL system (2004) 0.01
    0.010238041 = product of:
      0.0767853 = sum of:
        0.06392458 = weight(_text_:evaluation in 2419) [ClassicSimilarity], result of:
          0.06392458 = score(doc=2419,freq=6.0), product of:
            0.13272417 = queryWeight, product of:
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.031640913 = queryNorm
            0.48163486 = fieldWeight in 2419, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.1947007 = idf(docFreq=1811, maxDocs=44218)
              0.046875 = fieldNorm(doc=2419)
        0.01286072 = product of:
          0.02572144 = sum of:
            0.02572144 = weight(_text_:22 in 2419) [ClassicSimilarity], result of:
              0.02572144 = score(doc=2419,freq=2.0), product of:
                0.110801086 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.031640913 = queryNorm
                0.23214069 = fieldWeight in 2419, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2419)
          0.5 = coord(1/2)
      0.13333334 = coord(2/15)
    
    Abstract
    The digital library system Daffodil is targeted at strategic support of users during the information search process. For searching, exploring and managing digital library objects it provides user-customisable information seeking patterns over a federation of heterogeneous digital libraries. In this paper evaluation results with respect to retrieval effectiveness, efficiency and user satisfaction are presented. The analysis focuses on strategic support for the scientific work-flow. Daffodil supports the whole work-flow, from data source selection over information seeking to the representation, organisation and reuse of information. By embedding high level search functionality into the scientific work-flow, the user experiences better strategic system support due to a more systematic work process. These ideas have been implemented in Daffodil followed by a qualitative evaluation. The evaluation has been conducted with 28 participants, ranging from information seeking novices to experts. The results are promising, as they support the chosen model.
    Date
    16.11.2008 16:22:48

Years

Languages

  • e 142
  • d 21
  • chi 2
  • m 1
  • pt 1
  • More… Less…

Types

  • a 148
  • m 10
  • s 4
  • x 4
  • el 2
  • r 2
  • p 1
  • More… Less…