Search (66 results, page 1 of 4)

  • × theme_ss:"Suchmaschinen"
  • × type_ss:"el"
  1. Birmingham, J.: Internet search engines (1996) 0.03
    0.028477818 = product of:
      0.08543345 = sum of:
        0.08543345 = sum of:
          0.014203937 = weight(_text_:of in 5664) [ClassicSimilarity], result of:
            0.014203937 = score(doc=5664,freq=2.0), product of:
              0.06850986 = queryWeight, product of:
                1.5637573 = idf(docFreq=25162, maxDocs=44218)
                0.043811057 = queryNorm
              0.20732689 = fieldWeight in 5664, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                1.5637573 = idf(docFreq=25162, maxDocs=44218)
                0.09375 = fieldNorm(doc=5664)
          0.07122952 = weight(_text_:22 in 5664) [ClassicSimilarity], result of:
            0.07122952 = score(doc=5664,freq=2.0), product of:
              0.15341885 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.043811057 = queryNorm
              0.46428138 = fieldWeight in 5664, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.09375 = fieldNorm(doc=5664)
      0.33333334 = coord(1/3)
    
    Abstract
    Basically a good listing in table format of features from the major search engines
    Date
    10.11.1996 16:36:22
  2. Bensman, S.J.: Eugene Garfield, Francis Narin, and PageRank : the theoretical bases of the Google search engine (2013) 0.02
    0.024179911 = product of:
      0.07253973 = sum of:
        0.07253973 = sum of:
          0.02505339 = weight(_text_:of in 1149) [ClassicSimilarity], result of:
            0.02505339 = score(doc=1149,freq=14.0), product of:
              0.06850986 = queryWeight, product of:
                1.5637573 = idf(docFreq=25162, maxDocs=44218)
                0.043811057 = queryNorm
              0.36569026 = fieldWeight in 1149, product of:
                3.7416575 = tf(freq=14.0), with freq of:
                  14.0 = termFreq=14.0
                1.5637573 = idf(docFreq=25162, maxDocs=44218)
                0.0625 = fieldNorm(doc=1149)
          0.047486346 = weight(_text_:22 in 1149) [ClassicSimilarity], result of:
            0.047486346 = score(doc=1149,freq=2.0), product of:
              0.15341885 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.043811057 = queryNorm
              0.30952093 = fieldWeight in 1149, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0625 = fieldNorm(doc=1149)
      0.33333334 = coord(1/3)
    
    Abstract
    This paper presents a test of the validity of using Google Scholar to evaluate the publications of researchers by comparing the premises on which its search engine, PageRank, is based, to those of Garfield's theory of citation indexing. It finds that the premises are identical and that PageRank and Garfield's theory of citation indexing validate each other.
    Date
    17.12.2013 11:02:22
  3. Boldi, P.; Santini, M.; Vigna, S.: PageRank as a function of the damping factor (2005) 0.02
    0.017533492 = product of:
      0.052600473 = sum of:
        0.052600473 = sum of:
          0.022921504 = weight(_text_:of in 2564) [ClassicSimilarity], result of:
            0.022921504 = score(doc=2564,freq=30.0), product of:
              0.06850986 = queryWeight, product of:
                1.5637573 = idf(docFreq=25162, maxDocs=44218)
                0.043811057 = queryNorm
              0.33457235 = fieldWeight in 2564, product of:
                5.477226 = tf(freq=30.0), with freq of:
                  30.0 = termFreq=30.0
                1.5637573 = idf(docFreq=25162, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2564)
          0.029678967 = weight(_text_:22 in 2564) [ClassicSimilarity], result of:
            0.029678967 = score(doc=2564,freq=2.0), product of:
              0.15341885 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.043811057 = queryNorm
              0.19345059 = fieldWeight in 2564, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2564)
      0.33333334 = coord(1/3)
    
    Abstract
    PageRank is defined as the stationary state of a Markov chain. The chain is obtained by perturbing the transition matrix induced by a web graph with a damping factor alpha that spreads uniformly part of the rank. The choice of alpha is eminently empirical, and in most cases the original suggestion alpha=0.85 by Brin and Page is still used. Recently, however, the behaviour of PageRank with respect to changes in alpha was discovered to be useful in link-spam detection. Moreover, an analytical justification of the value chosen for alpha is still missing. In this paper, we give the first mathematical analysis of PageRank when alpha changes. In particular, we show that, contrarily to popular belief, for real-world graphs values of alpha close to 1 do not give a more meaningful ranking. Then, we give closed-form formulae for PageRank derivatives of any order, and an extension of the Power Method that approximates them with convergence O(t**k*alpha**t) for the k-th derivative. Finally, we show a tight connection between iterated computation and analytical behaviour by proving that the k-th iteration of the Power Method gives exactly the PageRank value obtained using a Maclaurin polynomial of degree k. The latter result paves the way towards the application of analytical methods to the study of PageRank.
    Date
    16. 1.2016 10:22:28
    Source
    http://vigna.di.unimi.it/ftp/papers/PageRankAsFunction.pdf [Proceedings of the ACM World Wide Web Conference (WWW), 2005]
  4. Baeza-Yates, R.; Boldi, P.; Castillo, C.: Generalizing PageRank : damping functions for linkbased ranking algorithms (2006) 0.02
    0.016131433 = product of:
      0.048394296 = sum of:
        0.048394296 = sum of:
          0.01871533 = weight(_text_:of in 2565) [ClassicSimilarity], result of:
            0.01871533 = score(doc=2565,freq=20.0), product of:
              0.06850986 = queryWeight, product of:
                1.5637573 = idf(docFreq=25162, maxDocs=44218)
                0.043811057 = queryNorm
              0.27317715 = fieldWeight in 2565, product of:
                4.472136 = tf(freq=20.0), with freq of:
                  20.0 = termFreq=20.0
                1.5637573 = idf(docFreq=25162, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2565)
          0.029678967 = weight(_text_:22 in 2565) [ClassicSimilarity], result of:
            0.029678967 = score(doc=2565,freq=2.0), product of:
              0.15341885 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.043811057 = queryNorm
              0.19345059 = fieldWeight in 2565, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2565)
      0.33333334 = coord(1/3)
    
    Abstract
    This paper introduces a family of link-based ranking algorithms that propagate page importance through links. In these algorithms there is a damping function that decreases with distance, so a direct link implies more endorsement than a link through a long path. PageRank is the most widely known ranking function of this family. The main objective of this paper is to determine whether this family of ranking techniques has some interest per se, and how different choices for the damping function impact on rank quality and on convergence speed. Even though our results suggest that PageRank can be approximated with other simpler forms of rankings that may be computed more efficiently, our focus is of more speculative nature, in that it aims at separating the kernel of PageRank, that is, link-based importance propagation, from the way propagation decays over paths. We focus on three damping functions, having linear, exponential, and hyperbolic decay on the lengths of the paths. The exponential decay corresponds to PageRank, and the other functions are new. Our presentation includes algorithms, analysis, comparisons and experiments that study their behavior under different parameters in real Web graph data. Among other results, we show how to calculate a linear approximation that induces a page ordering that is almost identical to PageRank's using a fixed small number of iterations; comparisons were performed using Kendall's tau on large domain datasets.
    Date
    16. 1.2016 10:22:28
    Source
    http://chato.cl/papers/baeza06_general_pagerank_damping_functions_link_ranking.pdf [Proceedings of the ACM Special Interest Group on Information Retrieval (SIGIR) Conference, SIGIR'06, August 6-10, 2006, Seattle, Washington, USA]
  5. Dunning, A.: Do we still need search engines? (1999) 0.01
    0.013850185 = product of:
      0.041550554 = sum of:
        0.041550554 = product of:
          0.08310111 = sum of:
            0.08310111 = weight(_text_:22 in 6021) [ClassicSimilarity], result of:
              0.08310111 = score(doc=6021,freq=2.0), product of:
                0.15341885 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043811057 = queryNorm
                0.5416616 = fieldWeight in 6021, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=6021)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Source
    Ariadne. 1999, no.22
  6. Schaat, S.: Von der automatisierten Manipulation zur Manipulation der Automatisierung (2019) 0.01
    0.007914391 = product of:
      0.023743173 = sum of:
        0.023743173 = product of:
          0.047486346 = sum of:
            0.047486346 = weight(_text_:22 in 4996) [ClassicSimilarity], result of:
              0.047486346 = score(doc=4996,freq=2.0), product of:
                0.15341885 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043811057 = queryNorm
                0.30952093 = fieldWeight in 4996, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=4996)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    19. 2.2019 17:22:00
  7. Place, E.: Internationale Zusammenarbeit bei Internet Subject Gateways (1999) 0.01
    0.0059357933 = product of:
      0.01780738 = sum of:
        0.01780738 = product of:
          0.03561476 = sum of:
            0.03561476 = weight(_text_:22 in 4189) [ClassicSimilarity], result of:
              0.03561476 = score(doc=4189,freq=2.0), product of:
                0.15341885 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043811057 = queryNorm
                0.23214069 = fieldWeight in 4189, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4189)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    22. 6.2002 19:35:09
  8. Smith, A.G.: Search features of digital libraries (2000) 0.00
    0.004428855 = product of:
      0.013286565 = sum of:
        0.013286565 = product of:
          0.02657313 = sum of:
            0.02657313 = weight(_text_:of in 940) [ClassicSimilarity], result of:
              0.02657313 = score(doc=940,freq=28.0), product of:
                0.06850986 = queryWeight, product of:
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.043811057 = queryNorm
                0.38787308 = fieldWeight in 940, product of:
                  5.2915025 = tf(freq=28.0), with freq of:
                    28.0 = termFreq=28.0
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.046875 = fieldNorm(doc=940)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Traditional on-line search services such as Dialog, DataStar and Lexis provide a wide range of search features (boolean and proximity operators, truncation, etc). This paper discusses the use of these features for effective searching, and argues that these features are required, regardless of advances in search engine technology. The literature on on-line searching is reviewed, identifying features that searchers find desirable for effective searching. A selective survey of current digital libraries available on the Web was undertaken, identifying which search features are present. The survey indicates that current digital libraries do not implement a wide range of search features. For instance: under half of the examples included controlled vocabulary, under half had proximity searching, only one enabled browsing of term indexes, and none of the digital libraries enable searchers to refine an initial search. Suggestions are made for enhancing the search effectiveness of digital libraries; for instance, by providing a full range of search operators, enabling browsing of search terms, enhancement of records with controlled vocabulary, enabling the refining of initial searches, etc.
  9. Page, A.: ¬The search is over : the search-engines secrets of the pros (1996) 0.00
    0.0044112457 = product of:
      0.013233736 = sum of:
        0.013233736 = product of:
          0.026467472 = sum of:
            0.026467472 = weight(_text_:of in 5670) [ClassicSimilarity], result of:
              0.026467472 = score(doc=5670,freq=10.0), product of:
                0.06850986 = queryWeight, product of:
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.043811057 = queryNorm
                0.38633084 = fieldWeight in 5670, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.078125 = fieldNorm(doc=5670)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Covers 8 of the most popular search engines. Gives a summary of each and has a nice table of features that also briefly lists the pros and cons. Includes a short explanation of Boolean operators too
  10. Sirapyan, N.: In Search of... (2001) 0.00
    0.004267752 = product of:
      0.012803256 = sum of:
        0.012803256 = product of:
          0.025606511 = sum of:
            0.025606511 = weight(_text_:of in 5661) [ClassicSimilarity], result of:
              0.025606511 = score(doc=5661,freq=26.0), product of:
                0.06850986 = queryWeight, product of:
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.043811057 = queryNorm
                0.37376386 = fieldWeight in 5661, product of:
                  5.0990195 = tf(freq=26.0), with freq of:
                    26.0 = termFreq=26.0
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5661)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    In a series of capsule reviews of 20 search engines Sirapyan gives a good overview of the state of Internet search tools. She starts out with a clear discussion of the types of search tools available, the availability of advanced features such as Boolean queries and differences between directories, regular search engines and metasearch engines. It is unclear from the article whether the author and other testers used the same searches across all of the 20 tools but each review clearly outlines perceived strengths and weaknesses, gives tips on the advanced features, if any, of the search tool in question and suggests the types of searches that are most successful. The tools which receive top honors are Google, Northern Light, HotBot and Oingo. Finally, there is an extra sidebar the discusses meta and specialized search tools such as Infozoid and FirstGov. I can't help thinking that the usefulness of this article is related to the fact that Sirapyan is PC Magazine's librarian and goes into greater depth on those features that are of interest to information professionals
  11. Ogden, J.; Summers, E.; Walker, S.: Know(ing) Infrastructure : the wayback machine as object and instrument of digital research (2023) 0.00
    0.0040669674 = product of:
      0.012200902 = sum of:
        0.012200902 = product of:
          0.024401804 = sum of:
            0.024401804 = weight(_text_:of in 1084) [ClassicSimilarity], result of:
              0.024401804 = score(doc=1084,freq=34.0), product of:
                0.06850986 = queryWeight, product of:
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.043811057 = queryNorm
                0.35617945 = fieldWeight in 1084, product of:
                  5.8309517 = tf(freq=34.0), with freq of:
                    34.0 = termFreq=34.0
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1084)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    From documenting human rights abuses to studying online advertising, web archives are increasingly positioned as critical resources for a broad range of scholarly Internet research agendas. In this article, we reflect on the motivations and methodological challenges of investigating the world's largest web archive, the Internet Archive's Wayback Machine (IAWM). Using a mixed methods approach, we report on a pilot project centred around documenting the inner workings of 'Save Page Now' (SPN) - an Internet Archive tool that allows users to initiate the creation and storage of 'snapshots' of web resources. By improving our understanding of SPN and its role in shaping the IAWM, this work examines how the public tool is being used to 'save the Web' and highlights the challenges of operationalising a study of the dynamic sociotechnical processes supporting this knowledge infrastructure. Inspired by existing Science and Technology Studies (STS) approaches, the paper charts our development of methodological interventions to support an interdisciplinary investigation of SPN, including: ethnographic methods, 'experimental blackbox tactics', data tracing, modelling and documentary research. We discuss the opportunities and limitations of our methodology when interfacing with issues associated with temporality, scale and visibility, as well as critically engage with our own positionality in the research process (in terms of expertise and access). We conclude with reflections on the implications of digital STS approaches for 'knowing infrastructure', where the use of these infrastructures is unavoidably intertwined with our ability to study the situated and material arrangements of their creation.
    Source
    Convergence: The International Journal of Research into New Media Technologies [https://www.researchgate.net/publication/369660337_Knowing_Infrastructure_The_Wayback_Machine_as_object_and_instrument_of_digital_research]
  12. Gillitzer, B.: Yewno (2017) 0.00
    0.0039571957 = product of:
      0.011871587 = sum of:
        0.011871587 = product of:
          0.023743173 = sum of:
            0.023743173 = weight(_text_:22 in 3447) [ClassicSimilarity], result of:
              0.023743173 = score(doc=3447,freq=2.0), product of:
                0.15341885 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.043811057 = queryNorm
                0.15476047 = fieldWeight in 3447, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=3447)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    22. 2.2017 10:16:49
  13. Matrix of WWW indices : a comparison of Internet indexing tools (1995) 0.00
    0.003945538 = product of:
      0.0118366135 = sum of:
        0.0118366135 = product of:
          0.023673227 = sum of:
            0.023673227 = weight(_text_:of in 3165) [ClassicSimilarity], result of:
              0.023673227 = score(doc=3165,freq=8.0), product of:
                0.06850986 = queryWeight, product of:
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.043811057 = queryNorm
                0.34554482 = fieldWeight in 3165, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3165)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Imprint
    Ann Arbor : University of Michigan School of Information and Library Studies
  14. Brin, S.; Page, L.: ¬The anatomy of a large-scale hypertextual Web search engine (1998) 0.00
    0.003945538 = product of:
      0.0118366135 = sum of:
        0.0118366135 = product of:
          0.023673227 = sum of:
            0.023673227 = weight(_text_:of in 947) [ClassicSimilarity], result of:
              0.023673227 = score(doc=947,freq=32.0), product of:
                0.06850986 = queryWeight, product of:
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.043811057 = queryNorm
                0.34554482 = fieldWeight in 947, product of:
                  5.656854 = tf(freq=32.0), with freq of:
                    32.0 = termFreq=32.0
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=947)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems. The prototype with a full text and hyperlink database of at least 24 million pages is available at http://google.stanford.edu/. To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of millions of queries every day. Despite the importance of large-scale search engines on the web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from three years ago. This paper provides an in-depth description of our large-scale web search engine -- the first such detailed public description we know of to date. Apart from the problems of scaling traditional search techniques to data of this magnitude, there are new technical challenges involved with using the additional information present in hypertext to produce better search results. This paper addresses this question of how to build a practical large-scale system which can exploit the additional information present in hypertext. Also we look at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want
  15. Schomburg, S.; Prante, J.: Search Engine Federation in Libraries - Suchmaschinenföderation in Bibliotheken (2009) 0.00
    0.003925761 = product of:
      0.011777283 = sum of:
        0.011777283 = product of:
          0.023554565 = sum of:
            0.023554565 = weight(_text_:of in 2809) [ClassicSimilarity], result of:
              0.023554565 = score(doc=2809,freq=22.0), product of:
                0.06850986 = queryWeight, product of:
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.043811057 = queryNorm
                0.34381276 = fieldWeight in 2809, product of:
                  4.690416 = tf(freq=22.0), with freq of:
                    22.0 = termFreq=22.0
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2809)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    The hbz (Academic Library Center, Cologne) has a strong focus on search engine applications: Beyond the projected integration of respective technologies into the new release of the Digital Library portal solution (DigiBib6), vascoda background services also apply and take advantage of search engine technology. Experience since 2003 has given proof that building and updating of search engine indexes involves a vast amount of resources. The use of search engine federations, however, pledges major improvements: The total amount of data records held in linked indexes can be almost unlimited but also allow for a joint output of all hits retrieved. A federation also comes with excellent response times - hits retrieved can also refer to or link into the original system's layout. Nonetheless, the major challenge these days is different search engine technologies, e.g. Lucene and FAST, the variations in terms of ranking, and the implementation or non-implementation of so-called drill-downs. The lecture is designed to give a brief insight into the hbz search engine workshop with an introduction to the special project state of play.
  16. Austin, D.: How Google finds your needle in the Web's haystack : as we'll see, the trick is to ask the web itself to rank the importance of pages... (2006) 0.00
    0.0035207155 = product of:
      0.010562146 = sum of:
        0.010562146 = product of:
          0.021124292 = sum of:
            0.021124292 = weight(_text_:of in 93) [ClassicSimilarity], result of:
              0.021124292 = score(doc=93,freq=52.0), product of:
                0.06850986 = queryWeight, product of:
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.043811057 = queryNorm
                0.30833945 = fieldWeight in 93, product of:
                  7.2111025 = tf(freq=52.0), with freq of:
                    52.0 = termFreq=52.0
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=93)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Imagine a library containing 25 billion documents but with no centralized organization and no librarians. In addition, anyone may add a document at any time without telling anyone. You may feel sure that one of the documents contained in the collection has a piece of information that is vitally important to you, and, being impatient like most of us, you'd like to find it in a matter of seconds. How would you go about doing it? Posed in this way, the problem seems impossible. Yet this description is not too different from the World Wide Web, a huge, highly-disorganized collection of documents in many different formats. Of course, we're all familiar with search engines (perhaps you found this article using one) so we know that there is a solution. This article will describe Google's PageRank algorithm and how it returns pages from the web's collection of 25 billion documents that match search criteria so well that "google" has become a widely used verb. Most search engines, including Google, continually run an army of computer programs that retrieve pages from the web, index the words in each document, and store this information in an efficient format. Each time a user asks for a web search using a search phrase, such as "search engine," the search engine determines all the pages on the web that contains the words in the search phrase. (Perhaps additional information such as the distance between the words "search" and "engine" will be noted as well.) Here is the problem: Google now claims to index 25 billion pages. Roughly 95% of the text in web pages is composed from a mere 10,000 words. This means that, for most searches, there will be a huge number of pages containing the words in the search phrase. What is needed is a means of ranking the importance of the pages that fit the search criteria so that the pages can be sorted with the most important pages at the top of the list. One way to determine the importance of pages is to use a human-generated ranking. For instance, you may have seen pages that consist mainly of a large number of links to other resources in a particular area of interest. Assuming the person maintaining this page is reliable, the pages referenced are likely to be useful. Of course, the list may quickly fall out of date, and the person maintaining the list may miss some important pages, either unintentionally or as a result of an unstated bias. Google's PageRank algorithm assesses the importance of web pages without human evaluation of the content. In fact, Google feels that the value of its service is largely in its ability to provide unbiased results to search queries; Google claims, "the heart of our software is PageRank." As we'll see, the trick is to ask the web itself to rank the importance of pages.
  17. Tomaiuolo, N.G.; Packer, J.G.: Quantitative analysis of five WWW 'search engines' (1996) 0.00
    0.0034169364 = product of:
      0.010250809 = sum of:
        0.010250809 = product of:
          0.020501617 = sum of:
            0.020501617 = weight(_text_:of in 5675) [ClassicSimilarity], result of:
              0.020501617 = score(doc=5675,freq=6.0), product of:
                0.06850986 = queryWeight, product of:
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.043811057 = queryNorm
                0.2992506 = fieldWeight in 5675, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.078125 = fieldNorm(doc=5675)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Provides a table of the results from over 100 questions actually asked at a library reference desk: The summary notes the average number of relevant 'hits' for all investigated search engines are: AltaVista: 9.3; InfoSeek: 8.3; Lycos: 8.1; Magellan: 7.8; Point: 2.1
  18. TASI: ¬A review of image search engines (2003) 0.00
    0.0034169364 = product of:
      0.010250809 = sum of:
        0.010250809 = product of:
          0.020501617 = sum of:
            0.020501617 = weight(_text_:of in 6757) [ClassicSimilarity], result of:
              0.020501617 = score(doc=6757,freq=6.0), product of:
                0.06850986 = queryWeight, product of:
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.043811057 = queryNorm
                0.2992506 = fieldWeight in 6757, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.078125 = fieldNorm(doc=6757)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Replacing an earlier review, TASI's report outlines the different types of image search engines available and suggests the things to look out for when using one to find images. It includes TASI's own critical evaluation of the most popular engines.
  19. Zhao, Y.; Ma, F.; Xia, X.: Evaluating the coverage of entities in knowledge graphs behind general web search engines : Poster (2017) 0.00
    0.0034169364 = product of:
      0.010250809 = sum of:
        0.010250809 = product of:
          0.020501617 = sum of:
            0.020501617 = weight(_text_:of in 3854) [ClassicSimilarity], result of:
              0.020501617 = score(doc=3854,freq=24.0), product of:
                0.06850986 = queryWeight, product of:
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.043811057 = queryNorm
                0.2992506 = fieldWeight in 3854, product of:
                  4.8989797 = tf(freq=24.0), with freq of:
                    24.0 = termFreq=24.0
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3854)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Web search engines, such as Google and Bing, are constantly employing results from knowledge organization and various visualization features to improve their search services. Knowledge graph, a large repository of structured knowledge represented by formal languages such as RDF (Resource Description Framework), is used to support entity search feature of Google and Bing (Demartini, 2016). When a user searchs for an entity, such as a person, an organization, or a place in Google or Bing, it is likely that a knowledge cardwill be presented on the right side bar of the search engine result pages (SERPs). For example, when a user searches the entity Benedict Cumberbatch on Google, the knowledge card will show the basic structured information about this person, including his date of birth, height, spouse, parents, and his movies, etc. The knowledge card, which is used to present the result of entity search, is generated from knowledge graphs. Therefore, the quality of knowledge graphs is essential to the performance of entity search. However, studies on the quality of knowledge graphs from the angle of entity coverage are scant in the literature. This study aims to investigate the coverage of entities of knowledge graphs behind Google and Bing.
  20. Warnick, W.L.; Leberman, A.; Scott, R.L.; Spence, K.J.; Johnsom, L.A.; Allen, V.S.: Searching the deep Web : directed query engine applications at the Department of Energy (2001) 0.00
    0.0033478998 = product of:
      0.010043699 = sum of:
        0.010043699 = product of:
          0.020087399 = sum of:
            0.020087399 = weight(_text_:of in 1215) [ClassicSimilarity], result of:
              0.020087399 = score(doc=1215,freq=16.0), product of:
                0.06850986 = queryWeight, product of:
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.043811057 = queryNorm
                0.2932045 = fieldWeight in 1215, product of:
                  4.0 = tf(freq=16.0), with freq of:
                    16.0 = termFreq=16.0
                  1.5637573 = idf(docFreq=25162, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1215)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Directed Query Engines, an emerging class of search engine specifically designed to access distributed resources on the deep web, offer the opportunity to create inexpensive digital libraries. Already, one such engine, Distributed Explorer, has been used to select and assemble high quality information resources and incorporate them into publicly available systems for the physical sciences. By nesting Directed Query Engines so that one query launches several other engines in a cascading fashion, enormous virtual collections may soon be assembled to form a comprehensive information infrastructure for the physical sciences. Once a Directed Query Engine has been configured for a set of information resources, distributed alerts tools can provide patrons with personalized, profile-based notices of recent additions to any of the selected resources. Due to the potentially enormous size and scope of Directed Query Engine applications, consideration must be given to issues surrounding the representation of large quantities of information from multiple, heterogeneous sources.

Years

Languages

  • e 53
  • d 12

Types

  • a 29
  • r 1
  • x 1
  • More… Less…