Search (59 results, page 1 of 3)

Boldi, P.; Santini, M.; Vigna, S.: PageRank as a function of the damping factor (2005) 0.04
```
0.039321437 = product of:
  0.078642875 = sum of:
    0.078642875 = sum of:
      0.04334968 = weight(_text_:web in 2564) [ClassicSimilarity], result of:
        0.04334968 = score(doc=2564,freq=4.0), product of:
          0.17002425 = queryWeight, product of:
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.052098576 = queryNorm
          0.25496176 = fieldWeight in 2564, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2564)
      0.03529319 = weight(_text_:22 in 2564) [ClassicSimilarity], result of:
        0.03529319 = score(doc=2564,freq=2.0), product of:
          0.18244034 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052098576 = queryNorm
          0.19345059 = fieldWeight in 2564, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2564)
  0.5 = coord(1/2)
```
Abstract

PageRank is defined as the stationary state of a Markov chain. The chain is obtained by perturbing the transition matrix induced by a web graph with a damping factor alpha that spreads uniformly part of the rank. The choice of alpha is eminently empirical, and in most cases the original suggestion alpha=0.85 by Brin and Page is still used. Recently, however, the behaviour of PageRank with respect to changes in alpha was discovered to be useful in link-spam detection. Moreover, an analytical justification of the value chosen for alpha is still missing. In this paper, we give the first mathematical analysis of PageRank when alpha changes. In particular, we show that, contrarily to popular belief, for real-world graphs values of alpha close to 1 do not give a more meaningful ranking. Then, we give closed-form formulae for PageRank derivatives of any order, and an extension of the Power Method that approximates them with convergence O(t**k*alpha**t) for the k-th derivative. Finally, we show a tight connection between iterated computation and analytical behaviour by proving that the k-th iteration of the Power Method gives exactly the PageRank value obtained using a Maclaurin polynomial of degree k. The latter result paves the way towards the application of analytical methods to the study of PageRank.

Date

16. 1.2016 10:22:28

Source

http://vigna.di.unimi.it/ftp/papers/PageRankAsFunction.pdf [Proceedings of the ACM World Wide Web Conference (WWW), 2005]
Baeza-Yates, R.; Boldi, P.; Castillo, C.: Generalizing PageRank : damping functions for linkbased ranking algorithms (2006) 0.03
```
0.03297302 = product of:
  0.06594604 = sum of:
    0.06594604 = sum of:
      0.030652853 = weight(_text_:web in 2565) [ClassicSimilarity], result of:
        0.030652853 = score(doc=2565,freq=2.0), product of:
          0.17002425 = queryWeight, product of:
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.052098576 = queryNorm
          0.18028519 = fieldWeight in 2565, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.2635105 = idf(docFreq=4597, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2565)
      0.03529319 = weight(_text_:22 in 2565) [ClassicSimilarity], result of:
        0.03529319 = score(doc=2565,freq=2.0), product of:
          0.18244034 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.052098576 = queryNorm
          0.19345059 = fieldWeight in 2565, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2565)
  0.5 = coord(1/2)
```
Abstract

This paper introduces a family of link-based ranking algorithms that propagate page importance through links. In these algorithms there is a damping function that decreases with distance, so a direct link implies more endorsement than a link through a long path. PageRank is the most widely known ranking function of this family. The main objective of this paper is to determine whether this family of ranking techniques has some interest per se, and how different choices for the damping function impact on rank quality and on convergence speed. Even though our results suggest that PageRank can be approximated with other simpler forms of rankings that may be computed more efficiently, our focus is of more speculative nature, in that it aims at separating the kernel of PageRank, that is, link-based importance propagation, from the way propagation decays over paths. We focus on three damping functions, having linear, exponential, and hyperbolic decay on the lengths of the paths. The exponential decay corresponds to PageRank, and the other functions are new. Our presentation includes algorithms, analysis, comparisons and experiments that study their behavior under different parameters in real Web graph data. Among other results, we show how to calculate a linear approximation that induces a page ordering that is almost identical to PageRank's using a fixed small number of iterations; comparisons were performed using Kendall's tau on large domain datasets.

Date

16. 1.2016 10:22:28

Broder, A.; Kumar, R.; Maghoul, F.; Raghavan, P.; Rajagopalan, S.; Stata, R.; Tomkins, A.; Wiener, J.: Graph structure in the Web (2000) 0.03

0.027416745 = product of:
  0.05483349 = sum of:
    0.05483349 = product of:
      0.10966698 = sum of:
        0.10966698 = weight(_text_:web in 5595) [ClassicSimilarity], result of:
          0.10966698 = score(doc=5595,freq=10.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.6450079 = fieldWeight in 5595, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0625 = fieldNorm(doc=5595)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: The study of the web as a graph is not only fascinating in its own right, but also yields valuable insight into web algorithms for crawling, searching and community discovery, and the sociological phenomena which characterize its evolution. We report on experiments on local and global properties of the web graph using two Altavista crawls each with over 200M pages and 1.5 billion links. Our study indicates that the macroscopic structure of the web is considerably more intricate than suggested by earlier experiments on a smaller scale

Hogan, A.; Harth, A.; Umbrich, J.; Kinsella, S.; Polleres, A.; Decker, S.: Searching and browsing Linked Data with SWSE : the Semantic Web Search Engine (2011) 0.03
```
0.026546149 = product of:
  0.053092297 = sum of:
    0.053092297 = product of:
      0.106184594 = sum of:
        0.106184594 = weight(_text_:web in 438) [ClassicSimilarity], result of:
          0.106184594 = score(doc=438,freq=24.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.6245262 = fieldWeight in 438, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=438)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In this paper, we discuss the architecture and implementation of the Semantic Web Search Engine (SWSE). Following traditional search engine architecture, SWSE consists of crawling, data enhancing, indexing and a user interface for search, browsing and retrieval of information; unlike traditional search engines, SWSE operates over RDF Web data - loosely also known as Linked Data - which implies unique challenges for the system design, architecture, algorithms, implementation and user interface. In particular, many challenges exist in adopting Semantic Web technologies for Web data: the unique challenges of the Web - in terms of scale, unreliability, inconsistency and noise - are largely overlooked by the current Semantic Web standards. Herein, we describe the current SWSE system, initially detailing the architecture and later elaborating upon the function, design, implementation and performance of each individual component. In so doing, we also give an insight into how current Semantic Web standards can be tailored, in a best-effort manner, for use on Web data. Throughout, we offer evaluation and complementary argumentation to support our design choices, and also offer discussion on future directions and open research questions. Later, we also provide candid discussion relating to the difficulties currently faced in bringing such a search engine into the mainstream, and lessons learnt from roughly six years working on the Semantic Web Search Engine project.

Object

Semantic Web Search Engine

Theme

Semantic Web

Koch, T.: Searching the Web : systematic overview over indexes (1995) 0.03

0.026009807 = product of:
  0.052019615 = sum of:
    0.052019615 = product of:
      0.10403923 = sum of:
        0.10403923 = weight(_text_:web in 3169) [ClassicSimilarity], result of:
          0.10403923 = score(doc=3169,freq=4.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.6119082 = fieldWeight in 3169, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.09375 = fieldNorm(doc=3169)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Object: Nordic Web Index

Radhakrishnan, A.: Swoogle : an engine for the Semantic Web (2007) 0.03
```
0.025276989 = product of:
  0.050553977 = sum of:
    0.050553977 = product of:
      0.101107955 = sum of:
        0.101107955 = weight(_text_:web in 4709) [ClassicSimilarity], result of:
          0.101107955 = score(doc=4709,freq=34.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.59466785 = fieldWeight in 4709, product of:
              5.8309517 = tf(freq=34.0), with freq of:
                34.0 = termFreq=34.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.03125 = fieldNorm(doc=4709)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Content

"Swoogle, the Semantic web search engine, is a research project carried out by the ebiquity research group in the Computer Science and Electrical Engineering Department at the University of Maryland. It's an engine tailored towards finding documents on the semantic web. The whole research paper is available here. Semantic web is touted as the next generation of online content representation where the web documents are represented in a language that is not only easy for humans but is machine readable (easing the integration of data as never thought possible) as well. And the main elements of the semantic web include data model description formats such as Resource Description Framework (RDF), a variety of data interchange formats (e.g. RDF/XML, Turtle, N-Triples), and notations such as RDF Schema (RDFS), the Web Ontology Language (OWL), all of which are intended to provide a formal description of concepts, terms, and relationships within a given knowledge domain (Wikipedia). And Swoogle is an attempt to mine and index this new set of web documents. The engine performs crawling of semantic documents like most web search engines and the search is available as web service too. The engine is primarily written in Java with the PHP used for the front-end and MySQL for database. Swoogle is capable of searching over 10,000 ontologies and indexes more that 1.3 million web documents. It also computes the importance of a Semantic Web document. The techniques used for indexing are the more google-type page ranking and also mining the documents for inter-relationships that are the basis for the semantic web. For more information on how the RDF framework can be used to relate documents, read the link here. Being a research project, and with a non-commercial motive, there is not much hype around Swoogle. However, the approach to indexing of Semantic web documents is an approach that most engines will have to take at some point of time. When the Internet debuted, there were no specific engines available for indexing or searching. The Search domain only picked up as more and more content became available. One fundamental question that I've always wondered about it is - provided that the search engines return very relevant results for a query - how to ascertain that the documents are indeed the most relevant ones available. There is always an inherent delay in indexing of document. Its here that the new semantic documents search engines can close delay. Experimenting with the concept of Search in the semantic web can only bore well for the future of search technology."

Source

http://www.searchenginejournal.com/swoogle-an-engine-for-the-semantic-web/5469/

Theme

Semantic Web

Dunning, A.: Do we still need search engines? (1999) 0.02

0.024705233 = product of:
  0.049410466 = sum of:
    0.049410466 = product of:
      0.09882093 = sum of:
        0.09882093 = weight(_text_:22 in 6021) [ClassicSimilarity], result of:
          0.09882093 = score(doc=6021,freq=2.0), product of:
            0.18244034 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052098576 = queryNorm
            0.5416616 = fieldWeight in 6021, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6021)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Ariadne. 1999, no.22

Spink, A.; Gunar, O.: E-Commerce Web queries : Excite and AskJeeves study (2001) 0.02

0.024522282 = product of:
  0.049044564 = sum of:
    0.049044564 = product of:
      0.09808913 = sum of:
        0.09808913 = weight(_text_:web in 910) [ClassicSimilarity], result of:
          0.09808913 = score(doc=910,freq=2.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.5769126 = fieldWeight in 910, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.125 = fieldNorm(doc=910)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Sullivan D.: How search engines rank web pages (1998) 0.02

0.024522282 = product of:
  0.049044564 = sum of:
    0.049044564 = product of:
      0.09808913 = sum of:
        0.09808913 = weight(_text_:web in 5808) [ClassicSimilarity], result of:
          0.09808913 = score(doc=5808,freq=2.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.5769126 = fieldWeight in 5808, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.125 = fieldNorm(doc=5808)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Barlow, L.: ¬The spider's apprentice : how to use Web search engines (1997) 0.02

0.024522282 = product of:
  0.049044564 = sum of:
    0.049044564 = product of:
      0.09808913 = sum of:
        0.09808913 = weight(_text_:web in 7534) [ClassicSimilarity], result of:
          0.09808913 = score(doc=7534,freq=2.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.5769126 = fieldWeight in 7534, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.125 = fieldNorm(doc=7534)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Gerhart, S.L.: Do Web search engines suppress controversy? : Simulating the exchange process (2004) 0.02

0.024522282 = product of:
  0.049044564 = sum of:
    0.049044564 = product of:
      0.09808913 = sum of:
        0.09808913 = weight(_text_:web in 8164) [ClassicSimilarity], result of:
          0.09808913 = score(doc=8164,freq=2.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.5769126 = fieldWeight in 8164, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.125 = fieldNorm(doc=8164)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Ding, L.; Finin, T.; Joshi, A.; Peng, Y.; Cost, R.S.; Sachs, J.; Pan, R.; Reddivari, P.; Doshi, V.: Swoogle : a Semantic Web search and metadata engine (2004) 0.02
```
0.02432995 = product of:
  0.0486599 = sum of:
    0.0486599 = product of:
      0.0973198 = sum of:
        0.0973198 = weight(_text_:web in 4704) [ClassicSimilarity], result of:
          0.0973198 = score(doc=4704,freq=14.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.57238775 = fieldWeight in 4704, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=4704)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Swoogle is a crawler-based indexing and retrieval system for the Semantic Web, i.e., for Web documents in RDF or OWL. It extracts metadata for each discovered document, and computes relations between documents. Discovered documents are also indexed by an information retrieval system which can use either character N-Gram or URIrefs as keywords to find relevant documents and to compute the similarity among a set of documents. One of the interesting properties we compute is rank, a measure of the importance of a Semantic Web document.

Content

Vgl. unter: http://www.dblab.ntua.gr/~bikakis/LD/5.pdf Vgl. auch: http://swoogle.umbc.edu/. Vgl. auch: http://ebiquity.umbc.edu/paper/html/id/183/. Vgl. auch: Radhakrishnan, A.: Swoogle : An Engine for the Semantic Web unter: http://www.searchenginejournal.com/swoogle-an-engine-for-the-semantic-web/5469/.

Theme

Semantic Web
Li, Z.: ¬A domain specific search engine with explicit document relations (2013) 0.02
```
0.022989638 = product of:
  0.045979276 = sum of:
    0.045979276 = product of:
      0.09195855 = sum of:
        0.09195855 = weight(_text_:web in 1210) [ClassicSimilarity], result of:
          0.09195855 = score(doc=1210,freq=18.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.5408555 = fieldWeight in 1210, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1210)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The current web consists of documents that are highly heterogeneous and hard for machines to understand. The Semantic Web is a progressive movement of the Word Wide Web, aiming at converting the current web of unstructured documents to the web of data. In the Semantic Web, web documents are annotated with metadata using standardized ontology language. These annotated documents are directly processable by machines and it highly improves their usability and usefulness. In Ericsson, similar problems occur. There are massive documents being created with well-defined structures. Though these documents are about domain specific knowledge and can have rich relations, they are currently managed by a traditional search engine, which ignores the rich domain specific information and presents few data to users. Motivated by the Semantic Web, we aim to find standard ways to process these documents, extract rich domain specific information and annotate these data to documents with formal markup languages. We propose this project to develop a domain specific search engine for processing different documents and building explicit relations for them. This research project consists of the three main focuses: examining different domain specific documents and finding ways to extract their metadata; integrating a text search engine with an ontology server; exploring novel ways to build relations for documents. We implement this system and demonstrate its functions. As a prototype, the system provides required features and will be extended in the future.

Theme

Semantic Web

Dambeck, H.: Wie Google mit Milliarden Unbekannten rechnet : Teil 2: Ausgerechnet: Der Page Rank für ein Mini-Web aus drei Seiten (2009) 0.02

0.02167484 = product of:
  0.04334968 = sum of:
    0.04334968 = product of:
      0.08669936 = sum of:
        0.08669936 = weight(_text_:web in 3080) [ClassicSimilarity], result of:
          0.08669936 = score(doc=3080,freq=4.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.5099235 = fieldWeight in 3080, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.078125 = fieldNorm(doc=3080)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: Ein simples Beispiel eines Mini-Internets aus drei Web-Seiten verdeutlicht, wie dieses Ranking-System in der Praxis funktioniert.

Bradley, P.: ¬The relevance of underpants to searching the Web (2000) 0.02

0.021456998 = product of:
  0.042913996 = sum of:
    0.042913996 = product of:
      0.08582799 = sum of:
        0.08582799 = weight(_text_:web in 3961) [ClassicSimilarity], result of:
          0.08582799 = score(doc=3961,freq=2.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.50479853 = fieldWeight in 3961, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.109375 = fieldNorm(doc=3961)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Page, L.; Brin, S.; Motwani, R.; Winograd, T.: ¬The PageRank citation ranking : Bringing order to the Web (1999) 0.02

0.021456998 = product of:
  0.042913996 = sum of:
    0.042913996 = product of:
      0.08582799 = sum of:
        0.08582799 = weight(_text_:web in 496) [ClassicSimilarity], result of:
          0.08582799 = score(doc=496,freq=2.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.50479853 = fieldWeight in 496, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.109375 = fieldNorm(doc=496)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Birmingham, J.: Internet search engines (1996) 0.02

0.021175914 = product of:
  0.042351827 = sum of:
    0.042351827 = product of:
      0.084703654 = sum of:
        0.084703654 = weight(_text_:22 in 5664) [ClassicSimilarity], result of:
          0.084703654 = score(doc=5664,freq=2.0), product of:
            0.18244034 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.052098576 = queryNorm
            0.46428138 = fieldWeight in 5664, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=5664)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 10.11.1996 16:36:22

Entlich, R.: FAQ: Image Search Engines (2001) 0.02
```
0.02056256 = product of:
  0.04112512 = sum of:
    0.04112512 = product of:
      0.08225024 = sum of:
        0.08225024 = weight(_text_:web in 155) [ClassicSimilarity], result of:
          0.08225024 = score(doc=155,freq=10.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.48375595 = fieldWeight in 155, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.046875 = fieldNorm(doc=155)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Everyone loves images. The web wasn't anything until images came along, then it was an overnight success. So how does one find a specific image on the web? By using one of a burgeoning number of image-focused search engines. These search engines are simply optimized versions of typical web indexes, with crawlers that go around sucking down web content and indexing it. But with image search engines, they focus on images only, and the web page text that may describe them. As information professionals, we know that this is a clumsy approach at best, but as the author puts it, until more sophisticated methods become available, the tools profiled here will "have to suffice." Seven search engines are thoroughly tested in this review article, with Google's Image Search (http://www.google.com/imghp?hl=en) being the highest rated
Brin, S.; Page, L.: ¬The anatomy of a large-scale hypertextual Web search engine (1998) 0.02
```
0.020274958 = product of:
  0.040549915 = sum of:
    0.040549915 = product of:
      0.08109983 = sum of:
        0.08109983 = weight(_text_:web in 947) [ClassicSimilarity], result of:
          0.08109983 = score(doc=947,freq=14.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.47698978 = fieldWeight in 947, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.0390625 = fieldNorm(doc=947)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems. The prototype with a full text and hyperlink database of at least 24 million pages is available at http://google.stanford.edu/. To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of millions of queries every day. Despite the importance of large-scale search engines on the web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from three years ago. This paper provides an in-depth description of our large-scale web search engine -- the first such detailed public description we know of to date. Apart from the problems of scaling traditional search techniques to data of this magnitude, there are new technical challenges involved with using the additional information present in hypertext to produce better search results. This paper addresses this question of how to build a practical large-scale system which can exploit the additional information present in hypertext. Also we look at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want

Leighton, H.V.: Performance of four World Wide Web (WWW) index services : Infoseek, Lycos, WebCrawler and WWWWorm (1995) 0.02

0.01839171 = product of:
  0.03678342 = sum of:
    0.03678342 = product of:
      0.07356684 = sum of:
        0.07356684 = weight(_text_:web in 3168) [ClassicSimilarity], result of:
          0.07356684 = score(doc=3168,freq=2.0), product of:
            0.17002425 = queryWeight, product of:
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.052098576 = queryNorm
            0.43268442 = fieldWeight in 3168, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.2635105 = idf(docFreq=4597, maxDocs=44218)
              0.09375 = fieldNorm(doc=3168)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Search (59 results, page 1 of 3)

Authors

Years

Languages

Types

Themes