Search (46 results, page 1 of 3)

Burgin, R.: ¬The retrieval effectiveness of 5 clustering algorithms as a function of indexing exhaustivity (1995) 0.09
```
0.08518513 = product of:
  0.21296284 = sum of:
    0.17849004 = weight(_text_:link in 3365) [ClassicSimilarity], result of:
      0.17849004 = score(doc=3365,freq=10.0), product of:
        0.2711644 = queryWeight, product of:
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.05088753 = queryNorm
        0.65823555 = fieldWeight in 3365, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3365)
    0.03447279 = weight(_text_:22 in 3365) [ClassicSimilarity], result of:
      0.03447279 = score(doc=3365,freq=2.0), product of:
        0.17819946 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.05088753 = queryNorm
        0.19345059 = fieldWeight in 3365, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3365)
  0.4 = coord(2/5)
```
Abstract

The retrieval effectiveness of 5 hierarchical clustering methods (single link, complete link, group average, Ward's method, and weighted average) is examined as a function of indexing exhaustivity with 4 test collections (CR, Cranfield, Medlars, and Time). Evaluations of retrieval effectiveness, based on 3 measures of optimal retrieval performance, confirm earlier findings that the performance of a retrieval system based on single link clustering varies as a function of indexing exhaustivity but fail ti find similar patterns for other clustering methods. The data also confirm earlier findings regarding the poor performance of single link clustering is a retrieval environment. The poor performance of single link clustering appears to derive from that method's tendency to produce a small number of large, ill defined document clusters. By contrast, the data examined here found the retrieval performance of the other clustering methods to be general comparable. The data presented also provides an opportunity to examine the theoretical limits of cluster based retrieval and to compare these theoretical limits to the effectiveness of operational implementations. Performance standards of the 4 document collections examined were found to vary widely, and the effectiveness of operational implementations were found to be in the range defined as unacceptable. Further improvements in search strategies and document representations warrant investigations

Date

22. 2.1996 11:20:06

Jones, K.: Linguistic searching versus relevance ranking : DR-LINK and TARGET (1999) 0.06

0.06321672 = product of:
  0.3160836 = sum of:
    0.3160836 = weight(_text_:link in 6423) [ClassicSimilarity], result of:
      0.3160836 = score(doc=6423,freq=4.0), product of:
        0.2711644 = queryWeight, product of:
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.05088753 = queryNorm
        1.1656531 = fieldWeight in 6423, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.109375 = fieldNorm(doc=6423)
  0.2 = coord(1/5)

Object: DR-LINK

Liddy, E.D.; Diamond, T.; McKenna, M.: DR-LINK in TIPSTER (2000) 0.05

0.051086824 = product of:
  0.25543413 = sum of:
    0.25543413 = weight(_text_:link in 3907) [ClassicSimilarity], result of:
      0.25543413 = score(doc=3907,freq=2.0), product of:
        0.2711644 = queryWeight, product of:
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.05088753 = queryNorm
        0.94198996 = fieldWeight in 3907, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.125 = fieldNorm(doc=3907)
  0.2 = coord(1/5)

Kleinberg, J.M.: Authoritative sources in a hyperlinked environment (1998) 0.04
```
0.03831512 = product of:
  0.1915756 = sum of:
    0.1915756 = weight(_text_:link in 5) [ClassicSimilarity], result of:
      0.1915756 = score(doc=5,freq=8.0), product of:
        0.2711644 = queryWeight, product of:
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.05088753 = queryNorm
        0.7064925 = fieldWeight in 5, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.046875 = fieldNorm(doc=5)
  0.2 = coord(1/5)
```
Abstract

The network structure of a hyperlinked environment can be a rich source of information about the content of the environment, provided we have effective means for understanding it. We develop a set of algorithmic tools for extracting information from the link structures of such environments, and report on experiments that demonstrate their effectiveness in a variety of contexts on the World Wide Web. The central issue we address within our framework is the distillation of broad search topics, through the discovery of "authoritative" information sources on such topics. We propose and test an algorithmic formulation of the notion of authority, based on the relationship between a set of relevant authoritative pages and the set of "hub pages" that join them together in the link structure. Our formulation has connections to the eigenvectors of certain matrices associated with the link graph; these connections in turn motivate additional heuristics for link-based analysis.
Lempel, R.; Moran, S.: SALSA: the stochastic approach for link-structure analysis (2001) 0.04
```
0.035698008 = product of:
  0.17849004 = sum of:
    0.17849004 = weight(_text_:link in 10) [ClassicSimilarity], result of:
      0.17849004 = score(doc=10,freq=10.0), product of:
        0.2711644 = queryWeight, product of:
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.05088753 = queryNorm
        0.65823555 = fieldWeight in 10, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.0390625 = fieldNorm(doc=10)
  0.2 = coord(1/5)
```
Abstract

Today, when searching for information on the WWW, one usually performs a query through a term-based search engine. These engines return, as the query's result, a list of Web pages whose contents matches the query. For broad-topic queries, such searches often result in a huge set of retrieved documents, many of which are irrelevant to the user. However, much information is contained in the link-structure of the WWW. Information such as which pages are linked to others can be used to augment search algorithms. In this context, Jon Kleinberg introduced the notion of two distinct types of Web pages: hubs and authorities. Kleinberg argued that hubs and authorities exhibit a mutually reinforcing relationship: a good hub will point to many authorities, and a good authority will be pointed at by many hubs. In light of this, he dervised an algoirthm aimed at finding authoritative pages. We present SALSA, a new stochastic approach for link-structure analysis, which examines random walks on graphs derived from the link-structure. We show that both SALSA and Kleinberg's Mutual Reinforcement approach employ the same metaalgorithm. We then prove that SALSA is quivalent to a weighted in degree analysis of the link-sturcutre of WWW subgraphs, making it computationally more efficient than the Mutual reinforcement approach. We compare that results of applying SALSA to the results derived through Kleinberg's approach. These comparisions reveal a topological Phenomenon called the TKC effectwhich, in certain cases, prevents the Mutual reinforcement approach from identifying meaningful authorities.

Brenner, E.H.: Beyond Boolean : new approaches in information retrieval; the quest for intuitive online search systems past, present & future (1995) 0.03

0.03160836 = product of:
  0.1580418 = sum of:
    0.1580418 = weight(_text_:link in 2547) [ClassicSimilarity], result of:
      0.1580418 = score(doc=2547,freq=4.0), product of:
        0.2711644 = queryWeight, product of:
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.05088753 = queryNorm
        0.58282655 = fieldWeight in 2547, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2547)
  0.2 = coord(1/5)

Content: (1) The Boolean world; (2) The Non-Boolean picture; (3) The commercial search engines: Personal Librarian, CLARIT, ConQuest, DR-LINK, InQuizit, InTEXT, TOPIC, WIN, TARGET, FREESTYLE, InfoSeek; (4) Wiedergabe von 8 Aufsätzen aus 'Monitor'
Object: DR-LINK

Weiß, B.: Verwandte Seiten finden : "Ähnliche Seiten" oder "What's Related" (2005) 0.03
```
0.027651558 = product of:
  0.13825779 = sum of:
    0.13825779 = weight(_text_:link in 868) [ClassicSimilarity], result of:
      0.13825779 = score(doc=868,freq=6.0), product of:
        0.2711644 = queryWeight, product of:
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.05088753 = queryNorm
        0.5098671 = fieldWeight in 868, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.0390625 = fieldNorm(doc=868)
  0.2 = coord(1/5)
```
Abstract

Die Link-Struktur-Analyse (LSA) ist nicht nur beim Crawling, dem Webseitenranking, der Abgrenzung geographischer Bereiche, der Vorhersage von Linkverwendungen, dem Auffinden von "Mirror"-Seiten, dem Kategorisieren von Webseiten und beim Generieren von Webseitenstatistiken eines der wichtigsten Analyseverfahren, sondern auch bei der Suche nach verwandten Seiten. Um qualitativ hochwertige verwandte Seiten zu finden, bildet sie nach herrschender Meinung den Hauptbestandteil bei der Identifizierung von ähnlichen Seiten innerhalb themenspezifischer Graphen vernetzter Dokumente. Dabei wird stets von zwei Annahmen ausgegangen: Links zwischen zwei Dokumenten implizieren einen verwandten Inhalt beider Dokumente und wenn die Dokumente aus unterschiedlichen Quellen (von unterschiedlichen Autoren, Hosts, Domänen, .) stammen, so bedeutet dies das eine Quelle die andere über einen Link empfiehlt. Aufbauend auf dieser Idee entwickelte Kleinberg 1998 den HITS Algorithmus um verwandte Seiten über die Link-Struktur-Analyse zu bestimmen. Dieser Ansatz wurde von Bharat und Henzinger weiterentwickelt und später auch in Algorithmen wie dem Companion und Cocitation Algorithmus zur Suche von verwandten Seiten basierend auf nur einer Anfrage-URL weiter verfolgt. In der vorliegenden Seminararbeit sollen dabei die Algorithmen, die hinter diesen Überlegungen stehen, näher erläutert werden und im Anschluss jeweils neuere Forschungsansätze auf diesem Themengebiet aufgezeigt werden.
Henzinger, M.R.: Link analysis in Web information retrieval (2000) 0.03
```
0.025543412 = product of:
  0.12771706 = sum of:
    0.12771706 = weight(_text_:link in 801) [ClassicSimilarity], result of:
      0.12771706 = score(doc=801,freq=8.0), product of:
        0.2711644 = queryWeight, product of:
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.05088753 = queryNorm
        0.47099498 = fieldWeight in 801, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.03125 = fieldNorm(doc=801)
  0.2 = coord(1/5)
```
Abstract

The analysis of the hyperlink structure of the web has led to significant improvements in web information retrieval. This survey describes two successful link analysis algorithms and the state-of-the art of the field.

Content

The goal of information retrieval is to find all documents relevant for a user query in a collection of documents. Decades of research in information retrieval were successful in developing and refining techniques that are solely word-based (see e.g., [2]). With the advent of the web new sources of information became available, one of them being the hyperlinks between documents and records of user behavior. To be precise, hypertexts (i.e., collections of documents connected by hyperlinks) have existed and have been studied for a long time. What was new was the large number of hyperlinks created by independent individuals. Hyperlinks provide a valuable source of information for web information retrieval as we will show in this article. This area of information retrieval is commonly called link analysis. Why would one expect hyperlinks to be useful? Ahyperlink is a reference of a web page B that is contained in a web page A. When the hyperlink is clicked on in a web browser, the browser displays page B. This functionality alone is not helpful for web information retrieval. However, the way hyperlinks are typically used by authors of web pages can give them valuable information content. Typically, authors create links because they think they will be useful for the readers of the pages. Thus, links are usually either navigational aids that, for example, bring the reader back to the homepage of the site, or links that point to pages whose content augments the content of the current page. The second kind of links tend to point to high-quality pages that might be on the same topic as the page containing the link.
Liddy, E.D.; Paik, W.; McKenna, M.; Yu, E.S.: ¬A natural language text retrieval system with relevance feedback (1995) 0.02
```
0.022350488 = product of:
  0.111752436 = sum of:
    0.111752436 = weight(_text_:link in 3131) [ClassicSimilarity], result of:
      0.111752436 = score(doc=3131,freq=2.0), product of:
        0.2711644 = queryWeight, product of:
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.05088753 = queryNorm
        0.4121206 = fieldWeight in 3131, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3131)
  0.2 = coord(1/5)
```
Abstract

Outlines a fully integrated retrieval engine that processes documents and queries at the multiple, complex linguistic levels that humans use to construe meaning. Currently undergoing beta site trials, the DR-LINK natural language text retrieval system allows searchers to state queries as fully formed, natural sentences. The meaning and matching of both queries and documents is accomplished at the conceptual level of human expression, not by the simple concurrence of keywords. Furthermore, the natural browsing behaviour of information searchers is accomodated by allowing documents identified as potentially relevant by the explicit semantics of the system to be used as relevance feedback queries which provide an appropriate implicit semantic representation of the information seeker's need
Bidoki, A.M.Z.; Yazdani, N.: an intelligent ranking algorithm for web pages : DistanceRank (2008) 0.02
```
0.022350488 = product of:
  0.111752436 = sum of:
    0.111752436 = weight(_text_:link in 2068) [ClassicSimilarity], result of:
      0.111752436 = score(doc=2068,freq=2.0), product of:
        0.2711644 = queryWeight, product of:
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.05088753 = queryNorm
        0.4121206 = fieldWeight in 2068, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2068)
  0.2 = coord(1/5)
```
Abstract

A fast and efficient page ranking mechanism for web crawling and retrieval remains as a challenging issue. Recently, several link based ranking algorithms like PageRank, HITS and OPIC have been proposed. In this paper, we propose a novel recursive method based on reinforcement learning which considers distance between pages as punishment, called "DistanceRank" to compute ranks of web pages. The distance is defined as the number of "average clicks" between two pages. The objective is to minimize punishment or distance so that a page with less distance to have a higher rank. Experimental results indicate that DistanceRank outperforms other ranking algorithms in page ranking and crawling scheduling. Furthermore, the complexity of DistanceRank is low. We have used University of California at Berkeley's web for our experiments.
Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (2005) 0.02
```
0.022121247 = product of:
  0.11060623 = sum of:
    0.11060623 = weight(_text_:link in 7) [ClassicSimilarity], result of:
      0.11060623 = score(doc=7,freq=6.0), product of:
        0.2711644 = queryWeight, product of:
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.05088753 = queryNorm
        0.40789366 = fieldWeight in 7, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.03125 = fieldNorm(doc=7)
  0.2 = coord(1/5)
```
Abstract

The second edition of Understanding Search Engines: Mathematical Modeling and Text Retrieval follows the basic premise of the first edition by discussing many of the key design issues for building search engines and emphasizing the important role that applied mathematics can play in improving information retrieval. The authors discuss important data structures, algorithms, and software as well as user-centered issues such as interfaces, manual indexing, and document preparation. Significant changes bring the text up to date on current information retrieval methods: for example the addition of a new chapter on link-structure algorithms used in search engines such as Google. The chapter on user interface has been rewritten to specifically focus on search engine usability. In addition the authors have added new recommendations for further reading and expanded the bibliography, and have updated and streamlined the index to make it more reader friendly.

Content

Inhalt: Introduction Document File Preparation - Manual Indexing - Information Extraction - Vector Space Modeling - Matrix Decompositions - Query Representations - Ranking and Relevance Feedback - Searching by Link Structure - User Interface - Book Format Document File Preparation Document Purification and Analysis - Text Formatting - Validation - Manual Indexing - Automatic Indexing - Item Normalization - Inverted File Structures - Document File - Dictionary List - Inversion List - Other File Structures Vector Space Models Construction - Term-by-Document Matrices - Simple Query Matching - Design Issues - Term Weighting - Sparse Matrix Storage - Low-Rank Approximations Matrix Decompositions QR Factorization - Singular Value Decomposition - Low-Rank Approximations - Query Matching - Software - Semidiscrete Decomposition - Updating Techniques Query Management Query Binding - Types of Queries - Boolean Queries - Natural Language Queries - Thesaurus Queries - Fuzzy Queries - Term Searches - Probabilistic Queries Ranking and Relevance Feedback Performance Evaluation - Precision - Recall - Average Precision - Genetic Algorithms - Relevance Feedback Searching by Link Structure HITS Method - HITS Implementation - HITS Summary - PageRank Method - PageRank Adjustments - PageRank Implementation - PageRank Summary User Interface Considerations General Guidelines - Search Engine Interfaces - Form Fill-in - Display Considerations - Progress Indication - No Penalties for Error - Results - Test and Retest - Final Considerations Further Reading

Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.02

0.022062585 = product of:
  0.110312924 = sum of:
    0.110312924 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
      0.110312924 = score(doc=402,freq=2.0), product of:
        0.17819946 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.05088753 = queryNorm
        0.61904186 = fieldWeight in 402, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.125 = fieldNorm(doc=402)
  0.2 = coord(1/5)

Source: Information processing and management. 22(1986) no.6, S.465-476

Smeaton, A.F.; Rijsbergen, C.J. van: ¬The retrieval effects of query expansion on a feedback document retrieval system (1983) 0.02

0.019304762 = product of:
  0.09652381 = sum of:
    0.09652381 = weight(_text_:22 in 2134) [ClassicSimilarity], result of:
      0.09652381 = score(doc=2134,freq=2.0), product of:
        0.17819946 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.05088753 = queryNorm
        0.5416616 = fieldWeight in 2134, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.109375 = fieldNorm(doc=2134)
  0.2 = coord(1/5)

Date: 30. 3.2001 13:32:22

Back, J.: ¬An evaluation of relevancy ranking techniques used by Internet search engines (2000) 0.02

0.019304762 = product of:
  0.09652381 = sum of:
    0.09652381 = weight(_text_:22 in 3445) [ClassicSimilarity], result of:
      0.09652381 = score(doc=3445,freq=2.0), product of:
        0.17819946 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.05088753 = queryNorm
        0.5416616 = fieldWeight in 3445, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.109375 = fieldNorm(doc=3445)
  0.2 = coord(1/5)

Date: 25. 8.2005 17:42:22

Nie, J.-Y.: Query expansion and query translation as logical inference (2003) 0.02
```
0.01915756 = product of:
  0.0957878 = sum of:
    0.0957878 = weight(_text_:link in 1425) [ClassicSimilarity], result of:
      0.0957878 = score(doc=1425,freq=2.0), product of:
        0.2711644 = queryWeight, product of:
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.05088753 = queryNorm
        0.35324624 = fieldWeight in 1425, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.046875 = fieldNorm(doc=1425)
  0.2 = coord(1/5)
```
Abstract

A number of studies have examined the problems of query expansion in monolingual Information Retrieval (IR), and query translation for crosslanguage IR. However, no link has been made between them. This article first shows that query translation is a special case of query expansion. There is also another set of studies an inferential IR. Again, there is no relationship established with query translation or query expansion. The second claim of this article is that logical inference is a general form that covers query expansion and query translation. This analysis provides a unified view of different subareas of IR. We further develop the inferential IR approach in two particular contexts: using fuzzy logic and probability theory. The evaluation formulas obtained are shown to strongly correspond to those used in other IR models. This indicates that inference is indeed the core of advanced IR.
Dominich, S.; Skrop, A.: PageRank and interaction information retrieval (2005) 0.02
```
0.01915756 = product of:
  0.0957878 = sum of:
    0.0957878 = weight(_text_:link in 3268) [ClassicSimilarity], result of:
      0.0957878 = score(doc=3268,freq=2.0), product of:
        0.2711644 = queryWeight, product of:
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.05088753 = queryNorm
        0.35324624 = fieldWeight in 3268, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.046875 = fieldNorm(doc=3268)
  0.2 = coord(1/5)
```
Abstract

The PageRank method is used by the Google Web search engine to compute the importance of Web pages. Two different views have been developed for the Interpretation of the PageRank method and values: (a) stochastic (random surfer): the PageRank values can be conceived as the steady-state distribution of a Markov chain, and (b) algebraic: the PageRank values form the eigenvector corresponding to eigenvalue 1 of the Web link matrix. The Interaction Information Retrieval (1**2 R) method is a nonclassical information retrieval paradigm, which represents a connectionist approach based an dynamic systems. In the present paper, a different Interpretation of PageRank is proposed, namely, a dynamic systems viewpoint, by showing that the PageRank method can be formally interpreted as a particular case of the Interaction Information Retrieval method; and thus, the PageRank values may be interpreted as neutral equilibrium points of the Web.
Thelwall, M.: Can Google's PageRank be used to find the most important academic Web pages? (2003) 0.02
```
0.01915756 = product of:
  0.0957878 = sum of:
    0.0957878 = weight(_text_:link in 4457) [ClassicSimilarity], result of:
      0.0957878 = score(doc=4457,freq=2.0), product of:
        0.2711644 = queryWeight, product of:
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.05088753 = queryNorm
        0.35324624 = fieldWeight in 4457, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.046875 = fieldNorm(doc=4457)
  0.2 = coord(1/5)
```
Abstract

Google's PageRank is an influential algorithm that uses a model of Web use that is dominated by its link structure in order to rank pages by their estimated value to the Web community. This paper reports on the outcome of applying the algorithm to the Web sites of three national university systems in order to test whether it is capable of identifying the most important Web pages. The results are also compared with simple inlink counts. It was discovered that the highest inlinked pages do not always have the highest PageRank, indicating that the two metrics are genuinely different, even for the top pages. More significantly, however, internal links dominated external links for the high ranks in either method and superficial reasons accounted for high scores in both cases. It is concluded that PageRank is not useful for identifying the top pages in a site and that it must be combined with a powerful text matching techniques in order to get the quality of information retrieval results provided by Google.
Thelwall, M.; Vaughan, L.: New versions of PageRank employing alternative Web document models (2004) 0.02
```
0.01915756 = product of:
  0.0957878 = sum of:
    0.0957878 = weight(_text_:link in 674) [ClassicSimilarity], result of:
      0.0957878 = score(doc=674,freq=2.0), product of:
        0.2711644 = queryWeight, product of:
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.05088753 = queryNorm
        0.35324624 = fieldWeight in 674, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.046875 = fieldNorm(doc=674)
  0.2 = coord(1/5)
```
Abstract

Introduces several new versions of PageRank (the link based Web page ranking algorithm), based on an information science perspective on the concept of the Web document. Although the Web page is the typical indivisible unit of information in search engine results and most Web information retrieval algorithms, other research has suggested that aggregating pages based on directories and domains gives promising alternatives, particularly when Web links are the object of study. The new algorithms introduced based on these alternatives were used to rank four sets of Web pages. The ranking results were compared with human subjects' rankings. The results of the tests were somewhat inconclusive: the new approach worked well for the set that includes pages from different Web sites; however, it does not work well in ranking pages that are from the same site. It seems that the new algorithms may be effective for some tasks but not for others, especially when only low numbers of links are involved or the pages to be ranked are from the same site or directory.
Koumenides, C.L.; Shadbolt, N.R.: Ranking methods for entity-oriented semantic web search (2014) 0.02
```
0.01915756 = product of:
  0.0957878 = sum of:
    0.0957878 = weight(_text_:link in 1280) [ClassicSimilarity], result of:
      0.0957878 = score(doc=1280,freq=2.0), product of:
        0.2711644 = queryWeight, product of:
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.05088753 = queryNorm
        0.35324624 = fieldWeight in 1280, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.3287 = idf(docFreq=582, maxDocs=44218)
          0.046875 = fieldNorm(doc=1280)
  0.2 = coord(1/5)
```
Abstract

This article provides a technical review of semantic search methods used to support text-based search over formal Semantic Web knowledge bases. Our focus is on ranking methods and auxiliary processes explored by existing semantic search systems, outlined within broad areas of classification. We present reflective examples from the literature in some detail, which should appeal to readers interested in a deeper perspective on the various methods and systems implemented in the outlined literature. The presentation covers graph exploration and propagation methods, adaptations of classic probabilistic retrieval models, and query-independent link analysis via flexible extensions to the PageRank algorithm. Future research directions are discussed, including development of more cohesive retrieval models to unlock further potentials and uses, data indexing schemes, integration with user interfaces, and building community consensus for more systematic evaluation and gradual development.

Fuhr, N.: Ranking-Experimente mit gewichteter Indexierung (1986) 0.02

0.016546939 = product of:
  0.08273469 = sum of:
    0.08273469 = weight(_text_:22 in 58) [ClassicSimilarity], result of:
      0.08273469 = score(doc=58,freq=2.0), product of:
        0.17819946 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.05088753 = queryNorm
        0.46428138 = fieldWeight in 58, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.09375 = fieldNorm(doc=58)
  0.2 = coord(1/5)

Date: 14. 6.2015 22:12:44

Search (46 results, page 1 of 3)

Authors

Years

Languages

Types

Themes

Subjects

Classifications