Search (94 results, page 1 of 5)

Chang, C.-H.; Hsu, C.-C.: Integrating query expansion and conceptual relevance feedback for personalized Web information retrieval (1998) 0.06

0.061383378 = product of:
  0.092075065 = sum of:
    0.06910593 = weight(_text_:web in 1319) [ClassicSimilarity], result of:
      0.06910593 = score(doc=1319,freq=6.0), product of:
        0.15807624 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.048437484 = queryNorm
        0.43716836 = fieldWeight in 1319, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1319)
    0.022969136 = product of:
      0.045938272 = sum of:
        0.045938272 = weight(_text_:22 in 1319) [ClassicSimilarity], result of:
          0.045938272 = score(doc=1319,freq=2.0), product of:
            0.16961981 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.048437484 = queryNorm
            0.2708308 = fieldWeight in 1319, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1319)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: Keyword based querying has been an immediate and efficient way to specify and retrieve related information that the user inquired. However, conventional document ranking based on an automatic assessment of document relevance to the query may not be the best approach when little information is given. Proposes an idea to integrate 2 existing techniques, query expansion and relevance feedback to achieve a concept-based information search for the Web
Date: 1. 8.1996 22:08:06
Footnote: Contribution to a special issue devoted to the Proceedings of the 7th International World Wide Web Conference, held 14-18 April 1998, Brisbane, Australia

Fan, W.; Fox, E.A.; Pathak, P.; Wu, H.: ¬The effects of fitness functions an genetic programming-based ranking discovery for Web search (2004) 0.06

0.05872331 = product of:
  0.08808496 = sum of:
    0.06839713 = weight(_text_:web in 2239) [ClassicSimilarity], result of:
      0.06839713 = score(doc=2239,freq=8.0), product of:
        0.15807624 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.048437484 = queryNorm
        0.43268442 = fieldWeight in 2239, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=2239)
    0.01968783 = product of:
      0.03937566 = sum of:
        0.03937566 = weight(_text_:22 in 2239) [ClassicSimilarity], result of:
          0.03937566 = score(doc=2239,freq=2.0), product of:
            0.16961981 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.048437484 = queryNorm
            0.23214069 = fieldWeight in 2239, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2239)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: Genetic-based evolutionary learning algorithms, such as genetic algorithms (GAs) and genetic programming (GP), have been applied to information retrieval (IR) since the 1980s. Recently, GP has been applied to a new IR taskdiscovery of ranking functions for Web search-and has achieved very promising results. However, in our prior research, only one fitness function has been used for GP-based learning. It is unclear how other fitness functions may affect ranking function discovery for Web search, especially since it is weIl known that choosing a proper fitness function is very important for the effectiveness and efficiency of evolutionary algorithms. In this article, we report our experience in contrasting different fitness function designs an GP-based learning using a very large Web corpus. Our results indicate that the design of fitness functions is instrumental in performance improvement. We also give recommendations an the design of fitness functions for genetic-based information retrieval experiments.
Date: 31. 5.2004 19:22:06

Meghabghab, G.: Google's Web page ranking applied to different topological Web graph structures (2001) 0.05
```
0.04555849 = product of:
  0.13667546 = sum of:
    0.13667546 = weight(_text_:web in 6028) [ClassicSimilarity], result of:
      0.13667546 = score(doc=6028,freq=46.0), product of:
        0.15807624 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.048437484 = queryNorm
        0.86461735 = fieldWeight in 6028, product of:
          6.78233 = tf(freq=46.0), with freq of:
            46.0 = termFreq=46.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=6028)
  0.33333334 = coord(1/3)
```
Abstract

This research is part of the ongoing study to better understand web page ranking on the web. It looks at a web page as a graph structure or a web graph, and tries to classify different web graphs in the new coordinate space: (out-degree, in-degree). The out-degree coordinate od is defined as the number of outgoing web pages from a given web page. The in-degree id coordinate is the number of web pages that point to a given web page. In this new coordinate space a metric is built to classify how close or far different web graphs are. Google's web ranking algorithm (Brin & Page, 1998) on ranking web pages is applied in this new coordinate space. The results of the algorithm has been modified to fit different topological web graph structures. Also the algorithm was not successful in the case of general web graphs and new ranking web algorithms have to be considered. This study does not look at enhancing web ranking by adding any contextual information. It only considers web links as a source to web page ranking. The author believes that understanding the underlying web page as a graph will help design better ranking web algorithms, enhance retrieval and web performance, and recommends using graphs as a part of visual aid for browsing engine designers
Ravana, S.D.; Rajagopal, P.; Balakrishnan, V.: Ranking retrieval systems using pseudo relevance judgments (2015) 0.03
```
0.034467425 = product of:
  0.051701136 = sum of:
    0.028498804 = weight(_text_:web in 2591) [ClassicSimilarity], result of:
      0.028498804 = score(doc=2591,freq=2.0), product of:
        0.15807624 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.048437484 = queryNorm
        0.18028519 = fieldWeight in 2591, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2591)
    0.023202332 = product of:
      0.046404663 = sum of:
        0.046404663 = weight(_text_:22 in 2591) [ClassicSimilarity], result of:
          0.046404663 = score(doc=2591,freq=4.0), product of:
            0.16961981 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.048437484 = queryNorm
            0.27358043 = fieldWeight in 2591, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2591)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Purpose In a system-based approach, replicating the web would require large test collections, and judging the relevancy of all documents per topic in creating relevance judgment through human assessors is infeasible. Due to the large amount of documents that requires judgment, there are possible errors introduced by human assessors because of disagreements. The paper aims to discuss these issues. Design/methodology/approach This study explores exponential variation and document ranking methods that generate a reliable set of relevance judgments (pseudo relevance judgments) to reduce human efforts. These methods overcome problems with large amounts of documents for judgment while avoiding human disagreement errors during the judgment process. This study utilizes two key factors: number of occurrences of each document per topic from all the system runs; and document rankings to generate the alternate methods. Findings The effectiveness of the proposed method is evaluated using the correlation coefficient of ranked systems using mean average precision scores between the original Text REtrieval Conference (TREC) relevance judgments and pseudo relevance judgments. The results suggest that the proposed document ranking method with a pool depth of 100 could be a reliable alternative to reduce human effort and disagreement errors involved in generating TREC-like relevance judgments. Originality/value Simple methods proposed in this study show improvement in the correlation coefficient in generating alternate relevance judgment without human assessors while contributing to information retrieval evaluation.

Date

20. 1.2015 18:30:22
18. 9.2018 18:22:56
Khoo, C.S.G.; Wan, K.-W.: ¬A simple relevancy-ranking strategy for an interface to Boolean OPACs (2004) 0.03
```
0.034255266 = product of:
  0.051382896 = sum of:
    0.03989833 = weight(_text_:web in 2509) [ClassicSimilarity], result of:
      0.03989833 = score(doc=2509,freq=8.0), product of:
        0.15807624 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.048437484 = queryNorm
        0.25239927 = fieldWeight in 2509, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.02734375 = fieldNorm(doc=2509)
    0.011484568 = product of:
      0.022969136 = sum of:
        0.022969136 = weight(_text_:22 in 2509) [ClassicSimilarity], result of:
          0.022969136 = score(doc=2509,freq=2.0), product of:
            0.16961981 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.048437484 = queryNorm
            0.1354154 = fieldWeight in 2509, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.02734375 = fieldNorm(doc=2509)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Content

"Most Web search engines accept natural language queries, perform some kind of fuzzy matching and produce ranked output, displaying first the documents that are most likely to be relevant. On the other hand, most library online public access catalogs (OPACs) an the Web are still Boolean retrieval systems that perform exact matching, and require users to express their search requests precisely in a Boolean search language and to refine their search statements to improve the search results. It is well-documented that users have difficulty searching Boolean OPACs effectively (e.g. Borgman, 1996; Ensor, 1992; Wallace, 1993). One approach to making OPACs easier to use is to develop a natural language search interface that acts as a middleware between the user's Web browser and the OPAC system. The search interface can accept a natural language query from the user and reformulate it as a series of Boolean search statements that are then submitted to the OPAC. The records retrieved by the OPAC are ranked by the search interface before forwarding them to the user's Web browser. The user, then, does not need to interact directly with the Boolean OPAC but with the natural language search interface or search intermediary. The search interface interacts with the OPAC system an the user's behalf. The advantage of this approach is that no modification to the OPAC or library system is required. Furthermore, the search interface can access multiple OPACs, acting as a meta search engine, and integrate search results from various OPACs before sending them to the user. The search interface needs to incorporate a method for converting the user's natural language query into a series of Boolean search statements, and for ranking the OPAC records retrieved. The purpose of this study was to develop a relevancyranking algorithm for a search interface to Boolean OPAC systems. This is part of an on-going effort to develop a knowledge-based search interface to OPACs called the E-Referencer (Khoo et al., 1998, 1999; Poo et al., 2000). E-Referencer v. 2 that has been implemented applies a repertoire of initial search strategies and reformulation strategies to retrieve records from OPACs using the Z39.50 protocol, and also assists users in mapping query keywords to the Library of Congress subject headings."

Source

Electronic library. 22(2004) no.2, S.112-120
Thelwall, M.; Vaughan, L.: New versions of PageRank employing alternative Web document models (2004) 0.03
```
0.032242715 = product of:
  0.096728146 = sum of:
    0.096728146 = weight(_text_:web in 674) [ClassicSimilarity], result of:
      0.096728146 = score(doc=674,freq=16.0), product of:
        0.15807624 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.048437484 = queryNorm
        0.6119082 = fieldWeight in 674, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=674)
  0.33333334 = coord(1/3)
```
Abstract

Introduces several new versions of PageRank (the link based Web page ranking algorithm), based on an information science perspective on the concept of the Web document. Although the Web page is the typical indivisible unit of information in search engine results and most Web information retrieval algorithms, other research has suggested that aggregating pages based on directories and domains gives promising alternatives, particularly when Web links are the object of study. The new algorithms introduced based on these alternatives were used to rank four sets of Web pages. The ranking results were compared with human subjects' rankings. The results of the tests were somewhat inconclusive: the new approach worked well for the set that includes pages from different Web sites; however, it does not work well in ranking pages that are from the same site. It seems that the new algorithms may be effective for some tasks but not for others, especially when only low numbers of links are involved or the pages to be ranked are from the same site or directory.
Shiri, A.A.; Revie, C.: Query expansion behavior within a thesaurus-enhanced search environment : a user-centered evaluation (2006) 0.03
```
0.029936887 = product of:
  0.04490533 = sum of:
    0.028498804 = weight(_text_:web in 56) [ClassicSimilarity], result of:
      0.028498804 = score(doc=56,freq=2.0), product of:
        0.15807624 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.048437484 = queryNorm
        0.18028519 = fieldWeight in 56, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=56)
    0.016406527 = product of:
      0.032813054 = sum of:
        0.032813054 = weight(_text_:22 in 56) [ClassicSimilarity], result of:
          0.032813054 = score(doc=56,freq=2.0), product of:
            0.16961981 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.048437484 = queryNorm
            0.19345059 = fieldWeight in 56, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=56)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The study reported here investigated the query expansion behavior of end-users interacting with a thesaurus-enhanced search system on the Web. Two groups, namely academic staff and postgraduate students, were recruited into this study. Data were collected from 90 searches performed by 30 users using the OVID interface to the CAB abstracts database. Data-gathering techniques included questionnaires, screen capturing software, and interviews. The results presented here relate to issues of search-topic and search-term characteristics, number and types of expanded queries, usefulness of thesaurus terms, and behavioral differences between academic staff and postgraduate students in their interaction. The key conclusions drawn were that (a) academic staff chose more narrow and synonymous terms than did postgraduate students, who generally selected broader and related terms; (b) topic complexity affected users' interaction with the thesaurus in that complex topics required more query expansion and search term selection; (c) users' prior topic-search experience appeared to have a significant effect on their selection and evaluation of thesaurus terms; (d) in 50% of the searches where additional terms were suggested from the thesaurus, users stated that they had not been aware of the terms at the beginning of the search; this observation was particularly noticeable in the case of postgraduate students.

Date

22. 7.2006 16:32:43
Henzinger, M.R.: Hyperlink analysis for the Web (2001) 0.03
```
0.027401041 = product of:
  0.08220312 = sum of:
    0.08220312 = weight(_text_:web in 8) [ClassicSimilarity], result of:
      0.08220312 = score(doc=8,freq=26.0), product of:
        0.15807624 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.048437484 = queryNorm
        0.520022 = fieldWeight in 8, product of:
          5.0990195 = tf(freq=26.0), with freq of:
            26.0 = termFreq=26.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.03125 = fieldNorm(doc=8)
  0.33333334 = coord(1/3)
```
Abstract

Hyperlink analysis algorithms allow search engines to deliver focused results to user queries.This article surveys ranking algorithms used to retrieve information on the Web.

Content

Information retrieval is a computer science subfield whose goal is to find all documents relevant to a user query in a given collection of documents. As such, information retrieval should really be called document retrieval. Before the advent of the Web, IR systems were typically installed in libraries for use mostly by reference librarians. The retrieval algorithm for these systems was usually based exclusively on analysis of the words in the document. The Web changed all this. Now each Web user has access to various search engines whose retrieval algorithms often use not only the words in the documents but also information like the hyperlink structure of the Web or markup language tags. How are hyperlinks useful? The hyperlink functionality alone-that is, the hyperlink to Web page B that is contained in Web page A-is not directly useful in information retrieval. However, the way Web page authors use hyperlinks can give them valuable information content. Authors usually create hyperlinks they think will be useful to readers. Some may be navigational aids that, for example, take the reader back to the site's home page; others provide access to documents that augment the content of the current page. The latter tend to point to highquality pages that might be on the same topic as the page containing the hyperlink. Web information retrieval systems can exploit this information to refine searches for relevant documents. Hyperlink analysis significantly improves the relevance of the search results, so much so that all major Web search engines claim to use some type of hyperlink analysis. However, the search engines do not disclose details about the type of hyperlink analysis they perform- mostly to avoid manipulation of search results by Web-positioning companies. In this article, I discuss how hyperlink analysis can be applied to ranking algorithms, and survey other ways Web search engines can use this analysis.
Ning, X.; Jin, H.; Wu, H.: RSS: a framework enabling ranked search on the semantic web (2008) 0.03
```
0.026868932 = product of:
  0.080606796 = sum of:
    0.080606796 = weight(_text_:web in 2069) [ClassicSimilarity], result of:
      0.080606796 = score(doc=2069,freq=16.0), product of:
        0.15807624 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.048437484 = queryNorm
        0.5099235 = fieldWeight in 2069, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2069)
  0.33333334 = coord(1/3)
```
Abstract

The semantic web not only contains resources but also includes the heterogeneous relationships among them, which is sharply distinguished from the current web. As the growth of the semantic web, specialized search techniques are of significance. In this paper, we present RSS-a framework for enabling ranked semantic search on the semantic web. In this framework, the heterogeneity of relationships is fully exploited to determine the global importance of resources. In addition, the search results can be greatly expanded with entities most semantically related to the query, thus able to provide users with properly ordered semantic search results by combining global ranking values and the relevance between the resources and the query. The proposed semantic search model which supports inference is very different from traditional keyword-based search methods. Moreover, RSS also distinguishes from many current methods of accessing the semantic web data in that it applies novel ranking strategies to prevent returning search results in disorder. The experimental results show that the framework is feasible and can produce better ordering of semantic search results than directly applying the standard PageRank algorithm on the semantic web.

Theme

Semantic Web

Zhang, D.; Dong, Y.: ¬An effective algorithm to rank Web resources (2000) 0.03

0.026598886 = product of:
  0.07979666 = sum of:
    0.07979666 = weight(_text_:web in 3662) [ClassicSimilarity], result of:
      0.07979666 = score(doc=3662,freq=2.0), product of:
        0.15807624 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.048437484 = queryNorm
        0.50479853 = fieldWeight in 3662, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.109375 = fieldNorm(doc=3662)
  0.33333334 = coord(1/3)

Finding anything in the billion page Web : are algorithms the key? (1999) 0.03

0.026598886 = product of:
  0.07979666 = sum of:
    0.07979666 = weight(_text_:web in 6248) [ClassicSimilarity], result of:
      0.07979666 = score(doc=6248,freq=2.0), product of:
        0.15807624 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.048437484 = queryNorm
        0.50479853 = fieldWeight in 6248, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.109375 = fieldNorm(doc=6248)
  0.33333334 = coord(1/3)

Bidoki, A.M.Z.; Yazdani, N.: an intelligent ranking algorithm for web pages : DistanceRank (2008) 0.03
```
0.026598886 = product of:
  0.07979666 = sum of:
    0.07979666 = weight(_text_:web in 2068) [ClassicSimilarity], result of:
      0.07979666 = score(doc=2068,freq=8.0), product of:
        0.15807624 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.048437484 = queryNorm
        0.50479853 = fieldWeight in 2068, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2068)
  0.33333334 = coord(1/3)
```
Abstract

A fast and efficient page ranking mechanism for web crawling and retrieval remains as a challenging issue. Recently, several link based ranking algorithms like PageRank, HITS and OPIC have been proposed. In this paper, we propose a novel recursive method based on reinforcement learning which considers distance between pages as punishment, called "DistanceRank" to compute ranks of web pages. The distance is defined as the number of "average clicks" between two pages. The objective is to minimize punishment or distance so that a page with less distance to have a higher rank. Experimental results indicate that DistanceRank outperforms other ranking algorithms in page ranking and crawling scheduling. Furthermore, the complexity of DistanceRank is low. We have used University of California at Berkeley's web for our experiments.
Thelwall, M.: Can Google's PageRank be used to find the most important academic Web pages? (2003) 0.03
```
0.025490109 = product of:
  0.07647032 = sum of:
    0.07647032 = weight(_text_:web in 4457) [ClassicSimilarity], result of:
      0.07647032 = score(doc=4457,freq=10.0), product of:
        0.15807624 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.048437484 = queryNorm
        0.48375595 = fieldWeight in 4457, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=4457)
  0.33333334 = coord(1/3)
```
Abstract

Google's PageRank is an influential algorithm that uses a model of Web use that is dominated by its link structure in order to rank pages by their estimated value to the Web community. This paper reports on the outcome of applying the algorithm to the Web sites of three national university systems in order to test whether it is capable of identifying the most important Web pages. The results are also compared with simple inlink counts. It was discovered that the highest inlinked pages do not always have the highest PageRank, indicating that the two metrics are genuinely different, even for the top pages. More significantly, however, internal links dominated external links for the high ranks in either method and superficial reasons accounted for high scores in both cases. It is concluded that PageRank is not useful for identifying the top pages in a site and that it must be combined with a powerful text matching techniques in order to get the quality of information retrieval results provided by Google.
Henzinger, M.R.: Link analysis in Web information retrieval (2000) 0.02
```
0.024032302 = product of:
  0.07209691 = sum of:
    0.07209691 = weight(_text_:web in 801) [ClassicSimilarity], result of:
      0.07209691 = score(doc=801,freq=20.0), product of:
        0.15807624 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.048437484 = queryNorm
        0.45608947 = fieldWeight in 801, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.03125 = fieldNorm(doc=801)
  0.33333334 = coord(1/3)
```
Abstract

The analysis of the hyperlink structure of the web has led to significant improvements in web information retrieval. This survey describes two successful link analysis algorithms and the state-of-the art of the field.

Content

The goal of information retrieval is to find all documents relevant for a user query in a collection of documents. Decades of research in information retrieval were successful in developing and refining techniques that are solely word-based (see e.g., [2]). With the advent of the web new sources of information became available, one of them being the hyperlinks between documents and records of user behavior. To be precise, hypertexts (i.e., collections of documents connected by hyperlinks) have existed and have been studied for a long time. What was new was the large number of hyperlinks created by independent individuals. Hyperlinks provide a valuable source of information for web information retrieval as we will show in this article. This area of information retrieval is commonly called link analysis. Why would one expect hyperlinks to be useful? Ahyperlink is a reference of a web page B that is contained in a web page A. When the hyperlink is clicked on in a web browser, the browser displays page B. This functionality alone is not helpful for web information retrieval. However, the way hyperlinks are typically used by authors of web pages can give them valuable information content. Typically, authors create links because they think they will be useful for the readers of the pages. Thus, links are usually either navigational aids that, for example, bring the reader back to the homepage of the site, or links that point to pages whose content augments the content of the current page. The second kind of links tend to point to high-quality pages that might be on the same topic as the page containing the link.

Jascó, P.: Mapping algorithms to translate natural language questions into search queries for Web databases (1997) 0.02

0.022799043 = product of:
  0.06839713 = sum of:
    0.06839713 = weight(_text_:web in 314) [ClassicSimilarity], result of:
      0.06839713 = score(doc=314,freq=2.0), product of:
        0.15807624 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.048437484 = queryNorm
        0.43268442 = fieldWeight in 314, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.09375 = fieldNorm(doc=314)
  0.33333334 = coord(1/3)

Dominich, S.; Skrop, A.: PageRank and interaction information retrieval (2005) 0.02
```
0.022799043 = product of:
  0.06839713 = sum of:
    0.06839713 = weight(_text_:web in 3268) [ClassicSimilarity], result of:
      0.06839713 = score(doc=3268,freq=8.0), product of:
        0.15807624 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.048437484 = queryNorm
        0.43268442 = fieldWeight in 3268, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=3268)
  0.33333334 = coord(1/3)
```
Abstract

The PageRank method is used by the Google Web search engine to compute the importance of Web pages. Two different views have been developed for the Interpretation of the PageRank method and values: (a) stochastic (random surfer): the PageRank values can be conceived as the steady-state distribution of a Markov chain, and (b) algebraic: the PageRank values form the eigenvector corresponding to eigenvalue 1 of the Web link matrix. The Interaction Information Retrieval (1**2 R) method is a nonclassical information retrieval paradigm, which represents a connectionist approach based an dynamic systems. In the present paper, a different Interpretation of PageRank is proposed, namely, a dynamic systems viewpoint, by showing that the PageRank method can be formally interpreted as a particular case of the Interaction Information Retrieval method; and thus, the PageRank values may be interpreted as neutral equilibrium points of the Web.

Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (1999) 0.02

0.022799043 = product of:
  0.06839713 = sum of:
    0.06839713 = weight(_text_:web in 5777) [ClassicSimilarity], result of:
      0.06839713 = score(doc=5777,freq=8.0), product of:
        0.15807624 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.048437484 = queryNorm
        0.43268442 = fieldWeight in 5777, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=5777)
  0.33333334 = coord(1/3)

LCSH: Web search engines
RSWK: World Wide Web / Suchmaschine / Mathematisches Modell (BVB)
Subject: World Wide Web / Suchmaschine / Mathematisches Modell (BVB)
Web search engines

Stock, M.; Stock, W.G.: Internet-Suchwerkzeuge im Vergleich (IV) : Relevance Ranking nach "Popularität" von Webseiten: Google (2001) 0.02
```
0.019744553 = product of:
  0.059233658 = sum of:
    0.059233658 = weight(_text_:web in 5771) [ClassicSimilarity], result of:
      0.059233658 = score(doc=5771,freq=6.0), product of:
        0.15807624 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.048437484 = queryNorm
        0.37471575 = fieldWeight in 5771, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=5771)
  0.33333334 = coord(1/3)
```
Abstract

In unserem Retrievaltest von Suchwerkzeugen im World Wide Web (Password 11/2000) schnitt die Suchmaschine Google am besten ab. Im Vergleich zu anderen Search Engines setzt Google kaum auf Informationslinguistik, sondern auf Algorithmen, die sich aus den Besonderheiten der Web-Dokumente ableiten lassen. Kernstück der informationsstatistischen Technik ist das "PageRank"- Verfahren (benannt nach dem Entwickler Larry Page), das aus der Hypertextstruktur des Web die "Popularität" von Seiten anhand ihrer ein- und ausgehenden Links berechnet. Google besticht durch das Angebot intuitiv verstehbarer Suchbildschirme sowie durch einige sehr nützliche "Kleinigkeiten" wie die Angabe des Rangs einer Seite, Highlighting, Suchen in der Seite, Suchen innerhalb eines Suchergebnisses usw., alles verstaut in einer eigenen Befehlsleiste innerhalb des Browsers. Ähnlich wie RealNames bietet Google mit dem Produkt "AdWords" den Aufkauf von Suchtermen an. Nach einer Reihe von nunmehr vier Password-Artikeln über InternetSuchwerkzeugen im Vergleich wollen wir abschließend zu einer Bewertung kommen. Wie ist der Stand der Technik bei Directories und Search Engines aus informationswissenschaftlicher Sicht einzuschätzen? Werden die "typischen" Internetnutzer, die ja in der Regel keine Information Professionals sind, adäquat bedient? Und können auch Informationsfachleute von den Suchwerkzeugen profitieren?
Radev, D.; Fan, W.; Qu, H.; Wu, H.; Grewal, A.: Probabilistic question answering on the Web (2005) 0.02
```
0.019744553 = product of:
  0.059233658 = sum of:
    0.059233658 = weight(_text_:web in 3455) [ClassicSimilarity], result of:
      0.059233658 = score(doc=3455,freq=6.0), product of:
        0.15807624 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.048437484 = queryNorm
        0.37471575 = fieldWeight in 3455, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=3455)
  0.33333334 = coord(1/3)
```
Abstract

Web-based search engines such as Google and NorthernLight return documents that are relevant to a user query, not answers to user questions. We have developed an architecture that augments existing search engines so that they support natural language question answering. The process entails five steps: query modulation, document retrieval, passage extraction, phrase extraction, and answer ranking. In this article, we describe some probabilistic approaches to the last three of these stages. We show how our techniques apply to a number of existing search engines, and we also present results contrasting three different methods for question answering. Our algorithm, probabilistic phrase reranking (PPR), uses proximity and question type features and achieves a total reciprocal document rank of .20 an the TREC8 corpus. Our techniques have been implemented as a Web-accessible system, called NSIR.
Fu, X.: Towards a model of implicit feedback for Web search (2010) 0.02
```
0.019744553 = product of:
  0.059233658 = sum of:
    0.059233658 = weight(_text_:web in 3310) [ClassicSimilarity], result of:
      0.059233658 = score(doc=3310,freq=6.0), product of:
        0.15807624 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.048437484 = queryNorm
        0.37471575 = fieldWeight in 3310, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.046875 = fieldNorm(doc=3310)
  0.33333334 = coord(1/3)
```
Abstract

This research investigated several important issues in using implicit feedback techniques to assist searchers with difficulties in formulating effective search strategies. It focused on examining the relationship between types of behavioral evidence that can be captured from Web searches and searchers' interests. A carefully crafted observation study was conducted to capture, examine, and elucidate the analytical processes and work practices of human analysts when they simulated the role of an implicit feedback system by trying to infer searchers' interests from behavioral traces. Findings provided rare insight into the complexities and nuances in using behavioral evidence for implicit feedback and led to the proposal of an implicit feedback model for Web search that bridged previous studies on behavioral evidence and implicit feedback measures. A new level of analysis termed an analytical lens emerged from the data and provides a road map for future research on this topic.

Search (94 results, page 1 of 5)

Authors

Years

Languages

Types

Themes

Subjects

Classifications