Search (18 results, page 1 of 1)

  • × theme_ss:"Retrievalalgorithmen"
  • × theme_ss:"Suchmaschinen"
  • × type_ss:"a"
  1. Kanaeva, Z.: Ranking: Google und CiteSeer (2005) 0.14
    0.14489293 = product of:
      0.28978586 = sum of:
        0.13105242 = weight(_text_:suchmaschine in 3276) [ClassicSimilarity], result of:
          0.13105242 = score(doc=3276,freq=4.0), product of:
            0.21191008 = queryWeight, product of:
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.03747799 = queryNorm
            0.6184341 = fieldWeight in 3276, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3276)
        0.14688537 = weight(_text_:ranking in 3276) [ClassicSimilarity], result of:
          0.14688537 = score(doc=3276,freq=6.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.7245744 = fieldWeight in 3276, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3276)
        0.011848084 = product of:
          0.03554425 = sum of:
            0.03554425 = weight(_text_:22 in 3276) [ClassicSimilarity], result of:
              0.03554425 = score(doc=3276,freq=2.0), product of:
                0.13124153 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03747799 = queryNorm
                0.2708308 = fieldWeight in 3276, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3276)
          0.33333334 = coord(1/3)
      0.5 = coord(3/6)
    
    Abstract
    Im Rahmen des klassischen Information Retrieval wurden verschiedene Verfahren für das Ranking sowie die Suche in einer homogenen strukturlosen Dokumentenmenge entwickelt. Die Erfolge der Suchmaschine Google haben gezeigt dass die Suche in einer zwar inhomogenen aber zusammenhängenden Dokumentenmenge wie dem Internet unter Berücksichtigung der Dokumentenverbindungen (Links) sehr effektiv sein kann. Unter den von der Suchmaschine Google realisierten Konzepten ist ein Verfahren zum Ranking von Suchergebnissen (PageRank), das in diesem Artikel kurz erklärt wird. Darüber hinaus wird auf die Konzepte eines Systems namens CiteSeer eingegangen, welches automatisch bibliographische Angaben indexiert (engl. Autonomous Citation Indexing, ACI). Letzteres erzeugt aus einer Menge von nicht vernetzten wissenschaftlichen Dokumenten eine zusammenhängende Dokumentenmenge und ermöglicht den Einsatz von Banking-Verfahren, die auf den von Google genutzten Verfahren basieren.
    Date
    20. 3.2005 16:23:22
  2. Back, J.: ¬An evaluation of relevancy ranking techniques used by Internet search engines (2000) 0.06
    0.06443493 = product of:
      0.19330478 = sum of:
        0.16960861 = weight(_text_:ranking in 3445) [ClassicSimilarity], result of:
          0.16960861 = score(doc=3445,freq=2.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.8366664 = fieldWeight in 3445, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.109375 = fieldNorm(doc=3445)
        0.023696167 = product of:
          0.0710885 = sum of:
            0.0710885 = weight(_text_:22 in 3445) [ClassicSimilarity], result of:
              0.0710885 = score(doc=3445,freq=2.0), product of:
                0.13124153 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03747799 = queryNorm
                0.5416616 = fieldWeight in 3445, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3445)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Date
    25. 8.2005 17:42:22
  3. Stock, M.; Stock, W.G.: Internet-Suchwerkzeuge im Vergleich (IV) : Relevance Ranking nach "Popularität" von Webseiten: Google (2001) 0.05
    0.050706387 = product of:
      0.15211916 = sum of:
        0.07942976 = weight(_text_:suchmaschine in 5771) [ClassicSimilarity], result of:
          0.07942976 = score(doc=5771,freq=2.0), product of:
            0.21191008 = queryWeight, product of:
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.03747799 = queryNorm
            0.37482765 = fieldWeight in 5771, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.6542544 = idf(docFreq=420, maxDocs=44218)
              0.046875 = fieldNorm(doc=5771)
        0.0726894 = weight(_text_:ranking in 5771) [ClassicSimilarity], result of:
          0.0726894 = score(doc=5771,freq=2.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.35857132 = fieldWeight in 5771, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.046875 = fieldNorm(doc=5771)
      0.33333334 = coord(2/6)
    
    Abstract
    In unserem Retrievaltest von Suchwerkzeugen im World Wide Web (Password 11/2000) schnitt die Suchmaschine Google am besten ab. Im Vergleich zu anderen Search Engines setzt Google kaum auf Informationslinguistik, sondern auf Algorithmen, die sich aus den Besonderheiten der Web-Dokumente ableiten lassen. Kernstück der informationsstatistischen Technik ist das "PageRank"- Verfahren (benannt nach dem Entwickler Larry Page), das aus der Hypertextstruktur des Web die "Popularität" von Seiten anhand ihrer ein- und ausgehenden Links berechnet. Google besticht durch das Angebot intuitiv verstehbarer Suchbildschirme sowie durch einige sehr nützliche "Kleinigkeiten" wie die Angabe des Rangs einer Seite, Highlighting, Suchen in der Seite, Suchen innerhalb eines Suchergebnisses usw., alles verstaut in einer eigenen Befehlsleiste innerhalb des Browsers. Ähnlich wie RealNames bietet Google mit dem Produkt "AdWords" den Aufkauf von Suchtermen an. Nach einer Reihe von nunmehr vier Password-Artikeln über InternetSuchwerkzeugen im Vergleich wollen wir abschließend zu einer Bewertung kommen. Wie ist der Stand der Technik bei Directories und Search Engines aus informationswissenschaftlicher Sicht einzuschätzen? Werden die "typischen" Internetnutzer, die ja in der Regel keine Information Professionals sind, adäquat bedient? Und können auch Informationsfachleute von den Suchwerkzeugen profitieren?
  4. Meghabghab, G.: Google's Web page ranking applied to different topological Web graph structures (2001) 0.03
    0.028555095 = product of:
      0.17133057 = sum of:
        0.17133057 = weight(_text_:ranking in 6028) [ClassicSimilarity], result of:
          0.17133057 = score(doc=6028,freq=16.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.8451607 = fieldWeight in 6028, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6028)
      0.16666667 = coord(1/6)
    
    Abstract
    This research is part of the ongoing study to better understand web page ranking on the web. It looks at a web page as a graph structure or a web graph, and tries to classify different web graphs in the new coordinate space: (out-degree, in-degree). The out-degree coordinate od is defined as the number of outgoing web pages from a given web page. The in-degree id coordinate is the number of web pages that point to a given web page. In this new coordinate space a metric is built to classify how close or far different web graphs are. Google's web ranking algorithm (Brin & Page, 1998) on ranking web pages is applied in this new coordinate space. The results of the algorithm has been modified to fit different topological web graph structures. Also the algorithm was not successful in the case of general web graphs and new ranking web algorithms have to be considered. This study does not look at enhancing web ranking by adding any contextual information. It only considers web links as a source to web page ranking. The author believes that understanding the underlying web page as a graph will help design better ranking web algorithms, enhance retrieval and web performance, and recommends using graphs as a part of visual aid for browsing engine designers
  5. Weinstein, A.: Hochprozentig : Tipps and tricks für ein Top-Ranking (2002) 0.03
    0.028555095 = product of:
      0.17133057 = sum of:
        0.17133057 = weight(_text_:ranking in 1083) [ClassicSimilarity], result of:
          0.17133057 = score(doc=1083,freq=4.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.8451607 = fieldWeight in 1083, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.078125 = fieldNorm(doc=1083)
      0.16666667 = coord(1/6)
    
    Abstract
    Die Suchmaschinen haben in den letzten Monaten an ihren Ranking-Algorithmen gefeilt, um Spamern das Handwerk zu erschweren. Internet Pro beleuchtet die Trends im Suchmaschinen-Marketing
  6. Furner, J.: ¬A unifying model of document relatedness for hybrid search engines (2003) 0.03
    0.027614966 = product of:
      0.0828449 = sum of:
        0.0726894 = weight(_text_:ranking in 2717) [ClassicSimilarity], result of:
          0.0726894 = score(doc=2717,freq=2.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.35857132 = fieldWeight in 2717, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.046875 = fieldNorm(doc=2717)
        0.0101555 = product of:
          0.030466499 = sum of:
            0.030466499 = weight(_text_:22 in 2717) [ClassicSimilarity], result of:
              0.030466499 = score(doc=2717,freq=2.0), product of:
                0.13124153 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03747799 = queryNorm
                0.23214069 = fieldWeight in 2717, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2717)
          0.33333334 = coord(1/3)
      0.33333334 = coord(2/6)
    
    Abstract
    Previous work an search-engine design has indicated that information-seekers may benefit from being given the opportunity to exploit multiple sources of evidence of document relatedness. Few existing systems, however, give users more than minimal control over the selections that may be made among methods of exploitation. By applying the methods of "document network analysis" (DNA), a unifying, graph-theoretic model of content-, collaboration-, and context-based systems (CCC) may be developed in which the nature of the similarities between types of document relatedness and document ranking are clarified. The usefulness of the approach to system design suggested by this model may be tested by constructing and evaluating a prototype system (UCXtra) that allows searchers to maintain control over the multiple ways in which document collections may be ranked and re-ranked.
    Date
    11. 9.2004 17:32:22
  7. Thelwall, M.; Vaughan, L.: New versions of PageRank employing alternative Web document models (2004) 0.02
    0.020983625 = product of:
      0.12590174 = sum of:
        0.12590174 = weight(_text_:ranking in 674) [ClassicSimilarity], result of:
          0.12590174 = score(doc=674,freq=6.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.62106377 = fieldWeight in 674, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.046875 = fieldNorm(doc=674)
      0.16666667 = coord(1/6)
    
    Abstract
    Introduces several new versions of PageRank (the link based Web page ranking algorithm), based on an information science perspective on the concept of the Web document. Although the Web page is the typical indivisible unit of information in search engine results and most Web information retrieval algorithms, other research has suggested that aggregating pages based on directories and domains gives promising alternatives, particularly when Web links are the object of study. The new algorithms introduced based on these alternatives were used to rank four sets of Web pages. The ranking results were compared with human subjects' rankings. The results of the tests were somewhat inconclusive: the new approach worked well for the set that includes pages from different Web sites; however, it does not work well in ranking pages that are from the same site. It seems that the new algorithms may be effective for some tasks but not for others, especially when only low numbers of links are involved or the pages to be ranked are from the same site or directory.
  8. Ding, Y.; Yan, E.; Frazho, A.; Caverlee, J.: PageRank for ranking authors in co-citation networks (2009) 0.02
    0.020983625 = product of:
      0.12590174 = sum of:
        0.12590174 = weight(_text_:ranking in 3161) [ClassicSimilarity], result of:
          0.12590174 = score(doc=3161,freq=6.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.62106377 = fieldWeight in 3161, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.046875 = fieldNorm(doc=3161)
      0.16666667 = coord(1/6)
    
    Abstract
    This paper studies how varied damping factors in the PageRank algorithm influence the ranking of authors and proposes weighted PageRank algorithms. We selected the 108 most highly cited authors in the information retrieval (IR) area from the 1970s to 2008 to form the author co-citation network. We calculated the ranks of these 108 authors based on PageRank with the damping factor ranging from 0.05 to 0.95. In order to test the relationship between different measures, we compared PageRank and weighted PageRank results with the citation ranking, h-index, and centrality measures. We found that in our author co-citation network, citation rank is highly correlated with PageRank with different damping factors and also with different weighted PageRank algorithms; citation rank and PageRank are not significantly correlated with centrality measures; and h-index rank does not significantly correlate with centrality measures but does significantly correlate with other measures. The key factors that have impact on the PageRank of authors in the author co-citation network are being co-cited with important authors.
  9. Jindal, V.; Bawa, S.; Batra, S.: ¬A review of ranking approaches for semantic search on Web (2014) 0.02
    0.020983625 = product of:
      0.12590174 = sum of:
        0.12590174 = weight(_text_:ranking in 2799) [ClassicSimilarity], result of:
          0.12590174 = score(doc=2799,freq=6.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.62106377 = fieldWeight in 2799, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.046875 = fieldNorm(doc=2799)
      0.16666667 = coord(1/6)
    
    Abstract
    With ever increasing information being available to the end users, search engines have become the most powerful tools for obtaining useful information scattered on the Web. However, it is very common that even most renowned search engines return result sets with not so useful pages to the user. Research on semantic search aims to improve traditional information search and retrieval methods where the basic relevance criteria rely primarily on the presence of query keywords within the returned pages. This work is an attempt to explore different relevancy ranking approaches based on semantics which are considered appropriate for the retrieval of relevant information. In this paper, various pilot projects and their corresponding outcomes have been investigated based on methodologies adopted and their most distinctive characteristics towards ranking. An overview of selected approaches and their comparison by means of the classification criteria has been presented. With the help of this comparison, some common concepts and outstanding features have been identified.
  10. Courtois, M.P.; Berry, M.W.: Results ranking in Web search engines (1999) 0.02
    0.020191502 = product of:
      0.121149 = sum of:
        0.121149 = weight(_text_:ranking in 3726) [ClassicSimilarity], result of:
          0.121149 = score(doc=3726,freq=2.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.5976189 = fieldWeight in 3726, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.078125 = fieldNorm(doc=3726)
      0.16666667 = coord(1/6)
    
  11. Bhansali, D.; Desai, H.; Deulkar, K.: ¬A study of different ranking approaches for semantic search (2015) 0.02
    0.020191502 = product of:
      0.121149 = sum of:
        0.121149 = weight(_text_:ranking in 2696) [ClassicSimilarity], result of:
          0.121149 = score(doc=2696,freq=8.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.5976189 = fieldWeight in 2696, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2696)
      0.16666667 = coord(1/6)
    
    Abstract
    Search Engines have become an integral part of our day to day life. Our reliance on search engines increases with every passing day. With the amount of data available on Internet increasing exponentially, it becomes important to develop new methods and tools that help to return results relevant to the queries and reduce the time spent on searching. The results should be diverse but at the same time should return results focused on the queries asked. Relation Based Page Rank [4] algorithms are considered to be the next frontier in improvement of Semantic Web Search. The probability of finding relevance in the search results as posited by the user while entering the query is used to measure the relevance. However, its application is limited by the complexity of determining relation between the terms and assigning explicit meaning to each term. Trust Rank is one of the most widely used ranking algorithms for semantic web search. Few other ranking algorithms like HITS algorithm, PageRank algorithm are also used for Semantic Web Searching. In this paper, we will provide a comparison of few ranking approaches.
  12. White, R.W.; Jose, J.M.; Ruthven, I.: Using top-ranking sentences to facilitate effective information access (2005) 0.01
    0.014277548 = product of:
      0.085665286 = sum of:
        0.085665286 = weight(_text_:ranking in 3881) [ClassicSimilarity], result of:
          0.085665286 = score(doc=3881,freq=4.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.42258036 = fieldWeight in 3881, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3881)
      0.16666667 = coord(1/6)
    
    Abstract
    Web searchers typically fall to view search results beyond the first page nor fully examine those results presented to them. In this article we describe an approach that encourages a deeper examination of the contents of the document set retrieved in response to a searcher's query. The approach shifts the focus of perusal and interaction away from potentially uninformative document surrogates (such as titles, sentence fragments, and URLs) to actual document content, and uses this content to drive the information seeking process. Current search interfaces assume searchers examine results document-by-document. In contrast our approach extracts, ranks, and presents the contents of the top-ranked document set. We use query-relevant topranking sentences extracted from the top documents at retrieval time as fine-grained representations of topranked document content and, when combined in a ranked list, an overview of these documents. The interaction of the searcher provides implicit evidence that is used to reorder the sentences where appropriate. We evaluate our approach in three separate user studies, each applying these sentences in a different way. The findings of these studies show that top-ranking sentences can facilitate effective information access.
  13. Watters, C.; Amoudi, A.: Geosearcher : location-based ranking of search engine results (2003) 0.01
    0.014277548 = product of:
      0.085665286 = sum of:
        0.085665286 = weight(_text_:ranking in 5152) [ClassicSimilarity], result of:
          0.085665286 = score(doc=5152,freq=4.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.42258036 = fieldWeight in 5152, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5152)
      0.16666667 = coord(1/6)
    
    Abstract
    Waters and Amoudi describe GeoSearcher, a prototype ranking program that arranges search engine results along a geo-spatial dimension without the provision of geo-spatial meta-tags or the use of geo-spatial feature extraction. GeoSearcher uses URL analysis, IptoLL, Whois, and the Getty Thesaurus of Geographic Names to determine site location. It accepts the first 200 sites returned by a search engine, identifies the coordinates, calculates their distance from a reference point and ranks in ascending order by this value. For any retrieved site the system checks if it has already been located in the current session, then sends the domain name to Whois to generate a return of a two letter country code and an area code. With no success the name is stripped one level and resent. If this fails the top level domain is tested for being a country code. Any remaining unmatched names go to IptoLL. Distance is calculated using the center point of the geographic area and a provided reference location. A test run on a set of 100 URLs from a search was successful in locating 90 sites. Eighty three pages could be manually found and 68 had sufficient information to verify location determination. Of these 65 ( 95%) had been assigned reasonably correct geographic locations. A random set of URLs used instead of a search result, yielded 80% success.
  14. Bilal, D.: Ranking, relevance judgment, and precision of information retrieval on children's queries : evaluation of Google, Yahoo!, Bing, Yahoo! Kids, and ask Kids (2012) 0.01
    0.013989083 = product of:
      0.08393449 = sum of:
        0.08393449 = weight(_text_:ranking in 393) [ClassicSimilarity], result of:
          0.08393449 = score(doc=393,freq=6.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.4140425 = fieldWeight in 393, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03125 = fieldNorm(doc=393)
      0.16666667 = coord(1/6)
    
    Abstract
    This study employed benchmarking and intellectual relevance judgment in evaluating Google, Yahoo!, Bing, Yahoo! Kids, and Ask Kids on 30 queries that children formulated to find information for specific tasks. Retrieved hits on given queries were benchmarked to Google's and Yahoo! Kids' top-five ranked hits retrieved. Relevancy of hits was judged on a graded scale; precision was calculated using the precision-at-ten metric (P@10). Yahoo! and Bing produced a similar percentage in hit overlap with Google (nearly 30%), but differed in the ranking of hits. Ask Kids retrieved 11% in hit overlap with Google versus 3% by Yahoo! Kids. The engines retrieved 26 hits across query clusters that overlapped with Yahoo! Kids' top-five ranked hits. Precision (P) that the engines produced across the queries was P = 0.48 for relevant hits, and P = 0.28 for partially relevant hits. Precision by Ask Kids was P = 0.44 for relevant hits versus P = 0.21 by Yahoo! Kids. Bing produced the highest total precision (TP) of relevant hits (TP = 0.86) across the queries, and Yahoo! Kids yielded the lowest (TP = 0.47). Average precision (AP) of relevant hits was AP = 0.56 by leading engines versus AP = 0.29 by small engines. In contrast, average precision of partially relevant hits was AP = 0.83 by small engines versus AP = 0.33 by leading engines. Average precision of relevant hits across the engines was highest on two-word queries and lowest on one-word queries. Google performed best on natural language queries; Bing did the same (P = 0.69) on two-word queries. The findings have implications for search engine ranking algorithms, relevance theory, search engine design, research design, and information literacy.
  15. Radev, D.; Fan, W.; Qu, H.; Wu, H.; Grewal, A.: Probabilistic question answering on the Web (2005) 0.01
    0.0121149 = product of:
      0.0726894 = sum of:
        0.0726894 = weight(_text_:ranking in 3455) [ClassicSimilarity], result of:
          0.0726894 = score(doc=3455,freq=2.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.35857132 = fieldWeight in 3455, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.046875 = fieldNorm(doc=3455)
      0.16666667 = coord(1/6)
    
    Abstract
    Web-based search engines such as Google and NorthernLight return documents that are relevant to a user query, not answers to user questions. We have developed an architecture that augments existing search engines so that they support natural language question answering. The process entails five steps: query modulation, document retrieval, passage extraction, phrase extraction, and answer ranking. In this article, we describe some probabilistic approaches to the last three of these stages. We show how our techniques apply to a number of existing search engines, and we also present results contrasting three different methods for question answering. Our algorithm, probabilistic phrase reranking (PPR), uses proximity and question type features and achieves a total reciprocal document rank of .20 an the TREC8 corpus. Our techniques have been implemented as a Web-accessible system, called NSIR.
  16. Bar-Ilan, J.; Levene, M.; Mat-Hassan, M.: Methods for evaluating dynamic changes in search engine rankings : a case study (2006) 0.01
    0.011422038 = product of:
      0.06853223 = sum of:
        0.06853223 = weight(_text_:ranking in 616) [ClassicSimilarity], result of:
          0.06853223 = score(doc=616,freq=4.0), product of:
            0.20271951 = queryWeight, product of:
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03747799 = queryNorm
            0.33806428 = fieldWeight in 616, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.4090285 = idf(docFreq=537, maxDocs=44218)
              0.03125 = fieldNorm(doc=616)
      0.16666667 = coord(1/6)
    
    Abstract
    Purpose - The objective of this paper is to characterize the changes in the rankings of the top ten results of major search engines over time and to compare the rankings between these engines. Design/methodology/approach - The papers compare rankings of the top-ten results of the search engines Google and AlltheWeb on ten identical queries over a period of three weeks. Only the top-ten results were considered, since users do not normally inspect more than the first results page returned by a search engine. The experiment was repeated twice, in October 2003 and in January 2004, in order to assess changes to the top-ten results of some of the queries during the three months interval. In order to assess the changes in the rankings, three measures were computed for each data collection point and each search engine. Findings - The findings in this paper show that the rankings of AlltheWeb were highly stable over each period, while the rankings of Google underwent constant yet minor changes, with occasional major ones. Changes over time can be explained by the dynamic nature of the web or by fluctuations in the search engines' indexes. The top-ten results of the two search engines had surprisingly low overlap. With such small overlap, the task of comparing the rankings of the two engines becomes extremely challenging. Originality/value - The paper shows that because of the abundance of information on the web, ranking search results is of extreme importance. The paper compares several measures for computing the similarity between rankings of search tools, and shows that none of the measures is fully satisfactory as a standalone measure. It also demonstrates the apparent differences in the ranking algorithms of two widely used search engines.
  17. Chakrabarti, S.; Dom, B.; Kumar, S.R.; Raghavan, P.; Rajagopalan, S.; Tomkins, A.; Kleinberg, J.M.; Gibson, D.: Neue Pfade durch den Internet-Dschungel : Die zweite Generation von Web-Suchmaschinen (1999) 0.00
    0.0022772634 = product of:
      0.013663581 = sum of:
        0.013663581 = product of:
          0.04099074 = sum of:
            0.04099074 = weight(_text_:29 in 3) [ClassicSimilarity], result of:
              0.04099074 = score(doc=3,freq=2.0), product of:
                0.13183585 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03747799 = queryNorm
                0.31092256 = fieldWeight in 3, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0625 = fieldNorm(doc=3)
          0.33333334 = coord(1/3)
      0.16666667 = coord(1/6)
    
    Date
    31.12.1996 19:29:41
  18. Agosti, M.; Pretto, L.: ¬A theoretical study of a generalized version of kleinberg's HITS algorithm (2005) 0.00
    0.0014232898 = product of:
      0.008539738 = sum of:
        0.008539738 = product of:
          0.025619213 = sum of:
            0.025619213 = weight(_text_:29 in 4) [ClassicSimilarity], result of:
              0.025619213 = score(doc=4,freq=2.0), product of:
                0.13183585 = queryWeight, product of:
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.03747799 = queryNorm
                0.19432661 = fieldWeight in 4, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5176873 = idf(docFreq=3565, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=4)
          0.33333334 = coord(1/3)
      0.16666667 = coord(1/6)
    
    Date
    31.12.1996 19:29:41