Search (18 results, page 1 of 1)

  • × theme_ss:"Suchmaschinen"
  • × theme_ss:"Retrievalalgorithmen"
  1. Back, J.: ¬An evaluation of relevancy ranking techniques used by Internet search engines (2000) 0.11
    0.11066668 = product of:
      0.22133335 = sum of:
        0.17885299 = weight(_text_:engines in 3445) [ClassicSimilarity], result of:
          0.17885299 = score(doc=3445,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.7858995 = fieldWeight in 3445, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.109375 = fieldNorm(doc=3445)
        0.042480372 = product of:
          0.084960744 = sum of:
            0.084960744 = weight(_text_:22 in 3445) [ClassicSimilarity], result of:
              0.084960744 = score(doc=3445,freq=2.0), product of:
                0.15685207 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04479146 = queryNorm
                0.5416616 = fieldWeight in 3445, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=3445)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Date
    25. 8.2005 17:42:22
  2. Furner, J.: ¬A unifying model of document relatedness for hybrid search engines (2003) 0.05
    0.04742858 = product of:
      0.09485716 = sum of:
        0.07665128 = weight(_text_:engines in 2717) [ClassicSimilarity], result of:
          0.07665128 = score(doc=2717,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.33681408 = fieldWeight in 2717, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=2717)
        0.018205874 = product of:
          0.036411747 = sum of:
            0.036411747 = weight(_text_:22 in 2717) [ClassicSimilarity], result of:
              0.036411747 = score(doc=2717,freq=2.0), product of:
                0.15685207 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04479146 = queryNorm
                0.23214069 = fieldWeight in 2717, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=2717)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Date
    11. 9.2004 17:32:22
  3. Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (1999) 0.04
    0.03832564 = product of:
      0.15330257 = sum of:
        0.15330257 = weight(_text_:engines in 5777) [ClassicSimilarity], result of:
          0.15330257 = score(doc=5777,freq=8.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.67362815 = fieldWeight in 5777, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=5777)
      0.25 = coord(1/4)
    
    Abstract
    This book discusses many of the key design issues for building search engines and emphazises the important role that applied mathematics can play in improving information retrieval. The authors discuss not only important data structures, algorithms, and software but also user-centered issues such as interfaces, manual indexing, and document preparation. They also present some of the current problems in information retrieval that many not be familiar to applied mathematicians and computer scientists and some of the driving computational methods (SVD, SDD) for automated conceptual indexing
    LCSH
    Web search engines
    Subject
    Web search engines
  4. Bar-Ilan, J.; Levene, M.; Mat-Hassan, M.: Methods for evaluating dynamic changes in search engine rankings : a case study (2006) 0.03
    0.033800043 = product of:
      0.13520017 = sum of:
        0.13520017 = weight(_text_:engines in 616) [ClassicSimilarity], result of:
          0.13520017 = score(doc=616,freq=14.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.59408426 = fieldWeight in 616, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.03125 = fieldNorm(doc=616)
      0.25 = coord(1/4)
    
    Abstract
    Purpose - The objective of this paper is to characterize the changes in the rankings of the top ten results of major search engines over time and to compare the rankings between these engines. Design/methodology/approach - The papers compare rankings of the top-ten results of the search engines Google and AlltheWeb on ten identical queries over a period of three weeks. Only the top-ten results were considered, since users do not normally inspect more than the first results page returned by a search engine. The experiment was repeated twice, in October 2003 and in January 2004, in order to assess changes to the top-ten results of some of the queries during the three months interval. In order to assess the changes in the rankings, three measures were computed for each data collection point and each search engine. Findings - The findings in this paper show that the rankings of AlltheWeb were highly stable over each period, while the rankings of Google underwent constant yet minor changes, with occasional major ones. Changes over time can be explained by the dynamic nature of the web or by fluctuations in the search engines' indexes. The top-ten results of the two search engines had surprisingly low overlap. With such small overlap, the task of comparing the rankings of the two engines becomes extremely challenging. Originality/value - The paper shows that because of the abundance of information on the web, ranking search results is of extreme importance. The paper compares several measures for computing the similarity between rankings of search tools, and shows that none of the measures is fully satisfactory as a standalone measure. It also demonstrates the apparent differences in the ranking algorithms of two widely used search engines.
  5. Bilal, D.: Ranking, relevance judgment, and precision of information retrieval on children's queries : evaluation of Google, Yahoo!, Bing, Yahoo! Kids, and ask Kids (2012) 0.03
    0.033800043 = product of:
      0.13520017 = sum of:
        0.13520017 = weight(_text_:engines in 393) [ClassicSimilarity], result of:
          0.13520017 = score(doc=393,freq=14.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.59408426 = fieldWeight in 393, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.03125 = fieldNorm(doc=393)
      0.25 = coord(1/4)
    
    Abstract
    This study employed benchmarking and intellectual relevance judgment in evaluating Google, Yahoo!, Bing, Yahoo! Kids, and Ask Kids on 30 queries that children formulated to find information for specific tasks. Retrieved hits on given queries were benchmarked to Google's and Yahoo! Kids' top-five ranked hits retrieved. Relevancy of hits was judged on a graded scale; precision was calculated using the precision-at-ten metric (P@10). Yahoo! and Bing produced a similar percentage in hit overlap with Google (nearly 30%), but differed in the ranking of hits. Ask Kids retrieved 11% in hit overlap with Google versus 3% by Yahoo! Kids. The engines retrieved 26 hits across query clusters that overlapped with Yahoo! Kids' top-five ranked hits. Precision (P) that the engines produced across the queries was P = 0.48 for relevant hits, and P = 0.28 for partially relevant hits. Precision by Ask Kids was P = 0.44 for relevant hits versus P = 0.21 by Yahoo! Kids. Bing produced the highest total precision (TP) of relevant hits (TP = 0.86) across the queries, and Yahoo! Kids yielded the lowest (TP = 0.47). Average precision (AP) of relevant hits was AP = 0.56 by leading engines versus AP = 0.29 by small engines. In contrast, average precision of partially relevant hits was AP = 0.83 by small engines versus AP = 0.33 by leading engines. Average precision of relevant hits across the engines was highest on two-word queries and lowest on one-word queries. Google performed best on natural language queries; Bing did the same (P = 0.69) on two-word queries. The findings have implications for search engine ranking algorithms, relevance theory, search engine design, research design, and information literacy.
  6. Radev, D.; Fan, W.; Qu, H.; Wu, H.; Grewal, A.: Probabilistic question answering on the Web (2005) 0.03
    0.03319098 = product of:
      0.13276392 = sum of:
        0.13276392 = weight(_text_:engines in 3455) [ClassicSimilarity], result of:
          0.13276392 = score(doc=3455,freq=6.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.58337915 = fieldWeight in 3455, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=3455)
      0.25 = coord(1/4)
    
    Abstract
    Web-based search engines such as Google and NorthernLight return documents that are relevant to a user query, not answers to user questions. We have developed an architecture that augments existing search engines so that they support natural language question answering. The process entails five steps: query modulation, document retrieval, passage extraction, phrase extraction, and answer ranking. In this article, we describe some probabilistic approaches to the last three of these stages. We show how our techniques apply to a number of existing search engines, and we also present results contrasting three different methods for question answering. Our algorithm, probabilistic phrase reranking (PPR), uses proximity and question type features and achieves a total reciprocal document rank of .20 an the TREC8 corpus. Our techniques have been implemented as a Web-accessible system, called NSIR.
  7. Courtois, M.P.; Berry, M.W.: Results ranking in Web search engines (1999) 0.03
    0.031938035 = product of:
      0.12775214 = sum of:
        0.12775214 = weight(_text_:engines in 3726) [ClassicSimilarity], result of:
          0.12775214 = score(doc=3726,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.5613568 = fieldWeight in 3726, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.078125 = fieldNorm(doc=3726)
      0.25 = coord(1/4)
    
  8. Berry, M.W.; Browne, M.: Understanding search engines : mathematical modeling and text retrieval (2005) 0.03
    0.031292755 = product of:
      0.12517102 = sum of:
        0.12517102 = weight(_text_:engines in 7) [ClassicSimilarity], result of:
          0.12517102 = score(doc=7,freq=12.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.5500151 = fieldWeight in 7, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.03125 = fieldNorm(doc=7)
      0.25 = coord(1/4)
    
    Abstract
    The second edition of Understanding Search Engines: Mathematical Modeling and Text Retrieval follows the basic premise of the first edition by discussing many of the key design issues for building search engines and emphasizing the important role that applied mathematics can play in improving information retrieval. The authors discuss important data structures, algorithms, and software as well as user-centered issues such as interfaces, manual indexing, and document preparation. Significant changes bring the text up to date on current information retrieval methods: for example the addition of a new chapter on link-structure algorithms used in search engines such as Google. The chapter on user interface has been rewritten to specifically focus on search engine usability. In addition the authors have added new recommendations for further reading and expanded the bibliography, and have updated and streamlined the index to make it more reader friendly.
    LCSH
    Web search engines
    Subject
    Web search engines
  9. Wills, R.S.: Google's PageRank : the math behind the search engine (2006) 0.03
    0.02856625 = product of:
      0.114265 = sum of:
        0.114265 = weight(_text_:engines in 5954) [ClassicSimilarity], result of:
          0.114265 = score(doc=5954,freq=10.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.50209284 = fieldWeight in 5954, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.03125 = fieldNorm(doc=5954)
      0.25 = coord(1/4)
    
    Abstract
    Approximately 91 million American adults use the Internet on a typical day The number-one Internet activity is reading and writing e-mail. Search engine use is next in line and continues to increase in popularity. In fact, survey findings indicate that nearly 60 million American adults use search engines on a given day. Even though there are many Internet search engines, Google, Yahoo!, and MSN receive over 81% of all search requests. Despite claims that the quality of search provided by Yahoo! and MSN now equals that of Google, Google continues to thrive as the search engine of choice, receiving over 46% of all search requests, nearly double the volume of Yahoo! and over four times that of MSN. I use Google's search engine on a daily basis and rarely request information from other search engines. One day, I decided to visit the homepages of Google. Yahoo!, and MSN to compare the quality of search results. Coffee was on my mind that day, so I entered the simple query "coffee" in the search box at each homepage. Table 1 shows the top ten (unsponsored) results returned by each search engine. Although ordered differently, two webpages, www.peets.com and www.coffeegeek.com, appear in all three top ten lists. In addition, each pairing of top ten lists has two additional results in common. Depending on the information I hoped to obtain about coffee by using the search engines, I could argue that any one of the three returned better results: however, I was not looking for a particular webpage, so all three listings of search results seemed of equal quality. Thus, I plan to continue using Google. My decision is indicative of the problem Yahoo!, MSN, and other search engine companies face in the quest to obtain a larger percentage of Internet search volume. Search engine users are loyal to one or a few search engines and are generally happy with search results. Thus, as long as Google continues to provide results deemed high in quality, Google likely will remain the top search engine. But what set Google apart from its competitors in the first place? The answer is PageRank. In this article I explain this simple mathematical algorithm that revolutionized Web search.
  10. Stock, M.; Stock, W.G.: Internet-Suchwerkzeuge im Vergleich (IV) : Relevance Ranking nach "Popularität" von Webseiten: Google (2001) 0.03
    0.027100323 = product of:
      0.10840129 = sum of:
        0.10840129 = weight(_text_:engines in 5771) [ClassicSimilarity], result of:
          0.10840129 = score(doc=5771,freq=4.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.47632706 = fieldWeight in 5771, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=5771)
      0.25 = coord(1/4)
    
    Abstract
    In unserem Retrievaltest von Suchwerkzeugen im World Wide Web (Password 11/2000) schnitt die Suchmaschine Google am besten ab. Im Vergleich zu anderen Search Engines setzt Google kaum auf Informationslinguistik, sondern auf Algorithmen, die sich aus den Besonderheiten der Web-Dokumente ableiten lassen. Kernstück der informationsstatistischen Technik ist das "PageRank"- Verfahren (benannt nach dem Entwickler Larry Page), das aus der Hypertextstruktur des Web die "Popularität" von Seiten anhand ihrer ein- und ausgehenden Links berechnet. Google besticht durch das Angebot intuitiv verstehbarer Suchbildschirme sowie durch einige sehr nützliche "Kleinigkeiten" wie die Angabe des Rangs einer Seite, Highlighting, Suchen in der Seite, Suchen innerhalb eines Suchergebnisses usw., alles verstaut in einer eigenen Befehlsleiste innerhalb des Browsers. Ähnlich wie RealNames bietet Google mit dem Produkt "AdWords" den Aufkauf von Suchtermen an. Nach einer Reihe von nunmehr vier Password-Artikeln über InternetSuchwerkzeugen im Vergleich wollen wir abschließend zu einer Bewertung kommen. Wie ist der Stand der Technik bei Directories und Search Engines aus informationswissenschaftlicher Sicht einzuschätzen? Werden die "typischen" Internetnutzer, die ja in der Regel keine Information Professionals sind, adäquat bedient? Und können auch Informationsfachleute von den Suchwerkzeugen profitieren?
  11. Jindal, V.; Bawa, S.; Batra, S.: ¬A review of ranking approaches for semantic search on Web (2014) 0.03
    0.027100323 = product of:
      0.10840129 = sum of:
        0.10840129 = weight(_text_:engines in 2799) [ClassicSimilarity], result of:
          0.10840129 = score(doc=2799,freq=4.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.47632706 = fieldWeight in 2799, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.046875 = fieldNorm(doc=2799)
      0.25 = coord(1/4)
    
    Abstract
    With ever increasing information being available to the end users, search engines have become the most powerful tools for obtaining useful information scattered on the Web. However, it is very common that even most renowned search engines return result sets with not so useful pages to the user. Research on semantic search aims to improve traditional information search and retrieval methods where the basic relevance criteria rely primarily on the presence of query keywords within the returned pages. This work is an attempt to explore different relevancy ranking approaches based on semantics which are considered appropriate for the retrieval of relevant information. In this paper, various pilot projects and their corresponding outcomes have been investigated based on methodologies adopted and their most distinctive characteristics towards ranking. An overview of selected approaches and their comparison by means of the classification criteria has been presented. With the help of this comparison, some common concepts and outstanding features have been identified.
  12. Bhansali, D.; Desai, H.; Deulkar, K.: ¬A study of different ranking approaches for semantic search (2015) 0.02
    0.022583602 = product of:
      0.09033441 = sum of:
        0.09033441 = weight(_text_:engines in 2696) [ClassicSimilarity], result of:
          0.09033441 = score(doc=2696,freq=4.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.39693922 = fieldWeight in 2696, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2696)
      0.25 = coord(1/4)
    
    Abstract
    Search Engines have become an integral part of our day to day life. Our reliance on search engines increases with every passing day. With the amount of data available on Internet increasing exponentially, it becomes important to develop new methods and tools that help to return results relevant to the queries and reduce the time spent on searching. The results should be diverse but at the same time should return results focused on the queries asked. Relation Based Page Rank [4] algorithms are considered to be the next frontier in improvement of Semantic Web Search. The probability of finding relevance in the search results as posited by the user while entering the query is used to measure the relevance. However, its application is limited by the complexity of determining relation between the terms and assigning explicit meaning to each term. Trust Rank is one of the most widely used ranking algorithms for semantic web search. Few other ranking algorithms like HITS algorithm, PageRank algorithm are also used for Semantic Web Searching. In this paper, we will provide a comparison of few ranking approaches.
  13. Langville, A.N.; Meyer, C.D.: Google's PageRank and beyond : the science of search engine rankings (2006) 0.02
    0.01916282 = product of:
      0.07665128 = sum of:
        0.07665128 = weight(_text_:engines in 6) [ClassicSimilarity], result of:
          0.07665128 = score(doc=6,freq=8.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.33681408 = fieldWeight in 6, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0234375 = fieldNorm(doc=6)
      0.25 = coord(1/4)
    
    Abstract
    Why doesn't your home page appear on the first page of search results, even when you query your own name? How do other Web pages always appear at the top? What creates these powerful rankings? And how? The first book ever about the science of Web page rankings, "Google's PageRank and Beyond" supplies the answers to these and other questions and more. The book serves two very different audiences: the curious science reader and the technical computational reader. The chapters build in mathematical sophistication, so that the first five are accessible to the general academic reader. While other chapters are much more mathematical in nature, each one contains something for both audiences. For example, the authors include entertaining asides such as how search engines make money and how the Great Firewall of China influences research. The book includes an extensive background chapter designed to help readers learn more about the mathematics of search engines, and it contains several MATLAB codes and links to sample Web data sets. The philosophy throughout is to encourage readers to experiment with the ideas and algorithms in the text. Any business seriously interested in improving its rankings in the major search engines can benefit from the clear examples, sample code, and list of resources provided. It includes: many illustrative examples and entertaining asides; MATLAB code; accessible and informal style; and complete and self-contained section for mathematics review.
    Content
    Inhalt: Chapter 1. Introduction to Web Search Engines: 1.1 A Short History of Information Retrieval - 1.2 An Overview of Traditional Information Retrieval - 1.3 Web Information Retrieval Chapter 2. Crawling, Indexing, and Query Processing: 2.1 Crawling - 2.2 The Content Index - 2.3 Query Processing Chapter 3. Ranking Webpages by Popularity: 3.1 The Scene in 1998 - 3.2 Two Theses - 3.3 Query-Independence Chapter 4. The Mathematics of Google's PageRank: 4.1 The Original Summation Formula for PageRank - 4.2 Matrix Representation of the Summation Equations - 4.3 Problems with the Iterative Process - 4.4 A Little Markov Chain Theory - 4.5 Early Adjustments to the Basic Model - 4.6 Computation of the PageRank Vector - 4.7 Theorem and Proof for Spectrum of the Google Matrix Chapter 5. Parameters in the PageRank Model: 5.1 The a Factor - 5.2 The Hyperlink Matrix H - 5.3 The Teleportation Matrix E Chapter 6. The Sensitivity of PageRank; 6.1 Sensitivity with respect to alpha - 6.2 Sensitivity with respect to H - 6.3 Sensitivity with respect to vT - 6.4 Other Analyses of Sensitivity - 6.5 Sensitivity Theorems and Proofs Chapter 7. The PageRank Problem as a Linear System: 7.1 Properties of (I - alphaS) - 7.2 Properties of (I - alphaH) - 7.3 Proof of the PageRank Sparse Linear System Chapter 8. Issues in Large-Scale Implementation of PageRank: 8.1 Storage Issues - 8.2 Convergence Criterion - 8.3 Accuracy - 8.4 Dangling Nodes - 8.5 Back Button Modeling
  14. Lempel, R.; Moran, S.: SALSA: the stochastic approach for link-structure analysis (2001) 0.02
    0.015969018 = product of:
      0.06387607 = sum of:
        0.06387607 = weight(_text_:engines in 10) [ClassicSimilarity], result of:
          0.06387607 = score(doc=10,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.2806784 = fieldWeight in 10, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.0390625 = fieldNorm(doc=10)
      0.25 = coord(1/4)
    
    Abstract
    Today, when searching for information on the WWW, one usually performs a query through a term-based search engine. These engines return, as the query's result, a list of Web pages whose contents matches the query. For broad-topic queries, such searches often result in a huge set of retrieved documents, many of which are irrelevant to the user. However, much information is contained in the link-structure of the WWW. Information such as which pages are linked to others can be used to augment search algorithms. In this context, Jon Kleinberg introduced the notion of two distinct types of Web pages: hubs and authorities. Kleinberg argued that hubs and authorities exhibit a mutually reinforcing relationship: a good hub will point to many authorities, and a good authority will be pointed at by many hubs. In light of this, he dervised an algoirthm aimed at finding authoritative pages. We present SALSA, a new stochastic approach for link-structure analysis, which examines random walks on graphs derived from the link-structure. We show that both SALSA and Kleinberg's Mutual Reinforcement approach employ the same metaalgorithm. We then prove that SALSA is quivalent to a weighted in degree analysis of the link-sturcutre of WWW subgraphs, making it computationally more efficient than the Mutual reinforcement approach. We compare that results of applying SALSA to the results derived through Kleinberg's approach. These comparisions reveal a topological Phenomenon called the TKC effectwhich, in certain cases, prevents the Mutual reinforcement approach from identifying meaningful authorities.
  15. Austin, D.: How Google finds your needle in the Web's haystack : as we'll see, the trick is to ask the web itself to rank the importance of pages... (2006) 0.02
    0.01580852 = product of:
      0.06323408 = sum of:
        0.06323408 = weight(_text_:engines in 93) [ClassicSimilarity], result of:
          0.06323408 = score(doc=93,freq=4.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.27785745 = fieldWeight in 93, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.02734375 = fieldNorm(doc=93)
      0.25 = coord(1/4)
    
    Abstract
    Imagine a library containing 25 billion documents but with no centralized organization and no librarians. In addition, anyone may add a document at any time without telling anyone. You may feel sure that one of the documents contained in the collection has a piece of information that is vitally important to you, and, being impatient like most of us, you'd like to find it in a matter of seconds. How would you go about doing it? Posed in this way, the problem seems impossible. Yet this description is not too different from the World Wide Web, a huge, highly-disorganized collection of documents in many different formats. Of course, we're all familiar with search engines (perhaps you found this article using one) so we know that there is a solution. This article will describe Google's PageRank algorithm and how it returns pages from the web's collection of 25 billion documents that match search criteria so well that "google" has become a widely used verb. Most search engines, including Google, continually run an army of computer programs that retrieve pages from the web, index the words in each document, and store this information in an efficient format. Each time a user asks for a web search using a search phrase, such as "search engine," the search engine determines all the pages on the web that contains the words in the search phrase. (Perhaps additional information such as the distance between the words "search" and "engine" will be noted as well.) Here is the problem: Google now claims to index 25 billion pages. Roughly 95% of the text in web pages is composed from a mere 10,000 words. This means that, for most searches, there will be a huge number of pages containing the words in the search phrase. What is needed is a means of ranking the importance of the pages that fit the search criteria so that the pages can be sorted with the most important pages at the top of the list. One way to determine the importance of pages is to use a human-generated ranking. For instance, you may have seen pages that consist mainly of a large number of links to other resources in a particular area of interest. Assuming the person maintaining this page is reliable, the pages referenced are likely to be useful. Of course, the list may quickly fall out of date, and the person maintaining the list may miss some important pages, either unintentionally or as a result of an unstated bias. Google's PageRank algorithm assesses the importance of web pages without human evaluation of the content. In fact, Google feels that the value of its service is largely in its ability to provide unbiased results to search queries; Google claims, "the heart of our software is PageRank." As we'll see, the trick is to ask the web itself to rank the importance of pages.
  16. Lanvent, A.: Licht im Daten Chaos (2004) 0.01
    0.012775214 = product of:
      0.051100858 = sum of:
        0.051100858 = weight(_text_:engines in 2806) [ClassicSimilarity], result of:
          0.051100858 = score(doc=2806,freq=2.0), product of:
            0.22757743 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.04479146 = queryNorm
            0.22454272 = fieldWeight in 2806, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.03125 = fieldNorm(doc=2806)
      0.25 = coord(1/4)
    
    Content
    "Bitte suchen Sie alle Unterlagen, die im PC zum Ibelshäuser-Vertrag in Sprockhövel gespeichert sind. Finden Sie alles, was wir haben - Dokumente, Tabellen, Präsentationen, Scans, E-Mails. Und erledigen Sie das gleich! « Wer diese Aufgabe an das Windows-eigene Suchmodul vergibt, wird zwangsläufig enttäuscht. Denn das Betriebssystem beherrscht weder die formatübergreifende Recherche noch die Kontextsuche, die für solche komplexen Aufträge nötig sind. Professionelle Desktop-Suchmaschinen erledigen Aufgaben dieser Art jedoch im Handumdrehen - genauer gesagt in einer einzigen Sekunde. Spitzenprogramme wie Global Brain benötigen dafür nicht einmal umfangreiche Abfrageformulare. Es genügt, einen Satz im Eingabefeld zu formulieren, der das Thema der gewünschten Dokumente eingrenzt. Dabei suchen die Programme über alle Laufwerke, die sich auf dem System einbinden lassen - also auch im Netzwerk-Ordner (Shared Folder), sofern dieser freigegeben wurde. Allen Testkandidaten - mit Ausnahme von Search 32 - gemeinsam ist, dass sie weitaus bessere Rechercheergebnisse abliefern als Windows, deutlich schneller arbeiten und meist auch in den Online-Postfächern stöbern. Wer schon öfter vergeblich über die Windows-Suche nach wichtigen Dokumenten gefahndet hat, kommt angesichts der Qualität der Search-Engines kaum mehr um die Anschaffung eines Desktop-Suchtools herum. Aber Microsoft will nachbessern. Für den Windows-XP-Nachfolger Longhorn wirbt der Hersteller vor allem mit dem Hinweis auf das neue Dateisystem WinFS, das sämtliche Files auf der Festplatte über Meta-Tags indiziert und dem Anwender damit lange Suchläufe erspart. So sollen sich anders als bei Windows XP alle Dateien zu bestimmten Themen in wenigen Sekunden auflisten lassen - unabhängig vom Format und vom physikalischen Speicherort der Files. Für die Recherche selbst ist dann weder der Dateiname noch das Erstelldatum ausschlaggebend. Anhand der kontextsensitiven Suche von WinFS kann der Anwender einfach einen Suchbefehl wie »Vertragsabschluss mit Firma XYZ, Neunkirchen/Saar« eingeben, der dann ohne Umwege zum Ziel führt."
  17. Tober, M.; Hennig, L.; Furch, D.: SEO Ranking-Faktoren und Rang-Korrelationen 2014 : Google Deutschland (2014) 0.01
    0.0060686246 = product of:
      0.024274498 = sum of:
        0.024274498 = product of:
          0.048548996 = sum of:
            0.048548996 = weight(_text_:22 in 1484) [ClassicSimilarity], result of:
              0.048548996 = score(doc=1484,freq=2.0), product of:
                0.15685207 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04479146 = queryNorm
                0.30952093 = fieldWeight in 1484, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0625 = fieldNorm(doc=1484)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    13. 9.2014 14:45:22
  18. Kanaeva, Z.: Ranking: Google und CiteSeer (2005) 0.01
    0.0053100465 = product of:
      0.021240186 = sum of:
        0.021240186 = product of:
          0.042480372 = sum of:
            0.042480372 = weight(_text_:22 in 3276) [ClassicSimilarity], result of:
              0.042480372 = score(doc=3276,freq=2.0), product of:
                0.15685207 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04479146 = queryNorm
                0.2708308 = fieldWeight in 3276, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=3276)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    20. 3.2005 16:23:22

Languages

  • e 14
  • d 4

Types