Search (8 results, page 1 of 1)

  • × theme_ss:"Retrievalalgorithmen"
  • × theme_ss:"Suchmaschinen"
  1. Back, J.: ¬An evaluation of relevancy ranking techniques used by Internet search engines (2000) 0.02
    0.015257841 = product of:
      0.09154704 = sum of:
        0.09154704 = weight(_text_:22 in 3445) [ClassicSimilarity], result of:
          0.09154704 = score(doc=3445,freq=2.0), product of:
            0.1690115 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04826377 = queryNorm
            0.5416616 = fieldWeight in 3445, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=3445)
      0.16666667 = coord(1/6)
    
    Date
    25. 8.2005 17:42:22
  2. Ding, Y.; Chowdhury, G.; Foo, S.: Organsising keywords in a Web search environment : a methodology based on co-word analysis (2000) 0.01
    0.009606745 = product of:
      0.05764047 = sum of:
        0.05764047 = weight(_text_:problem in 105) [ClassicSimilarity], result of:
          0.05764047 = score(doc=105,freq=2.0), product of:
            0.20485485 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04826377 = queryNorm
            0.28137225 = fieldWeight in 105, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.046875 = fieldNorm(doc=105)
      0.16666667 = coord(1/6)
    
    Abstract
    The rapid development of the Internet and World Wide Web has caused some critical problem for information retrieval. Researchers have made several attempts to solve these problems. Thesauri and subject heading lists as traditional information retrieval tools have been criticised for their efficiency to tackle these newly emerging problems. This paper proposes an information retrieval tool generated by cocitation analysis, comprising keyword clusters with relationships based on the co-occurrences of keywords in the literature. Such a tool can play the role of an associative thesaurus that can provide information about the keywords in a domain that might be useful for information searching and query expansion
  3. Tober, M.; Hennig, L.; Furch, D.: SEO Ranking-Faktoren und Rang-Korrelationen 2014 : Google Deutschland (2014) 0.01
    0.008718766 = product of:
      0.052312598 = sum of:
        0.052312598 = weight(_text_:22 in 1484) [ClassicSimilarity], result of:
          0.052312598 = score(doc=1484,freq=2.0), product of:
            0.1690115 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04826377 = queryNorm
            0.30952093 = fieldWeight in 1484, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1484)
      0.16666667 = coord(1/6)
    
    Date
    13. 9.2014 14:45:22
  4. Austin, D.: How Google finds your needle in the Web's haystack : as we'll see, the trick is to ask the web itself to rank the importance of pages... (2006) 0.01
    0.007925161 = product of:
      0.047550965 = sum of:
        0.047550965 = weight(_text_:problem in 93) [ClassicSimilarity], result of:
          0.047550965 = score(doc=93,freq=4.0), product of:
            0.20485485 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04826377 = queryNorm
            0.23212028 = fieldWeight in 93, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.02734375 = fieldNorm(doc=93)
      0.16666667 = coord(1/6)
    
    Abstract
    Imagine a library containing 25 billion documents but with no centralized organization and no librarians. In addition, anyone may add a document at any time without telling anyone. You may feel sure that one of the documents contained in the collection has a piece of information that is vitally important to you, and, being impatient like most of us, you'd like to find it in a matter of seconds. How would you go about doing it? Posed in this way, the problem seems impossible. Yet this description is not too different from the World Wide Web, a huge, highly-disorganized collection of documents in many different formats. Of course, we're all familiar with search engines (perhaps you found this article using one) so we know that there is a solution. This article will describe Google's PageRank algorithm and how it returns pages from the web's collection of 25 billion documents that match search criteria so well that "google" has become a widely used verb. Most search engines, including Google, continually run an army of computer programs that retrieve pages from the web, index the words in each document, and store this information in an efficient format. Each time a user asks for a web search using a search phrase, such as "search engine," the search engine determines all the pages on the web that contains the words in the search phrase. (Perhaps additional information such as the distance between the words "search" and "engine" will be noted as well.) Here is the problem: Google now claims to index 25 billion pages. Roughly 95% of the text in web pages is composed from a mere 10,000 words. This means that, for most searches, there will be a huge number of pages containing the words in the search phrase. What is needed is a means of ranking the importance of the pages that fit the search criteria so that the pages can be sorted with the most important pages at the top of the list. One way to determine the importance of pages is to use a human-generated ranking. For instance, you may have seen pages that consist mainly of a large number of links to other resources in a particular area of interest. Assuming the person maintaining this page is reliable, the pages referenced are likely to be useful. Of course, the list may quickly fall out of date, and the person maintaining the list may miss some important pages, either unintentionally or as a result of an unstated bias. Google's PageRank algorithm assesses the importance of web pages without human evaluation of the content. In fact, Google feels that the value of its service is largely in its ability to provide unbiased results to search queries; Google claims, "the heart of our software is PageRank." As we'll see, the trick is to ask the web itself to rank the importance of pages.
  5. Kanaeva, Z.: Ranking: Google und CiteSeer (2005) 0.01
    0.0076289205 = product of:
      0.04577352 = sum of:
        0.04577352 = weight(_text_:22 in 3276) [ClassicSimilarity], result of:
          0.04577352 = score(doc=3276,freq=2.0), product of:
            0.1690115 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04826377 = queryNorm
            0.2708308 = fieldWeight in 3276, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3276)
      0.16666667 = coord(1/6)
    
    Date
    20. 3.2005 16:23:22
  6. Furner, J.: ¬A unifying model of document relatedness for hybrid search engines (2003) 0.01
    0.0065390747 = product of:
      0.03923445 = sum of:
        0.03923445 = weight(_text_:22 in 2717) [ClassicSimilarity], result of:
          0.03923445 = score(doc=2717,freq=2.0), product of:
            0.1690115 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04826377 = queryNorm
            0.23214069 = fieldWeight in 2717, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2717)
      0.16666667 = coord(1/6)
    
    Date
    11. 9.2004 17:32:22
  7. Wills, R.S.: Google's PageRank : the math behind the search engine (2006) 0.01
    0.0064044963 = product of:
      0.038426977 = sum of:
        0.038426977 = weight(_text_:problem in 5954) [ClassicSimilarity], result of:
          0.038426977 = score(doc=5954,freq=2.0), product of:
            0.20485485 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04826377 = queryNorm
            0.1875815 = fieldWeight in 5954, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.03125 = fieldNorm(doc=5954)
      0.16666667 = coord(1/6)
    
    Abstract
    Approximately 91 million American adults use the Internet on a typical day The number-one Internet activity is reading and writing e-mail. Search engine use is next in line and continues to increase in popularity. In fact, survey findings indicate that nearly 60 million American adults use search engines on a given day. Even though there are many Internet search engines, Google, Yahoo!, and MSN receive over 81% of all search requests. Despite claims that the quality of search provided by Yahoo! and MSN now equals that of Google, Google continues to thrive as the search engine of choice, receiving over 46% of all search requests, nearly double the volume of Yahoo! and over four times that of MSN. I use Google's search engine on a daily basis and rarely request information from other search engines. One day, I decided to visit the homepages of Google. Yahoo!, and MSN to compare the quality of search results. Coffee was on my mind that day, so I entered the simple query "coffee" in the search box at each homepage. Table 1 shows the top ten (unsponsored) results returned by each search engine. Although ordered differently, two webpages, www.peets.com and www.coffeegeek.com, appear in all three top ten lists. In addition, each pairing of top ten lists has two additional results in common. Depending on the information I hoped to obtain about coffee by using the search engines, I could argue that any one of the three returned better results: however, I was not looking for a particular webpage, so all three listings of search results seemed of equal quality. Thus, I plan to continue using Google. My decision is indicative of the problem Yahoo!, MSN, and other search engine companies face in the quest to obtain a larger percentage of Internet search volume. Search engine users are loyal to one or a few search engines and are generally happy with search results. Thus, as long as Google continues to provide results deemed high in quality, Google likely will remain the top search engine. But what set Google apart from its competitors in the first place? The answer is PageRank. In this article I explain this simple mathematical algorithm that revolutionized Web search.
  8. Langville, A.N.; Meyer, C.D.: Google's PageRank and beyond : the science of search engine rankings (2006) 0.00
    0.0048033725 = product of:
      0.028820235 = sum of:
        0.028820235 = weight(_text_:problem in 6) [ClassicSimilarity], result of:
          0.028820235 = score(doc=6,freq=2.0), product of:
            0.20485485 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04826377 = queryNorm
            0.14068612 = fieldWeight in 6, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0234375 = fieldNorm(doc=6)
      0.16666667 = coord(1/6)
    
    Content
    Inhalt: Chapter 1. Introduction to Web Search Engines: 1.1 A Short History of Information Retrieval - 1.2 An Overview of Traditional Information Retrieval - 1.3 Web Information Retrieval Chapter 2. Crawling, Indexing, and Query Processing: 2.1 Crawling - 2.2 The Content Index - 2.3 Query Processing Chapter 3. Ranking Webpages by Popularity: 3.1 The Scene in 1998 - 3.2 Two Theses - 3.3 Query-Independence Chapter 4. The Mathematics of Google's PageRank: 4.1 The Original Summation Formula for PageRank - 4.2 Matrix Representation of the Summation Equations - 4.3 Problems with the Iterative Process - 4.4 A Little Markov Chain Theory - 4.5 Early Adjustments to the Basic Model - 4.6 Computation of the PageRank Vector - 4.7 Theorem and Proof for Spectrum of the Google Matrix Chapter 5. Parameters in the PageRank Model: 5.1 The a Factor - 5.2 The Hyperlink Matrix H - 5.3 The Teleportation Matrix E Chapter 6. The Sensitivity of PageRank; 6.1 Sensitivity with respect to alpha - 6.2 Sensitivity with respect to H - 6.3 Sensitivity with respect to vT - 6.4 Other Analyses of Sensitivity - 6.5 Sensitivity Theorems and Proofs Chapter 7. The PageRank Problem as a Linear System: 7.1 Properties of (I - alphaS) - 7.2 Properties of (I - alphaH) - 7.3 Proof of the PageRank Sparse Linear System Chapter 8. Issues in Large-Scale Implementation of PageRank: 8.1 Storage Issues - 8.2 Convergence Criterion - 8.3 Accuracy - 8.4 Dangling Nodes - 8.5 Back Button Modeling