Document (#31928)

Author
Henzinger, M.R.
Title
Link analysis in Web information retrieval
Source
IEEE data engineering bulletin. 23(2000) no.3, S.3-8
Year
2000
Abstract
The analysis of the hyperlink structure of the web has led to significant improvements in web information retrieval. This survey describes two successful link analysis algorithms and the state-of-the art of the field.
Content
The goal of information retrieval is to find all documents relevant for a user query in a collection of documents. Decades of research in information retrieval were successful in developing and refining techniques that are solely word-based (see e.g., [2]). With the advent of the web new sources of information became available, one of them being the hyperlinks between documents and records of user behavior. To be precise, hypertexts (i.e., collections of documents connected by hyperlinks) have existed and have been studied for a long time. What was new was the large number of hyperlinks created by independent individuals. Hyperlinks provide a valuable source of information for web information retrieval as we will show in this article. This area of information retrieval is commonly called link analysis. Why would one expect hyperlinks to be useful? Ahyperlink is a reference of a web page B that is contained in a web page A. When the hyperlink is clicked on in a web browser, the browser displays page B. This functionality alone is not helpful for web information retrieval. However, the way hyperlinks are typically used by authors of web pages can give them valuable information content. Typically, authors create links because they think they will be useful for the readers of the pages. Thus, links are usually either navigational aids that, for example, bring the reader back to the homepage of the site, or links that point to pages whose content augments the content of the current page. The second kind of links tend to point to high-quality pages that might be on the same topic as the page containing the link.
Theme
Retrievalalgorithmen
Object
Google

Similar documents (author)

  1. Henzinger, M.R.: Hyperlink analysis for the Web (2001) 6.17
    6.169457 = sum of:
      6.169457 = weight(author_txt:henzinger in 2009) [ClassicSimilarity], result of:
        6.169457 = fieldWeight in 2009, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.871131 = idf(docFreq=5, maxDocs=42740)
          0.625 = fieldNorm(doc=2009)
    
  2. Dean, J.; Henzinger, M.R.: Finding related pages in the World Wide Web (1999) 4.94
    4.9355655 = sum of:
      4.9355655 = weight(author_txt:henzinger in 285) [ClassicSimilarity], result of:
        4.9355655 = fieldWeight in 285, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.871131 = idf(docFreq=5, maxDocs=42740)
          0.5 = fieldNorm(doc=285)
    
  3. Henzinger, M.; Pöppe, C.: "Qualität der Suchergebnisse ist unser höchstes Ziel" : Suchmaschine Google (2002) 4.94
    4.9355655 = sum of:
      4.9355655 = weight(author_txt:henzinger in 1852) [ClassicSimilarity], result of:
        4.9355655 = fieldWeight in 1852, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.871131 = idf(docFreq=5, maxDocs=42740)
          0.5 = fieldNorm(doc=1852)
    
  4. Henzinger, M.; Wiesemann, M.: Google-Forschungschefin Monika Henzinger beklagt Manipulationen von Suchmaschinen : "Tricks der Porno-Branche" (2002) 4.94
    4.9355655 = sum of:
      4.9355655 = weight(author_txt:henzinger in 2138) [ClassicSimilarity], result of:
        4.9355655 = fieldWeight in 2138, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.871131 = idf(docFreq=5, maxDocs=42740)
          0.5 = fieldNorm(doc=2138)
    

Similar documents (content)

  1. Rasmussen, E.: Clustering algorithms (1992) 0.38
    0.38162428 = sum of:
      0.38162428 = product of:
        0.8177663 = sum of:
          0.04014714 = weight(abstract_txt:structure in 4514) [ClassicSimilarity], result of:
            0.04014714 = score(doc=4514,freq=1.0), product of:
              0.1469045 = queryWeight, product of:
                1.7898505 = boost
                4.3725977 = idf(docFreq=1465, maxDocs=42740)
                0.01877063 = queryNorm
              0.27328736 = fieldWeight in 4514, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3725977 = idf(docFreq=1465, maxDocs=42740)
                0.0625 = fieldNorm(doc=4514)
          0.06291501 = weight(abstract_txt:field in 4514) [ClassicSimilarity], result of:
            0.06291501 = score(doc=4514,freq=2.0), product of:
              0.15731068 = queryWeight, product of:
                1.8521591 = boost
                4.5248175 = idf(docFreq=1258, maxDocs=42740)
                0.01877063 = queryNorm
              0.39994115 = fieldWeight in 4514, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5248175 = idf(docFreq=1258, maxDocs=42740)
                0.0625 = fieldNorm(doc=4514)
          0.013782903 = weight(abstract_txt:information in 4514) [ClassicSimilarity], result of:
            0.013782903 = score(doc=4514,freq=1.0), product of:
              0.09074774 = queryWeight, product of:
                1.9894457 = boost
                2.430104 = idf(docFreq=10226, maxDocs=42740)
                0.01877063 = queryNorm
              0.1518815 = fieldWeight in 4514, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.430104 = idf(docFreq=10226, maxDocs=42740)
                0.0625 = fieldNorm(doc=4514)
          0.12961623 = weight(abstract_txt:algorithms in 4514) [ClassicSimilarity], result of:
            0.12961623 = score(doc=4514,freq=2.0), product of:
              0.25469962 = queryWeight, product of:
                2.356749 = boost
                5.757529 = idf(docFreq=366, maxDocs=42740)
                0.01877063 = queryNorm
              0.50889844 = fieldWeight in 4514, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.757529 = idf(docFreq=366, maxDocs=42740)
                0.0625 = fieldNorm(doc=4514)
          0.06919293 = weight(abstract_txt:retrieval in 4514) [ClassicSimilarity], result of:
            0.06919293 = score(doc=4514,freq=3.0), product of:
              0.18447722 = queryWeight, product of:
                2.8365183 = boost
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.01877063 = queryNorm
              0.37507573 = fieldWeight in 4514, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4648013 = idf(docFreq=3633, maxDocs=42740)
                0.0625 = fieldNorm(doc=4514)
          0.14457273 = weight(abstract_txt:analysis in 4514) [ClassicSimilarity], result of:
            0.14457273 = score(doc=4514,freq=4.0), product of:
              0.3135764 = queryWeight, product of:
                4.5293045 = boost
                3.6883576 = idf(docFreq=2905, maxDocs=42740)
                0.01877063 = queryNorm
              0.4610447 = fieldWeight in 4514, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.6883576 = idf(docFreq=2905, maxDocs=42740)
                0.0625 = fieldNorm(doc=4514)
          0.3575394 = weight(abstract_txt:link in 4514) [ClassicSimilarity], result of:
            0.3575394 = score(doc=4514,freq=4.0), product of:
              0.5009618 = queryWeight, product of:
                4.6742992 = boost
                5.709647 = idf(docFreq=384, maxDocs=42740)
                0.01877063 = queryNorm
              0.7137059 = fieldWeight in 4514, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.709647 = idf(docFreq=384, maxDocs=42740)
                0.0625 = fieldNorm(doc=4514)
        0.46666667 = coord(7/15)
    
  2. Henzinger, M.R.: Hyperlink analysis for the Web (2001) 0.36
    0.35882932 = sum of:
      0.35882932 = product of:
        1.3456099 = sum of:
          0.041348707 = weight(abstract_txt:information in 2009) [ClassicSimilarity], result of:
            0.041348707 = score(doc=2009,freq=1.0), product of:
              0.09074774 = queryWeight, product of:
                1.9894457 = boost
                2.430104 = idf(docFreq=10226, maxDocs=42740)
                0.01877063 = queryNorm
              0.4556445 = fieldWeight in 2009, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.430104 = idf(docFreq=10226, maxDocs=42740)
                0.1875 = fieldNorm(doc=2009)
          0.3888487 = weight(abstract_txt:algorithms in 2009) [ClassicSimilarity], result of:
            0.3888487 = score(doc=2009,freq=2.0), product of:
              0.25469962 = queryWeight, product of:
                2.356749 = boost
                5.757529 = idf(docFreq=366, maxDocs=42740)
                0.01877063 = queryNorm
              1.5266953 = fieldWeight in 2009, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.757529 = idf(docFreq=366, maxDocs=42740)
                0.1875 = fieldNorm(doc=2009)
          0.6985534 = weight(abstract_txt:hyperlink in 2009) [ClassicSimilarity], result of:
            0.6985534 = score(doc=2009,freq=1.0), product of:
              0.47422478 = queryWeight, product of:
                3.215817 = boost
                7.856228 = idf(docFreq=44, maxDocs=42740)
                0.01877063 = queryNorm
              1.4730427 = fieldWeight in 2009, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.856228 = idf(docFreq=44, maxDocs=42740)
                0.1875 = fieldNorm(doc=2009)
          0.21685912 = weight(abstract_txt:analysis in 2009) [ClassicSimilarity], result of:
            0.21685912 = score(doc=2009,freq=1.0), product of:
              0.3135764 = queryWeight, product of:
                4.5293045 = boost
                3.6883576 = idf(docFreq=2905, maxDocs=42740)
                0.01877063 = queryNorm
              0.69156706 = fieldWeight in 2009, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6883576 = idf(docFreq=2905, maxDocs=42740)
                0.1875 = fieldNorm(doc=2009)
        0.26666668 = coord(4/15)
    
  3. Yang, P.; Gao, W.; Tan, Q.; Wong, K.-F.: ¬A link-bridged topic model for cross-domain document classification (2013) 0.30
    0.30165946 = sum of:
      0.30165946 = product of:
        0.75414866 = sum of:
          0.007001714 = weight(abstract_txt:this in 4707) [ClassicSimilarity], result of:
            0.007001714 = score(doc=4707,freq=1.0), product of:
              0.045856573 = queryWeight, product of:
                2.442996 = idf(docFreq=10095, maxDocs=42740)
                0.01877063 = queryNorm
              0.15268725 = fieldWeight in 4707, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.442996 = idf(docFreq=10095, maxDocs=42740)
                0.0625 = fieldNorm(doc=4707)
          0.04014714 = weight(abstract_txt:structure in 4707) [ClassicSimilarity], result of:
            0.04014714 = score(doc=4707,freq=1.0), product of:
              0.1469045 = queryWeight, product of:
                1.7898505 = boost
                4.3725977 = idf(docFreq=1465, maxDocs=42740)
                0.01877063 = queryNorm
              0.27328736 = fieldWeight in 4707, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3725977 = idf(docFreq=1465, maxDocs=42740)
                0.0625 = fieldNorm(doc=4707)
          0.05491557 = weight(abstract_txt:state in 4707) [ClassicSimilarity], result of:
            0.05491557 = score(doc=4707,freq=1.0), product of:
              0.18102102 = queryWeight, product of:
                1.9868437 = boost
                4.8538513 = idf(docFreq=905, maxDocs=42740)
                0.01877063 = queryNorm
              0.3033657 = fieldWeight in 4707, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.8538513 = idf(docFreq=905, maxDocs=42740)
                0.0625 = fieldNorm(doc=4707)
          0.019491967 = weight(abstract_txt:information in 4707) [ClassicSimilarity], result of:
            0.019491967 = score(doc=4707,freq=2.0), product of:
              0.09074774 = queryWeight, product of:
                1.9894457 = boost
                2.430104 = idf(docFreq=10226, maxDocs=42740)
                0.01877063 = queryNorm
              0.21479288 = fieldWeight in 4707, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.430104 = idf(docFreq=10226, maxDocs=42740)
                0.0625 = fieldNorm(doc=4707)
          0.23285112 = weight(abstract_txt:hyperlink in 4707) [ClassicSimilarity], result of:
            0.23285112 = score(doc=4707,freq=1.0), product of:
              0.47422478 = queryWeight, product of:
                3.215817 = boost
                7.856228 = idf(docFreq=44, maxDocs=42740)
                0.01877063 = queryNorm
              0.49101424 = fieldWeight in 4707, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.856228 = idf(docFreq=44, maxDocs=42740)
                0.0625 = fieldNorm(doc=4707)
          0.39974117 = weight(abstract_txt:link in 4707) [ClassicSimilarity], result of:
            0.39974117 = score(doc=4707,freq=5.0), product of:
              0.5009618 = queryWeight, product of:
                4.6742992 = boost
                5.709647 = idf(docFreq=384, maxDocs=42740)
                0.01877063 = queryNorm
              0.79794747 = fieldWeight in 4707, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.709647 = idf(docFreq=384, maxDocs=42740)
                0.0625 = fieldNorm(doc=4707)
        0.4 = coord(6/15)
    
  4. Thelwall, M.: ¬A comparison of link and URL citation counting (2011) 0.30
    0.29859626 = sum of:
      0.29859626 = product of:
        0.8957887 = sum of:
          0.009901919 = weight(abstract_txt:this in 1534) [ClassicSimilarity], result of:
            0.009901919 = score(doc=1534,freq=2.0), product of:
              0.045856573 = queryWeight, product of:
                2.442996 = idf(docFreq=10095, maxDocs=42740)
                0.01877063 = queryNorm
              0.21593238 = fieldWeight in 1534, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.442996 = idf(docFreq=10095, maxDocs=42740)
                0.0625 = fieldNorm(doc=1534)
          0.07511226 = weight(abstract_txt:significant in 1534) [ClassicSimilarity], result of:
            0.07511226 = score(doc=1534,freq=2.0), product of:
              0.1770364 = queryWeight, product of:
                1.964855 = boost
                4.8001328 = idf(docFreq=955, maxDocs=42740)
                0.01877063 = queryNorm
              0.4242758 = fieldWeight in 1534, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.8001328 = idf(docFreq=955, maxDocs=42740)
                0.0625 = fieldNorm(doc=1534)
          0.23285112 = weight(abstract_txt:hyperlink in 1534) [ClassicSimilarity], result of:
            0.23285112 = score(doc=1534,freq=1.0), product of:
              0.47422478 = queryWeight, product of:
                3.215817 = boost
                7.856228 = idf(docFreq=44, maxDocs=42740)
                0.01877063 = queryNorm
              0.49101424 = fieldWeight in 1534, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.856228 = idf(docFreq=44, maxDocs=42740)
                0.0625 = fieldNorm(doc=1534)
          0.07228637 = weight(abstract_txt:analysis in 1534) [ClassicSimilarity], result of:
            0.07228637 = score(doc=1534,freq=1.0), product of:
              0.3135764 = queryWeight, product of:
                4.5293045 = boost
                3.6883576 = idf(docFreq=2905, maxDocs=42740)
                0.01877063 = queryNorm
              0.23052235 = fieldWeight in 1534, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6883576 = idf(docFreq=2905, maxDocs=42740)
                0.0625 = fieldNorm(doc=1534)
          0.50563705 = weight(abstract_txt:link in 1534) [ClassicSimilarity], result of:
            0.50563705 = score(doc=1534,freq=8.0), product of:
              0.5009618 = queryWeight, product of:
                4.6742992 = boost
                5.709647 = idf(docFreq=384, maxDocs=42740)
                0.01877063 = queryNorm
              1.0093325 = fieldWeight in 1534, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                5.709647 = idf(docFreq=384, maxDocs=42740)
                0.0625 = fieldNorm(doc=1534)
        0.33333334 = coord(5/15)
    
  5. Thelwall, M.; Li, X.; Barjak, F.; Robinson, S.: Assessing the international web connectivity of research groups (2008) 0.30
    0.29729256 = sum of:
      0.29729256 = product of:
        0.74323136 = sum of:
          0.012127325 = weight(abstract_txt:this in 3402) [ClassicSimilarity], result of:
            0.012127325 = score(doc=3402,freq=3.0), product of:
              0.045856573 = queryWeight, product of:
                2.442996 = idf(docFreq=10095, maxDocs=42740)
                0.01877063 = queryNorm
              0.26446208 = fieldWeight in 3402, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.442996 = idf(docFreq=10095, maxDocs=42740)
                0.0625 = fieldNorm(doc=3402)
          0.06291501 = weight(abstract_txt:field in 3402) [ClassicSimilarity], result of:
            0.06291501 = score(doc=3402,freq=2.0), product of:
              0.15731068 = queryWeight, product of:
                1.8521591 = boost
                4.5248175 = idf(docFreq=1258, maxDocs=42740)
                0.01877063 = queryNorm
              0.39994115 = fieldWeight in 3402, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.5248175 = idf(docFreq=1258, maxDocs=42740)
                0.0625 = fieldNorm(doc=3402)
          0.013782903 = weight(abstract_txt:information in 3402) [ClassicSimilarity], result of:
            0.013782903 = score(doc=3402,freq=1.0), product of:
              0.09074774 = queryWeight, product of:
                1.9894457 = boost
                2.430104 = idf(docFreq=10226, maxDocs=42740)
                0.01877063 = queryNorm
              0.1518815 = fieldWeight in 3402, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.430104 = idf(docFreq=10226, maxDocs=42740)
                0.0625 = fieldNorm(doc=3402)
          0.3293012 = weight(abstract_txt:hyperlink in 3402) [ClassicSimilarity], result of:
            0.3293012 = score(doc=3402,freq=2.0), product of:
              0.47422478 = queryWeight, product of:
                3.215817 = boost
                7.856228 = idf(docFreq=44, maxDocs=42740)
                0.01877063 = queryNorm
              0.694399 = fieldWeight in 3402, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.856228 = idf(docFreq=44, maxDocs=42740)
                0.0625 = fieldNorm(doc=3402)
          0.07228637 = weight(abstract_txt:analysis in 3402) [ClassicSimilarity], result of:
            0.07228637 = score(doc=3402,freq=1.0), product of:
              0.3135764 = queryWeight, product of:
                4.5293045 = boost
                3.6883576 = idf(docFreq=2905, maxDocs=42740)
                0.01877063 = queryNorm
              0.23052235 = fieldWeight in 3402, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6883576 = idf(docFreq=2905, maxDocs=42740)
                0.0625 = fieldNorm(doc=3402)
          0.25281852 = weight(abstract_txt:link in 3402) [ClassicSimilarity], result of:
            0.25281852 = score(doc=3402,freq=2.0), product of:
              0.5009618 = queryWeight, product of:
                4.6742992 = boost
                5.709647 = idf(docFreq=384, maxDocs=42740)
                0.01877063 = queryNorm
              0.50466627 = fieldWeight in 3402, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.709647 = idf(docFreq=384, maxDocs=42740)
                0.0625 = fieldNorm(doc=3402)
        0.4 = coord(6/15)