Document (#32011)

Author
Lempel, R.
Moran, S.
Title
SALSA: the stochastic approach for link-structure analysis
Source
ACM transactions on information systems. 19(2001) no.2, S.131-160
Year
2001
Abstract
Today, when searching for information on the WWW, one usually performs a query through a term-based search engine. These engines return, as the query's result, a list of Web pages whose contents matches the query. For broad-topic queries, such searches often result in a huge set of retrieved documents, many of which are irrelevant to the user. However, much information is contained in the link-structure of the WWW. Information such as which pages are linked to others can be used to augment search algorithms. In this context, Jon Kleinberg introduced the notion of two distinct types of Web pages: hubs and authorities. Kleinberg argued that hubs and authorities exhibit a mutually reinforcing relationship: a good hub will point to many authorities, and a good authority will be pointed at by many hubs. In light of this, he dervised an algoirthm aimed at finding authoritative pages. We present SALSA, a new stochastic approach for link-structure analysis, which examines random walks on graphs derived from the link-structure. We show that both SALSA and Kleinberg's Mutual Reinforcement approach employ the same metaalgorithm. We then prove that SALSA is quivalent to a weighted in degree analysis of the link-sturcutre of WWW subgraphs, making it computationally more efficient than the Mutual reinforcement approach. We compare that results of applying SALSA to the results derived through Kleinberg's approach. These comparisions reveal a topological Phenomenon called the TKC effectwhich, in certain cases, prevents the Mutual reinforcement approach from identifying meaningful authorities.
Theme
Suchmaschinen
Retrievalalgorithmen

Similar documents (author)

  1. Moran, J.M.: Influencia dos meios de comunicacao no conhecimento (1994) 5.94
    5.937289 = sum of:
      5.937289 = weight(author_txt:moran in 2376) [ClassicSimilarity], result of:
        5.937289 = fieldWeight in 2376, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.625 = fieldNorm(doc=2376)
    
  2. Moran, D.B.: Multimodal user interfaces in the Open Agent Architecture (1998) 5.94
    5.937289 = sum of:
      5.937289 = weight(author_txt:moran in 3838) [ClassicSimilarity], result of:
        5.937289 = fieldWeight in 3838, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.625 = fieldNorm(doc=3838)
    
  3. Kilgour, F.G.; Moran, B.B.: Surname plus recallable title word searches for known items by scholars (2000) 4.75
    4.749831 = sum of:
      4.749831 = weight(author_txt:moran in 4296) [ClassicSimilarity], result of:
        4.749831 = fieldWeight in 4296, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.5 = fieldNorm(doc=4296)
    
  4. Olson, G.M.; Moran, T.P.: Introduction to this special issue on experimental comparisons of usability evaluation methods (1998) 4.75
    4.749831 = sum of:
      4.749831 = weight(author_txt:moran in 5102) [ClassicSimilarity], result of:
        4.749831 = fieldWeight in 5102, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.5 = fieldNorm(doc=5102)
    
  5. Kilgour, F.G.; Moran, B.B.; Barden, J.R.: Retrieval effectiveness of surname-title-word searches for known items by academic library users (1999) 3.56
    3.5623734 = sum of:
      3.5623734 = weight(author_txt:moran in 3061) [ClassicSimilarity], result of:
        3.5623734 = fieldWeight in 3061, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.499662 = idf(docFreq=8, maxDocs=44218)
          0.375 = fieldNorm(doc=3061)
    

Similar documents (content)

  1. Haider, J.: ¬The structuring of information through search : sorting waste with Google (2016) 0.14
    0.13574222 = sum of:
      0.13574222 = product of:
        0.42419443 = sum of:
          0.020753402 = weight(abstract_txt:through in 3073) [ClassicSimilarity], result of:
            0.020753402 = score(doc=3073,freq=3.0), product of:
              0.054622054 = queryWeight, product of:
                1.0079343 = boost
                4.011184 = idf(docFreq=2176, maxDocs=44218)
                0.013510244 = queryNorm
              0.3799455 = fieldWeight in 3073, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.011184 = idf(docFreq=2176, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3073)
          0.011968634 = weight(abstract_txt:which in 3073) [ClassicSimilarity], result of:
            0.011968634 = score(doc=3073,freq=3.0), product of:
              0.04332135 = queryWeight, product of:
                1.0993724 = boost
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.013510244 = queryNorm
              0.27627566 = fieldWeight in 3073, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3073)
          0.008555747 = weight(abstract_txt:that in 3073) [ClassicSimilarity], result of:
            0.008555747 = score(doc=3073,freq=3.0), product of:
              0.03812037 = queryWeight, product of:
                1.190808 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.013510244 = queryNorm
              0.22444029 = fieldWeight in 3073, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3073)
          0.028205829 = weight(abstract_txt:query in 3073) [ClassicSimilarity], result of:
            0.028205829 = score(doc=3073,freq=2.0), product of:
              0.07671815 = queryWeight, product of:
                1.1945306 = boost
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.013510244 = queryNorm
              0.36765522 = fieldWeight in 3073, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3073)
          0.027162865 = weight(abstract_txt:analysis in 3073) [ClassicSimilarity], result of:
            0.027162865 = score(doc=3073,freq=4.0), product of:
              0.06797403 = queryWeight, product of:
                1.3770996 = boost
                3.6535451 = idf(docFreq=3112, maxDocs=44218)
                0.013510244 = queryNorm
              0.3996065 = fieldWeight in 3073, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.6535451 = idf(docFreq=3112, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3073)
          0.05852369 = weight(abstract_txt:approach in 3073) [ClassicSimilarity], result of:
            0.05852369 = score(doc=3073,freq=4.0), product of:
              0.14286432 = queryWeight, product of:
                2.823389 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.013510244 = queryNorm
              0.40964526 = fieldWeight in 3073, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3073)
          0.18270934 = weight(abstract_txt:authorities in 3073) [ClassicSimilarity], result of:
            0.18270934 = score(doc=3073,freq=2.0), product of:
              0.33588406 = queryWeight, product of:
                3.5347435 = boost
                7.033448 = idf(docFreq=105, maxDocs=44218)
                0.013510244 = queryNorm
              0.54396546 = fieldWeight in 3073, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.033448 = idf(docFreq=105, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3073)
          0.08631492 = weight(abstract_txt:link in 3073) [ClassicSimilarity], result of:
            0.08631492 = score(doc=3073,freq=1.0), product of:
              0.27651548 = queryWeight, product of:
                3.585733 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.013510244 = queryNorm
              0.3121522 = fieldWeight in 3073, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3073)
        0.32 = coord(8/25)
    
  2. Wei, F.; Li, W.; Lu, Q.; He, Y.: Applying two-level reinforcement ranking in query-oriented multidocument summarization (2009) 0.12
    0.122118525 = sum of:
      0.122118525 = product of:
        0.6105926 = sum of:
          0.011290658 = weight(abstract_txt:that in 3120) [ClassicSimilarity], result of:
            0.011290658 = score(doc=3120,freq=4.0), product of:
              0.03812037 = queryWeight, product of:
                1.190808 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.013510244 = queryNorm
              0.2961844 = fieldWeight in 3120, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=3120)
          0.022793753 = weight(abstract_txt:query in 3120) [ClassicSimilarity], result of:
            0.022793753 = score(doc=3120,freq=1.0), product of:
              0.07671815 = queryWeight, product of:
                1.1945306 = boost
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.013510244 = queryNorm
              0.2971103 = fieldWeight in 3120, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.0625 = fieldNorm(doc=3120)
          0.02163339 = weight(abstract_txt:many in 3120) [ClassicSimilarity], result of:
            0.02163339 = score(doc=3120,freq=1.0), product of:
              0.084814034 = queryWeight, product of:
                1.5382527 = boost
                4.081096 = idf(docFreq=2029, maxDocs=44218)
                0.013510244 = queryNorm
              0.2550685 = fieldWeight in 3120, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.081096 = idf(docFreq=2029, maxDocs=44218)
                0.0625 = fieldNorm(doc=3120)
          0.17099436 = weight(abstract_txt:mutual in 3120) [ClassicSimilarity], result of:
            0.17099436 = score(doc=3120,freq=2.0), product of:
              0.2671135 = queryWeight, product of:
                2.7298687 = boost
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.013510244 = queryNorm
              0.64015615 = fieldWeight in 3120, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.0625 = fieldNorm(doc=3120)
          0.38388047 = weight(abstract_txt:reinforcement in 3120) [ClassicSimilarity], result of:
            0.38388047 = score(doc=3120,freq=3.0), product of:
              0.40007517 = queryWeight, product of:
                3.340909 = boost
                8.863674 = idf(docFreq=16, maxDocs=44218)
                0.013510244 = queryNorm
              0.9595209 = fieldWeight in 3120, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.863674 = idf(docFreq=16, maxDocs=44218)
                0.0625 = fieldNorm(doc=3120)
        0.2 = coord(5/25)
    
  3. Kleinberg, J.M.: Authoritative sources in a hyperlinked environment (1998) 0.12
    0.11696108 = sum of:
      0.11696108 = product of:
        0.48733783 = sum of:
          0.017117118 = weight(abstract_txt:through in 5) [ClassicSimilarity], result of:
            0.017117118 = score(doc=5,freq=1.0), product of:
              0.054622054 = queryWeight, product of:
                1.0079343 = boost
                4.011184 = idf(docFreq=2176, maxDocs=44218)
                0.013510244 = queryNorm
              0.31337377 = fieldWeight in 5, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.011184 = idf(docFreq=2176, maxDocs=44218)
                0.078125 = fieldNorm(doc=5)
          0.009979626 = weight(abstract_txt:that in 5) [ClassicSimilarity], result of:
            0.009979626 = score(doc=5,freq=2.0), product of:
              0.03812037 = queryWeight, product of:
                1.190808 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.013510244 = queryNorm
              0.26179248 = fieldWeight in 5, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.078125 = fieldNorm(doc=5)
          0.019402046 = weight(abstract_txt:analysis in 5) [ClassicSimilarity], result of:
            0.019402046 = score(doc=5,freq=1.0), product of:
              0.06797403 = queryWeight, product of:
                1.3770996 = boost
                3.6535451 = idf(docFreq=3112, maxDocs=44218)
                0.013510244 = queryNorm
              0.2854332 = fieldWeight in 5, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6535451 = idf(docFreq=3112, maxDocs=44218)
                0.078125 = fieldNorm(doc=5)
          0.06208967 = weight(abstract_txt:structure in 5) [ClassicSimilarity], result of:
            0.06208967 = score(doc=5,freq=2.0), product of:
              0.12895173 = queryWeight, product of:
                2.1901648 = boost
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.013510244 = queryNorm
              0.48149544 = fieldWeight in 5, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.078125 = fieldNorm(doc=5)
          0.13213532 = weight(abstract_txt:pages in 5) [ClassicSimilarity], result of:
            0.13213532 = score(doc=5,freq=2.0), product of:
              0.21335043 = queryWeight, product of:
                2.81715 = boost
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.013510244 = queryNorm
              0.61933464 = fieldWeight in 5, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.078125 = fieldNorm(doc=5)
          0.24661404 = weight(abstract_txt:link in 5) [ClassicSimilarity], result of:
            0.24661404 = score(doc=5,freq=4.0), product of:
              0.27651548 = queryWeight, product of:
                3.585733 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.013510244 = queryNorm
              0.8918634 = fieldWeight in 5, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.078125 = fieldNorm(doc=5)
        0.24 = coord(6/25)
    
  4. Bidoki, A.M.Z.; Yazdani, N.: an intelligent ranking algorithm for web pages : DistanceRank (2008) 0.12
    0.11640643 = sum of:
      0.11640643 = product of:
        0.58203214 = sum of:
          0.009871564 = weight(abstract_txt:which in 2068) [ClassicSimilarity], result of:
            0.009871564 = score(doc=2068,freq=1.0), product of:
              0.04332135 = queryWeight, product of:
                1.0993724 = boost
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.013510244 = queryNorm
              0.22786833 = fieldWeight in 2068, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.078125 = fieldNorm(doc=2068)
          0.009979626 = weight(abstract_txt:that in 2068) [ClassicSimilarity], result of:
            0.009979626 = score(doc=2068,freq=2.0), product of:
              0.03812037 = queryWeight, product of:
                1.190808 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.013510244 = queryNorm
              0.26179248 = fieldWeight in 2068, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.078125 = fieldNorm(doc=2068)
          0.16183205 = weight(abstract_txt:pages in 2068) [ClassicSimilarity], result of:
            0.16183205 = score(doc=2068,freq=3.0), product of:
              0.21335043 = queryWeight, product of:
                2.81715 = boost
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.013510244 = queryNorm
              0.7585269 = fieldWeight in 2068, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.078125 = fieldNorm(doc=2068)
          0.27704188 = weight(abstract_txt:reinforcement in 2068) [ClassicSimilarity], result of:
            0.27704188 = score(doc=2068,freq=1.0), product of:
              0.40007517 = queryWeight, product of:
                3.340909 = boost
                8.863674 = idf(docFreq=16, maxDocs=44218)
                0.013510244 = queryNorm
              0.69247454 = fieldWeight in 2068, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.863674 = idf(docFreq=16, maxDocs=44218)
                0.078125 = fieldNorm(doc=2068)
          0.12330702 = weight(abstract_txt:link in 2068) [ClassicSimilarity], result of:
            0.12330702 = score(doc=2068,freq=1.0), product of:
              0.27651548 = queryWeight, product of:
                3.585733 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.013510244 = queryNorm
              0.4459317 = fieldWeight in 2068, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.078125 = fieldNorm(doc=2068)
        0.2 = coord(5/25)
    
  5. Picard, J.; Savoy, J.: Enhancing retrieval with hyperlinks : a general model based on propositional argumentation systems (2003) 0.11
    0.11061962 = sum of:
      0.11061962 = product of:
        0.4609151 = sum of:
          0.007897251 = weight(abstract_txt:which in 1427) [ClassicSimilarity], result of:
            0.007897251 = score(doc=1427,freq=1.0), product of:
              0.04332135 = queryWeight, product of:
                1.0993724 = boost
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.013510244 = queryNorm
              0.18229467 = fieldWeight in 1427, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.0625 = fieldNorm(doc=1427)
          0.005645329 = weight(abstract_txt:that in 1427) [ClassicSimilarity], result of:
            0.005645329 = score(doc=1427,freq=1.0), product of:
              0.03812037 = queryWeight, product of:
                1.190808 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.013510244 = queryNorm
              0.1480922 = fieldWeight in 1427, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=1427)
          0.03512322 = weight(abstract_txt:structure in 1427) [ClassicSimilarity], result of:
            0.03512322 = score(doc=1427,freq=1.0), product of:
              0.12895173 = queryWeight, product of:
                2.1901648 = boost
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.013510244 = queryNorm
              0.27237496 = fieldWeight in 1427, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.0625 = fieldNorm(doc=1427)
          0.033442106 = weight(abstract_txt:approach in 1427) [ClassicSimilarity], result of:
            0.033442106 = score(doc=1427,freq=1.0), product of:
              0.14286432 = queryWeight, product of:
                2.823389 = boost
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.013510244 = queryNorm
              0.234083 = fieldWeight in 1427, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.745328 = idf(docFreq=2839, maxDocs=44218)
                0.0625 = fieldNorm(doc=1427)
          0.23115571 = weight(abstract_txt:hubs in 1427) [ClassicSimilarity], result of:
            0.23115571 = score(doc=1427,freq=1.0), product of:
              0.4114538 = queryWeight, product of:
                3.3880856 = boost
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.013510244 = queryNorm
              0.5618023 = fieldWeight in 1427, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.0625 = fieldNorm(doc=1427)
          0.14765145 = weight(abstract_txt:authorities in 1427) [ClassicSimilarity], result of:
            0.14765145 = score(doc=1427,freq=1.0), product of:
              0.33588406 = queryWeight, product of:
                3.5347435 = boost
                7.033448 = idf(docFreq=105, maxDocs=44218)
                0.013510244 = queryNorm
              0.4395905 = fieldWeight in 1427, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.033448 = idf(docFreq=105, maxDocs=44218)
                0.0625 = fieldNorm(doc=1427)
        0.24 = coord(6/25)