Document (#32005)

Author
Agosti, M.
Pretto, L.
Title
¬A theoretical study of a generalized version of kleinberg's HITS algorithm
Source
Advances in mathematical/formal methods in information retrieval. 8(2005) no.2 , S.219-243
Year
2005
Abstract
Kleinberg's HITS (Hyperlink-Induced Topic Search) algorithm (Kleinberg 1999), which was originally developed in a Web context, tries to infer the authoritativeness of a Web page in relation to a specific query using the structure of a subgraph of the Web graph, which is obtained considering this specific query. Recent applications of this algorithm in contexts far removed from that of Web searching (Bacchin, Ferro and Melucci 2002, Ng et al. 2001) inspired us to study the algorithm in the abstract, independently of its particular applications, trying to mathematically illuminate its behaviour. In the present paper we detail this theoretical analysis. The original work starts from the definition of a revised and more general version of the algorithm, which includes the classic one as a particular case. We perform an analysis of the structure of two particular matrices, essential to studying the behaviour of the algorithm, and we prove the convergence of the algorithm in the most general case, finding the analytic expression of the vectors to which it converges. Then we study the symmetry of the algorithm and prove the equivalence between the existence of symmetry and the independence from the order of execution of some basic operations on initial vectors. Finally, we expound some interesting consequences of our theoretical results.
Theme
Suchmaschinen
Retrievalalgorithmen
Object
HITS-Algorithmus

Similar documents (author)

  1. Agosti, M.: Hypertext and information retrieval (1993) 5.81
    5.81187 = sum of:
      5.81187 = weight(author_txt:agosti in 4708) [ClassicSimilarity], result of:
        5.81187 = fieldWeight in 4708, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.298992 = idf(docFreq=10, maxDocs=44218)
          0.625 = fieldNorm(doc=4708)
    
  2. Agosti, M.; Allan, J.: Introduction to the special issue on methods and tools for the automatic construction of hypertext (1997) 4.65
    4.649496 = sum of:
      4.649496 = weight(author_txt:agosti in 149) [ClassicSimilarity], result of:
        4.649496 = fieldWeight in 149, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.298992 = idf(docFreq=10, maxDocs=44218)
          0.5 = fieldNorm(doc=149)
    
  3. Agosti, M.; Smeaton, A.F.: Information retrieval and hypertext (1996) 4.65
    4.649496 = sum of:
      4.649496 = weight(author_txt:agosti in 497) [ClassicSimilarity], result of:
        4.649496 = fieldWeight in 497, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.298992 = idf(docFreq=10, maxDocs=44218)
          0.5 = fieldNorm(doc=497)
    
  4. Agosti, M.; Melucci, M.: Information retrieval techniques for the automatic construction of hypertext (2000) 4.65
    4.649496 = sum of:
      4.649496 = weight(author_txt:agosti in 4671) [ClassicSimilarity], result of:
        4.649496 = fieldWeight in 4671, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.298992 = idf(docFreq=10, maxDocs=44218)
          0.5 = fieldNorm(doc=4671)
    
  5. Agosti, M.; Colotti, R.; Gradenigo, G.: Issues of data modelling in information retrieval (1991) 3.49
    3.487122 = sum of:
      3.487122 = weight(author_txt:agosti in 5094) [ClassicSimilarity], result of:
        3.487122 = fieldWeight in 5094, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.298992 = idf(docFreq=10, maxDocs=44218)
          0.375 = fieldNorm(doc=5094)
    

Similar documents (content)

  1. Quirin, A.; Cordón, O.; Guerrero-Bote, V.P.; Vargas-Quesada, B.; Moya-Anegón, F.: A quick MST-based algorithm to obtain Pathfinder networks (oo, n - 1) (2008) 0.15
    0.15317851 = sum of:
      0.15317851 = product of:
        0.6382438 = sum of:
          0.017597597 = weight(abstract_txt:from in 2371) [ClassicSimilarity], result of:
            0.017597597 = score(doc=2371,freq=4.0), product of:
              0.050935872 = queryWeight, product of:
                1.0073497 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.01829464 = queryNorm
              0.34548533 = fieldWeight in 2371, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=2371)
          0.022307774 = weight(abstract_txt:specific in 2371) [ClassicSimilarity], result of:
            0.022307774 = score(doc=2371,freq=1.0), product of:
              0.082733594 = queryWeight, product of:
                1.0482472 = boost
                4.314141 = idf(docFreq=1607, maxDocs=44218)
                0.01829464 = queryNorm
              0.2696338 = fieldWeight in 2371, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.314141 = idf(docFreq=1607, maxDocs=44218)
                0.0625 = fieldNorm(doc=2371)
          0.029629627 = weight(abstract_txt:applications in 2371) [ClassicSimilarity], result of:
            0.029629627 = score(doc=2371,freq=1.0), product of:
              0.099968195 = queryWeight, product of:
                1.1522685 = boost
                4.7422485 = idf(docFreq=1047, maxDocs=44218)
                0.01829464 = queryNorm
              0.29639053 = fieldWeight in 2371, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7422485 = idf(docFreq=1047, maxDocs=44218)
                0.0625 = fieldNorm(doc=2371)
          0.043650657 = weight(abstract_txt:case in 2371) [ClassicSimilarity], result of:
            0.043650657 = score(doc=2371,freq=2.0), product of:
              0.10272944 = queryWeight, product of:
                1.1680737 = boost
                4.807296 = idf(docFreq=981, maxDocs=44218)
                0.01829464 = queryNorm
              0.42490894 = fieldWeight in 2371, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.807296 = idf(docFreq=981, maxDocs=44218)
                0.0625 = fieldNorm(doc=2371)
          0.01949845 = weight(abstract_txt:which in 2371) [ClassicSimilarity], result of:
            0.01949845 = score(doc=2371,freq=2.0), product of:
              0.07563297 = queryWeight, product of:
                1.417403 = boost
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.01829464 = queryNorm
              0.2578036 = fieldWeight in 2371, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.0625 = fieldNorm(doc=2371)
          0.5055597 = weight(abstract_txt:algorithm in 2371) [ClassicSimilarity], result of:
            0.5055597 = score(doc=2371,freq=6.0), product of:
              0.57880056 = queryWeight, product of:
                5.5452003 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.01829464 = queryNorm
              0.87346095 = fieldWeight in 2371, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.0625 = fieldNorm(doc=2371)
        0.24 = coord(6/25)
    
  2. Chen, Z.; Fu, B.: On the complexity of Rocchio's similarity-based relevance feedback algorithm (2007) 0.13
    0.13374963 = sum of:
      0.13374963 = product of:
        0.66874814 = sum of:
          0.008798799 = weight(abstract_txt:from in 578) [ClassicSimilarity], result of:
            0.008798799 = score(doc=578,freq=1.0), product of:
              0.050935872 = queryWeight, product of:
                1.0073497 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.01829464 = queryNorm
              0.17274266 = fieldWeight in 578, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=578)
          0.029629627 = weight(abstract_txt:applications in 578) [ClassicSimilarity], result of:
            0.029629627 = score(doc=578,freq=1.0), product of:
              0.099968195 = queryWeight, product of:
                1.1522685 = boost
                4.7422485 = idf(docFreq=1047, maxDocs=44218)
                0.01829464 = queryNorm
              0.29639053 = fieldWeight in 578, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7422485 = idf(docFreq=1047, maxDocs=44218)
                0.0625 = fieldNorm(doc=578)
          0.02984601 = weight(abstract_txt:query in 578) [ClassicSimilarity], result of:
            0.02984601 = score(doc=578,freq=1.0), product of:
              0.100454316 = queryWeight, product of:
                1.1550667 = boost
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.01829464 = queryNorm
              0.2971103 = fieldWeight in 578, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7537646 = idf(docFreq=1035, maxDocs=44218)
                0.0625 = fieldNorm(doc=578)
          0.13896301 = weight(abstract_txt:prove in 578) [ClassicSimilarity], result of:
            0.13896301 = score(doc=578,freq=2.0), product of:
              0.22231421 = queryWeight, product of:
                1.7183292 = boost
                7.071914 = idf(docFreq=101, maxDocs=44218)
                0.01829464 = queryNorm
              0.6250748 = fieldWeight in 578, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.071914 = idf(docFreq=101, maxDocs=44218)
                0.0625 = fieldNorm(doc=578)
          0.46151072 = weight(abstract_txt:algorithm in 578) [ClassicSimilarity], result of:
            0.46151072 = score(doc=578,freq=5.0), product of:
              0.57880056 = queryWeight, product of:
                5.5452003 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.01829464 = queryNorm
              0.7973571 = fieldWeight in 578, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.0625 = fieldNorm(doc=578)
        0.2 = coord(5/25)
    
  3. Chandrasekar, R.; Srinivas, B.: Automatic induction of rules for text simplification (1997) 0.13
    0.12619375 = sum of:
      0.12619375 = product of:
        0.6309687 = sum of:
          0.11620273 = weight(abstract_txt:induced in 2873) [ClassicSimilarity], result of:
            0.11620273 = score(doc=2873,freq=1.0), product of:
              0.15058595 = queryWeight, product of:
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.01829464 = queryNorm
              0.77167046 = fieldWeight in 2873, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.09375 = fieldNorm(doc=2873)
          0.013198198 = weight(abstract_txt:from in 2873) [ClassicSimilarity], result of:
            0.013198198 = score(doc=2873,freq=1.0), product of:
              0.050935872 = queryWeight, product of:
                1.0073497 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.01829464 = queryNorm
              0.259114 = fieldWeight in 2873, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.09375 = fieldNorm(doc=2873)
          0.03449261 = weight(abstract_txt:structure in 2873) [ClassicSimilarity], result of:
            0.03449261 = score(doc=2873,freq=1.0), product of:
              0.084424324 = queryWeight, product of:
                1.0589039 = boost
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.01829464 = queryNorm
              0.40856242 = fieldWeight in 2873, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.09375 = fieldNorm(doc=2873)
          0.029247677 = weight(abstract_txt:which in 2873) [ClassicSimilarity], result of:
            0.029247677 = score(doc=2873,freq=2.0), product of:
              0.07563297 = queryWeight, product of:
                1.417403 = boost
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.01829464 = queryNorm
              0.3867054 = fieldWeight in 2873, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.09375 = fieldNorm(doc=2873)
          0.4378275 = weight(abstract_txt:algorithm in 2873) [ClassicSimilarity], result of:
            0.4378275 = score(doc=2873,freq=2.0), product of:
              0.57880056 = queryWeight, product of:
                5.5452003 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.01829464 = queryNorm
              0.7564393 = fieldWeight in 2873, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.09375 = fieldNorm(doc=2873)
        0.2 = coord(5/25)
    
  4. Marshall, B.; Chen, H.; Kaza, S.: Using importance flooding to identify interesting networks of criminal activity (2008) 0.12
    0.1240179 = sum of:
      0.1240179 = product of:
        0.44292107 = sum of:
          0.008798799 = weight(abstract_txt:from in 2386) [ClassicSimilarity], result of:
            0.008798799 = score(doc=2386,freq=1.0), product of:
              0.050935872 = queryWeight, product of:
                1.0073497 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.01829464 = queryNorm
              0.17274266 = fieldWeight in 2386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=2386)
          0.022307774 = weight(abstract_txt:specific in 2386) [ClassicSimilarity], result of:
            0.022307774 = score(doc=2386,freq=1.0), product of:
              0.082733594 = queryWeight, product of:
                1.0482472 = boost
                4.314141 = idf(docFreq=1607, maxDocs=44218)
                0.01829464 = queryNorm
              0.2696338 = fieldWeight in 2386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.314141 = idf(docFreq=1607, maxDocs=44218)
                0.0625 = fieldNorm(doc=2386)
          0.022995071 = weight(abstract_txt:structure in 2386) [ClassicSimilarity], result of:
            0.022995071 = score(doc=2386,freq=1.0), product of:
              0.084424324 = queryWeight, product of:
                1.0589039 = boost
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.01829464 = queryNorm
              0.27237496 = fieldWeight in 2386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.0625 = fieldNorm(doc=2386)
          0.029629627 = weight(abstract_txt:applications in 2386) [ClassicSimilarity], result of:
            0.029629627 = score(doc=2386,freq=1.0), product of:
              0.099968195 = queryWeight, product of:
                1.1522685 = boost
                4.7422485 = idf(docFreq=1047, maxDocs=44218)
                0.01829464 = queryNorm
              0.29639053 = fieldWeight in 2386, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7422485 = idf(docFreq=1047, maxDocs=44218)
                0.0625 = fieldNorm(doc=2386)
          0.043650657 = weight(abstract_txt:case in 2386) [ClassicSimilarity], result of:
            0.043650657 = score(doc=2386,freq=2.0), product of:
              0.10272944 = queryWeight, product of:
                1.1680737 = boost
                4.807296 = idf(docFreq=981, maxDocs=44218)
                0.01829464 = queryNorm
              0.42490894 = fieldWeight in 2386, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.807296 = idf(docFreq=981, maxDocs=44218)
                0.0625 = fieldNorm(doc=2386)
          0.023654127 = weight(abstract_txt:study in 2386) [ClassicSimilarity], result of:
            0.023654127 = score(doc=2386,freq=2.0), product of:
              0.07816328 = queryWeight, product of:
                1.2478713 = boost
                3.423806 = idf(docFreq=3916, maxDocs=44218)
                0.01829464 = queryNorm
              0.30262455 = fieldWeight in 2386, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.423806 = idf(docFreq=3916, maxDocs=44218)
                0.0625 = fieldNorm(doc=2386)
          0.29188502 = weight(abstract_txt:algorithm in 2386) [ClassicSimilarity], result of:
            0.29188502 = score(doc=2386,freq=2.0), product of:
              0.57880056 = queryWeight, product of:
                5.5452003 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.01829464 = queryNorm
              0.5042929 = fieldWeight in 2386, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.0625 = fieldNorm(doc=2386)
        0.28 = coord(7/25)
    
  5. Cathey, R.J.; Jensen, E.C.; Beitzel, S.M.; Frieder, O.; Grossman, D.: Exploiting parallelism to support scalable hierarchical clustering (2007) 0.12
    0.12234441 = sum of:
      0.12234441 = product of:
        0.61172205 = sum of:
          0.030865675 = weight(abstract_txt:case in 448) [ClassicSimilarity], result of:
            0.030865675 = score(doc=448,freq=1.0), product of:
              0.10272944 = queryWeight, product of:
                1.1680737 = boost
                4.807296 = idf(docFreq=981, maxDocs=44218)
                0.01829464 = queryNorm
              0.300456 = fieldWeight in 448, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.807296 = idf(docFreq=981, maxDocs=44218)
                0.0625 = fieldNorm(doc=448)
          0.05811 = weight(abstract_txt:version in 448) [ClassicSimilarity], result of:
            0.05811 = score(doc=448,freq=2.0), product of:
              0.12431828 = queryWeight, product of:
                1.2849619 = boost
                5.288358 = idf(docFreq=606, maxDocs=44218)
                0.01829464 = queryNorm
              0.46742925 = fieldWeight in 448, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.288358 = idf(docFreq=606, maxDocs=44218)
                0.0625 = fieldNorm(doc=448)
          0.013787487 = weight(abstract_txt:which in 448) [ClassicSimilarity], result of:
            0.013787487 = score(doc=448,freq=1.0), product of:
              0.07563297 = queryWeight, product of:
                1.417403 = boost
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.01829464 = queryNorm
              0.18229467 = fieldWeight in 448, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.9167147 = idf(docFreq=6503, maxDocs=44218)
                0.0625 = fieldNorm(doc=448)
          0.047448162 = weight(abstract_txt:theoretical in 448) [ClassicSimilarity], result of:
            0.047448162 = score(doc=448,freq=1.0), product of:
              0.15663461 = queryWeight, product of:
                1.7664944 = boost
                4.846761 = idf(docFreq=943, maxDocs=44218)
                0.01829464 = queryNorm
              0.30292258 = fieldWeight in 448, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.846761 = idf(docFreq=943, maxDocs=44218)
                0.0625 = fieldNorm(doc=448)
          0.46151072 = weight(abstract_txt:algorithm in 448) [ClassicSimilarity], result of:
            0.46151072 = score(doc=448,freq=5.0), product of:
              0.57880056 = queryWeight, product of:
                5.5452003 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.01829464 = queryNorm
              0.7973571 = fieldWeight in 448, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.0625 = fieldNorm(doc=448)
        0.2 = coord(5/25)