Document (#36627)

Author
Hsu, M.-H.
Chen, H.-H.
Title
Efficient and effective prediction of social tags to enhance Web search
Source
Journal of the American Society for Information Science and Technology. 62(2011) no.8, S.1473-1487
Year
2011
Abstract
As the web has grown into an integral part of daily life, social annotation has become a popular manner for web users to manage resources. This method of management has many potential applications, but it is limited in applicability by the cold-start problem, especially for new resources on the web. In this article, we study automatic tag prediction for web pages comprehensively and utilize the predicted tags to improve search performance. First, we explore the stabilizing phenomenon of tag usage in a social bookmarking system. Then, we propose a two-stage tag prediction approach, which is efficient and is effective in making use of early annotations from users. In the first stage, content-based ranking, candidate tags are selected and ranked to generate an initial tag list. In the second stage, random-walk re-ranking, we adopt a random-walk model that utilizes tag co-occurrence information to re-rank the initial list. The experimental results show that our algorithm effectively proposes appropriate tags for target web pages. In addition, we present a framework to incorporate tag prediction in a general web search. The experimental results of the web search validate the hypothesis that the proposed framework significantly enhances the typical retrieval model.
Theme
Social tagging

Similar documents (author)

  1. Chen, Y.N.; Chen, S.J.: ¬A metadata practice of the OFLA FRBR model : a case study for the National Palace Museum in Taipai (2004) 4.36
    4.3627276 = sum of:
      4.3627276 = weight(author_txt:chen in 4385) [ClassicSimilarity], result of:
        4.3627276 = fieldWeight in 4385, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.169829 = idf(docFreq=242, maxDocs=42740)
          0.5 = fieldNorm(doc=4385)
    
  2. Chen, C.C.; Chen, H.H.; Chen, K.H.: ¬The design of the XML/Metadata management system (2000) 4.01
    4.0074215 = sum of:
      4.0074215 = weight(author_txt:chen in 5634) [ClassicSimilarity], result of:
        4.0074215 = fieldWeight in 5634, product of:
          1.7320508 = tf(freq=3.0), with freq of:
            3.0 = termFreq=3.0
          6.169829 = idf(docFreq=242, maxDocs=42740)
          0.375 = fieldNorm(doc=5634)
    
  3. Chen, W.Y.: Observations on cataloguing and classification (1991) 3.86
    3.856143 = sum of:
      3.856143 = weight(author_txt:chen in 4184) [ClassicSimilarity], result of:
        3.856143 = fieldWeight in 4184, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.169829 = idf(docFreq=242, maxDocs=42740)
          0.625 = fieldNorm(doc=4184)
    
  4. Chen, H.: Knowledge-based document retrieval : framework and design (1992) 3.86
    3.856143 = sum of:
      3.856143 = weight(author_txt:chen in 5283) [ClassicSimilarity], result of:
        3.856143 = fieldWeight in 5283, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.169829 = idf(docFreq=242, maxDocs=42740)
          0.625 = fieldNorm(doc=5283)
    
  5. Chen, P.S.: On inference rules of logic-based information retrieval systems (1994) 3.86
    3.856143 = sum of:
      3.856143 = weight(author_txt:chen in 6731) [ClassicSimilarity], result of:
        3.856143 = fieldWeight in 6731, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.169829 = idf(docFreq=242, maxDocs=42740)
          0.625 = fieldNorm(doc=6731)
    

Similar documents (content)

  1. Torres, S.D.; Hiemstra, D.; Weber, I.; Serdyukov, P.: Query recommendation in the information domain of children (2014) 0.25
    0.2548742 = sum of:
      0.2548742 = product of:
        0.79648185 = sum of:
          0.031455386 = weight(abstract_txt:resources in 3301) [ClassicSimilarity], result of:
            0.031455386 = score(doc=3301,freq=2.0), product of:
              0.084205896 = queryWeight, product of:
                1.1310424 = boost
                4.226273 = idf(docFreq=1696, maxDocs=42740)
                0.017615952 = queryNorm
              0.37355328 = fieldWeight in 3301, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.226273 = idf(docFreq=1696, maxDocs=42740)
                0.0625 = fieldNorm(doc=3301)
          0.03342182 = weight(abstract_txt:effective in 3301) [ClassicSimilarity], result of:
            0.03342182 = score(doc=3301,freq=1.0), product of:
              0.110469535 = queryWeight, product of:
                1.2954745 = boost
                4.840693 = idf(docFreq=917, maxDocs=42740)
                0.017615952 = queryNorm
              0.3025433 = fieldWeight in 3301, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.840693 = idf(docFreq=917, maxDocs=42740)
                0.0625 = fieldNorm(doc=3301)
          0.05172057 = weight(abstract_txt:ranking in 3301) [ClassicSimilarity], result of:
            0.05172057 = score(doc=3301,freq=1.0), product of:
              0.14779668 = queryWeight, product of:
                1.498442 = boost
                5.5991054 = idf(docFreq=429, maxDocs=42740)
                0.017615952 = queryNorm
              0.34994408 = fieldWeight in 3301, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.5991054 = idf(docFreq=429, maxDocs=42740)
                0.0625 = fieldNorm(doc=3301)
          0.03482264 = weight(abstract_txt:social in 3301) [ClassicSimilarity], result of:
            0.03482264 = score(doc=3301,freq=1.0), product of:
              0.12996528 = queryWeight, product of:
                1.7209446 = boost
                4.2870083 = idf(docFreq=1596, maxDocs=42740)
                0.017615952 = queryNorm
              0.26793802 = fieldWeight in 3301, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2870083 = idf(docFreq=1596, maxDocs=42740)
                0.0625 = fieldNorm(doc=3301)
          0.142688 = weight(abstract_txt:random in 3301) [ClassicSimilarity], result of:
            0.142688 = score(doc=3301,freq=3.0), product of:
              0.20157671 = queryWeight, product of:
                1.7499586 = boost
                6.5389266 = idf(docFreq=167, maxDocs=42740)
                0.017615952 = queryNorm
              0.7078596 = fieldWeight in 3301, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.5389266 = idf(docFreq=167, maxDocs=42740)
                0.0625 = fieldNorm(doc=3301)
          0.040576912 = weight(abstract_txt:search in 3301) [ClassicSimilarity], result of:
            0.040576912 = score(doc=3301,freq=2.0), product of:
              0.12572119 = queryWeight, product of:
                1.9544603 = boost
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.017615952 = queryNorm
              0.3227532 = fieldWeight in 3301, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.6515355 = idf(docFreq=3014, maxDocs=42740)
                0.0625 = fieldNorm(doc=3301)
          0.25088194 = weight(abstract_txt:walk in 3301) [ClassicSimilarity], result of:
            0.25088194 = score(doc=3301,freq=2.0), product of:
              0.33614403 = queryWeight, product of:
                2.2598016 = boost
                8.444015 = idf(docFreq=24, maxDocs=42740)
                0.017615952 = queryNorm
              0.7463525 = fieldWeight in 3301, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.444015 = idf(docFreq=24, maxDocs=42740)
                0.0625 = fieldNorm(doc=3301)
          0.2109146 = weight(abstract_txt:tags in 3301) [ClassicSimilarity], result of:
            0.2109146 = score(doc=3301,freq=2.0), product of:
              0.37724796 = queryWeight, product of:
                3.3856032 = boost
                6.3253527 = idf(docFreq=207, maxDocs=42740)
                0.017615952 = queryNorm
              0.55908746 = fieldWeight in 3301, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.3253527 = idf(docFreq=207, maxDocs=42740)
                0.0625 = fieldNorm(doc=3301)
        0.32 = coord(8/25)
    
  2. Xiong, S.; Ji, D.: Query-focused multi-document summarization using hypergraph-based ranking (2016) 0.18
    0.1763888 = sum of:
      0.1763888 = product of:
        0.73495334 = sum of:
          0.023968264 = weight(abstract_txt:model in 4973) [ClassicSimilarity], result of:
            0.023968264 = score(doc=4973,freq=1.0), product of:
              0.07627347 = queryWeight, product of:
                1.0764513 = boost
                4.022287 = idf(docFreq=2080, maxDocs=42740)
                0.017615952 = queryNorm
              0.31424117 = fieldWeight in 4973, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.022287 = idf(docFreq=2080, maxDocs=42740)
                0.078125 = fieldNorm(doc=4973)
          0.02731197 = weight(abstract_txt:first in 4973) [ClassicSimilarity], result of:
            0.02731197 = score(doc=4973,freq=1.0), product of:
              0.08321171 = queryWeight, product of:
                1.1243457 = boost
                4.20125 = idf(docFreq=1739, maxDocs=42740)
                0.017615952 = queryNorm
              0.32822266 = fieldWeight in 4973, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.20125 = idf(docFreq=1739, maxDocs=42740)
                0.078125 = fieldNorm(doc=4973)
          0.0627544 = weight(abstract_txt:framework in 4973) [ClassicSimilarity], result of:
            0.0627544 = score(doc=4973,freq=3.0), product of:
              0.10046269 = queryWeight, product of:
                1.2354069 = boost
                4.6162434 = idf(docFreq=1148, maxDocs=42740)
                0.017615952 = queryNorm
              0.62465376 = fieldWeight in 4973, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.6162434 = idf(docFreq=1148, maxDocs=42740)
                0.078125 = fieldNorm(doc=4973)
          0.05847581 = weight(abstract_txt:experimental in 4973) [ClassicSimilarity], result of:
            0.05847581 = score(doc=4973,freq=1.0), product of:
              0.13822925 = queryWeight, product of:
                1.4491308 = boost
                5.414848 = idf(docFreq=516, maxDocs=42740)
                0.017615952 = queryNorm
              0.423035 = fieldWeight in 4973, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.414848 = idf(docFreq=516, maxDocs=42740)
                0.078125 = fieldNorm(doc=4973)
          0.17836 = weight(abstract_txt:random in 4973) [ClassicSimilarity], result of:
            0.17836 = score(doc=4973,freq=3.0), product of:
              0.20157671 = queryWeight, product of:
                1.7499586 = boost
                6.5389266 = idf(docFreq=167, maxDocs=42740)
                0.017615952 = queryNorm
              0.88482445 = fieldWeight in 4973, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.5389266 = idf(docFreq=167, maxDocs=42740)
                0.078125 = fieldNorm(doc=4973)
          0.3840829 = weight(abstract_txt:walk in 4973) [ClassicSimilarity], result of:
            0.3840829 = score(doc=4973,freq=3.0), product of:
              0.33614403 = queryWeight, product of:
                2.2598016 = boost
                8.444015 = idf(docFreq=24, maxDocs=42740)
                0.017615952 = queryNorm
              1.1426141 = fieldWeight in 4973, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.444015 = idf(docFreq=24, maxDocs=42740)
                0.078125 = fieldNorm(doc=4973)
        0.24 = coord(6/25)
    
  3. Chang, Y.-W.: Influence of human behavior and the principle of least effort on library and information science research (2016) 0.18
    0.1763888 = sum of:
      0.1763888 = product of:
        0.73495334 = sum of:
          0.023968264 = weight(abstract_txt:model in 4974) [ClassicSimilarity], result of:
            0.023968264 = score(doc=4974,freq=1.0), product of:
              0.07627347 = queryWeight, product of:
                1.0764513 = boost
                4.022287 = idf(docFreq=2080, maxDocs=42740)
                0.017615952 = queryNorm
              0.31424117 = fieldWeight in 4974, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.022287 = idf(docFreq=2080, maxDocs=42740)
                0.078125 = fieldNorm(doc=4974)
          0.02731197 = weight(abstract_txt:first in 4974) [ClassicSimilarity], result of:
            0.02731197 = score(doc=4974,freq=1.0), product of:
              0.08321171 = queryWeight, product of:
                1.1243457 = boost
                4.20125 = idf(docFreq=1739, maxDocs=42740)
                0.017615952 = queryNorm
              0.32822266 = fieldWeight in 4974, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.20125 = idf(docFreq=1739, maxDocs=42740)
                0.078125 = fieldNorm(doc=4974)
          0.0627544 = weight(abstract_txt:framework in 4974) [ClassicSimilarity], result of:
            0.0627544 = score(doc=4974,freq=3.0), product of:
              0.10046269 = queryWeight, product of:
                1.2354069 = boost
                4.6162434 = idf(docFreq=1148, maxDocs=42740)
                0.017615952 = queryNorm
              0.62465376 = fieldWeight in 4974, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.6162434 = idf(docFreq=1148, maxDocs=42740)
                0.078125 = fieldNorm(doc=4974)
          0.05847581 = weight(abstract_txt:experimental in 4974) [ClassicSimilarity], result of:
            0.05847581 = score(doc=4974,freq=1.0), product of:
              0.13822925 = queryWeight, product of:
                1.4491308 = boost
                5.414848 = idf(docFreq=516, maxDocs=42740)
                0.017615952 = queryNorm
              0.423035 = fieldWeight in 4974, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.414848 = idf(docFreq=516, maxDocs=42740)
                0.078125 = fieldNorm(doc=4974)
          0.17836 = weight(abstract_txt:random in 4974) [ClassicSimilarity], result of:
            0.17836 = score(doc=4974,freq=3.0), product of:
              0.20157671 = queryWeight, product of:
                1.7499586 = boost
                6.5389266 = idf(docFreq=167, maxDocs=42740)
                0.017615952 = queryNorm
              0.88482445 = fieldWeight in 4974, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.5389266 = idf(docFreq=167, maxDocs=42740)
                0.078125 = fieldNorm(doc=4974)
          0.3840829 = weight(abstract_txt:walk in 4974) [ClassicSimilarity], result of:
            0.3840829 = score(doc=4974,freq=3.0), product of:
              0.33614403 = queryWeight, product of:
                2.2598016 = boost
                8.444015 = idf(docFreq=24, maxDocs=42740)
                0.017615952 = queryNorm
              1.1426141 = fieldWeight in 4974, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.444015 = idf(docFreq=24, maxDocs=42740)
                0.078125 = fieldNorm(doc=4974)
        0.24 = coord(6/25)
    
  4. Vilares, J.; Alonso, M.A.; Doval, Y.; Vilares, M.: Studying the effect and treatment of misspelled queries in Cross-Language Information Retrieval (2016) 0.18
    0.1763888 = sum of:
      0.1763888 = product of:
        0.73495334 = sum of:
          0.023968264 = weight(abstract_txt:model in 4975) [ClassicSimilarity], result of:
            0.023968264 = score(doc=4975,freq=1.0), product of:
              0.07627347 = queryWeight, product of:
                1.0764513 = boost
                4.022287 = idf(docFreq=2080, maxDocs=42740)
                0.017615952 = queryNorm
              0.31424117 = fieldWeight in 4975, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.022287 = idf(docFreq=2080, maxDocs=42740)
                0.078125 = fieldNorm(doc=4975)
          0.02731197 = weight(abstract_txt:first in 4975) [ClassicSimilarity], result of:
            0.02731197 = score(doc=4975,freq=1.0), product of:
              0.08321171 = queryWeight, product of:
                1.1243457 = boost
                4.20125 = idf(docFreq=1739, maxDocs=42740)
                0.017615952 = queryNorm
              0.32822266 = fieldWeight in 4975, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.20125 = idf(docFreq=1739, maxDocs=42740)
                0.078125 = fieldNorm(doc=4975)
          0.0627544 = weight(abstract_txt:framework in 4975) [ClassicSimilarity], result of:
            0.0627544 = score(doc=4975,freq=3.0), product of:
              0.10046269 = queryWeight, product of:
                1.2354069 = boost
                4.6162434 = idf(docFreq=1148, maxDocs=42740)
                0.017615952 = queryNorm
              0.62465376 = fieldWeight in 4975, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.6162434 = idf(docFreq=1148, maxDocs=42740)
                0.078125 = fieldNorm(doc=4975)
          0.05847581 = weight(abstract_txt:experimental in 4975) [ClassicSimilarity], result of:
            0.05847581 = score(doc=4975,freq=1.0), product of:
              0.13822925 = queryWeight, product of:
                1.4491308 = boost
                5.414848 = idf(docFreq=516, maxDocs=42740)
                0.017615952 = queryNorm
              0.423035 = fieldWeight in 4975, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.414848 = idf(docFreq=516, maxDocs=42740)
                0.078125 = fieldNorm(doc=4975)
          0.17836 = weight(abstract_txt:random in 4975) [ClassicSimilarity], result of:
            0.17836 = score(doc=4975,freq=3.0), product of:
              0.20157671 = queryWeight, product of:
                1.7499586 = boost
                6.5389266 = idf(docFreq=167, maxDocs=42740)
                0.017615952 = queryNorm
              0.88482445 = fieldWeight in 4975, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.5389266 = idf(docFreq=167, maxDocs=42740)
                0.078125 = fieldNorm(doc=4975)
          0.3840829 = weight(abstract_txt:walk in 4975) [ClassicSimilarity], result of:
            0.3840829 = score(doc=4975,freq=3.0), product of:
              0.33614403 = queryWeight, product of:
                2.2598016 = boost
                8.444015 = idf(docFreq=24, maxDocs=42740)
                0.017615952 = queryNorm
              1.1426141 = fieldWeight in 4975, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.444015 = idf(docFreq=24, maxDocs=42740)
                0.078125 = fieldNorm(doc=4975)
        0.24 = coord(6/25)
    
  5. Pandey, S.; Khanna, P.; Yokota, H.: ¬A semantics and image retrieval system for hierarchical image databases (2016) 0.18
    0.1763888 = sum of:
      0.1763888 = product of:
        0.73495334 = sum of:
          0.023968264 = weight(abstract_txt:model in 5185) [ClassicSimilarity], result of:
            0.023968264 = score(doc=5185,freq=1.0), product of:
              0.07627347 = queryWeight, product of:
                1.0764513 = boost
                4.022287 = idf(docFreq=2080, maxDocs=42740)
                0.017615952 = queryNorm
              0.31424117 = fieldWeight in 5185, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.022287 = idf(docFreq=2080, maxDocs=42740)
                0.078125 = fieldNorm(doc=5185)
          0.02731197 = weight(abstract_txt:first in 5185) [ClassicSimilarity], result of:
            0.02731197 = score(doc=5185,freq=1.0), product of:
              0.08321171 = queryWeight, product of:
                1.1243457 = boost
                4.20125 = idf(docFreq=1739, maxDocs=42740)
                0.017615952 = queryNorm
              0.32822266 = fieldWeight in 5185, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.20125 = idf(docFreq=1739, maxDocs=42740)
                0.078125 = fieldNorm(doc=5185)
          0.0627544 = weight(abstract_txt:framework in 5185) [ClassicSimilarity], result of:
            0.0627544 = score(doc=5185,freq=3.0), product of:
              0.10046269 = queryWeight, product of:
                1.2354069 = boost
                4.6162434 = idf(docFreq=1148, maxDocs=42740)
                0.017615952 = queryNorm
              0.62465376 = fieldWeight in 5185, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.6162434 = idf(docFreq=1148, maxDocs=42740)
                0.078125 = fieldNorm(doc=5185)
          0.05847581 = weight(abstract_txt:experimental in 5185) [ClassicSimilarity], result of:
            0.05847581 = score(doc=5185,freq=1.0), product of:
              0.13822925 = queryWeight, product of:
                1.4491308 = boost
                5.414848 = idf(docFreq=516, maxDocs=42740)
                0.017615952 = queryNorm
              0.423035 = fieldWeight in 5185, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.414848 = idf(docFreq=516, maxDocs=42740)
                0.078125 = fieldNorm(doc=5185)
          0.17836 = weight(abstract_txt:random in 5185) [ClassicSimilarity], result of:
            0.17836 = score(doc=5185,freq=3.0), product of:
              0.20157671 = queryWeight, product of:
                1.7499586 = boost
                6.5389266 = idf(docFreq=167, maxDocs=42740)
                0.017615952 = queryNorm
              0.88482445 = fieldWeight in 5185, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.5389266 = idf(docFreq=167, maxDocs=42740)
                0.078125 = fieldNorm(doc=5185)
          0.3840829 = weight(abstract_txt:walk in 5185) [ClassicSimilarity], result of:
            0.3840829 = score(doc=5185,freq=3.0), product of:
              0.33614403 = queryWeight, product of:
                2.2598016 = boost
                8.444015 = idf(docFreq=24, maxDocs=42740)
                0.017615952 = queryNorm
              1.1426141 = fieldWeight in 5185, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.444015 = idf(docFreq=24, maxDocs=42740)
                0.078125 = fieldNorm(doc=5185)
        0.24 = coord(6/25)