Document (#33345)

Author
Li, Q.
Wu, Y.-f.B.
Title
People search : searching people sharing similar interests from the Web
Source
Journal of the American Society for Information Science and Technology. 59(2008) no.1, S.111-125
Year
2008
Abstract
On the Web, there are limited ways of finding people sharing similar interests with a given person. The current methods are either ineffective or time consuming. In this paper, we present a new approach for searching people sharing similar interests from the Web. Given a person, to find similar people from the Web, there are two major research issues: person representation and matching persons. In this study, we propose a person representation method which uses a person's website to represent this person. Our design of matching process takes person representation into consideration to allow the same representation to be used when composing the query. Under this person representation method, the proposed algorithm integrates textual content and hyperlink information of all the pages belonging to a personal website to represent a person and match persons. Other algorithms are also explored and compared to the proposed algorithm. Experimental results are presented.

Similar documents (content)

  1. Dumitrescu, A.; Santini, S.: Full coverage of a reader's interests in context-based information filtering (2021) 0.22
    0.22223057 = sum of:
      0.22223057 = product of:
        0.9259607 = sum of:
          0.07812582 = weight(abstract_txt:person's in 327) [ClassicSimilarity], result of:
            0.07812582 = score(doc=327,freq=1.0), product of:
              0.14744179 = queryWeight, product of:
                1.2069961 = boost
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.014408565 = queryNorm
              0.5298757 = fieldWeight in 327, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.0625 = fieldNorm(doc=327)
          0.026640777 = weight(abstract_txt:given in 327) [ClassicSimilarity], result of:
            0.026640777 = score(doc=327,freq=1.0), product of:
              0.090670384 = queryWeight, product of:
                1.3385768 = boost
                4.701121 = idf(docFreq=1091, maxDocs=44218)
                0.014408565 = queryNorm
              0.29382005 = fieldWeight in 327, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.701121 = idf(docFreq=1091, maxDocs=44218)
                0.0625 = fieldNorm(doc=327)
          0.047621835 = weight(abstract_txt:algorithm in 327) [ClassicSimilarity], result of:
            0.047621835 = score(doc=327,freq=1.0), product of:
              0.13354827 = queryWeight, product of:
                1.6245375 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.014408565 = queryNorm
              0.35658893 = fieldWeight in 327, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.0625 = fieldNorm(doc=327)
          0.17411146 = weight(abstract_txt:interests in 327) [ClassicSimilarity], result of:
            0.17411146 = score(doc=327,freq=3.0), product of:
              0.2515605 = queryWeight, product of:
                2.7307217 = boost
                6.3935823 = idf(docFreq=200, maxDocs=44218)
                0.014408565 = queryNorm
              0.69212556 = fieldWeight in 327, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.3935823 = idf(docFreq=200, maxDocs=44218)
                0.0625 = fieldNorm(doc=327)
          0.076628745 = weight(abstract_txt:representation in 327) [ClassicSimilarity], result of:
            0.076628745 = score(doc=327,freq=1.0), product of:
              0.2488907 = queryWeight, product of:
                3.5065894 = boost
                4.926098 = idf(docFreq=871, maxDocs=44218)
                0.014408565 = queryNorm
              0.30788112 = fieldWeight in 327, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.926098 = idf(docFreq=871, maxDocs=44218)
                0.0625 = fieldNorm(doc=327)
          0.5228321 = weight(abstract_txt:person in 327) [ClassicSimilarity], result of:
            0.5228321 = score(doc=327,freq=4.0), product of:
              0.6596937 = queryWeight, product of:
                7.2212396 = boost
                6.340301 = idf(docFreq=211, maxDocs=44218)
                0.014408565 = queryNorm
              0.7925376 = fieldWeight in 327, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.340301 = idf(docFreq=211, maxDocs=44218)
                0.0625 = fieldNorm(doc=327)
        0.24 = coord(6/25)
    
  2. Brown, S.A.; Dennis, A.R.; Burley, D.; Arling, P.: Knowledge sharing and knowledge management system avoidance : the role of knowledge type and the social network in bypassing an organizational knowledge management system (2013) 0.17
    0.17184138 = sum of:
      0.17184138 = product of:
        1.0740087 = sum of:
          0.022082208 = weight(abstract_txt:there in 1099) [ClassicSimilarity], result of:
            0.022082208 = score(doc=1099,freq=1.0), product of:
              0.068948135 = queryWeight, product of:
                1.1672714 = boost
                4.099491 = idf(docFreq=1992, maxDocs=44218)
                0.014408565 = queryNorm
              0.32027274 = fieldWeight in 1099, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.099491 = idf(docFreq=1992, maxDocs=44218)
                0.078125 = fieldNorm(doc=1099)
          0.01273735 = weight(abstract_txt:this in 1099) [ClassicSimilarity], result of:
            0.01273735 = score(doc=1099,freq=2.0), product of:
              0.047776416 = queryWeight, product of:
                1.3741443 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.014408565 = queryNorm
              0.2666033 = fieldWeight in 1099, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.078125 = fieldNorm(doc=1099)
          0.11494393 = weight(abstract_txt:sharing in 1099) [ClassicSimilarity], result of:
            0.11494393 = score(doc=1099,freq=2.0), product of:
              0.18815047 = queryWeight, product of:
                2.3616135 = boost
                5.529371 = idf(docFreq=476, maxDocs=44218)
                0.014408565 = queryNorm
              0.61091495 = fieldWeight in 1099, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.529371 = idf(docFreq=476, maxDocs=44218)
                0.078125 = fieldNorm(doc=1099)
          0.92424524 = weight(abstract_txt:person in 1099) [ClassicSimilarity], result of:
            0.92424524 = score(doc=1099,freq=8.0), product of:
              0.6596937 = queryWeight, product of:
                7.2212396 = boost
                6.340301 = idf(docFreq=211, maxDocs=44218)
                0.014408565 = queryNorm
              1.4010217 = fieldWeight in 1099, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                6.340301 = idf(docFreq=211, maxDocs=44218)
                0.078125 = fieldNorm(doc=1099)
        0.16 = coord(4/25)
    
  3. Zhou, Q.; Lee, C.S.; Sin, S.-C.J.; Lin, S.; Hu, H.; Ismail, M.F.F. Bin: Understanding the use of YouTube as a learning resource : a social cognitive perspective (2020) 0.14
    0.14487368 = sum of:
      0.14487368 = product of:
        0.6036403 = sum of:
          0.011484384 = weight(abstract_txt:from in 174) [ClassicSimilarity], result of:
            0.011484384 = score(doc=174,freq=2.0), product of:
              0.047010306 = queryWeight, product of:
                1.180464 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.014408565 = queryNorm
              0.24429502 = fieldWeight in 174, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=174)
          0.023380574 = weight(abstract_txt:method in 174) [ClassicSimilarity], result of:
            0.023380574 = score(doc=174,freq=1.0), product of:
              0.08311339 = queryWeight, product of:
                1.281581 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.014408565 = queryNorm
              0.28130937 = fieldWeight in 174, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.0625 = fieldNorm(doc=174)
          0.025110269 = weight(abstract_txt:proposed in 174) [ClassicSimilarity], result of:
            0.025110269 = score(doc=174,freq=1.0), product of:
              0.08716359 = queryWeight, product of:
                1.312436 = boost
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.014408565 = queryNorm
              0.2880821 = fieldWeight in 174, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.0625 = fieldNorm(doc=174)
          0.014410666 = weight(abstract_txt:this in 174) [ClassicSimilarity], result of:
            0.014410666 = score(doc=174,freq=4.0), product of:
              0.047776416 = queryWeight, product of:
                1.3741443 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.014408565 = queryNorm
              0.3016272 = fieldWeight in 174, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=174)
          0.07646858 = weight(abstract_txt:people in 174) [ClassicSimilarity], result of:
            0.07646858 = score(doc=174,freq=1.0), product of:
              0.24854377 = queryWeight, product of:
                3.5041444 = boost
                4.922663 = idf(docFreq=874, maxDocs=44218)
                0.014408565 = queryNorm
              0.30766645 = fieldWeight in 174, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.922663 = idf(docFreq=874, maxDocs=44218)
                0.0625 = fieldNorm(doc=174)
          0.45278588 = weight(abstract_txt:person in 174) [ClassicSimilarity], result of:
            0.45278588 = score(doc=174,freq=3.0), product of:
              0.6596937 = queryWeight, product of:
                7.2212396 = boost
                6.340301 = idf(docFreq=211, maxDocs=44218)
                0.014408565 = queryNorm
              0.68635774 = fieldWeight in 174, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.340301 = idf(docFreq=211, maxDocs=44218)
                0.0625 = fieldNorm(doc=174)
        0.24 = coord(6/25)
    
  4. Lihui, C.; Lian, C.W.: Using Web structure and summarisation techniques for Web content mining (2005) 0.14
    0.13514711 = sum of:
      0.13514711 = product of:
        0.42233473 = sum of:
          0.04894267 = weight(abstract_txt:consuming in 1046) [ClassicSimilarity], result of:
            0.04894267 = score(doc=1046,freq=1.0), product of:
              0.10794834 = queryWeight, product of:
                1.0327699 = boost
                7.2542357 = idf(docFreq=84, maxDocs=44218)
                0.014408565 = queryNorm
              0.45338973 = fieldWeight in 1046, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2542357 = idf(docFreq=84, maxDocs=44218)
                0.0625 = fieldNorm(doc=1046)
          0.04349226 = weight(abstract_txt:proposed in 1046) [ClassicSimilarity], result of:
            0.04349226 = score(doc=1046,freq=3.0), product of:
              0.08716359 = queryWeight, product of:
                1.312436 = boost
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.014408565 = queryNorm
              0.4989728 = fieldWeight in 1046, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.0625 = fieldNorm(doc=1046)
          0.026640777 = weight(abstract_txt:given in 1046) [ClassicSimilarity], result of:
            0.026640777 = score(doc=1046,freq=1.0), product of:
              0.090670384 = queryWeight, product of:
                1.3385768 = boost
                4.701121 = idf(docFreq=1091, maxDocs=44218)
                0.014408565 = queryNorm
              0.29382005 = fieldWeight in 1046, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.701121 = idf(docFreq=1091, maxDocs=44218)
                0.0625 = fieldNorm(doc=1046)
          0.007205333 = weight(abstract_txt:this in 1046) [ClassicSimilarity], result of:
            0.007205333 = score(doc=1046,freq=1.0), product of:
              0.047776416 = queryWeight, product of:
                1.3741443 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.014408565 = queryNorm
              0.1508136 = fieldWeight in 1046, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=1046)
          0.043151967 = weight(abstract_txt:represent in 1046) [ClassicSimilarity], result of:
            0.043151967 = score(doc=1046,freq=1.0), product of:
              0.12505506 = queryWeight, product of:
                1.5720314 = boost
                5.52102 = idf(docFreq=480, maxDocs=44218)
                0.014408565 = queryNorm
              0.34506375 = fieldWeight in 1046, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.52102 = idf(docFreq=480, maxDocs=44218)
                0.0625 = fieldNorm(doc=1046)
          0.047621835 = weight(abstract_txt:algorithm in 1046) [ClassicSimilarity], result of:
            0.047621835 = score(doc=1046,freq=1.0), product of:
              0.13354827 = queryWeight, product of:
                1.6245375 = boost
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.014408565 = queryNorm
              0.35658893 = fieldWeight in 1046, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.705423 = idf(docFreq=399, maxDocs=44218)
                0.0625 = fieldNorm(doc=1046)
          0.07255501 = weight(abstract_txt:similar in 1046) [ClassicSimilarity], result of:
            0.07255501 = score(doc=1046,freq=1.0), product of:
              0.22278664 = queryWeight, product of:
                2.9673593 = boost
                5.2107263 = idf(docFreq=655, maxDocs=44218)
                0.014408565 = queryNorm
              0.3256704 = fieldWeight in 1046, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2107263 = idf(docFreq=655, maxDocs=44218)
                0.0625 = fieldNorm(doc=1046)
          0.13272488 = weight(abstract_txt:representation in 1046) [ClassicSimilarity], result of:
            0.13272488 = score(doc=1046,freq=3.0), product of:
              0.2488907 = queryWeight, product of:
                3.5065894 = boost
                4.926098 = idf(docFreq=871, maxDocs=44218)
                0.014408565 = queryNorm
              0.5332657 = fieldWeight in 1046, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.926098 = idf(docFreq=871, maxDocs=44218)
                0.0625 = fieldNorm(doc=1046)
        0.32 = coord(8/25)
    
  5. Elsweiler, D.; Harvey, M.: Engaging and maintaining a sense of being informed : understanding the tasks motivating twitter search (2015) 0.13
    0.12740189 = sum of:
      0.12740189 = product of:
        0.53084123 = sum of:
          0.01406544 = weight(abstract_txt:from in 1635) [ClassicSimilarity], result of:
            0.01406544 = score(doc=1635,freq=3.0), product of:
              0.047010306 = queryWeight, product of:
                1.180464 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.014408565 = queryNorm
              0.29919907 = fieldWeight in 1635, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=1635)
          0.007205333 = weight(abstract_txt:this in 1635) [ClassicSimilarity], result of:
            0.007205333 = score(doc=1635,freq=1.0), product of:
              0.047776416 = queryWeight, product of:
                1.3741443 = boost
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.014408565 = queryNorm
              0.1508136 = fieldWeight in 1635, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=1635)
          0.043151967 = weight(abstract_txt:represent in 1635) [ClassicSimilarity], result of:
            0.043151967 = score(doc=1635,freq=1.0), product of:
              0.12505506 = queryWeight, product of:
                1.5720314 = boost
                5.52102 = idf(docFreq=480, maxDocs=44218)
                0.014408565 = queryNorm
              0.34506375 = fieldWeight in 1635, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.52102 = idf(docFreq=480, maxDocs=44218)
                0.0625 = fieldNorm(doc=1635)
          0.07255501 = weight(abstract_txt:similar in 1635) [ClassicSimilarity], result of:
            0.07255501 = score(doc=1635,freq=1.0), product of:
              0.22278664 = queryWeight, product of:
                2.9673593 = boost
                5.2107263 = idf(docFreq=655, maxDocs=44218)
                0.014408565 = queryNorm
              0.3256704 = fieldWeight in 1635, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.2107263 = idf(docFreq=655, maxDocs=44218)
                0.0625 = fieldNorm(doc=1635)
          0.13244745 = weight(abstract_txt:people in 1635) [ClassicSimilarity], result of:
            0.13244745 = score(doc=1635,freq=3.0), product of:
              0.24854377 = queryWeight, product of:
                3.5041444 = boost
                4.922663 = idf(docFreq=874, maxDocs=44218)
                0.014408565 = queryNorm
              0.5328939 = fieldWeight in 1635, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.922663 = idf(docFreq=874, maxDocs=44218)
                0.0625 = fieldNorm(doc=1635)
          0.26141605 = weight(abstract_txt:person in 1635) [ClassicSimilarity], result of:
            0.26141605 = score(doc=1635,freq=1.0), product of:
              0.6596937 = queryWeight, product of:
                7.2212396 = boost
                6.340301 = idf(docFreq=211, maxDocs=44218)
                0.014408565 = queryNorm
              0.3962688 = fieldWeight in 1635, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.340301 = idf(docFreq=211, maxDocs=44218)
                0.0625 = fieldNorm(doc=1635)
        0.24 = coord(6/25)