Document (#42918)

Author
Tay, W.
Zhang, X.
Karimi , S.
Title
Beyond mean rating : probabilistic aggregation of star ratings based on helpfulness
Source
Journal of the Association for Information Science and Technology. 71(2020) no.7, S.784-799
Year
2020
Abstract
The star-rating mechanism of customer reviews is used universally by the online population to compare and select merchants, movies, products, and services. The consensus opinion from aggregation of star ratings is used as a proxy for item quality. Online reviews are noisy and effective aggregation of star ratings to accurately reflect the "true quality" of products and services is challenging. The mean-rating aggregation model is widely used and other aggregation models are also proposed. These existing aggregation models rely on a large number of reviews to tolerate noise. However, many products rarely have reviews. We propose probabilistic aggregation models for review ratings based on the Dirichlet distribution to combat data sparsity in reviews. We further propose to exploit the "helpfulness" social information and time to filter noisy reviews and effectively aggregate ratings to compute the consensus opinion. Our experiments on an Amazon data set show that our probabilistic aggregation models based on "helpfulness" achieve better performance than the statistical and heuristic baseline approaches.
Content
https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24297.
Theme
Informetrie

Similar documents (author)

  1. Zhang, M.; Zhang, Y.: Professional organizations in Twittersphere : an empirical study of U.S. library and information science professional organizations-related Tweets (2020) 4.54
    4.5423746 = sum of:
      4.5423746 = weight(author_txt:zhang in 5775) [ClassicSimilarity], result of:
        4.5423746 = fieldWeight in 5775, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.4238877 = idf(docFreq=194, maxDocs=44218)
          0.5 = fieldNorm(doc=5775)
    
  2. Zhang, Y.; Zhang, C.: Enhancing keyphrase extraction from microblogs using human reading time (2021) 4.54
    4.5423746 = sum of:
      4.5423746 = weight(author_txt:zhang in 237) [ClassicSimilarity], result of:
        4.5423746 = fieldWeight in 237, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.4238877 = idf(docFreq=194, maxDocs=44218)
          0.5 = fieldNorm(doc=237)
    
  3. Zhang, J.: TOFIR: A tool of facilitating information retrieval : introduce a visual retrieval model (2001) 4.01
    4.01493 = sum of:
      4.01493 = weight(author_txt:zhang in 7711) [ClassicSimilarity], result of:
        4.01493 = fieldWeight in 7711, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.4238877 = idf(docFreq=194, maxDocs=44218)
          0.625 = fieldNorm(doc=7711)
    
  4. Zhang, A.: Multimedia file formats on the Internet : a beginner's guide for PC users (1995) 4.01
    4.01493 = sum of:
      4.01493 = weight(author_txt:zhang in 3212) [ClassicSimilarity], result of:
        4.01493 = fieldWeight in 3212, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.4238877 = idf(docFreq=194, maxDocs=44218)
          0.625 = fieldNorm(doc=3212)
    
  5. Zhang, J.: ¬A representational analysis of relational information displays (1996) 4.01
    4.01493 = sum of:
      4.01493 = weight(author_txt:zhang in 6403) [ClassicSimilarity], result of:
        4.01493 = fieldWeight in 6403, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          6.4238877 = idf(docFreq=194, maxDocs=44218)
          0.625 = fieldNorm(doc=6403)
    

Similar documents (content)

  1. Xiao, D.; Ji, Y.; Li, Y.; Zhuang, F.; Shi, C.: Coupled matrix factorization and topic modeling for aspect mining (2018) 0.18
    0.17529944 = sum of:
      0.17529944 = product of:
        0.6260694 = sum of:
          0.016627217 = weight(abstract_txt:quality in 5042) [ClassicSimilarity], result of:
            0.016627217 = score(doc=5042,freq=1.0), product of:
              0.05717683 = queryWeight, product of:
                1.1620504 = boost
                4.6528544 = idf(docFreq=1145, maxDocs=44218)
                0.010574885 = queryNorm
              0.2908034 = fieldWeight in 5042, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6528544 = idf(docFreq=1145, maxDocs=44218)
                0.0625 = fieldNorm(doc=5042)
          0.009386388 = weight(abstract_txt:used in 5042) [ClassicSimilarity], result of:
            0.009386388 = score(doc=5042,freq=1.0), product of:
              0.044706408 = queryWeight, product of:
                1.2584776 = boost
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.010574885 = queryNorm
              0.2099562 = fieldWeight in 5042, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.0625 = fieldNorm(doc=5042)
          0.022529205 = weight(abstract_txt:propose in 5042) [ClassicSimilarity], result of:
            0.022529205 = score(doc=5042,freq=1.0), product of:
              0.07001175 = queryWeight, product of:
                1.2858799 = boost
                5.1486683 = idf(docFreq=697, maxDocs=44218)
                0.010574885 = queryNorm
              0.32179177 = fieldWeight in 5042, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1486683 = idf(docFreq=697, maxDocs=44218)
                0.0625 = fieldNorm(doc=5042)
          0.053676415 = weight(abstract_txt:opinion in 5042) [ClassicSimilarity], result of:
            0.053676415 = score(doc=5042,freq=1.0), product of:
              0.12489049 = queryWeight, product of:
                1.7174323 = boost
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.010574885 = queryNorm
              0.42978784 = fieldWeight in 5042, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.0625 = fieldNorm(doc=5042)
          0.29785502 = weight(abstract_txt:rating in 5042) [ClassicSimilarity], result of:
            0.29785502 = score(doc=5042,freq=7.0), product of:
              0.23424737 = queryWeight, product of:
                2.8806992 = boost
                7.689554 = idf(docFreq=54, maxDocs=44218)
                0.010574885 = queryNorm
              1.2715405 = fieldWeight in 5042, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                7.689554 = idf(docFreq=54, maxDocs=44218)
                0.0625 = fieldNorm(doc=5042)
          0.06016476 = weight(abstract_txt:reviews in 5042) [ClassicSimilarity], result of:
            0.06016476 = score(doc=5042,freq=1.0), product of:
              0.19436091 = queryWeight, product of:
                3.7109065 = boost
                4.952828 = idf(docFreq=848, maxDocs=44218)
                0.010574885 = queryNorm
              0.30955175 = fieldWeight in 5042, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.952828 = idf(docFreq=848, maxDocs=44218)
                0.0625 = fieldNorm(doc=5042)
          0.16583042 = weight(abstract_txt:ratings in 5042) [ClassicSimilarity], result of:
            0.16583042 = score(doc=5042,freq=1.0), product of:
              0.35955322 = queryWeight, product of:
                4.6075125 = boost
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.010574885 = queryNorm
              0.46121246 = fieldWeight in 5042, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.0625 = fieldNorm(doc=5042)
        0.28 = coord(7/25)
    
  2. Zhu, J.; Han, L.; Gou, Z.; Yuan, X.: ¬A fuzzy clustering-based denoising model for evaluating uncertainty in collaborative filtering recommender systems (2018) 0.15
    0.15456389 = sum of:
      0.15456389 = product of:
        0.5520139 = sum of:
          0.04830362 = weight(abstract_txt:movies in 4460) [ClassicSimilarity], result of:
            0.04830362 = score(doc=4460,freq=1.0), product of:
              0.09239536 = queryWeight, product of:
                1.0445398 = boost
                8.364683 = idf(docFreq=27, maxDocs=44218)
                0.010574885 = queryNorm
              0.5227927 = fieldWeight in 4460, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.364683 = idf(docFreq=27, maxDocs=44218)
                0.0625 = fieldNorm(doc=4460)
          0.016627217 = weight(abstract_txt:quality in 4460) [ClassicSimilarity], result of:
            0.016627217 = score(doc=4460,freq=1.0), product of:
              0.05717683 = queryWeight, product of:
                1.1620504 = boost
                4.6528544 = idf(docFreq=1145, maxDocs=44218)
                0.010574885 = queryNorm
              0.2908034 = fieldWeight in 4460, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6528544 = idf(docFreq=1145, maxDocs=44218)
                0.0625 = fieldNorm(doc=4460)
          0.011344695 = weight(abstract_txt:based in 4460) [ClassicSimilarity], result of:
            0.011344695 = score(doc=4460,freq=2.0), product of:
              0.040261447 = queryWeight, product of:
                1.1942775 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.010574885 = queryNorm
              0.28177565 = fieldWeight in 4460, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0625 = fieldNorm(doc=4460)
          0.009386388 = weight(abstract_txt:used in 4460) [ClassicSimilarity], result of:
            0.009386388 = score(doc=4460,freq=1.0), product of:
              0.044706408 = queryWeight, product of:
                1.2584776 = boost
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.010574885 = queryNorm
              0.2099562 = fieldWeight in 4460, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.0625 = fieldNorm(doc=4460)
          0.09102548 = weight(abstract_txt:noisy in 4460) [ClassicSimilarity], result of:
            0.09102548 = score(doc=4460,freq=1.0), product of:
              0.17760247 = queryWeight, product of:
                2.0480447 = boost
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.010574885 = queryNorm
              0.5125237 = fieldWeight in 4460, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.0625 = fieldNorm(doc=4460)
          0.043665644 = weight(abstract_txt:products in 4460) [ClassicSimilarity], result of:
            0.043665644 = score(doc=4460,freq=1.0), product of:
              0.12458451 = queryWeight, product of:
                2.1008382 = boost
                5.6078424 = idf(docFreq=440, maxDocs=44218)
                0.010574885 = queryNorm
              0.35049015 = fieldWeight in 4460, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.6078424 = idf(docFreq=440, maxDocs=44218)
                0.0625 = fieldNorm(doc=4460)
          0.33166084 = weight(abstract_txt:ratings in 4460) [ClassicSimilarity], result of:
            0.33166084 = score(doc=4460,freq=4.0), product of:
              0.35955322 = queryWeight, product of:
                4.6075125 = boost
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.010574885 = queryNorm
              0.9224249 = fieldWeight in 4460, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.0625 = fieldNorm(doc=4460)
        0.28 = coord(7/25)
    
  3. Chua, A.Y.K.; Banerjee, S.: Understanding review helpfulness as a function of reviewer reputation, review rating, and review depth (2015) 0.14
    0.14084816 = sum of:
      0.14084816 = product of:
        0.880301 = sum of:
          0.0635764 = weight(abstract_txt:amazon in 1641) [ClassicSimilarity], result of:
            0.0635764 = score(doc=1641,freq=1.0), product of:
              0.08468376 = queryWeight, product of:
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.010574885 = queryNorm
              0.7507508 = fieldWeight in 1641, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.09375 = fieldNorm(doc=1641)
          0.23881531 = weight(abstract_txt:rating in 1641) [ClassicSimilarity], result of:
            0.23881531 = score(doc=1641,freq=2.0), product of:
              0.23424737 = queryWeight, product of:
                2.8806992 = boost
                7.689554 = idf(docFreq=54, maxDocs=44218)
                0.010574885 = queryNorm
              1.0195005 = fieldWeight in 1641, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.689554 = idf(docFreq=54, maxDocs=44218)
                0.09375 = fieldNorm(doc=1641)
          0.45028055 = weight(abstract_txt:helpfulness in 1641) [ClassicSimilarity], result of:
            0.45028055 = score(doc=1641,freq=2.0), product of:
              0.35751045 = queryWeight, product of:
                3.558811 = boost
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.010574885 = queryNorm
              1.2594892 = fieldWeight in 1641, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.09375 = fieldNorm(doc=1641)
          0.12762873 = weight(abstract_txt:reviews in 1641) [ClassicSimilarity], result of:
            0.12762873 = score(doc=1641,freq=2.0), product of:
              0.19436091 = queryWeight, product of:
                3.7109065 = boost
                4.952828 = idf(docFreq=848, maxDocs=44218)
                0.010574885 = queryNorm
              0.6566584 = fieldWeight in 1641, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.952828 = idf(docFreq=848, maxDocs=44218)
                0.09375 = fieldNorm(doc=1641)
        0.16 = coord(4/25)
    
  4. Malik, M.S.I.; Hussain, A.: ¬An analysis of review content and reviewer variables that contribute to review helpfulness (2018) 0.13
    0.12742843 = sum of:
      0.12742843 = product of:
        0.6371421 = sum of:
          0.0599404 = weight(abstract_txt:amazon in 5091) [ClassicSimilarity], result of:
            0.0599404 = score(doc=5091,freq=2.0), product of:
              0.08468376 = queryWeight, product of:
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.010574885 = queryNorm
              0.7078146 = fieldWeight in 5091, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.008008 = idf(docFreq=39, maxDocs=44218)
                0.0625 = fieldNorm(doc=5091)
          0.009386388 = weight(abstract_txt:used in 5091) [ClassicSimilarity], result of:
            0.009386388 = score(doc=5091,freq=1.0), product of:
              0.044706408 = queryWeight, product of:
                1.2584776 = boost
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.010574885 = queryNorm
              0.2099562 = fieldWeight in 5091, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.3592992 = idf(docFreq=4177, maxDocs=44218)
                0.0625 = fieldNorm(doc=5091)
          0.03301316 = weight(abstract_txt:models in 5091) [ClassicSimilarity], result of:
            0.03301316 = score(doc=5091,freq=1.0), product of:
              0.11379987 = queryWeight, product of:
                2.3184664 = boost
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.010574885 = queryNorm
              0.2900984 = fieldWeight in 5091, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6415744 = idf(docFreq=1158, maxDocs=44218)
                0.0625 = fieldNorm(doc=5091)
          0.4746374 = weight(abstract_txt:helpfulness in 5091) [ClassicSimilarity], result of:
            0.4746374 = score(doc=5091,freq=5.0), product of:
              0.35751045 = queryWeight, product of:
                3.558811 = boost
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.010574885 = queryNorm
              1.3276182 = fieldWeight in 5091, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.0625 = fieldNorm(doc=5091)
          0.06016476 = weight(abstract_txt:reviews in 5091) [ClassicSimilarity], result of:
            0.06016476 = score(doc=5091,freq=1.0), product of:
              0.19436091 = queryWeight, product of:
                3.7109065 = boost
                4.952828 = idf(docFreq=848, maxDocs=44218)
                0.010574885 = queryNorm
              0.30955175 = fieldWeight in 5091, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.952828 = idf(docFreq=848, maxDocs=44218)
                0.0625 = fieldNorm(doc=5091)
        0.2 = coord(5/25)
    
  5. Li, H.; Bhowmick, S.S.; Sun, A.: AffRank: affinity-driven ranking of products in online social rating networks (2011) 0.12
    0.11887966 = sum of:
      0.11887966 = product of:
        0.49533194 = sum of:
          0.016627217 = weight(abstract_txt:quality in 4483) [ClassicSimilarity], result of:
            0.016627217 = score(doc=4483,freq=1.0), product of:
              0.05717683 = queryWeight, product of:
                1.1620504 = boost
                4.6528544 = idf(docFreq=1145, maxDocs=44218)
                0.010574885 = queryNorm
              0.2908034 = fieldWeight in 4483, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6528544 = idf(docFreq=1145, maxDocs=44218)
                0.0625 = fieldNorm(doc=4483)
          0.008021912 = weight(abstract_txt:based in 4483) [ClassicSimilarity], result of:
            0.008021912 = score(doc=4483,freq=1.0), product of:
              0.040261447 = queryWeight, product of:
                1.1942775 = boost
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.010574885 = queryNorm
              0.19924548 = fieldWeight in 4483, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1879277 = idf(docFreq=4958, maxDocs=44218)
                0.0625 = fieldNorm(doc=4483)
          0.022529205 = weight(abstract_txt:propose in 4483) [ClassicSimilarity], result of:
            0.022529205 = score(doc=4483,freq=1.0), product of:
              0.07001175 = queryWeight, product of:
                1.2858799 = boost
                5.1486683 = idf(docFreq=697, maxDocs=44218)
                0.010574885 = queryNorm
              0.32179177 = fieldWeight in 4483, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1486683 = idf(docFreq=697, maxDocs=44218)
                0.0625 = fieldNorm(doc=4483)
          0.08733129 = weight(abstract_txt:products in 4483) [ClassicSimilarity], result of:
            0.08733129 = score(doc=4483,freq=4.0), product of:
              0.12458451 = queryWeight, product of:
                2.1008382 = boost
                5.6078424 = idf(docFreq=440, maxDocs=44218)
                0.010574885 = queryNorm
              0.7009803 = fieldWeight in 4483, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.6078424 = idf(docFreq=440, maxDocs=44218)
                0.0625 = fieldNorm(doc=4483)
          0.19499187 = weight(abstract_txt:rating in 4483) [ClassicSimilarity], result of:
            0.19499187 = score(doc=4483,freq=3.0), product of:
              0.23424737 = queryWeight, product of:
                2.8806992 = boost
                7.689554 = idf(docFreq=54, maxDocs=44218)
                0.010574885 = queryNorm
              0.8324186 = fieldWeight in 4483, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.689554 = idf(docFreq=54, maxDocs=44218)
                0.0625 = fieldNorm(doc=4483)
          0.16583042 = weight(abstract_txt:ratings in 4483) [ClassicSimilarity], result of:
            0.16583042 = score(doc=4483,freq=1.0), product of:
              0.35955322 = queryWeight, product of:
                4.6075125 = boost
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.010574885 = queryNorm
              0.46121246 = fieldWeight in 4483, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3793993 = idf(docFreq=74, maxDocs=44218)
                0.0625 = fieldNorm(doc=4483)
        0.24 = coord(6/25)