Document (#42043)

Author
Xiao, D.
Ji, Y.
Li, Y.
Zhuang, F.
Shi, C.
Title
Coupled matrix factorization and topic modeling for aspect mining
Source
Information processing and management. 54(2018) no.6, S.861-873
Year
2018
Abstract
Aspect mining, which aims to extract ad hoc aspects from online reviews and predict rating or opinion on each aspect, can satisfy the personalized needs for evaluation of specific aspect on product quality. Recently, with the increase of related research, how to effectively integrate rating and review information has become the key issue for addressing this problem. Considering that matrix factorization is an effective tool for rating prediction and topic modeling is widely used for review processing, it is a natural idea to combine matrix factorization and topic modeling for aspect mining (or called aspect rating prediction). However, this idea faces several challenges on how to address suitable sharing factors, scale mismatch, and dependency relation of rating and review information. In this paper, we propose a novel model to effectively integrate Matrix factorization and Topic modeling for Aspect rating prediction (MaToAsp). To overcome the above challenges and ensure the performance, MaToAsp employs items as the sharing factors to combine matrix factorization and topic modeling, and introduces an interpretive preference probability to eliminate scale mismatch. In the hybrid model, we establish a dependency relation from ratings to sentiment terms in phrases. The experiments on two real datasets including Chinese Dianping and English Tripadvisor prove that MaToAsp not only obtains reasonable aspect identification but also achieves the best aspect rating prediction performance, compared to recent representative baselines.
Content
Vgl.: https://doi.org/10.1016/j.ipm.2018.05.002.

Similar documents (author)

  1. Xiao, Y.: Modern development of classification : research and practice in the People's Republic of China (1992) 5.54
    5.5397964 = sum of:
      5.5397964 = weight(author_txt:xiao in 1909) [ClassicSimilarity], result of:
        5.5397964 = fieldWeight in 1909, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.863674 = idf(docFreq=16, maxDocs=44218)
          0.625 = fieldNorm(doc=1909)
    
  2. Xiao, Y.: Faceted classification : a consideration of its features as a paradigm of knowledge organization (1994) 5.54
    5.5397964 = sum of:
      5.5397964 = weight(author_txt:xiao in 7547) [ClassicSimilarity], result of:
        5.5397964 = fieldWeight in 7547, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.863674 = idf(docFreq=16, maxDocs=44218)
          0.625 = fieldNorm(doc=7547)
    
  3. Xiao, G.: ¬A knowledge classification model based on the relationship between science and human needs (2013) 5.54
    5.5397964 = sum of:
      5.5397964 = weight(author_txt:xiao in 138) [ClassicSimilarity], result of:
        5.5397964 = fieldWeight in 138, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.863674 = idf(docFreq=16, maxDocs=44218)
          0.625 = fieldNorm(doc=138)
    
  4. Xiao, L.: Effects of rationale awareness in online ideation crowdsourcing tasks (2014) 5.54
    5.5397964 = sum of:
      5.5397964 = weight(author_txt:xiao in 1329) [ClassicSimilarity], result of:
        5.5397964 = fieldWeight in 1329, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.863674 = idf(docFreq=16, maxDocs=44218)
          0.625 = fieldNorm(doc=1329)
    
  5. Xiao, L.; Askin, A.: What influences online deliberation? : A wikipedia study (2014) 4.43
    4.431837 = sum of:
      4.431837 = weight(author_txt:xiao in 1254) [ClassicSimilarity], result of:
        4.431837 = fieldWeight in 1254, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.863674 = idf(docFreq=16, maxDocs=44218)
          0.5 = fieldNorm(doc=1254)
    

Similar documents (content)

  1. Su, L.T.; Chen, H.L.: Evaluation of Web search engines by undergraduate students (1999) 0.12
    0.12436842 = sum of:
      0.12436842 = product of:
        0.51820177 = sum of:
          0.017786335 = weight(abstract_txt:performance in 6546) [ClassicSimilarity], result of:
            0.017786335 = score(doc=6546,freq=2.0), product of:
              0.04966644 = queryWeight, product of:
                1.1528028 = boost
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.009304383 = queryNorm
              0.3581158 = fieldWeight in 6546, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6546)
          0.015468428 = weight(abstract_txt:factors in 6546) [ClassicSimilarity], result of:
            0.015468428 = score(doc=6546,freq=1.0), product of:
              0.05701373 = queryWeight, product of:
                1.2351317 = boost
                4.9611073 = idf(docFreq=841, maxDocs=44218)
                0.009304383 = queryNorm
              0.27131057 = fieldWeight in 6546, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9611073 = idf(docFreq=841, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6546)
          0.019863963 = weight(abstract_txt:relation in 6546) [ClassicSimilarity], result of:
            0.019863963 = score(doc=6546,freq=1.0), product of:
              0.06735853 = queryWeight, product of:
                1.3425171 = boost
                5.3924384 = idf(docFreq=546, maxDocs=44218)
                0.009304383 = queryNorm
              0.294899 = fieldWeight in 6546, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.3924384 = idf(docFreq=546, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6546)
          0.041084886 = weight(abstract_txt:topic in 6546) [ClassicSimilarity], result of:
            0.041084886 = score(doc=6546,freq=1.0), product of:
              0.14840554 = queryWeight, product of:
                3.1507838 = boost
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.009304383 = queryNorm
              0.276842 = fieldWeight in 6546, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.062254 = idf(docFreq=760, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6546)
          0.28509974 = weight(abstract_txt:rating in 6546) [ClassicSimilarity], result of:
            0.28509974 = score(doc=6546,freq=2.0), product of:
              0.479394 = queryWeight, product of:
                6.7004485 = boost
                7.689554 = idf(docFreq=54, maxDocs=44218)
                0.009304383 = queryNorm
              0.5947086 = fieldWeight in 6546, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.689554 = idf(docFreq=54, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6546)
          0.13889839 = weight(abstract_txt:aspect in 6546) [ClassicSimilarity], result of:
            0.13889839 = score(doc=6546,freq=1.0), product of:
              0.4066471 = queryWeight, product of:
                6.997431 = boost
                6.2458487 = idf(docFreq=232, maxDocs=44218)
                0.009304383 = queryNorm
              0.34156984 = fieldWeight in 6546, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2458487 = idf(docFreq=232, maxDocs=44218)
                0.0546875 = fieldNorm(doc=6546)
        0.24 = coord(6/25)
    
  2. Su, Z.; Li, D.; Li, H.; Luo, X.: Boosting attribute recognition with latent topics by matrix factorization (2017) 0.12
    0.118006006 = sum of:
      0.118006006 = product of:
        0.73753756 = sum of:
          0.029721672 = weight(abstract_txt:scale in 3693) [ClassicSimilarity], result of:
            0.029721672 = score(doc=3693,freq=1.0), product of:
              0.069469824 = queryWeight, product of:
                1.3633949 = boost
                5.476297 = idf(docFreq=502, maxDocs=44218)
                0.009304383 = queryNorm
              0.4278357 = fieldWeight in 3693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.476297 = idf(docFreq=502, maxDocs=44218)
                0.078125 = fieldNorm(doc=3693)
          0.14559196 = weight(abstract_txt:matrix in 3693) [ClassicSimilarity], result of:
            0.14559196 = score(doc=3693,freq=1.0), product of:
              0.27194786 = queryWeight, product of:
                4.26517 = boost
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.009304383 = queryNorm
              0.5353672 = fieldWeight in 3693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.078125 = fieldNorm(doc=3693)
          0.36379766 = weight(abstract_txt:factorization in 3693) [ClassicSimilarity], result of:
            0.36379766 = score(doc=3693,freq=1.0), product of:
              0.500765 = queryWeight, product of:
                5.7877603 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.009304383 = queryNorm
              0.72648376 = fieldWeight in 3693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.078125 = fieldNorm(doc=3693)
          0.19842626 = weight(abstract_txt:aspect in 3693) [ClassicSimilarity], result of:
            0.19842626 = score(doc=3693,freq=1.0), product of:
              0.4066471 = queryWeight, product of:
                6.997431 = boost
                6.2458487 = idf(docFreq=232, maxDocs=44218)
                0.009304383 = queryNorm
              0.48795694 = fieldWeight in 3693, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2458487 = idf(docFreq=232, maxDocs=44218)
                0.078125 = fieldNorm(doc=3693)
        0.16 = coord(4/25)
    
  3. Greenstein-Messica, A.; Rokach, L.; Shabtai, A.: Personal-discount sensitivity prediction for mobile coupon conversion optimization (2017) 0.10
    0.10210684 = sum of:
      0.10210684 = product of:
        0.63816774 = sum of:
          0.1519546 = weight(abstract_txt:prediction in 3751) [ClassicSimilarity], result of:
            0.1519546 = score(doc=3751,freq=2.0), product of:
              0.23923789 = queryWeight, product of:
                3.5781085 = boost
                7.1860275 = idf(docFreq=90, maxDocs=44218)
                0.009304383 = queryNorm
              0.6351611 = fieldWeight in 3751, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.1860275 = idf(docFreq=90, maxDocs=44218)
                0.0625 = fieldNorm(doc=3751)
          0.07870141 = weight(abstract_txt:modeling in 3751) [ClassicSimilarity], result of:
            0.07870141 = score(doc=3751,freq=1.0), product of:
              0.20940597 = queryWeight, product of:
                3.7427263 = boost
                6.0133076 = idf(docFreq=293, maxDocs=44218)
                0.009304383 = queryNorm
              0.37583172 = fieldWeight in 3751, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0133076 = idf(docFreq=293, maxDocs=44218)
                0.0625 = fieldNorm(doc=3751)
          0.11647357 = weight(abstract_txt:matrix in 3751) [ClassicSimilarity], result of:
            0.11647357 = score(doc=3751,freq=1.0), product of:
              0.27194786 = queryWeight, product of:
                4.26517 = boost
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.009304383 = queryNorm
              0.42829376 = fieldWeight in 3751, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.0625 = fieldNorm(doc=3751)
          0.29103813 = weight(abstract_txt:factorization in 3751) [ClassicSimilarity], result of:
            0.29103813 = score(doc=3751,freq=1.0), product of:
              0.500765 = queryWeight, product of:
                5.7877603 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.009304383 = queryNorm
              0.581187 = fieldWeight in 3751, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.0625 = fieldNorm(doc=3751)
        0.16 = coord(4/25)
    
  4. Greenstein-Messica, A.; Rokach, L.; Shabtai, A.: Personal-discount sensitivity prediction for mobile coupon conversion optimization (2017) 0.10
    0.10210684 = sum of:
      0.10210684 = product of:
        0.63816774 = sum of:
          0.1519546 = weight(abstract_txt:prediction in 3761) [ClassicSimilarity], result of:
            0.1519546 = score(doc=3761,freq=2.0), product of:
              0.23923789 = queryWeight, product of:
                3.5781085 = boost
                7.1860275 = idf(docFreq=90, maxDocs=44218)
                0.009304383 = queryNorm
              0.6351611 = fieldWeight in 3761, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.1860275 = idf(docFreq=90, maxDocs=44218)
                0.0625 = fieldNorm(doc=3761)
          0.07870141 = weight(abstract_txt:modeling in 3761) [ClassicSimilarity], result of:
            0.07870141 = score(doc=3761,freq=1.0), product of:
              0.20940597 = queryWeight, product of:
                3.7427263 = boost
                6.0133076 = idf(docFreq=293, maxDocs=44218)
                0.009304383 = queryNorm
              0.37583172 = fieldWeight in 3761, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0133076 = idf(docFreq=293, maxDocs=44218)
                0.0625 = fieldNorm(doc=3761)
          0.11647357 = weight(abstract_txt:matrix in 3761) [ClassicSimilarity], result of:
            0.11647357 = score(doc=3761,freq=1.0), product of:
              0.27194786 = queryWeight, product of:
                4.26517 = boost
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.009304383 = queryNorm
              0.42829376 = fieldWeight in 3761, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.0625 = fieldNorm(doc=3761)
          0.29103813 = weight(abstract_txt:factorization in 3761) [ClassicSimilarity], result of:
            0.29103813 = score(doc=3761,freq=1.0), product of:
              0.500765 = queryWeight, product of:
                5.7877603 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.009304383 = queryNorm
              0.581187 = fieldWeight in 3761, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.0625 = fieldNorm(doc=3761)
        0.16 = coord(4/25)
    
  5. Ferreira, R.S.; Graça Pimentel, M. de; Cristo, M.: ¬A wikification prediction model based on the combination of latent, dyadic, and monadic features (2018) 0.08
    0.084693335 = sum of:
      0.084693335 = product of:
        0.52933335 = sum of:
          0.01437353 = weight(abstract_txt:performance in 4119) [ClassicSimilarity], result of:
            0.01437353 = score(doc=4119,freq=1.0), product of:
              0.04966644 = queryWeight, product of:
                1.1528028 = boost
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.009304383 = queryNorm
              0.28940126 = fieldWeight in 4119, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.63042 = idf(docFreq=1171, maxDocs=44218)
                0.0625 = fieldNorm(doc=4119)
          0.10744813 = weight(abstract_txt:prediction in 4119) [ClassicSimilarity], result of:
            0.10744813 = score(doc=4119,freq=1.0), product of:
              0.23923789 = queryWeight, product of:
                3.5781085 = boost
                7.1860275 = idf(docFreq=90, maxDocs=44218)
                0.009304383 = queryNorm
              0.44912672 = fieldWeight in 4119, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.1860275 = idf(docFreq=90, maxDocs=44218)
                0.0625 = fieldNorm(doc=4119)
          0.11647357 = weight(abstract_txt:matrix in 4119) [ClassicSimilarity], result of:
            0.11647357 = score(doc=4119,freq=1.0), product of:
              0.27194786 = queryWeight, product of:
                4.26517 = boost
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.009304383 = queryNorm
              0.42829376 = fieldWeight in 4119, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8527 = idf(docFreq=126, maxDocs=44218)
                0.0625 = fieldNorm(doc=4119)
          0.29103813 = weight(abstract_txt:factorization in 4119) [ClassicSimilarity], result of:
            0.29103813 = score(doc=4119,freq=1.0), product of:
              0.500765 = queryWeight, product of:
                5.7877603 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.009304383 = queryNorm
              0.581187 = fieldWeight in 4119, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.0625 = fieldNorm(doc=4119)
        0.16 = coord(4/25)