Document (#42041)

Author
Xiao, D.
Ji, Y.
Li, Y.
Zhuang, F.
Shi, C.
Title
Coupled matrix factorization and topic modeling for aspect mining
Source
Information processing and management. 54(2018) no.6, S.861-873
Year
2018
Abstract
Aspect mining, which aims to extract ad hoc aspects from online reviews and predict rating or opinion on each aspect, can satisfy the personalized needs for evaluation of specific aspect on product quality. Recently, with the increase of related research, how to effectively integrate rating and review information has become the key issue for addressing this problem. Considering that matrix factorization is an effective tool for rating prediction and topic modeling is widely used for review processing, it is a natural idea to combine matrix factorization and topic modeling for aspect mining (or called aspect rating prediction). However, this idea faces several challenges on how to address suitable sharing factors, scale mismatch, and dependency relation of rating and review information. In this paper, we propose a novel model to effectively integrate Matrix factorization and Topic modeling for Aspect rating prediction (MaToAsp). To overcome the above challenges and ensure the performance, MaToAsp employs items as the sharing factors to combine matrix factorization and topic modeling, and introduces an interpretive preference probability to eliminate scale mismatch. In the hybrid model, we establish a dependency relation from ratings to sentiment terms in phrases. The experiments on two real datasets including Chinese Dianping and English Tripadvisor prove that MaToAsp not only obtains reasonable aspect identification but also achieves the best aspect rating prediction performance, compared to recent representative baselines.
Content
Vgl.: https://doi.org/10.1016/j.ipm.2018.05.002.

Similar documents (author)

  1. Xiao, Y.: Modern development of classification : research and practice in the People's Republic of China (1992) 5.65
    5.651716 = sum of:
      5.651716 = weight(author_txt:xiao in 1909) [ClassicSimilarity], result of:
        5.651716 = fieldWeight in 1909, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.042746 = idf(docFreq=13, maxDocs=43556)
          0.625 = fieldNorm(doc=1909)
    
  2. Xiao, Y.: Faceted classification : a consideration of its features as a paradigm of knowledge organization (1994) 5.65
    5.651716 = sum of:
      5.651716 = weight(author_txt:xiao in 7544) [ClassicSimilarity], result of:
        5.651716 = fieldWeight in 7544, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.042746 = idf(docFreq=13, maxDocs=43556)
          0.625 = fieldNorm(doc=7544)
    
  3. Xiao, G.: ¬A knowledge classification model based on the relationship between science and human needs (2013) 5.65
    5.651716 = sum of:
      5.651716 = weight(author_txt:xiao in 2136) [ClassicSimilarity], result of:
        5.651716 = fieldWeight in 2136, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.042746 = idf(docFreq=13, maxDocs=43556)
          0.625 = fieldNorm(doc=2136)
    
  4. Xiao, L.: Effects of rationale awareness in online ideation crowdsourcing tasks (2014) 5.65
    5.651716 = sum of:
      5.651716 = weight(author_txt:xiao in 3327) [ClassicSimilarity], result of:
        5.651716 = fieldWeight in 3327, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.042746 = idf(docFreq=13, maxDocs=43556)
          0.625 = fieldNorm(doc=3327)
    
  5. Xiao, L.; Askin, A.: What influences online deliberation? : A wikipedia study (2014) 4.52
    4.521373 = sum of:
      4.521373 = weight(author_txt:xiao in 3252) [ClassicSimilarity], result of:
        4.521373 = fieldWeight in 3252, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.042746 = idf(docFreq=13, maxDocs=43556)
          0.5 = fieldNorm(doc=3252)
    

Similar documents (content)

  1. Su, L.T.; Chen, H.L.: Evaluation of Web search engines by undergraduate students (1999) 0.13
    0.12502424 = sum of:
      0.12502424 = product of:
        0.52093434 = sum of:
          0.017796587 = weight(abstract_txt:performance in 544) [ClassicSimilarity], result of:
            0.017796587 = score(doc=544,freq=2.0), product of:
              0.049625646 = queryWeight, product of:
                1.1490432 = boost
                4.6368976 = idf(docFreq=1146, maxDocs=43556)
                0.00931413 = queryNorm
              0.35861674 = fieldWeight in 544, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6368976 = idf(docFreq=1146, maxDocs=43556)
                0.0546875 = fieldNorm(doc=544)
          0.01556481 = weight(abstract_txt:factors in 544) [ClassicSimilarity], result of:
            0.01556481 = score(doc=544,freq=1.0), product of:
              0.05718132 = queryWeight, product of:
                1.2334182 = boost
                4.9773884 = idf(docFreq=815, maxDocs=43556)
                0.00931413 = queryNorm
              0.27220094 = fieldWeight in 544, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.9773884 = idf(docFreq=815, maxDocs=43556)
                0.0546875 = fieldNorm(doc=544)
          0.02005838 = weight(abstract_txt:relation in 544) [ClassicSimilarity], result of:
            0.02005838 = score(doc=544,freq=1.0), product of:
              0.06771563 = queryWeight, product of:
                1.3422325 = boost
                5.4165015 = idf(docFreq=525, maxDocs=43556)
                0.00931413 = queryNorm
              0.29621494 = fieldWeight in 544, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4165015 = idf(docFreq=525, maxDocs=43556)
                0.0546875 = fieldNorm(doc=544)
          0.041283224 = weight(abstract_txt:topic in 544) [ClassicSimilarity], result of:
            0.041283224 = score(doc=544,freq=1.0), product of:
              0.14870334 = queryWeight, product of:
                3.1449492 = boost
                5.0765047 = idf(docFreq=738, maxDocs=43556)
                0.00931413 = queryNorm
              0.27762136 = fieldWeight in 544, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.0765047 = idf(docFreq=738, maxDocs=43556)
                0.0546875 = fieldNorm(doc=544)
          0.2865106 = weight(abstract_txt:rating in 544) [ClassicSimilarity], result of:
            0.2865106 = score(doc=544,freq=2.0), product of:
              0.48039463 = queryWeight, product of:
                6.6883097 = boost
                7.7115107 = idf(docFreq=52, maxDocs=43556)
                0.00931413 = queryNorm
              0.5964067 = fieldWeight in 544, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.7115107 = idf(docFreq=52, maxDocs=43556)
                0.0546875 = fieldNorm(doc=544)
          0.13972077 = weight(abstract_txt:aspect in 544) [ClassicSimilarity], result of:
            0.13972077 = score(doc=544,freq=1.0), product of:
              0.4077586 = queryWeight, product of:
                6.9870057 = boost
                6.2657022 = idf(docFreq=224, maxDocs=43556)
                0.00931413 = queryNorm
              0.3426556 = fieldWeight in 544, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2657022 = idf(docFreq=224, maxDocs=43556)
                0.0546875 = fieldNorm(doc=544)
        0.24 = coord(6/25)
    
  2. Su, Z.; Li, D.; Li, H.; Luo, X.: Boosting attribute recognition with latent topics by matrix factorization (2017) 0.12
    0.11766979 = sum of:
      0.11766979 = product of:
        0.7354362 = sum of:
          0.029995931 = weight(abstract_txt:scale in 691) [ClassicSimilarity], result of:
            0.029995931 = score(doc=691,freq=1.0), product of:
              0.0698123 = queryWeight, product of:
                1.3628538 = boost
                5.4997177 = idf(docFreq=483, maxDocs=43556)
                0.00931413 = queryNorm
              0.42966545 = fieldWeight in 691, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4997177 = idf(docFreq=483, maxDocs=43556)
                0.078125 = fieldNorm(doc=691)
          0.14511634 = weight(abstract_txt:matrix in 691) [ClassicSimilarity], result of:
            0.14511634 = score(doc=691,freq=1.0), product of:
              0.27102825 = queryWeight, product of:
                4.24581 = boost
                6.853489 = idf(docFreq=124, maxDocs=43556)
                0.00931413 = queryNorm
              0.5354288 = fieldWeight in 691, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.853489 = idf(docFreq=124, maxDocs=43556)
                0.078125 = fieldNorm(doc=691)
          0.36072287 = weight(abstract_txt:factorization in 691) [ClassicSimilarity], result of:
            0.36072287 = score(doc=691,freq=1.0), product of:
              0.49733934 = queryWeight, product of:
                5.751481 = boost
                9.283908 = idf(docFreq=10, maxDocs=43556)
                0.00931413 = queryNorm
              0.7253053 = fieldWeight in 691, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.283908 = idf(docFreq=10, maxDocs=43556)
                0.078125 = fieldNorm(doc=691)
          0.19960108 = weight(abstract_txt:aspect in 691) [ClassicSimilarity], result of:
            0.19960108 = score(doc=691,freq=1.0), product of:
              0.4077586 = queryWeight, product of:
                6.9870057 = boost
                6.2657022 = idf(docFreq=224, maxDocs=43556)
                0.00931413 = queryNorm
              0.48950797 = fieldWeight in 691, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2657022 = idf(docFreq=224, maxDocs=43556)
                0.078125 = fieldNorm(doc=691)
        0.16 = coord(4/25)
    
  3. Greenstein-Messica, A.; Rokach, L.; Shabtai, A.: Personal-discount sensitivity prediction for mobile coupon conversion optimization (2017) 0.10
    0.102227524 = sum of:
      0.102227524 = product of:
        0.63892204 = sum of:
          0.15478867 = weight(abstract_txt:prediction in 37) [ClassicSimilarity], result of:
            0.15478867 = score(doc=37,freq=2.0), product of:
              0.24191149 = queryWeight, product of:
                3.587786 = boost
                7.2391515 = idf(docFreq=84, maxDocs=43556)
                0.00931413 = queryNorm
              0.63985664 = fieldWeight in 37, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.2391515 = idf(docFreq=84, maxDocs=43556)
                0.0625 = fieldNorm(doc=37)
          0.07946199 = weight(abstract_txt:modeling in 37) [ClassicSimilarity], result of:
            0.07946199 = score(doc=37,freq=1.0), product of:
              0.21049899 = queryWeight, product of:
                3.7417803 = boost
                6.0398955 = idf(docFreq=281, maxDocs=43556)
                0.00931413 = queryNorm
              0.37749347 = fieldWeight in 37, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0398955 = idf(docFreq=281, maxDocs=43556)
                0.0625 = fieldNorm(doc=37)
          0.11609307 = weight(abstract_txt:matrix in 37) [ClassicSimilarity], result of:
            0.11609307 = score(doc=37,freq=1.0), product of:
              0.27102825 = queryWeight, product of:
                4.24581 = boost
                6.853489 = idf(docFreq=124, maxDocs=43556)
                0.00931413 = queryNorm
              0.42834306 = fieldWeight in 37, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.853489 = idf(docFreq=124, maxDocs=43556)
                0.0625 = fieldNorm(doc=37)
          0.2885783 = weight(abstract_txt:factorization in 37) [ClassicSimilarity], result of:
            0.2885783 = score(doc=37,freq=1.0), product of:
              0.49733934 = queryWeight, product of:
                5.751481 = boost
                9.283908 = idf(docFreq=10, maxDocs=43556)
                0.00931413 = queryNorm
              0.58024424 = fieldWeight in 37, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.283908 = idf(docFreq=10, maxDocs=43556)
                0.0625 = fieldNorm(doc=37)
        0.16 = coord(4/25)
    
  4. Greenstein-Messica, A.; Rokach, L.; Shabtai, A.: Personal-discount sensitivity prediction for mobile coupon conversion optimization (2017) 0.10
    0.102227524 = sum of:
      0.102227524 = product of:
        0.63892204 = sum of:
          0.15478867 = weight(abstract_txt:prediction in 47) [ClassicSimilarity], result of:
            0.15478867 = score(doc=47,freq=2.0), product of:
              0.24191149 = queryWeight, product of:
                3.587786 = boost
                7.2391515 = idf(docFreq=84, maxDocs=43556)
                0.00931413 = queryNorm
              0.63985664 = fieldWeight in 47, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.2391515 = idf(docFreq=84, maxDocs=43556)
                0.0625 = fieldNorm(doc=47)
          0.07946199 = weight(abstract_txt:modeling in 47) [ClassicSimilarity], result of:
            0.07946199 = score(doc=47,freq=1.0), product of:
              0.21049899 = queryWeight, product of:
                3.7417803 = boost
                6.0398955 = idf(docFreq=281, maxDocs=43556)
                0.00931413 = queryNorm
              0.37749347 = fieldWeight in 47, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.0398955 = idf(docFreq=281, maxDocs=43556)
                0.0625 = fieldNorm(doc=47)
          0.11609307 = weight(abstract_txt:matrix in 47) [ClassicSimilarity], result of:
            0.11609307 = score(doc=47,freq=1.0), product of:
              0.27102825 = queryWeight, product of:
                4.24581 = boost
                6.853489 = idf(docFreq=124, maxDocs=43556)
                0.00931413 = queryNorm
              0.42834306 = fieldWeight in 47, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.853489 = idf(docFreq=124, maxDocs=43556)
                0.0625 = fieldNorm(doc=47)
          0.2885783 = weight(abstract_txt:factorization in 47) [ClassicSimilarity], result of:
            0.2885783 = score(doc=47,freq=1.0), product of:
              0.49733934 = queryWeight, product of:
                5.751481 = boost
                9.283908 = idf(docFreq=10, maxDocs=43556)
                0.00931413 = queryNorm
              0.58024424 = fieldWeight in 47, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.283908 = idf(docFreq=10, maxDocs=43556)
                0.0625 = fieldNorm(doc=47)
        0.16 = coord(4/25)
    
  5. Ferreira, R.S.; Graça Pimentel, M. de; Cristo, M.: ¬A wikification prediction model based on the combination of latent, dyadic, and monadic features (2018) 0.08
    0.08456085 = sum of:
      0.08456085 = product of:
        0.5285053 = sum of:
          0.014381815 = weight(abstract_txt:performance in 405) [ClassicSimilarity], result of:
            0.014381815 = score(doc=405,freq=1.0), product of:
              0.049625646 = queryWeight, product of:
                1.1490432 = boost
                4.6368976 = idf(docFreq=1146, maxDocs=43556)
                0.00931413 = queryNorm
              0.2898061 = fieldWeight in 405, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6368976 = idf(docFreq=1146, maxDocs=43556)
                0.0625 = fieldNorm(doc=405)
          0.10945212 = weight(abstract_txt:prediction in 405) [ClassicSimilarity], result of:
            0.10945212 = score(doc=405,freq=1.0), product of:
              0.24191149 = queryWeight, product of:
                3.587786 = boost
                7.2391515 = idf(docFreq=84, maxDocs=43556)
                0.00931413 = queryNorm
              0.45244697 = fieldWeight in 405, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.2391515 = idf(docFreq=84, maxDocs=43556)
                0.0625 = fieldNorm(doc=405)
          0.11609307 = weight(abstract_txt:matrix in 405) [ClassicSimilarity], result of:
            0.11609307 = score(doc=405,freq=1.0), product of:
              0.27102825 = queryWeight, product of:
                4.24581 = boost
                6.853489 = idf(docFreq=124, maxDocs=43556)
                0.00931413 = queryNorm
              0.42834306 = fieldWeight in 405, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.853489 = idf(docFreq=124, maxDocs=43556)
                0.0625 = fieldNorm(doc=405)
          0.2885783 = weight(abstract_txt:factorization in 405) [ClassicSimilarity], result of:
            0.2885783 = score(doc=405,freq=1.0), product of:
              0.49733934 = queryWeight, product of:
                5.751481 = boost
                9.283908 = idf(docFreq=10, maxDocs=43556)
                0.00931413 = queryNorm
              0.58024424 = fieldWeight in 405, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.283908 = idf(docFreq=10, maxDocs=43556)
                0.0625 = fieldNorm(doc=405)
        0.16 = coord(4/25)