Document (#42005)

Author
Xu, L.
Qiu, J.
Title
Unsupervised multi-class sentiment classification approach
Source
Knowledge organization. 46(2019) no.1, S.15-32
Year
2019
Abstract
Real-time and accurate multi-class sentiment classification serves as a tool to gauge public user experiences and provide a decision-making basis for timely analysis. In the field of sentiment classification, there is an urgent need for an accurate and efficient multi-class sentiment classification method. With the aim to overcome the drawbacks of the existing methods, we propose a novel, unsupervised multi-class sentiment classification method called Gaussian mixture model of multi-class sentiment classification (GMSC). Based on the Gaussian mixture model (GMM), the GMSC consists of the following essential phases: first, combining a dictionary with microblog texts to calculate and construct the feature matrix of sentiment for each sample; second, introducing a dimension reduction method to avoid the in-fluence of a sparse feature matrix on the results; third, modeling the multi-class sentiment classification procedure based on GMM; and lastly, computing the probability distribution of different categories of sentiment by using GMM to partition sentiments in microblogs into distinct components and classify them via a Gaussian process regression. The results indicate the GMSC approach's accuracy is better and manual tagging time is reduced when compared to semi-supervised and unsupervised sentiment classification methods within the same parameters.
Content
DOI:10.5771/0943-7444-2019-1-15.

Similar documents (content)

  1. Chen, Z.; Huang, Y.; Tian, J.; Liu, X.; Fu, K.; Huang, T.: Joint model for subsentence-level sentiment analysis with Markov logic (2015) 0.23
    0.2304676 = sum of:
      0.2304676 = product of:
        1.152338 = sum of:
          0.00992517 = weight(abstract_txt:model in 3675) [ClassicSimilarity], result of:
            0.00992517 = score(doc=3675,freq=1.0), product of:
              0.039624535 = queryWeight, product of:
                1.0573673 = boost
                4.0076866 = idf(docFreq=2136, maxDocs=43254)
                0.009350709 = queryNorm
              0.2504804 = fieldWeight in 3675, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0076866 = idf(docFreq=2136, maxDocs=43254)
                0.0625 = fieldNorm(doc=3675)
          0.07860557 = weight(abstract_txt:sentiments in 3675) [ClassicSimilarity], result of:
            0.07860557 = score(doc=3675,freq=2.0), product of:
              0.09917931 = queryWeight, product of:
                1.1828763 = boost
                8.966795 = idf(docFreq=14, maxDocs=43254)
                0.009350709 = queryNorm
              0.79256016 = fieldWeight in 3675, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.966795 = idf(docFreq=14, maxDocs=43254)
                0.0625 = fieldNorm(doc=3675)
          0.032027405 = weight(abstract_txt:feature in 3675) [ClassicSimilarity], result of:
            0.032027405 = score(doc=3675,freq=1.0), product of:
              0.08652735 = queryWeight, product of:
                1.5625017 = boost
                5.922272 = idf(docFreq=314, maxDocs=43254)
                0.009350709 = queryNorm
              0.370142 = fieldWeight in 3675, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.922272 = idf(docFreq=314, maxDocs=43254)
                0.0625 = fieldNorm(doc=3675)
          0.055735253 = weight(abstract_txt:classification in 3675) [ClassicSimilarity], result of:
            0.055735253 = score(doc=3675,freq=2.0), product of:
              0.1577256 = queryWeight, product of:
                4.219149 = boost
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.009350709 = queryNorm
              0.35336846 = fieldWeight in 3675, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.0625 = fieldNorm(doc=3675)
          0.9760446 = weight(abstract_txt:sentiment in 3675) [ClassicSimilarity], result of:
            0.9760446 = score(doc=3675,freq=8.0), product of:
              0.72179186 = queryWeight, product of:
                10.091014 = boost
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.009350709 = queryNorm
              1.3522521 = fieldWeight in 3675, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.0625 = fieldNorm(doc=3675)
        0.2 = coord(5/25)
    
  2. Melo, P.F.; Dalip, D.H.; Junior, M.M.; Gonçalves, M.A.; Benevenuto, F.: 10SENT : a stable sentiment analysis method based on the combination of off-the-shelf approaches (2019) 0.21
    0.21403961 = sum of:
      0.21403961 = product of:
        0.8918317 = sum of:
          0.033583064 = weight(abstract_txt:supervised in 991) [ClassicSimilarity], result of:
            0.033583064 = score(doc=991,freq=1.0), product of:
              0.07088305 = queryWeight, product of:
                7.5805006 = idf(docFreq=59, maxDocs=43254)
                0.009350709 = queryNorm
              0.4737813 = fieldWeight in 991, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.5805006 = idf(docFreq=59, maxDocs=43254)
                0.0625 = fieldNorm(doc=991)
          0.019302864 = weight(abstract_txt:methods in 991) [ClassicSimilarity], result of:
            0.019302864 = score(doc=991,freq=3.0), product of:
              0.04280682 = queryWeight, product of:
                1.0990065 = boost
                4.1655097 = idf(docFreq=1824, maxDocs=43254)
                0.009350709 = queryNorm
              0.45092964 = fieldWeight in 991, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1655097 = idf(docFreq=1824, maxDocs=43254)
                0.0625 = fieldNorm(doc=991)
          0.021251243 = weight(abstract_txt:method in 991) [ClassicSimilarity], result of:
            0.021251243 = score(doc=991,freq=1.0), product of:
              0.07535154 = queryWeight, product of:
                1.7858111 = boost
                4.5124474 = idf(docFreq=1289, maxDocs=43254)
                0.009350709 = queryNorm
              0.28202796 = fieldWeight in 991, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5124474 = idf(docFreq=1289, maxDocs=43254)
                0.0625 = fieldNorm(doc=991)
          0.18058096 = weight(abstract_txt:unsupervised in 991) [ClassicSimilarity], result of:
            0.18058096 = score(doc=991,freq=3.0), product of:
              0.21755889 = queryWeight, product of:
                3.034435 = boost
                7.667512 = idf(docFreq=54, maxDocs=43254)
                0.009350709 = queryNorm
              0.8300325 = fieldWeight in 991, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.667512 = idf(docFreq=54, maxDocs=43254)
                0.0625 = fieldNorm(doc=991)
          0.039410777 = weight(abstract_txt:classification in 991) [ClassicSimilarity], result of:
            0.039410777 = score(doc=991,freq=1.0), product of:
              0.1577256 = queryWeight, product of:
                4.219149 = boost
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.009350709 = queryNorm
              0.24986924 = fieldWeight in 991, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9979079 = idf(docFreq=2157, maxDocs=43254)
                0.0625 = fieldNorm(doc=991)
          0.5977028 = weight(abstract_txt:sentiment in 991) [ClassicSimilarity], result of:
            0.5977028 = score(doc=991,freq=3.0), product of:
              0.72179186 = queryWeight, product of:
                10.091014 = boost
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.009350709 = queryNorm
              0.8280819 = fieldWeight in 991, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.0625 = fieldNorm(doc=991)
        0.24 = coord(6/25)
    
  3. Abdi, A.; Shamsuddin, S.M.; Aliguliyev, R.M.: QMOS: Query-based multi-documents opinion-oriented summarization (2018) 0.16
    0.15861294 = sum of:
      0.15861294 = product of:
        0.9913309 = sum of:
          0.013790633 = weight(abstract_txt:methods in 90) [ClassicSimilarity], result of:
            0.013790633 = score(doc=90,freq=2.0), product of:
              0.04280682 = queryWeight, product of:
                1.0990065 = boost
                4.1655097 = idf(docFreq=1824, maxDocs=43254)
                0.009350709 = queryNorm
              0.3221597 = fieldWeight in 90, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1655097 = idf(docFreq=1824, maxDocs=43254)
                0.0546875 = fieldNorm(doc=90)
          0.037189674 = weight(abstract_txt:method in 90) [ClassicSimilarity], result of:
            0.037189674 = score(doc=90,freq=4.0), product of:
              0.07535154 = queryWeight, product of:
                1.7858111 = boost
                4.5124474 = idf(docFreq=1289, maxDocs=43254)
                0.009350709 = queryNorm
              0.49354893 = fieldWeight in 90, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.5124474 = idf(docFreq=1289, maxDocs=43254)
                0.0546875 = fieldNorm(doc=90)
          0.0863116 = weight(abstract_txt:multi in 90) [ClassicSimilarity], result of:
            0.0863116 = score(doc=90,freq=1.0), product of:
              0.26417196 = queryWeight, product of:
                4.7287655 = boost
                5.9744015 = idf(docFreq=298, maxDocs=43254)
                0.009350709 = queryNorm
              0.32672507 = fieldWeight in 90, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9744015 = idf(docFreq=298, maxDocs=43254)
                0.0546875 = fieldNorm(doc=90)
          0.854039 = weight(abstract_txt:sentiment in 90) [ClassicSimilarity], result of:
            0.854039 = score(doc=90,freq=8.0), product of:
              0.72179186 = queryWeight, product of:
                10.091014 = boost
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.009350709 = queryNorm
              1.1832206 = fieldWeight in 90, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.0546875 = fieldNorm(doc=90)
        0.16 = coord(4/25)
    
  4. Jansen, B.J.; Zhang, M.; Sobel, K.; Chowdury, A.: Twitter power : tweets as electronic word of mouth (2009) 0.16
    0.15786622 = sum of:
      0.15786622 = product of:
        0.7893311 = sum of:
          0.011144514 = weight(abstract_txt:methods in 158) [ClassicSimilarity], result of:
            0.011144514 = score(doc=158,freq=1.0), product of:
              0.04280682 = queryWeight, product of:
                1.0990065 = boost
                4.1655097 = idf(docFreq=1824, maxDocs=43254)
                0.009350709 = queryNorm
              0.26034436 = fieldWeight in 158, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1655097 = idf(docFreq=1824, maxDocs=43254)
                0.0625 = fieldNorm(doc=158)
          0.07860557 = weight(abstract_txt:sentiments in 158) [ClassicSimilarity], result of:
            0.07860557 = score(doc=158,freq=2.0), product of:
              0.09917931 = queryWeight, product of:
                1.1828763 = boost
                8.966795 = idf(docFreq=14, maxDocs=43254)
                0.009350709 = queryNorm
              0.79256016 = fieldWeight in 158, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.966795 = idf(docFreq=14, maxDocs=43254)
                0.0625 = fieldNorm(doc=158)
          0.084621266 = weight(abstract_txt:microblog in 158) [ClassicSimilarity], result of:
            0.084621266 = score(doc=158,freq=2.0), product of:
              0.10417701 = queryWeight, product of:
                1.2123129 = boost
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.009350709 = queryNorm
              0.81228346 = fieldWeight in 158, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.189939 = idf(docFreq=11, maxDocs=43254)
                0.0625 = fieldNorm(doc=158)
          0.1269374 = weight(abstract_txt:microblogs in 158) [ClassicSimilarity], result of:
            0.1269374 = score(doc=158,freq=4.0), product of:
              0.10835159 = queryWeight, product of:
                1.2363642 = boost
                9.37226 = idf(docFreq=9, maxDocs=43254)
                0.009350709 = queryNorm
              1.1715325 = fieldWeight in 158, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.37226 = idf(docFreq=9, maxDocs=43254)
                0.0625 = fieldNorm(doc=158)
          0.4880223 = weight(abstract_txt:sentiment in 158) [ClassicSimilarity], result of:
            0.4880223 = score(doc=158,freq=2.0), product of:
              0.72179186 = queryWeight, product of:
                10.091014 = boost
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.009350709 = queryNorm
              0.67612606 = fieldWeight in 158, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.0625 = fieldNorm(doc=158)
        0.2 = coord(5/25)
    
  5. Zhang, C.; Zeng, D.; Li, J.; Wang, F.-Y.; Zuo, W.: Sentiment analysis of Chinese documents : from sentence to document level (2009) 0.16
    0.15562922 = sum of:
      0.15562922 = product of:
        0.9726826 = sum of:
          0.013930643 = weight(abstract_txt:methods in 297) [ClassicSimilarity], result of:
            0.013930643 = score(doc=297,freq=1.0), product of:
              0.04280682 = queryWeight, product of:
                1.0990065 = boost
                4.1655097 = idf(docFreq=1824, maxDocs=43254)
                0.009350709 = queryNorm
              0.32543045 = fieldWeight in 297, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1655097 = idf(docFreq=1824, maxDocs=43254)
                0.078125 = fieldNorm(doc=297)
          0.06947817 = weight(abstract_txt:sentiments in 297) [ClassicSimilarity], result of:
            0.06947817 = score(doc=297,freq=1.0), product of:
              0.09917931 = queryWeight, product of:
                1.1828763 = boost
                8.966795 = idf(docFreq=14, maxDocs=43254)
                0.009350709 = queryNorm
              0.7005309 = fieldWeight in 297, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.966795 = idf(docFreq=14, maxDocs=43254)
                0.078125 = fieldNorm(doc=297)
          0.026564052 = weight(abstract_txt:method in 297) [ClassicSimilarity], result of:
            0.026564052 = score(doc=297,freq=1.0), product of:
              0.07535154 = queryWeight, product of:
                1.7858111 = boost
                4.5124474 = idf(docFreq=1289, maxDocs=43254)
                0.009350709 = queryNorm
              0.35253495 = fieldWeight in 297, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5124474 = idf(docFreq=1289, maxDocs=43254)
                0.078125 = fieldNorm(doc=297)
          0.8627097 = weight(abstract_txt:sentiment in 297) [ClassicSimilarity], result of:
            0.8627097 = score(doc=297,freq=4.0), product of:
              0.72179186 = queryWeight, product of:
                10.091014 = boost
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.009350709 = queryNorm
              1.1952333 = fieldWeight in 297, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.078125 = fieldNorm(doc=297)
        0.16 = coord(4/25)