Document (#42005)

Author
Xu, L.
Qiu, J.
Title
Unsupervised multi-class sentiment classification approach
Source
Knowledge organization. 46(2019) no.1, S.15-32
Year
2019
Abstract
Real-time and accurate multi-class sentiment classification serves as a tool to gauge public user experiences and provide a decision-making basis for timely analysis. In the field of sentiment classification, there is an urgent need for an accurate and efficient multi-class sentiment classification method. With the aim to overcome the drawbacks of the existing methods, we propose a novel, unsupervised multi-class sentiment classification method called Gaussian mixture model of multi-class sentiment classification (GMSC). Based on the Gaussian mixture model (GMM), the GMSC consists of the following essential phases: first, combining a dictionary with microblog texts to calculate and construct the feature matrix of sentiment for each sample; second, introducing a dimension reduction method to avoid the in-fluence of a sparse feature matrix on the results; third, modeling the multi-class sentiment classification procedure based on GMM; and lastly, computing the probability distribution of different categories of sentiment by using GMM to partition sentiments in microblogs into distinct components and classify them via a Gaussian process regression. The results indicate the GMSC approach's accuracy is better and manual tagging time is reduced when compared to semi-supervised and unsupervised sentiment classification methods within the same parameters.
Content
DOI:10.5771/0943-7444-2019-1-15.

Similar documents (content)

  1. Chen, Z.; Huang, Y.; Tian, J.; Liu, X.; Fu, K.; Huang, T.: Joint model for subsentence-level sentiment analysis with Markov logic (2015) 0.23
    0.23193577 = sum of:
      0.23193577 = product of:
        1.1596788 = sum of:
          0.009998497 = weight(abstract_txt:model in 4211) [ClassicSimilarity], result of:
            0.009998497 = score(doc=4211,freq=1.0), product of:
              0.039772388 = queryWeight, product of:
                1.0557406 = boost
                4.022287 = idf(docFreq=2080, maxDocs=42740)
                0.0093659405 = queryNorm
              0.25139293 = fieldWeight in 4211, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.022287 = idf(docFreq=2080, maxDocs=42740)
                0.0625 = fieldNorm(doc=4211)
          0.079831414 = weight(abstract_txt:sentiments in 4211) [ClassicSimilarity], result of:
            0.079831414 = score(doc=4211,freq=2.0), product of:
              0.10008932 = queryWeight, product of:
                1.1842551 = boost
                9.023833 = idf(docFreq=13, maxDocs=42740)
                0.0093659405 = queryNorm
              0.7976017 = fieldWeight in 4211, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.023833 = idf(docFreq=13, maxDocs=42740)
                0.0625 = fieldNorm(doc=4211)
          0.032296855 = weight(abstract_txt:feature in 4211) [ClassicSimilarity], result of:
            0.032296855 = score(doc=4211,freq=1.0), product of:
              0.086909115 = queryWeight, product of:
                1.5606269 = boost
                5.945863 = idf(docFreq=303, maxDocs=42740)
                0.0093659405 = queryNorm
              0.37161642 = fieldWeight in 4211, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.945863 = idf(docFreq=303, maxDocs=42740)
                0.0625 = fieldNorm(doc=4211)
          0.055623095 = weight(abstract_txt:classification in 4211) [ClassicSimilarity], result of:
            0.055623095 = score(doc=4211,freq=2.0), product of:
              0.15732773 = queryWeight, product of:
                4.199514 = boost
                3.9999528 = idf(docFreq=2127, maxDocs=42740)
                0.0093659405 = queryNorm
              0.3535492 = fieldWeight in 4211, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9999528 = idf(docFreq=2127, maxDocs=42740)
                0.0625 = fieldNorm(doc=4211)
          0.98192894 = weight(abstract_txt:sentiment in 4211) [ClassicSimilarity], result of:
            0.98192894 = score(doc=4211,freq=8.0), product of:
              0.7238333 = queryWeight, product of:
                10.070955 = boost
                7.6739063 = idf(docFreq=53, maxDocs=42740)
                0.0093659405 = queryNorm
              1.3565677 = fieldWeight in 4211, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                7.6739063 = idf(docFreq=53, maxDocs=42740)
                0.0625 = fieldNorm(doc=4211)
        0.2 = coord(5/25)
    
  2. Melo, P.F.; Dalip, D.H.; Junior, M.M.; Gonçalves, M.A.; Benevenuto, F.: 10SENT : a stable sentiment analysis method based on the combination of off-the-shelf approaches (2019) 0.21
    0.21464682 = sum of:
      0.21464682 = product of:
        0.89436173 = sum of:
          0.033987798 = weight(abstract_txt:supervised in 991) [ClassicSimilarity], result of:
            0.033987798 = score(doc=991,freq=1.0), product of:
              0.07136696 = queryWeight, product of:
                7.619839 = idf(docFreq=56, maxDocs=42740)
                0.0093659405 = queryNorm
              0.47623995 = fieldWeight in 991, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.619839 = idf(docFreq=56, maxDocs=42740)
                0.0625 = fieldNorm(doc=991)
          0.019329553 = weight(abstract_txt:methods in 991) [ClassicSimilarity], result of:
            0.019329553 = score(doc=991,freq=3.0), product of:
              0.04279562 = queryWeight, product of:
                1.0951309 = boost
                4.172361 = idf(docFreq=1790, maxDocs=42740)
                0.0093659405 = queryNorm
              0.4516713 = fieldWeight in 991, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.172361 = idf(docFreq=1790, maxDocs=42740)
                0.0625 = fieldNorm(doc=991)
          0.021305729 = weight(abstract_txt:method in 991) [ClassicSimilarity], result of:
            0.021305729 = score(doc=991,freq=1.0), product of:
              0.07539106 = queryWeight, product of:
                1.780213 = boost
                4.5216455 = idf(docFreq=1262, maxDocs=42740)
                0.0093659405 = queryNorm
              0.28260285 = fieldWeight in 991, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5216455 = idf(docFreq=1262, maxDocs=42740)
                0.0625 = fieldNorm(doc=991)
          0.17910095 = weight(abstract_txt:unsupervised in 991) [ClassicSimilarity], result of:
            0.17910095 = score(doc=991,freq=3.0), product of:
              0.21611278 = queryWeight, product of:
                3.0140624 = boost
                7.655557 = idf(docFreq=54, maxDocs=42740)
                0.0093659405 = queryNorm
              0.82873833 = fieldWeight in 991, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.655557 = idf(docFreq=54, maxDocs=42740)
                0.0625 = fieldNorm(doc=991)
          0.039331466 = weight(abstract_txt:classification in 991) [ClassicSimilarity], result of:
            0.039331466 = score(doc=991,freq=1.0), product of:
              0.15732773 = queryWeight, product of:
                4.199514 = boost
                3.9999528 = idf(docFreq=2127, maxDocs=42740)
                0.0093659405 = queryNorm
              0.24999705 = fieldWeight in 991, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9999528 = idf(docFreq=2127, maxDocs=42740)
                0.0625 = fieldNorm(doc=991)
          0.60130626 = weight(abstract_txt:sentiment in 991) [ClassicSimilarity], result of:
            0.60130626 = score(doc=991,freq=3.0), product of:
              0.7238333 = queryWeight, product of:
                10.070955 = boost
                7.6739063 = idf(docFreq=53, maxDocs=42740)
                0.0093659405 = queryNorm
              0.8307247 = fieldWeight in 991, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.6739063 = idf(docFreq=53, maxDocs=42740)
                0.0625 = fieldNorm(doc=991)
        0.24 = coord(6/25)
    
  3. Jansen, B.J.; Zhang, M.; Sobel, K.; Chowdury, A.: Twitter power : tweets as electronic word of mouth (2009) 0.16
    0.15941177 = sum of:
      0.15941177 = product of:
        0.7970588 = sum of:
          0.011159924 = weight(abstract_txt:methods in 158) [ClassicSimilarity], result of:
            0.011159924 = score(doc=158,freq=1.0), product of:
              0.04279562 = queryWeight, product of:
                1.0951309 = boost
                4.172361 = idf(docFreq=1790, maxDocs=42740)
                0.0093659405 = queryNorm
              0.26077256 = fieldWeight in 158, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.172361 = idf(docFreq=1790, maxDocs=42740)
                0.0625 = fieldNorm(doc=158)
          0.079831414 = weight(abstract_txt:sentiments in 158) [ClassicSimilarity], result of:
            0.079831414 = score(doc=158,freq=2.0), product of:
              0.10008932 = queryWeight, product of:
                1.1842551 = boost
                9.023833 = idf(docFreq=13, maxDocs=42740)
                0.0093659405 = queryNorm
              0.7976017 = fieldWeight in 158, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.023833 = idf(docFreq=13, maxDocs=42740)
                0.0625 = fieldNorm(doc=158)
          0.1260044 = weight(abstract_txt:microblogs in 158) [ClassicSimilarity], result of:
            0.1260044 = score(doc=158,freq=4.0), product of:
              0.10769255 = queryWeight, product of:
                1.2284125 = boost
                9.360306 = idf(docFreq=9, maxDocs=42740)
                0.0093659405 = queryNorm
              1.1700382 = fieldWeight in 158, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.360306 = idf(docFreq=9, maxDocs=42740)
                0.0625 = fieldNorm(doc=158)
          0.089098565 = weight(abstract_txt:microblog in 158) [ClassicSimilarity], result of:
            0.089098565 = score(doc=158,freq=2.0), product of:
              0.10769255 = queryWeight, product of:
                1.2284125 = boost
                9.360306 = idf(docFreq=9, maxDocs=42740)
                0.0093659405 = queryNorm
              0.827342 = fieldWeight in 158, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.360306 = idf(docFreq=9, maxDocs=42740)
                0.0625 = fieldNorm(doc=158)
          0.49096447 = weight(abstract_txt:sentiment in 158) [ClassicSimilarity], result of:
            0.49096447 = score(doc=158,freq=2.0), product of:
              0.7238333 = queryWeight, product of:
                10.070955 = boost
                7.6739063 = idf(docFreq=53, maxDocs=42740)
                0.0093659405 = queryNorm
              0.6782839 = fieldWeight in 158, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.6739063 = idf(docFreq=53, maxDocs=42740)
                0.0625 = fieldNorm(doc=158)
        0.2 = coord(5/25)
    
  4. Abdi, A.; Shamsuddin, S.M.; Aliguliyev, R.M.: QMOS: Query-based multi-documents opinion-oriented summarization (2018) 0.16
    0.15939324 = sum of:
      0.15939324 = product of:
        0.9962077 = sum of:
          0.013809701 = weight(abstract_txt:methods in 1090) [ClassicSimilarity], result of:
            0.013809701 = score(doc=1090,freq=2.0), product of:
              0.04279562 = queryWeight, product of:
                1.0951309 = boost
                4.172361 = idf(docFreq=1790, maxDocs=42740)
                0.0093659405 = queryNorm
              0.3226896 = fieldWeight in 1090, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.172361 = idf(docFreq=1790, maxDocs=42740)
                0.0546875 = fieldNorm(doc=1090)
          0.037285026 = weight(abstract_txt:method in 1090) [ClassicSimilarity], result of:
            0.037285026 = score(doc=1090,freq=4.0), product of:
              0.07539106 = queryWeight, product of:
                1.780213 = boost
                4.5216455 = idf(docFreq=1262, maxDocs=42740)
                0.0093659405 = queryNorm
              0.494555 = fieldWeight in 1090, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.5216455 = idf(docFreq=1262, maxDocs=42740)
                0.0546875 = fieldNorm(doc=1090)
          0.08592512 = weight(abstract_txt:multi in 1090) [ClassicSimilarity], result of:
            0.08592512 = score(doc=1090,freq=1.0), product of:
              0.26307142 = queryWeight, product of:
                4.7028794 = boost
                5.972531 = idf(docFreq=295, maxDocs=42740)
                0.0093659405 = queryNorm
              0.32662278 = fieldWeight in 1090, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.972531 = idf(docFreq=295, maxDocs=42740)
                0.0546875 = fieldNorm(doc=1090)
          0.85918784 = weight(abstract_txt:sentiment in 1090) [ClassicSimilarity], result of:
            0.85918784 = score(doc=1090,freq=8.0), product of:
              0.7238333 = queryWeight, product of:
                10.070955 = boost
                7.6739063 = idf(docFreq=53, maxDocs=42740)
                0.0093659405 = queryNorm
              1.1869968 = fieldWeight in 1090, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                7.6739063 = idf(docFreq=53, maxDocs=42740)
                0.0546875 = fieldNorm(doc=1090)
        0.16 = coord(4/25)
    
  5. Zhang, C.; Zeng, D.; Li, J.; Wang, F.-Y.; Zuo, W.: Sentiment analysis of Chinese documents : from sentence to document level (2009) 0.16
    0.15664871 = sum of:
      0.15664871 = product of:
        0.97905445 = sum of:
          0.013949905 = weight(abstract_txt:methods in 297) [ClassicSimilarity], result of:
            0.013949905 = score(doc=297,freq=1.0), product of:
              0.04279562 = queryWeight, product of:
                1.0951309 = boost
                4.172361 = idf(docFreq=1790, maxDocs=42740)
                0.0093659405 = queryNorm
              0.3259657 = fieldWeight in 297, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.172361 = idf(docFreq=1790, maxDocs=42740)
                0.078125 = fieldNorm(doc=297)
          0.07056167 = weight(abstract_txt:sentiments in 297) [ClassicSimilarity], result of:
            0.07056167 = score(doc=297,freq=1.0), product of:
              0.10008932 = queryWeight, product of:
                1.1842551 = boost
                9.023833 = idf(docFreq=13, maxDocs=42740)
                0.0093659405 = queryNorm
              0.704987 = fieldWeight in 297, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.023833 = idf(docFreq=13, maxDocs=42740)
                0.078125 = fieldNorm(doc=297)
          0.02663216 = weight(abstract_txt:method in 297) [ClassicSimilarity], result of:
            0.02663216 = score(doc=297,freq=1.0), product of:
              0.07539106 = queryWeight, product of:
                1.780213 = boost
                4.5216455 = idf(docFreq=1262, maxDocs=42740)
                0.0093659405 = queryNorm
              0.35325354 = fieldWeight in 297, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5216455 = idf(docFreq=1262, maxDocs=42740)
                0.078125 = fieldNorm(doc=297)
          0.86791074 = weight(abstract_txt:sentiment in 297) [ClassicSimilarity], result of:
            0.86791074 = score(doc=297,freq=4.0), product of:
              0.7238333 = queryWeight, product of:
                10.070955 = boost
                7.6739063 = idf(docFreq=53, maxDocs=42740)
                0.0093659405 = queryNorm
              1.1990478 = fieldWeight in 297, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.6739063 = idf(docFreq=53, maxDocs=42740)
                0.078125 = fieldNorm(doc=297)
        0.16 = coord(4/25)