Document (#42004)

Author
Xu, L.
Qiu, J.
Title
Unsupervised multi-class sentiment classification approach
Source
Knowledge organization. 46(2019) no.1, S.15-32
Year
2019
Abstract
Real-time and accurate multi-class sentiment classification serves as a tool to gauge public user experiences and provide a decision-making basis for timely analysis. In the field of sentiment classification, there is an urgent need for an accurate and efficient multi-class sentiment classification method. With the aim to overcome the drawbacks of the existing methods, we propose a novel, unsupervised multi-class sentiment classification method called Gaussian mixture model of multi-class sentiment classification (GMSC). Based on the Gaussian mixture model (GMM), the GMSC consists of the following essential phases: first, combining a dictionary with microblog texts to calculate and construct the feature matrix of sentiment for each sample; second, introducing a dimension reduction method to avoid the in-fluence of a sparse feature matrix on the results; third, modeling the multi-class sentiment classification procedure based on GMM; and lastly, computing the probability distribution of different categories of sentiment by using GMM to partition sentiments in microblogs into distinct components and classify them via a Gaussian process regression. The results indicate the GMSC approach's accuracy is better and manual tagging time is reduced when compared to semi-supervised and unsupervised sentiment classification methods within the same parameters.
Content
DOI:10.5771/0943-7444-2019-1-15.

Similar documents (content)

  1. Chen, Z.; Huang, Y.; Tian, J.; Liu, X.; Fu, K.; Huang, T.: Joint model for subsentence-level sentiment analysis with Markov logic (2015) 0.23
    0.2265137 = sum of:
      0.2265137 = product of:
        1.1325685 = sum of:
          0.009940882 = weight(abstract_txt:model in 2210) [ClassicSimilarity], result of:
            0.009940882 = score(doc=2210,freq=1.0), product of:
              0.039900847 = queryWeight, product of:
                1.0682971 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.009369734 = queryNorm
              0.24913962 = fieldWeight in 2210, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.0625 = fieldNorm(doc=2210)
          0.07727925 = weight(abstract_txt:sentiments in 2210) [ClassicSimilarity], result of:
            0.07727925 = score(doc=2210,freq=2.0), product of:
              0.09864024 = queryWeight, product of:
                1.1877173 = boost
                8.863674 = idf(docFreq=16, maxDocs=44218)
                0.009369734 = queryNorm
              0.7834455 = fieldWeight in 2210, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.863674 = idf(docFreq=16, maxDocs=44218)
                0.0625 = fieldNorm(doc=2210)
          0.03224599 = weight(abstract_txt:feature in 2210) [ClassicSimilarity], result of:
            0.03224599 = score(doc=2210,freq=1.0), product of:
              0.08743446 = queryWeight, product of:
                1.5814023 = boost
                5.9008293 = idf(docFreq=328, maxDocs=44218)
                0.009369734 = queryNorm
              0.36880183 = fieldWeight in 2210, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9008293 = idf(docFreq=328, maxDocs=44218)
                0.0625 = fieldNorm(doc=2210)
          0.056481685 = weight(abstract_txt:classification in 2210) [ClassicSimilarity], result of:
            0.056481685 = score(doc=2210,freq=2.0), product of:
              0.16007148 = queryWeight, product of:
                4.2794504 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.009369734 = queryNorm
              0.3528529 = fieldWeight in 2210, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0625 = fieldNorm(doc=2210)
          0.9566207 = weight(abstract_txt:sentiment in 2210) [ClassicSimilarity], result of:
            0.9566207 = score(doc=2210,freq=8.0), product of:
              0.7163941 = queryWeight, product of:
                10.1219 = boost
                7.5537524 = idf(docFreq=62, maxDocs=44218)
                0.009369734 = queryNorm
              1.3353274 = fieldWeight in 2210, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                7.5537524 = idf(docFreq=62, maxDocs=44218)
                0.0625 = fieldNorm(doc=2210)
        0.2 = coord(5/25)
    
  2. Melo, P.F.; Dalip, D.H.; Junior, M.M.; Gonçalves, M.A.; Benevenuto, F.: 10SENT : a stable sentiment analysis method based on the combination of off-the-shelf approaches (2019) 0.21
    0.21109688 = sum of:
      0.21109688 = product of:
        0.87957036 = sum of:
          0.032614347 = weight(abstract_txt:supervised in 4990) [ClassicSimilarity], result of:
            0.032614347 = score(doc=4990,freq=1.0), product of:
              0.06992427 = queryWeight, product of:
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.009369734 = queryNorm
              0.4664238 = fieldWeight in 4990, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.0625 = fieldNorm(doc=4990)
          0.019383017 = weight(abstract_txt:methods in 4990) [ClassicSimilarity], result of:
            0.019383017 = score(doc=4990,freq=3.0), product of:
              0.04317901 = queryWeight, product of:
                1.1113155 = boost
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.009369734 = queryNorm
              0.44889906 = fieldWeight in 4990, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.0625 = fieldNorm(doc=4990)
          0.02146547 = weight(abstract_txt:method in 4990) [ClassicSimilarity], result of:
            0.02146547 = score(doc=4990,freq=1.0), product of:
              0.07630556 = queryWeight, product of:
                1.809359 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.009369734 = queryNorm
              0.28130937 = fieldWeight in 4990, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.0625 = fieldNorm(doc=4990)
          0.18036081 = weight(abstract_txt:unsupervised in 4990) [ClassicSimilarity], result of:
            0.18036081 = score(doc=4990,freq=3.0), product of:
              0.2186672 = queryWeight, product of:
                3.06294 = boost
                7.61935 = idf(docFreq=58, maxDocs=44218)
                0.009369734 = queryNorm
              0.8248188 = fieldWeight in 4990, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.61935 = idf(docFreq=58, maxDocs=44218)
                0.0625 = fieldNorm(doc=4990)
          0.039938588 = weight(abstract_txt:classification in 4990) [ClassicSimilarity], result of:
            0.039938588 = score(doc=4990,freq=1.0), product of:
              0.16007148 = queryWeight, product of:
                4.2794504 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.009369734 = queryNorm
              0.2495047 = fieldWeight in 4990, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0625 = fieldNorm(doc=4990)
          0.58580816 = weight(abstract_txt:sentiment in 4990) [ClassicSimilarity], result of:
            0.58580816 = score(doc=4990,freq=3.0), product of:
              0.7163941 = queryWeight, product of:
                10.1219 = boost
                7.5537524 = idf(docFreq=62, maxDocs=44218)
                0.009369734 = queryNorm
              0.8177177 = fieldWeight in 4990, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.5537524 = idf(docFreq=62, maxDocs=44218)
                0.0625 = fieldNorm(doc=4990)
        0.24 = coord(6/25)
    
  3. Jansen, B.J.; Zhang, M.; Sobel, K.; Chowdury, A.: Twitter power : tweets as electronic word of mouth (2009) 0.16
    0.1567297 = sum of:
      0.1567297 = product of:
        0.7836485 = sum of:
          0.01119079 = weight(abstract_txt:methods in 3157) [ClassicSimilarity], result of:
            0.01119079 = score(doc=3157,freq=1.0), product of:
              0.04317901 = queryWeight, product of:
                1.1113155 = boost
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.009369734 = queryNorm
              0.259172 = fieldWeight in 3157, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.0625 = fieldNorm(doc=3157)
          0.07727925 = weight(abstract_txt:sentiments in 3157) [ClassicSimilarity], result of:
            0.07727925 = score(doc=3157,freq=2.0), product of:
              0.09864024 = queryWeight, product of:
                1.1877173 = boost
                8.863674 = idf(docFreq=16, maxDocs=44218)
                0.009369734 = queryNorm
              0.7834455 = fieldWeight in 3157, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.863674 = idf(docFreq=16, maxDocs=44218)
                0.0625 = fieldNorm(doc=3157)
          0.08675223 = weight(abstract_txt:microblog in 3157) [ClassicSimilarity], result of:
            0.08675223 = score(doc=3157,freq=2.0), product of:
              0.10654488 = queryWeight, product of:
                1.2343898 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.009369734 = queryNorm
              0.81423175 = fieldWeight in 3157, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.0625 = fieldNorm(doc=3157)
          0.13011585 = weight(abstract_txt:microblogs in 3157) [ClassicSimilarity], result of:
            0.13011585 = score(doc=3157,freq=4.0), product of:
              0.11080405 = queryWeight, product of:
                1.2588207 = boost
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.009369734 = queryNorm
              1.1742878 = fieldWeight in 3157, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.0625 = fieldNorm(doc=3157)
          0.47831035 = weight(abstract_txt:sentiment in 3157) [ClassicSimilarity], result of:
            0.47831035 = score(doc=3157,freq=2.0), product of:
              0.7163941 = queryWeight, product of:
                10.1219 = boost
                7.5537524 = idf(docFreq=62, maxDocs=44218)
                0.009369734 = queryNorm
              0.6676637 = fieldWeight in 3157, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.5537524 = idf(docFreq=62, maxDocs=44218)
                0.0625 = fieldNorm(doc=3157)
        0.2 = coord(5/25)
    
  4. Abdi, A.; Shamsuddin, S.M.; Aliguliyev, R.M.: QMOS: Query-based multi-documents opinion-oriented summarization (2018) 0.16
    0.15599783 = sum of:
      0.15599783 = product of:
        0.97498643 = sum of:
          0.013847897 = weight(abstract_txt:methods in 5089) [ClassicSimilarity], result of:
            0.013847897 = score(doc=5089,freq=2.0), product of:
              0.04317901 = queryWeight, product of:
                1.1113155 = boost
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.009369734 = queryNorm
              0.320709 = fieldWeight in 5089, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5089)
          0.037564572 = weight(abstract_txt:method in 5089) [ClassicSimilarity], result of:
            0.037564572 = score(doc=5089,freq=4.0), product of:
              0.07630556 = queryWeight, product of:
                1.809359 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.009369734 = queryNorm
              0.4922914 = fieldWeight in 5089, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5089)
          0.086530894 = weight(abstract_txt:multi in 5089) [ClassicSimilarity], result of:
            0.086530894 = score(doc=5089,freq=1.0), product of:
              0.2661836 = queryWeight, product of:
                4.779168 = boost
                5.9443145 = idf(docFreq=314, maxDocs=44218)
                0.009369734 = queryNorm
              0.3250797 = fieldWeight in 5089, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9443145 = idf(docFreq=314, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5089)
          0.8370431 = weight(abstract_txt:sentiment in 5089) [ClassicSimilarity], result of:
            0.8370431 = score(doc=5089,freq=8.0), product of:
              0.7163941 = queryWeight, product of:
                10.1219 = boost
                7.5537524 = idf(docFreq=62, maxDocs=44218)
                0.009369734 = queryNorm
              1.1684115 = fieldWeight in 5089, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                7.5537524 = idf(docFreq=62, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5089)
        0.16 = coord(4/25)
    
  5. Billal, B.; Fonseca, A.; Sadat, F.; Lounis, H.: Semi-supervised learning and social media text analysis towards multi-labeling categorization (2017) 0.16
    0.15516149 = sum of:
      0.15516149 = product of:
        0.48487964 = sum of:
          0.06381191 = weight(abstract_txt:supervised in 4095) [ClassicSimilarity], result of:
            0.06381191 = score(doc=4095,freq=5.0), product of:
              0.06992427 = queryWeight, product of:
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.009369734 = queryNorm
              0.912586 = fieldWeight in 4095, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.462781 = idf(docFreq=68, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4095)
          0.008698272 = weight(abstract_txt:model in 4095) [ClassicSimilarity], result of:
            0.008698272 = score(doc=4095,freq=1.0), product of:
              0.039900847 = queryWeight, product of:
                1.0682971 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.009369734 = queryNorm
              0.21799716 = fieldWeight in 4095, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4095)
          0.013847897 = weight(abstract_txt:methods in 4095) [ClassicSimilarity], result of:
            0.013847897 = score(doc=4095,freq=2.0), product of:
              0.04317901 = queryWeight, product of:
                1.1113155 = boost
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.009369734 = queryNorm
              0.320709 = fieldWeight in 4095, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4095)
          0.009803133 = weight(abstract_txt:time in 4095) [ClassicSimilarity], result of:
            0.009803133 = score(doc=4095,freq=1.0), product of:
              0.043211903 = queryWeight, product of:
                1.1117387 = boost
                4.148331 = idf(docFreq=1897, maxDocs=44218)
                0.009369734 = queryNorm
              0.22686186 = fieldWeight in 4095, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.148331 = idf(docFreq=1897, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4095)
          0.026562162 = weight(abstract_txt:method in 4095) [ClassicSimilarity], result of:
            0.026562162 = score(doc=4095,freq=2.0), product of:
              0.07630556 = queryWeight, product of:
                1.809359 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.009369734 = queryNorm
              0.34810257 = fieldWeight in 4095, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4095)
          0.10483879 = weight(abstract_txt:classification in 4095) [ClassicSimilarity], result of:
            0.10483879 = score(doc=4095,freq=9.0), product of:
              0.16007148 = queryWeight, product of:
                4.2794504 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.009369734 = queryNorm
              0.65494984 = fieldWeight in 4095, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4095)
          0.084255695 = weight(abstract_txt:class in 4095) [ClassicSimilarity], result of:
            0.084255695 = score(doc=4095,freq=1.0), product of:
              0.26149702 = queryWeight, product of:
                4.736909 = boost
                5.8917522 = idf(docFreq=331, maxDocs=44218)
                0.009369734 = queryNorm
              0.3222052 = fieldWeight in 4095, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8917522 = idf(docFreq=331, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4095)
          0.17306179 = weight(abstract_txt:multi in 4095) [ClassicSimilarity], result of:
            0.17306179 = score(doc=4095,freq=4.0), product of:
              0.2661836 = queryWeight, product of:
                4.779168 = boost
                5.9443145 = idf(docFreq=314, maxDocs=44218)
                0.009369734 = queryNorm
              0.6501594 = fieldWeight in 4095, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.9443145 = idf(docFreq=314, maxDocs=44218)
                0.0546875 = fieldNorm(doc=4095)
        0.32 = coord(8/25)