Document (#38505)

Author
Yu, N.
Title
Exploring co-training strategies for opinion detection
Source
Journal of the Association for Information Science and Technology. 65(2014) no.10, S.2098-2110
Year
2014
Abstract
For the last decade or so, sentiment analysis, which aims to automatically identify opinions, polarities, or emotions from user-generated content (e.g., blogs, tweets), has attracted interest from both academic and industrial communities. Most sentiment analysis strategies fall into 2 categories: lexicon-based and corpus-based approaches. While the latter often requires sentiment-labeled data to build a machine learning model, both approaches need sentiment-labeled data for evaluation. Unfortunately, most data domains lack sufficient quantities of labeled data, especially at the subdocument level. Semisupervised learning (SSL), a machine learning technique that requires only a few labeled examples and can automatically label unlabeled data, is a promising strategy to deal with the issue of insufficient labeled data. Although previous studies have shown promising results of applying various SSL algorithms to solve a sentiment-analysis problem, co-training, an SSL algorithm, has not attracted much attention for sentiment analysis largely due to its restricted assumptions. Therefore, this study focuses on revisiting co-training in depth and discusses several co-training strategies for sentiment analysis following a loose assumption. Results suggest that co-training can be more effective than can other currently adopted SSL methods for sentiment analysis.

Similar documents (content)

  1. Thelwall, M.; Buckley, K.; Paltoglou, G.: Sentiment strength detection for the social web (2012) 0.40
    0.4019631 = sum of:
      0.4019631 = product of:
        1.2561347 = sum of:
          0.010530992 = weight(abstract_txt:both in 1437) [ClassicSimilarity], result of:
            0.010530992 = score(doc=1437,freq=1.0), product of:
              0.044130377 = queryWeight, product of:
                1.0160078 = boost
                3.8181381 = idf(docFreq=2582, maxDocs=43254)
                0.011375983 = queryNorm
              0.23863363 = fieldWeight in 1437, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8181381 = idf(docFreq=2582, maxDocs=43254)
                0.0625 = fieldNorm(doc=1437)
          0.016544957 = weight(abstract_txt:most in 1437) [ClassicSimilarity], result of:
            0.016544957 = score(doc=1437,freq=2.0), product of:
              0.047336034 = queryWeight, product of:
                1.0522627 = boost
                3.9543834 = idf(docFreq=2253, maxDocs=43254)
                0.011375983 = queryNorm
              0.3495214 = fieldWeight in 1437, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9543834 = idf(docFreq=2253, maxDocs=43254)
                0.0625 = fieldNorm(doc=1437)
          0.01890587 = weight(abstract_txt:approaches in 1437) [ClassicSimilarity], result of:
            0.01890587 = score(doc=1437,freq=1.0), product of:
              0.06518623 = queryWeight, product of:
                1.2348272 = boost
                4.640457 = idf(docFreq=1134, maxDocs=43254)
                0.011375983 = queryNorm
              0.29002857 = fieldWeight in 1437, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.640457 = idf(docFreq=1134, maxDocs=43254)
                0.0625 = fieldNorm(doc=1437)
          0.028494801 = weight(abstract_txt:machine in 1437) [ClassicSimilarity], result of:
            0.028494801 = score(doc=1437,freq=1.0), product of:
              0.085691 = queryWeight, product of:
                1.4157802 = boost
                5.320475 = idf(docFreq=574, maxDocs=43254)
                0.011375983 = queryNorm
              0.3325297 = fieldWeight in 1437, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.320475 = idf(docFreq=574, maxDocs=43254)
                0.0625 = fieldNorm(doc=1437)
          0.031297125 = weight(abstract_txt:learning in 1437) [ClassicSimilarity], result of:
            0.031297125 = score(doc=1437,freq=1.0), product of:
              0.10442188 = queryWeight, product of:
                1.914122 = boost
                4.7954893 = idf(docFreq=971, maxDocs=43254)
                0.011375983 = queryNorm
              0.29971808 = fieldWeight in 1437, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7954893 = idf(docFreq=971, maxDocs=43254)
                0.0625 = fieldNorm(doc=1437)
          0.03042242 = weight(abstract_txt:data in 1437) [ClassicSimilarity], result of:
            0.03042242 = score(doc=1437,freq=2.0), product of:
              0.10246708 = queryWeight, product of:
                2.68152 = boost
                3.3590338 = idf(docFreq=4087, maxDocs=43254)
                0.011375983 = queryNorm
              0.29689944 = fieldWeight in 1437, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3590338 = idf(docFreq=4087, maxDocs=43254)
                0.0625 = fieldNorm(doc=1437)
          0.048734087 = weight(abstract_txt:analysis in 1437) [ClassicSimilarity], result of:
            0.048734087 = score(doc=1437,freq=3.0), product of:
              0.12255 = queryWeight, product of:
                2.9325507 = boost
                3.67349 = idf(docFreq=2984, maxDocs=43254)
                0.011375983 = queryNorm
              0.39766696 = fieldWeight in 1437, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.67349 = idf(docFreq=2984, maxDocs=43254)
                0.0625 = fieldNorm(doc=1437)
          1.0712045 = weight(abstract_txt:sentiment in 1437) [ClassicSimilarity], result of:
            1.0712045 = score(doc=1437,freq=10.0), product of:
              0.7085324 = queryWeight, product of:
                8.142131 = boost
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.011375983 = queryNorm
              1.5118638 = fieldWeight in 1437, product of:
                3.1622777 = tf(freq=10.0), with freq of:
                  10.0 = termFreq=10.0
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.0625 = fieldNorm(doc=1437)
        0.32 = coord(8/25)
    
  2. Saif, H.; He, Y.; Fernandez, M.; Alani, H.: Contextual semantics for sentiment analysis of Twitter (2016) 0.40
    0.39869446 = sum of:
      0.39869446 = product of:
        1.2459202 = sum of:
          0.014893072 = weight(abstract_txt:both in 4132) [ClassicSimilarity], result of:
            0.014893072 = score(doc=4132,freq=2.0), product of:
              0.044130377 = queryWeight, product of:
                1.0160078 = boost
                3.8181381 = idf(docFreq=2582, maxDocs=43254)
                0.011375983 = queryNorm
              0.3374789 = fieldWeight in 4132, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.8181381 = idf(docFreq=2582, maxDocs=43254)
                0.0625 = fieldNorm(doc=4132)
          0.042343076 = weight(abstract_txt:tweets in 4132) [ClassicSimilarity], result of:
            0.042343076 = score(doc=4132,freq=1.0), product of:
              0.08856655 = queryWeight, product of:
                1.0177664 = boost
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.011375983 = queryNorm
              0.47809333 = fieldWeight in 4132, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.0625 = fieldNorm(doc=4132)
          0.06210555 = weight(abstract_txt:lexicon in 4132) [ClassicSimilarity], result of:
            0.06210555 = score(doc=4132,freq=2.0), product of:
              0.090745494 = queryWeight, product of:
                1.03021 = boost
                7.7430196 = idf(docFreq=50, maxDocs=43254)
                0.011375983 = queryNorm
              0.6843927 = fieldWeight in 4132, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.7430196 = idf(docFreq=50, maxDocs=43254)
                0.0625 = fieldNorm(doc=4132)
          0.01890587 = weight(abstract_txt:approaches in 4132) [ClassicSimilarity], result of:
            0.01890587 = score(doc=4132,freq=1.0), product of:
              0.06518623 = queryWeight, product of:
                1.2348272 = boost
                4.640457 = idf(docFreq=1134, maxDocs=43254)
                0.011375983 = queryNorm
              0.29002857 = fieldWeight in 4132, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.640457 = idf(docFreq=1134, maxDocs=43254)
                0.0625 = fieldNorm(doc=4132)
          0.09131918 = weight(abstract_txt:polarities in 4132) [ClassicSimilarity], result of:
            0.09131918 = score(doc=4132,freq=1.0), product of:
              0.14783914 = queryWeight, product of:
                1.3149462 = boost
                9.883085 = idf(docFreq=5, maxDocs=43254)
                0.011375983 = queryNorm
              0.6176928 = fieldWeight in 4132, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.883085 = idf(docFreq=5, maxDocs=43254)
                0.0625 = fieldNorm(doc=4132)
          0.08032822 = weight(abstract_txt:attracted in 4132) [ClassicSimilarity], result of:
            0.08032822 = score(doc=4132,freq=1.0), product of:
              0.17100292 = queryWeight, product of:
                2.0 = boost
                7.515962 = idf(docFreq=63, maxDocs=43254)
                0.011375983 = queryNorm
              0.46974763 = fieldWeight in 4132, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.515962 = idf(docFreq=63, maxDocs=43254)
                0.0625 = fieldNorm(doc=4132)
          0.039791215 = weight(abstract_txt:analysis in 4132) [ClassicSimilarity], result of:
            0.039791215 = score(doc=4132,freq=2.0), product of:
              0.12255 = queryWeight, product of:
                2.9325507 = boost
                3.67349 = idf(docFreq=2984, maxDocs=43254)
                0.011375983 = queryNorm
              0.3246937 = fieldWeight in 4132, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.67349 = idf(docFreq=2984, maxDocs=43254)
                0.0625 = fieldNorm(doc=4132)
          0.896234 = weight(abstract_txt:sentiment in 4132) [ClassicSimilarity], result of:
            0.896234 = score(doc=4132,freq=7.0), product of:
              0.7085324 = queryWeight, product of:
                8.142131 = boost
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.011375983 = queryNorm
              1.2649161 = fieldWeight in 4132, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.0625 = fieldNorm(doc=4132)
        0.32 = coord(8/25)
    
  3. Melo, P.F.; Dalip, D.H.; Junior, M.M.; Gonçalves, M.A.; Benevenuto, F.: 10SENT : a stable sentiment analysis method based on the combination of off-the-shelf approaches (2019) 0.34
    0.3367965 = sum of:
      0.3367965 = product of:
        1.052489 = sum of:
          0.01890587 = weight(abstract_txt:approaches in 991) [ClassicSimilarity], result of:
            0.01890587 = score(doc=991,freq=1.0), product of:
              0.06518623 = queryWeight, product of:
                1.2348272 = boost
                4.640457 = idf(docFreq=1134, maxDocs=43254)
                0.011375983 = queryNorm
              0.29002857 = fieldWeight in 991, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.640457 = idf(docFreq=1134, maxDocs=43254)
                0.0625 = fieldNorm(doc=991)
          0.031297125 = weight(abstract_txt:learning in 991) [ClassicSimilarity], result of:
            0.031297125 = score(doc=991,freq=1.0), product of:
              0.10442188 = queryWeight, product of:
                1.914122 = boost
                4.7954893 = idf(docFreq=971, maxDocs=43254)
                0.011375983 = queryNorm
              0.29971808 = fieldWeight in 991, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.7954893 = idf(docFreq=971, maxDocs=43254)
                0.0625 = fieldNorm(doc=991)
          0.038562275 = weight(abstract_txt:strategies in 991) [ClassicSimilarity], result of:
            0.038562275 = score(doc=991,freq=1.0), product of:
              0.12001356 = queryWeight, product of:
                2.0520551 = boost
                5.141056 = idf(docFreq=687, maxDocs=43254)
                0.011375983 = queryNorm
              0.321316 = fieldWeight in 991, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.141056 = idf(docFreq=687, maxDocs=43254)
                0.0625 = fieldNorm(doc=991)
          0.0372597 = weight(abstract_txt:data in 991) [ClassicSimilarity], result of:
            0.0372597 = score(doc=991,freq=3.0), product of:
              0.10246708 = queryWeight, product of:
                2.68152 = boost
                3.3590338 = idf(docFreq=4087, maxDocs=43254)
                0.011375983 = queryNorm
              0.36362606 = fieldWeight in 991, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.3590338 = idf(docFreq=4087, maxDocs=43254)
                0.0625 = fieldNorm(doc=991)
          0.056273278 = weight(abstract_txt:analysis in 991) [ClassicSimilarity], result of:
            0.056273278 = score(doc=991,freq=4.0), product of:
              0.12255 = queryWeight, product of:
                2.9325507 = boost
                3.67349 = idf(docFreq=2984, maxDocs=43254)
                0.011375983 = queryNorm
              0.45918626 = fieldWeight in 991, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.67349 = idf(docFreq=2984, maxDocs=43254)
                0.0625 = fieldNorm(doc=991)
          0.06389156 = weight(abstract_txt:training in 991) [ClassicSimilarity], result of:
            0.06389156 = score(doc=991,freq=1.0), product of:
              0.19923568 = queryWeight, product of:
                3.4133577 = boost
                5.1309333 = idf(docFreq=694, maxDocs=43254)
                0.011375983 = queryNorm
              0.32068333 = fieldWeight in 991, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.1309333 = idf(docFreq=694, maxDocs=43254)
                0.0625 = fieldNorm(doc=991)
          0.21957631 = weight(abstract_txt:labeled in 991) [ClassicSimilarity], result of:
            0.21957631 = score(doc=991,freq=1.0), product of:
              0.4537275 = queryWeight, product of:
                5.1510506 = boost
                7.7430196 = idf(docFreq=50, maxDocs=43254)
                0.011375983 = queryNorm
              0.48393872 = fieldWeight in 991, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7430196 = idf(docFreq=50, maxDocs=43254)
                0.0625 = fieldNorm(doc=991)
          0.58672285 = weight(abstract_txt:sentiment in 991) [ClassicSimilarity], result of:
            0.58672285 = score(doc=991,freq=3.0), product of:
              0.7085324 = queryWeight, product of:
                8.142131 = boost
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.011375983 = queryNorm
              0.8280819 = fieldWeight in 991, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.0625 = fieldNorm(doc=991)
        0.32 = coord(8/25)
    
  4. Xing, F.Z.; Pallucchini, F.; Cambria, E.: Cognitive-inspired domain adaptation of sentiment lexicons (2019) 0.29
    0.29098353 = sum of:
      0.29098353 = product of:
        1.2124314 = sum of:
          0.06210555 = weight(abstract_txt:lexicon in 105) [ClassicSimilarity], result of:
            0.06210555 = score(doc=105,freq=2.0), product of:
              0.090745494 = queryWeight, product of:
                1.03021 = boost
                7.7430196 = idf(docFreq=50, maxDocs=43254)
                0.011375983 = queryNorm
              0.6843927 = fieldWeight in 105, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.7430196 = idf(docFreq=50, maxDocs=43254)
                0.0625 = fieldNorm(doc=105)
          0.09131918 = weight(abstract_txt:polarities in 105) [ClassicSimilarity], result of:
            0.09131918 = score(doc=105,freq=1.0), product of:
              0.14783914 = queryWeight, product of:
                1.3149462 = boost
                9.883085 = idf(docFreq=5, maxDocs=43254)
                0.011375983 = queryNorm
              0.6176928 = fieldWeight in 105, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.883085 = idf(docFreq=5, maxDocs=43254)
                0.0625 = fieldNorm(doc=105)
          0.028494801 = weight(abstract_txt:machine in 105) [ClassicSimilarity], result of:
            0.028494801 = score(doc=105,freq=1.0), product of:
              0.085691 = queryWeight, product of:
                1.4157802 = boost
                5.320475 = idf(docFreq=574, maxDocs=43254)
                0.011375983 = queryNorm
              0.3325297 = fieldWeight in 105, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.320475 = idf(docFreq=574, maxDocs=43254)
                0.0625 = fieldNorm(doc=105)
          0.04426082 = weight(abstract_txt:learning in 105) [ClassicSimilarity], result of:
            0.04426082 = score(doc=105,freq=2.0), product of:
              0.10442188 = queryWeight, product of:
                1.914122 = boost
                4.7954893 = idf(docFreq=971, maxDocs=43254)
                0.011375983 = queryNorm
              0.42386538 = fieldWeight in 105, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.7954893 = idf(docFreq=971, maxDocs=43254)
                0.0625 = fieldNorm(doc=105)
          0.028136639 = weight(abstract_txt:analysis in 105) [ClassicSimilarity], result of:
            0.028136639 = score(doc=105,freq=1.0), product of:
              0.12255 = queryWeight, product of:
                2.9325507 = boost
                3.67349 = idf(docFreq=2984, maxDocs=43254)
                0.011375983 = queryNorm
              0.22959313 = fieldWeight in 105, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.67349 = idf(docFreq=2984, maxDocs=43254)
                0.0625 = fieldNorm(doc=105)
          0.95811445 = weight(abstract_txt:sentiment in 105) [ClassicSimilarity], result of:
            0.95811445 = score(doc=105,freq=8.0), product of:
              0.7085324 = queryWeight, product of:
                8.142131 = boost
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.011375983 = queryNorm
              1.3522521 = fieldWeight in 105, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.0625 = fieldNorm(doc=105)
        0.24 = coord(6/25)
    
  5. Abdi, A.; Shamsuddin, S.M.; Aliguliyev, R.M.: QMOS: Query-based multi-documents opinion-oriented summarization (2018) 0.29
    0.2877224 = sum of:
      0.2877224 = product of:
        1.02758 = sum of:
          0.009214618 = weight(abstract_txt:both in 90) [ClassicSimilarity], result of:
            0.009214618 = score(doc=90,freq=1.0), product of:
              0.044130377 = queryWeight, product of:
                1.0160078 = boost
                3.8181381 = idf(docFreq=2582, maxDocs=43254)
                0.011375983 = queryNorm
              0.20880443 = fieldWeight in 90, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.8181381 = idf(docFreq=2582, maxDocs=43254)
                0.0546875 = fieldNorm(doc=90)
          0.0768517 = weight(abstract_txt:lexicon in 90) [ClassicSimilarity], result of:
            0.0768517 = score(doc=90,freq=4.0), product of:
              0.090745494 = queryWeight, product of:
                1.03021 = boost
                7.7430196 = idf(docFreq=50, maxDocs=43254)
                0.011375983 = queryNorm
              0.8468928 = fieldWeight in 90, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.7430196 = idf(docFreq=50, maxDocs=43254)
                0.0546875 = fieldNorm(doc=90)
          0.01023667 = weight(abstract_txt:most in 90) [ClassicSimilarity], result of:
            0.01023667 = score(doc=90,freq=1.0), product of:
              0.047336034 = queryWeight, product of:
                1.0522627 = boost
                3.9543834 = idf(docFreq=2253, maxDocs=43254)
                0.011375983 = queryNorm
              0.21625534 = fieldWeight in 90, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.9543834 = idf(docFreq=2253, maxDocs=43254)
                0.0546875 = fieldNorm(doc=90)
          0.016542636 = weight(abstract_txt:approaches in 90) [ClassicSimilarity], result of:
            0.016542636 = score(doc=90,freq=1.0), product of:
              0.06518623 = queryWeight, product of:
                1.2348272 = boost
                4.640457 = idf(docFreq=1134, maxDocs=43254)
                0.011375983 = queryNorm
              0.253775 = fieldWeight in 90, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.640457 = idf(docFreq=1134, maxDocs=43254)
                0.0546875 = fieldNorm(doc=90)
          0.033741992 = weight(abstract_txt:strategies in 90) [ClassicSimilarity], result of:
            0.033741992 = score(doc=90,freq=1.0), product of:
              0.12001356 = queryWeight, product of:
                2.0520551 = boost
                5.141056 = idf(docFreq=687, maxDocs=43254)
                0.011375983 = queryNorm
              0.2811515 = fieldWeight in 90, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.141056 = idf(docFreq=687, maxDocs=43254)
                0.0546875 = fieldNorm(doc=90)
          0.042642325 = weight(abstract_txt:analysis in 90) [ClassicSimilarity], result of:
            0.042642325 = score(doc=90,freq=3.0), product of:
              0.12255 = queryWeight, product of:
                2.9325507 = boost
                3.67349 = idf(docFreq=2984, maxDocs=43254)
                0.011375983 = queryNorm
              0.3479586 = fieldWeight in 90, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.67349 = idf(docFreq=2984, maxDocs=43254)
                0.0546875 = fieldNorm(doc=90)
          0.8383501 = weight(abstract_txt:sentiment in 90) [ClassicSimilarity], result of:
            0.8383501 = score(doc=90,freq=8.0), product of:
              0.7085324 = queryWeight, product of:
                8.142131 = boost
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.011375983 = queryNorm
              1.1832206 = fieldWeight in 90, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                7.649493 = idf(docFreq=55, maxDocs=43254)
                0.0546875 = fieldNorm(doc=90)
        0.28 = coord(7/25)