Document (#38173)

Author
Pang, B.
Lee, L.
Title
Opinion mining and sentiment analysis
Imprint
Boston, MA : Now Publ.
Year
2008
Pages
IX, 137 S
Isbn
978-1-60198-150-9
Series
Foundations and trends(r) in information retrieval; 2,1/2
Abstract
An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people can, and do, actively use information technologies to seek out and understand the opinions of others. The sudden eruption of activity in the area of opinion mining and sentiment analysis, which deals with the computational treatment of opinion, sentiment, and subjectivity in text, has thus occurred at least in part as a direct response to the surge of interest in new systems that deal directly with opinions as a first-class object. Opinion Mining and Sentiment Analysis covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems. The focus is on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis. The survey includes an enumeration of the various applications, a look at general challenges and discusses categorization, extraction and summarization. Finally, it moves beyond just the technical issues, devoting significant attention to the broader implications that the development of opinion-oriented information-access services have: questions of privacy, vulnerability to manipulation, and whether or not reviews can have measurable economic impact. To facilitate future work, a discussion of available resources, benchmark datasets, and evaluation campaigns is also provided. Opinion Mining and Sentiment Analysis is the first such comprehensive survey of this vibrant and important research area and will be of interest to anyone with an interest in opinion-oriented information-seeking systems.
Content
Table of contents 1. Introduction 2. Applications 3. General Challenges 4. Classification and Extraction 5. Summarization 6. Broader Implications 7. Publicly Available Resources 8. Concluding Remarks References
LCSH
Information behavior
Research
Information retrieval
Public opinion
Text processing (Computer science)
RSWK
World Wide Web / Meinungsäußerung / Data Mining
Data Mining / Psycholinguistik (BVB)
BK
54.72 (Künstliche Intelligenz)
LCC
Z3075
RVK
ST 530

Similar documents (content)

  1. Varathan, K.D.; Giachanou, A.; Crestani, F.: Comparative opinion mining : a review (2017) 0.60
    0.60473865 = sum of:
      0.60473865 = product of:
        1.6798295 = sum of:
          0.021575993 = weight(abstract_txt:survey in 459) [ClassicSimilarity], result of:
            0.021575993 = score(doc=459,freq=1.0), product of:
              0.06900998 = queryWeight, product of:
                1.0810413 = boost
                5.002405 = idf(docFreq=772, maxDocs=42306)
                0.0127611775 = queryNorm
              0.31265032 = fieldWeight in 459, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.002405 = idf(docFreq=772, maxDocs=42306)
                0.0625 = fieldNorm(doc=459)
          0.021847768 = weight(abstract_txt:area in 459) [ClassicSimilarity], result of:
            0.021847768 = score(doc=459,freq=1.0), product of:
              0.06958828 = queryWeight, product of:
                1.0855614 = boost
                5.023321 = idf(docFreq=756, maxDocs=42306)
                0.0127611775 = queryNorm
              0.31395757 = fieldWeight in 459, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.023321 = idf(docFreq=756, maxDocs=42306)
                0.0625 = fieldNorm(doc=459)
          0.007862964 = weight(abstract_txt:with in 459) [ClassicSimilarity], result of:
            0.007862964 = score(doc=459,freq=2.0), product of:
              0.035209153 = queryWeight, product of:
                1.0920163 = boost
                2.5265954 = idf(docFreq=9191, maxDocs=42306)
                0.0127611775 = queryNorm
              0.22332159 = fieldWeight in 459, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.5265954 = idf(docFreq=9191, maxDocs=42306)
                0.0625 = fieldNorm(doc=459)
          0.011985915 = weight(abstract_txt:that in 459) [ClassicSimilarity], result of:
            0.011985915 = score(doc=459,freq=4.0), product of:
              0.03987238 = queryWeight, product of:
                1.2992492 = boost
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.0127611775 = queryNorm
              0.30060694 = fieldWeight in 459, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.0625 = fieldNorm(doc=459)
          0.0123303905 = weight(abstract_txt:information in 459) [ClassicSimilarity], result of:
            0.0123303905 = score(doc=459,freq=2.0), product of:
              0.05727019 = queryWeight, product of:
                1.8424033 = boost
                2.435865 = idf(docFreq=10064, maxDocs=42306)
                0.0127611775 = queryNorm
              0.21530207 = fieldWeight in 459, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.435865 = idf(docFreq=10064, maxDocs=42306)
                0.0625 = fieldNorm(doc=459)
          0.03086873 = weight(abstract_txt:analysis in 459) [ClassicSimilarity], result of:
            0.03086873 = score(doc=459,freq=2.0), product of:
              0.094387375 = queryWeight, product of:
                1.9990023 = boost
                3.7000692 = idf(docFreq=2842, maxDocs=42306)
                0.0127611775 = queryNorm
              0.327043 = fieldWeight in 459, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.7000692 = idf(docFreq=2842, maxDocs=42306)
                0.0625 = fieldNorm(doc=459)
          0.37468553 = weight(abstract_txt:mining in 459) [ClassicSimilarity], result of:
            0.37468553 = score(doc=459,freq=9.0), product of:
              0.32087478 = queryWeight, product of:
                4.037521 = boost
                6.227734 = idf(docFreq=226, maxDocs=42306)
                0.0127611775 = queryNorm
              1.1677002 = fieldWeight in 459, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                6.227734 = idf(docFreq=226, maxDocs=42306)
                0.0625 = fieldNorm(doc=459)
          0.23444712 = weight(abstract_txt:sentiment in 459) [ClassicSimilarity], result of:
            0.23444712 = score(doc=459,freq=1.0), product of:
              0.48827943 = queryWeight, product of:
                4.9805946 = boost
                7.682392 = idf(docFreq=52, maxDocs=42306)
                0.0127611775 = queryNorm
              0.4801495 = fieldWeight in 459, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.682392 = idf(docFreq=52, maxDocs=42306)
                0.0625 = fieldNorm(doc=459)
          0.96422505 = weight(abstract_txt:opinion in 459) [ClassicSimilarity], result of:
            0.96422505 = score(doc=459,freq=11.0), product of:
              0.66820455 = queryWeight, product of:
                7.521874 = boost
                6.961336 = idf(docFreq=108, maxDocs=42306)
                0.0127611775 = queryNorm
              1.4430088 = fieldWeight in 459, product of:
                3.3166249 = tf(freq=11.0), with freq of:
                  11.0 = termFreq=11.0
                6.961336 = idf(docFreq=108, maxDocs=42306)
                0.0625 = fieldNorm(doc=459)
        0.36 = coord(9/25)
    
  2. Ku, L.-W.; Chen, H.-H.: Mining opinions from the Web : beyond relevance retrieval (2007) 0.40
    0.40332097 = sum of:
      0.40332097 = product of:
        1.4404321 = sum of:
          0.007862964 = weight(abstract_txt:with in 2606) [ClassicSimilarity], result of:
            0.007862964 = score(doc=2606,freq=2.0), product of:
              0.035209153 = queryWeight, product of:
                1.0920163 = boost
                2.5265954 = idf(docFreq=9191, maxDocs=42306)
                0.0127611775 = queryNorm
              0.22332159 = fieldWeight in 2606, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.5265954 = idf(docFreq=9191, maxDocs=42306)
                0.0625 = fieldNorm(doc=2606)
          0.0103177205 = weight(abstract_txt:systems in 2606) [ClassicSimilarity], result of:
            0.0103177205 = score(doc=2606,freq=1.0), product of:
              0.048307844 = queryWeight, product of:
                1.1077476 = boost
                3.4173236 = idf(docFreq=3771, maxDocs=42306)
                0.0127611775 = queryNorm
              0.21358272 = fieldWeight in 2606, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4173236 = idf(docFreq=3771, maxDocs=42306)
                0.0625 = fieldNorm(doc=2606)
          0.08568676 = weight(abstract_txt:opinions in 2606) [ClassicSimilarity], result of:
            0.08568676 = score(doc=2606,freq=2.0), product of:
              0.13736114 = queryWeight, product of:
                1.5251702 = boost
                7.0575643 = idf(docFreq=98, maxDocs=42306)
                0.0127611775 = queryNorm
              0.6238064 = fieldWeight in 2606, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.0575643 = idf(docFreq=98, maxDocs=42306)
                0.0625 = fieldNorm(doc=2606)
          0.01949606 = weight(abstract_txt:information in 2606) [ClassicSimilarity], result of:
            0.01949606 = score(doc=2606,freq=5.0), product of:
              0.05727019 = queryWeight, product of:
                1.8424033 = boost
                2.435865 = idf(docFreq=10064, maxDocs=42306)
                0.0127611775 = queryNorm
              0.34042248 = fieldWeight in 2606, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                2.435865 = idf(docFreq=10064, maxDocs=42306)
                0.0625 = fieldNorm(doc=2606)
          0.21632479 = weight(abstract_txt:mining in 2606) [ClassicSimilarity], result of:
            0.21632479 = score(doc=2606,freq=3.0), product of:
              0.32087478 = queryWeight, product of:
                4.037521 = boost
                6.227734 = idf(docFreq=226, maxDocs=42306)
                0.0127611775 = queryNorm
              0.674172 = fieldWeight in 2606, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.227734 = idf(docFreq=226, maxDocs=42306)
                0.0625 = fieldNorm(doc=2606)
          0.33155832 = weight(abstract_txt:sentiment in 2606) [ClassicSimilarity], result of:
            0.33155832 = score(doc=2606,freq=2.0), product of:
              0.48827943 = queryWeight, product of:
                4.9805946 = boost
                7.682392 = idf(docFreq=52, maxDocs=42306)
                0.0127611775 = queryNorm
              0.67903394 = fieldWeight in 2606, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.682392 = idf(docFreq=52, maxDocs=42306)
                0.0625 = fieldNorm(doc=2606)
          0.7691854 = weight(abstract_txt:opinion in 2606) [ClassicSimilarity], result of:
            0.7691854 = score(doc=2606,freq=7.0), product of:
              0.66820455 = queryWeight, product of:
                7.521874 = boost
                6.961336 = idf(docFreq=108, maxDocs=42306)
                0.0127611775 = queryNorm
              1.1511227 = fieldWeight in 2606, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.961336 = idf(docFreq=108, maxDocs=42306)
                0.0625 = fieldNorm(doc=2606)
        0.28 = coord(7/25)
    
  3. Huang, H.-H.; Wang, J.-J.; Chen, H.-H.: Implicit opinion analysis : extraction and polarity labelling (2017) 0.34
    0.34345025 = sum of:
      0.34345025 = product of:
        1.4310427 = sum of:
          0.09088454 = weight(abstract_txt:opinions in 739) [ClassicSimilarity], result of:
            0.09088454 = score(doc=739,freq=1.0), product of:
              0.13736114 = queryWeight, product of:
                1.5251702 = boost
                7.0575643 = idf(docFreq=98, maxDocs=42306)
                0.0127611775 = queryNorm
              0.66164666 = fieldWeight in 739, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.0575643 = idf(docFreq=98, maxDocs=42306)
                0.09375 = fieldNorm(doc=739)
          0.013078355 = weight(abstract_txt:information in 739) [ClassicSimilarity], result of:
            0.013078355 = score(doc=739,freq=1.0), product of:
              0.05727019 = queryWeight, product of:
                1.8424033 = boost
                2.435865 = idf(docFreq=10064, maxDocs=42306)
                0.0127611775 = queryNorm
              0.22836234 = fieldWeight in 739, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.435865 = idf(docFreq=10064, maxDocs=42306)
                0.09375 = fieldNorm(doc=739)
          0.032741234 = weight(abstract_txt:analysis in 739) [ClassicSimilarity], result of:
            0.032741234 = score(doc=739,freq=1.0), product of:
              0.094387375 = queryWeight, product of:
                1.9990023 = boost
                3.7000692 = idf(docFreq=2842, maxDocs=42306)
                0.0127611775 = queryNorm
              0.34688148 = fieldWeight in 739, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7000692 = idf(docFreq=2842, maxDocs=42306)
                0.09375 = fieldNorm(doc=739)
          0.18734276 = weight(abstract_txt:mining in 739) [ClassicSimilarity], result of:
            0.18734276 = score(doc=739,freq=1.0), product of:
              0.32087478 = queryWeight, product of:
                4.037521 = boost
                6.227734 = idf(docFreq=226, maxDocs=42306)
                0.0127611775 = queryNorm
              0.5838501 = fieldWeight in 739, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.227734 = idf(docFreq=226, maxDocs=42306)
                0.09375 = fieldNorm(doc=739)
          0.35167068 = weight(abstract_txt:sentiment in 739) [ClassicSimilarity], result of:
            0.35167068 = score(doc=739,freq=1.0), product of:
              0.48827943 = queryWeight, product of:
                4.9805946 = boost
                7.682392 = idf(docFreq=52, maxDocs=42306)
                0.0127611775 = queryNorm
              0.72022426 = fieldWeight in 739, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.682392 = idf(docFreq=52, maxDocs=42306)
                0.09375 = fieldNorm(doc=739)
          0.7553251 = weight(abstract_txt:opinion in 739) [ClassicSimilarity], result of:
            0.7553251 = score(doc=739,freq=3.0), product of:
              0.66820455 = queryWeight, product of:
                7.521874 = boost
                6.961336 = idf(docFreq=108, maxDocs=42306)
                0.0127611775 = queryNorm
              1.13038 = fieldWeight in 739, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.961336 = idf(docFreq=108, maxDocs=42306)
                0.09375 = fieldNorm(doc=739)
        0.24 = coord(6/25)
    
  4. Miao, Q.; Li, Q.; Zeng, D.: Fine-grained opinion mining by integrating multiple review sources (2010) 0.33
    0.33100635 = sum of:
      0.33100635 = product of:
        1.3791932 = sum of:
          0.008339933 = weight(abstract_txt:with in 1105) [ClassicSimilarity], result of:
            0.008339933 = score(doc=1105,freq=1.0), product of:
              0.035209153 = queryWeight, product of:
                1.0920163 = boost
                2.5265954 = idf(docFreq=9191, maxDocs=42306)
                0.0127611775 = queryNorm
              0.23686832 = fieldWeight in 1105, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.5265954 = idf(docFreq=9191, maxDocs=42306)
                0.09375 = fieldNorm(doc=1105)
          0.008989436 = weight(abstract_txt:that in 1105) [ClassicSimilarity], result of:
            0.008989436 = score(doc=1105,freq=1.0), product of:
              0.03987238 = queryWeight, product of:
                1.2992492 = boost
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.0127611775 = queryNorm
              0.2254552 = fieldWeight in 1105, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.09375 = fieldNorm(doc=1105)
          0.12853013 = weight(abstract_txt:opinions in 1105) [ClassicSimilarity], result of:
            0.12853013 = score(doc=1105,freq=2.0), product of:
              0.13736114 = queryWeight, product of:
                1.5251702 = boost
                7.0575643 = idf(docFreq=98, maxDocs=42306)
                0.0127611775 = queryNorm
              0.9357096 = fieldWeight in 1105, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.0575643 = idf(docFreq=98, maxDocs=42306)
                0.09375 = fieldNorm(doc=1105)
          0.26494265 = weight(abstract_txt:mining in 1105) [ClassicSimilarity], result of:
            0.26494265 = score(doc=1105,freq=2.0), product of:
              0.32087478 = queryWeight, product of:
                4.037521 = boost
                6.227734 = idf(docFreq=226, maxDocs=42306)
                0.0127611775 = queryNorm
              0.8256886 = fieldWeight in 1105, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.227734 = idf(docFreq=226, maxDocs=42306)
                0.09375 = fieldNorm(doc=1105)
          0.35167068 = weight(abstract_txt:sentiment in 1105) [ClassicSimilarity], result of:
            0.35167068 = score(doc=1105,freq=1.0), product of:
              0.48827943 = queryWeight, product of:
                4.9805946 = boost
                7.682392 = idf(docFreq=52, maxDocs=42306)
                0.0127611775 = queryNorm
              0.72022426 = fieldWeight in 1105, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.682392 = idf(docFreq=52, maxDocs=42306)
                0.09375 = fieldNorm(doc=1105)
          0.6167204 = weight(abstract_txt:opinion in 1105) [ClassicSimilarity], result of:
            0.6167204 = score(doc=1105,freq=2.0), product of:
              0.66820455 = queryWeight, product of:
                7.521874 = boost
                6.961336 = idf(docFreq=108, maxDocs=42306)
                0.0127611775 = queryNorm
              0.9229515 = fieldWeight in 1105, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.961336 = idf(docFreq=108, maxDocs=42306)
                0.09375 = fieldNorm(doc=1105)
        0.24 = coord(6/25)
    
  5. Ku, L.-W.; Ho, H.-W.; Chen, H.-H.: Opinion mining and relationship discovery using CopeOpi opinion analysis system (2009) 0.32
    0.31663758 = sum of:
      0.31663758 = product of:
        1.1308485 = sum of:
          0.0055599553 = weight(abstract_txt:with in 758) [ClassicSimilarity], result of:
            0.0055599553 = score(doc=758,freq=1.0), product of:
              0.035209153 = queryWeight, product of:
                1.0920163 = boost
                2.5265954 = idf(docFreq=9191, maxDocs=42306)
                0.0127611775 = queryNorm
              0.15791221 = fieldWeight in 758, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.5265954 = idf(docFreq=9191, maxDocs=42306)
                0.0625 = fieldNorm(doc=758)
          0.011985915 = weight(abstract_txt:that in 758) [ClassicSimilarity], result of:
            0.011985915 = score(doc=758,freq=4.0), product of:
              0.03987238 = queryWeight, product of:
                1.2992492 = boost
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.0127611775 = queryNorm
              0.30060694 = fieldWeight in 758, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.4048555 = idf(docFreq=10381, maxDocs=42306)
                0.0625 = fieldNorm(doc=758)
          0.08568676 = weight(abstract_txt:opinions in 758) [ClassicSimilarity], result of:
            0.08568676 = score(doc=758,freq=2.0), product of:
              0.13736114 = queryWeight, product of:
                1.5251702 = boost
                7.0575643 = idf(docFreq=98, maxDocs=42306)
                0.0127611775 = queryNorm
              0.6238064 = fieldWeight in 758, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.0575643 = idf(docFreq=98, maxDocs=42306)
                0.0625 = fieldNorm(doc=758)
          0.008718903 = weight(abstract_txt:information in 758) [ClassicSimilarity], result of:
            0.008718903 = score(doc=758,freq=1.0), product of:
              0.05727019 = queryWeight, product of:
                1.8424033 = boost
                2.435865 = idf(docFreq=10064, maxDocs=42306)
                0.0127611775 = queryNorm
              0.15224156 = fieldWeight in 758, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.435865 = idf(docFreq=10064, maxDocs=42306)
                0.0625 = fieldNorm(doc=758)
          0.02182749 = weight(abstract_txt:analysis in 758) [ClassicSimilarity], result of:
            0.02182749 = score(doc=758,freq=1.0), product of:
              0.094387375 = queryWeight, product of:
                1.9990023 = boost
                3.7000692 = idf(docFreq=2842, maxDocs=42306)
                0.0127611775 = queryNorm
              0.23125432 = fieldWeight in 758, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7000692 = idf(docFreq=2842, maxDocs=42306)
                0.0625 = fieldNorm(doc=758)
          0.12489518 = weight(abstract_txt:mining in 758) [ClassicSimilarity], result of:
            0.12489518 = score(doc=758,freq=1.0), product of:
              0.32087478 = queryWeight, product of:
                4.037521 = boost
                6.227734 = idf(docFreq=226, maxDocs=42306)
                0.0127611775 = queryNorm
              0.38923338 = fieldWeight in 758, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.227734 = idf(docFreq=226, maxDocs=42306)
                0.0625 = fieldNorm(doc=758)
          0.8721743 = weight(abstract_txt:opinion in 758) [ClassicSimilarity], result of:
            0.8721743 = score(doc=758,freq=9.0), product of:
              0.66820455 = queryWeight, product of:
                7.521874 = boost
                6.961336 = idf(docFreq=108, maxDocs=42306)
                0.0127611775 = queryNorm
              1.3052505 = fieldWeight in 758, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                6.961336 = idf(docFreq=108, maxDocs=42306)
                0.0625 = fieldNorm(doc=758)
        0.28 = coord(7/25)