Document (#36447)

Author
Seki, K.
Uehara, K.
Title
Opinionated document retrieval using subjective triggers
Source
Journal of the American Society for Information Science and Technology. 62(2011) no.5, S.861-876
Year
2011
Abstract
This article proposes a novel application of a statistical language model to opinionated document retrieval targeting weblogs (blogs). In particular, we explore the use of the trigger model-originally developed for incorporating distant word dependencies-in order to model the characteristics of personal opinions that cannot be properly modeled by standard n-grams. Our primary assumption is that there are two constituents to form a subjective opinion. One is the subject of the opinion or the object that the opinion is about, and the other is a subjective expression; the former is regarded as a triggering word and the latter as a triggered word. We automatically identify those subjective trigger patterns to build a language model from a corpus of product customer reviews. Experimental results on the Text Retrieval Conference Blog track test collections show that, when used for reranking initial search results, our proposed model significantly improves opinionated document retrieval. In addition, we report on an experiment on dynamic adaptation of the model to a given query, which is found effective for most of the difficult queries categorized under politics and organizations. We also demonstrate that, without any modification to the proposed model itself, it can be effectively applied to polarized opinion retrieval.

Similar documents (content)

  1. Belbachir, F.; Boughanem, M.: Using language models to improve opinion detection (2018) 0.48
    0.4827958 = sum of:
      0.4827958 = product of:
        1.508737 = sum of:
          0.049255174 = weight(abstract_txt:blog in 5044) [ClassicSimilarity], result of:
            0.049255174 = score(doc=5044,freq=1.0), product of:
              0.11598956 = queryWeight, product of:
                1.0364201 = boost
                7.7650614 = idf(docFreq=50, maxDocs=44218)
                0.014412462 = queryNorm
              0.4246518 = fieldWeight in 5044, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7650614 = idf(docFreq=50, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5044)
          0.034412123 = weight(abstract_txt:language in 5044) [ClassicSimilarity], result of:
            0.034412123 = score(doc=5044,freq=5.0), product of:
              0.06728919 = queryWeight, product of:
                1.1163851 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.014412462 = queryNorm
              0.5114064 = fieldWeight in 5044, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5044)
          0.009895977 = weight(abstract_txt:that in 5044) [ClassicSimilarity], result of:
            0.009895977 = score(doc=5044,freq=2.0), product of:
              0.054001205 = queryWeight, product of:
                1.5812958 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.014412462 = queryNorm
              0.18325473 = fieldWeight in 5044, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5044)
          0.055819664 = weight(abstract_txt:document in 5044) [ClassicSimilarity], result of:
            0.055819664 = score(doc=5044,freq=5.0), product of:
              0.106339075 = queryWeight, product of:
                1.7188321 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.014412462 = queryNorm
              0.5249215 = fieldWeight in 5044, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5044)
          0.054072883 = weight(abstract_txt:retrieval in 5044) [ClassicSimilarity], result of:
            0.054072883 = score(doc=5044,freq=6.0), product of:
              0.11615652 = queryWeight, product of:
                2.319173 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.014412462 = queryNorm
              0.46551743 = fieldWeight in 5044, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5044)
          0.1685353 = weight(abstract_txt:subjective in 5044) [ClassicSimilarity], result of:
            0.1685353 = score(doc=5044,freq=2.0), product of:
              0.33183455 = queryWeight, product of:
                3.5060427 = boost
                6.5669885 = idf(docFreq=168, maxDocs=44218)
                0.014412462 = queryNorm
              0.50788957 = fieldWeight in 5044, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.5669885 = idf(docFreq=168, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5044)
          0.4740129 = weight(abstract_txt:opinion in 5044) [ClassicSimilarity], result of:
            0.4740129 = score(doc=5044,freq=12.0), product of:
              0.36386254 = queryWeight, product of:
                3.6713438 = boost
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.014412462 = queryNorm
              1.3027252 = fieldWeight in 5044, product of:
                3.4641016 = tf(freq=12.0), with freq of:
                  12.0 = termFreq=12.0
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5044)
          0.66273296 = weight(abstract_txt:opinionated in 5044) [ClassicSimilarity], result of:
            0.66273296 = score(doc=5044,freq=6.0), product of:
              0.52079487 = queryWeight, product of:
                3.8038237 = boost
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.014412462 = queryNorm
              1.2725413 = fieldWeight in 5044, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5044)
        0.32 = coord(8/25)
    
  2. Ku, L.-W.; Chen, H.-H.: Mining opinions from the Web : beyond relevance retrieval (2007) 0.19
    0.18837069 = sum of:
      0.18837069 = product of:
        0.7848779 = sum of:
          0.023547666 = weight(abstract_txt:proposed in 605) [ClassicSimilarity], result of:
            0.023547666 = score(doc=605,freq=1.0), product of:
              0.08173943 = queryWeight, product of:
                1.2304308 = boost
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.014412462 = queryNorm
              0.2880821 = fieldWeight in 605, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.0625 = fieldNorm(doc=605)
          0.028529499 = weight(abstract_txt:document in 605) [ClassicSimilarity], result of:
            0.028529499 = score(doc=605,freq=1.0), product of:
              0.106339075 = queryWeight, product of:
                1.7188321 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.014412462 = queryNorm
              0.26828802 = fieldWeight in 605, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=605)
          0.05791938 = weight(abstract_txt:word in 605) [ClassicSimilarity], result of:
            0.05791938 = score(doc=605,freq=1.0), product of:
              0.17049542 = queryWeight, product of:
                2.1764233 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.014412462 = queryNorm
              0.33971223 = fieldWeight in 605, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.0625 = fieldNorm(doc=605)
          0.025228756 = weight(abstract_txt:retrieval in 605) [ClassicSimilarity], result of:
            0.025228756 = score(doc=605,freq=1.0), product of:
              0.11615652 = queryWeight, product of:
                2.319173 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.014412462 = queryNorm
              0.21719621 = fieldWeight in 605, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=605)
          0.2359003 = weight(abstract_txt:subjective in 605) [ClassicSimilarity], result of:
            0.2359003 = score(doc=605,freq=3.0), product of:
              0.33183455 = queryWeight, product of:
                3.5060427 = boost
                6.5669885 = idf(docFreq=168, maxDocs=44218)
                0.014412462 = queryNorm
              0.7108973 = fieldWeight in 605, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.5669885 = idf(docFreq=168, maxDocs=44218)
                0.0625 = fieldNorm(doc=605)
          0.41375235 = weight(abstract_txt:opinion in 605) [ClassicSimilarity], result of:
            0.41375235 = score(doc=605,freq=7.0), product of:
              0.36386254 = queryWeight, product of:
                3.6713438 = boost
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.014412462 = queryNorm
              1.1371117 = fieldWeight in 605, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.0625 = fieldNorm(doc=605)
        0.24 = coord(6/25)
    
  3. Guo, L.; Wan, X.: Exploiting syntactic and semantic relationships between terms for opinion retrieval (2012) 0.18
    0.17546849 = sum of:
      0.17546849 = product of:
        0.73111874 = sum of:
          0.029434584 = weight(abstract_txt:proposed in 492) [ClassicSimilarity], result of:
            0.029434584 = score(doc=492,freq=1.0), product of:
              0.08173943 = queryWeight, product of:
                1.2304308 = boost
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.014412462 = queryNorm
              0.36010262 = fieldWeight in 492, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6093135 = idf(docFreq=1196, maxDocs=44218)
                0.078125 = fieldNorm(doc=492)
          0.009996447 = weight(abstract_txt:that in 492) [ClassicSimilarity], result of:
            0.009996447 = score(doc=492,freq=1.0), product of:
              0.054001205 = queryWeight, product of:
                1.5812958 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.014412462 = queryNorm
              0.18511525 = fieldWeight in 492, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.078125 = fieldNorm(doc=492)
          0.035661872 = weight(abstract_txt:document in 492) [ClassicSimilarity], result of:
            0.035661872 = score(doc=492,freq=1.0), product of:
              0.106339075 = queryWeight, product of:
                1.7188321 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.014412462 = queryNorm
              0.33536002 = fieldWeight in 492, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.078125 = fieldNorm(doc=492)
          0.044598557 = weight(abstract_txt:retrieval in 492) [ClassicSimilarity], result of:
            0.044598557 = score(doc=492,freq=2.0), product of:
              0.11615652 = queryWeight, product of:
                2.319173 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.014412462 = queryNorm
              0.38395226 = fieldWeight in 492, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.078125 = fieldNorm(doc=492)
          0.51719046 = weight(abstract_txt:opinion in 492) [ClassicSimilarity], result of:
            0.51719046 = score(doc=492,freq=7.0), product of:
              0.36386254 = queryWeight, product of:
                3.6713438 = boost
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.014412462 = queryNorm
              1.4213896 = fieldWeight in 492, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.8766055 = idf(docFreq=123, maxDocs=44218)
                0.078125 = fieldNorm(doc=492)
          0.0942368 = weight(abstract_txt:model in 492) [ClassicSimilarity], result of:
            0.0942368 = score(doc=492,freq=2.0), product of:
              0.21396992 = queryWeight, product of:
                3.724361 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.014412462 = queryNorm
              0.44042078 = fieldWeight in 492, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.078125 = fieldNorm(doc=492)
        0.24 = coord(6/25)
    
  4. Lhadj, L.S.; Boughanem, M.; Amrouche, K.: Enhancing information retrieval through concept-based language modeling and semantic smoothing (2016) 0.14
    0.14143632 = sum of:
      0.14143632 = product of:
        0.4419885 = sum of:
          0.058086723 = weight(abstract_txt:dependencies in 3221) [ClassicSimilarity], result of:
            0.058086723 = score(doc=3221,freq=1.0), product of:
              0.11844251 = queryWeight, product of:
                1.0473219 = boost
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.014412462 = queryNorm
              0.49042124 = fieldWeight in 3221, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.84674 = idf(docFreq=46, maxDocs=44218)
                0.0625 = fieldNorm(doc=3221)
          0.06629961 = weight(abstract_txt:grams in 3221) [ClassicSimilarity], result of:
            0.06629961 = score(doc=3221,freq=1.0), product of:
              0.12935911 = queryWeight, product of:
                1.094523 = boost
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.014412462 = queryNorm
              0.5125237 = fieldWeight in 3221, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.200379 = idf(docFreq=32, maxDocs=44218)
                0.0625 = fieldNorm(doc=3221)
          0.030463448 = weight(abstract_txt:language in 3221) [ClassicSimilarity], result of:
            0.030463448 = score(doc=3221,freq=3.0), product of:
              0.06728919 = queryWeight, product of:
                1.1163851 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.014412462 = queryNorm
              0.45272425 = fieldWeight in 3221, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.0625 = fieldNorm(doc=3221)
          0.015994314 = weight(abstract_txt:that in 3221) [ClassicSimilarity], result of:
            0.015994314 = score(doc=3221,freq=4.0), product of:
              0.054001205 = queryWeight, product of:
                1.5812958 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.014412462 = queryNorm
              0.2961844 = fieldWeight in 3221, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=3221)
          0.028529499 = weight(abstract_txt:document in 3221) [ClassicSimilarity], result of:
            0.028529499 = score(doc=3221,freq=1.0), product of:
              0.106339075 = queryWeight, product of:
                1.7188321 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.014412462 = queryNorm
              0.26828802 = fieldWeight in 3221, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=3221)
          0.1003193 = weight(abstract_txt:word in 3221) [ClassicSimilarity], result of:
            0.1003193 = score(doc=3221,freq=3.0), product of:
              0.17049542 = queryWeight, product of:
                2.1764233 = boost
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.014412462 = queryNorm
              0.5883988 = fieldWeight in 3221, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.4353957 = idf(docFreq=523, maxDocs=44218)
                0.0625 = fieldNorm(doc=3221)
          0.035678845 = weight(abstract_txt:retrieval in 3221) [ClassicSimilarity], result of:
            0.035678845 = score(doc=3221,freq=2.0), product of:
              0.11615652 = queryWeight, product of:
                2.319173 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.014412462 = queryNorm
              0.3071618 = fieldWeight in 3221, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=3221)
          0.106616765 = weight(abstract_txt:model in 3221) [ClassicSimilarity], result of:
            0.106616765 = score(doc=3221,freq=4.0), product of:
              0.21396992 = queryWeight, product of:
                3.724361 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.014412462 = queryNorm
              0.49827924 = fieldWeight in 3221, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.0625 = fieldNorm(doc=3221)
        0.32 = coord(8/25)
    
  5. Ye, Z.; He, B.; Wang, L.; Luo, T.: Utilizing term proximity for blog post retrieval (2013) 0.13
    0.13289557 = sum of:
      0.13289557 = product of:
        0.6644778 = sum of:
          0.05629163 = weight(abstract_txt:blog in 1126) [ClassicSimilarity], result of:
            0.05629163 = score(doc=1126,freq=1.0), product of:
              0.11598956 = queryWeight, product of:
                1.0364201 = boost
                7.7650614 = idf(docFreq=50, maxDocs=44218)
                0.014412462 = queryNorm
              0.48531634 = fieldWeight in 1126, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.7650614 = idf(docFreq=50, maxDocs=44218)
                0.0625 = fieldNorm(doc=1126)
          0.007997157 = weight(abstract_txt:that in 1126) [ClassicSimilarity], result of:
            0.007997157 = score(doc=1126,freq=1.0), product of:
              0.054001205 = queryWeight, product of:
                1.5812958 = boost
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.014412462 = queryNorm
              0.1480922 = fieldWeight in 1126, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.3694751 = idf(docFreq=11241, maxDocs=44218)
                0.0625 = fieldNorm(doc=1126)
          0.043697488 = weight(abstract_txt:retrieval in 1126) [ClassicSimilarity], result of:
            0.043697488 = score(doc=1126,freq=3.0), product of:
              0.11615652 = queryWeight, product of:
                2.319173 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.014412462 = queryNorm
              0.37619486 = fieldWeight in 1126, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=1126)
          0.119201176 = weight(abstract_txt:model in 1126) [ClassicSimilarity], result of:
            0.119201176 = score(doc=1126,freq=5.0), product of:
              0.21396992 = queryWeight, product of:
                3.724361 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.014412462 = queryNorm
              0.55709314 = fieldWeight in 1126, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.0625 = fieldNorm(doc=1126)
          0.43729034 = weight(abstract_txt:opinionated in 1126) [ClassicSimilarity], result of:
            0.43729034 = score(doc=1126,freq=2.0), product of:
              0.52079487 = queryWeight, product of:
                3.8038237 = boost
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.014412462 = queryNorm
              0.83965945 = fieldWeight in 1126, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.0625 = fieldNorm(doc=1126)
        0.2 = coord(5/25)