Search (8 results, page 1 of 1)

Ku, L.-W.; Ho, H.-W.; Chen, H.-H.: Opinion mining and relationship discovery using CopeOpi opinion analysis system (2009) 0.08
```
0.080165744 = product of:
  0.16033149 = sum of:
    0.16033149 = sum of:
      0.12601131 = weight(_text_:mining in 2938) [ClassicSimilarity], result of:
        0.12601131 = score(doc=2938,freq=4.0), product of:
          0.28585905 = queryWeight, product of:
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.05066224 = queryNorm
          0.44081625 = fieldWeight in 2938, product of:
            2.0 = tf(freq=4.0), with freq of:
              4.0 = termFreq=4.0
            5.642448 = idf(docFreq=425, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2938)
      0.034320172 = weight(_text_:22 in 2938) [ClassicSimilarity], result of:
        0.034320172 = score(doc=2938,freq=2.0), product of:
          0.17741053 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.05066224 = queryNorm
          0.19345059 = fieldWeight in 2938, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.0390625 = fieldNorm(doc=2938)
  0.5 = coord(1/2)
```
Abstract

We present CopeOpi, an opinion-analysis system, which extracts from the Web opinions about specific targets, summarizes the polarity and strength of these opinions, and tracks opinion variations over time. Objects that yield similar opinion tendencies over a certain time period may be correlated due to the latent causal events. CopeOpi discovers relationships among objects based on their opinion-tracking plots and collocations. Event bursts are detected from the tracking plots, and the strength of opinion relationships is determined by the coverage of these plots. To evaluate opinion mining, we use the NTCIR corpus annotated with opinion information at sentence and document levels. CopeOpi achieves sentence- and document-level f-measures of 62% and 74%. For relationship discovery, we collected 1.3M economics-related documents from 93 Web sources over 22 months, and analyzed collocation-based, opinion-based, and hybrid models. We consider as correlated company pairs that demonstrate similar stock-price variations, and selected these as the gold standard for evaluation. Results show that opinion-based and collocation-based models complement each other, and that integrated models perform the best. The top 25, 50, and 100 pairs discovered achieve precision rates of 1, 0.92, and 0.79, respectively.
Ku, L.-W.; Chen, H.-H.: Mining opinions from the Web : beyond relevance retrieval (2007) 0.05
```
0.0545645 = product of:
  0.109129 = sum of:
    0.109129 = product of:
      0.218258 = sum of:
        0.218258 = weight(_text_:mining in 605) [ClassicSimilarity], result of:
          0.218258 = score(doc=605,freq=12.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.7635161 = fieldWeight in 605, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=605)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Documents discussing public affairs, common themes, interesting products, and so on, are reported and distributed on the Web. Positive and negative opinions embedded in documents are useful references and feedbacks for governments to improve their services, for companies to market their products, and for customers to purchase their objects. Web opinion mining aims to extract, summarize, and track various aspects of subjective information on the Web. Mining subjective information enables traditional information retrieval (IR) systems to retrieve more data from human viewpoints and provide information with finer granularity. Opinion extraction identifies opinion holders, extracts the relevant opinion sentences, and decides their polarities. Opinion summarization recognizes the major events embedded in documents and summarizes the supportive and the nonsupportive evidence. Opinion tracking captures subjective information from various genres and monitors the developments of opinions from spatial and temporal dimensions. To demonstrate and evaluate the proposed opinion mining algorithms, news and bloggers' articles are adopted. Documents in the evaluation corpora are tagged in different granularities from words, sentences to documents. In the experiments, positive and negative sentiment words and their weights are mined on the basis of Chinese word structures. The f-measure is 73.18% and 63.75% for verbs and nouns, respectively. Utilizing the sentiment words mined together with topical words, we achieve f-measure 62.16% at the sentence level and 74.37% at the document level.

Footnote

Beitrag eines Themenschwerpunktes "Mining Web resources for enhancing information retrieval"

Theme

Data Mining
Lee, L.-H.; Chen, H.-H.: Mining search intents for collaborative cyberporn filtering (2012) 0.03
```
0.031502828 = product of:
  0.063005656 = sum of:
    0.063005656 = product of:
      0.12601131 = sum of:
        0.12601131 = weight(_text_:mining in 4988) [ClassicSimilarity], result of:
          0.12601131 = score(doc=4988,freq=4.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.44081625 = fieldWeight in 4988, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4988)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This article presents a search-intent-based method to generate pornographic blacklists for collaborative cyberporn filtering. A novel porn-detection framework that can find newly appearing pornographic web pages by mining search query logs is proposed. First, suspected queries are identified along with their clicked URLs by an automatically constructed lexicon. Then, a candidate URL is determined if the number of clicks satisfies majority voting rules. Finally, a candidate whose URL contains at least one categorical keyword will be included in a blacklist. Several experiments are conducted on an MSN search porn dataset to demonstrate the effectiveness of our method. The resulting blacklist generated by our search-intent-based method achieves high precision (0.701) while maintaining a favorably low false-positive rate (0.086). The experiments of a real-life filtering simulation reveal that our proposed method with its accumulative update strategy can achieve 44.15% of a macro-averaging blocking rate, when the update frequency is set to 1 day. In addition, the overblocking rates are less than 9% with time change due to the strong advantages of our search-intent-based method. This user-behavior-oriented method can be easily applied to search engines for incorporating only implicit collective intelligence from query logs without other efforts. In practice, it is complementary to intelligent content analysis for keeping up with the changing trails of objectionable websites from users' perspectives.
Huang, H.-H.; Wang, J.-J.; Chen, H.-H.: Implicit opinion analysis : extraction and polarity labelling (2017) 0.03
```
0.031186208 = product of:
  0.062372416 = sum of:
    0.062372416 = product of:
      0.12474483 = sum of:
        0.12474483 = weight(_text_:mining in 3820) [ClassicSimilarity], result of:
          0.12474483 = score(doc=3820,freq=2.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.4363858 = fieldWeight in 3820, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3820)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Opinion words are crucial information for sentiment analysis. In some text, however, opinion words are absent or highly ambiguous. The resulting implicit opinions are more difficult to extract and label than explicit ones. In this paper, cutting-edge machine-learning approaches - deep neural network and word-embedding - are adopted for implicit opinion mining at the snippet and clause levels. Hotel reviews written in Chinese are collected and annotated as the experimental data set. Results show the convolutional neural network models not only outperform traditional support vector machine models, but also capture hidden knowledge within the raw text. The strength of word-embedding is also analyzed.

Lee, L.-H.; Juan, Y.-C.; Tseng, W.-L.; Chen, H.-H.; Tseng, Y.-H.: Mining browsing behaviors for objectionable content filtering (2015) 0.02

0.022275863 = product of:
  0.044551726 = sum of:
    0.044551726 = product of:
      0.08910345 = sum of:
        0.08910345 = weight(_text_:mining in 1818) [ClassicSimilarity], result of:
          0.08910345 = score(doc=1818,freq=2.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.31170416 = fieldWeight in 1818, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1818)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Tsai, M.-.F.; Chen, H.-H.; Wang, Y.-T.: Learning a merge model for multilingual information retrieval (2011) 0.02

0.022275863 = product of:
  0.044551726 = sum of:
    0.044551726 = product of:
      0.08910345 = sum of:
        0.08910345 = weight(_text_:mining in 2750) [ClassicSimilarity], result of:
          0.08910345 = score(doc=2750,freq=2.0), product of:
            0.28585905 = queryWeight, product of:
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.05066224 = queryNorm
            0.31170416 = fieldWeight in 2750, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.642448 = idf(docFreq=425, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2750)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Content: Beitrag in einem Themenschwerpunkt "Managing and Mining Multilingual Documents". Vgl.: 10.1016/j.ipm.2009.12.002.

Chen, H.-H.; Lin, W.-C.; Yang, C.; Lin, W.-H.: Translating-transliterating named entities for multilingual information access (2006) 0.01

0.012012059 = product of:
  0.024024118 = sum of:
    0.024024118 = product of:
      0.048048235 = sum of:
        0.048048235 = weight(_text_:22 in 1080) [ClassicSimilarity], result of:
          0.048048235 = score(doc=1080,freq=2.0), product of:
            0.17741053 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05066224 = queryNorm
            0.2708308 = fieldWeight in 1080, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1080)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 4. 6.2006 19:52:22

Bian, G.-W.; Chen, H.-H.: Cross-language information access to multilingual collections on the Internet (2000) 0.01

0.01029605 = product of:
  0.0205921 = sum of:
    0.0205921 = product of:
      0.0411842 = sum of:
        0.0411842 = weight(_text_:22 in 4436) [ClassicSimilarity], result of:
          0.0411842 = score(doc=4436,freq=2.0), product of:
            0.17741053 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05066224 = queryNorm
            0.23214069 = fieldWeight in 4436, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=4436)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 16. 2.2000 14:22:39

Search (8 results, page 1 of 1)

Authors

Years

Themes