Search (2 results, page 1 of 1)

Malo, P.; Sinha, A.; Wallenius, J.; Korhonen, P.: Concept-based document classification using Wikipedia and value function (2011) 0.00
```
0.0030444188 = product of:
  0.0060888375 = sum of:
    0.0060888375 = product of:
      0.012177675 = sum of:
        0.012177675 = weight(_text_:a in 4948) [ClassicSimilarity], result of:
          0.012177675 = score(doc=4948,freq=18.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.22931081 = fieldWeight in 4948, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=4948)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In this article, we propose a new concept-based method for document classification. The conceptual knowledge associated with the words is drawn from Wikipedia. The purpose is to utilize the abundant semantic relatedness information available in Wikipedia in an efficient value function-based query learning algorithm. The procedure learns the value function by solving a simple linear programming problem formulated using the training documents. The learning involves a step-wise iterative process that helps in generating a value function with an appropriate set of concepts (dimensions) chosen from a collection of concepts. Once the value function is formulated, it is utilized to make a decision between relevance and irrelevance. The value assigned to a particular document from the value function can be further used to rank the documents according to their relevance. Reuters newswire documents have been used to evaluate the efficacy of the procedure. An extensive comparison with other frameworks has been performed. The results are promising.

Type

a
Malo, P.; Sinha, A.; Korhonen, P.; Wallenius, J.; Takala, P.: Good debt or bad debt : detecting semantic orientations in economic texts (2014) 0.00
```
0.0023919214 = product of:
  0.0047838427 = sum of:
    0.0047838427 = product of:
      0.009567685 = sum of:
        0.009567685 = weight(_text_:a in 1226) [ClassicSimilarity], result of:
          0.009567685 = score(doc=1226,freq=16.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.18016359 = fieldWeight in 1226, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1226)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The use of robo-readers to analyze news texts is an emerging technology trend in computational finance. Recent research has developed sophisticated financial polarity lexicons for investigating how financial sentiments relate to future company performance. However, based on experience from fields that commonly analyze sentiment, it is well known that the overall semantic orientation of a sentence may differ from that of individual words. This article investigates how semantic orientations can be better detected in financial and economic news by accommodating the overall phrase-structure information and domain-specific use of language. Our three main contributions are the following: (a) a human-annotated finance phrase bank that can be used for training and evaluating alternative models; (b) a technique to enhance financial lexicons with attributes that help to identify expected direction of events that affect sentiment; and (c) a linearized phrase-structure model for detecting contextual semantic orientations in economic texts. The relevance of the newly added lexicon features and the benefit of using the proposed learning algorithm are demonstrated in a comparative study against general sentiment models as well as the popular word frequency models used in recent financial studies. The proposed framework is parsimonious and avoids the explosion in feature space caused by the use of conventional n-gram features.

Type

a

Search (2 results, page 1 of 1)

Authors

Themes