Search (7 results, page 1 of 1)

Qiu, X.Y.; Srinivasan, P.; Hu, Y.: Supervised learning models to predict firm performance with annual reports : an empirical study (2014) 0.05
```
0.050719745 = product of:
  0.15215923 = sum of:
    0.15215923 = weight(_text_:systematic in 1205) [ClassicSimilarity], result of:
      0.15215923 = score(doc=1205,freq=4.0), product of:
        0.28397155 = queryWeight, product of:
          5.715473 = idf(docFreq=395, maxDocs=44218)
          0.049684696 = queryNorm
        0.5358256 = fieldWeight in 1205, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          5.715473 = idf(docFreq=395, maxDocs=44218)
          0.046875 = fieldNorm(doc=1205)
  0.33333334 = coord(1/3)
```
Abstract

Text mining and machine learning methodologies have been applied toward knowledge discovery in several domains, such as biomedicine and business. Interestingly, in the business domain, the text mining and machine learning community has minimally explored company annual reports with their mandatory disclosures. In this study, we explore the question "How can annual reports be used to predict change in company performance from one year to the next?" from a text mining perspective. Our article contributes a systematic study of the potential of company mandatory disclosures using a computational viewpoint in the following aspects: (a) We characterize our research problem along distinct dimensions to gain a reasonably comprehensive understanding of the capacity of supervised learning methods in predicting change in company performance using annual reports, and (b) our findings from unbiased systematic experiments provide further evidence about the economic incentives faced by analysts in their stock recommendations and speculations on analysts having access to more information in producing earnings forecast.
Srinivasan, P.: Optimal document-indexing vocabulary for MEDLINE (1996) 0.02
```
0.020983277 = product of:
  0.06294983 = sum of:
    0.06294983 = product of:
      0.12589966 = sum of:
        0.12589966 = weight(_text_:indexing in 6634) [ClassicSimilarity], result of:
          0.12589966 = score(doc=6634,freq=10.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.6619802 = fieldWeight in 6634, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6634)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

The indexing vocabulary is an important determinant of success in text retrieval. Researchers have compared the effectiveness of indexing using free text and controlled vocabularies in a variety of text contexts. A number of studies have investigated the relative merits of free-text, MeSH and UMLS metathesaurus indexing vocabularies for MEDLINE document indexing. Controlled vocabularies offer no advantages in retrieval performance over free text. Offers a detailed analysis of prior results and their underlying experimental designs. Offers results from a new experiment assessing 8 different retrieval strategies. Results indicate that MeSH does have an important role in text retrieval

Srinivasan, P.; Ruiz, M.E.; Lam, W.: ¬An investigation of indexing on the WWW (1996) 0.02

0.018768014 = product of:
  0.05630404 = sum of:
    0.05630404 = product of:
      0.11260808 = sum of:
        0.11260808 = weight(_text_:indexing in 7424) [ClassicSimilarity], result of:
          0.11260808 = score(doc=7424,freq=8.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.5920931 = fieldWeight in 7424, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.0546875 = fieldNorm(doc=7424)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Abstract: Proposes a model that assists in understanding indexing on the WWW. It specifies key features of indexing startegies that are currently being used. Presents an experiment assessing the validity of Inverse Document Frequency (IDF) as a term weighting strategy for WWW documents. The experiment indicates that IDF scores are not stable in the heterogeneous and dynamic context of the WWW. Recommends further investigation to clarify the effectiveness of alternative indexing strategies for the WWW

Ruiz, M.E.; Srinivasan, P.: Combining machine learning and hierarchical indexing structures for text categorization (2001) 0.01
```
0.013270989 = product of:
  0.039812967 = sum of:
    0.039812967 = product of:
      0.079625934 = sum of:
        0.079625934 = weight(_text_:indexing in 1595) [ClassicSimilarity], result of:
          0.079625934 = score(doc=1595,freq=4.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.41867304 = fieldWeight in 1595, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1595)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

This paper presents a method that exploits the hierarchical structure of an indexing vocabulary to guide the development and training of machine learning methods for automatic text categorization. We present the design of a hierarchical classifier based an the divide-and-conquer principle. The method is evaluated using backpropagation neural networks, such as the machine learning algorithm, that leam to assign MeSH categories to a subset of MEDLINE records. Comparisons with traditional Rocchio's algorithm adapted for text categorization, as well as flat neural network classifiers, are provided. The results indicate that the use of hierarchical structures improves Performance significantly.
Srinivasan, P.: Thesaurus construction (1992) 0.01
```
0.011375135 = product of:
  0.034125403 = sum of:
    0.034125403 = product of:
      0.068250805 = sum of:
        0.068250805 = weight(_text_:indexing in 3504) [ClassicSimilarity], result of:
          0.068250805 = score(doc=3504,freq=4.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.3588626 = fieldWeight in 3504, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.046875 = fieldNorm(doc=3504)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Thesauri are valuable structures for Information Retrieval systems. A thesaurus provides a precise and controlled vocabulary which serves to coordinate dacument indexing and document retrieval. In both indexing and retrieval, a thesaurus may be used to select the most appropriate terms. Additionally, the thesaurus can assist the searcher in reformulating search strategies if required. Examines the important features of thesauri. This should allow the reader to differentiate between thesauri. Next, a brief overview of the manual thesaurus construction process is given. 2 major approaches for automatic thesaurus construction have been selected for detailed examination. The first is on thesaurus construction from collections of documents,a nd the 2nd, on thesaurus construction by merging existing thesauri. These 2 methods were selected since they rely on statistical techniques alone and are also significantly different from each other. Programs written in C language accompany the discussion of these approaches
Srinivasan, P.: On generalizing the Two-Poisson Model (1990) 0.01
```
0.009384007 = product of:
  0.02815202 = sum of:
    0.02815202 = product of:
      0.05630404 = sum of:
        0.05630404 = weight(_text_:indexing in 2880) [ClassicSimilarity], result of:
          0.05630404 = score(doc=2880,freq=2.0), product of:
            0.19018644 = queryWeight, product of:
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.049684696 = queryNorm
            0.29604656 = fieldWeight in 2880, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.8278677 = idf(docFreq=2614, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2880)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

Automatic indexing is one of the important functions of a modern document retrieval system. Numerous techniques for this function have been proposed in the literature ranging from purely statistical to linguistically complex mechanisms. Most result from examining properties of terms. Examines term distribution within the framework of the Poisson models. Specifically examines the effectiveness of the Two-Poisson and the Three-Poisson model to see if generalisation results in increased effectiveness. The results show that the Two-Poisson model is only moderately effective in identifying index terms. In addition, generalisation to the Three-Poisson does not give any additional power. The only Poisson model which consistently works well is the basic One-Poisson model. Also discusses term distribution information.

Bhattacharya, S.; Yang, C.; Srinivasan, P.; Boynton, B.: Perceptions of presidential candidates' personalities in twitter (2016) 0.01

0.005609659 = product of:
  0.016828977 = sum of:
    0.016828977 = product of:
      0.033657953 = sum of:
        0.033657953 = weight(_text_:22 in 2635) [ClassicSimilarity], result of:
          0.033657953 = score(doc=2635,freq=2.0), product of:
            0.17398734 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049684696 = queryNorm
            0.19345059 = fieldWeight in 2635, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2635)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 22. 1.2016 11:25:47

Search (7 results, page 1 of 1)

Authors

Years

Themes