Search (14 results, page 1 of 1)

Melo, P.F.; Dalip, D.H.; Junior, M.M.; Gonçalves, M.A.; Benevenuto, F.: 10SENT : a stable sentiment analysis method based on the combination of off-the-shelf approaches (2019) 0.05

0.053216215 = product of:
  0.088693686 = sum of:
    0.022488397 = weight(_text_:technology in 4990) [ClassicSimilarity], result of:
      0.022488397 = score(doc=4990,freq=2.0), product of:
        0.13667917 = queryWeight, product of:
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.04589033 = queryNorm
        0.16453418 = fieldWeight in 4990, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4990)
    0.04031018 = weight(_text_:social in 4990) [ClassicSimilarity], result of:
      0.04031018 = score(doc=4990,freq=2.0), product of:
        0.18299131 = queryWeight, product of:
          3.9875789 = idf(docFreq=2228, maxDocs=44218)
          0.04589033 = queryNorm
        0.22028469 = fieldWeight in 4990, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9875789 = idf(docFreq=2228, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4990)
    0.025895113 = product of:
      0.051790226 = sum of:
        0.051790226 = weight(_text_:aspects in 4990) [ClassicSimilarity], result of:
          0.051790226 = score(doc=4990,freq=2.0), product of:
            0.20741826 = queryWeight, product of:
              4.5198684 = idf(docFreq=1308, maxDocs=44218)
              0.04589033 = queryNorm
            0.2496898 = fieldWeight in 4990, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.5198684 = idf(docFreq=1308, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4990)
      0.5 = coord(1/2)
  0.6 = coord(3/5)

Abstract: Sentiment analysis has become a very important tool for analysis of social media data. There are several methods developed, covering distinct aspects of the problem and disparate strategies. However, no single technique fits well in all cases or for all data sources. Supervised approaches may be able to adapt to specific situations, but require manually labeled training, which is very cumbersome and expensive to acquire, mainly for a new application. In this context, we propose to combine several popular and effective state-of-the-practice sentiment analysis methods by means of an unsupervised bootstrapped strategy. One of our main goals is to reduce the large variability (low stability) of the unsupervised methods across different domains. The experimental results demonstrate that our combined method (aka, 10SENT) improves the effectiveness of the classification task, considering thirteen different data sets. Also, it tackles the key problem of cross-domain low stability and produces the best (or close to best) results in almost all considered contexts, without any additional costs (e.g., manual labeling). Finally, we also investigate a transfer learning approach for sentiment analysis to gather additional (unsupervised) information for the proposed approach, and we show the potential of this technique to improve our results.
Source: Journal of the Association for Information Science and Technology. 70(2019) no.3, S.242-255

Cortez, E.; Silva, A.S. da; Gonçalves, M.A.; Mesquita, F.; Moura, E.S. de: ¬A flexible approach for extracting metadata from bibliographic citations (2009) 0.03
```
0.02511943 = product of:
  0.062798575 = sum of:
    0.022488397 = weight(_text_:technology in 2848) [ClassicSimilarity], result of:
      0.022488397 = score(doc=2848,freq=2.0), product of:
        0.13667917 = queryWeight, product of:
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.04589033 = queryNorm
        0.16453418 = fieldWeight in 2848, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2848)
    0.04031018 = weight(_text_:social in 2848) [ClassicSimilarity], result of:
      0.04031018 = score(doc=2848,freq=2.0), product of:
        0.18299131 = queryWeight, product of:
          3.9875789 = idf(docFreq=2228, maxDocs=44218)
          0.04589033 = queryNorm
        0.22028469 = fieldWeight in 2848, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.9875789 = idf(docFreq=2228, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2848)
  0.4 = coord(2/5)
```
Abstract

In this article we present FLUX-CiM, a novel method for extracting components (e.g., author names, article titles, venues, page numbers) from bibliographic citations. Our method does not rely on patterns encoding specific delimiters used in a particular citation style. This feature yields a high degree of automation and flexibility, and allows FLUX-CiM to extract from citations in any given format. Differently from previous methods that are based on models learned from user-driven training, our method relies on a knowledge base automatically constructed from an existing set of sample metadata records from a given field (e.g., computer science, health sciences, social sciences, etc.). These records are usually available on the Web or other public data repositories. To demonstrate the effectiveness and applicability of our proposed method, we present a series of experiments in which we apply it to extract bibliographic data from citations in articles of different fields. Results of these experiments exhibit precision and recall levels above 94% for all fields, and perfect extraction for the large majority of citations tested. In addition, in a comparison against a state-of-the-art information-extraction method, ours produced superior results without the training phase required by that method. Finally, we present a strategy for using bibliographic data resulting from the extraction process with FLUX-CiM to automatically update and expand the knowledge base of a given domain. We show that this strategy can be used to achieve good extraction results even if only a very small initial sample of bibliographic records is available for building the knowledge base.

Source

Journal of the American Society for Information Science and Technology. 60(2009) no.6, S.1144-1158
Cavalcante Dourado, Í.; Galante, R.; Gonçalves, M.A.; Silva Torres, R. de: Bag of textual graphs (BoTG) : a general graph-based text representation model (2019) 0.02
```
0.023643848 = product of:
  0.059109617 = sum of:
    0.022488397 = weight(_text_:technology in 5291) [ClassicSimilarity], result of:
      0.022488397 = score(doc=5291,freq=2.0), product of:
        0.13667917 = queryWeight, product of:
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.04589033 = queryNorm
        0.16453418 = fieldWeight in 5291, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5291)
    0.03662122 = product of:
      0.07324244 = sum of:
        0.07324244 = weight(_text_:aspects in 5291) [ClassicSimilarity], result of:
          0.07324244 = score(doc=5291,freq=4.0), product of:
            0.20741826 = queryWeight, product of:
              4.5198684 = idf(docFreq=1308, maxDocs=44218)
              0.04589033 = queryNorm
            0.35311472 = fieldWeight in 5291, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.5198684 = idf(docFreq=1308, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5291)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

Text representation models are the fundamental basis for information retrieval and text mining tasks. Although different text models have been proposed, they typically target specific task aspects in isolation, such as time efficiency, accuracy, or applicability for different scenarios. Here we present Bag of Textual Graphs (BoTG), a general text representation model that addresses these three requirements at the same time. The proposed textual representation is based on a graph-based scheme that encodes term proximity and term ordering, and represents text documents into an efficient vector space that addresses all these aspects as well as provides discriminative textual patterns. Extensive experiments are conducted in two experimental scenarios-classification and retrieval-considering multiple well-known text collections. We also compare our model against several methods from the literature. Experimental results demonstrate that our model is generic enough to handle different tasks and collections. It is also more efficient than the widely used state-of-the-art methods in textual classification and retrieval tasks, with a competitive effectiveness, sometimes with gains by large margins.

Source

Journal of the Association for Information Science and Technology. 70(2019) no.8, S.817-829

Dalip, D.H.; Gonçalves, M.A.; Cristo, M.; Calado, P.: ¬A general multiview framework for assessing the quality of collaboratively created content on web 2.0 (2017) 0.02

0.015212866 = product of:
  0.038032163 = sum of:
    0.022488397 = weight(_text_:technology in 3343) [ClassicSimilarity], result of:
      0.022488397 = score(doc=3343,freq=2.0), product of:
        0.13667917 = queryWeight, product of:
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.04589033 = queryNorm
        0.16453418 = fieldWeight in 3343, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3343)
    0.015543767 = product of:
      0.031087535 = sum of:
        0.031087535 = weight(_text_:22 in 3343) [ClassicSimilarity], result of:
          0.031087535 = score(doc=3343,freq=2.0), product of:
            0.16070013 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04589033 = queryNorm
            0.19345059 = fieldWeight in 3343, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3343)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Date: 16.11.2017 13:04:22
Source: Journal of the Association for Information Science and Technology. 68(2017) no.2, S.286-308

Belém, F.M.; Almeida, J.M.; Gonçalves, M.A.: ¬A survey on tag recommendation methods : a review (2017) 0.02

0.015212866 = product of:
  0.038032163 = sum of:
    0.022488397 = weight(_text_:technology in 3524) [ClassicSimilarity], result of:
      0.022488397 = score(doc=3524,freq=2.0), product of:
        0.13667917 = queryWeight, product of:
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.04589033 = queryNorm
        0.16453418 = fieldWeight in 3524, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3524)
    0.015543767 = product of:
      0.031087535 = sum of:
        0.031087535 = weight(_text_:22 in 3524) [ClassicSimilarity], result of:
          0.031087535 = score(doc=3524,freq=2.0), product of:
            0.16070013 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04589033 = queryNorm
            0.19345059 = fieldWeight in 3524, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3524)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Date: 16.11.2017 13:30:22
Source: Journal of the Association for Information Science and Technology. 68(2017) no.4, S.830-844

Moura, E.S. de; Fernandes, D.; Ribeiro-Neto, B.; Silva, A.S. da; Gonçalves, M.A.: Using structural information to improve search in Web collections (2010) 0.01

0.005397215 = product of:
  0.026986076 = sum of:
    0.026986076 = weight(_text_:technology in 4119) [ClassicSimilarity], result of:
      0.026986076 = score(doc=4119,freq=2.0), product of:
        0.13667917 = queryWeight, product of:
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.04589033 = queryNorm
        0.19744103 = fieldWeight in 4119, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.046875 = fieldNorm(doc=4119)
  0.2 = coord(1/5)

Source: Journal of the American Society for Information Science and Technology. 61(2010) no.12, S.2503-2513

Santana, A.F.; Gonçalves, M.A.; Laender, A.H.F.; Ferreira, A.A.: Incremental author name disambiguation by exploiting domain-specific heuristics (2017) 0.01

0.005397215 = product of:
  0.026986076 = sum of:
    0.026986076 = weight(_text_:technology in 3587) [ClassicSimilarity], result of:
      0.026986076 = score(doc=3587,freq=2.0), product of:
        0.13667917 = queryWeight, product of:
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.04589033 = queryNorm
        0.19744103 = fieldWeight in 3587, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.046875 = fieldNorm(doc=3587)
  0.2 = coord(1/5)

Source: Journal of the Association for Information Science and Technology. 68(2017) no.4, S.931-945

Calado, P.; Cristo, M.; Gonçalves, M.A.; Moura, E.S. de; Ribeiro-Neto, B.; Ziviani, N.: Link-based similarity measures for the classification of Web documents (2006) 0.00

0.0044976794 = product of:
  0.022488397 = sum of:
    0.022488397 = weight(_text_:technology in 4921) [ClassicSimilarity], result of:
      0.022488397 = score(doc=4921,freq=2.0), product of:
        0.13667917 = queryWeight, product of:
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.04589033 = queryNorm
        0.16453418 = fieldWeight in 4921, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4921)
  0.2 = coord(1/5)

Source: Journal of the American Society for Information Science and Technology. 57(2006) no.2, S.208-221

Cota, R.G.; Ferreira, A.A.; Nascimento, C.; Gonçalves, M.A.; Laender, A.H.F.: ¬An unsupervised heuristic-based hierarchical method for name disambiguation in bibliographic citations (2010) 0.00

0.0044976794 = product of:
  0.022488397 = sum of:
    0.022488397 = weight(_text_:technology in 3986) [ClassicSimilarity], result of:
      0.022488397 = score(doc=3986,freq=2.0), product of:
        0.13667917 = queryWeight, product of:
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.04589033 = queryNorm
        0.16453418 = fieldWeight in 3986, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3986)
  0.2 = coord(1/5)

Source: Journal of the American Society for Information Science and Technology. 61(2010) no.9, S.1853-1870

Pereira, D.A.; Ribeiro-Neto, B.; Ziviani, N.; Laender, A.H.F.; Gonçalves, M.A.: ¬A generic Web-based entity resolution framework (2011) 0.00

0.0044976794 = product of:
  0.022488397 = sum of:
    0.022488397 = weight(_text_:technology in 4450) [ClassicSimilarity], result of:
      0.022488397 = score(doc=4450,freq=2.0), product of:
        0.13667917 = queryWeight, product of:
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.04589033 = queryNorm
        0.16453418 = fieldWeight in 4450, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4450)
  0.2 = coord(1/5)

Source: Journal of the American Society for Information Science and Technology. 62(2011) no.5, S.919-932

Silva, R.M.; Gonçalves, M.A.; Veloso, A.: ¬A Two-stage active learning method for learning to rank (2014) 0.00

0.0044976794 = product of:
  0.022488397 = sum of:
    0.022488397 = weight(_text_:technology in 1184) [ClassicSimilarity], result of:
      0.022488397 = score(doc=1184,freq=2.0), product of:
        0.13667917 = queryWeight, product of:
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.04589033 = queryNorm
        0.16453418 = fieldWeight in 1184, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1184)
  0.2 = coord(1/5)

Source: Journal of the Association for Information Science and Technology. 65(2014) no.1, S.109-128

Ferreira, A.A.; Veloso, A.; Gonçalves, M.A.; Laender, A.H.F.: Self-training author name disambiguation for information scarce scenarios (2014) 0.00

0.0044976794 = product of:
  0.022488397 = sum of:
    0.022488397 = weight(_text_:technology in 1292) [ClassicSimilarity], result of:
      0.022488397 = score(doc=1292,freq=2.0), product of:
        0.13667917 = queryWeight, product of:
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.04589033 = queryNorm
        0.16453418 = fieldWeight in 1292, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1292)
  0.2 = coord(1/5)

Source: Journal of the Association for Information Science and Technology. 65(2014) no.6, S.1257-1278

Martins, E.F.; Belém, F.M.; Almeida, J.M.; Gonçalves, M.A.: On cold start for associative tag recommendation (2016) 0.00

0.0044976794 = product of:
  0.022488397 = sum of:
    0.022488397 = weight(_text_:technology in 2494) [ClassicSimilarity], result of:
      0.022488397 = score(doc=2494,freq=2.0), product of:
        0.13667917 = queryWeight, product of:
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.04589033 = queryNorm
        0.16453418 = fieldWeight in 2494, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2494)
  0.2 = coord(1/5)

Source: Journal of the Association for Information Science and Technology. 67(2016) no.1, S.83-105

Salles, T.; Rocha, L.; Gonçalves, M.A.; Almeida, J.M.; Mourão, F.; Meira Jr., W.; Viegas, F.: ¬A quantitative analysis of the temporal effects on automatic text classification (2016) 0.00

0.0044976794 = product of:
  0.022488397 = sum of:
    0.022488397 = weight(_text_:technology in 3014) [ClassicSimilarity], result of:
      0.022488397 = score(doc=3014,freq=2.0), product of:
        0.13667917 = queryWeight, product of:
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.04589033 = queryNorm
        0.16453418 = fieldWeight in 3014, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.978387 = idf(docFreq=6114, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3014)
  0.2 = coord(1/5)

Source: Journal of the Association for Information Science and Technology. 67(2016) no.7, S.1639-1667

Search (14 results, page 1 of 1)

Authors

Years

Themes