Search (12 results, page 1 of 1)

Mesquita, L.A.P.; Souza, R.R.; Baracho Porto, R.M.A.: Noun phrases in automatic indexing: : a structural analysis of the distribution of relevant terms in doctoral theses (2014) 0.02
```
0.017720532 = product of:
  0.035441063 = sum of:
    0.035441063 = sum of:
      0.010480874 = weight(_text_:a in 1442) [ClassicSimilarity], result of:
        0.010480874 = score(doc=1442,freq=30.0), product of:
          0.053105544 = queryWeight, product of:
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.046056706 = queryNorm
          0.19735932 = fieldWeight in 1442, product of:
            5.477226 = tf(freq=30.0), with freq of:
              30.0 = termFreq=30.0
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.03125 = fieldNorm(doc=1442)
      0.02496019 = weight(_text_:22 in 1442) [ClassicSimilarity], result of:
        0.02496019 = score(doc=1442,freq=2.0), product of:
          0.16128273 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046056706 = queryNorm
          0.15476047 = fieldWeight in 1442, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.03125 = fieldNorm(doc=1442)
  0.5 = coord(1/2)
```
Abstract

The main objective of this research was to analyze whether there was a characteristic distribution behavior of relevant terms over a scientific text that could contribute as a criterion for their process of automatic indexing. The terms considered in this study were only full noun phrases contained in the texts themselves. The texts were considered a total of 98 doctoral theses of the eight areas of knowledge in a same university. Initially, 20 full noun phrases were automatically extracted from each text as candidates to be the most relevant terms, and each author of each text assigned a relevance value 0-6 (not relevant and highly relevant, respectively) for each of the 20 noun phrases sent. Only, 22.1 % of noun phrases were considered not relevant. A relevance values of the terms assigned by the authors were associated with their positions in the text. Each full noun phrases found in the text was considered as a valid linear position. The results that were obtained showed values resulting from this distribution by considering two types of position: linear, with values consolidated into ten equal consecutive parts; and structural, considering parts of the text (such as introduction, development and conclusion). As a result of considerable importance, all areas of knowledge related to the Natural Sciences showed a characteristic behavior in the distribution of relevant terms, as well as all areas of knowledge related to Social Sciences showed the same characteristic behavior of distribution, but distinct from the Natural Sciences. The difference of the distribution behavior between the Natural and Social Sciences can be clearly visualized through graphs. All behaviors, including the general behavior of all areas of knowledge together, were characterized in polynomial equations and can be applied in future as criteria for automatic indexing. Until the present date this work has become inedited of for two reasons: to present a method for characterizing the distribution of relevant terms in a scientific text, and also, through this method, pointing out a quantitative trait difference between the Natural and Social Sciences.

Source

Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik

Type

a
Martins, A.L.; Souza, R.R.; Ribeiro de Mello, H.: ¬The use of noun phrases in information retrieval : proposing a mechanism for automatic classification (2014) 0.02
```
0.016758895 = product of:
  0.03351779 = sum of:
    0.03351779 = sum of:
      0.008557598 = weight(_text_:a in 1441) [ClassicSimilarity], result of:
        0.008557598 = score(doc=1441,freq=20.0), product of:
          0.053105544 = queryWeight, product of:
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.046056706 = queryNorm
          0.16114321 = fieldWeight in 1441, product of:
            4.472136 = tf(freq=20.0), with freq of:
              20.0 = termFreq=20.0
            1.153047 = idf(docFreq=37942, maxDocs=44218)
            0.03125 = fieldNorm(doc=1441)
      0.02496019 = weight(_text_:22 in 1441) [ClassicSimilarity], result of:
        0.02496019 = score(doc=1441,freq=2.0), product of:
          0.16128273 = queryWeight, product of:
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.046056706 = queryNorm
          0.15476047 = fieldWeight in 1441, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            3.5018296 = idf(docFreq=3622, maxDocs=44218)
            0.03125 = fieldNorm(doc=1441)
  0.5 = coord(1/2)
```
Abstract

This paper presents a research on syntactic structures known as noun phrases (NP) being applied to increase the effectiveness and efficiency of the mechanisms for the document's classification. Our hypothesis is the fact that the NP can be used instead of single words as a semantic aggregator to reduce the number of words that will be used for the classification system without losing its semantic coverage, increasing its efficiency. The experiment divided the documents classification process in three phases: a) NP preprocessing b) system training; and c) classification experiments. In the first step, a corpus of digitalized texts was submitted to a natural language processing platform1 in which the part-of-speech tagging was done, and them PERL scripts pertaining to the PALAVRAS package were used to extract the Noun Phrases. The preprocessing also involved the tasks of a) removing NP low meaning pre-modifiers, as quantifiers; b) identification of synonyms and corresponding substitution for common hyperonyms; and c) stemming of the relevant words contained in the NP, for similitude checking with other NPs. The first tests with the resulting documents have demonstrated its effectiveness. We have compared the structural similarity of the documents before and after the whole pre-processing steps of phase one. The texts maintained the consistency with the original and have kept the readability. The second phase involves submitting the modified documents to a SVM algorithm to identify clusters and classify the documents. The classification rules are to be established using a machine learning approach. Finally, tests will be conducted to check the effectiveness of the whole process.

Source

Knowledge organization in the 21st century: between historical patterns and future prospects. Proceedings of the Thirteenth International ISKO Conference 19-22 May 2014, Kraków, Poland. Ed.: Wieslaw Babik

Type

a

Souza, R.R.; Tudhope, D.; Almeida, M.B.: Towards a taxonomy of KOS (2012) 0.00

0.0031324127 = product of:
  0.0062648254 = sum of:
    0.0062648254 = product of:
      0.012529651 = sum of:
        0.012529651 = weight(_text_:a in 139) [ClassicSimilarity], result of:
          0.012529651 = score(doc=139,freq=14.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.23593865 = fieldWeight in 139, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=139)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: This paper analyzes previous work on the classification of Knowledge Organization Systems (KOS), discusses strengths and weaknesses, and proposes a new and integrative framework. It argues that current analyses of the KOS tend to be idiosyncratic and incomplete, relying on a limited number of dimensions of analysis. The paper discusses why and how KOS should be classified on a new basis. Based on the available literature and previous work, the authors propose a wider set of dimensions for the analysis of KOS. These are represented in a taxonomy of KOS. Issues arising are discussed.
Type: a

Souza, R.R.; Gil-Leiva, I.: Automatic indexing of scientific texts : a methodological comparison (2016) 0.00

0.00270615 = product of:
  0.0054123 = sum of:
    0.0054123 = product of:
      0.0108246 = sum of:
        0.0108246 = weight(_text_:a in 4913) [ClassicSimilarity], result of:
          0.0108246 = score(doc=4913,freq=8.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.20383182 = fieldWeight in 4913, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=4913)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Knowledge organization for a sustainable world: challenges and perspectives for cultural, scientific, and technological sharing in a connected society : proceedings of the Fourteenth International ISKO Conference 27-29 September 2016, Rio de Janeiro, Brazil / organized by International Society for Knowledge Organization (ISKO), ISKO-Brazil, São Paulo State University ; edited by José Augusto Chaves Guimarães, Suellen Oliveira Milani, Vera Dodebei
Type: a

Coelho, F.C.; Souza, R.R.; Codeço, C.T.: Towards an ontology for mathematical modeling with application to epidemiology (2012) 0.00
```
0.0026849252 = product of:
  0.0053698504 = sum of:
    0.0053698504 = product of:
      0.010739701 = sum of:
        0.010739701 = weight(_text_:a in 838) [ClassicSimilarity], result of:
          0.010739701 = score(doc=838,freq=14.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.20223314 = fieldWeight in 838, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=838)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Mathematical modelling is a field of applied mathematics with applications in other disciplines. The availability of a formal ontology and derived benefits, such as the possibility of conducting automated reasoning about the ontological classes of the domains, greatly reduce the barrier of entry in the field for non-experts, while helping the establishment of a more precise and controlled vocabulary among the domain experts involved in mathematical modelling. This work focuses on Mathematical Models applied to the natural sciences and as a case study the field of mathematical epidemiology has been chosen for this ontology. We propose the development of an ontology of mathematical models which is general enough and not restricted in its applicability, yet is developed considering the specific needs of a particular application domain.

Source

Categories, contexts and relations in knowledge organization: Proceedings of the Twelfth International ISKO Conference 6-9 August 2012, Mysore, India. Eds.: Neelameghan, A. u. K.S. Raghavan

Type

a
Café, L.M.A.; Souza, R.R.: Sentiment analysis and knowledge organization : an overview of the international literature (2017) 0.00
```
0.0026742492 = product of:
  0.0053484985 = sum of:
    0.0053484985 = product of:
      0.010696997 = sum of:
        0.010696997 = weight(_text_:a in 3625) [ClassicSimilarity], result of:
          0.010696997 = score(doc=3625,freq=20.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.20142901 = fieldWeight in 3625, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3625)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Knowledge organization (KO) as an activity is, among other meanings, a process for conceptual modeling of knowledge domains that produces a consensual abstraction model of this domain with a particular purpose. It adopts a myriad of techniques to analyze and build efficient knowledge organization systems, and one of these techniques is called sentiment analysis (SA) or opinion mining, which is emerging as promising and useful in a variety of ways. It is based in NLP and AI algorithms, and aims at identifying opinions and emotions toward any person, organization or subject; evaluating them as positive or negative, in both binary and graded fashions. This study sought to show various aspects of the implementation of SA for knowledge organization tasks as register ed in the scientific literature. We began with exploratory bibliographic research and built a corpus of 91 scientific papers, written in English, selected in the LISA Database, between 2000 to 2016. We analyzed these papers and extracted title, year of publication, author(s) and institution(s), title of the journal where they were published, keywords, the LISA classification code, methods/techniques adopted and its application areas. Our main findings are that theoretical papers still prevail, which may indicate a field in the early stages. We found many institutions and authors from Asia, which points to a new shift in world expertise. We concluded that SA is still a novelty in the KO field, being slowly adopted as an aid to the main tasks, as document classification.

Type

a

Souza, R.R.; Tudhope, D.; Almeida, M.B.: ¬The KOS spectra : a tentative typology of knowledge organization systems (2010) 0.00

0.0026473717 = product of:
  0.0052947435 = sum of:
    0.0052947435 = product of:
      0.010589487 = sum of:
        0.010589487 = weight(_text_:a in 3523) [ClassicSimilarity], result of:
          0.010589487 = score(doc=3523,freq=10.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.19940455 = fieldWeight in 3523, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3523)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: This work tries to propose a set of evaluation dimensions for the analysis of the knowledge organization systems (KOS), building over previous research and the available literature on the subject. It presents a compiled taxonomy of KOSs, a set of tentative characteristics proposed in the literature and the authors' spectra proposal. The full details of the typology are not covered in the scope of the article, but will be available as an ontology in the near future.
Type: a

Oliveira Machado, L.M.; Souza, R.R.; Simões, M. da Graça: Semantic web or web of data? : a diachronic study (1999 to 2017) of the publications of Tim Berners-Lee and the World Wide Web Consortium (2019) 0.00
```
0.0023919214 = product of:
  0.0047838427 = sum of:
    0.0047838427 = product of:
      0.009567685 = sum of:
        0.009567685 = weight(_text_:a in 5300) [ClassicSimilarity], result of:
          0.009567685 = score(doc=5300,freq=16.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.18016359 = fieldWeight in 5300, product of:
              4.0 = tf(freq=16.0), with freq of:
                16.0 = termFreq=16.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5300)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The web has been, in the last decades, the place where information retrieval achieved its maximum importance, given its ubiquity and the sheer volume of information. However, its exponential growth made the retrieval task increasingly hard, relying in its effectiveness on idiosyncratic and somewhat biased ranking algorithms. To deal with this problem, a "new" web, called the Semantic Web (SW), was proposed, bringing along concepts like "Web of Data" and "Linked Data," although the definitions and connections among these concepts are often unclear. Based on a qualitative approach built over a literature review, a definition of SW is presented, discussing the related concepts sometimes used as synonyms. It concludes that the SW is a comprehensive and ambitious construct that includes the great purpose of making the web a global database. It also follows the specifications developed and/or associated with its operationalization and the necessary procedures for the connection of data in an open format on the web. The goals of this comprehensive SW are the union of two outcomes still tenuously connected: the virtually unlimited possibility of connections between data-the web domain-with the potentiality of the automated inference of "intelligent" systems-the semantic component.

Type

a
Coelho, F.C.; Souza, R.R.; Chada, D.M.; Cerdeira, P. de Camargo: Information mining and visualization of data from the Brazilian Supreme Court (STF) : a case study (2012) 0.00
```
0.0023678814 = product of:
  0.0047357627 = sum of:
    0.0047357627 = product of:
      0.009471525 = sum of:
        0.009471525 = weight(_text_:a in 867) [ClassicSimilarity], result of:
          0.009471525 = score(doc=867,freq=8.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.17835285 = fieldWeight in 867, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=867)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This paper describes a joint research of the Law School (Direito Rio) and the Applied Math School (EMAp) of the Getulio Vargas Foundation (FGV), Brazil to analyze information from judicial activities in some of the Brazilian courts. The data for the study included the entire collection of judicial decisions from 1988 to the present. The idea was to identify bottlenecks in the judicial processes at the STF.

Source

Categories, contexts and relations in knowledge organization: Proceedings of the Twelfth International ISKO Conference 6-9 August 2012, Mysore, India. Eds.: Neelameghan, A. u. K.S. Raghavan

Type

a

Souza, R.R.; Coelho, F.C.; Higuchi, S.; Silva, D.L da: ¬The CPDOC semantic portal : applying semantic and knowledge organization systems to the Brazilian contemporary history domain (2012) 0.00

0.001913537 = product of:
  0.003827074 = sum of:
    0.003827074 = product of:
      0.007654148 = sum of:
        0.007654148 = weight(_text_:a in 859) [ClassicSimilarity], result of:
          0.007654148 = score(doc=859,freq=4.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.14413087 = fieldWeight in 859, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0625 = fieldNorm(doc=859)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Categories, contexts and relations in knowledge organization: Proceedings of the Twelfth International ISKO Conference 6-9 August 2012, Mysore, India. Eds.: Neelameghan, A. u. K.S. Raghavan
Type: a

Simões, M. da Graça; Machado, L.M.; Souza, R.R.; Almeida, M.B.; Tavares Lopes, A.: Automatic indexing and ontologies : the consistency of research chronology and authoring in the context of Information Science (2018) 0.00

0.001674345 = product of:
  0.00334869 = sum of:
    0.00334869 = product of:
      0.00669738 = sum of:
        0.00669738 = weight(_text_:a in 5909) [ClassicSimilarity], result of:
          0.00669738 = score(doc=5909,freq=4.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.12611452 = fieldWeight in 5909, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5909)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Type: a

Almeida, M.B.; Souza, R.R.; Porto, R.B.: Looking for the identity of information science in the age of big data, computing clouds and social networks (2015) 0.00
```
0.0014351527 = product of:
  0.0028703054 = sum of:
    0.0028703054 = product of:
      0.005740611 = sum of:
        0.005740611 = weight(_text_:a in 3453) [ClassicSimilarity], result of:
          0.005740611 = score(doc=3453,freq=4.0), product of:
            0.053105544 = queryWeight, product of:
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046056706 = queryNorm
            0.10809815 = fieldWeight in 3453, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.153047 = idf(docFreq=37942, maxDocs=44218)
              0.046875 = fieldNorm(doc=3453)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

In this paper we discuss, under a critical point of view, the current Information Science landscape and some future prospects regarding contemporary information phenomena. We present thoughts about the process of thematic deflation of Information Science, through the analysis of the research objects currently under development in this field. In addition to this, we look at the process of absorption of these and other relevant objects in distinguished knowledge fields. We seek to challenge the emphasis and the volume of interdisciplinary research within the field, and present some comments about what might be the results of such processes for the future of Information Science. Subsequently, we analyze the impact in the Information Science field due to phenomena like information boom, the consolidation of the social networks as interactive spaces, cloud computing, as well as other key elements.

Type

a

Search (12 results, page 1 of 1)

Authors

Themes