Search (79 results, page 1 of 4)

Xiong, C.: Knowledge based text representations for information retrieval (2016) 0.11
```
0.11339873 = sum of:
  0.056877762 = product of:
    0.17063329 = sum of:
      0.17063329 = weight(_text_:3a in 5820) [ClassicSimilarity], result of:
        0.17063329 = score(doc=5820,freq=2.0), product of:
          0.4554123 = queryWeight, product of:
            8.478011 = idf(docFreq=24, maxDocs=44218)
            0.05371688 = queryNorm
          0.3746787 = fieldWeight in 5820, product of:
            1.4142135 = tf(freq=2.0), with freq of:
              2.0 = termFreq=2.0
            8.478011 = idf(docFreq=24, maxDocs=44218)
            0.03125 = fieldNorm(doc=5820)
    0.33333334 = coord(1/3)
  0.05652097 = product of:
    0.11304194 = sum of:
      0.11304194 = weight(_text_:word in 5820) [ClassicSimilarity], result of:
        0.11304194 = score(doc=5820,freq=6.0), product of:
          0.28165168 = queryWeight, product of:
            5.2432623 = idf(docFreq=634, maxDocs=44218)
            0.05371688 = queryNorm
          0.4013537 = fieldWeight in 5820, product of:
            2.4494898 = tf(freq=6.0), with freq of:
              6.0 = termFreq=6.0
            5.2432623 = idf(docFreq=634, maxDocs=44218)
            0.03125 = fieldNorm(doc=5820)
    0.5 = coord(1/2)
```
Abstract

The successes of information retrieval (IR) in recent decades were built upon bag-of-words representations. Effective as it is, bag-of-words is only a shallow text understanding; there is a limited amount of information for document ranking in the word space. This dissertation goes beyond words and builds knowledge based text representations, which embed the external and carefully curated information from knowledge bases, and provide richer and structured evidence for more advanced information retrieval systems. This thesis research first builds query representations with entities associated with the query. Entities' descriptions are used by query expansion techniques that enrich the query with explanation terms. Then we present a general framework that represents a query with entities that appear in the query, are retrieved by the query, or frequently show up in the top retrieved documents. A latent space model is developed to jointly learn the connections from query to entities and the ranking of documents, modeling the external evidence from knowledge bases and internal ranking features cooperatively. To further improve the quality of relevant entities, a defining factor of our query representations, we introduce learning to rank to entity search and retrieve better entities from knowledge bases. In the document representation part, this thesis research also moves one step forward with a bag-of-entities model, in which documents are represented by their automatic entity annotations, and the ranking is performed in the entity space.
This proposal includes plans to improve the quality of relevant entities with a co-learning framework that learns from both entity labels and document labels. We also plan to develop a hybrid ranking system that combines word based and entity based representations together with their uncertainties considered. At last, we plan to enrich the text representations with connections between entities. We propose several ways to infer entity graph representations for texts, and to rank documents using their structure representations. This dissertation overcomes the limitation of word based representations with external and carefully curated information from knowledge bases. We believe this thesis research is a solid start towards the new generation of intelligent, semantic, and structured information retrieval.

Content

Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Language and Information Technologies. Vgl.: https%3A%2F%2Fwww.cs.cmu.edu%2F~cx%2Fpapers%2Fknowledge_based_text_representation.pdf&usg=AOvVaw0SaTSvhWLTh__Uz_HtOtl3.
Deokattey, S.; Dixit, D.K.; Bhanumurthy, K.: Co-word and facet analysis as tools for conceptualization in ontologies : a preliminary study of a micro-domain (2012) 0.05
```
0.04560516 = product of:
  0.09121032 = sum of:
    0.09121032 = product of:
      0.18242064 = sum of:
        0.18242064 = weight(_text_:word in 841) [ClassicSimilarity], result of:
          0.18242064 = score(doc=841,freq=10.0), product of:
            0.28165168 = queryWeight, product of:
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.05371688 = queryNorm
            0.6476817 = fieldWeight in 841, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.0390625 = fieldNorm(doc=841)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Conceptualization is at the core of developing domain ontologies. This paper reports a study for developing an ontology for a micro-domain - Test Blanket Module (TBM), an integral part of thermonuclear or fusion reactors. Sample data downloaded from yielded 1115 unique DEI (indexer-assigned) descriptors assigned to 548 records on TBM. The frequencies of occurrence of all the unique descriptors, the corresponding co-word DEI descriptors (AN numbers) were identified On the basis of their research linkages the descriptors were grouped into four quadrants. It was found, that the descriptors in the 2nd and 3rd quadrants were at the core of the selected subject. A total of 31 core descriptors from these were selected for conceptualization and for each the co-occurring descriptors and their frequencies of co-occurrence with the selected descriptor were noted. Only descriptor pairs that co-occurred 10 times or higher were considered. Comparison of Co-Word Word Blocks (CWWBs) and word blocks (INISWB) from the INIS thesaurus showed differences. Co-words were used to semantically enrich descriptors transforming them into more comprehensive concepts; these were used as building blocks for conceptualization and for domain ontology. This method could be replicated to generate semantic networks (which could form an Ontological layer on any subject of study) and also in query expansion during search and retrieval in interdisciplinary subject domains.

Zeng, Q.; Yu, M.; Yu, W.; Xiong, J.; Shi, Y.; Jiang, M.: Faceted hierarchy : a new graph type to organize scientific concepts and a construction method (2019) 0.04

0.04265832 = product of:
  0.08531664 = sum of:
    0.08531664 = product of:
      0.2559499 = sum of:
        0.2559499 = weight(_text_:3a in 400) [ClassicSimilarity], result of:
          0.2559499 = score(doc=400,freq=2.0), product of:
            0.4554123 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.05371688 = queryNorm
            0.56201804 = fieldWeight in 400, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=400)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Content: Vgl.: https%3A%2F%2Faclanthology.org%2FD19-5317.pdf&usg=AOvVaw0ZZFyq5wWTtNTvNkrvjlGA.

Yu, L.-C.; Wu, C.-H.; Chang, R.-Y.; Liu, C.-H.; Hovy, E.H.: Annotation and verification of sense pools in OntoNotes (2010) 0.04
```
0.04079049 = product of:
  0.08158098 = sum of:
    0.08158098 = product of:
      0.16316196 = sum of:
        0.16316196 = weight(_text_:word in 4236) [ClassicSimilarity], result of:
          0.16316196 = score(doc=4236,freq=8.0), product of:
            0.28165168 = queryWeight, product of:
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.05371688 = queryNorm
            0.5793041 = fieldWeight in 4236, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4236)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The paper describes the OntoNotes, a multilingual (English, Chinese and Arabic) corpus with large-scale semantic annotations, including predicate-argument structure, word senses, ontology linking, and coreference. The underlying semantic model of OntoNotes involves word senses that are grouped into so-called sense pools, i.e., sets of near-synonymous senses of words. Such information is useful for many applications, including query expansion for information retrieval (IR) systems, (near-)duplicate detection for text summarization systems, and alternative word selection for writing support systems. Although a sense pool provides a set of near-synonymous senses of words, there is still no knowledge about whether two words in a pool are interchangeable in practical use. Therefore, this paper devises an unsupervised algorithm that incorporates Google n-grams and a statistical test to determine whether a word in a pool can be substituted by other words in the same pool. The n-gram features are used to measure the degree of context mismatch for a substitution. The statistical test is then applied to determine whether the substitution is adequate based on the degree of mismatch. The proposed method is compared with a supervised method, namely Linear Discriminant Analysis (LDA). Experimental results show that the proposed unsupervised method can achieve comparable performance with the supervised method.
Green, R.: WordNet (2009) 0.04
```
0.040380526 = product of:
  0.08076105 = sum of:
    0.08076105 = product of:
      0.1615221 = sum of:
        0.1615221 = weight(_text_:word in 4696) [ClassicSimilarity], result of:
          0.1615221 = score(doc=4696,freq=4.0), product of:
            0.28165168 = queryWeight, product of:
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.05371688 = queryNorm
            0.5734818 = fieldWeight in 4696, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4696)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

WordNet, a lexical database for English, is organized around semantic and lexical relationships between synsets, concepts represented by sets of synonymous word senses. Offering reasonably comprehensive coverage of the nouns, verbs, adjectives, and adverbs of general English, WordNet is a widely used resource for dealing with the ambiguity that arises from homonymy, polysemy, and synonymy. WordNet is used in many information-related tasks and applications (e.g., word sense disambiguation, semantic similarity, lexical chaining, alignment of parallel corpora, text segmentation, sentiment and subjectivity analysis, text classification, information retrieval, text summarization, question answering, information extraction, and machine translation).
Jorge-Botana, G.; León, J.A.; Olmos, R.; Hassan-Montero, Y.: Visualizing polysemy using LSA and the predication algorithm (2010) 0.04
```
0.035325605 = product of:
  0.07065121 = sum of:
    0.07065121 = product of:
      0.14130242 = sum of:
        0.14130242 = weight(_text_:word in 3696) [ClassicSimilarity], result of:
          0.14130242 = score(doc=3696,freq=6.0), product of:
            0.28165168 = queryWeight, product of:
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.05371688 = queryNorm
            0.5016921 = fieldWeight in 3696, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3696)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Context is a determining factor in language and plays a decisive role in polysemic words. Several psycholinguistically motivated algorithms have been proposed to emulate human management of context, under the assumption that the value of a word is evanescent and takes on meaning only in interaction with other structures. The predication algorithm (Kintsch, [2001]), for example, uses a vector representation of the words produced by LSA (Latent Semantic Analysis) to dynamically simulate the comprehension of predications and even of predicative metaphors. The objective of this study was to predict some unwanted effects that could be present in vector-space models when extracting different meanings of a polysemic word (predominant meaning inundation, lack of precision, and low-level definition), and propose ideas based on the predication algorithm for avoiding them. Our first step was to visualize such unwanted phenomena and also the effect of solutions. We use different methods to extract the meanings for a polysemic word (without context, vector sum, and predication algorithm). Our second step was to conduct an analysis of variance to compare such methods and measure the impact of potential solutions. Results support the idea that a human-based computational algorithm like the predication algorithm can take into account features that ensure more accurate representations of the structures we seek to extract. Theoretical assumptions and their repercussions are discussed.
Köhler, J.; Philippi, S.; Specht, M.; Rüegg, A.: Ontology based text indexing and querying for the semantic web (2006) 0.03
```
0.028843235 = product of:
  0.05768647 = sum of:
    0.05768647 = product of:
      0.11537294 = sum of:
        0.11537294 = weight(_text_:word in 3280) [ClassicSimilarity], result of:
          0.11537294 = score(doc=3280,freq=4.0), product of:
            0.28165168 = queryWeight, product of:
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.05371688 = queryNorm
            0.40962988 = fieldWeight in 3280, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3280)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

This publication shows how the gap between the HTML based internet and the RDF based vision of the semantic web might be bridged, by linking words in texts to concepts of ontologies. Most current search engines use indexes that are built at the syntactical level and return hits based on simple string comparisons. However, the indexes do not contain synonyms, cannot differentiate between homonyms ('mouse' as a pointing vs. 'mouse' as an animal) and users receive different search results when they use different conjugation forms of the same word. In this publication, we present a system that uses ontologies and Natural Language Processing techniques to index texts, and thus supports word sense disambiguation and the retrieval of texts that contain equivalent words, by indexing them to concepts of ontologies. For this purpose, we developed fully automated methods for mapping equivalent concepts of imported RDF ontologies (for this prototype WordNet, SUMO and OpenCyc). These methods will thus allow the seamless integration of domain specific ontologies for concept based information retrieval in different domains. To demonstrate the practical workability of this approach, a set of web pages that contain synonyms and homonyms were indexed and can be queried via a search engine like query frontend. However, the ontology based indexing approach can also be used for other data mining applications such text clustering, relation mining and for searching free text fields in biological databases. The ontology alignment methods and some of the text mining principles described in this publication are now incorporated into the ONDEX system http://ondex.sourceforge.net/.
Iorio, A. di; Peroni, S.; Vitali, F.: ¬A Semantic Web approach to everyday overlapping markup (2011) 0.03
```
0.028843235 = product of:
  0.05768647 = sum of:
    0.05768647 = product of:
      0.11537294 = sum of:
        0.11537294 = weight(_text_:word in 4749) [ClassicSimilarity], result of:
          0.11537294 = score(doc=4749,freq=4.0), product of:
            0.28165168 = queryWeight, product of:
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.05371688 = queryNorm
            0.40962988 = fieldWeight in 4749, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4749)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Overlapping structures in XML are not symptoms of a misunderstanding of the intrinsic characteristics of a text document nor evidence of extreme scholarly requirements far beyond those needed by the most common XML-based applications. On the contrary, overlaps have started to appear in a large number of incredibly popular applications hidden under the guise of syntactical tricks to the basic hierarchy of the XML data format. Unfortunately, syntactical tricks have the drawback that the affected structures require complicated workarounds to support even the simplest query or usage. In this article, we present Extremely Annotational Resource Description Framework (RDF) Markup (EARMARK), an approach to overlapping markup that simplifies and streamlines the management of multiple hierarchies on the same content, and provides an approach to sophisticated queries and usages over such structures without the need of ad-hoc applications, simply by using Semantic Web tools and languages. We compare how relevant tasks (e.g., the identification of the contribution of an author in a word processor document) are of some substantial complexity when using the original data format and become more or less trivial when using EARMARK. We finally evaluate positively the memory and disk requirements of EARMARK documents in comparison to Open Office and Microsoft Word XML-based formats.
Maheswari, J.U.; Karpagam, G.R.: ¬A conceptual framework for ontology based information retrieval (2010) 0.03
```
0.028843235 = product of:
  0.05768647 = sum of:
    0.05768647 = product of:
      0.11537294 = sum of:
        0.11537294 = weight(_text_:word in 702) [ClassicSimilarity], result of:
          0.11537294 = score(doc=702,freq=4.0), product of:
            0.28165168 = queryWeight, product of:
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.05371688 = queryNorm
            0.40962988 = fieldWeight in 702, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.0390625 = fieldNorm(doc=702)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Improving Information retrieval by employing the use of ontologies to overcome the limitations of syntactic search has been one of the inspirations since its emergence. This paper proposes a conceptual framework to exploit ontology based Information retrieval. This framework constitutes of five phases namely Query parsing, word stemming, ontology matching, weight assignment, ranking and Information retrieval. In the first phase, the user query is parsed into sequence of words. The parsed contents are curtailed to identify the significant word by ignoring superfluous terms such as "to", "is","ed", "about" and the like in the stemming phase. The objective of the stemming phase is to throttle feature descriptors to root words, which in turn will increase efficiency; this reduces the time consumed for searching the superfluous terms, which may not significantly influence the effectiveness of the retrieval process. In the third phase ontology matching is carried out by matching the parsed words with the relevant terms in the existing ontology. If the ontology does not exist, it is recommended to generate the required ontology. In the fourth phase the weights are assigned based on the distance between the stemmed words and the terms in the ontology uses improved matchmaking algorithm. The range of weights varies from 0 to 1 based on the level of distance in the ontology (superclass-subclass). The aggregate weights are calculated for the all the combination of stemmed words. The combination with the highest score is ranked as the best and the corresponding information is retrieved. The conceptual workflow is illustrated with an e-governance case study Academic Information System.
Vlachidis, A.; Tudhope, D.: ¬A knowledge-based approach to information extraction for semantic interoperability in the archaeology domain (2016) 0.03
```
0.028843235 = product of:
  0.05768647 = sum of:
    0.05768647 = product of:
      0.11537294 = sum of:
        0.11537294 = weight(_text_:word in 2895) [ClassicSimilarity], result of:
          0.11537294 = score(doc=2895,freq=4.0), product of:
            0.28165168 = queryWeight, product of:
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.05371688 = queryNorm
            0.40962988 = fieldWeight in 2895, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2895)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The article presents a method for automatic semantic indexing of archaeological grey-literature reports using empirical (rule-based) Information Extraction techniques in combination with domain-specific knowledge organization systems. The semantic annotation system (OPTIMA) performs the tasks of Named Entity Recognition, Relation Extraction, Negation Detection, and Word-Sense Disambiguation using hand-crafted rules and terminological resources for associating contextual abstractions with classes of the standard ontology CIDOC Conceptual Reference Model (CRM) for cultural heritage and its archaeological extension, CRM-EH. Relation Extraction (RE) performance benefits from a syntactic-based definition of RE patterns derived from domain oriented corpus analysis. The evaluation also shows clear benefit in the use of assistive natural language processing (NLP) modules relating to Word-Sense Disambiguation, Negation Detection, and Noun Phrase Validation, together with controlled thesaurus expansion. The semantic indexing results demonstrate the capacity of rule-based Information Extraction techniques to deliver interoperable semantic abstractions (semantic annotations) with respect to the CIDOC CRM and archaeological thesauri. Major contributions include recognition of relevant entities using shallow parsing NLP techniques driven by a complimentary use of ontological and terminological domain resources and empirical derivation of context-driven RE rules for the recognition of semantic relationships from phrases of unstructured text.

Loehrlein, A.; Jacob, E.K.; Lee, S.; Yang, K.: Development of heuristics in a hybrid approach to faceted classification (2006) 0.03

0.028553344 = product of:
  0.05710669 = sum of:
    0.05710669 = product of:
      0.11421338 = sum of:
        0.11421338 = weight(_text_:word in 247) [ClassicSimilarity], result of:
          0.11421338 = score(doc=247,freq=2.0), product of:
            0.28165168 = queryWeight, product of:
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.05371688 = queryNorm
            0.40551287 = fieldWeight in 247, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.0546875 = fieldNorm(doc=247)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Abstract: This paper describes work in progress to identify automated methods to complement and streamline the intellectual process in the generation of faceted schemes. It reports on the development of the word pair heuristic, the suffix heuristic, and the WordNet heuristic, and how the three heuristics integrate to produce an initial organization of terms from which a classificationist can more efficiently construct a faceted vocabulary.

Stojanovic, N.: Ontology-based Information Retrieval : methods and tools for cooperative query answering (2005) 0.03

0.028438881 = product of:
  0.056877762 = sum of:
    0.056877762 = product of:
      0.17063329 = sum of:
        0.17063329 = weight(_text_:3a in 701) [ClassicSimilarity], result of:
          0.17063329 = score(doc=701,freq=2.0), product of:
            0.4554123 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.05371688 = queryNorm
            0.3746787 = fieldWeight in 701, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03125 = fieldNorm(doc=701)
      0.33333334 = coord(1/3)
  0.5 = coord(1/2)

Content: Vgl.: http%3A%2F%2Fdigbib.ubka.uni-karlsruhe.de%2Fvolltexte%2Fdocuments%2F1627&ei=tAtYUYrBNoHKtQb3l4GYBw&usg=AFQjCNHeaxKkKU3-u54LWxMNYGXaaDLCGw&sig2=8WykXWQoDKjDSdGtAakH2Q&bvm=bv.44442042,d.Yms.

Li, J.; Zhang, Z.; Li, X.; Chen, H.: Kernel-based learning for biomedical relation extraction (2008) 0.02
```
0.024474295 = product of:
  0.04894859 = sum of:
    0.04894859 = product of:
      0.09789718 = sum of:
        0.09789718 = weight(_text_:word in 1611) [ClassicSimilarity], result of:
          0.09789718 = score(doc=1611,freq=2.0), product of:
            0.28165168 = queryWeight, product of:
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.05371688 = queryNorm
            0.34758246 = fieldWeight in 1611, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.046875 = fieldNorm(doc=1611)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Relation extraction is the process of scanning text for relationships between named entities. Recently, significant studies have focused on automatically extracting relations from biomedical corpora. Most existing biomedical relation extractors require manual creation of biomedical lexicons or parsing templates based on domain knowledge. In this study, we propose to use kernel-based learning methods to automatically extract biomedical relations from literature text. We develop a framework of kernel-based learning for biomedical relation extraction. In particular, we modified the standard tree kernel function by incorporating a trace kernel to capture richer contextual information. In our experiments on a biomedical corpus, we compare different kernel functions for biomedical relation detection and classification. The experimental results show that a tree kernel outperforms word and sequence kernels for relation detection, our trace-tree kernel outperforms the standard tree kernel, and a composite kernel outperforms individual kernels for relation extraction.
Schmitz-Esser, W.: Formalizing terminology-based knowledge for an ontology independently of a particular language (2008) 0.02
```
0.024474295 = product of:
  0.04894859 = sum of:
    0.04894859 = product of:
      0.09789718 = sum of:
        0.09789718 = weight(_text_:word in 1680) [ClassicSimilarity], result of:
          0.09789718 = score(doc=1680,freq=2.0), product of:
            0.28165168 = queryWeight, product of:
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.05371688 = queryNorm
            0.34758246 = fieldWeight in 1680, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.046875 = fieldNorm(doc=1680)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Last word ontological thought and practice is exemplified on an axiomatic framework [a model for an Integrative Cross-Language Ontology (ICLO), cf. Poli, R., Schmitz-Esser, W., forthcoming 2007] that is highly general, based on natural language, multilingual, can be implemented as topic maps and may be openly enhanced by software available for particular languages. Basics of ontological modelling, conditions for construction and maintenance, and the most salient points in application are addressed, such as cross-language text mining and knowledge generation. The rationale is to open the eyes for the tremendous potential of terminology-based ontologies for principled Knowledge Organization and the interchange and reuse of formalized knowledge.

Quillian, M.R.: Word concepts : a theory and simulation of some basic semantic capabilities. (1967) 0.02

0.024474295 = product of:
  0.04894859 = sum of:
    0.04894859 = product of:
      0.09789718 = sum of:
        0.09789718 = weight(_text_:word in 4414) [ClassicSimilarity], result of:
          0.09789718 = score(doc=4414,freq=2.0), product of:
            0.28165168 = queryWeight, product of:
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.05371688 = queryNorm
            0.34758246 = fieldWeight in 4414, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.046875 = fieldNorm(doc=4414)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Wang, Y.; Tai, Y.; Yang, Y.: Determination of semantic types of tags in social tagging systems (2018) 0.02
```
0.024474295 = product of:
  0.04894859 = sum of:
    0.04894859 = product of:
      0.09789718 = sum of:
        0.09789718 = weight(_text_:word in 4648) [ClassicSimilarity], result of:
          0.09789718 = score(doc=4648,freq=2.0), product of:
            0.28165168 = queryWeight, product of:
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.05371688 = queryNorm
            0.34758246 = fieldWeight in 4648, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.046875 = fieldNorm(doc=4648)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The purpose of this paper is to determine semantic types for tags in social tagging systems. In social tagging systems, the determination of the semantic type of tags plays an important role in tag classification, increasing the semantic information of tags and establishing mapping relations between tagged resources and a normed ontology. The research reported in this paper constructs the semantic type library that is needed based on the Unified Medical Language System (UMLS) and FrameNet and determines the semantic type of selected tags that have been pretreated via direct matching using the Semantic Navigator tool, the Semantic Type Word Sense Disambiguation (STWSD) tools in UMLS, and artificial matching. And finally, we verify the feasibility of the determination of semantic type for tags by empirical analysis.
Wei, W.; Liu, Y.-P.; Wei, L-R.: Feature-level sentiment analysis based on rules and fine-grained domain ontology (2020) 0.02
```
0.024474295 = product of:
  0.04894859 = sum of:
    0.04894859 = product of:
      0.09789718 = sum of:
        0.09789718 = weight(_text_:word in 5876) [ClassicSimilarity], result of:
          0.09789718 = score(doc=5876,freq=2.0), product of:
            0.28165168 = queryWeight, product of:
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.05371688 = queryNorm
            0.34758246 = fieldWeight in 5876, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.046875 = fieldNorm(doc=5876)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Mining product reviews and sentiment analysis are of great significance, whether for academic research purposes or optimizing business strategies. We propose a feature-level sentiment analysis framework based on rules parsing and fine-grained domain ontology for Chinese reviews. Fine-grained ontology is used to describe synonymous expressions of product features, which are reflected in word changes in online reviews. First, a semiautomatic construction method is developed by using Word2Vec for fine-grained ontology. Then, featurelevel sentiment analysis that combines rules parsing and the fine-grained domain ontology is conducted to extract explicit and implicit features from product reviews. Finally, the domain sentiment dictionary and context sentiment dictionary are established to identify sentiment polarities for the extracted feature-sentiment combinations. An experiment is conducted on the basis of product reviews crawled from Chinese e-commerce websites. The results demonstrate the effectiveness of our approach.
Hjoerland, B.: Semantics and knowledge organization (2007) 0.02
```
0.020395245 = product of:
  0.04079049 = sum of:
    0.04079049 = product of:
      0.08158098 = sum of:
        0.08158098 = weight(_text_:word in 1980) [ClassicSimilarity], result of:
          0.08158098 = score(doc=1980,freq=2.0), product of:
            0.28165168 = queryWeight, product of:
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.05371688 = queryNorm
            0.28965205 = fieldWeight in 1980, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1980)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

The aim of this chapter is to demonstrate that semantic issues underlie all research questions within Library and Information Science (LIS, or, as hereafter, IS) and, in particular, the subfield known as Knowledge Organization (KO). Further, it seeks to show that semantics is a field influenced by conflicting views and discusses why it is important to argue for the most fruitful one of these. Moreover, the chapter demonstrates that IS has not yet addressed semantic problems in systematic fashion and examines why the field is very fragmented and without a proper theoretical basis. The focus here is on broad interdisciplinary issues and the long-term perspective. The theoretical problems involving semantics and concepts are very complicated. Therefore, this chapter starts by considering tools developed in KO for information retrieval (IR) as basically semantic tools. In this way, it establishes a specific IS focus on the relation between KO and semantics. It is well known that thesauri consist of a selection of concepts supplemented with information about their semantic relations (such as generic relations or "associative relations"). Some words in thesauri are "preferred terms" (descriptors), whereas others are "lead-in terms." The descriptors represent concepts. The difference between "a word" and "a concept" is that different words may have the same meaning and similar words may have different meanings, whereas one concept expresses one meaning.
Jiang, X.; Tan, A.-H.: CRCTOL: a semantic-based domain ontology learning system (2009) 0.02
```
0.020395245 = product of:
  0.04079049 = sum of:
    0.04079049 = product of:
      0.08158098 = sum of:
        0.08158098 = weight(_text_:word in 3320) [ClassicSimilarity], result of:
          0.08158098 = score(doc=3320,freq=2.0), product of:
            0.28165168 = queryWeight, product of:
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.05371688 = queryNorm
            0.28965205 = fieldWeight in 3320, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3320)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Domain ontologies play an important role in supporting knowledge-based applications in the Semantic Web. To facilitate the building of ontologies, text mining techniques have been used to perform ontology learning from texts. However, traditional systems employ shallow natural language processing techniques and focus only on concept and taxonomic relation extraction. In this paper we present a system, known as Concept-Relation-Concept Tuple-based Ontology Learning (CRCTOL), for mining ontologies automatically from domain-specific documents. Specifically, CRCTOL adopts a full text parsing technique and employs a combination of statistical and lexico-syntactic methods, including a statistical algorithm that extracts key concepts from a document collection, a word sense disambiguation algorithm that disambiguates words in the key concepts, a rule-based algorithm that extracts relations between the key concepts, and a modified generalized association rule mining algorithm that prunes unimportant relations for ontology learning. As a result, the ontologies learned by CRCTOL are more concise and contain a richer semantics in terms of the range and number of semantic relations compared with alternative systems. We present two case studies where CRCTOL is used to build a terrorism domain ontology and a sport event domain ontology. At the component level, quantitative evaluation by comparing with Text-To-Onto and its successor Text2Onto has shown that CRCTOL is able to extract concepts and semantic relations with a significantly higher level of accuracy. At the ontology level, the quality of the learned ontologies is evaluated by either employing a set of quantitative and qualitative methods including analyzing the graph structural property, comparison to WordNet, and expert rating, or directly comparing with a human-edited benchmark ontology, demonstrating the high quality of the ontologies learned.
Saruladha, K.; Aghila, G.; Penchala, S.K.: Design of new indexing techniques based on ontology for information retrieval systems (2010) 0.02
```
0.020395245 = product of:
  0.04079049 = sum of:
    0.04079049 = product of:
      0.08158098 = sum of:
        0.08158098 = weight(_text_:word in 4317) [ClassicSimilarity], result of:
          0.08158098 = score(doc=4317,freq=2.0), product of:
            0.28165168 = queryWeight, product of:
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.05371688 = queryNorm
            0.28965205 = fieldWeight in 4317, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.2432623 = idf(docFreq=634, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4317)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Information Retrieval [IR] is the science of searching for documents, for information within documents, and for metadata about documents, as well as that of searching relational databases and the World Wide Web. This paper describes a document representation method instead of keywords ontological descriptors. The purpose of this paper is to propose a system for content-based querying of texts based on the availability of ontology for the concepts in the text domain and to develop new Indexing methods to improve RSV (Retrieval status value). There is a need for querying ontologies at various granularities to retrieve information from various sources to suit the requirements of Semantic web, to eradicate the mismatch between user request and response from the Information Retrieval system. Most of the search engines use indexes that are built at the syntactical level and return hits based on simple string comparisons. The indexes do not contain synonyms, cannot differentiate between homonyms and users receive different search results when they use different conjugation forms of the same word.

Search (79 results, page 1 of 4)

Authors

Years

Languages

Types

Themes

Subjects

Classifications