Search (11 results, page 1 of 1)

Xiong, C.: Knowledge based text representations for information retrieval (2016) 0.22

0.21602866 = product of:
  0.5040669 = sum of:
    0.031791996 = product of:
      0.095375985 = sum of:
        0.095375985 = weight(_text_:3a in 5820) [ClassicSimilarity], result of:
          0.095375985 = score(doc=5820,freq=2.0), product of:
            0.25455406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03002521 = queryNorm
            0.3746787 = fieldWeight in 5820, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03125 = fieldNorm(doc=5820)
      0.33333334 = coord(1/3)
    0.13488202 = weight(_text_:2f in 5820) [ClassicSimilarity], result of:
      0.13488202 = score(doc=5820,freq=4.0), product of:
        0.25455406 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03002521 = queryNorm
        0.5298757 = fieldWeight in 5820, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03125 = fieldNorm(doc=5820)
    0.047687992 = product of:
      0.095375985 = sum of:
        0.095375985 = weight(_text_:3a in 5820) [ClassicSimilarity], result of:
          0.095375985 = score(doc=5820,freq=2.0), product of:
            0.25455406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03002521 = queryNorm
            0.3746787 = fieldWeight in 5820, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03125 = fieldNorm(doc=5820)
      0.5 = coord(1/2)
    0.13488202 = weight(_text_:2f in 5820) [ClassicSimilarity], result of:
      0.13488202 = score(doc=5820,freq=4.0), product of:
        0.25455406 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03002521 = queryNorm
        0.5298757 = fieldWeight in 5820, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03125 = fieldNorm(doc=5820)
    0.13488202 = weight(_text_:2f in 5820) [ClassicSimilarity], result of:
      0.13488202 = score(doc=5820,freq=4.0), product of:
        0.25455406 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03002521 = queryNorm
        0.5298757 = fieldWeight in 5820, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03125 = fieldNorm(doc=5820)
    0.019940836 = product of:
      0.039881673 = sum of:
        0.039881673 = weight(_text_:texts in 5820) [ClassicSimilarity], result of:
          0.039881673 = score(doc=5820,freq=2.0), product of:
            0.16460659 = queryWeight, product of:
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.03002521 = queryNorm
            0.2422848 = fieldWeight in 5820, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.03125 = fieldNorm(doc=5820)
      0.5 = coord(1/2)
  0.42857143 = coord(6/14)

Abstract: This proposal includes plans to improve the quality of relevant entities with a co-learning framework that learns from both entity labels and document labels. We also plan to develop a hybrid ranking system that combines word based and entity based representations together with their uncertainties considered. At last, we plan to enrich the text representations with connections between entities. We propose several ways to infer entity graph representations for texts, and to rank documents using their structure representations. This dissertation overcomes the limitation of word based representations with external and carefully curated information from knowledge bases. We believe this thesis research is a solid start towards the new generation of intelligent, semantic, and structured information retrieval.
Content: Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Language and Information Technologies. Vgl.: https%3A%2F%2Fwww.cs.cmu.edu%2F~cx%2Fpapers%2Fknowledge_based_text_representation.pdf&usg=AOvVaw0SaTSvhWLTh__Uz_HtOtl3.

Stojanovic, N.: Ontology-based Information Retrieval : methods and tools for cooperative query answering (2005) 0.13

0.13057427 = product of:
  0.36560795 = sum of:
    0.031791996 = product of:
      0.095375985 = sum of:
        0.095375985 = weight(_text_:3a in 701) [ClassicSimilarity], result of:
          0.095375985 = score(doc=701,freq=2.0), product of:
            0.25455406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03002521 = queryNorm
            0.3746787 = fieldWeight in 701, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03125 = fieldNorm(doc=701)
      0.33333334 = coord(1/3)
    0.095375985 = weight(_text_:2f in 701) [ClassicSimilarity], result of:
      0.095375985 = score(doc=701,freq=2.0), product of:
        0.25455406 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03002521 = queryNorm
        0.3746787 = fieldWeight in 701, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03125 = fieldNorm(doc=701)
    0.047687992 = product of:
      0.095375985 = sum of:
        0.095375985 = weight(_text_:3a in 701) [ClassicSimilarity], result of:
          0.095375985 = score(doc=701,freq=2.0), product of:
            0.25455406 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03002521 = queryNorm
            0.3746787 = fieldWeight in 701, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03125 = fieldNorm(doc=701)
      0.5 = coord(1/2)
    0.095375985 = weight(_text_:2f in 701) [ClassicSimilarity], result of:
      0.095375985 = score(doc=701,freq=2.0), product of:
        0.25455406 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03002521 = queryNorm
        0.3746787 = fieldWeight in 701, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03125 = fieldNorm(doc=701)
    0.095375985 = weight(_text_:2f in 701) [ClassicSimilarity], result of:
      0.095375985 = score(doc=701,freq=2.0), product of:
        0.25455406 = queryWeight, product of:
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03002521 = queryNorm
        0.3746787 = fieldWeight in 701, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          8.478011 = idf(docFreq=24, maxDocs=44218)
          0.03125 = fieldNorm(doc=701)
  0.35714287 = coord(5/14)

Content: Vgl.: http%3A%2F%2Fdigbib.ubka.uni-karlsruhe.de%2Fvolltexte%2Fdocuments%2F1627&ei=tAtYUYrBNoHKtQb3l4GYBw&usg=AFQjCNHeaxKkKU3-u54LWxMNYGXaaDLCGw&sig2=8WykXWQoDKjDSdGtAakH2Q&bvm=bv.44442042,d.Yms.

Tzitzikas, Y.: Collaborative ontology-based information indexing and retrieval (2002) 0.01
```
0.009839063 = product of:
  0.045915626 = sum of:
    0.013458292 = weight(_text_:classification in 2281) [ClassicSimilarity], result of:
      0.013458292 = score(doc=2281,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.14074548 = fieldWeight in 2281, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03125 = fieldNorm(doc=2281)
    0.01899904 = product of:
      0.03799808 = sum of:
        0.03799808 = weight(_text_:schemes in 2281) [ClassicSimilarity], result of:
          0.03799808 = score(doc=2281,freq=2.0), product of:
            0.16067243 = queryWeight, product of:
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.03002521 = queryNorm
            0.2364941 = fieldWeight in 2281, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.03125 = fieldNorm(doc=2281)
      0.5 = coord(1/2)
    0.013458292 = weight(_text_:classification in 2281) [ClassicSimilarity], result of:
      0.013458292 = score(doc=2281,freq=2.0), product of:
        0.09562149 = queryWeight, product of:
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03002521 = queryNorm
        0.14074548 = fieldWeight in 2281, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1847067 = idf(docFreq=4974, maxDocs=44218)
          0.03125 = fieldNorm(doc=2281)
  0.21428572 = coord(3/14)
```
Abstract

An information system like the Web is a continuously evolving system consisting of multiple heterogeneous information sources, covering a wide domain of discourse, and a huge number of users (human or software) with diverse characteristics and needs, that produce and consume information. The challenge nowadays is to build a scalable information infrastructure enabling the effective, accurate, content based retrieval of information, in a way that adapts to the characteristics and interests of the users. The aim of this work is to propose formally sound methods for building such an information network based on ontologies which are widely used and are easy to grasp by ordinary Web users. The main results of this work are: - A novel scheme for indexing and retrieving objects according to multiple aspects or facets. The proposed scheme is a faceted scheme enriched with a method for specifying the combinations of terms that are valid. We give a model-theoretic interpretation to this model and we provide mechanisms for inferring the valid combinations of terms. This inference service can be exploited for preventing errors during the indexing process, which is very important especially in the case where the indexing is done collaboratively by many users, and for deriving "complete" navigation trees suitable for browsing through the Web. The proposed scheme has several advantages over the hierarchical classification schemes currently employed by Web catalogs, namely, conceptual clarity (it is easier to understand), compactness (it takes less space), and scalability (the update operations can be formulated more easily and be performed more effciently). - A exible and effecient model for building mediators over ontology based information sources. The proposed mediators support several modes of query translation and evaluation which can accommodate various application needs and levels of answer quality. The proposed model can be used for providing users with customized views of Web catalogs. It can also complement the techniques for building mediators over relational sources so as to support approximate translation of partially ordered domain values.
Sebastian, Y.: Literature-based discovery by learning heterogeneous bibliographic information networks (2017) 0.00
```
0.0032120824 = product of:
  0.044969153 = sum of:
    0.044969153 = weight(_text_:bibliographic in 535) [ClassicSimilarity], result of:
      0.044969153 = score(doc=535,freq=10.0), product of:
        0.11688946 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03002521 = queryNorm
        0.3847152 = fieldWeight in 535, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03125 = fieldNorm(doc=535)
  0.071428575 = coord(1/14)
```
Abstract

Literature-based discovery (LBD) research aims at finding effective computational methods for predicting previously unknown connections between clusters of research papers from disparate research areas. Existing methods encompass two general approaches. The first approach searches for these unknown connections by examining the textual contents of research papers. In addition to the existing textual features, the second approach incorporates structural features of scientific literatures, such as citation structures. These approaches, however, have not considered research papers' latent bibliographic metadata structures as important features that can be used for predicting previously unknown relationships between them. This thesis investigates a new graph-based LBD method that exploits the latent bibliographic metadata connections between pairs of research papers. The heterogeneous bibliographic information network is proposed as an efficient graph-based data structure for modeling the complex relationships between these metadata. In contrast to previous approaches, this method seamlessly combines textual and citation information in the form of pathbased metadata features for predicting future co-citation links between research papers from disparate research fields. The results reported in this thesis provide evidence that the method is effective for reconstructing the historical literature-based discovery hypotheses. This thesis also investigates the effects of semantic modeling and topic modeling on the performance of the proposed method. For semantic modeling, a general-purpose word sense disambiguation technique is proposed to reduce the lexical ambiguity in the title and abstract of research papers. The experimental results suggest that the reduced lexical ambiguity did not necessarily lead to a better performance of the method. This thesis discusses some of the possible contributing factors to these results. Finally, topic modeling is used for learning the latent topical relations between research papers. The learned topic model is incorporated into the heterogeneous bibliographic information network graph and allows new predictive features to be learned. The results in this thesis suggest that topic modeling improves the performance of the proposed method by increasing the overall accuracy for predicting the future co-citation links between disparate research papers.
Ziemba, L.: Information retrieval with concept discovery in digital collections for agriculture and natural resources (2011) 0.00
```
0.0020143287 = product of:
  0.028200602 = sum of:
    0.028200602 = product of:
      0.056401204 = sum of:
        0.056401204 = weight(_text_:texts in 4728) [ClassicSimilarity], result of:
          0.056401204 = score(doc=4728,freq=4.0), product of:
            0.16460659 = queryWeight, product of:
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.03002521 = queryNorm
            0.34264246 = fieldWeight in 4728, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.03125 = fieldNorm(doc=4728)
      0.5 = coord(1/2)
  0.071428575 = coord(1/14)
```
Abstract

The amount and complexity of information available in a digital form is already huge and new information is being produced every day. Retrieving information relevant to address a particular need becomes a significant issue. This work utilizes knowledge organization systems (KOS), such as thesauri and ontologies and applies information extraction (IE) and computational linguistics (CL) techniques to organize, manage and retrieve information stored in digital collections in the agricultural domain. Two real world applications of the approach have been developed and are available and actively used by the public. An ontology is used to manage the Water Conservation Digital Library holding a dynamic collection of various types of digital resources in the domain of urban water conservation in Florida, USA. The ontology based back-end powers a fully operational web interface, available at http://library.conservefloridawater.org. The system has demonstrated numerous benefits of the ontology application, including accurate retrieval of resources, information sharing and reuse, and has proved to effectively facilitate information management. The major difficulty encountered with the approach is that large and dynamic number of concepts makes it difficult to keep the ontology consistent and to accurately catalog resources manually. To address the aforementioned issues, a combination of IE and CL techniques, such as Vector Space Model and probabilistic parsing, with the use of Agricultural Thesaurus were adapted to automatically extract concepts important for each of the texts in the Best Management Practices (BMP) Publication Library--a collection of documents in the domain of agricultural BMPs in Florida available at http://lyra.ifas.ufl.edu/LIB. A new approach of domain-specific concept discovery with the use of Internet search engine was developed. Initial evaluation of the results indicates significant improvement in precision of information extraction. The approach presented in this work focuses on problems unique to agriculture and natural resources domain, such as domain specific concepts and vocabularies, but should be applicable to any collection of texts in digital format. It may be of potential interest for anyone who needs to effectively manage a collection of digital resources.
Schwarz, K.: Domain model enhanced search : a comparison of taxonomy, thesaurus and ontology (2005) 0.00
```
0.001919193 = product of:
  0.026868701 = sum of:
    0.026868701 = product of:
      0.053737402 = sum of:
        0.053737402 = weight(_text_:schemes in 4569) [ClassicSimilarity], result of:
          0.053737402 = score(doc=4569,freq=4.0), product of:
            0.16067243 = queryWeight, product of:
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.03002521 = queryNorm
            0.33445317 = fieldWeight in 4569, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.3512506 = idf(docFreq=569, maxDocs=44218)
              0.03125 = fieldNorm(doc=4569)
      0.5 = coord(1/2)
  0.071428575 = coord(1/14)
```
Abstract

The results of this thesis are intended to support the information architect in designing a solution for improved search in a corporate environment. Specifically we have examined the type of search problems that require a domain model to enhance the search process. There are several approaches to modeling a domain. We have considered 3 different types of domain modeling schemes; taxonomy, thesaurus and ontology. The intention is to support the information architect in making an informed choice between one or more of these schemes. In our opinion the main criteria for this choice are the modeling characteristics of a scheme and the suitability for application in the search process. The second chapter is a discussion of modeling characteristics of each scheme, followed by a comparison between them. This should give an information architect an idea of which aspects of a domain can be modeled with each scheme. What is missing here is an indication of the effort required to model a domain with each scheme. There are too many factors that influence the amount of required effort, ranging from measurable factors like domain size and resource characteristics to cultural matters such as the willingness to share knowledge and the existence of a project champion in the team to keep the project running. The third chapter shows what role domain models can play in each part of the search process. This gives an idea of the problems that domain models can solve. We have split the search process into individual parts to show that domain models can be applied very differently in the process. The fourth chapter makes recommendations about the suitability of each individualdomain modeling scheme for improving search. Each scheme has particular characteristics that make it especially suitable for a domain or a search problem. In the appendix each case study is described in detail. These descriptions are intended to serve as a benchmark. The current problem of the enterprise can be compared to those described to see which case study is most similar, which solution was chosen, which problems arose and how they were dealt with. An important issue that we have not touched upon in this thesis is that of maintenance. The real problems of a domain model are revealed when it is applied in a search system and its deficits and wrong assumptions become clear. Adaptation and maintenance are always required. Unfortunately we have not been able to glean sufficient information about maintenance issues from our case studies to draw any meaningful conclusions.
Onofri, A.: Concepts in context (2013) 0.00
```
0.0015003269 = product of:
  0.021004576 = sum of:
    0.021004576 = weight(_text_:subject in 1077) [ClassicSimilarity], result of:
      0.021004576 = score(doc=1077,freq=4.0), product of:
        0.10738805 = queryWeight, product of:
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.03002521 = queryNorm
        0.1955951 = fieldWeight in 1077, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.576596 = idf(docFreq=3361, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1077)
  0.071428575 = coord(1/14)
```
Abstract

My thesis discusses two related problems that have taken center stage in the recent literature on concepts: 1) What are the individuation conditions of concepts? Under what conditions is a concept Cv(1) the same concept as a concept Cv(2)? 2) What are the possession conditions of concepts? What conditions must be satisfied for a thinker to have a concept C? The thesis defends a novel account of concepts, which I call "pluralist-contextualist": 1) Pluralism: Different concepts have different kinds of individuation and possession conditions: some concepts are individuated more "coarsely", have less demanding possession conditions and are widely shared, while other concepts are individuated more "finely" and not shared. 2) Contextualism: When a speaker ascribes a propositional attitude to a subject S, or uses his ascription to explain/predict S's behavior, the speaker's intentions in the relevant context determine the correct individuation conditions for the concepts involved in his report. In chapters 1-3 I defend a contextualist, non-Millian theory of propositional attitude ascriptions. Then, I show how contextualism can be used to offer a novel perspective on the problem of concept individuation/possession. More specifically, I employ contextualism to provide a new, more effective argument for Fodor's "publicity principle": if contextualism is true, then certain specific concepts must be shared in order for interpersonally applicable psychological generalizations to be possible. In chapters 4-5 I raise a tension between publicity and another widely endorsed principle, the "Fregean constraint" (FC): subjects who are unaware of certain identity facts and find themselves in so-called "Frege cases" must have distinct concepts for the relevant object x. For instance: the ancient astronomers had distinct concepts (HESPERUS/PHOSPHORUS) for the same object (the planet Venus). First, I examine some leading theories of concepts and argue that they cannot meet both of our constraints at the same time. Then, I offer principled reasons to think that no theory can satisfy (FC) while also respecting publicity. (FC) appears to require a form of holism, on which a concept is individuated by its global inferential role in a subject S and can thus only be shared by someone who has exactly the same inferential dispositions as S. This explains the tension between publicity and (FC), since holism is clearly incompatible with concept shareability. To solve the tension, I suggest adopting my pluralist-contextualist proposal: concepts involved in Frege cases are holistically individuated and not public, while other concepts are more coarsely individuated and widely shared; given this "plurality" of concepts, we will then need contextual factors (speakers' intentions) to "select" the specific concepts to be employed in our intentional generalizations in the relevant contexts. In chapter 6 I develop the view further by contrasting it with some rival accounts. First, I examine a very different kind of pluralism about concepts, which has been recently defended by Daniel Weiskopf, and argue that it is insufficiently radical. Then, I consider the inferentialist accounts defended by authors like Peacocke, Rey and Jackson. Such views, I argue, are committed to an implausible picture of reference determination, on which our inferential dispositions fix the reference of our concepts: this leads to wrong predictions in all those cases of scientific disagreement where two parties have very different inferential dispositions and yet seem to refer to the same natural kind.

Haller, S.H.M.: Mappingverfahren zur Wissensorganisation (2002) 0.00

0.0014528577 = product of:
  0.020340007 = sum of:
    0.020340007 = product of:
      0.040680014 = sum of:
        0.040680014 = weight(_text_:22 in 3406) [ClassicSimilarity], result of:
          0.040680014 = score(doc=3406,freq=2.0), product of:
            0.10514317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03002521 = queryNorm
            0.38690117 = fieldWeight in 3406, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=3406)
      0.5 = coord(1/2)
  0.071428575 = coord(1/14)

Date: 30. 5.2010 16:22:35

Styltsvig, H.B.: Ontology-based information retrieval (2006) 0.00
```
0.0014243455 = product of:
  0.019940836 = sum of:
    0.019940836 = product of:
      0.039881673 = sum of:
        0.039881673 = weight(_text_:texts in 1154) [ClassicSimilarity], result of:
          0.039881673 = score(doc=1154,freq=2.0), product of:
            0.16460659 = queryWeight, product of:
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.03002521 = queryNorm
            0.2422848 = fieldWeight in 1154, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.4822793 = idf(docFreq=499, maxDocs=44218)
              0.03125 = fieldNorm(doc=1154)
      0.5 = coord(1/2)
  0.071428575 = coord(1/14)
```
Abstract

In this thesis, we will present methods for introducing ontologies in information retrieval. The main hypothesis is that the inclusion of conceptual knowledge such as ontologies in the information retrieval process can contribute to the solution of major problems currently found in information retrieval. This utilization of ontologies has a number of challenges. Our focus is on the use of similarity measures derived from the knowledge about relations between concepts in ontologies, the recognition of semantic information in texts and the mapping of this knowledge into the ontologies in use, as well as how to fuse together the ideas of ontological similarity and ontological indexing into a realistic information retrieval scenario. To achieve the recognition of semantic knowledge in a text, shallow natural language processing is used during indexing that reveals knowledge to the level of noun phrases. Furthermore, we briefly cover the identification of semantic relations inside and between noun phrases, as well as discuss which kind of problems are caused by an increase in compoundness with respect to the structure of concepts in the evaluation of queries. Measuring similarity between concepts based on distances in the structure of the ontology is discussed. In addition, a shared nodes measure is introduced and, based on a set of intuitive similarity properties, compared to a number of different measures. In this comparison the shared nodes measure appears to be superior, though more computationally complex. Some of the major problems of shared nodes which relate to the way relations differ with respect to the degree they bring the concepts they connect closer are discussed. A generalized measure called weighted shared nodes is introduced to deal with these problems. Finally, the utilization of concept similarity in query evaluation is discussed. A semantic expansion approach that incorporates concept similarity is introduced and a generalized fuzzy set retrieval model that applies expansion during query evaluation is presented. While not commonly used in present information retrieval systems, it appears that the fuzzy set model comprises the flexibility needed when generalizing to an ontology-based retrieval model and, with the introduction of a hierarchical fuzzy aggregation principle, compound concepts can be handled in a straightforward and natural manner.

Müller, T.: Wissensrepräsentation mit semantischen Netzen im Bereich Luftfahrt (2006) 0.00

7.264289E-4 = product of:
  0.010170003 = sum of:
    0.010170003 = product of:
      0.020340007 = sum of:
        0.020340007 = weight(_text_:22 in 1670) [ClassicSimilarity], result of:
          0.020340007 = score(doc=1670,freq=2.0), product of:
            0.10514317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03002521 = queryNorm
            0.19345059 = fieldWeight in 1670, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1670)
      0.5 = coord(1/2)
  0.071428575 = coord(1/14)

Date: 26. 9.2006 21:00:22

Kiren, T.: ¬A clustering based indexing technique of modularized ontologies for information retrieval (2017) 0.00

5.811431E-4 = product of:
  0.008136002 = sum of:
    0.008136002 = product of:
      0.016272005 = sum of:
        0.016272005 = weight(_text_:22 in 4399) [ClassicSimilarity], result of:
          0.016272005 = score(doc=4399,freq=2.0), product of:
            0.10514317 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03002521 = queryNorm
            0.15476047 = fieldWeight in 4399, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=4399)
      0.5 = coord(1/2)
  0.071428575 = coord(1/14)

Date: 20. 1.2015 18:30:22

Search (11 results, page 1 of 1)

Authors

Years

Languages

Themes