Search (9 results, page 1 of 1)

Farazi, M.: Faceted lightweight ontologies : a formalization and some experiments (2010) 0.07
```
0.066725716 = product of:
  0.10008857 = sum of:
    0.06653893 = product of:
      0.19961677 = sum of:
        0.19961677 = weight(_text_:3a in 4997) [ClassicSimilarity], result of:
          0.19961677 = score(doc=4997,freq=2.0), product of:
            0.4262143 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.05027291 = queryNorm
            0.46834838 = fieldWeight in 4997, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4997)
      0.33333334 = coord(1/3)
    0.03354964 = weight(_text_:search in 4997) [ClassicSimilarity], result of:
      0.03354964 = score(doc=4997,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.19200584 = fieldWeight in 4997, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4997)
  0.6666667 = coord(2/3)
```
Abstract

While classifications are heavily used to categorize web content, the evolution of the web foresees a more formal structure - ontology - which can serve this purpose. Ontologies are core artifacts of the Semantic Web which enable machines to use inference rules to conduct automated reasoning on data. Lightweight ontologies bridge the gap between classifications and ontologies. A lightweight ontology (LO) is an ontology representing a backbone taxonomy where the concept of the child node is more specific than the concept of the parent node. Formal lightweight ontologies can be generated from their informal ones. The key applications of formal lightweight ontologies are document classification, semantic search, and data integration. However, these applications suffer from the following problems: the disambiguation accuracy of the state of the art NLP tools used in generating formal lightweight ontologies from their informal ones; the lack of background knowledge needed for the formal lightweight ontologies; and the limitation of ontology reuse. In this dissertation, we propose a novel solution to these problems in formal lightweight ontologies; namely, faceted lightweight ontology (FLO). FLO is a lightweight ontology in which terms, present in each node label, and their concepts, are available in the background knowledge (BK), which is organized as a set of facets. A facet can be defined as a distinctive property of the groups of concepts that can help in differentiating one group from another. Background knowledge can be defined as a subset of a knowledge base, such as WordNet, and often represents a specific domain.

Content

PhD Dissertation at International Doctorate School in Information and Communication Technology. Vgl.: https%3A%2F%2Fcore.ac.uk%2Fdownload%2Fpdf%2F150083013.pdf&usg=AOvVaw2n-qisNagpyT0lli_6QbAQ.
Vocht, L. De: Exploring semantic relationships in the Web of Data : Semantische relaties verkennen in data op het web (2017) 0.06
```
0.060695343 = product of:
  0.09104301 = sum of:
    0.07311975 = weight(_text_:search in 4232) [ClassicSimilarity], result of:
      0.07311975 = score(doc=4232,freq=38.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.41846704 = fieldWeight in 4232, product of:
          6.164414 = tf(freq=38.0), with freq of:
            38.0 = termFreq=38.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.01953125 = fieldNorm(doc=4232)
    0.01792326 = product of:
      0.03584652 = sum of:
        0.03584652 = weight(_text_:engines in 4232) [ClassicSimilarity], result of:
          0.03584652 = score(doc=4232,freq=2.0), product of:
            0.25542772 = queryWeight, product of:
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.05027291 = queryNorm
            0.1403392 = fieldWeight in 4232, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.080822 = idf(docFreq=746, maxDocs=44218)
              0.01953125 = fieldNorm(doc=4232)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

After the launch of the World Wide Web, it became clear that searching documentson the Web would not be trivial. Well-known engines to search the web, like Google, focus on search in web documents using keywords. The documents are structured and indexed to ensure keywords match documents as accurately as possible. However, searching by keywords does not always suice. It is oen the case that users do not know exactly how to formulate the search query or which keywords guarantee retrieving the most relevant documents. Besides that, it occurs that users rather want to browse information than looking up something specific. It turned out that there is need for systems that enable more interactivity and facilitate the gradual refinement of search queries to explore the Web. Users expect more from the Web because the short keyword-based queries they pose during search, do not suffice for all cases. On top of that, the Web is changing structurally. The Web comprises, apart from a collection of documents, more and more linked data, pieces of information structured so they can be processed by machines. The consequently applied semantics allow users to exactly indicate machines their search intentions. This is made possible by describing data following controlled vocabularies, concept lists composed by experts, published uniquely identifiable on the Web. Even so, it is still not trivial to explore data on the Web. There is a large variety of vocabularies and various data sources use different terms to identify the same concepts.
This PhD-thesis describes how to effectively explore linked data on the Web. The main focus is on scenarios where users want to discover relationships between resources rather than finding out more about something specific. Searching for a specific document or piece of information fits in the theoretical framework of information retrieval and is associated with exploratory search. Exploratory search goes beyond 'looking up something' when users are seeking more detailed understanding, further investigation or navigation of the initial search results. The ideas behind exploratory search and querying linked data merge when it comes to the way knowledge is represented and indexed by machines - how data is structured and stored for optimal searchability. Queries and information should be aligned to facilitate that searches also reveal connections between results. This implies that they take into account the same semantic entities, relevant at that moment. To realize this, we research three techniques that are evaluated one by one in an experimental set-up to assess how well they succeed in their goals. In the end, the techniques are applied to a practical use case that focuses on forming a bridge between the Web and the use of digital libraries in scientific research. Our first technique focuses on the interactive visualization of search results. Linked data resources can be brought in relation with each other at will. This leads to complex and diverse graphs structures. Our technique facilitates navigation and supports a workflow starting from a broad overview on the data and allows narrowing down until the desired level of detail to then broaden again. To validate the flow, two visualizations where implemented and presented to test-users. The users judged the usability of the visualizations, how the visualizations fit in the workflow and to which degree their features seemed useful for the exploration of linked data.
The ideas behind exploratory search and querying linked data merge when it comes to the way knowledge is represented and indexed by machines - how data is structured and stored for optimal searchability. eries and information should be aligned to facilitate that searches also reveal connections between results. This implies that they take into account the same semantic entities, relevant at that moment. To realize this, we research three techniques that are evaluated one by one in an experimental set-up to assess how well they succeed in their goals. In the end, the techniques are applied to a practical use case that focuses on forming a bridge between the Web and the use of digital libraries in scientific research.
Our first technique focuses on the interactive visualization of search results. Linked data resources can be brought in relation with each other at will. This leads to complex and diverse graphs structures. Our technique facilitates navigation and supports a workflow starting from a broad overview on the data and allows narrowing down until the desired level of detail to then broaden again. To validate the flow, two visualizations where implemented and presented to test-users. The users judged the usability of the visualizations, how the visualizations fit in the workflow and to which degree their features seemed useful for the exploration of linked data. There is a difference in the way users interact with resources, visually or textually, and how resources are represented for machines to be processed by algorithms. This difference complicates bridging the users' intents and machine executable queries. It is important to implement this 'translation' mechanism to impact the search as favorable as possible in terms of performance, complexity and accuracy. To do this, we explain a second technique, that supports such a bridging component. Our second technique is developed around three features that support the search process: looking up, relating and ranking resources. The main goal is to ensure that resources in the results are as precise and relevant as possible. During the evaluation of this technique, we did not only look at the precision of the search results but also investigated how the effectiveness of the search evolved while the user executed certain actions sequentially.
When we speak about finding relationships between resources, it is necessary to dive deeper in the structure. The graph structure of linked data where the semantics give meaning to the relationships between resources enable the execution of pathfinding algorithms. The assigned weights and heuristics are base components of such algorithms and ultimately define (the order) which resources are included in a path. These paths explain indirect connections between resources. Our third technique proposes an algorithm that optimizes the choice of resources in terms of serendipity. Some optimizations guard the consistence of candidate-paths where the coherence of consecutive connections is maximized to avoid trivial and too arbitrary paths. The implementation uses the A* algorithm, the de-facto reference when it comes to heuristically optimized minimal cost paths. The effectiveness of paths was measured based on common automatic metrics and surveys where the users could indicate their preference for paths, generated each time in a different way. Finally, all our techniques are applied to a use case about publications in digital libraries where they are aligned with information about scientific conferences and researchers. The application to this use case is a practical example because the different aspects of exploratory search come together. In fact, the techniques also evolved from the experiences when implementing the use case. Practical details about the semantic model are explained and the implementation of the search system is clarified module by module. The evaluation positions the result, a prototype of a tool to explore scientific publications, researchers and conferences next to some important alternatives.
Xiong, C.: Knowledge based text representations for information retrieval (2016) 0.05
```
0.05338057 = product of:
  0.08007085 = sum of:
    0.053231142 = product of:
      0.15969342 = sum of:
        0.15969342 = weight(_text_:3a in 5820) [ClassicSimilarity], result of:
          0.15969342 = score(doc=5820,freq=2.0), product of:
            0.4262143 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.05027291 = queryNorm
            0.3746787 = fieldWeight in 5820, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03125 = fieldNorm(doc=5820)
      0.33333334 = coord(1/3)
    0.026839713 = weight(_text_:search in 5820) [ClassicSimilarity], result of:
      0.026839713 = score(doc=5820,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.15360467 = fieldWeight in 5820, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.03125 = fieldNorm(doc=5820)
  0.6666667 = coord(2/3)
```
Abstract

The successes of information retrieval (IR) in recent decades were built upon bag-of-words representations. Effective as it is, bag-of-words is only a shallow text understanding; there is a limited amount of information for document ranking in the word space. This dissertation goes beyond words and builds knowledge based text representations, which embed the external and carefully curated information from knowledge bases, and provide richer and structured evidence for more advanced information retrieval systems. This thesis research first builds query representations with entities associated with the query. Entities' descriptions are used by query expansion techniques that enrich the query with explanation terms. Then we present a general framework that represents a query with entities that appear in the query, are retrieved by the query, or frequently show up in the top retrieved documents. A latent space model is developed to jointly learn the connections from query to entities and the ranking of documents, modeling the external evidence from knowledge bases and internal ranking features cooperatively. To further improve the quality of relevant entities, a defining factor of our query representations, we introduce learning to rank to entity search and retrieve better entities from knowledge bases. In the document representation part, this thesis research also moves one step forward with a bag-of-entities model, in which documents are represented by their automatic entity annotations, and the ranking is performed in the entity space.

Content

Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Language and Information Technologies. Vgl.: https%3A%2F%2Fwww.cs.cmu.edu%2F~cx%2Fpapers%2Fknowledge_based_text_representation.pdf&usg=AOvVaw0SaTSvhWLTh__Uz_HtOtl3.
Kiren, T.: ¬A clustering based indexing technique of modularized ontologies for information retrieval (2017) 0.03
```
0.03438644 = product of:
  0.051579658 = sum of:
    0.037957087 = weight(_text_:search in 4399) [ClassicSimilarity], result of:
      0.037957087 = score(doc=4399,freq=4.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.21722981 = fieldWeight in 4399, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.03125 = fieldNorm(doc=4399)
    0.013622572 = product of:
      0.027245143 = sum of:
        0.027245143 = weight(_text_:22 in 4399) [ClassicSimilarity], result of:
          0.027245143 = score(doc=4399,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.15476047 = fieldWeight in 4399, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=4399)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Modular ontologies are built in modular manner by combining modules from multiple relevant ontologies. Ontology heterogeneity also arises during modular ontology construction because multiple ontologies are being dealt with, during this process. Ontologies need to be aligned before using them for modular ontology construction. The existing approaches for ontology alignment compare all the concepts of each ontology to be aligned, hence not optimized in terms of time and search space utilization. A new indexing technique is proposed based on modular ontology. An efficient ontology alignment technique is proposed to solve the heterogeneity problem during the construction of modular ontology. Results are satisfactory as Precision and Recall are improved by (8%) and (10%) respectively. The value of Pearsons Correlation Coefficient for degree of similarity, time, search space requirement, precision and recall are close to 1 which shows that the results are significant. Further research can be carried out for using modular ontology based indexing technique for Multimedia Information Retrieval and Bio-Medical information retrieval.

Date

20. 1.2015 18:30:22
Líska, M.: Evaluation of mathematics retrieval (2013) 0.02
```
0.015656501 = product of:
  0.0469695 = sum of:
    0.0469695 = weight(_text_:search in 1653) [ClassicSimilarity], result of:
      0.0469695 = score(doc=1653,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.2688082 = fieldWeight in 1653, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1653)
  0.33333334 = coord(1/3)
```
Abstract

The thesis deals with the evaluation of mathematics information retrieval (IR). It gives an overview of the history of regular IR evaluation, initiatives that are engaged in this field of research as well as most common methods and measures used for evaluation. The findings are applied to the specifics of mathematics retrieval. This thesis also summarizes the state-of-the-art of MIaS math search system, which is already being used in an international web portal. Latest developments aiming towards the second version of the system are described. In addition to participating in the international evaluation conference and workshop, MIaS is tested for effectiveness and efficiency in this work. Measured performance indicators are evaluated and future work is suggested accordingly.
Kara, S.: ¬An ontology-based retrieval system using semantic indexing (2012) 0.01
```
0.013419857 = product of:
  0.04025957 = sum of:
    0.04025957 = weight(_text_:search in 3829) [ClassicSimilarity], result of:
      0.04025957 = score(doc=3829,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.230407 = fieldWeight in 3829, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.046875 = fieldNorm(doc=3829)
  0.33333334 = coord(1/3)
```
Abstract

In this thesis, we present an ontology-based information extraction and retrieval system and its application to soccer domain. In general, we deal with three issues in semantic search, namely, usability, scalability and retrieval performance. We propose a keyword-based semantic retrieval approach. The performance of the system is improved considerably using domain-specific information extraction, inference and rules. Scalability is achieved by adapting a semantic indexing approach. The system is implemented using the state-of-the-art technologies in SemanticWeb and its performance is evaluated against traditional systems as well as the query expansion methods. Furthermore, a detailed evaluation is provided to observe the performance gain due to domain-specific information extraction and inference. Finally, we show how we use semantic indexing to solve simple structural ambiguities.
Ziemba, L.: Information retrieval with concept discovery in digital collections for agriculture and natural resources (2011) 0.01
```
0.0089465715 = product of:
  0.026839713 = sum of:
    0.026839713 = weight(_text_:search in 4728) [ClassicSimilarity], result of:
      0.026839713 = score(doc=4728,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.15360467 = fieldWeight in 4728, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.03125 = fieldNorm(doc=4728)
  0.33333334 = coord(1/3)
```
Abstract

The amount and complexity of information available in a digital form is already huge and new information is being produced every day. Retrieving information relevant to address a particular need becomes a significant issue. This work utilizes knowledge organization systems (KOS), such as thesauri and ontologies and applies information extraction (IE) and computational linguistics (CL) techniques to organize, manage and retrieve information stored in digital collections in the agricultural domain. Two real world applications of the approach have been developed and are available and actively used by the public. An ontology is used to manage the Water Conservation Digital Library holding a dynamic collection of various types of digital resources in the domain of urban water conservation in Florida, USA. The ontology based back-end powers a fully operational web interface, available at http://library.conservefloridawater.org. The system has demonstrated numerous benefits of the ontology application, including accurate retrieval of resources, information sharing and reuse, and has proved to effectively facilitate information management. The major difficulty encountered with the approach is that large and dynamic number of concepts makes it difficult to keep the ontology consistent and to accurately catalog resources manually. To address the aforementioned issues, a combination of IE and CL techniques, such as Vector Space Model and probabilistic parsing, with the use of Agricultural Thesaurus were adapted to automatically extract concepts important for each of the texts in the Best Management Practices (BMP) Publication Library--a collection of documents in the domain of agricultural BMPs in Florida available at http://lyra.ifas.ufl.edu/LIB. A new approach of domain-specific concept discovery with the use of Internet search engine was developed. Initial evaluation of the results indicates significant improvement in precision of information extraction. The approach presented in this work focuses on problems unique to agriculture and natural resources domain, such as domain specific concepts and vocabularies, but should be applicable to any collection of texts in digital format. It may be of potential interest for anyone who needs to effectively manage a collection of digital resources.

Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.01

0.0068112854 = product of:
  0.020433856 = sum of:
    0.020433856 = product of:
      0.040867712 = sum of:
        0.040867712 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
          0.040867712 = score(doc=563,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.23214069 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 10. 1.2013 19:22:47

Geisriegler, E.: Enriching electronic texts with semantic metadata : a use case for the historical Newspaper Collection ANNO (Austrian Newspapers Online) of the Austrian National Libraryhek (2012) 0.01

0.0056760716 = product of:
  0.017028214 = sum of:
    0.017028214 = product of:
      0.03405643 = sum of:
        0.03405643 = weight(_text_:22 in 595) [ClassicSimilarity], result of:
          0.03405643 = score(doc=595,freq=2.0), product of:
            0.17604718 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.05027291 = queryNorm
            0.19345059 = fieldWeight in 595, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=595)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 3. 2.2013 18:00:22

Search (9 results, page 1 of 1)

Authors

Types

Themes