Search (39 results, page 2 of 2)

  • × language_ss:"e"
  • × type_ss:"x"
  1. Tzitzikas, Y.: Collaborative ontology-based information indexing and retrieval (2002) 0.00
    0.0021393995 = product of:
      0.004278799 = sum of:
        0.004278799 = product of:
          0.008557598 = sum of:
            0.008557598 = weight(_text_:a in 2281) [ClassicSimilarity], result of:
              0.008557598 = score(doc=2281,freq=20.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.16114321 = fieldWeight in 2281, product of:
                  4.472136 = tf(freq=20.0), with freq of:
                    20.0 = termFreq=20.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2281)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    An information system like the Web is a continuously evolving system consisting of multiple heterogeneous information sources, covering a wide domain of discourse, and a huge number of users (human or software) with diverse characteristics and needs, that produce and consume information. The challenge nowadays is to build a scalable information infrastructure enabling the effective, accurate, content based retrieval of information, in a way that adapts to the characteristics and interests of the users. The aim of this work is to propose formally sound methods for building such an information network based on ontologies which are widely used and are easy to grasp by ordinary Web users. The main results of this work are: - A novel scheme for indexing and retrieving objects according to multiple aspects or facets. The proposed scheme is a faceted scheme enriched with a method for specifying the combinations of terms that are valid. We give a model-theoretic interpretation to this model and we provide mechanisms for inferring the valid combinations of terms. This inference service can be exploited for preventing errors during the indexing process, which is very important especially in the case where the indexing is done collaboratively by many users, and for deriving "complete" navigation trees suitable for browsing through the Web. The proposed scheme has several advantages over the hierarchical classification schemes currently employed by Web catalogs, namely, conceptual clarity (it is easier to understand), compactness (it takes less space), and scalability (the update operations can be formulated more easily and be performed more effciently). - A exible and effecient model for building mediators over ontology based information sources. The proposed mediators support several modes of query translation and evaluation which can accommodate various application needs and levels of answer quality. The proposed model can be used for providing users with customized views of Web catalogs. It can also complement the techniques for building mediators over relational sources so as to support approximate translation of partially ordered domain values.
  2. Kara, S.: ¬An ontology-based retrieval system using semantic indexing (2012) 0.00
    0.0020296127 = product of:
      0.0040592253 = sum of:
        0.0040592253 = product of:
          0.008118451 = sum of:
            0.008118451 = weight(_text_:a in 3829) [ClassicSimilarity], result of:
              0.008118451 = score(doc=3829,freq=8.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.15287387 = fieldWeight in 3829, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3829)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In this thesis, we present an ontology-based information extraction and retrieval system and its application to soccer domain. In general, we deal with three issues in semantic search, namely, usability, scalability and retrieval performance. We propose a keyword-based semantic retrieval approach. The performance of the system is improved considerably using domain-specific information extraction, inference and rules. Scalability is achieved by adapting a semantic indexing approach. The system is implemented using the state-of-the-art technologies in SemanticWeb and its performance is evaluated against traditional systems as well as the query expansion methods. Furthermore, a detailed evaluation is provided to observe the performance gain due to domain-specific information extraction and inference. Finally, we show how we use semantic indexing to solve simple structural ambiguities.
    Type
    a
  3. Ziemba, L.: Information retrieval with concept discovery in digital collections for agriculture and natural resources (2011) 0.00
    0.0020296127 = product of:
      0.0040592253 = sum of:
        0.0040592253 = product of:
          0.008118451 = sum of:
            0.008118451 = weight(_text_:a in 4728) [ClassicSimilarity], result of:
              0.008118451 = score(doc=4728,freq=18.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.15287387 = fieldWeight in 4728, product of:
                  4.2426405 = tf(freq=18.0), with freq of:
                    18.0 = termFreq=18.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.03125 = fieldNorm(doc=4728)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The amount and complexity of information available in a digital form is already huge and new information is being produced every day. Retrieving information relevant to address a particular need becomes a significant issue. This work utilizes knowledge organization systems (KOS), such as thesauri and ontologies and applies information extraction (IE) and computational linguistics (CL) techniques to organize, manage and retrieve information stored in digital collections in the agricultural domain. Two real world applications of the approach have been developed and are available and actively used by the public. An ontology is used to manage the Water Conservation Digital Library holding a dynamic collection of various types of digital resources in the domain of urban water conservation in Florida, USA. The ontology based back-end powers a fully operational web interface, available at http://library.conservefloridawater.org. The system has demonstrated numerous benefits of the ontology application, including accurate retrieval of resources, information sharing and reuse, and has proved to effectively facilitate information management. The major difficulty encountered with the approach is that large and dynamic number of concepts makes it difficult to keep the ontology consistent and to accurately catalog resources manually. To address the aforementioned issues, a combination of IE and CL techniques, such as Vector Space Model and probabilistic parsing, with the use of Agricultural Thesaurus were adapted to automatically extract concepts important for each of the texts in the Best Management Practices (BMP) Publication Library--a collection of documents in the domain of agricultural BMPs in Florida available at http://lyra.ifas.ufl.edu/LIB. A new approach of domain-specific concept discovery with the use of Internet search engine was developed. Initial evaluation of the results indicates significant improvement in precision of information extraction. The approach presented in this work focuses on problems unique to agriculture and natural resources domain, such as domain specific concepts and vocabularies, but should be applicable to any collection of texts in digital format. It may be of potential interest for anyone who needs to effectively manage a collection of digital resources.
  4. Nagy T., I.: Detecting multiword expressions and named entities in natural language texts (2014) 0.00
    0.0019633435 = product of:
      0.003926687 = sum of:
        0.003926687 = product of:
          0.007853374 = sum of:
            0.007853374 = weight(_text_:a in 1536) [ClassicSimilarity], result of:
              0.007853374 = score(doc=1536,freq=22.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.14788237 = fieldWeight in 1536, product of:
                  4.690416 = tf(freq=22.0), with freq of:
                    22.0 = termFreq=22.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=1536)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Multiword expressions (MWEs) are lexical items that can be decomposed into single words and display lexical, syntactic, semantic, pragmatic and/or statistical idiosyncrasy (Sag et al., 2002; Kim, 2008; Calzolari et al., 2002). The proper treatment of multiword expressions such as rock 'n' roll and make a decision is essential for many natural language processing (NLP) applications like information extraction and retrieval, terminology extraction and machine translation, and it is important to identify multiword expressions in context. For example, in machine translation we must know that MWEs form one semantic unit, hence their parts should not be translated separately. For this, multiword expressions should be identified first in the text to be translated. The chief aim of this thesis is to develop machine learning-based approaches for the automatic detection of different types of multiword expressions in English and Hungarian natural language texts. In our investigations, we pay attention to the characteristics of different types of multiword expressions such as nominal compounds, multiword named entities and light verb constructions, and we apply novel methods to identify MWEs in raw texts. In the thesis it will be demonstrated that nominal compounds and multiword amed entities may require a similar approach for their automatic detection as they behave in the same way from a linguistic point of view. Furthermore, it will be shown that the automatic detection of light verb constructions can be carried out using two effective machine learning-based approaches.
    In this thesis, we focused on the automatic detection of multiword expressions in natural language texts. On the basis of the main contributions, we can argue that: - Supervised machine learning methods can be successfully applied for the automatic detection of different types of multiword expressions in natural language texts. - Machine learning-based multiword expression detection can be successfully carried out for English as well as for Hungarian. - Our supervised machine learning-based model was successfully applied to the automatic detection of nominal compounds from English raw texts. - We developed a Wikipedia-based dictionary labeling method to automatically detect English nominal compounds. - A prior knowledge of nominal compounds can enhance Named Entity Recognition, while previously identified named entities can assist the nominal compound identification process. - The machine learning-based method can also provide acceptable results when it was trained on an automatically generated silver standard corpus. - As named entities form one semantic unit and may consist of more than one word and function as a noun, we can treat them in a similar way to nominal compounds. - Our sequence labelling-based tool can be successfully applied for identifying verbal light verb constructions in two typologically different languages, namely English and Hungarian. - Domain adaptation techniques may help diminish the distance between domains in the automatic detection of light verb constructions. - Our syntax-based method can be successfully applied for the full-coverage identification of light verb constructions. As a first step, a data-driven candidate extraction method can be utilized. After, a machine learning approach that makes use of an extended and rich feature set selects LVCs among extracted candidates. - When a precise syntactic parser is available for the actual domain, the full-coverage identification can be performed better. In other cases, the usage of the sequence labeling method is recommended.
  5. Haslhofer, B.: ¬A Web-based mapping technique for establishing metadata interoperability (2008) 0.00
    0.0019376777 = product of:
      0.0038753555 = sum of:
        0.0038753555 = product of:
          0.007750711 = sum of:
            0.007750711 = weight(_text_:a in 3173) [ClassicSimilarity], result of:
              0.007750711 = score(doc=3173,freq=42.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.14594918 = fieldWeight in 3173, product of:
                  6.4807405 = tf(freq=42.0), with freq of:
                    42.0 = termFreq=42.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.01953125 = fieldNorm(doc=3173)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The integration of metadata from distinct, heterogeneous data sources requires metadata interoperability, which is a qualitative property of metadata information objects that is not given by default. The technique of metadata mapping allows domain experts to establish metadata interoperability in a certain integration scenario. Mapping solutions, as a technical manifestation of this technique, are already available for the intensively studied domain of database system interoperability, but they rarely exist for the Web. If we consider the amount of steadily increasing structured metadata and corresponding metadata schemes on theWeb, we can observe a clear need for a mapping solution that can operate in aWeb-based environment. To achieve that, we first need to build its technical core, which is a mapping model that provides the language primitives to define mapping relationships. Existing SemanticWeb languages such as RDFS and OWL define some basic mapping elements (e.g., owl:equivalentProperty, owl:sameAs), but do not address the full spectrum of semantic and structural heterogeneities that can occur among distinct, incompatible metadata information objects. Furthermore, it is still unclear how to process defined mapping relationships during run-time in order to deliver metadata to the client in a uniform way. As the main contribution of this thesis, we present an abstract mapping model, which reflects the mapping problem on a generic level and provides the means for reconciling incompatible metadata. Instance transformation functions and URIs take a central role in that model. The former cover a broad spectrum of possible structural and semantic heterogeneities, while the latter bind the complete mapping model to the architecture of the Word Wide Web. On the concrete, language-specific level we present a binding of the abstract mapping model for the RDF Vocabulary Description Language (RDFS), which allows us to create mapping specifications among incompatible metadata schemes expressed in RDFS. The mapping model is embedded in a cyclic process that categorises the requirements a mapping solution should fulfil into four subsequent phases: mapping discovery, mapping representation, mapping execution, and mapping maintenance. In this thesis, we mainly focus on mapping representation and on the transformation of mapping specifications into executable SPARQL queries. For mapping discovery support, the model provides an interface for plugging-in schema and ontology matching algorithms. For mapping maintenance we introduce the concept of a simple, but effective mapping registry. Based on the mapping model, we propose aWeb-based mediator wrapper-architecture that allows domain experts to set up mediation endpoints that provide a uniform SPARQL query interface to a set of distributed metadata sources. The involved data sources are encapsulated by wrapper components that expose the contained metadata and the schema definitions on the Web and provide a SPARQL query interface to these metadata. In this thesis, we present the OAI2LOD Server, a wrapper component for integrating metadata that are accessible via the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). In a case study, we demonstrate how mappings can be created in aWeb environment and how our mediator wrapper architecture can easily be configured in order to integrate metadata from various heterogeneous data sources without the need to install any mapping solution or metadata integration solution in a local system environment.
  6. Vocht, L. De: Exploring semantic relationships in the Web of Data : Semantische relaties verkennen in data op het web (2017) 0.00
    0.0019376777 = product of:
      0.0038753555 = sum of:
        0.0038753555 = product of:
          0.007750711 = sum of:
            0.007750711 = weight(_text_:a in 4232) [ClassicSimilarity], result of:
              0.007750711 = score(doc=4232,freq=42.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.14594918 = fieldWeight in 4232, product of:
                  6.4807405 = tf(freq=42.0), with freq of:
                    42.0 = termFreq=42.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.01953125 = fieldNorm(doc=4232)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    After the launch of the World Wide Web, it became clear that searching documentson the Web would not be trivial. Well-known engines to search the web, like Google, focus on search in web documents using keywords. The documents are structured and indexed to ensure keywords match documents as accurately as possible. However, searching by keywords does not always suice. It is oen the case that users do not know exactly how to formulate the search query or which keywords guarantee retrieving the most relevant documents. Besides that, it occurs that users rather want to browse information than looking up something specific. It turned out that there is need for systems that enable more interactivity and facilitate the gradual refinement of search queries to explore the Web. Users expect more from the Web because the short keyword-based queries they pose during search, do not suffice for all cases. On top of that, the Web is changing structurally. The Web comprises, apart from a collection of documents, more and more linked data, pieces of information structured so they can be processed by machines. The consequently applied semantics allow users to exactly indicate machines their search intentions. This is made possible by describing data following controlled vocabularies, concept lists composed by experts, published uniquely identifiable on the Web. Even so, it is still not trivial to explore data on the Web. There is a large variety of vocabularies and various data sources use different terms to identify the same concepts.
    This PhD-thesis describes how to effectively explore linked data on the Web. The main focus is on scenarios where users want to discover relationships between resources rather than finding out more about something specific. Searching for a specific document or piece of information fits in the theoretical framework of information retrieval and is associated with exploratory search. Exploratory search goes beyond 'looking up something' when users are seeking more detailed understanding, further investigation or navigation of the initial search results. The ideas behind exploratory search and querying linked data merge when it comes to the way knowledge is represented and indexed by machines - how data is structured and stored for optimal searchability. Queries and information should be aligned to facilitate that searches also reveal connections between results. This implies that they take into account the same semantic entities, relevant at that moment. To realize this, we research three techniques that are evaluated one by one in an experimental set-up to assess how well they succeed in their goals. In the end, the techniques are applied to a practical use case that focuses on forming a bridge between the Web and the use of digital libraries in scientific research. Our first technique focuses on the interactive visualization of search results. Linked data resources can be brought in relation with each other at will. This leads to complex and diverse graphs structures. Our technique facilitates navigation and supports a workflow starting from a broad overview on the data and allows narrowing down until the desired level of detail to then broaden again. To validate the flow, two visualizations where implemented and presented to test-users. The users judged the usability of the visualizations, how the visualizations fit in the workflow and to which degree their features seemed useful for the exploration of linked data.
    The ideas behind exploratory search and querying linked data merge when it comes to the way knowledge is represented and indexed by machines - how data is structured and stored for optimal searchability. eries and information should be aligned to facilitate that searches also reveal connections between results. This implies that they take into account the same semantic entities, relevant at that moment. To realize this, we research three techniques that are evaluated one by one in an experimental set-up to assess how well they succeed in their goals. In the end, the techniques are applied to a practical use case that focuses on forming a bridge between the Web and the use of digital libraries in scientific research.
    Our first technique focuses on the interactive visualization of search results. Linked data resources can be brought in relation with each other at will. This leads to complex and diverse graphs structures. Our technique facilitates navigation and supports a workflow starting from a broad overview on the data and allows narrowing down until the desired level of detail to then broaden again. To validate the flow, two visualizations where implemented and presented to test-users. The users judged the usability of the visualizations, how the visualizations fit in the workflow and to which degree their features seemed useful for the exploration of linked data. There is a difference in the way users interact with resources, visually or textually, and how resources are represented for machines to be processed by algorithms. This difference complicates bridging the users' intents and machine executable queries. It is important to implement this 'translation' mechanism to impact the search as favorable as possible in terms of performance, complexity and accuracy. To do this, we explain a second technique, that supports such a bridging component. Our second technique is developed around three features that support the search process: looking up, relating and ranking resources. The main goal is to ensure that resources in the results are as precise and relevant as possible. During the evaluation of this technique, we did not only look at the precision of the search results but also investigated how the effectiveness of the search evolved while the user executed certain actions sequentially.
    When we speak about finding relationships between resources, it is necessary to dive deeper in the structure. The graph structure of linked data where the semantics give meaning to the relationships between resources enable the execution of pathfinding algorithms. The assigned weights and heuristics are base components of such algorithms and ultimately define (the order) which resources are included in a path. These paths explain indirect connections between resources. Our third technique proposes an algorithm that optimizes the choice of resources in terms of serendipity. Some optimizations guard the consistence of candidate-paths where the coherence of consecutive connections is maximized to avoid trivial and too arbitrary paths. The implementation uses the A* algorithm, the de-facto reference when it comes to heuristically optimized minimal cost paths. The effectiveness of paths was measured based on common automatic metrics and surveys where the users could indicate their preference for paths, generated each time in a different way. Finally, all our techniques are applied to a use case about publications in digital libraries where they are aligned with information about scientific conferences and researchers. The application to this use case is a practical example because the different aspects of exploratory search come together. In fact, the techniques also evolved from the experiences when implementing the use case. Practical details about the semantic model are explained and the implementation of the search system is clarified module by module. The evaluation positions the result, a prototype of a tool to explore scientific publications, researchers and conferences next to some important alternatives.
  7. Slavic-Overfield, A.: Classification management and use in a networked environment : the case of the Universal Decimal Classification (2005) 0.00
    0.0017899501 = product of:
      0.0035799001 = sum of:
        0.0035799001 = product of:
          0.0071598003 = sum of:
            0.0071598003 = weight(_text_:a in 2191) [ClassicSimilarity], result of:
              0.0071598003 = score(doc=2191,freq=14.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.13482209 = fieldWeight in 2191, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2191)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In the Internet information space, advanced information retrieval (IR) methods and automatic text processing are used in conjunction with traditional knowledge organization systems (KOS). New information technology provides a platform for better KOS publishing, exploitation and sharing both for human and machine use. Networked KOS services are now being planned and developed as powerful tools for resource discovery. They will enable automatic contextualisation, interpretation and query matching to different indexing languages. The Semantic Web promises to be an environment in which the quality of semantic relationships in bibliographic classification systems can be fully exploited. Their use in the networked environment is, however, limited by the fact that they are not prepared or made available for advanced machine processing. The UDC was chosen for this research because of its widespread use and its long-term presence in online information retrieval systems. It was also the first system to be used for the automatic classification of Internet resources, and the first to be made available as a classification tool on the Web. The objective of this research is to establish the advantages of using UDC for information retrieval in a networked environment, to highlight the problems of automation and classification exchange, and to offer possible solutions. The first research question was is there enough evidence of the use of classification on the Internet to justify further development with this particular environment in mind? The second question is what are the automation requirements for the full exploitation of UDC and its exchange? The third question is which areas are in need of improvement and what specific recommendations can be made for implementing the UDC in a networked environment? A summary of changes required in the management and development of the UDC to facilitate its full adaptation for future use is drawn from this analysis.
  8. Haveliwala, T.: Context-Sensitive Web search (2005) 0.00
    0.0017899501 = product of:
      0.0035799001 = sum of:
        0.0035799001 = product of:
          0.0071598003 = sum of:
            0.0071598003 = weight(_text_:a in 2567) [ClassicSimilarity], result of:
              0.0071598003 = score(doc=2567,freq=14.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.13482209 = fieldWeight in 2567, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2567)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    As the Web continues to grow and encompass broader and more diverse sources of information, providing effective search facilities to users becomes an increasingly challenging problem. To help users deal with the deluge of Web-accessible information, we propose a search system which makes use of context to improve search results in a scalable way. By context, we mean any sources of information, in addition to any search query, that provide clues about the user's true information need. For instance, a user's bookmarks and search history can be considered a part of the search context. We consider two types of context-based search. The first type of functionality we consider is "similarity search." In this case, as the user is browsing Web pages, URLs for pages similar to the current page are retrieved and displayed in a side panel. No query is explicitly issued; context alone (i.e., the page currently being viewed) is used to provide the user with useful related information. The second type of functionality involves taking search context into account when ranking results to standard search queries. Web search differs from traditional information retrieval tasks in several major ways, making effective context-sensitive Web search challenging. First, scalability is of critical importance. With billions of publicly accessible documents, the Web is much larger than traditional datasets. Similarly, with millions of search queries issued each day, the query load is much higher than for traditional information retrieval systems. Second, there are no guarantees on the quality ofWeb pages, with Web-authors taking an adversarial, rather than cooperative, approach in attempts to inflate the rankings of their pages. Third, there is a significant amount of metadata embodied in the link structure corresponding to the hyperlinks between Web pages that can be exploitedduring the retrieval process. In this thesis, we design a search system, using the Stanford WebBase platform, that exploits the link structure of the Web to provide scalable, context-sensitive search.
  9. Smith, D.A.: Exploratory and faceted browsing over heterogeneous and cross-domain data sources. (2011) 0.00
    0.001757696 = product of:
      0.003515392 = sum of:
        0.003515392 = product of:
          0.007030784 = sum of:
            0.007030784 = weight(_text_:a in 4839) [ClassicSimilarity], result of:
              0.007030784 = score(doc=4839,freq=6.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.13239266 = fieldWeight in 4839, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4839)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Exploration of heterogeneous data sources increases the value of information by allowing users to answer questions through exploration across multiple sources; Users can use information that has been posted across the Web to answer questions and learn about new domains. We have conducted research that lowers the interrogation time of faceted data, by combining related information from different sources. The work contributes methodologies in combining heterogenous sources, and how to deliver that data to a user interface scalably, with enough performance to support rapid interrogation of the knowledge by the user. The work also contributes how to combine linked data sources so that users can create faceted browsers that target the information facets of their needs. The work is grounded and proven in a number of experiments and test cases that study the contributions in domain research work.
    Footnote
    A thesis submitted in partial fulfillment for the degree of Doctor of Philosophy. June 2011.
  10. Schmolz, H.: Anaphora resolution and text retrieval : a lnguistic analysis of hypertexts (2015) 0.00
    0.0016913437 = product of:
      0.0033826875 = sum of:
        0.0033826875 = product of:
          0.006765375 = sum of:
            0.006765375 = weight(_text_:a in 1172) [ClassicSimilarity], result of:
              0.006765375 = score(doc=1172,freq=2.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.12739488 = fieldWeight in 1172, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1172)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  11. Schmolz, H.: Anaphora resolution and text retrieval : a lnguistic analysis of hypertexts (2013) 0.00
    0.0016913437 = product of:
      0.0033826875 = sum of:
        0.0033826875 = product of:
          0.006765375 = sum of:
            0.006765375 = weight(_text_:a in 1810) [ClassicSimilarity], result of:
              0.006765375 = score(doc=1810,freq=2.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.12739488 = fieldWeight in 1810, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.078125 = fieldNorm(doc=1810)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
  12. Kirk, J.: Theorising information use : managers and their work (2002) 0.00
    0.001674345 = product of:
      0.00334869 = sum of:
        0.00334869 = product of:
          0.00669738 = sum of:
            0.00669738 = weight(_text_:a in 560) [ClassicSimilarity], result of:
              0.00669738 = score(doc=560,freq=4.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.12611452 = fieldWeight in 560, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=560)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The focus of this thesis is information use. Although a key concept in information behaviour, information use has received little attention from information science researchers. Studies of other key concepts such as information need and information seeking are dominant in information behaviour research. Information use is an area of interest to information professionals who rely on research outcomes to shape their practice. There are few empirical studies of how people actually use information that might guide and refine the development of information systems, products and services.
    Content
    A thesis submitted to the University of Technology, Sydney in fulfilment of the requirements for the degree of Doctor of Philosophy. - Vgl. unter: http://epress.lib.uts.edu.au/dspace/bitstream/2100/309/2/02whole.pdf.
  13. Garfield, E.: ¬An algorithm for translating chemical names to molecular formulas (1961) 0.00
    0.0014647468 = product of:
      0.0029294936 = sum of:
        0.0029294936 = product of:
          0.005858987 = sum of:
            0.005858987 = weight(_text_:a in 3465) [ClassicSimilarity], result of:
              0.005858987 = score(doc=3465,freq=6.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.11032722 = fieldWeight in 3465, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3465)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This dissertation discusses, explains, and demonstrates a new algorithm for translating chemica l nomenclature into molecular formulas. In order to place the study in its proper context and perspective the historical development of nomenclature is first discussed, aa well as other related aspects of the chemical information problem. The relationship of nomenclature to modern linguistic studies is then introduced. Tire relevance of structural linguistic procedures to the study of chemical nomenclature is shown. The methods of the linguist are illustrated by examples from chemical discourse. The algorithm is then explained, first for the human translator and then for use by a computer. Flow diagrams for the computer syntactic analysis, dictionary Iook-up routine, and furmula calculation routine are included. The sampling procedure for testing the algorithm is explained and finalIy, conclusions are drawn with respect to the general validity of the method and the dirsction that might be taken for future research. A summary of modern chemical nomenclature practice is appened primarily for use by the reader who is not familiar with chemical nomenclature.
  14. Noy, N.F.: Knowledge representation for intelligent information retrieval in experimental sciences (1997) 0.00
    0.001353075 = product of:
      0.00270615 = sum of:
        0.00270615 = product of:
          0.0054123 = sum of:
            0.0054123 = weight(_text_:a in 694) [ClassicSimilarity], result of:
              0.0054123 = score(doc=694,freq=8.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.10191591 = fieldWeight in 694, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.03125 = fieldNorm(doc=694)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    More and more information is available on-line every day. The greater the amount of on-line information, the greater the demand for tools that process and disseminate this information. Processing electronic information in the form of text and answering users' queries about that information intelligently is one of the great challenges in natural language processing and information retrieval. The research presented in this talk is centered on the latter of these two tasks: intelligent information retrieval. In order for information to be retrieved, it first needs to be formalized in a database or knowledge base. The ontology for this formalization and assumptions it is based on are crucial to successful intelligent information retrieval. We have concentrated our effort on developing an ontology for representing knowledge in the domains of experimental sciences, molecular biology in particular. We show that existing ontological models cannot be readily applied to represent this domain adequately. For example, the fundamental notion of ontology design that every "real" object is defined as an instance of a category seems incompatible with the universe where objects can change their category as a result of experimental procedures. Another important problem is representing complex structures such as DNA, mixtures, populations of molecules, etc., that are very common in molecular biology. We present extensions that need to be made to an ontology to cover these issues: the representation of transformations that change the structure and/or category of their participants, and the component relations and spatial structures of complex objects. We demonstrate examples of how the proposed representations can be used to improve the quality and completeness of answers to user queries; discuss techniques for evaluating ontologies and show a prototype of an Information Retrieval System that we developed.
  15. Sebastian, Y.: Literature-based discovery by learning heterogeneous bibliographic information networks (2017) 0.00
    0.001353075 = product of:
      0.00270615 = sum of:
        0.00270615 = product of:
          0.0054123 = sum of:
            0.0054123 = weight(_text_:a in 535) [ClassicSimilarity], result of:
              0.0054123 = score(doc=535,freq=8.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.10191591 = fieldWeight in 535, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.03125 = fieldNorm(doc=535)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Literature-based discovery (LBD) research aims at finding effective computational methods for predicting previously unknown connections between clusters of research papers from disparate research areas. Existing methods encompass two general approaches. The first approach searches for these unknown connections by examining the textual contents of research papers. In addition to the existing textual features, the second approach incorporates structural features of scientific literatures, such as citation structures. These approaches, however, have not considered research papers' latent bibliographic metadata structures as important features that can be used for predicting previously unknown relationships between them. This thesis investigates a new graph-based LBD method that exploits the latent bibliographic metadata connections between pairs of research papers. The heterogeneous bibliographic information network is proposed as an efficient graph-based data structure for modeling the complex relationships between these metadata. In contrast to previous approaches, this method seamlessly combines textual and citation information in the form of pathbased metadata features for predicting future co-citation links between research papers from disparate research fields. The results reported in this thesis provide evidence that the method is effective for reconstructing the historical literature-based discovery hypotheses. This thesis also investigates the effects of semantic modeling and topic modeling on the performance of the proposed method. For semantic modeling, a general-purpose word sense disambiguation technique is proposed to reduce the lexical ambiguity in the title and abstract of research papers. The experimental results suggest that the reduced lexical ambiguity did not necessarily lead to a better performance of the method. This thesis discusses some of the possible contributing factors to these results. Finally, topic modeling is used for learning the latent topical relations between research papers. The learned topic model is incorporated into the heterogeneous bibliographic information network graph and allows new predictive features to be learned. The results in this thesis suggest that topic modeling improves the performance of the proposed method by increasing the overall accuracy for predicting the future co-citation links between disparate research papers.
    Footnote
    A thesis submitted in ful llment of the requirements for the degree of Doctor of Philosophy Monash University, Faculty of Information Technology.
  16. Seidlmayer, E.: ¬An ontology of digital objects in philosophy : an approach for practical use in research (2018) 0.00
    0.0011839407 = product of:
      0.0023678814 = sum of:
        0.0023678814 = product of:
          0.0047357627 = sum of:
            0.0047357627 = weight(_text_:a in 5496) [ClassicSimilarity], result of:
              0.0047357627 = score(doc=5496,freq=2.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.089176424 = fieldWeight in 5496, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5496)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The digitalization of research enables new scientific insights and methods, especially in the humanities. Nonetheless, electronic book editions, encyclopedias, mobile applications or web sites presenting research projects are not in broad use in academic philosophy. This is contradictory to the large amount of helpful tools facilitating research also bearing new scientific subjects and approaches. A possible solution to this dilemma is the systematization and promotion of these tools in order to improve their accessibility and fully exploit the potential of digitalization for philosophy.
  17. Mao, M.: Ontology mapping : towards semantic interoperability in distributed and heterogeneous environments (2008) 0.00
    9.567685E-4 = product of:
      0.001913537 = sum of:
        0.001913537 = product of:
          0.003827074 = sum of:
            0.003827074 = weight(_text_:a in 4659) [ClassicSimilarity], result of:
              0.003827074 = score(doc=4659,freq=4.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.072065435 = fieldWeight in 4659, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.03125 = fieldNorm(doc=4659)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This dissertation studies ontology mapping: the problem of finding semantic correspondences between similar elements of different ontologies. In the dissertation, elements denote classes or properties of ontologies. The goal of this research is to use ontology mapping to make heterogeneous information more accessible. The World Wide Web (WWW) now is widely used as a universal medium for information exchange. Semantic interoperability among different information systems in the WWW is limited due to information heterogeneity, and the non semantic nature of HTML and URLs. Ontologies have been suggested as a way to solve the problem of information heterogeneity by providing formal, explicit definitions of data and reasoning ability over related concepts. Given that no universal ontology exists for the WWW, work has focused on finding semantic correspondences between similar elements of different ontologies, i.e., ontology mapping. Ontology mapping can be done either by hand or using automated tools. Manual mapping becomes impractical as the size and complexity of ontologies increases. Full or semi-automated mapping approaches have been examined by several research studies. Previous full or semiautomated mapping approaches include analyzing linguistic information of elements in ontologies, treating ontologies as structural graphs, applying heuristic rules and machine learning techniques, and using probabilistic and reasoning methods etc. In this paper, two generic ontology mapping approaches are proposed. One is the PRIOR+ approach, which utilizes both information retrieval and artificial intelligence techniques in the context of ontology mapping. The other is the non-instance learning based approach, which experimentally explores machine learning algorithms to solve ontology mapping problem without requesting any instance. The results of the PRIOR+ on different tests at OAEI ontology matching campaign 2007 are encouraging. The non-instance learning based approach has shown potential for solving ontology mapping problem on OAEI benchmark tests.
  18. Knitel, M.: ¬The application of linked data principles to library data : opportunities and challenges (2012) 0.00
    8.4567186E-4 = product of:
      0.0016913437 = sum of:
        0.0016913437 = product of:
          0.0033826875 = sum of:
            0.0033826875 = weight(_text_:a in 599) [ClassicSimilarity], result of:
              0.0033826875 = score(doc=599,freq=2.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.06369744 = fieldWeight in 599, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=599)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Location
    A
  19. Oberhauser, O.: Card-Image Public Access Catalogues (CIPACs) : a critical consideration of a cost-effective alternative to full retrospective catalogue conversion (2002) 0.00
    8.371725E-4 = product of:
      0.001674345 = sum of:
        0.001674345 = product of:
          0.00334869 = sum of:
            0.00334869 = weight(_text_:a in 1703) [ClassicSimilarity], result of:
              0.00334869 = score(doc=1703,freq=4.0), product of:
                0.053105544 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046056706 = queryNorm
                0.06305726 = fieldWeight in 1703, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=1703)
          0.5 = coord(1/2)
      0.5 = coord(1/2)