Search (24 results, page 1 of 2)

Farazi, M.: Faceted lightweight ontologies : a formalization and some experiments (2010) 0.07
```
0.07158818 = product of:
  0.10738227 = sum of:
    0.06752829 = product of:
      0.20258486 = sum of:
        0.20258486 = weight(_text_:3a in 4997) [ClassicSimilarity], result of:
          0.20258486 = score(doc=4997,freq=2.0), product of:
            0.43255165 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.051020417 = queryNorm
            0.46834838 = fieldWeight in 4997, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4997)
      0.33333334 = coord(1/3)
    0.039853975 = weight(_text_:data in 4997) [ClassicSimilarity], result of:
      0.039853975 = score(doc=4997,freq=4.0), product of:
        0.16132914 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.051020417 = queryNorm
        0.24703519 = fieldWeight in 4997, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4997)
  0.6666667 = coord(2/3)
```
Abstract

While classifications are heavily used to categorize web content, the evolution of the web foresees a more formal structure - ontology - which can serve this purpose. Ontologies are core artifacts of the Semantic Web which enable machines to use inference rules to conduct automated reasoning on data. Lightweight ontologies bridge the gap between classifications and ontologies. A lightweight ontology (LO) is an ontology representing a backbone taxonomy where the concept of the child node is more specific than the concept of the parent node. Formal lightweight ontologies can be generated from their informal ones. The key applications of formal lightweight ontologies are document classification, semantic search, and data integration. However, these applications suffer from the following problems: the disambiguation accuracy of the state of the art NLP tools used in generating formal lightweight ontologies from their informal ones; the lack of background knowledge needed for the formal lightweight ontologies; and the limitation of ontology reuse. In this dissertation, we propose a novel solution to these problems in formal lightweight ontologies; namely, faceted lightweight ontology (FLO). FLO is a lightweight ontology in which terms, present in each node label, and their concepts, are available in the background knowledge (BK), which is organized as a set of facets. A facet can be defined as a distinctive property of the groups of concepts that can help in differentiating one group from another. Background knowledge can be defined as a subset of a knowledge base, such as WordNet, and often represents a specific domain.

Content

PhD Dissertation at International Doctorate School in Information and Communication Technology. Vgl.: https%3A%2F%2Fcore.ac.uk%2Fdownload%2Fpdf%2F150083013.pdf&usg=AOvVaw2n-qisNagpyT0lli_6QbAQ.
Makewita, S.M.: Investigating the generic information-seeking function of organisational decision-makers : perspectives on improving organisational information systems (2002) 0.05
```
0.05353072 = product of:
  0.08029608 = sum of:
    0.06301467 = weight(_text_:data in 642) [ClassicSimilarity], result of:
      0.06301467 = score(doc=642,freq=10.0), product of:
        0.16132914 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.051020417 = queryNorm
        0.39059696 = fieldWeight in 642, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=642)
    0.017281406 = product of:
      0.03456281 = sum of:
        0.03456281 = weight(_text_:22 in 642) [ClassicSimilarity], result of:
          0.03456281 = score(doc=642,freq=2.0), product of:
            0.1786648 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051020417 = queryNorm
            0.19345059 = fieldWeight in 642, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=642)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

The past decade has seen the emergence of a new paradigm in the corporate world where organisations emphasised connectivity as a means of exposing decision-makers to wider resources of information within and outside the organisation. Many organisations followed the initiatives of enhancing infrastructures, manipulating cultural shifts and emphasising managerial commitment for creating pools and networks of knowledge. However, the concept of connectivity is not merely presenting people with the data, but more importantly, to create environments where people can seek information efficiently. This paradigm has therefore caused a shift in the function of information systems in organisations. They have to be now assessed in relation to how they underpin people's information-seeking activities within the context of their organisational environment. This research project used interpretative research methods to investigate the nature of people's information-seeking activities at two culturally contrasting organisations. Outcomes of this research project provide insights into phenomena associated with people's information-seeking function, and show how they depend on the organisational context that is defined partly by information systems. It suggests that information-seeking is not just searching for data. The inefficiencies inherent in both people and their environments can bring opaqueness into people's data, which they need to avoid or eliminate as part of seeking information. This seems to have made information-seeking a two-tier process consisting of a primary process of searching and interpreting data and auxiliary process of avoiding and eliminating opaqueness in data. Based on this view, this research suggests that organisational information systems operate naturally as implicit dual-mechanisms to underpin the above two-tier process, and that improvements to information systems should concern maintaining the balance in these dual-mechanisms.

Date

22. 7.2022 12:16:58

Huo, W.: Automatic multi-word term extraction and its application to Web-page summarization (2012) 0.04

0.036369935 = product of:
  0.054554902 = sum of:
    0.033817217 = weight(_text_:data in 563) [ClassicSimilarity], result of:
      0.033817217 = score(doc=563,freq=2.0), product of:
        0.16132914 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.051020417 = queryNorm
        0.2096163 = fieldWeight in 563, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=563)
    0.020737685 = product of:
      0.04147537 = sum of:
        0.04147537 = weight(_text_:22 in 563) [ClassicSimilarity], result of:
          0.04147537 = score(doc=563,freq=2.0), product of:
            0.1786648 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051020417 = queryNorm
            0.23214069 = fieldWeight in 563, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=563)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)

Abstract: In this thesis we propose three new word association measures for multi-word term extraction. We combine these association measures with LocalMaxs algorithm in our extraction model and compare the results of different multi-word term extraction methods. Our approach is language and domain independent and requires no training data. It can be applied to such tasks as text summarization, information retrieval, and document classification. We further explore the potential of using multi-word terms as an effective representation for general web-page summarization. We extract multi-word terms from human written summaries in a large collection of web-pages, and generate the summaries by aligning document words with these multi-word terms. Our system applies machine translation technology to learn the aligning process from a training set and focuses on selecting high quality multi-word terms from human written summaries to generate suitable results for web-page summarization.
Date: 10. 1.2013 19:22:47

Smith, D.A.: Exploratory and faceted browsing over heterogeneous and cross-domain data sources. (2011) 0.03
```
0.025205866 = product of:
  0.0756176 = sum of:
    0.0756176 = weight(_text_:data in 4839) [ClassicSimilarity], result of:
      0.0756176 = score(doc=4839,freq=10.0), product of:
        0.16132914 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.051020417 = queryNorm
        0.46871632 = fieldWeight in 4839, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=4839)
  0.33333334 = coord(1/3)
```
Abstract

Exploration of heterogeneous data sources increases the value of information by allowing users to answer questions through exploration across multiple sources; Users can use information that has been posted across the Web to answer questions and learn about new domains. We have conducted research that lowers the interrogation time of faceted data, by combining related information from different sources. The work contributes methodologies in combining heterogenous sources, and how to deliver that data to a user interface scalably, with enough performance to support rapid interrogation of the knowledge by the user. The work also contributes how to combine linked data sources so that users can create faceted browsers that target the information facets of their needs. The work is grounded and proven in a number of experiments and test cases that study the contributions in domain research work.
Nagy T., I.: Detecting multiword expressions and named entities in natural language texts (2014) 0.02
```
0.023928402 = product of:
  0.035892602 = sum of:
    0.01972671 = weight(_text_:data in 1536) [ClassicSimilarity], result of:
      0.01972671 = score(doc=1536,freq=2.0), product of:
        0.16132914 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.051020417 = queryNorm
        0.12227618 = fieldWeight in 1536, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1536)
    0.016165892 = product of:
      0.032331783 = sum of:
        0.032331783 = weight(_text_:processing in 1536) [ClassicSimilarity], result of:
          0.032331783 = score(doc=1536,freq=2.0), product of:
            0.20653816 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.051020417 = queryNorm
            0.15654145 = fieldWeight in 1536, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1536)
      0.5 = coord(1/2)
  0.6666667 = coord(2/3)
```
Abstract

Multiword expressions (MWEs) are lexical items that can be decomposed into single words and display lexical, syntactic, semantic, pragmatic and/or statistical idiosyncrasy (Sag et al., 2002; Kim, 2008; Calzolari et al., 2002). The proper treatment of multiword expressions such as rock 'n' roll and make a decision is essential for many natural language processing (NLP) applications like information extraction and retrieval, terminology extraction and machine translation, and it is important to identify multiword expressions in context. For example, in machine translation we must know that MWEs form one semantic unit, hence their parts should not be translated separately. For this, multiword expressions should be identified first in the text to be translated. The chief aim of this thesis is to develop machine learning-based approaches for the automatic detection of different types of multiword expressions in English and Hungarian natural language texts. In our investigations, we pay attention to the characteristics of different types of multiword expressions such as nominal compounds, multiword named entities and light verb constructions, and we apply novel methods to identify MWEs in raw texts. In the thesis it will be demonstrated that nominal compounds and multiword amed entities may require a similar approach for their automatic detection as they behave in the same way from a linguistic point of view. Furthermore, it will be shown that the automatic detection of light verb constructions can be carried out using two effective machine learning-based approaches.
In this thesis, we focused on the automatic detection of multiword expressions in natural language texts. On the basis of the main contributions, we can argue that: - Supervised machine learning methods can be successfully applied for the automatic detection of different types of multiword expressions in natural language texts. - Machine learning-based multiword expression detection can be successfully carried out for English as well as for Hungarian. - Our supervised machine learning-based model was successfully applied to the automatic detection of nominal compounds from English raw texts. - We developed a Wikipedia-based dictionary labeling method to automatically detect English nominal compounds. - A prior knowledge of nominal compounds can enhance Named Entity Recognition, while previously identified named entities can assist the nominal compound identification process. - The machine learning-based method can also provide acceptable results when it was trained on an automatically generated silver standard corpus. - As named entities form one semantic unit and may consist of more than one word and function as a noun, we can treat them in a similar way to nominal compounds. - Our sequence labelling-based tool can be successfully applied for identifying verbal light verb constructions in two typologically different languages, namely English and Hungarian. - Domain adaptation techniques may help diminish the distance between domains in the automatic detection of light verb constructions. - Our syntax-based method can be successfully applied for the full-coverage identification of light verb constructions. As a first step, a data-driven candidate extraction method can be utilized. After, a machine learning approach that makes use of an extended and rich feature set selects LVCs among extracted candidates. - When a precise syntactic parser is available for the actual domain, the full-coverage identification can be performed better. In other cases, the usage of the sequence labeling method is recommended.
Knitel, M.: ¬The application of linked data principles to library data : opportunities and challenges (2012) 0.02
```
0.023009703 = product of:
  0.06902911 = sum of:
    0.06902911 = weight(_text_:data in 599) [ClassicSimilarity], result of:
      0.06902911 = score(doc=599,freq=12.0), product of:
        0.16132914 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.051020417 = queryNorm
        0.4278775 = fieldWeight in 599, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=599)
  0.33333334 = coord(1/3)
```
Abstract

Linked Data hat sich im Laufe der letzten Jahre zu einem vorherrschenden Thema der Bibliothekswissenschaft entwickelt. Als ein Standard für Erfassung und Austausch von Daten, bestehen zahlreiche Berührungspunkte mit traditionellen bibliothekarischen Techniken. Diese Arbeit stellt in einem ersten Teil die grundlegenden Technologien dieses neuen Paradigmas vor, um sodann deren Anwendung auf bibliothekarische Daten zu untersuchen. Den zentralen Prinzipien der Linked Data Initiative folgend, werden dabei die Adressierung von Entitäten durch URIs, die Anwendung des RDF Datenmodells und die Verknüpfung von heterogenen Datenbeständen näher beleuchtet. Den dabei zu Tage tretenden Herausforderungen der Sicherstellung von qualitativ hochwertiger Information, der permanenten Adressierung von Inhalten im World Wide Web sowie Problemen der Interoperabilität von Metadatenstandards wird dabei besondere Aufmerksamkeit geschenkt. Der letzte Teil der Arbeit skizziert ein Programm, welches eine mögliche Erweiterung der Suchmaschine des österreichischen Bibliothekenverbundes darstellt. Dessen prototypische Umsetzung erlaubt eine realistische Einschätzung der derzeitigen Möglichkeiten von Linked Data und unterstreicht viele der vorher theoretisch erarbeiteten Themengebiete. Es zeigt sich, dass für den voll produktiven Einsatz von Linked Data noch viele Hürden zu überwinden sind. Insbesondere befinden sich viele Projekte derzeit noch in einem frühen Reifegrad. Andererseits sind die Möglichkeiten, die aus einem konsequenten Einsatz von RDF resultieren würden, vielversprechend. RDF qualifiziert sich somit als Kandidat für den Ersatz von auslaufenden bibliographischen Datenformaten wie MAB oder MARC.
Vocht, L. De: Exploring semantic relationships in the Web of Data : Semantische relaties verkennen in data op het web (2017) 0.02
```
0.019926988 = product of:
  0.059780963 = sum of:
    0.059780963 = weight(_text_:data in 4232) [ClassicSimilarity], result of:
      0.059780963 = score(doc=4232,freq=36.0), product of:
        0.16132914 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.051020417 = queryNorm
        0.3705528 = fieldWeight in 4232, product of:
          6.0 = tf(freq=36.0), with freq of:
            36.0 = termFreq=36.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.01953125 = fieldNorm(doc=4232)
  0.33333334 = coord(1/3)
```
Abstract

After the launch of the World Wide Web, it became clear that searching documentson the Web would not be trivial. Well-known engines to search the web, like Google, focus on search in web documents using keywords. The documents are structured and indexed to ensure keywords match documents as accurately as possible. However, searching by keywords does not always suice. It is oen the case that users do not know exactly how to formulate the search query or which keywords guarantee retrieving the most relevant documents. Besides that, it occurs that users rather want to browse information than looking up something specific. It turned out that there is need for systems that enable more interactivity and facilitate the gradual refinement of search queries to explore the Web. Users expect more from the Web because the short keyword-based queries they pose during search, do not suffice for all cases. On top of that, the Web is changing structurally. The Web comprises, apart from a collection of documents, more and more linked data, pieces of information structured so they can be processed by machines. The consequently applied semantics allow users to exactly indicate machines their search intentions. This is made possible by describing data following controlled vocabularies, concept lists composed by experts, published uniquely identifiable on the Web. Even so, it is still not trivial to explore data on the Web. There is a large variety of vocabularies and various data sources use different terms to identify the same concepts.
This PhD-thesis describes how to effectively explore linked data on the Web. The main focus is on scenarios where users want to discover relationships between resources rather than finding out more about something specific. Searching for a specific document or piece of information fits in the theoretical framework of information retrieval and is associated with exploratory search. Exploratory search goes beyond 'looking up something' when users are seeking more detailed understanding, further investigation or navigation of the initial search results. The ideas behind exploratory search and querying linked data merge when it comes to the way knowledge is represented and indexed by machines - how data is structured and stored for optimal searchability. Queries and information should be aligned to facilitate that searches also reveal connections between results. This implies that they take into account the same semantic entities, relevant at that moment. To realize this, we research three techniques that are evaluated one by one in an experimental set-up to assess how well they succeed in their goals. In the end, the techniques are applied to a practical use case that focuses on forming a bridge between the Web and the use of digital libraries in scientific research. Our first technique focuses on the interactive visualization of search results. Linked data resources can be brought in relation with each other at will. This leads to complex and diverse graphs structures. Our technique facilitates navigation and supports a workflow starting from a broad overview on the data and allows narrowing down until the desired level of detail to then broaden again. To validate the flow, two visualizations where implemented and presented to test-users. The users judged the usability of the visualizations, how the visualizations fit in the workflow and to which degree their features seemed useful for the exploration of linked data.
The ideas behind exploratory search and querying linked data merge when it comes to the way knowledge is represented and indexed by machines - how data is structured and stored for optimal searchability. eries and information should be aligned to facilitate that searches also reveal connections between results. This implies that they take into account the same semantic entities, relevant at that moment. To realize this, we research three techniques that are evaluated one by one in an experimental set-up to assess how well they succeed in their goals. In the end, the techniques are applied to a practical use case that focuses on forming a bridge between the Web and the use of digital libraries in scientific research.
Our first technique focuses on the interactive visualization of search results. Linked data resources can be brought in relation with each other at will. This leads to complex and diverse graphs structures. Our technique facilitates navigation and supports a workflow starting from a broad overview on the data and allows narrowing down until the desired level of detail to then broaden again. To validate the flow, two visualizations where implemented and presented to test-users. The users judged the usability of the visualizations, how the visualizations fit in the workflow and to which degree their features seemed useful for the exploration of linked data. There is a difference in the way users interact with resources, visually or textually, and how resources are represented for machines to be processed by algorithms. This difference complicates bridging the users' intents and machine executable queries. It is important to implement this 'translation' mechanism to impact the search as favorable as possible in terms of performance, complexity and accuracy. To do this, we explain a second technique, that supports such a bridging component. Our second technique is developed around three features that support the search process: looking up, relating and ranking resources. The main goal is to ensure that resources in the results are as precise and relevant as possible. During the evaluation of this technique, we did not only look at the precision of the search results but also investigated how the effectiveness of the search evolved while the user executed certain actions sequentially.
When we speak about finding relationships between resources, it is necessary to dive deeper in the structure. The graph structure of linked data where the semantics give meaning to the relationships between resources enable the execution of pathfinding algorithms. The assigned weights and heuristics are base components of such algorithms and ultimately define (the order) which resources are included in a path. These paths explain indirect connections between resources. Our third technique proposes an algorithm that optimizes the choice of resources in terms of serendipity. Some optimizations guard the consistence of candidate-paths where the coherence of consecutive connections is maximized to avoid trivial and too arbitrary paths. The implementation uses the A* algorithm, the de-facto reference when it comes to heuristically optimized minimal cost paths. The effectiveness of paths was measured based on common automatic metrics and surveys where the users could indicate their preference for paths, generated each time in a different way. Finally, all our techniques are applied to a use case about publications in digital libraries where they are aligned with information about scientific conferences and researchers. The application to this use case is a practical example because the different aspects of exploratory search come together. In fact, the techniques also evolved from the experiences when implementing the use case. Practical details about the semantic model are explained and the implementation of the search system is clarified module by module. The evaluation positions the result, a prototype of a tool to explore scientific publications, researchers and conferences next to some important alternatives.

Stojanovic, N.: Ontology-based Information Retrieval : methods and tools for cooperative query answering (2005) 0.02

0.018007545 = product of:
  0.054022633 = sum of:
    0.054022633 = product of:
      0.16206789 = sum of:
        0.16206789 = weight(_text_:3a in 701) [ClassicSimilarity], result of:
          0.16206789 = score(doc=701,freq=2.0), product of:
            0.43255165 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.051020417 = queryNorm
            0.3746787 = fieldWeight in 701, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03125 = fieldNorm(doc=701)
      0.33333334 = coord(1/3)
  0.33333334 = coord(1/3)

Content: Vgl.: http%3A%2F%2Fdigbib.ubka.uni-karlsruhe.de%2Fvolltexte%2Fdocuments%2F1627&ei=tAtYUYrBNoHKtQb3l4GYBw&usg=AFQjCNHeaxKkKU3-u54LWxMNYGXaaDLCGw&sig2=8WykXWQoDKjDSdGtAakH2Q&bvm=bv.44442042,d.Yms.

Xiong, C.: Knowledge based text representations for information retrieval (2016) 0.02

0.018007545 = product of:
  0.054022633 = sum of:
    0.054022633 = product of:
      0.16206789 = sum of:
        0.16206789 = weight(_text_:3a in 5820) [ClassicSimilarity], result of:
          0.16206789 = score(doc=5820,freq=2.0), product of:
            0.43255165 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.051020417 = queryNorm
            0.3746787 = fieldWeight in 5820, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03125 = fieldNorm(doc=5820)
      0.33333334 = coord(1/3)
  0.33333334 = coord(1/3)

Content: Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Language and Information Technologies. Vgl.: https%3A%2F%2Fwww.cs.cmu.edu%2F~cx%2Fpapers%2Fknowledge_based_text_representation.pdf&usg=AOvVaw0SaTSvhWLTh__Uz_HtOtl3.

Gordon, T.J.; Helmer-Hirschberg, O.: Report on a long-range forecasting study (1964) 0.01

0.013034452 = product of:
  0.039103355 = sum of:
    0.039103355 = product of:
      0.07820671 = sum of:
        0.07820671 = weight(_text_:22 in 4204) [ClassicSimilarity], result of:
          0.07820671 = score(doc=4204,freq=4.0), product of:
            0.1786648 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.051020417 = queryNorm
            0.4377287 = fieldWeight in 4204, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=4204)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)

Date: 22. 6.2018 13:24:08
22. 6.2018 13:54:52

Mair, M.: Increasing the value of meta data by using associative semantic networks (2002) 0.01

0.011272406 = product of:
  0.033817217 = sum of:
    0.033817217 = weight(_text_:data in 4972) [ClassicSimilarity], result of:
      0.033817217 = score(doc=4972,freq=2.0), product of:
        0.16132914 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.051020417 = queryNorm
        0.2096163 = fieldWeight in 4972, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=4972)
  0.33333334 = coord(1/3)

Thornton, K: Powerful structure : inspecting infrastructures of information organization in Wikimedia Foundation projects (2016) 0.01
```
0.011272406 = product of:
  0.033817217 = sum of:
    0.033817217 = weight(_text_:data in 3288) [ClassicSimilarity], result of:
      0.033817217 = score(doc=3288,freq=2.0), product of:
        0.16132914 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.051020417 = queryNorm
        0.2096163 = fieldWeight in 3288, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=3288)
  0.33333334 = coord(1/3)
```
Abstract

This dissertation investigates the social and technological factors of collaboratively organizing information in commons-based peer production systems. To do so, it analyzes the diverse strategies that members of Wikimedia Foundation (WMF) project communities use to organize information. Key findings from this dissertation show that conceptual structures of information organization are encoded into the infrastructure of WMF projects. The fact that WMF projects are commons-based peer production systems means that we can inspect the code that enables these systems, but a specific type of technical literacy is required to do so. I use three methods in this dissertation. I conduct a qualitative content analysis of the discussions surrounding the design, implementation and evaluation of the category system; a quantitative analysis using descriptive statistics of patterns of editing among editors who contributed to the code of templates for information boxes; and a close reading of the infrastructure used to create the category system, the infobox templates, and the knowledge base of structured data.
Markó, K.G.: Foundation, implementation and evaluation of the MorphoSaurus system (2008) 0.01
```
0.009333383 = product of:
  0.028000148 = sum of:
    0.028000148 = product of:
      0.056000296 = sum of:
        0.056000296 = weight(_text_:processing in 4415) [ClassicSimilarity], result of:
          0.056000296 = score(doc=4415,freq=6.0), product of:
            0.20653816 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.051020417 = queryNorm
            0.27113777 = fieldWeight in 4415, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.02734375 = fieldNorm(doc=4415)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

This work proposes an approach which is intended to meet the particular challenges of Medical Language Processing, in particular medical information retrieval. At its core lies a new type of dictionary, in which the entries are equivalence classes of subwords, i.e., semantically minimal units. These equivalence classes capture intralingual as well as interlingual synonymy. As equivalence classes abstract away from subtle particularities within and between languages and reference to them is realized via a language-independent conceptual system, they form an interlingua. In this work, the theoretical foundations of this approach are elaborated on. Furthermore, design considerations of applications based on the subword methodology are drawn up and showcase implementations are evaluated in detail. Starting with the introduction of Medical Linguistics as a field of active research in Chapter two, its consideration as a domain separated form general linguistics is motivated. In particular, morphological phenomena inherent to medical language are figured in more detail, which leads to an alternative view on medical terms and the introduction of the notion of subwords. Chapter three describes the formal foundation of subwords and the underlying linguistic declarative as well as procedural knowledge. An implementation of the subword model for the medical domain, the MorphoSaurus system, is presented in Chapter four. Emphasis will be given on the multilingual aspect of the proposed approach, including English, German, and Portuguese. The automatic acquisition of (medical) subwords for other languages (Spanish, French, and Swedish), and their integration in already available resources is described in the fifth Chapter.
The proper handling of acronyms plays a crucial role in medical texts, e.g. in patient records, as well as in scientific literature. Chapter six presents an approach, in which acronyms are automatically acquired from (bio-) medical literature. Furthermore, acronyms and their definitions in different languages are linked to each other using the MorphoSaurus text processing system. Automatic word sense disambiguation is still one of the most challenging tasks in Natural Language Processing. In Chapter seven, cross-lingual considerations lead to a new methodology for automatic disambiguation applied to subwords. Beginning with Chapter eight, a series of applications based onMorphoSaurus are introduced. Firstly, the implementation of the subword approach within a crosslanguage information retrieval setting for the medical domain is described and evaluated on standard test document collections. In Chapter nine, this methodology is extended to multilingual information retrieval in the Web, for which user queries are translated into target languages based on the segmentation into subwords and their interlingual mappings. The cross-lingual, automatic assignment of document descriptors to documents is the topic of Chapter ten. A large-scale evaluation of a heuristic, as well as a statistical algorithm is carried out using a prominent medical thesaurus as a controlled vocabulary. In Chapter eleven, it will be shown how MorphoSaurus can be used to map monolingual, lexical resources across different languages. As a result, a large multilingual medical lexicon with high coverage and complete lexical information is built and evaluated against a comparable, already available and commonly used lexical repository for the medical domain. Chapter twelve sketches a few applications based on MorphoSaurus. The generality and applicability of the subword approach to other domains is outlined, and proof-of-concepts in real-world scenarios are presented. Finally, Chapter thirteen recapitulates the most important aspects of MorphoSaurus and the potential benefit of its employment in medical information systems is carefully assessed, both for medical experts in their everyday life, but also with regard to health care consumers and their existential information needs.
Noy, N.F.: Knowledge representation for intelligent information retrieval in experimental sciences (1997) 0.01
```
0.008709343 = product of:
  0.026128028 = sum of:
    0.026128028 = product of:
      0.052256055 = sum of:
        0.052256055 = weight(_text_:processing in 694) [ClassicSimilarity], result of:
          0.052256055 = score(doc=694,freq=4.0), product of:
            0.20653816 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.051020417 = queryNorm
            0.2530092 = fieldWeight in 694, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.03125 = fieldNorm(doc=694)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

More and more information is available on-line every day. The greater the amount of on-line information, the greater the demand for tools that process and disseminate this information. Processing electronic information in the form of text and answering users' queries about that information intelligently is one of the great challenges in natural language processing and information retrieval. The research presented in this talk is centered on the latter of these two tasks: intelligent information retrieval. In order for information to be retrieved, it first needs to be formalized in a database or knowledge base. The ontology for this formalization and assumptions it is based on are crucial to successful intelligent information retrieval. We have concentrated our effort on developing an ontology for representing knowledge in the domains of experimental sciences, molecular biology in particular. We show that existing ontological models cannot be readily applied to represent this domain adequately. For example, the fundamental notion of ontology design that every "real" object is defined as an instance of a category seems incompatible with the universe where objects can change their category as a result of experimental procedures. Another important problem is representing complex structures such as DNA, mixtures, populations of molecules, etc., that are very common in molecular biology. We present extensions that need to be made to an ontology to cover these issues: the representation of transformations that change the structure and/or category of their participants, and the component relations and spatial structures of complex objects. We demonstrate examples of how the proposed representations can be used to improve the quality and completeness of answers to user queries; discuss techniques for evaluating ontologies and show a prototype of an Information Retrieval System that we developed.
Slavic-Overfield, A.: Classification management and use in a networked environment : the case of the Universal Decimal Classification (2005) 0.01
```
0.008709343 = product of:
  0.026128028 = sum of:
    0.026128028 = product of:
      0.052256055 = sum of:
        0.052256055 = weight(_text_:processing in 2191) [ClassicSimilarity], result of:
          0.052256055 = score(doc=2191,freq=4.0), product of:
            0.20653816 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.051020417 = queryNorm
            0.2530092 = fieldWeight in 2191, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.03125 = fieldNorm(doc=2191)
      0.5 = coord(1/2)
  0.33333334 = coord(1/3)
```
Abstract

In the Internet information space, advanced information retrieval (IR) methods and automatic text processing are used in conjunction with traditional knowledge organization systems (KOS). New information technology provides a platform for better KOS publishing, exploitation and sharing both for human and machine use. Networked KOS services are now being planned and developed as powerful tools for resource discovery. They will enable automatic contextualisation, interpretation and query matching to different indexing languages. The Semantic Web promises to be an environment in which the quality of semantic relationships in bibliographic classification systems can be fully exploited. Their use in the networked environment is, however, limited by the fact that they are not prepared or made available for advanced machine processing. The UDC was chosen for this research because of its widespread use and its long-term presence in online information retrieval systems. It was also the first system to be used for the automatic classification of Internet resources, and the first to be made available as a classification tool on the Web. The objective of this research is to establish the advantages of using UDC for information retrieval in a networked environment, to highlight the problems of automation and classification exchange, and to offer possible solutions. The first research question was is there enough evidence of the use of classification on the Internet to justify further development with this particular environment in mind? The second question is what are the automation requirements for the full exploitation of UDC and its exchange? The third question is which areas are in need of improvement and what specific recommendations can be made for implementing the UDC in a networked environment? A summary of changes required in the management and development of the UDC to facilitate its full adaptation for future use is drawn from this analysis.
Haslhofer, B.: ¬A Web-based mapping technique for establishing metadata interoperability (2008) 0.01
```
0.008135159 = product of:
  0.024405476 = sum of:
    0.024405476 = weight(_text_:data in 3173) [ClassicSimilarity], result of:
      0.024405476 = score(doc=3173,freq=6.0), product of:
        0.16132914 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.051020417 = queryNorm
        0.15127754 = fieldWeight in 3173, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.01953125 = fieldNorm(doc=3173)
  0.33333334 = coord(1/3)
```
Abstract

The integration of metadata from distinct, heterogeneous data sources requires metadata interoperability, which is a qualitative property of metadata information objects that is not given by default. The technique of metadata mapping allows domain experts to establish metadata interoperability in a certain integration scenario. Mapping solutions, as a technical manifestation of this technique, are already available for the intensively studied domain of database system interoperability, but they rarely exist for the Web. If we consider the amount of steadily increasing structured metadata and corresponding metadata schemes on theWeb, we can observe a clear need for a mapping solution that can operate in aWeb-based environment. To achieve that, we first need to build its technical core, which is a mapping model that provides the language primitives to define mapping relationships. Existing SemanticWeb languages such as RDFS and OWL define some basic mapping elements (e.g., owl:equivalentProperty, owl:sameAs), but do not address the full spectrum of semantic and structural heterogeneities that can occur among distinct, incompatible metadata information objects. Furthermore, it is still unclear how to process defined mapping relationships during run-time in order to deliver metadata to the client in a uniform way. As the main contribution of this thesis, we present an abstract mapping model, which reflects the mapping problem on a generic level and provides the means for reconciling incompatible metadata. Instance transformation functions and URIs take a central role in that model. The former cover a broad spectrum of possible structural and semantic heterogeneities, while the latter bind the complete mapping model to the architecture of the Word Wide Web. On the concrete, language-specific level we present a binding of the abstract mapping model for the RDF Vocabulary Description Language (RDFS), which allows us to create mapping specifications among incompatible metadata schemes expressed in RDFS. The mapping model is embedded in a cyclic process that categorises the requirements a mapping solution should fulfil into four subsequent phases: mapping discovery, mapping representation, mapping execution, and mapping maintenance. In this thesis, we mainly focus on mapping representation and on the transformation of mapping specifications into executable SPARQL queries. For mapping discovery support, the model provides an interface for plugging-in schema and ontology matching algorithms. For mapping maintenance we introduce the concept of a simple, but effective mapping registry. Based on the mapping model, we propose aWeb-based mediator wrapper-architecture that allows domain experts to set up mediation endpoints that provide a uniform SPARQL query interface to a set of distributed metadata sources. The involved data sources are encapsulated by wrapper components that expose the contained metadata and the schema definitions on the Web and provide a SPARQL query interface to these metadata. In this thesis, we present the OAI2LOD Server, a wrapper component for integrating metadata that are accessible via the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). In a case study, we demonstrate how mappings can be created in aWeb environment and how our mediator wrapper architecture can easily be configured in order to integrate metadata from various heterogeneous data sources without the need to install any mapping solution or metadata integration solution in a local system environment.
Mao, M.: Ontology mapping : towards semantic interoperability in distributed and heterogeneous environments (2008) 0.01
```
0.007514938 = product of:
  0.022544812 = sum of:
    0.022544812 = weight(_text_:data in 4659) [ClassicSimilarity], result of:
      0.022544812 = score(doc=4659,freq=2.0), product of:
        0.16132914 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.051020417 = queryNorm
        0.1397442 = fieldWeight in 4659, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=4659)
  0.33333334 = coord(1/3)
```
Abstract

This dissertation studies ontology mapping: the problem of finding semantic correspondences between similar elements of different ontologies. In the dissertation, elements denote classes or properties of ontologies. The goal of this research is to use ontology mapping to make heterogeneous information more accessible. The World Wide Web (WWW) now is widely used as a universal medium for information exchange. Semantic interoperability among different information systems in the WWW is limited due to information heterogeneity, and the non semantic nature of HTML and URLs. Ontologies have been suggested as a way to solve the problem of information heterogeneity by providing formal, explicit definitions of data and reasoning ability over related concepts. Given that no universal ontology exists for the WWW, work has focused on finding semantic correspondences between similar elements of different ontologies, i.e., ontology mapping. Ontology mapping can be done either by hand or using automated tools. Manual mapping becomes impractical as the size and complexity of ontologies increases. Full or semi-automated mapping approaches have been examined by several research studies. Previous full or semiautomated mapping approaches include analyzing linguistic information of elements in ontologies, treating ontologies as structural graphs, applying heuristic rules and machine learning techniques, and using probabilistic and reasoning methods etc. In this paper, two generic ontology mapping approaches are proposed. One is the PRIOR+ approach, which utilizes both information retrieval and artificial intelligence techniques in the context of ontology mapping. The other is the non-instance learning based approach, which experimentally explores machine learning algorithms to solve ontology mapping problem without requesting any instance. The results of the PRIOR+ on different tests at OAEI ontology matching campaign 2007 are encouraging. The non-instance learning based approach has shown potential for solving ontology mapping problem on OAEI benchmark tests.
Castellanos Ardila, J.P.: Investigation of an OSLC-domain targeting ISO 26262 : focus on the left side of the software V-model (2016) 0.01
```
0.007514938 = product of:
  0.022544812 = sum of:
    0.022544812 = weight(_text_:data in 5819) [ClassicSimilarity], result of:
      0.022544812 = score(doc=5819,freq=2.0), product of:
        0.16132914 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.051020417 = queryNorm
        0.1397442 = fieldWeight in 5819, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=5819)
  0.33333334 = coord(1/3)
```
Abstract

Industries have adopted a standardized set of practices for developing their products. In the automotive domain, the provision of safety-compliant systems is guided by ISO 26262, a standard that specifies a set of requirements and recommendations for developing automotive safety-critical systems. For being in compliance with ISO 26262, the safety lifecycle proposed by the standard must be included in the development process of a vehicle. Besides, a safety case that shows that the system is acceptably safe has to be provided. The provision of a safety case implies the execution of a precise documentation process. This process makes sure that the work products are available and traceable. Further, the documentation management is defined in the standard as a mandatory activity and guidelines are proposed/imposed for its elaboration. It would be appropriate to point out that a well-documented safety lifecycle will provide the necessary inputs for the generation of an ISO 26262-compliant safety case. The OSLC (Open Services for Lifecycle Collaboration) standard and the maturing stack of semantic web technologies represent a promising integration platform for enabling semantic interoperability between the tools involved in the safety lifecycle. Tools for requirements, architecture, development management, among others, are expected to interact and shared data with the help of domains specifications created in OSLC. This thesis proposes the creation of an OSLC tool-chain infrastructure for sharing safety-related information, where fragments of safety information can be generated. The steps carried out during the elaboration of this master thesis consist in the identification, representation, and shaping of the RDF resources needed for the creation of a safety case. The focus of the thesis is limited to a tiny portion of the ISO 26262 left-hand side of the V-model, more exactly part 6 clause 8 of the standard: Software unit design and implementation. Regardless of the use of a restricted portion of the standard during the execution of this thesis, the findings can be extended to other parts, and the conclusions can be generalize. This master thesis is considered one of the first steps towards the provision of an OSLC-based and ISO 26262-compliant methodological approach for representing and shaping the work products resulting from the execution of the safety lifecycle, documentation required in the conformation of an ISO-compliant safety case.
Pepper, S.: ¬The typology and semantics of binominal lexemes : noun-noun compounds and their functional equivalents (2020) 0.01
```
0.007514938 = product of:
  0.022544812 = sum of:
    0.022544812 = weight(_text_:data in 104) [ClassicSimilarity], result of:
      0.022544812 = score(doc=104,freq=2.0), product of:
        0.16132914 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.051020417 = queryNorm
        0.1397442 = fieldWeight in 104, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=104)
  0.33333334 = coord(1/3)
```
Abstract

The dissertation establishes 'binominal lexeme' as a comparative concept and discusses its cross-linguistic typology and semantics. Informally, a binominal lexeme is a noun-noun compound or functional equivalent; more precisely, it is a lexical item that consists primarily of two thing-morphs between which there exists an unstated semantic relation. Examples of binominals include Mandarin Chinese ?? (tielù) [iron road], French chemin de fer [way of iron] and Russian ???????? ?????? (zeleznaja doroga) [iron:adjz road]. All of these combine a word denoting 'iron' and a word denoting 'road' or 'way' to denote the meaning railway. In each case, the unstated semantic relation is one of composition: a railway is conceptualized as a road that is composed (or made) of iron. However, three different morphosyntactic strategies are employed: compounding, prepositional phrase and relational adjective. This study explores the range of such strategies used by a worldwide sample of 106 languages to express a set of 100 meanings from various semantic domains, resulting in a classification consisting of nine different morphosyntactic types. The semantic relations found in the data are also explored and a classification called the Hatcher-Bourque system is developed that operates at two levels of granularity, together with a tool for classifying binominals, the Bourquifier. The classification is extended to other subfields of language, including metonymy and lexical semantics, and beyond language to the domain of knowledge representation, resulting in a proposal for a general model of associative relations called the PHAB model. The many findings of the research include universals concerning the recruitment of anchoring nominal modification strategies, a method for comparing non-binary typologies, the non-universality (despite its predominance) of compounding, and a scale of frequencies for semantic relations which may provide insights into the associative nature of human thought.
Sebastian, Y.: Literature-based discovery by learning heterogeneous bibliographic information networks (2017) 0.01
```
0.007514938 = product of:
  0.022544812 = sum of:
    0.022544812 = weight(_text_:data in 535) [ClassicSimilarity], result of:
      0.022544812 = score(doc=535,freq=2.0), product of:
        0.16132914 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.051020417 = queryNorm
        0.1397442 = fieldWeight in 535, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=535)
  0.33333334 = coord(1/3)
```
Abstract

Literature-based discovery (LBD) research aims at finding effective computational methods for predicting previously unknown connections between clusters of research papers from disparate research areas. Existing methods encompass two general approaches. The first approach searches for these unknown connections by examining the textual contents of research papers. In addition to the existing textual features, the second approach incorporates structural features of scientific literatures, such as citation structures. These approaches, however, have not considered research papers' latent bibliographic metadata structures as important features that can be used for predicting previously unknown relationships between them. This thesis investigates a new graph-based LBD method that exploits the latent bibliographic metadata connections between pairs of research papers. The heterogeneous bibliographic information network is proposed as an efficient graph-based data structure for modeling the complex relationships between these metadata. In contrast to previous approaches, this method seamlessly combines textual and citation information in the form of pathbased metadata features for predicting future co-citation links between research papers from disparate research fields. The results reported in this thesis provide evidence that the method is effective for reconstructing the historical literature-based discovery hypotheses. This thesis also investigates the effects of semantic modeling and topic modeling on the performance of the proposed method. For semantic modeling, a general-purpose word sense disambiguation technique is proposed to reduce the lexical ambiguity in the title and abstract of research papers. The experimental results suggest that the reduced lexical ambiguity did not necessarily lead to a better performance of the method. This thesis discusses some of the possible contributing factors to these results. Finally, topic modeling is used for learning the latent topical relations between research papers. The learned topic model is incorporated into the heterogeneous bibliographic information network graph and allows new predictive features to be learned. The results in this thesis suggest that topic modeling improves the performance of the proposed method by increasing the overall accuracy for predicting the future co-citation links between disparate research papers.

Search (24 results, page 1 of 2)

Authors

Years

Themes

Classifications