Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 28. April 2022)
1Daquino, M. ; Peroni, S. ; Shotton, D. ; Colavizza, G. ; Ghavimi, B. ; Lauscher, A. ; Mayr, P. ; Romanello, M. ; Zumstein, P.: ¬The OpenCitations Data Model.
Abstract: A variety of schemas and ontologies are currently used for the machine-readable description of bibliographic entities and citations. This diversity, and the reuse of the same ontology terms with different nuances, generates inconsistencies in data. Adoption of a single data model would facilitate data integration tasks regardless of the data supplier or context application. In this paper we present the OpenCitations Data Model (OCDM), a generic data model for describing bibliographic entities and citations, developed using Semantic Web technologies. We also evaluate the effective reusability of OCDM according to ontology evaluation practices, mention existing users of OCDM, and discuss the use and impact of OCDM in the wider open science community.
Inhalt: Erschienen in: The Semantic Web - ISWC 2020, 19th International Semantic Web Conference, Athens, Greece, November 2-6, 2020, Proceedings, Part II. Vgl.: DOI: 10.1007/978-3-030-62466-8_28.
Themenfeld: Citation indexing
2Peroni, S. ; Dutton, A. ; Gray, T. ; Shotton, D.: Setting our bibliographic references free : towards open citation data.
In: Journal of documentation. 71(2015) no.2, S.253-277.
Abstract: Purpose - Citation data needs to be recognised as a part of the Commons - those works that are freely and legally available for sharing - and placed in an open repository. The paper aims to discuss this issue. Design/methodology/approach - The Open Citation Corpus is a new open repository of scholarly citation data, made available under a Creative Commons CC0 1.0 public domain dedication and encoded as Open Linked Data using the SPAR Ontologies. Findings - The Open Citation Corpus presently provides open access (OA) to reference lists from 204,637 articles from the OA Subset of PubMed Central, containing 6,325,178 individual references to 3,373,961 unique papers. Originality/value - Scholars, publishers and institutions may freely build upon, enhance and reuse the open citation data for any purpose, without restriction under copyright or database law.
3Iorio, A.D. ; Peroni, S. ; Poggi, F. ; Vitali, F.: Dealing with structural patterns of XML documents.
In: Journal of the Association for Information Science and Technology. 65(2014) no.9, S.1884-1900.
Abstract: Evaluating collections of XML documents without paying attention to the schema they were written in may give interesting insights into the expected characteristics of a markup language, as well as any regularity that may span vocabularies and languages, and that are more fundamental and frequent than plain content models. In this paper we explore the idea of structural patterns in XML vocabularies, by examining the characteristics of elements as they are used, rather than as they are defined. We introduce from the ground up a formal theory of 8 plus 3 structural patterns for XML elements, and verify their identifiability in a number of different XML vocabularies. The results allowed the creation of visualization and content extraction tools that are completely independent of the schema and without any previous knowledge of the semantics and organization of the XML vocabulary of the documents.
4Iorio, A. di ; Peroni, S. ; Vitali, F.: ¬A Semantic Web approach to everyday overlapping markup.
In: Journal of the American Society for Information Science and Technology. 62(2011) no.9, S.1696-1716.
Abstract: Overlapping structures in XML are not symptoms of a misunderstanding of the intrinsic characteristics of a text document nor evidence of extreme scholarly requirements far beyond those needed by the most common XML-based applications. On the contrary, overlaps have started to appear in a large number of incredibly popular applications hidden under the guise of syntactical tricks to the basic hierarchy of the XML data format. Unfortunately, syntactical tricks have the drawback that the affected structures require complicated workarounds to support even the simplest query or usage. In this article, we present Extremely Annotational Resource Description Framework (RDF) Markup (EARMARK), an approach to overlapping markup that simplifies and streamlines the management of multiple hierarchies on the same content, and provides an approach to sophisticated queries and usages over such structures without the need of ad-hoc applications, simply by using Semantic Web tools and languages. We compare how relevant tasks (e.g., the identification of the contribution of an author in a word processor document) are of some substantial complexity when using the original data format and become more or less trivial when using EARMARK. We finally evaluate positively the memory and disk requirements of EARMARK documents in comparison to Open Office and Microsoft Word XML-based formats.
Themenfeld: Semantic Web ; Wissensrepräsentation
Objekt: RDF ; EARMARK