Search (13 results, page 1 of 1)

Candela, G.: ¬An automatic data quality approach to assess semantic data from cultural heritage institutions (2023) 0.05

0.045274492 = product of:
  0.090548985 = sum of:
    0.072418936 = weight(_text_:data in 997) [ClassicSimilarity], result of:
      0.072418936 = score(doc=997,freq=12.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.59902847 = fieldWeight in 997, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=997)
    0.01813005 = product of:
      0.0362601 = sum of:
        0.0362601 = weight(_text_:22 in 997) [ClassicSimilarity], result of:
          0.0362601 = score(doc=997,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.2708308 = fieldWeight in 997, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=997)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Abstract: In recent years, cultural heritage institutions have been exploring the benefits of applying Linked Open Data to their catalogs and digital materials. Innovative and creative methods have emerged to publish and reuse digital contents to promote computational access, such as the concepts of Labs and Collections as Data. Data quality has become a requirement for researchers and training methods based on artificial intelligence and machine learning. This article explores how the quality of Linked Open Data made available by cultural heritage institutions can be automatically assessed. The results obtained can be useful for other institutions who wish to publish and assess their collections.
Date: 22. 6.2023 18:23:31

Gabler, S.: Vergabe von DDC-Sachgruppen mittels eines Schlagwort-Thesaurus (2021) 0.03
```
0.025739845 = product of:
  0.05147969 = sum of:
    0.030361896 = product of:
      0.15180948 = sum of:
        0.15180948 = weight(_text_:3a in 1000) [ClassicSimilarity], result of:
          0.15180948 = score(doc=1000,freq=2.0), product of:
            0.32413796 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.03823278 = queryNorm
            0.46834838 = fieldWeight in 1000, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1000)
      0.2 = coord(1/5)
    0.021117793 = weight(_text_:data in 1000) [ClassicSimilarity], result of:
      0.021117793 = score(doc=1000,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.17468026 = fieldWeight in 1000, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1000)
  0.5 = coord(2/4)
```
Abstract

Vorgestellt wird die Konstruktion eines thematisch geordneten Thesaurus auf Basis der Sachschlagwörter der Gemeinsamen Normdatei (GND) unter Nutzung der darin enthaltenen DDC-Notationen. Oberste Ordnungsebene dieses Thesaurus werden die DDC-Sachgruppen der Deutschen Nationalbibliothek. Die Konstruktion des Thesaurus erfolgt regelbasiert unter der Nutzung von Linked Data Prinzipien in einem SPARQL Prozessor. Der Thesaurus dient der automatisierten Gewinnung von Metadaten aus wissenschaftlichen Publikationen mittels eines computerlinguistischen Extraktors. Hierzu werden digitale Volltexte verarbeitet. Dieser ermittelt die gefundenen Schlagwörter über Vergleich der Zeichenfolgen Benennungen im Thesaurus, ordnet die Treffer nach Relevanz im Text und gibt die zugeordne-ten Sachgruppen rangordnend zurück. Die grundlegende Annahme dabei ist, dass die gesuchte Sachgruppe unter den oberen Rängen zurückgegeben wird. In einem dreistufigen Verfahren wird die Leistungsfähigkeit des Verfahrens validiert. Hierzu wird zunächst anhand von Metadaten und Erkenntnissen einer Kurzautopsie ein Goldstandard aus Dokumenten erstellt, die im Online-Katalog der DNB abrufbar sind. Die Dokumente vertei-len sich über 14 der Sachgruppen mit einer Losgröße von jeweils 50 Dokumenten. Sämtliche Dokumente werden mit dem Extraktor erschlossen und die Ergebnisse der Kategorisierung do-kumentiert. Schließlich wird die sich daraus ergebende Retrievalleistung sowohl für eine harte (binäre) Kategorisierung als auch eine rangordnende Rückgabe der Sachgruppen beurteilt.

Content

Master thesis Master of Science (Library and Information Studies) (MSc), Universität Wien. Advisor: Christoph Steiner. Vgl.: https://www.researchgate.net/publication/371680244_Vergabe_von_DDC-Sachgruppen_mittels_eines_Schlagwort-Thesaurus. DOI: 10.25365/thesis.70030. Vgl. dazu die Präsentation unter: https://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=web&cd=&ved=0CAIQw7AJahcKEwjwoZzzytz_AhUAAAAAHQAAAAAQAg&url=https%3A%2F%2Fwiki.dnb.de%2Fdownload%2Fattachments%2F252121510%2FDA3%2520Workshop-Gabler.pdf%3Fversion%3D1%26modificationDate%3D1671093170000%26api%3Dv2&psig=AOvVaw0szwENK1or3HevgvIDOfjx&ust=1687719410889597&opi=89978449.

Folsom, S.M.: Using the Program for Cooperative Cataloging's past and present to project a Linked Data future (2020) 0.02

0.018888328 = product of:
  0.07555331 = sum of:
    0.07555331 = weight(_text_:data in 5747) [ClassicSimilarity], result of:
      0.07555331 = score(doc=5747,freq=10.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.6249551 = fieldWeight in 5747, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0625 = fieldNorm(doc=5747)
  0.25 = coord(1/4)

Abstract: Drawing on the PCC's history with linked data and related work this article identifies and gives context to pressing areas PCC will need to focus on moving forward. These areas include defining plausible data targets, tractable implementation models and data flows, engaging in related tool development, and participating in the broader linked data community.

Naun, C.C.: Expanding the use of Linked Data value vocabularies in PCC cataloging (2020) 0.02

0.016527288 = product of:
  0.06610915 = sum of:
    0.06610915 = weight(_text_:data in 123) [ClassicSimilarity], result of:
      0.06610915 = score(doc=123,freq=10.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.5468357 = fieldWeight in 123, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=123)
  0.25 = coord(1/4)

Abstract: In 2015, the PCC Task Group on URIs in MARC was tasked to identify and address linked data identifiers deployment in the current MARC format. By way of a pilot test, a survey, MARC Discussion papers, Proposals, etc., the Task Group initiated and introduced changes to MARC encoding. The Task Group succeeded in laying the ground work for preparing library data transition from MARC data to a linked data, RDF environment.

Schreur, P.E.: ¬The use of Linked Data and artificial intelligence as key elements in the transformation of technical services (2020) 0.01
```
0.014782455 = product of:
  0.05912982 = sum of:
    0.05912982 = weight(_text_:data in 125) [ClassicSimilarity], result of:
      0.05912982 = score(doc=125,freq=8.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.48910472 = fieldWeight in 125, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=125)
  0.25 = coord(1/4)
```
Abstract

Library Technical Services have benefited from numerous stimuli. Although initially looked at with suspicion, transitions such as the move from catalog cards to the MARC formats have proven enormously helpful to libraries and their patrons. Linked data and Artificial Intelligence (AI) hold the same promise. Through the conversion of metadata surrogates (cataloging) to linked open data, libraries can represent their resources on the Semantic Web. But in order to provide some form of controlled access to unstructured data, libraries must reach beyond traditional cataloging to new tools such as AI to provide consistent access to a growing world of full-text resources.
Isaac, A.; Raemy, J.A.; Meijers, E.; Valk, S. De; Freire, N.: Metadata aggregation via linked data : results of the Europeana Common Culture project (2020) 0.01
```
0.014166246 = product of:
  0.056664985 = sum of:
    0.056664985 = weight(_text_:data in 39) [ClassicSimilarity], result of:
      0.056664985 = score(doc=39,freq=10.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.46871632 = fieldWeight in 39, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=39)
  0.25 = coord(1/4)
```
Abstract

Digital cultural heritage resources are widely available on the web through the digital libraries of heritage institutions. To address the difficulties of discoverability in cultural heritage, the common practice is metadata aggregation, where centralized efforts like Europeana facilitate discoverability by collecting the resources' metadata. We present the results of the linked data aggregation task conducted within the Europeana Common Culture project, which attempted an innovative approach to aggregation based on linked data made available by cultural heritage institutions. This task ran for one year with participation of eleven organizations, involving the three member roles of the Europeana network: data providers, intermediary aggregators, and the central aggregation hub, Europeana. We report on the challenges that were faced by data providers, the standards and specifications applied, and the resulting aggregated metadata.
Kahlawi, A,: ¬An ontology driven ESCO LOD quality enhancement (2020) 0.01
```
0.010973128 = product of:
  0.04389251 = sum of:
    0.04389251 = weight(_text_:data in 5959) [ClassicSimilarity], result of:
      0.04389251 = score(doc=5959,freq=6.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.3630661 = fieldWeight in 5959, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=5959)
  0.25 = coord(1/4)
```
Abstract

The labor market is a system that is complex and difficult to manage. To overcome this challenge, the European Union has launched the ESCO project which is a language that aims to describe this labor market. In order to support the spread of this project, its dataset was presented as linked open data (LOD). Since LOD is usable and reusable, a set of conditions have to be met. First, LOD must be feasible and high quality. In addition, it must provide the user with the right answers, and it has to be built according to a clear and correct structure. This study investigates the LOD of ESCO, focusing on data quality and data structure. The former is evaluated through applying a set of SPARQL queries. This provides solutions to improve its quality via a set of rules built in first order logic. This process was conducted based on a new proposed ESCO ontology.
Smith, A.: Simple Knowledge Organization System (SKOS) (2022) 0.01
```
0.008959521 = product of:
  0.035838082 = sum of:
    0.035838082 = weight(_text_:data in 1094) [ClassicSimilarity], result of:
      0.035838082 = score(doc=1094,freq=4.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.29644224 = fieldWeight in 1094, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=1094)
  0.25 = coord(1/4)
```
Abstract

SKOS (Simple Knowledge Organization System) is a recommendation from the World Wide Web Consortium (W3C) for representing controlled vocabularies, taxonomies, thesauri, classifications, and similar systems for organizing and indexing information as linked data elements in the Semantic Web, using the Resource Description Framework (RDF). The SKOS data model is centered on "concepts", which can have preferred and alternate labels in any language as well as other metadata, and which are identified by addresses on the World Wide Web (URIs). Concepts are grouped into hierarchies through "broader" and "narrower" relations, with "top concepts" at the broadest conceptual level. Concepts are also organized into "concept schemes", also identified by URIs. Other relations, mappings, and groupings are also supported. This article discusses the history of the development of SKOS and provides notes on adoption, uses, and limitations.
Ahmed, M.; Mukhopadhyay, M.; Mukhopadhyay, P.: Automated knowledge organization : AI ML based subject indexing system for libraries (2023) 0.01
```
0.0074662673 = product of:
  0.02986507 = sum of:
    0.02986507 = weight(_text_:data in 977) [ClassicSimilarity], result of:
      0.02986507 = score(doc=977,freq=4.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.24703519 = fieldWeight in 977, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=977)
  0.25 = coord(1/4)
```
Abstract

The research study as reported here is an attempt to explore the possibilities of an AI/ML-based semi-automated indexing system in a library setup to handle large volumes of documents. It uses the Python virtual environment to install and configure an open source AI environment (named Annif) to feed the LOD (Linked Open Data) dataset of Library of Congress Subject Headings (LCSH) as a standard KOS (Knowledge Organisation System). The framework deployed the Turtle format of LCSH after cleaning the file with Skosify, applied an array of backend algorithms (namely TF-IDF, Omikuji, and NN-Ensemble) to measure relative performance, and selected Snowball as an analyser. The training of Annif was conducted with a large set of bibliographic records populated with subject descriptors (MARC tag 650$a) and indexed by trained LIS professionals. The training dataset is first treated with MarcEdit to export it in a format suitable for OpenRefine, and then in OpenRefine it undergoes many steps to produce a bibliographic record set suitable to train Annif. The framework, after training, has been tested with a bibliographic dataset to measure indexing efficiencies, and finally, the automated indexing framework is integrated with data wrangling software (OpenRefine) to produce suggested headings on a mass scale. The entire framework is based on open-source software, open datasets, and open standards.
Peponakis, M.; Mastora, A.; Kapidakis, S.; Doerr, M.: Expressiveness and machine processability of Knowledge Organization Systems (KOS) : an analysis of concepts and relations (2020) 0.01
```
0.0052794483 = product of:
  0.021117793 = sum of:
    0.021117793 = weight(_text_:data in 5787) [ClassicSimilarity], result of:
      0.021117793 = score(doc=5787,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.17468026 = fieldWeight in 5787, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5787)
  0.25 = coord(1/4)
```
Abstract

This study considers the expressiveness (that is the expressive power or expressivity) of different types of Knowledge Organization Systems (KOS) and discusses its potential to be machine-processable in the context of the Semantic Web. For this purpose, the theoretical foundations of KOS are reviewed based on conceptualizations introduced by the Functional Requirements for Subject Authority Data (FRSAD) and the Simple Knowledge Organization System (SKOS); natural language processing techniques are also implemented. Applying a comparative analysis, the dataset comprises a thesaurus (Eurovoc), a subject headings system (LCSH) and a classification scheme (DDC). These are compared with an ontology (CIDOC-CRM) by focusing on how they define and handle concepts and relations. It was observed that LCSH and DDC focus on the formalism of character strings (nomens) rather than on the modelling of semantics; their definition of what constitutes a concept is quite fuzzy, and they comprise a large number of complex concepts. By contrast, thesauri have a coherent definition of what constitutes a concept, and apply a systematic approach to the modelling of relations. Ontologies explicitly define diverse types of relations, and are by their nature machine-processable. The paper concludes that the potential of both the expressiveness and machine processability of each KOS is extensively regulated by its structural rules. It is harder to represent subject headings and classification schemes as semantic networks with nodes and arcs, while thesauri are more suitable for such a representation. In addition, a paradigm shift is revealed which focuses on the modelling of relations between concepts, rather than the concepts themselves.
Menzel, S.; Schnaitter, H.; Zinck, J.; Petras, V.; Neudecker, C.; Labusch, K.; Leitner, E.; Rehm, G.: Named Entity Linking mit Wikidata und GND : das Potenzial handkuratierter und strukturierter Datenquellen für die semantische Anreicherung von Volltexten (2021) 0.01
```
0.0052794483 = product of:
  0.021117793 = sum of:
    0.021117793 = weight(_text_:data in 373) [ClassicSimilarity], result of:
      0.021117793 = score(doc=373,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.17468026 = fieldWeight in 373, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=373)
  0.25 = coord(1/4)
```
Abstract

Named Entities (benannte Entitäten) - wie Personen, Organisationen, Orte, Ereignisse und Werke - sind wichtige inhaltstragende Komponenten eines Dokuments und sind daher maßgeblich für eine gute inhaltliche Erschließung. Die Erkennung von Named Entities, deren Auszeichnung (Annotation) und Verfügbarmachung für die Suche sind wichtige Instrumente, um Anwendungen wie z. B. die inhaltliche oder semantische Suche in Texten, dokumentübergreifende Kontextualisierung oder das automatische Textzusammenfassen zu verbessern. Inhaltlich präzise und nachhaltig erschlossen werden die erkannten Named Entities eines Dokuments allerdings erst, wenn sie mit einer oder mehreren Quellen verknüpft werden (Grundprinzip von Linked Data, Berners-Lee 2006), die die Entität eindeutig identifizieren und gegenüber gleichlautenden Entitäten disambiguieren (vergleiche z. B. Berlin als Hauptstadt Deutschlands mit dem Komponisten Irving Berlin). Dazu wird die im Dokument erkannte Entität mit dem Entitätseintrag einer Normdatei oder einer anderen zuvor festgelegten Wissensbasis (z. B. Gazetteer für geografische Entitäten) verknüpft, gewöhnlich über den persistenten Identifikator der jeweiligen Wissensbasis oder Normdatei. Durch die Verknüpfung mit einer Normdatei erfolgt nicht nur die Disambiguierung und Identifikation der Entität, sondern es wird dadurch auch Interoperabilität zu anderen Systemen hergestellt, in denen die gleiche Normdatei benutzt wird, z. B. die Suche nach der Hauptstadt Berlin in verschiedenen Datenbanken bzw. Portalen. Die Entitätenverknüpfung (Named Entity Linking, NEL) hat zudem den Vorteil, dass die Normdateien oftmals Relationen zwischen Entitäten enthalten, sodass Dokumente, in denen Named Entities erkannt wurden, zusätzlich auch im Kontext einer größeren Netzwerkstruktur von Entitäten verortet und suchbar gemacht werden können

Rölke, H.; Weichselbraun, A.: Ontologien und Linked Open Data (2023) 0.01

0.0052794483 = product of:
  0.021117793 = sum of:
    0.021117793 = weight(_text_:data in 788) [ClassicSimilarity], result of:
      0.021117793 = score(doc=788,freq=2.0), product of:
        0.120893985 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03823278 = queryNorm
        0.17468026 = fieldWeight in 788, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=788)
  0.25 = coord(1/4)

Marcondes, C.H.: Towards a vocabulary to implement culturally relevant relationships between digital collections in heritage institutions (2020) 0.00

0.0032375087 = product of:
  0.012950035 = sum of:
    0.012950035 = product of:
      0.02590007 = sum of:
        0.02590007 = weight(_text_:22 in 5757) [ClassicSimilarity], result of:
          0.02590007 = score(doc=5757,freq=2.0), product of:
            0.13388468 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03823278 = queryNorm
            0.19345059 = fieldWeight in 5757, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5757)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 4. 3.2020 14:22:41

Search (13 results, page 1 of 1)

Authors

Languages

Types

Themes