Search (25 results, page 1 of 2)

  • × year_i:[2020 TO 2030}
  • × theme_ss:"Semantische Interoperabilität"
  1. Menzel, S.; Schnaitter, H.; Zinck, J.; Petras, V.; Neudecker, C.; Labusch, K.; Leitner, E.; Rehm, G.: Named Entity Linking mit Wikidata und GND : das Potenzial handkuratierter und strukturierter Datenquellen für die semantische Anreicherung von Volltexten (2021) 0.07
    0.07041071 = product of:
      0.14082143 = sum of:
        0.03775026 = weight(_text_:von in 373) [ClassicSimilarity], result of:
          0.03775026 = score(doc=373,freq=8.0), product of:
            0.12806706 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.04800207 = queryNorm
            0.29476947 = fieldWeight in 373, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.0390625 = fieldNorm(doc=373)
        0.103071176 = product of:
          0.15460676 = sum of:
            0.003525567 = weight(_text_:a in 373) [ClassicSimilarity], result of:
              0.003525567 = score(doc=373,freq=2.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.06369744 = fieldWeight in 373, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=373)
            0.15108119 = weight(_text_:z in 373) [ClassicSimilarity], result of:
              0.15108119 = score(doc=373,freq=8.0), product of:
                0.2562021 = queryWeight, product of:
                  5.337313 = idf(docFreq=577, maxDocs=44218)
                  0.04800207 = queryNorm
                0.58969533 = fieldWeight in 373, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  5.337313 = idf(docFreq=577, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=373)
          0.6666667 = coord(2/3)
      0.5 = coord(2/4)
    
    Abstract
    Named Entities (benannte Entitäten) - wie Personen, Organisationen, Orte, Ereignisse und Werke - sind wichtige inhaltstragende Komponenten eines Dokuments und sind daher maßgeblich für eine gute inhaltliche Erschließung. Die Erkennung von Named Entities, deren Auszeichnung (Annotation) und Verfügbarmachung für die Suche sind wichtige Instrumente, um Anwendungen wie z. B. die inhaltliche oder semantische Suche in Texten, dokumentübergreifende Kontextualisierung oder das automatische Textzusammenfassen zu verbessern. Inhaltlich präzise und nachhaltig erschlossen werden die erkannten Named Entities eines Dokuments allerdings erst, wenn sie mit einer oder mehreren Quellen verknüpft werden (Grundprinzip von Linked Data, Berners-Lee 2006), die die Entität eindeutig identifizieren und gegenüber gleichlautenden Entitäten disambiguieren (vergleiche z. B. Berlin als Hauptstadt Deutschlands mit dem Komponisten Irving Berlin). Dazu wird die im Dokument erkannte Entität mit dem Entitätseintrag einer Normdatei oder einer anderen zuvor festgelegten Wissensbasis (z. B. Gazetteer für geografische Entitäten) verknüpft, gewöhnlich über den persistenten Identifikator der jeweiligen Wissensbasis oder Normdatei. Durch die Verknüpfung mit einer Normdatei erfolgt nicht nur die Disambiguierung und Identifikation der Entität, sondern es wird dadurch auch Interoperabilität zu anderen Systemen hergestellt, in denen die gleiche Normdatei benutzt wird, z. B. die Suche nach der Hauptstadt Berlin in verschiedenen Datenbanken bzw. Portalen. Die Entitätenverknüpfung (Named Entity Linking, NEL) hat zudem den Vorteil, dass die Normdateien oftmals Relationen zwischen Entitäten enthalten, sodass Dokumente, in denen Named Entities erkannt wurden, zusätzlich auch im Kontext einer größeren Netzwerkstruktur von Entitäten verortet und suchbar gemacht werden können
    Type
    a
  2. Steeg, F.; Pohl, A.: ¬Ein Protokoll für den Datenabgleich im Web am Beispiel von OpenRefine und der Gemeinsamen Normdatei (GND) (2021) 0.06
    0.061621696 = product of:
      0.12324339 = sum of:
        0.032692686 = weight(_text_:von in 367) [ClassicSimilarity], result of:
          0.032692686 = score(doc=367,freq=6.0), product of:
            0.12806706 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.04800207 = queryNorm
            0.25527787 = fieldWeight in 367, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.0390625 = fieldNorm(doc=367)
        0.090550706 = product of:
          0.13582605 = sum of:
            0.0049859053 = weight(_text_:a in 367) [ClassicSimilarity], result of:
              0.0049859053 = score(doc=367,freq=4.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.090081796 = fieldWeight in 367, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=367)
            0.13084015 = weight(_text_:z in 367) [ClassicSimilarity], result of:
              0.13084015 = score(doc=367,freq=6.0), product of:
                0.2562021 = queryWeight, product of:
                  5.337313 = idf(docFreq=577, maxDocs=44218)
                  0.04800207 = queryNorm
                0.51069117 = fieldWeight in 367, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.337313 = idf(docFreq=577, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=367)
          0.6666667 = coord(2/3)
      0.5 = coord(2/4)
    
    Abstract
    Normdaten spielen speziell im Hinblick auf die Qualität der Inhaltserschließung bibliografischer und archivalischer Ressourcen eine wichtige Rolle. Ein konkretes Ziel der Inhaltserschließung ist z. B., dass alle Werke über Hermann Hesse einheitlich zu finden sind. Hier bieten Normdaten eine Lösung, indem z. B. bei der Erschließung einheitlich die GND-Nummer 11855042X für Hermann Hesse verwendet wird. Das Ergebnis ist eine höhere Qualität der Inhaltserschließung vor allem im Sinne von Einheitlichkeit und Eindeutigkeit und, daraus resultierend, eine bessere Auffindbarkeit. Werden solche Entitäten miteinander verknüpft, z. B. Hermann Hesse mit einem seiner Werke, entsteht ein Knowledge Graph, wie ihn etwa Google bei der Inhaltserschließung des Web verwendet (Singhal 2012). Die Entwicklung des Google Knowledge Graph und das hier vorgestellte Protokoll sind historisch miteinander verbunden: OpenRefine wurde ursprünglich als Google Refine entwickelt, und die Funktionalität zum Abgleich mit externen Datenquellen (Reconciliation) wurde ursprünglich zur Einbindung von Freebase entwickelt, einer der Datenquellen des Google Knowledge Graph. Freebase wurde später in Wikidata integriert. Schon Google Refine wurde zum Abgleich mit Normdaten verwendet, etwa den Library of Congress Subject Headings (Hooland et al. 2013).
    Type
    a
  3. Gabler, S.: Vergabe von DDC-Sachgruppen mittels eines Schlagwort-Thesaurus (2021) 0.05
    0.05286971 = product of:
      0.10573942 = sum of:
        0.06353335 = product of:
          0.19060004 = sum of:
            0.19060004 = weight(_text_:3a in 1000) [ClassicSimilarity], result of:
              0.19060004 = score(doc=1000,freq=2.0), product of:
                0.4069621 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.04800207 = queryNorm
                0.46834838 = fieldWeight in 1000, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1000)
          0.33333334 = coord(1/3)
        0.04220607 = weight(_text_:von in 1000) [ClassicSimilarity], result of:
          0.04220607 = score(doc=1000,freq=10.0), product of:
            0.12806706 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.04800207 = queryNorm
            0.32956228 = fieldWeight in 1000, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1000)
      0.5 = coord(2/4)
    
    Abstract
    Vorgestellt wird die Konstruktion eines thematisch geordneten Thesaurus auf Basis der Sachschlagwörter der Gemeinsamen Normdatei (GND) unter Nutzung der darin enthaltenen DDC-Notationen. Oberste Ordnungsebene dieses Thesaurus werden die DDC-Sachgruppen der Deutschen Nationalbibliothek. Die Konstruktion des Thesaurus erfolgt regelbasiert unter der Nutzung von Linked Data Prinzipien in einem SPARQL Prozessor. Der Thesaurus dient der automatisierten Gewinnung von Metadaten aus wissenschaftlichen Publikationen mittels eines computerlinguistischen Extraktors. Hierzu werden digitale Volltexte verarbeitet. Dieser ermittelt die gefundenen Schlagwörter über Vergleich der Zeichenfolgen Benennungen im Thesaurus, ordnet die Treffer nach Relevanz im Text und gibt die zugeordne-ten Sachgruppen rangordnend zurück. Die grundlegende Annahme dabei ist, dass die gesuchte Sachgruppe unter den oberen Rängen zurückgegeben wird. In einem dreistufigen Verfahren wird die Leistungsfähigkeit des Verfahrens validiert. Hierzu wird zunächst anhand von Metadaten und Erkenntnissen einer Kurzautopsie ein Goldstandard aus Dokumenten erstellt, die im Online-Katalog der DNB abrufbar sind. Die Dokumente vertei-len sich über 14 der Sachgruppen mit einer Losgröße von jeweils 50 Dokumenten. Sämtliche Dokumente werden mit dem Extraktor erschlossen und die Ergebnisse der Kategorisierung do-kumentiert. Schließlich wird die sich daraus ergebende Retrievalleistung sowohl für eine harte (binäre) Kategorisierung als auch eine rangordnende Rückgabe der Sachgruppen beurteilt.
    Content
    Master thesis Master of Science (Library and Information Studies) (MSc), Universität Wien. Advisor: Christoph Steiner. Vgl.: https://www.researchgate.net/publication/371680244_Vergabe_von_DDC-Sachgruppen_mittels_eines_Schlagwort-Thesaurus. DOI: 10.25365/thesis.70030. Vgl. dazu die Präsentation unter: https://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=web&cd=&ved=0CAIQw7AJahcKEwjwoZzzytz_AhUAAAAAHQAAAAAQAg&url=https%3A%2F%2Fwiki.dnb.de%2Fdownload%2Fattachments%2F252121510%2FDA3%2520Workshop-Gabler.pdf%3Fversion%3D1%26modificationDate%3D1671093170000%26api%3Dv2&psig=AOvVaw0szwENK1or3HevgvIDOfjx&ust=1687719410889597&opi=89978449.
  4. Balakrishnan, U.; Peters, S.; Voß, J.: Coli-conc : eine Infrastruktur zur Nutzung und Erstellung von Konkordanzen (2021) 0.03
    0.027247813 = product of:
      0.054495625 = sum of:
        0.052850362 = weight(_text_:von in 368) [ClassicSimilarity], result of:
          0.052850362 = score(doc=368,freq=8.0), product of:
            0.12806706 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.04800207 = queryNorm
            0.41267726 = fieldWeight in 368, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.0546875 = fieldNorm(doc=368)
        0.0016452647 = product of:
          0.004935794 = sum of:
            0.004935794 = weight(_text_:a in 368) [ClassicSimilarity], result of:
              0.004935794 = score(doc=368,freq=2.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.089176424 = fieldWeight in 368, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=368)
          0.33333334 = coord(1/3)
      0.5 = coord(2/4)
    
    Abstract
    coli-conc ist eine Dienstleistung der Verbundzentrale des Gemeinsamen Bibliotheksverbundes (VZG). Sie stellt webbasierte Dienste für einen effektiveren Austausch von Wissensorganisationssystemen und für die effiziente Erstellung und Wartung von Mappings zur Verfügung. Der Schwerpunkt liegt auf den im deutschsprachigen Raum verbreiteten bibliothekarischen Klassifikationen und Normdateien, vor allem den bedeutenden Universalklassifikationen wie Dewey Dezimalklassifikation (DDC), Regensburger Verbundklassifikation (RVK), Basisklassifikation (BK) und den Sachgruppen der Deutschen Nationalbibliografie (SDNB). Dieser Bericht beschreibt den Hintergrund, die Architektur und die Funktionalitäten von coli-conc sowie das Herzstück der Infrastruktur - das Mapping-Tool Cocoda. Außerdem wird auf Maßnahmen zur Qualitätssicherung eingegangen und ein Einblick in das neue Mapping-Verfahren mit dem Konzept- Hub gewährt.
    Type
    a
  5. Rölke, H.; Weichselbraun, A.: Ontologien und Linked Open Data (2023) 0.02
    0.019706113 = product of:
      0.039412227 = sum of:
        0.03775026 = weight(_text_:von in 788) [ClassicSimilarity], result of:
          0.03775026 = score(doc=788,freq=8.0), product of:
            0.12806706 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.04800207 = queryNorm
            0.29476947 = fieldWeight in 788, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.0390625 = fieldNorm(doc=788)
        0.0016619684 = product of:
          0.0049859053 = sum of:
            0.0049859053 = weight(_text_:a in 788) [ClassicSimilarity], result of:
              0.0049859053 = score(doc=788,freq=4.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.090081796 = fieldWeight in 788, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=788)
          0.33333334 = coord(1/3)
      0.5 = coord(2/4)
    
    Abstract
    Der Begriff Ontologie stammt ursprünglich aus der Metaphysik, einem Teilbereich der Philosophie, welcher sich um die Erkenntnis der Grundstruktur und Prinzipien der Wirklichkeit bemüht. Ontologien befassen sich dabei mit der Frage, welche Dinge auf der fundamentalsten Ebene existieren, wie sich diese strukturieren lassen und in welchen Beziehungen diese zueinanderstehen. In der Informationswissenschaft hingegen werden Ontologien verwendet, um das Vokabular für die Beschreibung von Wissensbereichen zu formalisieren. Ziel ist es, dass alle Akteure, die in diesen Bereichen tätig sind, die gleichen Konzepte und Begrifflichkeiten verwenden, um eine reibungslose Zusammenarbeit ohne Missverständnisse zu ermöglichen. So definierte zum Beispiel die Dublin Core Metadaten Initiative 15 Kernelemente, die zur Beschreibung von elektronischen Ressourcen und Medien verwendet werden können. Jedes Element wird durch eine eindeutige Bezeichnung (zum Beispiel identifier) und eine zugehörige Konzeption, welche die Bedeutung dieser Bezeichnung möglichst exakt festlegt, beschrieben. Ein Identifier muss zum Beispiel laut der Dublin Core Ontologie ein Dokument basierend auf einem zugehörigen Katalog eindeutig identifizieren. Je nach Katalog kämen daher zum Beispiel eine ISBN (Katalog von Büchern), ISSN (Katalog von Zeitschriften), URL (Web), DOI (Publikationsdatenbank) etc. als Identifier in Frage.
    Type
    a
  6. Gabler, S.: Thesauri - a Toolbox for Information Retrieval (2023) 0.02
    0.016429678 = product of:
      0.032859355 = sum of:
        0.030200208 = weight(_text_:von in 114) [ClassicSimilarity], result of:
          0.030200208 = score(doc=114,freq=2.0), product of:
            0.12806706 = queryWeight, product of:
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.04800207 = queryNorm
            0.23581557 = fieldWeight in 114, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6679487 = idf(docFreq=8340, maxDocs=44218)
              0.0625 = fieldNorm(doc=114)
        0.0026591495 = product of:
          0.007977448 = sum of:
            0.007977448 = weight(_text_:a in 114) [ClassicSimilarity], result of:
              0.007977448 = score(doc=114,freq=4.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.14413087 = fieldWeight in 114, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=114)
          0.33333334 = coord(1/3)
      0.5 = coord(2/4)
    
    Abstract
    Thesauri sind etablierte Instrumente der bibliothekarischen Sacherschließung. Durch die jüngste technologische Entwicklung und das Aufkommen künstlicher Intelligenz haben sie an Bedeutung gewonnen, da sie in der Lage sind, erklärbare Ergebnisse für die computergestützte Erschließungs- und Konkordanzarbeit mit anderen Datensätzen und Modellen sowie für die Datenvalidierung zu liefern. Ausgehend von bestehenden eigenen Recherchen für eine Masterarbeit wird der Aspekt der Qualitätssicherung in Bibliothekskatalogen anhand ausgewählter Beispiele vertieft.
    Type
    a
  7. Candela, G.: ¬An automatic data quality approach to assess semantic data from cultural heritage institutions (2023) 0.01
    0.008750932 = product of:
      0.03500373 = sum of:
        0.03500373 = product of:
          0.052505594 = sum of:
            0.0069802674 = weight(_text_:a in 997) [ClassicSimilarity], result of:
              0.0069802674 = score(doc=997,freq=4.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.12611452 = fieldWeight in 997, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=997)
            0.045525327 = weight(_text_:22 in 997) [ClassicSimilarity], result of:
              0.045525327 = score(doc=997,freq=2.0), product of:
                0.16809508 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04800207 = queryNorm
                0.2708308 = fieldWeight in 997, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=997)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Abstract
    In recent years, cultural heritage institutions have been exploring the benefits of applying Linked Open Data to their catalogs and digital materials. Innovative and creative methods have emerged to publish and reuse digital contents to promote computational access, such as the concepts of Labs and Collections as Data. Data quality has become a requirement for researchers and training methods based on artificial intelligence and machine learning. This article explores how the quality of Linked Open Data made available by cultural heritage institutions can be automatically assessed. The results obtained can be useful for other institutions who wish to publish and assess their collections.
    Date
    22. 6.2023 18:23:31
    Type
    a
  8. Marcondes, C.H.: Towards a vocabulary to implement culturally relevant relationships between digital collections in heritage institutions (2020) 0.01
    0.0073685125 = product of:
      0.02947405 = sum of:
        0.02947405 = product of:
          0.044211075 = sum of:
            0.011692984 = weight(_text_:a in 5757) [ClassicSimilarity], result of:
              0.011692984 = score(doc=5757,freq=22.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.21126054 = fieldWeight in 5757, product of:
                  4.690416 = tf(freq=22.0), with freq of:
                    22.0 = termFreq=22.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5757)
            0.032518093 = weight(_text_:22 in 5757) [ClassicSimilarity], result of:
              0.032518093 = score(doc=5757,freq=2.0), product of:
                0.16809508 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04800207 = queryNorm
                0.19345059 = fieldWeight in 5757, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5757)
          0.6666667 = coord(2/3)
      0.25 = coord(1/4)
    
    Abstract
    Cultural heritage institutions are publishing their digital collections over the web as LOD. This is is a new step in the patrimonialization and curatorial processes developed by such institutions. Many of these collections are thematically superimposed and complementary. Frequently, objects in these collections present culturally relevant relationships, such as a book about a painting, or a draft or sketch of a famous painting, etc. LOD technology enables such heritage records to be interlinked, achieving interoperability and adding value to digital collections, thus empowering heritage institutions. An aim of this research is characterizing such culturally relevant relationships and organizing them in a vocabulary. Use cases or examples of relationships between objects suggested by curators or mentioned in literature and in the conceptual models as FRBR/LRM, CIDOC CRM and RiC-CM, were collected and used as examples or inspiration of cultural relevant relationships. Relationships identified are collated and compared for identifying those with the same or similar meaning, synthesized and normalized. A set of thirty-three culturally relevant relationships are identified and formalized as a LOD property vocabulary to be used by digital curators to interlink digital collections. The results presented are provisional and a starting point to be discussed, tested, and enhanced.
    Date
    4. 3.2020 14:22:41
    Type
    a
  9. Kahlawi, A,: ¬An ontology driven ESCO LOD quality enhancement (2020) 0.00
    0.0010576702 = product of:
      0.004230681 = sum of:
        0.004230681 = product of:
          0.012692042 = sum of:
            0.012692042 = weight(_text_:a in 5959) [ClassicSimilarity], result of:
              0.012692042 = score(doc=5959,freq=18.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.22931081 = fieldWeight in 5959, product of:
                  4.2426405 = tf(freq=18.0), with freq of:
                    18.0 = termFreq=18.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5959)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Abstract
    The labor market is a system that is complex and difficult to manage. To overcome this challenge, the European Union has launched the ESCO project which is a language that aims to describe this labor market. In order to support the spread of this project, its dataset was presented as linked open data (LOD). Since LOD is usable and reusable, a set of conditions have to be met. First, LOD must be feasible and high quality. In addition, it must provide the user with the right answers, and it has to be built according to a clear and correct structure. This study investigates the LOD of ESCO, focusing on data quality and data structure. The former is evaluated through applying a set of SPARQL queries. This provides solutions to improve its quality via a set of rules built in first order logic. This process was conducted based on a new proposed ESCO ontology.
    Type
    a
  10. Peponakis, M.; Mastora, A.; Kapidakis, S.; Doerr, M.: Expressiveness and machine processability of Knowledge Organization Systems (KOS) : an analysis of concepts and relations (2020) 0.00
    0.0010177437 = product of:
      0.004070975 = sum of:
        0.004070975 = product of:
          0.012212924 = sum of:
            0.012212924 = weight(_text_:a in 5787) [ClassicSimilarity], result of:
              0.012212924 = score(doc=5787,freq=24.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.22065444 = fieldWeight in 5787, product of:
                  4.8989797 = tf(freq=24.0), with freq of:
                    24.0 = termFreq=24.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5787)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Abstract
    This study considers the expressiveness (that is the expressive power or expressivity) of different types of Knowledge Organization Systems (KOS) and discusses its potential to be machine-processable in the context of the Semantic Web. For this purpose, the theoretical foundations of KOS are reviewed based on conceptualizations introduced by the Functional Requirements for Subject Authority Data (FRSAD) and the Simple Knowledge Organization System (SKOS); natural language processing techniques are also implemented. Applying a comparative analysis, the dataset comprises a thesaurus (Eurovoc), a subject headings system (LCSH) and a classification scheme (DDC). These are compared with an ontology (CIDOC-CRM) by focusing on how they define and handle concepts and relations. It was observed that LCSH and DDC focus on the formalism of character strings (nomens) rather than on the modelling of semantics; their definition of what constitutes a concept is quite fuzzy, and they comprise a large number of complex concepts. By contrast, thesauri have a coherent definition of what constitutes a concept, and apply a systematic approach to the modelling of relations. Ontologies explicitly define diverse types of relations, and are by their nature machine-processable. The paper concludes that the potential of both the expressiveness and machine processability of each KOS is extensively regulated by its structural rules. It is harder to represent subject headings and classification schemes as semantic networks with nodes and arcs, while thesauri are more suitable for such a representation. In addition, a paradigm shift is revealed which focuses on the modelling of relations between concepts, rather than the concepts themselves.
  11. Cheng, Y.-Y.; Xia, Y.: ¬A systematic review of methods for aligning, mapping, merging taxonomies in information sciences (2023) 0.00
    9.2906854E-4 = product of:
      0.0037162742 = sum of:
        0.0037162742 = product of:
          0.0111488225 = sum of:
            0.0111488225 = weight(_text_:a in 1029) [ClassicSimilarity], result of:
              0.0111488225 = score(doc=1029,freq=20.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.20142901 = fieldWeight in 1029, product of:
                  4.472136 = tf(freq=20.0), with freq of:
                    20.0 = termFreq=20.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1029)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Abstract
    The purpose of this study is to provide a systematic literature review on taxonomy alignment methods in information science to explore the common research pipeline and characteristics. Design/methodology/approach The authors implement a five-step systematic literature review process relating to taxonomy alignment. They take on a knowledge organization system (KOS) perspective, and specifically examining the level of KOS on "taxonomies." Findings They synthesize the matching dimensions of 28 taxonomy alignment studies in terms of the taxonomy input, approach and output. In the input dimension, they develop three characteristics: tree shapes, variable names and symmetry; for approach: methodology, unit of matching, comparison type and relation type; for output: the number of merged solutions and whether original taxonomies are preserved in the solutions. Research limitations/implications The main research implications of this study are threefold: (1) to enhance the understanding of the characteristics of a taxonomy alignment work; (2) to provide a novel categorization of taxonomy alignment approaches into natural language processing approach, logic-based approach and heuristic-based approach; (3) to provide a methodological guideline on the must-include characteristics for future taxonomy alignment research. Originality/value There is no existing comprehensive review on the alignment of "taxonomies". Further, no other mapping survey research has discussed the comparison from a KOS perspective. Using a KOS lens is critical in understanding the broader picture of what other similar systems of organizations are, and enables us to define taxonomies more precisely.
    Type
    a
  12. Ahmed, M.; Mukhopadhyay, M.; Mukhopadhyay, P.: Automated knowledge organization : AI ML based subject indexing system for libraries (2023) 0.00
    8.813918E-4 = product of:
      0.0035255672 = sum of:
        0.0035255672 = product of:
          0.010576702 = sum of:
            0.010576702 = weight(_text_:a in 977) [ClassicSimilarity], result of:
              0.010576702 = score(doc=977,freq=18.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.19109234 = fieldWeight in 977, product of:
                  4.2426405 = tf(freq=18.0), with freq of:
                    18.0 = termFreq=18.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=977)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Abstract
    The research study as reported here is an attempt to explore the possibilities of an AI/ML-based semi-automated indexing system in a library setup to handle large volumes of documents. It uses the Python virtual environment to install and configure an open source AI environment (named Annif) to feed the LOD (Linked Open Data) dataset of Library of Congress Subject Headings (LCSH) as a standard KOS (Knowledge Organisation System). The framework deployed the Turtle format of LCSH after cleaning the file with Skosify, applied an array of backend algorithms (namely TF-IDF, Omikuji, and NN-Ensemble) to measure relative performance, and selected Snowball as an analyser. The training of Annif was conducted with a large set of bibliographic records populated with subject descriptors (MARC tag 650$a) and indexed by trained LIS professionals. The training dataset is first treated with MarcEdit to export it in a format suitable for OpenRefine, and then in OpenRefine it undergoes many steps to produce a bibliographic record set suitable to train Annif. The framework, after training, has been tested with a bibliographic dataset to measure indexing efficiencies, and finally, the automated indexing framework is integrated with data wrangling software (OpenRefine) to produce suggested headings on a mass scale. The entire framework is based on open-source software, open datasets, and open standards.
    Type
    a
  13. Sartini, B.; Erp, M. van; Gangemi, A.: Marriage is a peach and a chalice : modelling cultural symbolism on the Semantic Web (2021) 0.00
    8.63584E-4 = product of:
      0.003454336 = sum of:
        0.003454336 = product of:
          0.010363008 = sum of:
            0.010363008 = weight(_text_:a in 557) [ClassicSimilarity], result of:
              0.010363008 = score(doc=557,freq=12.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.18723148 = fieldWeight in 557, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=557)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Abstract
    In this work, we fill the gap in the Semantic Web in the context of Cultural Symbolism. Building upon earlier work in \citesartini_towards_2021, we introduce the Simulation Ontology, an ontology that models the background knowledge of symbolic meanings, developed by combining the concepts taken from the authoritative theory of Simulacra and Simulations of Jean Baudrillard with symbolic structures and content taken from "Symbolism: a Comprehensive Dictionary'' by Steven Olderr. We re-engineered the symbolic knowledge already present in heterogeneous resources by converting it into our ontology schema to create HyperReal, the first knowledge graph completely dedicated to cultural symbolism. A first experiment run on the knowledge graph is presented to show the potential of quantitative research on symbolism.
    Type
    a
  14. Naun, C.C.: Expanding the use of Linked Data value vocabularies in PCC cataloging (2020) 0.00
    8.2263234E-4 = product of:
      0.0032905294 = sum of:
        0.0032905294 = product of:
          0.009871588 = sum of:
            0.009871588 = weight(_text_:a in 123) [ClassicSimilarity], result of:
              0.009871588 = score(doc=123,freq=8.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.17835285 = fieldWeight in 123, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=123)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Abstract
    In 2015, the PCC Task Group on URIs in MARC was tasked to identify and address linked data identifiers deployment in the current MARC format. By way of a pilot test, a survey, MARC Discussion papers, Proposals, etc., the Task Group initiated and introduced changes to MARC encoding. The Task Group succeeded in laying the ground work for preparing library data transition from MARC data to a linked data, RDF environment.
    Type
    a
  15. Rodrigues Barbosa, E.; Godoy Viera, A.F.: Relações semânticas e interoperabilidade em tesauros representados em SKOS : uma revisao sistematica da literatura (2022) 0.00
    7.883408E-4 = product of:
      0.0031533632 = sum of:
        0.0031533632 = product of:
          0.00946009 = sum of:
            0.00946009 = weight(_text_:a in 254) [ClassicSimilarity], result of:
              0.00946009 = score(doc=254,freq=10.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.1709182 = fieldWeight in 254, product of:
                  3.1622777 = tf(freq=10.0), with freq of:
                    10.0 = termFreq=10.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=254)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Abstract
    Objetivo: Este estudo tem como objetivo compreender como o modelo de dados Simple Knowledge Organization System e seus modelos de extensão tem sido utilizado para promover a interoperabilidade com outros vocabulários e refinar as relações semânticas em tesauros na web. Metodologia: Utiliza a pesquisa documental nos guias de referência dos modelos de dados utilizados para representar os tesauros na web. Resultados: os modelos de dados têm sido utilizados para representar os termos e suas variações linguísticas, os relacionamentos entre grupos e subgrupos de conceitos, numa perspectiva intra-vocabulários, e os relacionamentos entre conceitos de vocabulários distintos, numa perspectiva inter-vocabulários. Conclusões: O uso do Simple Knowledge Organization System, e dos seus modelos de extensão contribuem para uma melhor estruturação dos conceitos em tesauros. Os modelos de extensão são apropriados para a representação dos relacionamentos de equivalência compostos, ou para a estruturação de grupos e subgrupos de conceitos em tesauros.
    Type
    a
  16. Lee, S.: Pidgin metadata framework as a mediator for metadata interoperability (2021) 0.00
    7.196534E-4 = product of:
      0.0028786135 = sum of:
        0.0028786135 = product of:
          0.00863584 = sum of:
            0.00863584 = weight(_text_:a in 654) [ClassicSimilarity], result of:
              0.00863584 = score(doc=654,freq=12.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.15602624 = fieldWeight in 654, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=654)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Abstract
    A pidgin metadata framework based on the concept of pidgin metadata is proposed to complement the limitations of existing approaches to metadata interoperability and to achieve more reliable metadata interoperability. The framework consists of three layers, with a hierarchical structure, and reflects the semantic and structural characteristics of various metadata. Layer 1 performs both an external function, serving as an anchor for semantic association between metadata elements, and an internal function, providing semantic categories that can encompass detailed elements. Layer 2 is an arbitrary layer composed of substantial elements from existing metadata and performs a function in which different metadata elements describing the same or similar aspects of information resources are associated with the semantic categories of Layer 1. Layer 3 implements the semantic relationships between Layer 1 and Layer 2 through the Resource Description Framework syntax. With this structure, the pidgin metadata framework can establish the criteria for semantic connection between different elements and fully reflect the complexity and heterogeneity among various metadata. Additionally, it is expected to provide a bibliographic environment that can achieve more reliable metadata interoperability than existing approaches by securing the communication between metadata.
    Type
    a
  17. Rocha Souza, R.; Lemos, D.: a comparative analysis : Knowledge organization systems for the representation of multimedia resources on the Web (2020) 0.00
    7.051135E-4 = product of:
      0.002820454 = sum of:
        0.002820454 = product of:
          0.008461362 = sum of:
            0.008461362 = weight(_text_:a in 5993) [ClassicSimilarity], result of:
              0.008461362 = score(doc=5993,freq=8.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.15287387 = fieldWeight in 5993, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5993)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Abstract
    The lack of standardization in the production, organization and dissemination of information in documentation centers and institutions alike, as a result from the digitization of collections and their availability on the internet has called for integration efforts. The sheer availability of multimedia content has fostered the development of many distinct and, most of the time, independent metadata standards for its description. This study aims at presenting and comparing the existing standards of metadata, vocabularies and ontologies for multimedia annotation and also tries to offer a synthetic overview of its main strengths and weaknesses, aiding efforts for semantic integration and enhancing the findability of available multimedia resources on the web. We also aim at unveiling the characteristics that could, should and are perhaps not being highlighted in the characterization of multimedia resources.
    Type
    a
  18. Folsom, S.M.: Using the Program for Cooperative Cataloging's past and present to project a Linked Data future (2020) 0.00
    6.6478737E-4 = product of:
      0.0026591495 = sum of:
        0.0026591495 = product of:
          0.007977448 = sum of:
            0.007977448 = weight(_text_:a in 5747) [ClassicSimilarity], result of:
              0.007977448 = score(doc=5747,freq=4.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.14413087 = fieldWeight in 5747, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.0625 = fieldNorm(doc=5747)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Type
    a
  19. Smith, A.: Simple Knowledge Organization System (SKOS) (2022) 0.00
    6.106462E-4 = product of:
      0.0024425848 = sum of:
        0.0024425848 = product of:
          0.007327754 = sum of:
            0.007327754 = weight(_text_:a in 1094) [ClassicSimilarity], result of:
              0.007327754 = score(doc=1094,freq=6.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.13239266 = fieldWeight in 1094, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.046875 = fieldNorm(doc=1094)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Abstract
    SKOS (Simple Knowledge Organization System) is a recommendation from the World Wide Web Consortium (W3C) for representing controlled vocabularies, taxonomies, thesauri, classifications, and similar systems for organizing and indexing information as linked data elements in the Semantic Web, using the Resource Description Framework (RDF). The SKOS data model is centered on "concepts", which can have preferred and alternate labels in any language as well as other metadata, and which are identified by addresses on the World Wide Web (URIs). Concepts are grouped into hierarchies through "broader" and "narrower" relations, with "top concepts" at the broadest conceptual level. Concepts are also organized into "concept schemes", also identified by URIs. Other relations, mappings, and groupings are also supported. This article discusses the history of the development of SKOS and provides notes on adoption, uses, and limitations.
    Type
    a
  20. Balakrishnan, U,; Soergel, D.; Helfer, O.: Representing concepts through description logic expressions for knowledge organization system (KOS) mapping (2020) 0.00
    5.875945E-4 = product of:
      0.002350378 = sum of:
        0.002350378 = product of:
          0.007051134 = sum of:
            0.007051134 = weight(_text_:a in 144) [ClassicSimilarity], result of:
              0.007051134 = score(doc=144,freq=2.0), product of:
                0.055348642 = queryWeight, product of:
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.04800207 = queryNorm
                0.12739488 = fieldWeight in 144, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  1.153047 = idf(docFreq=37942, maxDocs=44218)
                  0.078125 = fieldNorm(doc=144)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Type
    a