Search (25 results, page 1 of 2)

Panskus, E.J.: Metadaten zur Identifizierung von Falschmeldungen im digitalen Raum : eine praktische Annäherung (2019) 0.02

0.016880836 = product of:
  0.09565807 = sum of:
    0.026553465 = weight(_text_:und in 5452) [ClassicSimilarity], result of:
      0.026553465 = score(doc=5452,freq=12.0), product of:
        0.055336144 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.024967048 = queryNorm
        0.47985753 = fieldWeight in 5452, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.0625 = fieldNorm(doc=5452)
    0.06333006 = weight(_text_:informationswissenschaft in 5452) [ClassicSimilarity], result of:
      0.06333006 = score(doc=5452,freq=4.0), product of:
        0.11246919 = queryWeight, product of:
          4.504705 = idf(docFreq=1328, maxDocs=44218)
          0.024967048 = queryNorm
        0.5630881 = fieldWeight in 5452, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.504705 = idf(docFreq=1328, maxDocs=44218)
          0.0625 = fieldNorm(doc=5452)
    0.0057745427 = weight(_text_:in in 5452) [ClassicSimilarity], result of:
      0.0057745427 = score(doc=5452,freq=4.0), product of:
        0.033961542 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.024967048 = queryNorm
        0.17003182 = fieldWeight in 5452, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0625 = fieldNorm(doc=5452)
  0.1764706 = coord(3/17)

Abstract: In vielen Ländern erstarken populistische und rassistische Kräfte. Mit Polen und Ungarn schwächen selbst Mitglieder der Europäischen Union rechtsstaatliche Institutionen.[1] Die Türkei wendet sich immer stärker von der EU ab und driftet an den Rand einer Diktatur. In Österreich konnte ein Rechtspopulist nur knapp als Bundespräsident verhindert werden. All diese Ereignisse finden oder fanden auch wegen Missmut und Misstrauen gegenüber staatlichen und etablierten Institutionen wie klassischen Medien, Regierungen und der Wirtschaft statt.
Series: Zukunft der Informationswissenschaft / Hat die Informationswissenschaft eine Zukunft?

Suominen, O.; Hyvönen, N.: From MARC silos to Linked Data silos? (2017) 0.01

0.009553809 = product of:
  0.05413825 = sum of:
    0.008130305 = weight(_text_:und in 3732) [ClassicSimilarity], result of:
      0.008130305 = score(doc=3732,freq=2.0), product of:
        0.055336144 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.024967048 = queryNorm
        0.14692576 = fieldWeight in 3732, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.046875 = fieldNorm(doc=3732)
    0.005304256 = weight(_text_:in in 3732) [ClassicSimilarity], result of:
      0.005304256 = score(doc=3732,freq=6.0), product of:
        0.033961542 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.024967048 = queryNorm
        0.1561842 = fieldWeight in 3732, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=3732)
    0.04070369 = weight(_text_:bibliotheken in 3732) [ClassicSimilarity], result of:
      0.04070369 = score(doc=3732,freq=6.0), product of:
        0.09407886 = queryWeight, product of:
          3.768121 = idf(docFreq=2775, maxDocs=44218)
          0.024967048 = queryNorm
        0.43265504 = fieldWeight in 3732, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.768121 = idf(docFreq=2775, maxDocs=44218)
          0.046875 = fieldNorm(doc=3732)
  0.1764706 = coord(3/17)

Abstract: Seit einiger Zeit stellen Bibliotheken ihre bibliografischen Metadadaten verstärkt offen in Form von Linked Data zur Verfügung. Dabei kommen jedoch ganz unterschiedliche Modelle für die Strukturierung der bibliografischen Daten zur Anwendung. Manche Bibliotheken verwenden ein auf FRBR basierendes Modell mit mehreren Schichten von Entitäten, während andere flache, am Datensatz orientierte Modelle nutzen. Der Wildwuchs bei den Datenmodellen erschwert die Nachnutzung der bibliografischen Daten. Im Ergebnis haben die Bibliotheken die früheren MARC-Silos nur mit zueinander inkompatiblen Linked-Data-Silos vertauscht. Deshalb ist es häufig schwierig, Datensets miteinander zu kombinieren und nachzunutzen. Kleinere Unterschiede in der Datenmodellierung lassen sich zwar durch Schema Mappings in den Griff bekommen, doch erscheint es fraglich, ob die Interoperabilität insgesamt zugenommen hat. Der Beitrag stellt die Ergebnisse einer Studie zu verschiedenen veröffentlichten Sets von bibliografischen Daten vor. Dabei werden auch die unterschiedlichen Modelle betrachtet, um bibliografische Daten als RDF darzustellen, sowie Werkzeuge zur Erzeugung von entsprechenden Daten aus dem MARC-Format. Abschließend wird der von der Finnischen Nationalbibliothek verfolgte Ansatz behandelt.

Bohne-Lang, A.: Semantische Metadaten für den Webauftritt einer Bibliothek (2016) 0.01
```
0.008498429 = product of:
  0.048157766 = sum of:
    0.02347017 = weight(_text_:und in 3337) [ClassicSimilarity], result of:
      0.02347017 = score(doc=3337,freq=24.0), product of:
        0.055336144 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.024967048 = queryNorm
        0.42413816 = fieldWeight in 3337, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3337)
    0.005104023 = weight(_text_:in in 3337) [ClassicSimilarity], result of:
      0.005104023 = score(doc=3337,freq=8.0), product of:
        0.033961542 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.024967048 = queryNorm
        0.15028831 = fieldWeight in 3337, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3337)
    0.019583572 = weight(_text_:bibliotheken in 3337) [ClassicSimilarity], result of:
      0.019583572 = score(doc=3337,freq=2.0), product of:
        0.09407886 = queryWeight, product of:
          3.768121 = idf(docFreq=2775, maxDocs=44218)
          0.024967048 = queryNorm
        0.20816123 = fieldWeight in 3337, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.768121 = idf(docFreq=2775, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3337)
  0.1764706 = coord(3/17)
```
Abstract

Das Semantic Web ist schon seit über 10 Jahren viel beachtet und hat mit der Verfügbarkeit von Resource Description Framework (RDF) und den entsprechenden Ontologien einen großen Sprung in die Praxis gemacht. Vertreter kleiner Bibliotheken und Bibliothekare mit geringer Technik-Affinität stehen aber im Alltag vor großen Hürden, z.B. bei der Frage, wie man diese Technik konkret in den eigenen Webauftritt einbinden kann: man kommt sich vor wie Don Quijote, der versucht die Windmühlen zu bezwingen. RDF mit seinen Ontologien ist fast unverständlich komplex für Nicht-Informatiker und somit für den praktischen Einsatz auf Bibliotheksseiten in der Breite nicht direkt zu gebrauchen. Mit Schema.org wurde ursprünglich von den drei größten Suchmaschinen der Welt Google, Bing und Yahoo eine einfach und effektive semantische Beschreibung von Entitäten entwickelt. Aktuell wird Schema.org durch Google, Microsoft, Yahoo und Yandex weiter gesponsert und von vielen weiteren Suchmaschinen verstanden. Vor diesem Hintergrund hat die Bibliothek der Medizinischen Fakultät Mannheim auf ihrer Homepage (http://www.umm.uni-heidelberg.de/bibl/) verschiedene maschinenlesbare semantische Metadaten eingebettet. Sehr interessant und zukunftsweisend ist die neueste Entwicklung von Schema.org, bei der man eine 'Library' (https://schema.org/Library) mit Öffnungszeiten und vielem mehr modellieren kann. Ferner haben wir noch semantische Metadaten im Open Graph- und Dublin Core-Format eingebettet, um alte Standards und Facebook-konforme Informationen maschinenlesbar zur Verfügung zu stellen.

Content

Beitrag für die AGMB-Jahrestagung in Göttingen 2016.
Söhler, M.: Schluss mit Schema F (2011) 0.00
```
0.0026893357 = product of:
  0.022859354 = sum of:
    0.018776136 = weight(_text_:und in 4439) [ClassicSimilarity], result of:
      0.018776136 = score(doc=4439,freq=24.0), product of:
        0.055336144 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.024967048 = queryNorm
        0.33931053 = fieldWeight in 4439, product of:
          4.8989797 = tf(freq=24.0), with freq of:
            24.0 = termFreq=24.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.03125 = fieldNorm(doc=4439)
    0.004083218 = weight(_text_:in in 4439) [ClassicSimilarity], result of:
      0.004083218 = score(doc=4439,freq=8.0), product of:
        0.033961542 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.024967048 = queryNorm
        0.120230645 = fieldWeight in 4439, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.03125 = fieldNorm(doc=4439)
  0.11764706 = coord(2/17)
```
Abstract

Mit Schema.org und dem semantischen Web sollen Suchmaschinen verstehen lernen

Content

"Wörter haben oft mehrere Bedeutungen. Einige kennen den "Kanal" als künstliche Wasserstraße, andere vom Fernsehen. Die Waage kann zum Erfassen des Gewichts nützlich sein oder zur Orientierung auf der Horoskopseite. Casablanca ist eine Stadt und ein Film zugleich. Wo Menschen mit der Zeit Bedeutungen unterscheiden und verarbeiten lernen, können dies Suchmaschinen von selbst nicht. Stets listen sie dumpf hintereinander weg alles auf, was sie zu einem Thema finden. Damit das nicht so bleibt, haben sich nun Google, Yahoo und die zu Microsoft gehörende Suchmaschine Bing zusammengetan, um der Suche im Netz mehr Verständnis zu verpassen. Man spricht dabei auch von einer "semantischen Suche". Das Ergebnis heißt Schema.org. Wer die Webseite einmal besucht, sich ein wenig in die Unterstrukturen hereinklickt und weder Vorkenntnisse im Programmieren noch im Bereich des semantischen Webs hat, wird sich überfordert und gelangweilt wieder abwenden. Doch was hier entstehen könnte, hat das Zeug dazu, Teile des Netzes und speziell die Funktionen von Suchmaschinen mittel- oder langfristig zu verändern. "Große Player sind dabei, sich auf Standards zu einigen", sagt Daniel Bahls, Spezialist für Semantische Technologien beim ZBW Leibniz-Informationszentrum Wirtschaft in Hamburg. "Die semantischen Technologien stehen schon seit Jahren im Raum und wurden bisher nur im kleineren Kontext verwendet." Denn Schema.org lädt Entwickler, Forscher, die Semantic-Web-Community und am Ende auch alle Betreiber von Websites dazu ein, an der Umgestaltung der Suche im Netz mitzuwirken. Inhalte von Websites sollen mit einem speziellen, aber einheitlichen Vokabular für die Crawler - die Analyseprogramme der Suchmaschinen - gekennzeichnet und aufbereitet werden.
Indem Schlagworte, sogenannte Tags, in den für Normal-User nicht sichtbaren Teil des Codes von Websites eingebettet werden, sind Suchmachinen nicht mehr so sehr auf die Analyse der natürlichen Sprache angewiesen, um Texte inhaltlich zu erfassen. Im Blog ZBW Mediatalk wird dies als "Semantic Web light" bezeichnet - ein semantisches Web auf niedrigster Ebene. Aber selbst das werde "schon viel bewirken", meint Bahls. "Das semantische Web wird sich über die nächsten Jahrzehnte evolutionär weiterentwickeln." Einen "Abschluss" werde es nie geben, "da eine einheitliche Formalisierung von Begrifflichkeiten auf feiner Stufe kaum möglich ist". Die Ergebnisse aus Schema.org würden "zeitnah" in die Suchmaschine integriert, "denn einen Zeitplan" gebe es nicht, so Stefan Keuchel, Pressesprecher von Google Deutschland. Bis das so weit ist, hilft der Verweis von Daniel Bahns auf die bereits existierende semantische Suchmaschine Sig.ma. Geschwindigkeit und Menge der Ergebnisse nach einer Suchanfrage spielen hier keine Rolle. Sig.ma sammelt seine Informationen allein im Bereich des semantischen Webs und listet nach einer Anfrage alles Bekannte strukturiert auf.
Söhler, M.: "Dumm wie Google" war gestern : semantische Suche im Netz (2011) 0.00
```
0.0025576479 = product of:
  0.021740006 = sum of:
    0.017745476 = weight(_text_:und in 4440) [ClassicSimilarity], result of:
      0.017745476 = score(doc=4440,freq=28.0), product of:
        0.055336144 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.024967048 = queryNorm
        0.3206851 = fieldWeight in 4440, product of:
          5.2915025 = tf(freq=28.0), with freq of:
            28.0 = termFreq=28.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4440)
    0.00399453 = weight(_text_:in in 4440) [ClassicSimilarity], result of:
      0.00399453 = score(doc=4440,freq=10.0), product of:
        0.033961542 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.024967048 = queryNorm
        0.11761922 = fieldWeight in 4440, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.02734375 = fieldNorm(doc=4440)
  0.11764706 = coord(2/17)
```
Abstract

"Casablanca" bringt bei der Google-Suche Millionen Ergebnisse. Ist die Stadt gemeint oder der Film? Suchmaschinen sind dumm und schnell. Schema.org will das ändern.

Content

"6.500 Einzelsprachen so zu verstehen, dass noch die dümmsten Maschinen sie in all ihren Sätzen, Wörtern, Bedeutungen nicht nur erfassen, sondern auch verarbeiten können - das ist ein komplexer Vorgang, an dem große Teile des Internets inklusive fast aller Suchmaschinen bisher gescheitert sind. Wem schon der gerade gelesene Satz zu komplex erscheint, dem sei es einfacher ausgedrückt: Erstmal geht es um "Teekesselchen". Wörter haben oft mehrere Bedeutungen. Einige kennen den "Kanal" als künstliche Wasserstraße, andere kennen ihn vom Zappen am Fernsehgerät. Die Waage kann zum Erfassen des Gewichts nützlich sein oder zur Orientierung auf der Horoskopseite einer Zeitung. Casablanca ist eine Stadt und ein Film zugleich. Wo Menschen mit der Zeit zu unterscheiden lernen, lernen dies Suchmaschinen von selbst nicht. Nach einer entsprechenden Eingabe listen sie dumpf hintereinander weg alles auf, was sie zum Thema finden können. "Dumm wie Google", könnte man sagen, "doof wie Yahoo" oder "blöd wie Bing". Damit das nicht so bleibt, haben sich nun Google, Yahoo und die zu Microsoft gehörende Suchmaschine Bing zusammengetan, um der Suche im Netz mehr Verständnis zu verpassen. Man spricht dabei auch von einer "semantischen Suche". Das Ergebnis heißt Schema.org. Wer die Webseite einmal besucht, sich ein wenig in die Unterstrukturen hereinklickt und weder Vorkenntnisse im Programmieren noch im Bereich des semantischen Webs hat, wird sich überfordert und gelangweilt wieder abwenden.
- Neue Standards Doch was hier entstehen könnte, hat das Zeug dazu, Teile des Netzes und speziell die Funktionen von Suchmaschinen mittel- oder langfristig zu verändern. "Große Player sind dabei, sich auf Standards zu einigen", sagt Daniel Bahls, Spezialist für Semantische Technologien beim ZBW Leibniz-Informationszentrum Wirtschaft in Hamburg. "Die semantischen Technologien stehen schon seit Jahren im Raum und wurden bisher nur im kleineren Kontext verwendet." Denn Schema.org lädt Entwickler, Forscher, die Semantic-Web-Community und am Ende auch alle Betreiber von Websites dazu ein, an der Umgestaltung der Suche im Netz mitzuwirken. "Damit wollen Google, Bing und Yahoo! dem Info-Chaos im WWW den Garaus machen", schreibt André Vatter im Blog ZBW Mediatalk. Inhalte von Websites sollen mit einem speziellen, aber einheitlichen Vokabular für die Crawler der Suchmaschinen gekennzeichnet und aufbereitet werden. Indem Schlagworte, so genannte Tags, in den Code von Websites eingebettet werden, sind Suchmachinen nicht mehr so sehr auf die Analyse der natürlichen Sprache angewiesen, um Texte inhaltlich zu erfassen. Im Blog wird dies als "Semantic Web light" bezeichnet - ein semantisches Web auf niedrigster Ebene. Aber selbst das werde "schon viel bewirken", meint Bahls. "Das semantische Web wird sich über die nächsten Jahrzehnte evolutionär weiterentwickeln." Einen "Abschluss" werde es nie geben, "da eine einheitliche Formalisierung von Begrifflichkeiten auf feiner Stufe kaum möglich ist."
- "Gemeinsames Format für strukturierte Daten" Aber warum sollten Google, Yahoo und Bing plötzlich zusammenarbeiten, wo doch bisher die Konkurrenz das Verhältnis prägte? Stefan Keuchel, Pressesprecher von Google Deutschland, betont, alle beteiligten Unternehmen wollten "ein deutliches Zeichen setzen, um die Qualität der Suche zu verbessern". Man entwickele "ein gemeinsames Format für strukturierte Daten, mit dem Dinge ermöglicht werden, die heute noch nicht möglich sind - Stichwort: semantische Suche". Die Ergebnisse aus Schema.org würden "zeitnah" in die Suchmaschine integriert, "denn einen Zeitplan" gebe es nicht. "Erst mit der Einigung auf eine gemeinsame Sprache können Suchmaschinen einen Mehrwert durch semantische Technologien generieren", antwortet Daniel Bahls auf die Frage nach Gemeinsamkeit und Konkurrenz der Suchmaschinen. Er weist außerdem darauf hin, dass es bereits die semantische Suchmaschine Sig.ma gibt. Geschwindigkeit und Menge der Ergebnisse nach einer Suchanfrage spielen hier keine Rolle. Sig.ma sammelt seine Informationen allein im Bereich des semantischen Webs und listet nach einer Anfrage alles Bekannte strukturiert auf."
Roy, W.; Gray, C.: Preparing existing metadata for repository batch import : a recipe for a fickle food (2018) 0.00
```
0.0016662586 = product of:
  0.014163198 = sum of:
    0.005706471 = weight(_text_:in in 4550) [ClassicSimilarity], result of:
      0.005706471 = score(doc=4550,freq=10.0), product of:
        0.033961542 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.024967048 = queryNorm
        0.16802745 = fieldWeight in 4550, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4550)
    0.008456727 = product of:
      0.016913453 = sum of:
        0.016913453 = weight(_text_:22 in 4550) [ClassicSimilarity], result of:
          0.016913453 = score(doc=4550,freq=2.0), product of:
            0.08743035 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.024967048 = queryNorm
            0.19345059 = fieldWeight in 4550, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4550)
      0.5 = coord(1/2)
  0.11764706 = coord(2/17)
```
Abstract

In 2016, the University of Waterloo began offering a mediated copyright review and deposit service to support the growth of our institutional repository UWSpace. This resulted in the need to batch import large lists of published works into the institutional repository quickly and accurately. A range of methods have been proposed for harvesting publications metadata en masse, but many technological solutions can easily become detached from a workflow that is both reproducible for support staff and applicable to a range of situations. Many repositories offer the capacity for batch upload via CSV, so our method provides a template Python script that leverages the Habanero library for populating CSV files with existing metadata retrieved from the CrossRef API. In our case, we have combined this with useful metadata contained in a TSV file downloaded from Web of Science in order to enrich our metadata as well. The appeal of this 'low-maintenance' method is that it provides more robust options for gathering metadata semi-automatically, and only requires the user's ability to access Web of Science and the Python program, while still remaining flexible enough for local customizations.

Date

10.11.2018 16:27:22
Strobel, S.: Englischsprachige Erweiterung des TIB / AV-Portals : Ein GND/DBpedia-Mapping zur Gewinnung eines englischen Begriffssystems (2014) 0.00
```
0.0012216875 = product of:
  0.010384344 = sum of:
    0.0067752544 = weight(_text_:und in 2876) [ClassicSimilarity], result of:
      0.0067752544 = score(doc=2876,freq=2.0), product of:
        0.055336144 = queryWeight, product of:
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.024967048 = queryNorm
        0.12243814 = fieldWeight in 2876, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.216367 = idf(docFreq=13101, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2876)
    0.003609089 = weight(_text_:in in 2876) [ClassicSimilarity], result of:
      0.003609089 = score(doc=2876,freq=4.0), product of:
        0.033961542 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.024967048 = queryNorm
        0.10626988 = fieldWeight in 2876, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2876)
  0.11764706 = coord(2/17)
```
Abstract

Die Videos des TIB / AV-Portals werden mit insgesamt 63.356 GND-Sachbegriffen aus Naturwissenschaft und Technik automatisch verschlagwortet. Neben den deutschsprachigen Videos verfügt das TIB / AV-Portal auch über zahlreiche englischsprachige Videos. Die GND enthält zu den in der TIB / AV-Portal-Wissensbasis verwendeten Sachbegriffen nur sehr wenige englische Bezeichner. Es fehlt demnach ein englisches Indexierungsvokabular, mit dem die englischsprachigen Videos automatisch verschlagwortet werden können. Die Lösung dieses Problems sieht wie folgt aus: Die englischen Bezeichner sollen über ein Mapping der GND-Sachbegriffe auf andere Datensätze gewonnen werden, die eine englische Übersetzung der Begriffe enthalten. Die verwendeten Mappingstrategien nutzen die DBpedia, LCSH, MACS-Ergebnisse sowie den WTI-Thesaurus. Am Ende haben 35.025 GND-Sachbegriffe (mindestens) einen englischen Bezeichner ermittelt bekommen. Diese englischen Bezeichner können für die automatische Verschlagwortung der englischsprachigen Videos unmittelbar herangezogen werden. 11.694 GND-Sachbegriffe konnten zwar nicht ins Englische "übersetzt", aber immerhin mit einem Oberbegriff assoziiert werden, der eine englische Übersetzung hat. Diese Assoziation dient der Erweiterung der Suchergebnisse.

Content

Beitrag als ausgearbeitete Form eines Vortrages während des 103. Deutschen Bibliothekartages in Bremen. Vgl.: https://www.o-bib.de/article/view/2014H1S197-204.
Husevag, A.-S.R.: Named entities in indexing : a case study of TV subtitles and metadata records (2016) 0.00
```
4.7471584E-4 = product of:
  0.008070169 = sum of:
    0.008070169 = weight(_text_:in in 3105) [ClassicSimilarity], result of:
      0.008070169 = score(doc=3105,freq=20.0), product of:
        0.033961542 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.024967048 = queryNorm
        0.2376267 = fieldWeight in 3105, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3105)
  0.05882353 = coord(1/17)
```
Abstract

This paper explores the possible role of named entities in an automatic index-ing process, based on text in subtitles. This is done by analyzing entity types, name den-sity and name frequencies in subtitles and metadata records from different TV programs. The name density in metadata records is much higher than the name density in subtitles, and named entities with high frequencies in the subtitles are more likely to be mentioned in the metadata records. Personal names, geographical names and names of organizations where the most prominent entity types in both the news subtitles and news metadata, while persons, works and locations are the most prominent in culture programs.
Ruhl, M.: Do we need metadata? : an on-line survey in German archives (2012) 0.00
```
4.4125595E-4 = product of:
  0.007501351 = sum of:
    0.007501351 = weight(_text_:in in 471) [ClassicSimilarity], result of:
      0.007501351 = score(doc=471,freq=12.0), product of:
        0.033961542 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.024967048 = queryNorm
        0.22087781 = fieldWeight in 471, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=471)
  0.05882353 = coord(1/17)
```
Abstract

The paper summarizes the results of an on-line survey which was executed 2010 in german archives of all branches. The survey focused on metadata and used metadata standards for the annotation of audiovisual media like pictures, audio and video files (analog and digital). The findings motivate the question whether archives are able to collaborate in projects like europeana if they do not use accepted standards for their orientation. Archives need more resources and archival staff need more training to execute more complex tasks in an digital and semantic surrounding.

Source

Proceedings of the 2nd International Workshop on Semantic Digital Archives held in conjunction with the 16th Int. Conference on Theory and Practice of Digital Libraries (TPDL) on September 27, 2012 in Paphos, Cyprus [http://ceur-ws.org/Vol-912/proceedings.pdf]. Eds.: A. Mitschik et al
Bartczak, J.; Glendon, I.: Python, Google Sheets, and the Thesaurus for Graphic Materials for efficient metadata project workflows (2017) 0.00
```
4.4125595E-4 = product of:
  0.007501351 = sum of:
    0.007501351 = weight(_text_:in in 3893) [ClassicSimilarity], result of:
      0.007501351 = score(doc=3893,freq=12.0), product of:
        0.033961542 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.024967048 = queryNorm
        0.22087781 = fieldWeight in 3893, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=3893)
  0.05882353 = coord(1/17)
```
Abstract

In 2017, the University of Virginia (U.Va.) will launch a two year initiative to celebrate the bicentennial anniversary of the University's founding in 1819. The U.Va. Library is participating in this event by digitizing some 20,000 photographs and negatives that document student life on the U.Va. grounds in the 1960s and 1970s. Metadata librarians and archivists are well-versed in the challenges associated with generating digital content and accompanying description within the context of limited resources. This paper describes how technology and new approaches to metadata design have enabled the University of Virginia's Metadata Analysis and Design Department to rapidly and successfully generate accurate description for these digital objects. Python's pandas module improves efficiency by cleaning and repurposing data recorded at digitization, while the lxml module builds MODS XML programmatically from CSV tables. A simplified technique for subject heading selection and assignment in Google Sheets provides a collaborative environment for streamlined metadata creation and data quality control.
Hardesty, J.L.; Young, J.B.: ¬The semantics of metadata : Avalon Media System and the move to RDF (2017) 0.00
```
4.0280976E-4 = product of:
  0.0068477658 = sum of:
    0.0068477658 = weight(_text_:in in 3896) [ClassicSimilarity], result of:
      0.0068477658 = score(doc=3896,freq=10.0), product of:
        0.033961542 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.024967048 = queryNorm
        0.20163295 = fieldWeight in 3896, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=3896)
  0.05882353 = coord(1/17)
```
Abstract

The Avalon Media System (Avalon) provides access and management for digital audio and video collections in libraries and archives. The open source project is led by the libraries of Indiana University Bloomington and Northwestern University and is funded in part by grants from The Andrew W. Mellon Foundation and Institute of Museum and Library Services. Avalon is based on the Samvera Community (formerly Hydra Project) software stack and uses Fedora as the digital repository back end. The Avalon project team is in the process of migrating digital repositories from Fedora 3 to Fedora 4 and incorporating metadata statements using the Resource Description Framework (RDF) instead of XML files accompanying the digital objects in the repository. The Avalon team has worked on the migration path for technical metadata and is now working on the migration paths for structural metadata (PCDM) and descriptive metadata (from MODS XML to RDF). This paper covers the decisions made to begin using RDF for software development and offers a window into how Semantic Web technology functions in the real world.
DC-2013: International Conference on Dublin Core and Metadata Applications : Online Proceedings (2013) 0.00
```
3.7977265E-4 = product of:
  0.0064561353 = sum of:
    0.0064561353 = weight(_text_:in in 1076) [ClassicSimilarity], result of:
      0.0064561353 = score(doc=1076,freq=20.0), product of:
        0.033961542 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.024967048 = queryNorm
        0.19010136 = fieldWeight in 1076, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.03125 = fieldNorm(doc=1076)
  0.05882353 = coord(1/17)
```
Abstract

The collocated conferences for DC-2013 and iPRES-2013 in Lisbon attracted 392 participants from over 37 countries. In addition to the Tuesday through Thursday conference days comprised of peer-reviewed paper and special sessions, 223 participants attended pre-conference tutorials and 246 participated in post-conference workshops for the collocated events. The peer-reviewed papers and presentations are available on the conference website Presentation page (URLs above). In sum, it was a great conference. In addition to links to PDFs of papers, project reports and posters (and their associated presentations), the published proceedings include presentation PDFs for the following: KEYNOTES Darling, we need to talk - Gildas Illien TUTORIALS -- Ivan Herman: "Introduction to Linked Open Data (LOD)" -- Steven Miller: "Introduction to Ontology Concepts and Terminology" -- Kai Eckert: "Metadata Provenance" -- Daniel Garjio: "The W3C Provenance Ontology" SPECIAL SESSIONS -- "Application Profiles as an Alternative to OWL Ontologies" -- "Long-term Preservation and Governance of RDF Vocabularies (W3C Sponsored)" -- "Data Enrichment and Transformation in the LOD Context: Poor & Popular vs Rich & Lonely--Can't we achieve both?" -- "Why Schema.org?"

Content

FULL PAPERS Provenance and Annotations for Linked Data - Kai Eckert How Portable Are the Metadata Standards for Scientific Data? A Proposal for a Metadata Infrastructure - Jian Qin, Kai Li Lessons Learned in Implementing the Extended Date/Time Format in a Large Digital Library - Hannah Tarver, Mark Phillips Towards the Representation of Chinese Traditional Music: A State of the Art Review of Music Metadata Standards - Mi Tian, György Fazekas, Dawn Black, Mark Sandler Maps and Gaps: Strategies for Vocabulary Design and Development - Diane Ileana Hillmann, Gordon Dunsire, Jon Phipps A Method for the Development of Dublin Core Application Profiles (Me4DCAP V0.1): Aescription - Mariana Curado Malta, Ana Alice Baptista Find and Combine Vocabularies to Design Metadata Application Profiles using Schema Registries and LOD Resources - Tsunagu Honma, Mitsuharu Nagamori, Shigeo Sugimoto Achieving Interoperability between the CARARE Schema for Monuments and Sites and the Europeana Data Model - Antoine Isaac, Valentine Charles, Kate Fernie, Costis Dallas, Dimitris Gavrilis, Stavros Angelis With a Focused Intent: Evolution of DCMI as a Research Community - Jihee Beak, Richard P. Smiraglia Metadata Capital in a Data Repository - Jane Greenberg, Shea Swauger, Elena Feinstein DC Metadata is Alive and Well - A New Standard for Education - Liddy Nevile Representation of the UNIMARC Bibliographic Data Format in Resource Description Framework - Gordon Dunsire, Mirna Willer, Predrag Perozic
Wolfe, EW.: a case study in automated metadata enhancement : Natural Language Processing in the humanities (2019) 0.00
```
3.640176E-4 = product of:
  0.006188299 = sum of:
    0.006188299 = weight(_text_:in in 5236) [ClassicSimilarity], result of:
      0.006188299 = score(doc=5236,freq=6.0), product of:
        0.033961542 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.024967048 = queryNorm
        0.1822149 = fieldWeight in 5236, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5236)
  0.05882353 = coord(1/17)
```
Abstract

The Black Book Interactive Project at the University of Kansas (KU) is developing an expanded corpus of novels by African American authors, with an emphasis on lesser known writers and a goal of expanding research in this field. Using a custom metadata schema with an emphasis on race-related elements, each novel is analyzed for a variety of elements such as literary style, targeted content analysis, historical context, and other areas. Librarians at KU have worked to develop a variety of computational text analysis processes designed to assist with specific aspects of this metadata collection, including text mining and natural language processing, automated subject extraction based on word sense disambiguation, harvesting data from Wikidata, and other actions.
What is Schema.org? (2011) 0.00
```
3.6028394E-4 = product of:
  0.006124827 = sum of:
    0.006124827 = weight(_text_:in in 4437) [ClassicSimilarity], result of:
      0.006124827 = score(doc=4437,freq=8.0), product of:
        0.033961542 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.024967048 = queryNorm
        0.18034597 = fieldWeight in 4437, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=4437)
  0.05882353 = coord(1/17)
```
Abstract

This site provides a collection of schemas, i.e., html tags, that webmasters can use to markup their pages in ways recognized by major search providers. Search engines including Bing, Google and Yahoo! rely on this markup to improve the display of search results, making it easier for people to find the right web pages. Many sites are generated from structured data, which is often stored in databases. When this data is formatted into HTML, it becomes very difficult to recover the original structured data. Many applications, especially search engines, can benefit greatly from direct access to this structured data. On-page markup enables search engines to understand the information on web pages and provide richer search results in order to make it easier for users to find relevant information on the web. Markup can also enable new tools and applications that make use of the structure. A shared markup vocabulary makes easier for webmasters to decide on a markup schema and get the maximum benefit for their efforts. So, in the spirit of sitemaps.org, Bing, Google and Yahoo! have come together to provide a shared collection of schemas that webmasters can use.
Stevens, G.: New metadata recipes for old cookbooks : creating and analyzing a digital collection using the HathiTrust Research Center Portal (2017) 0.00
```
3.3567476E-4 = product of:
  0.005706471 = sum of:
    0.005706471 = weight(_text_:in in 3897) [ClassicSimilarity], result of:
      0.005706471 = score(doc=3897,freq=10.0), product of:
        0.033961542 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.024967048 = queryNorm
        0.16802745 = fieldWeight in 3897, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3897)
  0.05882353 = coord(1/17)
```
Abstract

The Early American Cookbooks digital project is a case study in analyzing collections as data using HathiTrust and the HathiTrust Research Center (HTRC) Portal. The purposes of the project are to create a freely available, searchable collection of full-text early American cookbooks within the HathiTrust Digital Library, to offer an overview of the scope and contents of the collection, and to analyze trends and patterns in the metadata and the full text of the collection. The digital project has two basic components: a collection of 1450 full-text cookbooks published in the United States between 1800 and 1920 and a website to present a guide to the collection and the results of the analysis. This article will focus on the workflow for analyzing the metadata and the full-text of the collection. The workflow will cover: 1) creating a searchable public collection of full-text titles within the HathiTrust Digital Library and uploading it to the HTRC Portal, 2) analyzing and visualizing legacy MARC data for the collection using MarcEdit, OpenRefine and Tableau, and 3) using the text analysis tools in the HTRC Portal to look for trends and patterns in the full text of the collection.
Edmunds, J.: Roadmap to nowhere : BIBFLOW, BIBFRAME, and linked data for libraries (2017) 0.00
```
3.1201506E-4 = product of:
  0.005304256 = sum of:
    0.005304256 = weight(_text_:in in 3523) [ClassicSimilarity], result of:
      0.005304256 = score(doc=3523,freq=6.0), product of:
        0.033961542 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.024967048 = queryNorm
        0.1561842 = fieldWeight in 3523, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=3523)
  0.05882353 = coord(1/17)
```
Abstract

On December 12, 2016, Carl Stahmer and MacKenzie Smith presented at the CNI Members Fall Meeting about the BIBFLOW project, self-described on Twitter as "a two-year project of the UC Davis University Library and Zepheira investigating the future of library technical services." In her opening remarks, Ms. Smith, University Librarian at UC Davis, stated that one of the goals of the project was to devise a roadmap "to get from where we are today, which is kind of the 1970s with a little lipstick on it, to 2020, which is where we're going to be very soon." The notion that where libraries are today is somehow behind the times is one of the commonly heard rationales behind a move to linked data. Stated more precisely: - Libraries devote considerable time and resources to producing high-quality bibliographic metadata - This metadata is stored in unconnected silos - This metadata is in a format (MARC) that is incompatible with technologies of the emerging Semantic Web - The visibility of library metadata is diminished as a result of the two points above Are these assertions true? If yes, is linked data the solution?

¬The Dublin Core Metadata Element Set (2012) 0.00

3.0023666E-4 = product of:
  0.005104023 = sum of:
    0.005104023 = weight(_text_:in in 4790) [ClassicSimilarity], result of:
      0.005104023 = score(doc=4790,freq=2.0), product of:
        0.033961542 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.024967048 = queryNorm
        0.15028831 = fieldWeight in 4790, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.078125 = fieldNorm(doc=4790)
  0.05882353 = coord(1/17)

Abstract: Defines fifteen metadata elements for resource description in a cross-disciplinary information environment.

Riley, J.: Understanding metadata : what is metadata, and what is it for? (2017) 0.00

3.0023666E-4 = product of:
  0.005104023 = sum of:
    0.005104023 = weight(_text_:in in 2005) [ClassicSimilarity], result of:
      0.005104023 = score(doc=2005,freq=2.0), product of:
        0.033961542 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.024967048 = queryNorm
        0.15028831 = fieldWeight in 2005, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.078125 = fieldNorm(doc=2005)
  0.05882353 = coord(1/17)

Footnote: Rez. in: Cataloging and classification quarterly 55(2017) no.7/8, S.669-670 (Liz Woolcott).

Hook, P.A.; Gantchev, A.: Using combined metadata sources to visualize a small library (OBL's English Language Books) (2017) 0.00
```
3.0023666E-4 = product of:
  0.005104023 = sum of:
    0.005104023 = weight(_text_:in in 3870) [ClassicSimilarity], result of:
      0.005104023 = score(doc=3870,freq=8.0), product of:
        0.033961542 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.024967048 = queryNorm
        0.15028831 = fieldWeight in 3870, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3870)
  0.05882353 = coord(1/17)
```
Abstract

Data from multiple knowledge organization systems are combined to provide a global overview of the content holdings of a small personal library. Subject headings and classification data are used to effectively map the combined book and topic space of the library. While harvested and manipulated by hand, the work reveals issues and potential solutions when using automated techniques to produce topic maps of much larger libraries. The small library visualized consists of the thirty-nine, digital, English language books found in the Osama Bin Laden (OBL) compound in Abbottabad, Pakistan upon his death. As this list of books has garnered considerable media attention, it is worth providing a visual overview of the subject content of these books - some of which is not readily apparent from the titles. Metadata from subject headings and classification numbers was combined to create book-subject maps. Tree maps of the classification data were also produced. The books contain 328 subject headings. In order to enhance the base map with meaningful thematic overlay, library holding count data was also harvested (and aggregated from duplicates). This additional data revealed the relative scarcity or popularity of individual books.

Content

Beitrag bei: NASKO 2017: Visualizing Knowledge Organization: Bringing Focus to Abstract Realities. The sixth North American Symposium on Knowledge Organization (NASKO 2017), June 15-16, 2017, in Champaign, IL, USA.
Neumann, M.; Steinberg, J.; Schaer, P.: Web-ccraping for non-programmers : introducing OXPath for digital library metadata harvesting (2017) 0.00
```
2.6001257E-4 = product of:
  0.004420214 = sum of:
    0.004420214 = weight(_text_:in in 3895) [ClassicSimilarity], result of:
      0.004420214 = score(doc=3895,freq=6.0), product of:
        0.033961542 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.024967048 = queryNorm
        0.1301535 = fieldWeight in 3895, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3895)
  0.05882353 = coord(1/17)
```
Abstract

Building up new collections for digital libraries is a demanding task. Available data sets have to be extracted which is usually done with the help of software developers as it involves custom data handlers or conversion scripts. In cases where the desired data is only available on the data provider's website custom web scrapers are needed. This may be the case for small to medium-size publishers, research institutes or funding agencies. As data curation is a typical task that is done by people with a library and information science background, these people are usually proficient with XML technologies but are not full-stack programmers. Therefore we would like to present a web scraping tool that does not demand the digital library curators to program custom web scrapers from scratch. We present the open-source tool OXPath, an extension of XPath, that allows the user to define data to be extracted from websites in a declarative way. By taking one of our own use cases as an example, we guide you in more detail through the process of creating an OXPath wrapper for metadata harvesting. We also point out some practical things to consider when creating a web scraper (with OXPath). On top of that, we also present a syntax highlighting plugin for the popular text editor Atom that we developed to further support OXPath users and to simplify the authoring process.

Search (25 results, page 1 of 2)

Authors

Languages

Types

Themes