Search (13 results, page 1 of 1)

Qualität in der Inhaltserschließung (2021) 0.01
```
0.007843386 = product of:
  0.036602467 = sum of:
    0.013948122 = weight(_text_:web in 753) [ClassicSimilarity], result of:
      0.013948122 = score(doc=753,freq=2.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.14422815 = fieldWeight in 753, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.03125 = fieldNorm(doc=753)
    0.005707573 = weight(_text_:information in 753) [ClassicSimilarity], result of:
      0.005707573 = score(doc=753,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.10971737 = fieldWeight in 753, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=753)
    0.016946774 = weight(_text_:retrieval in 753) [ClassicSimilarity], result of:
      0.016946774 = score(doc=753,freq=4.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.18905719 = fieldWeight in 753, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=753)
  0.21428572 = coord(3/14)
```
Content

Inhalt: Editorial - Michael Franke-Maier, Anna Kasprzik, Andreas Ledl und Hans Schürmann Qualität in der Inhaltserschließung - Ein Überblick aus 50 Jahren (1970-2020) - Andreas Ledl Fit for Purpose - Standardisierung von inhaltserschließenden Informationen durch Richtlinien für Metadaten - Joachim Laczny Neue Wege und Qualitäten - Die Inhaltserschließungspolitik der Deutschen Nationalbibliothek - Ulrike Junger und Frank Scholze Wissensbasen für die automatische Erschließung und ihre Qualität am Beispiel von Wikidata - Lydia Pintscher, Peter Bourgonje, Julián Moreno Schneider, Malte Ostendorff und Georg Rehm Qualitätssicherung in der GND - Esther Scheven Qualitätskriterien und Qualitätssicherung in der inhaltlichen Erschließung - Thesenpapier des Expertenteams RDA-Anwendungsprofil für die verbale Inhaltserschließung (ET RAVI) Coli-conc - Eine Infrastruktur zur Nutzung und Erstellung von Konkordanzen - Uma Balakrishnan, Stefan Peters und Jakob Voß Methoden und Metriken zur Messung von OCR-Qualität für die Kuratierung von Daten und Metadaten - Clemens Neudecker, Karolina Zaczynska, Konstantin Baierer, Georg Rehm, Mike Gerber und Julián Moreno Schneider Datenqualität als Grundlage qualitativer Inhaltserschließung - Jakob Voß Bemerkungen zu der Qualitätsbewertung von MARC-21-Datensätzen - Rudolf Ungváry und Péter Király Named Entity Linking mit Wikidata und GND - Das Potenzial handkuratierter und strukturierter Datenquellen für die semantische Anreicherung von Volltexten - Sina Menzel, Hannes Schnaitter, Josefine Zinck, Vivien Petras, Clemens Neudecker, Kai Labusch, Elena Leitner und Georg Rehm Ein Protokoll für den Datenabgleich im Web am Beispiel von OpenRefine und der Gemeinsamen Normdatei (GND) - Fabian Steeg und Adrian Pohl Verbale Erschließung in Katalogen und Discovery-Systemen - Überlegungen zur Qualität - Heidrun Wiesenmüller Inhaltserschließung für Discovery-Systeme gestalten - Jan Frederik Maas Evaluierung von Verschlagwortung im Kontext des Information Retrievals - Christian Wartena und Koraljka Golub Die Qualität der Fremddatenanreicherung FRED - Cyrus Beck Quantität als Qualität - Was die Verbünde zur Verbesserung der Inhaltserschließung beitragen können - Rita Albrecht, Barbara Block, Mathias Kratzer und Peter Thiessen Hybride Künstliche Intelligenz in der automatisierten Inhaltserschließung - Harald Sack

Footnote

Vgl.: https://www.degruyter.com/document/doi/10.1515/9783110691597/html. DOI: https://doi.org/10.1515/9783110691597. Rez. in: Information - Wissenschaft und Praxis 73(2022) H.2-3, S.131-132 (B. Lorenz u. V. Steyer). Weitere Rezension in: o-bib 9(20229 Nr.3. (Martin Völkl) [https://www.o-bib.de/bib/article/view/5843/8714].

Theme

Verbale Doksprachen im Online-Retrieval
Klassifikationssysteme im Online-Retrieval

Golub, K.: Automated subject indexing : an overview (2021) 0.00

0.004004761 = product of:
  0.028033325 = sum of:
    0.0070627616 = weight(_text_:information in 718) [ClassicSimilarity], result of:
      0.0070627616 = score(doc=718,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.13576832 = fieldWeight in 718, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=718)
    0.020970564 = weight(_text_:retrieval in 718) [ClassicSimilarity], result of:
      0.020970564 = score(doc=718,freq=2.0), product of:
        0.08963835 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.029633347 = queryNorm
        0.23394634 = fieldWeight in 718, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=718)
  0.14285715 = coord(2/14)

Abstract: In the face of the ever-increasing document volume, libraries around the globe are more and more exploring (semi-) automated approaches to subject indexing. This helps sustain bibliographic objectives, enrich metadata, and establish more connections across documents from various collections, effectively leading to improved information retrieval and access. However, generally accepted automated approaches that are functional in operative systems are lacking. This article aims to provide an overview of basic principles used for automated subject indexing, major approaches in relation to their possible application in actual library systems, existing working examples, as well as related challenges calling for further research.

Franke-Maier, M.; Beck, C.; Kasprzik, A.; Maas, J.F.; Pielmeier, S.; Wiesenmüller, H: ¬Ein Feuerwerk an Algorithmen und der Startschuss zur Bildung eines Kompetenznetzwerks für maschinelle Erschließung : Bericht zur Fachtagung Netzwerk maschinelle Erschließung an der Deutschen Nationalbibliothek am 10. und 11. Oktober 2019 (2020) 0.00
```
0.002365089 = product of:
  0.033111244 = sum of:
    0.033111244 = weight(_text_:bibliothek in 5851) [ClassicSimilarity], result of:
      0.033111244 = score(doc=5851,freq=2.0), product of:
        0.121660605 = queryWeight, product of:
          4.1055303 = idf(docFreq=1980, maxDocs=44218)
          0.029633347 = queryNorm
        0.27216077 = fieldWeight in 5851, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.1055303 = idf(docFreq=1980, maxDocs=44218)
          0.046875 = fieldNorm(doc=5851)
  0.071428575 = coord(1/14)
```
Abstract

Am 10. und 11. Oktober 2019 trafen sich rund 100 Vertreterinnen und Vertreter aus Bibliothek, Wissenschaft und Wirtschaft an der Deutschen Nationalbibliothek (DNB) in Frankfurt am Main zu einer Fachtagung über das derzeitige Trend-Thema "maschinelle Erschließung". Ziel der Veranstaltung war die "Betrachtung unterschiedlicher Anwendungsbereiche maschineller Textanalyse" sowie die Initiation eines Dialogs zu Technologien für die maschinelle Textanalyse, Aufgabenstellungen, Erfahrungen und den Herausforderungen, die maschinelle Verfahren nach sich ziehen. Hintergrund ist der Auftrag des Standardisierungsausschusses an die DNB, regelmäßig einschlägige Tagungen durchzuführen, aus denen "perspektivisch ein Kompetenznetzwerk für die maschinelle Erschließung entsteh[t]".
Pintscher, L.; Bourgonje, P.; Moreno Schneider, J.; Ostendorff, M.; Rehm, G.: Wissensbasen für die automatische Erschließung und ihre Qualität am Beispiel von Wikidata : die Inhaltserschließungspolitik der Deutschen Nationalbibliothek (2021) 0.00
```
0.001245368 = product of:
  0.017435152 = sum of:
    0.017435152 = weight(_text_:web in 366) [ClassicSimilarity], result of:
      0.017435152 = score(doc=366,freq=2.0), product of:
        0.09670874 = queryWeight, product of:
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.029633347 = queryNorm
        0.18028519 = fieldWeight in 366, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.2635105 = idf(docFreq=4597, maxDocs=44218)
          0.0390625 = fieldNorm(doc=366)
  0.071428575 = coord(1/14)
```
Abstract

Wikidata ist eine freie Wissensbasis, die allgemeine Daten über die Welt zur Verfügung stellt. Sie wird von Wikimedia entwickelt und betrieben, wie auch das Schwesterprojekt Wikipedia. Die Daten in Wikidata werden von einer großen Community von Freiwilligen gesammelt und gepflegt, wobei die Daten sowie die zugrundeliegende Ontologie von vielen Projekten, Institutionen und Firmen als Basis für Applikationen und Visualisierungen, aber auch für das Training von maschinellen Lernverfahren genutzt werden. Wikidata nutzt MediaWiki und die Erweiterung Wikibase als technische Grundlage der kollaborativen Arbeit an einer Wissensbasis, die verlinkte offene Daten für Menschen und Maschinen zugänglich macht. Ende 2020 beschreibt Wikidata über 90 Millionen Entitäten unter Verwendung von über 8 000 Eigenschaften, womit insgesamt mehr als 1,15 Milliarden Aussagen über die beschriebenen Entitäten getroffen werden. Die Datenobjekte dieser Entitäten sind mit äquivalenten Einträgen in mehr als 5 500 externen Datenbanken, Katalogen und Webseiten verknüpft, was Wikidata zu einem der zentralen Knotenpunkte des Linked Data Web macht. Mehr als 11 500 aktiv Editierende tragen neue Daten in die Wissensbasis ein und pflegen sie. Diese sind in Wiki-Projekten organisiert, die jeweils bestimmte Themenbereiche oder Aufgabengebiete adressieren. Die Daten werden in mehr als der Hälfte der Inhaltsseiten in den Wikimedia-Projekten genutzt und unter anderem mehr als 6,5 Millionen Mal am Tag über den SPARQL-Endpoint abgefragt, um sie in externe Applikationen und Visualisierungen einzubinden.
Moulaison-Sandy, H.; Adkins, D.; Bossaller, J.; Cho, H.: ¬An automated approach to describing fiction : a methodology to use book reviews to identify affect (2021) 0.00
```
7.134467E-4 = product of:
  0.009988253 = sum of:
    0.009988253 = weight(_text_:information in 710) [ClassicSimilarity], result of:
      0.009988253 = score(doc=710,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.1920054 = fieldWeight in 710, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=710)
  0.071428575 = coord(1/14)
```
Abstract

Subject headings and genre terms are notoriously difficult to apply, yet are important for fiction. The current project functions as a proof of concept, using a text-mining methodology to identify affective information (emotion and tone) about fiction titles from professional book reviews as a potential first step in automating the subject analysis process. Findings are presented and discussed, comparing results to the range of aboutness and isness information in library cataloging records. The methodology is likewise presented, and how future work might expand on the current project to enhance catalog records through text-mining is explored.
Yang, T.-H.; Hsieh, Y.-L.; Liu, S.-H.; Chang, Y.-C.; Hsu, W.-L.: ¬A flexible template generation and matching method with applications for publication reference metadata extraction (2021) 0.00
```
5.0960475E-4 = product of:
  0.0071344664 = sum of:
    0.0071344664 = weight(_text_:information in 63) [ClassicSimilarity], result of:
      0.0071344664 = score(doc=63,freq=4.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.13714671 = fieldWeight in 63, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=63)
  0.071428575 = coord(1/14)
```
Abstract

Conventional rule-based approaches use exact template matching to capture linguistic information and necessarily need to enumerate all variations. We propose a novel flexible template generation and matching scheme called the principle-based approach (PBA) based on sequence alignment, and employ it for reference metadata extraction (RME) to demonstrate its effectiveness. The main contributions of this research are threefold. First, we propose an automatic template generation that can capture prominent patterns using the dominating set algorithm. Second, we devise an alignment-based template-matching technique that uses a logistic regression model, which makes it more general and flexible than pure rule-based approaches. Last, we apply PBA to RME on extensive cross-domain corpora and demonstrate its robustness and generality. Experiments reveal that the same set of templates produced by the PBA framework not only deliver consistent performance on various unseen domains, but also surpass hand-crafted knowledge (templates). We use four independent journal style test sets and one conference style test set in the experiments. When compared to renowned machine learning methods, such as conditional random fields (CRF), as well as recent deep learning methods (i.e., bi-directional long short-term memory with a CRF layer, Bi-LSTM-CRF), PBA has the best performance for all datasets.

Source

Journal of the Association for Information Science and Technology. 72(2021) no.1, S.32-45
Pielmeier, S.; Voß, V.; Carstensen, H.; Kahl, B.: Online-Workshop "Computerunterstützte Inhaltserschließung" 2020 (2021) 0.00
```
4.32414E-4 = product of:
  0.0060537956 = sum of:
    0.0060537956 = weight(_text_:information in 4409) [ClassicSimilarity], result of:
      0.0060537956 = score(doc=4409,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.116372846 = fieldWeight in 4409, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=4409)
  0.071428575 = coord(1/14)
```
Abstract

Zum ersten Mal in digitaler Form und mit 230 Teilnehmer*innen fand am 11. und 12. November 2020 der 4. Workshop "Computerunterstützte Inhaltserschließung" statt, organisiert von der Deutschen Nationalbibliothek (DNB), der Firma Eurospider Information Technology, der Staatsbibliothek zu Berlin - Preußischer Kulturbesitz (SBB), der UB Stuttgart und dem Bibliotheksservice-Zentrum Baden-Württemberg (BSZ). Im Mittelpunkt stand der "Digitale Assistent DA-3": In elf Vorträgen wurden Anwendungsszenarien und Erfahrungen mit dem System vorgestellt, das Bibliotheken und andere Wissenschafts- und Kultureinrichtungen bei der Inhaltserschließung unterstützen soll. Die Begrüßung und Einführung in die beiden Workshop-Tage übernahm Frank Scholze (Generaldirektor der DNB). Er sieht den DA-3 als Baustein für die Verzahnung der intellektuellen und der maschinellen Erschließung.

Matthews, P.; Glitre, K.: Genre analysis of movies using a topic model of plot summaries (2021) 0.00

4.32414E-4 = product of:
  0.0060537956 = sum of:
    0.0060537956 = weight(_text_:information in 412) [ClassicSimilarity], result of:
      0.0060537956 = score(doc=412,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.116372846 = fieldWeight in 412, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=412)
  0.071428575 = coord(1/14)

Source: Journal of the Association for Information Science and Technology. 72(2021) no.12, S.1511-1527

Asula, M.; Makke, J.; Freienthal, L.; Kuulmets, H.-A.; Sirel, R.: Kratt: developing an automatic subject indexing tool for the National Library of Estonia : how to transfer metadata information among work cluster members (2021) 0.00

4.32414E-4 = product of:
  0.0060537956 = sum of:
    0.0060537956 = weight(_text_:information in 723) [ClassicSimilarity], result of:
      0.0060537956 = score(doc=723,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.116372846 = fieldWeight in 723, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=723)
  0.071428575 = coord(1/14)

Zhang, Y.; Zhang, C.; Li, J.: Joint modeling of characters, words, and conversation contexts for microblog keyphrase extraction (2020) 0.00

3.6034497E-4 = product of:
  0.0050448296 = sum of:
    0.0050448296 = weight(_text_:information in 5816) [ClassicSimilarity], result of:
      0.0050448296 = score(doc=5816,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.09697737 = fieldWeight in 5816, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5816)
  0.071428575 = coord(1/14)

Source: Journal of the Association for Information Science and Technology. 71(2020) no.5, S.553-567

Giesselbach, S.; Estler-Ziegler, T.: Dokumente schneller analysieren mit Künstlicher Intelligenz (2021) 0.00

3.6034497E-4 = product of:
  0.0050448296 = sum of:
    0.0050448296 = weight(_text_:information in 128) [ClassicSimilarity], result of:
      0.0050448296 = score(doc=128,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.09697737 = fieldWeight in 128, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=128)
  0.071428575 = coord(1/14)

Footnote: Vortrag im Rahmen des Berliner Arbeitskreis Information (BAK) am 25.02.2021.

Villaespesa, E.; Crider, S.: ¬A critical comparison analysis between human and machine-generated tags for the Metropolitan Museum of Art's collection (2021) 0.00
```
3.6034497E-4 = product of:
  0.0050448296 = sum of:
    0.0050448296 = weight(_text_:information in 341) [ClassicSimilarity], result of:
      0.0050448296 = score(doc=341,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.09697737 = fieldWeight in 341, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=341)
  0.071428575 = coord(1/14)
```
Abstract

Purpose Based on the highlights of The Metropolitan Museum of Art's collection, the purpose of this paper is to examine the similarities and differences between the subject keywords tags assigned by the museum and those produced by three computer vision systems. Design/methodology/approach This paper uses computer vision tools to generate the data and the Getty Research Institute's Art and Architecture Thesaurus (AAT) to compare the subject keyword tags. Findings This paper finds that there are clear opportunities to use computer vision technologies to automatically generate tags that expand the terms used by the museum. This brings a new perspective to the collection that is different from the traditional art historical one. However, the study also surfaces challenges about the accuracy and lack of context within the computer vision results. Practical implications This finding has important implications on how these machine-generated tags complement the current taxonomies and vocabularies inputted in the collection database. In consequence, the museum needs to consider the selection process for choosing which computer vision system to apply to their collection. Furthermore, they also need to think critically about the kind of tags they wish to use, such as colors, materials or objects. Originality/value The study results add to the rapidly evolving field of computer vision within the art information context and provide recommendations of aspects to consider before selecting and implementing these technologies.

Ahmed, M.: Automatic indexing for agriculture : designing a framework by deploying Agrovoc, Agris and Annif (2023) 0.00

3.6034497E-4 = product of:
  0.0050448296 = sum of:
    0.0050448296 = weight(_text_:information in 1024) [ClassicSimilarity], result of:
      0.0050448296 = score(doc=1024,freq=2.0), product of:
        0.052020688 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.029633347 = queryNorm
        0.09697737 = fieldWeight in 1024, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1024)
  0.071428575 = coord(1/14)

Source: ¬SRELS Journal of Information Management. 60(2023) no.2, S.85-95

Search (13 results, page 1 of 1)

Authors

Languages

Types

Themes