Search (30 results, page 1 of 2)

  • × theme_ss:"Metadaten"
  • × type_ss:"el"
  1. Wolfe, EW.: a case study in automated metadata enhancement : Natural Language Processing in the humanities (2019) 0.04
    0.03908867 = product of:
      0.07817734 = sum of:
        0.036211025 = weight(_text_:data in 5236) [ClassicSimilarity], result of:
          0.036211025 = score(doc=5236,freq=2.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.24455236 = fieldWeight in 5236, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5236)
        0.041966315 = product of:
          0.08393263 = sum of:
            0.08393263 = weight(_text_:processing in 5236) [ClassicSimilarity], result of:
              0.08393263 = score(doc=5236,freq=4.0), product of:
                0.18956426 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.046827413 = queryNorm
                0.4427661 = fieldWeight in 5236, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5236)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    The Black Book Interactive Project at the University of Kansas (KU) is developing an expanded corpus of novels by African American authors, with an emphasis on lesser known writers and a goal of expanding research in this field. Using a custom metadata schema with an emphasis on race-related elements, each novel is analyzed for a variety of elements such as literary style, targeted content analysis, historical context, and other areas. Librarians at KU have worked to develop a variety of computational text analysis processes designed to assist with specific aspects of this metadata collection, including text mining and natural language processing, automated subject extraction based on word sense disambiguation, harvesting data from Wikidata, and other actions.
  2. Cranefield, S.: Networked knowledge representation and exchange using UML and RDF (2001) 0.03
    0.032942846 = product of:
      0.06588569 = sum of:
        0.036211025 = weight(_text_:data in 5896) [ClassicSimilarity], result of:
          0.036211025 = score(doc=5896,freq=2.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.24455236 = fieldWeight in 5896, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5896)
        0.029674664 = product of:
          0.05934933 = sum of:
            0.05934933 = weight(_text_:processing in 5896) [ClassicSimilarity], result of:
              0.05934933 = score(doc=5896,freq=2.0), product of:
                0.18956426 = queryWeight, product of:
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.046827413 = queryNorm
                0.3130829 = fieldWeight in 5896, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  4.048147 = idf(docFreq=2097, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=5896)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    This paper proposes the use of the Unified Modeling Language (UML) as a language for modelling ontologies for Web resources and the knowledge contained within them. To provide a mechanism for serialising and processing object diagrams representing knowledge, a pair of XSI-T stylesheets have been developed to map from XML Metadata Interchange (XMI) encodings of class diagrams to corresponding RDF schemas and to Java classes representing the concepts in the ontologies. The Java code includes methods for marshalling and unmarshalling object-oriented information between in-memory data structures and RDF serialisations of that information. This provides a convenient mechanism for Java applications to share knowledge on the Web
  3. Sewing, S.: Bestandserhaltung und Archivierung : Koordinierung auf der Basis eines gemeinsamen Metadatenformates in den deutschen und österreichischen Bibliotheksverbünden (2021) 0.03
    0.025035713 = product of:
      0.050071426 = sum of:
        0.031038022 = weight(_text_:data in 266) [ClassicSimilarity], result of:
          0.031038022 = score(doc=266,freq=2.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.2096163 = fieldWeight in 266, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=266)
        0.019033402 = product of:
          0.038066804 = sum of:
            0.038066804 = weight(_text_:22 in 266) [ClassicSimilarity], result of:
              0.038066804 = score(doc=266,freq=2.0), product of:
                0.16398162 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046827413 = queryNorm
                0.23214069 = fieldWeight in 266, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=266)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Date
    22. 5.2021 12:43:05
    Source
    Open Password. 2021, Nr.928 vom 31.05.2021 [https://www.password-online.de/?mailpoet_router&endpoint=view_in_browser&action=view&data=WzI5OSwiMjc2N2ZlZjQwMDUwIiwwLDAsMjY4LDFd]
  4. Daniel Jr., R.; Lagoze, C.: Extending the Warwick framework : from metadata containers to active digital objects (1997) 0.02
    0.020742472 = product of:
      0.08296989 = sum of:
        0.08296989 = weight(_text_:data in 1264) [ClassicSimilarity], result of:
          0.08296989 = score(doc=1264,freq=42.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.56033987 = fieldWeight in 1264, product of:
              6.4807405 = tf(freq=42.0), with freq of:
                42.0 = termFreq=42.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1264)
      0.25 = coord(1/4)
    
    Abstract
    Defining metadata as "data about data" provokes more questions than it answers. What are the forms of the data and metadata? Can we be more specific about the manner in which the metadata is "about" the data? Are data and metadata distinguished only in the context of their relationship? Is the nature of the relationship between the datasets declarative or procedural? Can the metadata itself be described by other data? Over the past several years, we have been engaged in a number of efforts examining the role, format, composition, and architecture of metadata for networked resources. During this time, we have noticed the tendency to be led astray by comfortable, but somewhat inappropriate, models in the non-digital information environment. Rather than pursuing familiar models, there is the need for a new model that fully exploits the unique combination of computation and connectivity that characterizes the digital library. In this paper, we describe an extension of the Warwick Framework that we call Distributed Active Relationships (DARs). DARs provide a powerful model for representing data and metadata in digital library objects. They explicitly express the relationships between networked resources, and even allow those relationships to be dynamically downloadable and executable. The DAR model is based on the following principles, which our examination of the "data about data" definition has led us to regard as axiomatic: * There is no essential distinction between data and metadata. We can only make such a distinction in terms of a particular "about" relationship. As a result, what is metadata in the context of one "about" relationship may be data in another. * There is no single "about" relationship. There are many different and important relationships between data resources. * Resources can be related without regard for their location. The connectivity in networked information architectures makes it possible to have data in one repository describe data in another repository. * The computational power of the networked information environment makes it possible to consider active or dynamic relationships between data sets. This adds considerable power to the "data about data" definition. First, data about another data set may not physically exist, but may be automatically derived. Second, the "about" relationship may be an executable object -- in a sense interpretable metadata. As will be shown, this provides useful mechanisms for handling complex metadata problems such as rights management of digital objects. The remainder of this paper describes the development and consequences of the DAR model. Section 2 reviews the Warwick Framework, which is the basis for the model described in this paper. Section 3 examines the concept of the Warwick Framework Catalog, which provides a mechanism for expressing the relationships between the packages in a Warwick Framework container. With that background established, section 4 generalizes the Warwick Framework by removing the restriction that it only contains "metadata". This allows us to consider digital library objects that are aggregations of (possibly distributed) data sets, with the relationships between the data sets expressed using a Warwick Framework Catalog. Section 5 further extends the model by describing Distributed Active Relationships (DARs). DARs are the explicit relationships that have the potential to be executable, as alluded to earlier. Finally, section 6 describes two possible implementations of these concepts.
  5. Baker, T.: ¬A grammar of Dublin Core (2000) 0.02
    0.016690476 = product of:
      0.03338095 = sum of:
        0.020692015 = weight(_text_:data in 1236) [ClassicSimilarity], result of:
          0.020692015 = score(doc=1236,freq=2.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.1397442 = fieldWeight in 1236, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03125 = fieldNorm(doc=1236)
        0.012688936 = product of:
          0.025377871 = sum of:
            0.025377871 = weight(_text_:22 in 1236) [ClassicSimilarity], result of:
              0.025377871 = score(doc=1236,freq=2.0), product of:
                0.16398162 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046827413 = queryNorm
                0.15476047 = fieldWeight in 1236, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1236)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    Dublin Core is often presented as a modern form of catalog card -- a set of elements (and now qualifiers) that describe resources in a complete package. Sometimes it is proposed as an exchange format for sharing records among multiple collections. The founding principle that "every element is optional and repeatable" reinforces the notion that a Dublin Core description is to be taken as a whole. This paper, in contrast, is based on a much different premise: Dublin Core is a language. More precisely, it is a small language for making a particular class of statements about resources. Like natural languages, it has a vocabulary of word-like terms, the two classes of which -- elements and qualifiers -- function within statements like nouns and adjectives; and it has a syntax for arranging elements and qualifiers into statements according to a simple pattern. Whenever tourists order a meal or ask directions in an unfamiliar language, considerate native speakers will spontaneously limit themselves to basic words and simple sentence patterns along the lines of "I am so-and-so" or "This is such-and-such". Linguists call this pidginization. In such situations, a small phrase book or translated menu can be most helpful. By analogy, today's Web has been called an Internet Commons where users and information providers from a wide range of scientific, commercial, and social domains present their information in a variety of incompatible data models and description languages. In this context, Dublin Core presents itself as a metadata pidgin for digital tourists who must find their way in this linguistically diverse landscape. Its vocabulary is small enough to learn quickly, and its basic pattern is easily grasped. It is well-suited to serve as an auxiliary language for digital libraries. This grammar starts by defining terms. It then follows a 200-year-old tradition of English grammar teaching by focusing on the structure of single statements. It concludes by looking at the growing dictionary of Dublin Core vocabulary terms -- its registry, and at how statements can be used to build the metadata equivalent of paragraphs and compositions -- the application profile.
    Date
    26.12.2011 14:01:22
  6. Neumann, M.; Steinberg, J.; Schaer, P.: Web-ccraping for non-programmers : introducing OXPath for digital library metadata harvesting (2017) 0.02
    0.015839024 = product of:
      0.063356094 = sum of:
        0.063356094 = weight(_text_:data in 3895) [ClassicSimilarity], result of:
          0.063356094 = score(doc=3895,freq=12.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.4278775 = fieldWeight in 3895, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3895)
      0.25 = coord(1/4)
    
    Abstract
    Building up new collections for digital libraries is a demanding task. Available data sets have to be extracted which is usually done with the help of software developers as it involves custom data handlers or conversion scripts. In cases where the desired data is only available on the data provider's website custom web scrapers are needed. This may be the case for small to medium-size publishers, research institutes or funding agencies. As data curation is a typical task that is done by people with a library and information science background, these people are usually proficient with XML technologies but are not full-stack programmers. Therefore we would like to present a web scraping tool that does not demand the digital library curators to program custom web scrapers from scratch. We present the open-source tool OXPath, an extension of XPath, that allows the user to define data to be extracted from websites in a declarative way. By taking one of our own use cases as an example, we guide you in more detail through the process of creating an OXPath wrapper for metadata harvesting. We also point out some practical things to consider when creating a web scraper (with OXPath). On top of that, we also present a syntax highlighting plugin for the popular text editor Atom that we developed to further support OXPath users and to simplify the authoring process.
  7. Baca, M.; O'Keefe, E.: Sharing standards and expertise in the early 21st century : Moving toward a collaborative, "cross-community" model for metadata creation (2008) 0.02
    0.015519011 = product of:
      0.062076043 = sum of:
        0.062076043 = weight(_text_:data in 2321) [ClassicSimilarity], result of:
          0.062076043 = score(doc=2321,freq=8.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.4192326 = fieldWeight in 2321, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=2321)
      0.25 = coord(1/4)
    
    Abstract
    This paper provides a brief overview of the evolving descriptive metadata landscape, one phenomenon of which can be characterized as "cross-community" metadata as manifested in records that are the result of a combination of carefully considered data value and data content standards. he online catalog of the Morgan Library & Museum provides a real-life illustration of how diverse data content standards and vocabulary tools can be integrated within the classic data structure/technical interchange format of MARC21 to better describe unique, museum-type objects, and to provide better end-user access and understanding. The Morgan experience also shows the value of developing a collaborative model for metadata creation that combines the subject expertise of curators and scholars with the cataloging expertise and knowledge of standards possessed by librarians.
  8. What is Schema.org? (2011) 0.02
    0.015519011 = product of:
      0.062076043 = sum of:
        0.062076043 = weight(_text_:data in 4437) [ClassicSimilarity], result of:
          0.062076043 = score(doc=4437,freq=8.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.4192326 = fieldWeight in 4437, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=4437)
      0.25 = coord(1/4)
    
    Abstract
    This site provides a collection of schemas, i.e., html tags, that webmasters can use to markup their pages in ways recognized by major search providers. Search engines including Bing, Google and Yahoo! rely on this markup to improve the display of search results, making it easier for people to find the right web pages. Many sites are generated from structured data, which is often stored in databases. When this data is formatted into HTML, it becomes very difficult to recover the original structured data. Many applications, especially search engines, can benefit greatly from direct access to this structured data. On-page markup enables search engines to understand the information on web pages and provide richer search results in order to make it easier for users to find relevant information on the web. Markup can also enable new tools and applications that make use of the structure. A shared markup vocabulary makes easier for webmasters to decide on a markup schema and get the maximum benefit for their efforts. So, in the spirit of sitemaps.org, Bing, Google and Yahoo! have come together to provide a shared collection of schemas that webmasters can use.
  9. Hook, P.A.; Gantchev, A.: Using combined metadata sources to visualize a small library (OBL's English Language Books) (2017) 0.01
    0.014458986 = product of:
      0.057835944 = sum of:
        0.057835944 = weight(_text_:data in 3870) [ClassicSimilarity], result of:
          0.057835944 = score(doc=3870,freq=10.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.39059696 = fieldWeight in 3870, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3870)
      0.25 = coord(1/4)
    
    Abstract
    Data from multiple knowledge organization systems are combined to provide a global overview of the content holdings of a small personal library. Subject headings and classification data are used to effectively map the combined book and topic space of the library. While harvested and manipulated by hand, the work reveals issues and potential solutions when using automated techniques to produce topic maps of much larger libraries. The small library visualized consists of the thirty-nine, digital, English language books found in the Osama Bin Laden (OBL) compound in Abbottabad, Pakistan upon his death. As this list of books has garnered considerable media attention, it is worth providing a visual overview of the subject content of these books - some of which is not readily apparent from the titles. Metadata from subject headings and classification numbers was combined to create book-subject maps. Tree maps of the classification data were also produced. The books contain 328 subject headings. In order to enhance the base map with meaningful thematic overlay, library holding count data was also harvested (and aggregated from duplicates). This additional data revealed the relative scarcity or popularity of individual books.
  10. DC-2013: International Conference on Dublin Core and Metadata Applications : Online Proceedings (2013) 0.01
    0.013686483 = product of:
      0.05474593 = sum of:
        0.05474593 = weight(_text_:data in 1076) [ClassicSimilarity], result of:
          0.05474593 = score(doc=1076,freq=14.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.36972845 = fieldWeight in 1076, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03125 = fieldNorm(doc=1076)
      0.25 = coord(1/4)
    
    Abstract
    The collocated conferences for DC-2013 and iPRES-2013 in Lisbon attracted 392 participants from over 37 countries. In addition to the Tuesday through Thursday conference days comprised of peer-reviewed paper and special sessions, 223 participants attended pre-conference tutorials and 246 participated in post-conference workshops for the collocated events. The peer-reviewed papers and presentations are available on the conference website Presentation page (URLs above). In sum, it was a great conference. In addition to links to PDFs of papers, project reports and posters (and their associated presentations), the published proceedings include presentation PDFs for the following: KEYNOTES Darling, we need to talk - Gildas Illien TUTORIALS -- Ivan Herman: "Introduction to Linked Open Data (LOD)" -- Steven Miller: "Introduction to Ontology Concepts and Terminology" -- Kai Eckert: "Metadata Provenance" -- Daniel Garjio: "The W3C Provenance Ontology" SPECIAL SESSIONS -- "Application Profiles as an Alternative to OWL Ontologies" -- "Long-term Preservation and Governance of RDF Vocabularies (W3C Sponsored)" -- "Data Enrichment and Transformation in the LOD Context: Poor & Popular vs Rich & Lonely--Can't we achieve both?" -- "Why Schema.org?"
    Content
    FULL PAPERS Provenance and Annotations for Linked Data - Kai Eckert How Portable Are the Metadata Standards for Scientific Data? A Proposal for a Metadata Infrastructure - Jian Qin, Kai Li Lessons Learned in Implementing the Extended Date/Time Format in a Large Digital Library - Hannah Tarver, Mark Phillips Towards the Representation of Chinese Traditional Music: A State of the Art Review of Music Metadata Standards - Mi Tian, György Fazekas, Dawn Black, Mark Sandler Maps and Gaps: Strategies for Vocabulary Design and Development - Diane Ileana Hillmann, Gordon Dunsire, Jon Phipps A Method for the Development of Dublin Core Application Profiles (Me4DCAP V0.1): Aescription - Mariana Curado Malta, Ana Alice Baptista Find and Combine Vocabularies to Design Metadata Application Profiles using Schema Registries and LOD Resources - Tsunagu Honma, Mitsuharu Nagamori, Shigeo Sugimoto Achieving Interoperability between the CARARE Schema for Monuments and Sites and the Europeana Data Model - Antoine Isaac, Valentine Charles, Kate Fernie, Costis Dallas, Dimitris Gavrilis, Stavros Angelis With a Focused Intent: Evolution of DCMI as a Research Community - Jihee Beak, Richard P. Smiraglia Metadata Capital in a Data Repository - Jane Greenberg, Shea Swauger, Elena Feinstein DC Metadata is Alive and Well - A New Standard for Education - Liddy Nevile Representation of the UNIMARC Bibliographic Data Format in Resource Description Framework - Gordon Dunsire, Mirna Willer, Predrag Perozic
  11. Edmunds, J.: Roadmap to nowhere : BIBFLOW, BIBFRAME, and linked data for libraries (2017) 0.01
    0.013439858 = product of:
      0.053759433 = sum of:
        0.053759433 = weight(_text_:data in 3523) [ClassicSimilarity], result of:
          0.053759433 = score(doc=3523,freq=6.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.3630661 = fieldWeight in 3523, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=3523)
      0.25 = coord(1/4)
    
    Abstract
    On December 12, 2016, Carl Stahmer and MacKenzie Smith presented at the CNI Members Fall Meeting about the BIBFLOW project, self-described on Twitter as "a two-year project of the UC Davis University Library and Zepheira investigating the future of library technical services." In her opening remarks, Ms. Smith, University Librarian at UC Davis, stated that one of the goals of the project was to devise a roadmap "to get from where we are today, which is kind of the 1970s with a little lipstick on it, to 2020, which is where we're going to be very soon." The notion that where libraries are today is somehow behind the times is one of the commonly heard rationales behind a move to linked data. Stated more precisely: - Libraries devote considerable time and resources to producing high-quality bibliographic metadata - This metadata is stored in unconnected silos - This metadata is in a format (MARC) that is incompatible with technologies of the emerging Semantic Web - The visibility of library metadata is diminished as a result of the two points above Are these assertions true? If yes, is linked data the solution?
  12. Suominen, O.; Hyvönen, N.: From MARC silos to Linked Data silos? (2017) 0.01
    0.013439858 = product of:
      0.053759433 = sum of:
        0.053759433 = weight(_text_:data in 3732) [ClassicSimilarity], result of:
          0.053759433 = score(doc=3732,freq=6.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.3630661 = fieldWeight in 3732, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=3732)
      0.25 = coord(1/4)
    
    Abstract
    Seit einiger Zeit stellen Bibliotheken ihre bibliografischen Metadadaten verstärkt offen in Form von Linked Data zur Verfügung. Dabei kommen jedoch ganz unterschiedliche Modelle für die Strukturierung der bibliografischen Daten zur Anwendung. Manche Bibliotheken verwenden ein auf FRBR basierendes Modell mit mehreren Schichten von Entitäten, während andere flache, am Datensatz orientierte Modelle nutzen. Der Wildwuchs bei den Datenmodellen erschwert die Nachnutzung der bibliografischen Daten. Im Ergebnis haben die Bibliotheken die früheren MARC-Silos nur mit zueinander inkompatiblen Linked-Data-Silos vertauscht. Deshalb ist es häufig schwierig, Datensets miteinander zu kombinieren und nachzunutzen. Kleinere Unterschiede in der Datenmodellierung lassen sich zwar durch Schema Mappings in den Griff bekommen, doch erscheint es fraglich, ob die Interoperabilität insgesamt zugenommen hat. Der Beitrag stellt die Ergebnisse einer Studie zu verschiedenen veröffentlichten Sets von bibliografischen Daten vor. Dabei werden auch die unterschiedlichen Modelle betrachtet, um bibliografische Daten als RDF darzustellen, sowie Werkzeuge zur Erzeugung von entsprechenden Daten aus dem MARC-Format. Abschließend wird der von der Finnischen Nationalbibliothek verfolgte Ansatz behandelt.
  13. Hodges, D.W.; Schlottmann, K.: better archival migration outcomes with Python and the Google Sheets API : Reporting from the archives (2019) 0.01
    0.01293251 = product of:
      0.05173004 = sum of:
        0.05173004 = weight(_text_:data in 5444) [ClassicSimilarity], result of:
          0.05173004 = score(doc=5444,freq=8.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.34936053 = fieldWeight in 5444, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5444)
      0.25 = coord(1/4)
    
    Abstract
    Columbia University Libraries recently embarked on a multi-phase project to migrate nearly 4,000 records describing over 70,000 linear feet of archival material from disparate sources and formats into ArchivesSpace. This paper discusses tools and methods brought to bear in Phase 2 of this project, which required us to look closely at how to integrate a large number of legacy finding aids into the new system and merge descriptive data that had diverged in myriad ways. Using Python, XSLT, and a widely available if underappreciated resource-the Google Sheets API-archival and technical library staff devised ways to efficiently report data from different sources, and present it in an accessible, user-friendly way,. Responses were then fed back into automated data remediation processes to keep the migration project on track and minimize manual intervention. The scripts and processes developed proved very effective, and moreover, show promise well beyond the ArchivesSpace migration. This paper describes the Python/XSLT/Sheets API processes developed and how they opened a path to move beyond CSV-based reporting with flexible, ad-hoc data interfaces easily adaptable to meet a variety of purposes.
  14. Bartczak, J.; Glendon, I.: Python, Google Sheets, and the Thesaurus for Graphic Materials for efficient metadata project workflows (2017) 0.01
    0.010973599 = product of:
      0.043894395 = sum of:
        0.043894395 = weight(_text_:data in 3893) [ClassicSimilarity], result of:
          0.043894395 = score(doc=3893,freq=4.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.29644224 = fieldWeight in 3893, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=3893)
      0.25 = coord(1/4)
    
    Abstract
    In 2017, the University of Virginia (U.Va.) will launch a two year initiative to celebrate the bicentennial anniversary of the University's founding in 1819. The U.Va. Library is participating in this event by digitizing some 20,000 photographs and negatives that document student life on the U.Va. grounds in the 1960s and 1970s. Metadata librarians and archivists are well-versed in the challenges associated with generating digital content and accompanying description within the context of limited resources. This paper describes how technology and new approaches to metadata design have enabled the University of Virginia's Metadata Analysis and Design Department to rapidly and successfully generate accurate description for these digital objects. Python's pandas module improves efficiency by cleaning and repurposing data recorded at digitization, while the lxml module builds MODS XML programmatically from CSV tables. A simplified technique for subject heading selection and assignment in Google Sheets provides a collaborative environment for streamlined metadata creation and data quality control.
  15. Miller, S.: Introduction to ontology concepts and terminology : DC-2013 Tutorial, September 2, 2013. (2013) 0.01
    0.0103460075 = product of:
      0.04138403 = sum of:
        0.04138403 = weight(_text_:data in 1075) [ClassicSimilarity], result of:
          0.04138403 = score(doc=1075,freq=2.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.2794884 = fieldWeight in 1075, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0625 = fieldNorm(doc=1075)
      0.25 = coord(1/4)
    
    Content
    Tutorial topics and outline 1. Tutorial Background Overview The Semantic Web, Linked Data, and the Resource Description Framework 2. Ontology Basics and RDFS Tutorial Semantic modeling, domain ontologies, and RDF Vocabulary Description Language (RDFS) concepts and terminology Examples: domain ontologies, models, and schemas Exercises 3. OWL Overview Tutorial Web Ontology Language (OWL): selected concepts and terminology Exercises
  16. Broughton, V.: Automatic metadata generation : Digital resource description without human intervention (2007) 0.01
    0.009516701 = product of:
      0.038066804 = sum of:
        0.038066804 = product of:
          0.07613361 = sum of:
            0.07613361 = weight(_text_:22 in 6048) [ClassicSimilarity], result of:
              0.07613361 = score(doc=6048,freq=2.0), product of:
                0.16398162 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046827413 = queryNorm
                0.46428138 = fieldWeight in 6048, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=6048)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Date
    22. 9.2007 15:41:14
  17. Wallis, R.; Isaac, A.; Charles, V.; Manguinhas, H.: Recommendations for the application of Schema.org to aggregated cultural heritage metadata to increase relevance and visibility to search engines : the case of Europeana (2017) 0.01
    0.009144665 = product of:
      0.03657866 = sum of:
        0.03657866 = weight(_text_:data in 3372) [ClassicSimilarity], result of:
          0.03657866 = score(doc=3372,freq=4.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.24703519 = fieldWeight in 3372, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3372)
      0.25 = coord(1/4)
    
    Abstract
    Europeana provides access to more than 54 million cultural heritage objects through its portal Europeana Collections. It is crucial for Europeana to be recognized by search engines as a trusted authoritative repository of cultural heritage objects. Indeed, even though its portal is the main entry point, most Europeana users come to it via search engines. Europeana Collections is fuelled by metadata describing cultural objects, represented in the Europeana Data Model (EDM). This paper presents the research and consequent recommendations for publishing Europeana metadata using the Schema.org vocabulary and best practices. Schema.org html embedded metadata to be consumed by search engines to power rich services (such as Google Knowledge Graph). Schema.org is an open and widely adopted initiative (used by over 12 million domains) backed by Google, Bing, Yahoo!, and Yandex, for sharing metadata across the web It underpins the emergence of new web techniques, such as so called Semantic SEO. Our research addressed the representation of the embedded metadata as part of the Europeana HTML pages and sitemaps so that the re-use of this data can be optimized. The practical objective of our work is to produce a Schema.org representation of Europeana resources described in EDM, being the richest as possible and tailored to Europeana's realities and user needs as well the search engines and their users.
  18. Stevens, G.: New metadata recipes for old cookbooks : creating and analyzing a digital collection using the HathiTrust Research Center Portal (2017) 0.01
    0.009144665 = product of:
      0.03657866 = sum of:
        0.03657866 = weight(_text_:data in 3897) [ClassicSimilarity], result of:
          0.03657866 = score(doc=3897,freq=4.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.24703519 = fieldWeight in 3897, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3897)
      0.25 = coord(1/4)
    
    Abstract
    The Early American Cookbooks digital project is a case study in analyzing collections as data using HathiTrust and the HathiTrust Research Center (HTRC) Portal. The purposes of the project are to create a freely available, searchable collection of full-text early American cookbooks within the HathiTrust Digital Library, to offer an overview of the scope and contents of the collection, and to analyze trends and patterns in the metadata and the full text of the collection. The digital project has two basic components: a collection of 1450 full-text cookbooks published in the United States between 1800 and 1920 and a website to present a guide to the collection and the results of the analysis. This article will focus on the workflow for analyzing the metadata and the full-text of the collection. The workflow will cover: 1) creating a searchable public collection of full-text titles within the HathiTrust Digital Library and uploading it to the HTRC Portal, 2) analyzing and visualizing legacy MARC data for the collection using MarcEdit, OpenRefine and Tableau, and 3) using the text analysis tools in the HTRC Portal to look for trends and patterns in the full text of the collection.
  19. McClelland, M.; McArthur, D.; Giersch, S.; Geisler, G.: Challenges for service providers when importing metadata in digital libraries (2002) 0.01
    0.009052756 = product of:
      0.036211025 = sum of:
        0.036211025 = weight(_text_:data in 565) [ClassicSimilarity], result of:
          0.036211025 = score(doc=565,freq=2.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.24455236 = fieldWeight in 565, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0546875 = fieldNorm(doc=565)
      0.25 = coord(1/4)
    
    Abstract
    Much of the usefulness of digital libraries lies in their ability to provide services for data from distributed repositories, and many research projects are investigating frameworks for interoperability. In this paper, we report on the experiences and lessons learned by iLumina after importing IMS metadata. iLumina utilizes the IMS metadata specification, which allows for a rich set of metadata (Dublin Core has a simpler metadata scheme that can be mapped onto a subset of the IMS metadata). Our experiences identify questions regarding intellectual property rights for metadata, protocols for enriched metadata, and tips for designing metadata services.
  20. Heery, R.; Wagner, H.: ¬A metadata registry for the Semantic Web (2002) 0.01
    0.009052756 = product of:
      0.036211025 = sum of:
        0.036211025 = weight(_text_:data in 1210) [ClassicSimilarity], result of:
          0.036211025 = score(doc=1210,freq=8.0), product of:
            0.14807065 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046827413 = queryNorm
            0.24455236 = fieldWeight in 1210, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1210)
      0.25 = coord(1/4)
    
    Abstract
    The Semantic Web activity is a W3C project whose goal is to enable a 'cooperative' Web where machines and humans can exchange electronic content that has clear-cut, unambiguous meaning. This vision is based on the automated sharing of metadata terms across Web applications. The declaration of schemas in metadata registries advance this vision by providing a common approach for the discovery, understanding, and exchange of semantics. However, many of the issues regarding registries are not clear, and ideas vary regarding their scope and purpose. Additionally, registry issues are often difficult to describe and comprehend without a working example. This article will explore the role of metadata registries and will describe three prototypes, written by the Dublin Core Metadata Initiative. The article will outline how the prototypes are being used to demonstrate and evaluate application scope, functional requirements, and technology solutions for metadata registries. Metadata schema registries are, in effect, databases of schemas that can trace an historical line back to shared data dictionaries and the registration process encouraged by the ISO/IEC 11179 community. New impetus for the development of registries has come with the development activities surrounding creation of the Semantic Web. The motivation for establishing registries arises from domain and standardization communities, and from the knowledge management community. Examples of current registry activity include:
    * Agencies maintaining directories of data elements in a domain area in accordance with ISO/IEC 11179 (This standard specifies good practice for data element definition as well as the registration process. Example implementations are the National Health Information Knowledgebase hosted by the Australian Institute of Health and Welfare and the Environmental Data Registry hosted by the US Environmental Protection Agency.); * The xml.org directory of the Extended Markup Language (XML) document specifications facilitating re-use of Document Type Definition (DTD), hosted by the Organization for the Advancement of Structured Information Standards (OASIS); * The MetaForm database of Dublin Core usage and mappings maintained at the State and University Library in Goettingen; * The Semantic Web Agreement Group Dictionary, a database of terms for the Semantic Web that can be referred to by humans and software agents; * LEXML, a multi-lingual and multi-jurisdictional RDF Dictionary for the legal world; * The SCHEMAS registry maintained by the European Commission funded SCHEMAS project, which indexes several metadata element sets as well as a large number of activity reports describing metadata related activities and initiatives. Metadata registries essentially provide an index of terms. Given the distributed nature of the Web, there are a number of ways this can be accomplished. For example, the registry could link to terms and definitions in schemas published by implementers and stored locally by the schema maintainer. Alternatively, the registry might harvest various metadata schemas from their maintainers. Registries provide 'added value' to users by indexing schemas relevant to a particular 'domain' or 'community of use' and by simplifying the navigation of terms by enabling multiple schemas to be accessed from one view. An important benefit of this approach is an increase in the reuse of existing terms, rather than users having to reinvent them. Merging schemas to one view leads to harmonization between applications and helps avoid duplication of effort. Additionally, the establishment of registries to index terms actively being used in local implementations facilitates the metadata standards activity by providing implementation experience transferable to the standards-making process.