Search (25 results, page 1 of 2)

  • × theme_ss:"Metadaten"
  • × type_ss:"a"
  • × type_ss:"el"
  1. Edmunds, J.: Roadmap to nowhere : BIBFLOW, BIBFRAME, and linked data for libraries (2017) 0.02
    0.017613193 = product of:
      0.079259366 = sum of:
        0.036990993 = weight(_text_:bibliographic in 3523) [ClassicSimilarity], result of:
          0.036990993 = score(doc=3523,freq=2.0), product of:
            0.14333439 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.036818076 = queryNorm
            0.2580748 = fieldWeight in 3523, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.046875 = fieldNorm(doc=3523)
        0.042268377 = weight(_text_:data in 3523) [ClassicSimilarity], result of:
          0.042268377 = score(doc=3523,freq=6.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.3630661 = fieldWeight in 3523, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=3523)
      0.22222222 = coord(2/9)
    
    Abstract
    On December 12, 2016, Carl Stahmer and MacKenzie Smith presented at the CNI Members Fall Meeting about the BIBFLOW project, self-described on Twitter as "a two-year project of the UC Davis University Library and Zepheira investigating the future of library technical services." In her opening remarks, Ms. Smith, University Librarian at UC Davis, stated that one of the goals of the project was to devise a roadmap "to get from where we are today, which is kind of the 1970s with a little lipstick on it, to 2020, which is where we're going to be very soon." The notion that where libraries are today is somehow behind the times is one of the commonly heard rationales behind a move to linked data. Stated more precisely: - Libraries devote considerable time and resources to producing high-quality bibliographic metadata - This metadata is stored in unconnected silos - This metadata is in a format (MARC) that is incompatible with technologies of the emerging Semantic Web - The visibility of library metadata is diminished as a result of the two points above Are these assertions true? If yes, is linked data the solution?
  2. Godby, C.J.; Young, J.A.; Childress, E.: ¬A repository of metadata crosswalks (2004) 0.02
    0.016889969 = product of:
      0.15200971 = sum of:
        0.15200971 = weight(_text_:readable in 1155) [ClassicSimilarity], result of:
          0.15200971 = score(doc=1155,freq=4.0), product of:
            0.2262076 = queryWeight, product of:
              6.1439276 = idf(docFreq=257, maxDocs=44218)
              0.036818076 = queryNorm
            0.67199206 = fieldWeight in 1155, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              6.1439276 = idf(docFreq=257, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1155)
      0.11111111 = coord(1/9)
    
    Abstract
    This paper proposes a model for metadata crosswalks that associates three pieces of information: the crosswalk, the source metadata standard, and the target metadata standard, each of which may have a machine-readable encoding and human-readable description. The crosswalks are encoded as METS records that are made available to a repository for processing by search engines, OAI harvesters, and custom-designed Web services. The METS object brings together all of the information required to access and interpret crosswalks and represents a significant improvement over previously available formats. But it raises questions about how best to describe these complex objects and exposes gaps that must eventually be filled in by the digital library community.
  3. Miller, E.: ¬An introduction to the Resource Description Framework (1998) 0.01
    0.011943011 = product of:
      0.1074871 = sum of:
        0.1074871 = weight(_text_:readable in 1231) [ClassicSimilarity], result of:
          0.1074871 = score(doc=1231,freq=2.0), product of:
            0.2262076 = queryWeight, product of:
              6.1439276 = idf(docFreq=257, maxDocs=44218)
              0.036818076 = queryNorm
            0.47517014 = fieldWeight in 1231, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.1439276 = idf(docFreq=257, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1231)
      0.11111111 = coord(1/9)
    
    Abstract
    The Resource Description Framework (RDF) is an infrastructure that enables the encoding, exchange and reuse of structured metadata. RDF is an application of XML that imposes needed structural constraints to provide unambiguous methods of expressing semantics. RDF additionally provides a means for publishing both human-readable and machine-processable vocabularies designed to encourage the reuse and extension of metadata semantics among disparate information communities. The structural constraints RDF imposes to support the consistent encoding and exchange of standardized metadata provides for the interchangeability of separate packages of metadata defined by different resource description communities.
  4. Bearman, D.; Miller, E.; Rust, G.; Trant, J.; Weibel, S.: ¬A common model to support interoperable metadata : progress report on reconciling metadata requirements from the Dublin Core and INDECS/DOI communities (1999) 0.01
    0.011369381 = product of:
      0.051162213 = sum of:
        0.03082583 = weight(_text_:bibliographic in 1249) [ClassicSimilarity], result of:
          0.03082583 = score(doc=1249,freq=2.0), product of:
            0.14333439 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.036818076 = queryNorm
            0.21506234 = fieldWeight in 1249, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1249)
        0.020336384 = weight(_text_:data in 1249) [ClassicSimilarity], result of:
          0.020336384 = score(doc=1249,freq=2.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.17468026 = fieldWeight in 1249, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1249)
      0.22222222 = coord(2/9)
    
    Abstract
    The Dublin Core metadata community and the INDECS/DOI community of authors, rights holders, and publishers are seeking common ground in the expression of metadata for information resources. Recent meetings at the 6th Dublin Core Workshop in Washington DC sketched out common models for semantics (informed by the requirements articulated in the IFLA Functional Requirements for the Bibliographic Record) and conventions for knowledge representation (based on the Resource Description Framework under development by the W3C). Further development of detailed requirements is planned by both communities in the coming months with the aim of fully representing the metadata needs of each. An open "Schema Harmonization" working group has been established to identify a common framework to support interoperability among these communities. The present document represents a starting point identifying historical developments and common requirements of these perspectives on metadata and charts a path for harmonizing their respective conceptual models. It is hoped that collaboration over the coming year will result in agreed semantic and syntactic conventions that will support a high degree of interoperability among these communities, ideally expressed in a single data model and using common, standard tools.
  5. Sewing, S.: Bestandserhaltung und Archivierung : Koordinierung auf der Basis eines gemeinsamen Metadatenformates in den deutschen und österreichischen Bibliotheksverbünden (2021) 0.01
    0.008748596 = product of:
      0.03936868 = sum of:
        0.024403658 = weight(_text_:data in 266) [ClassicSimilarity], result of:
          0.024403658 = score(doc=266,freq=2.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.2096163 = fieldWeight in 266, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=266)
        0.014965023 = product of:
          0.029930046 = sum of:
            0.029930046 = weight(_text_:22 in 266) [ClassicSimilarity], result of:
              0.029930046 = score(doc=266,freq=2.0), product of:
                0.12893063 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.036818076 = queryNorm
                0.23214069 = fieldWeight in 266, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=266)
          0.5 = coord(1/2)
      0.22222222 = coord(2/9)
    
    Date
    22. 5.2021 12:43:05
    Source
    Open Password. 2021, Nr.928 vom 31.05.2021 [https://www.password-online.de/?mailpoet_router&endpoint=view_in_browser&action=view&data=WzI5OSwiMjc2N2ZlZjQwMDUwIiwwLDAsMjY4LDFd]
  6. Husevag, A.-S.R.: Named entities in indexing : a case study of TV subtitles and metadata records (2016) 0.01
    0.008037162 = product of:
      0.07233446 = sum of:
        0.07233446 = weight(_text_:germany in 3105) [ClassicSimilarity], result of:
          0.07233446 = score(doc=3105,freq=2.0), product of:
            0.21956629 = queryWeight, product of:
              5.963546 = idf(docFreq=308, maxDocs=44218)
              0.036818076 = queryNorm
            0.32944247 = fieldWeight in 3105, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.963546 = idf(docFreq=308, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3105)
      0.11111111 = coord(1/9)
    
    Source
    Proceedings of the 15th European Networked Knowledge Organization Systems Workshop (NKOS 2016) co-located with the 20th International Conference on Theory and Practice of Digital Libraries 2016 (TPDL 2016), Hannover, Germany, September 9, 2016. Edi. by Philipp Mayr et al. [http://ceur-ws.org/Vol-1676/=urn:nbn:de:0074-1676-5]
  7. Daniel Jr., R.; Lagoze, C.: Extending the Warwick framework : from metadata containers to active digital objects (1997) 0.01
    0.0072483453 = product of:
      0.06523511 = sum of:
        0.06523511 = weight(_text_:data in 1264) [ClassicSimilarity], result of:
          0.06523511 = score(doc=1264,freq=42.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.56033987 = fieldWeight in 1264, product of:
              6.4807405 = tf(freq=42.0), with freq of:
                42.0 = termFreq=42.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1264)
      0.11111111 = coord(1/9)
    
    Abstract
    Defining metadata as "data about data" provokes more questions than it answers. What are the forms of the data and metadata? Can we be more specific about the manner in which the metadata is "about" the data? Are data and metadata distinguished only in the context of their relationship? Is the nature of the relationship between the datasets declarative or procedural? Can the metadata itself be described by other data? Over the past several years, we have been engaged in a number of efforts examining the role, format, composition, and architecture of metadata for networked resources. During this time, we have noticed the tendency to be led astray by comfortable, but somewhat inappropriate, models in the non-digital information environment. Rather than pursuing familiar models, there is the need for a new model that fully exploits the unique combination of computation and connectivity that characterizes the digital library. In this paper, we describe an extension of the Warwick Framework that we call Distributed Active Relationships (DARs). DARs provide a powerful model for representing data and metadata in digital library objects. They explicitly express the relationships between networked resources, and even allow those relationships to be dynamically downloadable and executable. The DAR model is based on the following principles, which our examination of the "data about data" definition has led us to regard as axiomatic: * There is no essential distinction between data and metadata. We can only make such a distinction in terms of a particular "about" relationship. As a result, what is metadata in the context of one "about" relationship may be data in another. * There is no single "about" relationship. There are many different and important relationships between data resources. * Resources can be related without regard for their location. The connectivity in networked information architectures makes it possible to have data in one repository describe data in another repository. * The computational power of the networked information environment makes it possible to consider active or dynamic relationships between data sets. This adds considerable power to the "data about data" definition. First, data about another data set may not physically exist, but may be automatically derived. Second, the "about" relationship may be an executable object -- in a sense interpretable metadata. As will be shown, this provides useful mechanisms for handling complex metadata problems such as rights management of digital objects. The remainder of this paper describes the development and consequences of the DAR model. Section 2 reviews the Warwick Framework, which is the basis for the model described in this paper. Section 3 examines the concept of the Warwick Framework Catalog, which provides a mechanism for expressing the relationships between the packages in a Warwick Framework container. With that background established, section 4 generalizes the Warwick Framework by removing the restriction that it only contains "metadata". This allows us to consider digital library objects that are aggregations of (possibly distributed) data sets, with the relationships between the data sets expressed using a Warwick Framework Catalog. Section 5 further extends the model by describing Distributed Active Relationships (DARs). DARs are the explicit relationships that have the potential to be executable, as alluded to earlier. Finally, section 6 describes two possible implementations of these concepts.
  8. Baker, T.: Languages for Dublin Core (1998) 0.01
    0.0059715053 = product of:
      0.05374355 = sum of:
        0.05374355 = weight(_text_:readable in 1257) [ClassicSimilarity], result of:
          0.05374355 = score(doc=1257,freq=2.0), product of:
            0.2262076 = queryWeight, product of:
              6.1439276 = idf(docFreq=257, maxDocs=44218)
              0.036818076 = queryNorm
            0.23758507 = fieldWeight in 1257, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.1439276 = idf(docFreq=257, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1257)
      0.11111111 = coord(1/9)
    
    Abstract
    Over the past three years, the Dublin Core Metadata Initiative has achieved a broad international consensus on the semantics of a simple element set for describing electronic resources. Since the first workshop in March 1995, which was reported in the very first issue of D-Lib Magazine, Dublin Core has been the topic of perhaps a dozen articles here. Originally intended to be simple and intuitive enough for authors to tag Web pages without special training, Dublin Core is being adapted now for more specialized uses, from government information and legal deposit to museum informatics and electronic commerce. To meet such specialized requirements, Dublin Core can be customized with additional elements or qualifiers. However, these refinements can compromise interoperability across applications. There are tradeoffs between using specific terms that precisely meet local needs versus general terms that are understood more widely. We can better understand this inevitable tension between simplicity and complexity if we recognize that metadata is a form of human language. With Dublin Core, as with a natural language, people are inclined to stretch definitions, make general terms more specific, specific terms more general, misunderstand intended meanings, and coin new terms. One goal of this paper, therefore, will be to examine the experience of some related ways to seek semantic interoperability through simplicity: planned languages, interlingua constructs, and pidgins. The problem of semantic interoperability is compounded when we consider Dublin Core in translation. All of the workshops, documents, mailing lists, user guides, and working group outputs of the Dublin Core Initiative have been in English. But in many countries and for many applications, people need a metadata standard in their own language. In principle, the broad elements of Dublin Core can be defined equally well in Bulgarian or Hindi. Since Dublin Core is a controlled standard, however, any parallel definitions need to be kept in sync as the standard evolves. Another goal of the paper, then, will be to define the conceptual and organizational problem of maintaining a metadata standard in multiple languages. In addition to a name and definition, which are meant for human consumption, each Dublin Core element has a label, or indexing token, meant for harvesting by search engines. For practical reasons, these machine-readable tokens are English-looking strings such as Creator and Subject (just as HTML tags are called HEAD, BODY, or TITLE). These tokens, which are shared by Dublin Cores in every language, ensure that metadata fields created in any particular language are indexed together across repositories. As symbols of underlying universal semantics, these tokens form the basis of semantic interoperability among the multiple Dublin Cores. As long as we limit ourselves to sharing these indexing tokens among exact translations of a simple set of fifteen broad elements, the definitions of which fit easily onto two pages, the problem of Dublin Core in multiple languages is straightforward. But nothing having to do with human language is ever so simple. Just as speakers of various languages must learn the language of Dublin Core in their own tongues, we must find the right words to talk about a metadata language that is expressable in many discipline-specific jargons and natural languages and that inevitably will evolve and change over time.
  9. Baker, T.: ¬A grammar of Dublin Core (2000) 0.01
    0.0058323974 = product of:
      0.026245788 = sum of:
        0.016269106 = weight(_text_:data in 1236) [ClassicSimilarity], result of:
          0.016269106 = score(doc=1236,freq=2.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.1397442 = fieldWeight in 1236, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.03125 = fieldNorm(doc=1236)
        0.009976682 = product of:
          0.019953365 = sum of:
            0.019953365 = weight(_text_:22 in 1236) [ClassicSimilarity], result of:
              0.019953365 = score(doc=1236,freq=2.0), product of:
                0.12893063 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.036818076 = queryNorm
                0.15476047 = fieldWeight in 1236, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1236)
          0.5 = coord(1/2)
      0.22222222 = coord(2/9)
    
    Abstract
    Dublin Core is often presented as a modern form of catalog card -- a set of elements (and now qualifiers) that describe resources in a complete package. Sometimes it is proposed as an exchange format for sharing records among multiple collections. The founding principle that "every element is optional and repeatable" reinforces the notion that a Dublin Core description is to be taken as a whole. This paper, in contrast, is based on a much different premise: Dublin Core is a language. More precisely, it is a small language for making a particular class of statements about resources. Like natural languages, it has a vocabulary of word-like terms, the two classes of which -- elements and qualifiers -- function within statements like nouns and adjectives; and it has a syntax for arranging elements and qualifiers into statements according to a simple pattern. Whenever tourists order a meal or ask directions in an unfamiliar language, considerate native speakers will spontaneously limit themselves to basic words and simple sentence patterns along the lines of "I am so-and-so" or "This is such-and-such". Linguists call this pidginization. In such situations, a small phrase book or translated menu can be most helpful. By analogy, today's Web has been called an Internet Commons where users and information providers from a wide range of scientific, commercial, and social domains present their information in a variety of incompatible data models and description languages. In this context, Dublin Core presents itself as a metadata pidgin for digital tourists who must find their way in this linguistically diverse landscape. Its vocabulary is small enough to learn quickly, and its basic pattern is easily grasped. It is well-suited to serve as an auxiliary language for digital libraries. This grammar starts by defining terms. It then follows a 200-year-old tradition of English grammar teaching by focusing on the structure of single statements. It concludes by looking at the growing dictionary of Dublin Core vocabulary terms -- its registry, and at how statements can be used to build the metadata equivalent of paragraphs and compositions -- the application profile.
    Date
    26.12.2011 14:01:22
  10. Neumann, M.; Steinberg, J.; Schaer, P.: Web-ccraping for non-programmers : introducing OXPath for digital library metadata harvesting (2017) 0.01
    0.005534862 = product of:
      0.04981376 = sum of:
        0.04981376 = weight(_text_:data in 3895) [ClassicSimilarity], result of:
          0.04981376 = score(doc=3895,freq=12.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.4278775 = fieldWeight in 3895, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3895)
      0.11111111 = coord(1/9)
    
    Abstract
    Building up new collections for digital libraries is a demanding task. Available data sets have to be extracted which is usually done with the help of software developers as it involves custom data handlers or conversion scripts. In cases where the desired data is only available on the data provider's website custom web scrapers are needed. This may be the case for small to medium-size publishers, research institutes or funding agencies. As data curation is a typical task that is done by people with a library and information science background, these people are usually proficient with XML technologies but are not full-stack programmers. Therefore we would like to present a web scraping tool that does not demand the digital library curators to program custom web scrapers from scratch. We present the open-source tool OXPath, an extension of XPath, that allows the user to define data to be extracted from websites in a declarative way. By taking one of our own use cases as an example, we guide you in more detail through the process of creating an OXPath wrapper for metadata harvesting. We also point out some practical things to consider when creating a web scraper (with OXPath). On top of that, we also present a syntax highlighting plugin for the popular text editor Atom that we developed to further support OXPath users and to simplify the authoring process.
  11. Hook, P.A.; Gantchev, A.: Using combined metadata sources to visualize a small library (OBL's English Language Books) (2017) 0.01
    0.0050526154 = product of:
      0.04547354 = sum of:
        0.04547354 = weight(_text_:data in 3870) [ClassicSimilarity], result of:
          0.04547354 = score(doc=3870,freq=10.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.39059696 = fieldWeight in 3870, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3870)
      0.11111111 = coord(1/9)
    
    Abstract
    Data from multiple knowledge organization systems are combined to provide a global overview of the content holdings of a small personal library. Subject headings and classification data are used to effectively map the combined book and topic space of the library. While harvested and manipulated by hand, the work reveals issues and potential solutions when using automated techniques to produce topic maps of much larger libraries. The small library visualized consists of the thirty-nine, digital, English language books found in the Osama Bin Laden (OBL) compound in Abbottabad, Pakistan upon his death. As this list of books has garnered considerable media attention, it is worth providing a visual overview of the subject content of these books - some of which is not readily apparent from the titles. Metadata from subject headings and classification numbers was combined to create book-subject maps. Tree maps of the classification data were also produced. The books contain 328 subject headings. In order to enhance the base map with meaningful thematic overlay, library holding count data was also harvested (and aggregated from duplicates). This additional data revealed the relative scarcity or popularity of individual books.
  12. Suominen, O.; Hyvönen, N.: From MARC silos to Linked Data silos? (2017) 0.00
    0.0046964865 = product of:
      0.042268377 = sum of:
        0.042268377 = weight(_text_:data in 3732) [ClassicSimilarity], result of:
          0.042268377 = score(doc=3732,freq=6.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.3630661 = fieldWeight in 3732, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=3732)
      0.11111111 = coord(1/9)
    
    Abstract
    Seit einiger Zeit stellen Bibliotheken ihre bibliografischen Metadadaten verstärkt offen in Form von Linked Data zur Verfügung. Dabei kommen jedoch ganz unterschiedliche Modelle für die Strukturierung der bibliografischen Daten zur Anwendung. Manche Bibliotheken verwenden ein auf FRBR basierendes Modell mit mehreren Schichten von Entitäten, während andere flache, am Datensatz orientierte Modelle nutzen. Der Wildwuchs bei den Datenmodellen erschwert die Nachnutzung der bibliografischen Daten. Im Ergebnis haben die Bibliotheken die früheren MARC-Silos nur mit zueinander inkompatiblen Linked-Data-Silos vertauscht. Deshalb ist es häufig schwierig, Datensets miteinander zu kombinieren und nachzunutzen. Kleinere Unterschiede in der Datenmodellierung lassen sich zwar durch Schema Mappings in den Griff bekommen, doch erscheint es fraglich, ob die Interoperabilität insgesamt zugenommen hat. Der Beitrag stellt die Ergebnisse einer Studie zu verschiedenen veröffentlichten Sets von bibliografischen Daten vor. Dabei werden auch die unterschiedlichen Modelle betrachtet, um bibliografische Daten als RDF darzustellen, sowie Werkzeuge zur Erzeugung von entsprechenden Daten aus dem MARC-Format. Abschließend wird der von der Finnischen Nationalbibliothek verfolgte Ansatz behandelt.
  13. Hodges, D.W.; Schlottmann, K.: better archival migration outcomes with Python and the Google Sheets API : Reporting from the archives (2019) 0.00
    0.0045191967 = product of:
      0.040672768 = sum of:
        0.040672768 = weight(_text_:data in 5444) [ClassicSimilarity], result of:
          0.040672768 = score(doc=5444,freq=8.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.34936053 = fieldWeight in 5444, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5444)
      0.11111111 = coord(1/9)
    
    Abstract
    Columbia University Libraries recently embarked on a multi-phase project to migrate nearly 4,000 records describing over 70,000 linear feet of archival material from disparate sources and formats into ArchivesSpace. This paper discusses tools and methods brought to bear in Phase 2 of this project, which required us to look closely at how to integrate a large number of legacy finding aids into the new system and merge descriptive data that had diverged in myriad ways. Using Python, XSLT, and a widely available if underappreciated resource-the Google Sheets API-archival and technical library staff devised ways to efficiently report data from different sources, and present it in an accessible, user-friendly way,. Responses were then fed back into automated data remediation processes to keep the migration project on track and minimize manual intervention. The scripts and processes developed proved very effective, and moreover, show promise well beyond the ArchivesSpace migration. This paper describes the Python/XSLT/Sheets API processes developed and how they opened a path to move beyond CSV-based reporting with flexible, ad-hoc data interfaces easily adaptable to meet a variety of purposes.
  14. Suranofsky, M.; McColl, L.: a Google sheets add-on that uses the WorldCat search API : MatchMarc (2019) 0.00
    0.0041101105 = product of:
      0.036990993 = sum of:
        0.036990993 = weight(_text_:bibliographic in 5442) [ClassicSimilarity], result of:
          0.036990993 = score(doc=5442,freq=2.0), product of:
            0.14333439 = queryWeight, product of:
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.036818076 = queryNorm
            0.2580748 = fieldWeight in 5442, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.893044 = idf(docFreq=2449, maxDocs=44218)
              0.046875 = fieldNorm(doc=5442)
      0.11111111 = coord(1/9)
    
    Abstract
    Lehigh University Libraries has developed a new tool for querying WorldCat using the WorldCat Search API. The tool is a Google Sheet Add-on and is available now via the Google Sheets Add-ons menu under the name "MatchMarc." The add-on is easily customizable, with no knowledge of coding needed. The tool will return a single "best" OCLC record number, and its bibliographic information for a given ISBN or LCCN, allowing the user to set up and define "best." Because all of the information, the input, the criteria, and the results exist in the Google Sheets environment, efficient workflows can be developed from this flexible starting point. This article will discuss the development of the add-on, how it works, and future plans for development.
  15. Bartczak, J.; Glendon, I.: Python, Google Sheets, and the Thesaurus for Graphic Materials for efficient metadata project workflows (2017) 0.00
    0.0038346653 = product of:
      0.034511987 = sum of:
        0.034511987 = weight(_text_:data in 3893) [ClassicSimilarity], result of:
          0.034511987 = score(doc=3893,freq=4.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.29644224 = fieldWeight in 3893, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.046875 = fieldNorm(doc=3893)
      0.11111111 = coord(1/9)
    
    Abstract
    In 2017, the University of Virginia (U.Va.) will launch a two year initiative to celebrate the bicentennial anniversary of the University's founding in 1819. The U.Va. Library is participating in this event by digitizing some 20,000 photographs and negatives that document student life on the U.Va. grounds in the 1960s and 1970s. Metadata librarians and archivists are well-versed in the challenges associated with generating digital content and accompanying description within the context of limited resources. This paper describes how technology and new approaches to metadata design have enabled the University of Virginia's Metadata Analysis and Design Department to rapidly and successfully generate accurate description for these digital objects. Python's pandas module improves efficiency by cleaning and repurposing data recorded at digitization, while the lxml module builds MODS XML programmatically from CSV tables. A simplified technique for subject heading selection and assignment in Google Sheets provides a collaborative environment for streamlined metadata creation and data quality control.
  16. Wallis, R.; Isaac, A.; Charles, V.; Manguinhas, H.: Recommendations for the application of Schema.org to aggregated cultural heritage metadata to increase relevance and visibility to search engines : the case of Europeana (2017) 0.00
    0.0031955543 = product of:
      0.028759988 = sum of:
        0.028759988 = weight(_text_:data in 3372) [ClassicSimilarity], result of:
          0.028759988 = score(doc=3372,freq=4.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.24703519 = fieldWeight in 3372, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3372)
      0.11111111 = coord(1/9)
    
    Abstract
    Europeana provides access to more than 54 million cultural heritage objects through its portal Europeana Collections. It is crucial for Europeana to be recognized by search engines as a trusted authoritative repository of cultural heritage objects. Indeed, even though its portal is the main entry point, most Europeana users come to it via search engines. Europeana Collections is fuelled by metadata describing cultural objects, represented in the Europeana Data Model (EDM). This paper presents the research and consequent recommendations for publishing Europeana metadata using the Schema.org vocabulary and best practices. Schema.org html embedded metadata to be consumed by search engines to power rich services (such as Google Knowledge Graph). Schema.org is an open and widely adopted initiative (used by over 12 million domains) backed by Google, Bing, Yahoo!, and Yandex, for sharing metadata across the web It underpins the emergence of new web techniques, such as so called Semantic SEO. Our research addressed the representation of the embedded metadata as part of the Europeana HTML pages and sitemaps so that the re-use of this data can be optimized. The practical objective of our work is to produce a Schema.org representation of Europeana resources described in EDM, being the richest as possible and tailored to Europeana's realities and user needs as well the search engines and their users.
  17. Stevens, G.: New metadata recipes for old cookbooks : creating and analyzing a digital collection using the HathiTrust Research Center Portal (2017) 0.00
    0.0031955543 = product of:
      0.028759988 = sum of:
        0.028759988 = weight(_text_:data in 3897) [ClassicSimilarity], result of:
          0.028759988 = score(doc=3897,freq=4.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.24703519 = fieldWeight in 3897, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3897)
      0.11111111 = coord(1/9)
    
    Abstract
    The Early American Cookbooks digital project is a case study in analyzing collections as data using HathiTrust and the HathiTrust Research Center (HTRC) Portal. The purposes of the project are to create a freely available, searchable collection of full-text early American cookbooks within the HathiTrust Digital Library, to offer an overview of the scope and contents of the collection, and to analyze trends and patterns in the metadata and the full text of the collection. The digital project has two basic components: a collection of 1450 full-text cookbooks published in the United States between 1800 and 1920 and a website to present a guide to the collection and the results of the analysis. This article will focus on the workflow for analyzing the metadata and the full-text of the collection. The workflow will cover: 1) creating a searchable public collection of full-text titles within the HathiTrust Digital Library and uploading it to the HTRC Portal, 2) analyzing and visualizing legacy MARC data for the collection using MarcEdit, OpenRefine and Tableau, and 3) using the text analysis tools in the HTRC Portal to look for trends and patterns in the full text of the collection.
  18. Cranefield, S.: Networked knowledge representation and exchange using UML and RDF (2001) 0.00
    0.0031634374 = product of:
      0.028470935 = sum of:
        0.028470935 = weight(_text_:data in 5896) [ClassicSimilarity], result of:
          0.028470935 = score(doc=5896,freq=2.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.24455236 = fieldWeight in 5896, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5896)
      0.11111111 = coord(1/9)
    
    Abstract
    This paper proposes the use of the Unified Modeling Language (UML) as a language for modelling ontologies for Web resources and the knowledge contained within them. To provide a mechanism for serialising and processing object diagrams representing knowledge, a pair of XSI-T stylesheets have been developed to map from XML Metadata Interchange (XMI) encodings of class diagrams to corresponding RDF schemas and to Java classes representing the concepts in the ontologies. The Java code includes methods for marshalling and unmarshalling object-oriented information between in-memory data structures and RDF serialisations of that information. This provides a convenient mechanism for Java applications to share knowledge on the Web
  19. McClelland, M.; McArthur, D.; Giersch, S.; Geisler, G.: Challenges for service providers when importing metadata in digital libraries (2002) 0.00
    0.0031634374 = product of:
      0.028470935 = sum of:
        0.028470935 = weight(_text_:data in 565) [ClassicSimilarity], result of:
          0.028470935 = score(doc=565,freq=2.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.24455236 = fieldWeight in 565, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.0546875 = fieldNorm(doc=565)
      0.11111111 = coord(1/9)
    
    Abstract
    Much of the usefulness of digital libraries lies in their ability to provide services for data from distributed repositories, and many research projects are investigating frameworks for interoperability. In this paper, we report on the experiences and lessons learned by iLumina after importing IMS metadata. iLumina utilizes the IMS metadata specification, which allows for a rich set of metadata (Dublin Core has a simpler metadata scheme that can be mapped onto a subset of the IMS metadata). Our experiences identify questions regarding intellectual property rights for metadata, protocols for enriched metadata, and tips for designing metadata services.
  20. Heery, R.; Wagner, H.: ¬A metadata registry for the Semantic Web (2002) 0.00
    0.0031634374 = product of:
      0.028470935 = sum of:
        0.028470935 = weight(_text_:data in 1210) [ClassicSimilarity], result of:
          0.028470935 = score(doc=1210,freq=8.0), product of:
            0.11642061 = queryWeight, product of:
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.036818076 = queryNorm
            0.24455236 = fieldWeight in 1210, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.1620505 = idf(docFreq=5088, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1210)
      0.11111111 = coord(1/9)
    
    Abstract
    The Semantic Web activity is a W3C project whose goal is to enable a 'cooperative' Web where machines and humans can exchange electronic content that has clear-cut, unambiguous meaning. This vision is based on the automated sharing of metadata terms across Web applications. The declaration of schemas in metadata registries advance this vision by providing a common approach for the discovery, understanding, and exchange of semantics. However, many of the issues regarding registries are not clear, and ideas vary regarding their scope and purpose. Additionally, registry issues are often difficult to describe and comprehend without a working example. This article will explore the role of metadata registries and will describe three prototypes, written by the Dublin Core Metadata Initiative. The article will outline how the prototypes are being used to demonstrate and evaluate application scope, functional requirements, and technology solutions for metadata registries. Metadata schema registries are, in effect, databases of schemas that can trace an historical line back to shared data dictionaries and the registration process encouraged by the ISO/IEC 11179 community. New impetus for the development of registries has come with the development activities surrounding creation of the Semantic Web. The motivation for establishing registries arises from domain and standardization communities, and from the knowledge management community. Examples of current registry activity include:
    * Agencies maintaining directories of data elements in a domain area in accordance with ISO/IEC 11179 (This standard specifies good practice for data element definition as well as the registration process. Example implementations are the National Health Information Knowledgebase hosted by the Australian Institute of Health and Welfare and the Environmental Data Registry hosted by the US Environmental Protection Agency.); * The xml.org directory of the Extended Markup Language (XML) document specifications facilitating re-use of Document Type Definition (DTD), hosted by the Organization for the Advancement of Structured Information Standards (OASIS); * The MetaForm database of Dublin Core usage and mappings maintained at the State and University Library in Goettingen; * The Semantic Web Agreement Group Dictionary, a database of terms for the Semantic Web that can be referred to by humans and software agents; * LEXML, a multi-lingual and multi-jurisdictional RDF Dictionary for the legal world; * The SCHEMAS registry maintained by the European Commission funded SCHEMAS project, which indexes several metadata element sets as well as a large number of activity reports describing metadata related activities and initiatives. Metadata registries essentially provide an index of terms. Given the distributed nature of the Web, there are a number of ways this can be accomplished. For example, the registry could link to terms and definitions in schemas published by implementers and stored locally by the schema maintainer. Alternatively, the registry might harvest various metadata schemas from their maintainers. Registries provide 'added value' to users by indexing schemas relevant to a particular 'domain' or 'community of use' and by simplifying the navigation of terms by enabling multiple schemas to be accessed from one view. An important benefit of this approach is an increase in the reuse of existing terms, rather than users having to reinvent them. Merging schemas to one view leads to harmonization between applications and helps avoid duplication of effort. Additionally, the establishment of registries to index terms actively being used in local implementations facilitates the metadata standards activity by providing implementation experience transferable to the standards-making process.