Search (52 results, page 1 of 3)

Lynch, J.D.; Gibson, J.; Han, M.-J.: Analyzing and normalizing type metadata for a large aggregated digital library (2020) 0.14

0.14340337 = product of:
  0.19120449 = sum of:
    0.10446788 = weight(_text_:digital in 5720) [ClassicSimilarity], result of:
      0.10446788 = score(doc=5720,freq=6.0), product of:
        0.19770671 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.050121464 = queryNorm
        0.5283983 = fieldWeight in 5720, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5720)
    0.03790077 = weight(_text_:library in 5720) [ClassicSimilarity], result of:
      0.03790077 = score(doc=5720,freq=4.0), product of:
        0.1317883 = queryWeight, product of:
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.050121464 = queryNorm
        0.28758827 = fieldWeight in 5720, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5720)
    0.048835836 = product of:
      0.09767167 = sum of:
        0.09767167 = weight(_text_:project in 5720) [ClassicSimilarity], result of:
          0.09767167 = score(doc=5720,freq=4.0), product of:
            0.21156175 = queryWeight, product of:
              4.220981 = idf(docFreq=1764, maxDocs=44218)
              0.050121464 = queryNorm
            0.4616698 = fieldWeight in 5720, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.220981 = idf(docFreq=1764, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5720)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: The Illinois Digital Heritage Hub (IDHH) gathers and enhances metadata from contributing institutions around the state of Illinois and provides this metadata to th Digital Public Library of America (DPLA) for greater access. The IDHH helps contributors shape their metadata to the standards recommended and required by the DPLA in part by analyzing and enhancing aggregated metadata. In late 2018, the IDHH undertook a project to address a particularly problematic field, Type metadata. This paper walks through the project, detailing the process of gathering and analyzing metadata using the DPLA API and OpenRefine, data remediation through XSL transformations in conjunction with local improvements by contributing institutions, and the DPLA ingestion system's quality controls.

Hardesty, J.L.; Young, J.B.: ¬The semantics of metadata : Avalon Media System and the move to RDF (2017) 0.13

0.133226 = product of:
  0.17763469 = sum of:
    0.10339639 = weight(_text_:digital in 3896) [ClassicSimilarity], result of:
      0.10339639 = score(doc=3896,freq=8.0), product of:
        0.19770671 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.050121464 = queryNorm
        0.52297866 = fieldWeight in 3896, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.046875 = fieldNorm(doc=3896)
    0.022971334 = weight(_text_:library in 3896) [ClassicSimilarity], result of:
      0.022971334 = score(doc=3896,freq=2.0), product of:
        0.1317883 = queryWeight, product of:
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.050121464 = queryNorm
        0.17430481 = fieldWeight in 3896, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.046875 = fieldNorm(doc=3896)
    0.051266953 = product of:
      0.10253391 = sum of:
        0.10253391 = weight(_text_:project in 3896) [ClassicSimilarity], result of:
          0.10253391 = score(doc=3896,freq=6.0), product of:
            0.21156175 = queryWeight, product of:
              4.220981 = idf(docFreq=1764, maxDocs=44218)
              0.050121464 = queryNorm
            0.48465237 = fieldWeight in 3896, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.220981 = idf(docFreq=1764, maxDocs=44218)
              0.046875 = fieldNorm(doc=3896)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: The Avalon Media System (Avalon) provides access and management for digital audio and video collections in libraries and archives. The open source project is led by the libraries of Indiana University Bloomington and Northwestern University and is funded in part by grants from The Andrew W. Mellon Foundation and Institute of Museum and Library Services. Avalon is based on the Samvera Community (formerly Hydra Project) software stack and uses Fedora as the digital repository back end. The Avalon project team is in the process of migrating digital repositories from Fedora 3 to Fedora 4 and incorporating metadata statements using the Resource Description Framework (RDF) instead of XML files accompanying the digital objects in the repository. The Avalon team has worked on the migration path for technical metadata and is now working on the migration paths for structural metadata (PCDM) and descriptive metadata (from MODS XML to RDF). This paper covers the decisions made to begin using RDF for software development and offers a window into how Semantic Web technology functions in the real world.

Stevens, G.: New metadata recipes for old cookbooks : creating and analyzing a digital collection using the HathiTrust Research Center Portal (2017) 0.12

0.12459625 = product of:
  0.16612834 = sum of:
    0.0963339 = weight(_text_:digital in 3897) [ClassicSimilarity], result of:
      0.0963339 = score(doc=3897,freq=10.0), product of:
        0.19770671 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.050121464 = queryNorm
        0.4872566 = fieldWeight in 3897, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3897)
    0.027071979 = weight(_text_:library in 3897) [ClassicSimilarity], result of:
      0.027071979 = score(doc=3897,freq=4.0), product of:
        0.1317883 = queryWeight, product of:
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.050121464 = queryNorm
        0.2054202 = fieldWeight in 3897, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3897)
    0.04272246 = product of:
      0.08544492 = sum of:
        0.08544492 = weight(_text_:project in 3897) [ClassicSimilarity], result of:
          0.08544492 = score(doc=3897,freq=6.0), product of:
            0.21156175 = queryWeight, product of:
              4.220981 = idf(docFreq=1764, maxDocs=44218)
              0.050121464 = queryNorm
            0.40387696 = fieldWeight in 3897, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.220981 = idf(docFreq=1764, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3897)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: The Early American Cookbooks digital project is a case study in analyzing collections as data using HathiTrust and the HathiTrust Research Center (HTRC) Portal. The purposes of the project are to create a freely available, searchable collection of full-text early American cookbooks within the HathiTrust Digital Library, to offer an overview of the scope and contents of the collection, and to analyze trends and patterns in the metadata and the full text of the collection. The digital project has two basic components: a collection of 1450 full-text cookbooks published in the United States between 1800 and 1920 and a website to present a guide to the collection and the results of the analysis. This article will focus on the workflow for analyzing the metadata and the full-text of the collection. The workflow will cover: 1) creating a searchable public collection of full-text titles within the HathiTrust Digital Library and uploading it to the HTRC Portal, 2) analyzing and visualizing legacy MARC data for the collection using MarcEdit, OpenRefine and Tableau, and 3) using the text analysis tools in the HTRC Portal to look for trends and patterns in the full text of the collection.

Bartczak, J.; Glendon, I.: Python, Google Sheets, and the Thesaurus for Graphic Materials for efficient metadata project workflows (2017) 0.09

0.09426196 = product of:
  0.1256826 = sum of:
    0.073112294 = weight(_text_:digital in 3893) [ClassicSimilarity], result of:
      0.073112294 = score(doc=3893,freq=4.0), product of:
        0.19770671 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.050121464 = queryNorm
        0.36980176 = fieldWeight in 3893, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.046875 = fieldNorm(doc=3893)
    0.022971334 = weight(_text_:library in 3893) [ClassicSimilarity], result of:
      0.022971334 = score(doc=3893,freq=2.0), product of:
        0.1317883 = queryWeight, product of:
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.050121464 = queryNorm
        0.17430481 = fieldWeight in 3893, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.046875 = fieldNorm(doc=3893)
    0.029598987 = product of:
      0.059197973 = sum of:
        0.059197973 = weight(_text_:project in 3893) [ClassicSimilarity], result of:
          0.059197973 = score(doc=3893,freq=2.0), product of:
            0.21156175 = queryWeight, product of:
              4.220981 = idf(docFreq=1764, maxDocs=44218)
              0.050121464 = queryNorm
            0.27981415 = fieldWeight in 3893, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.220981 = idf(docFreq=1764, maxDocs=44218)
              0.046875 = fieldNorm(doc=3893)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: In 2017, the University of Virginia (U.Va.) will launch a two year initiative to celebrate the bicentennial anniversary of the University's founding in 1819. The U.Va. Library is participating in this event by digitizing some 20,000 photographs and negatives that document student life on the U.Va. grounds in the 1960s and 1970s. Metadata librarians and archivists are well-versed in the challenges associated with generating digital content and accompanying description within the context of limited resources. This paper describes how technology and new approaches to metadata design have enabled the University of Virginia's Metadata Analysis and Design Department to rapidly and successfully generate accurate description for these digital objects. Python's pandas module improves efficiency by cleaning and repurposing data recorded at digitization, while the lxml module builds MODS XML programmatically from CSV tables. A simplified technique for subject heading selection and assignment in Google Sheets provides a collaborative environment for streamlined metadata creation and data quality control.

METS: an overview & tutorial : Metadata Encoding & Transmission Standard (METS) (2001) 0.08
```
0.08348307 = product of:
  0.16696614 = sum of:
    0.11560067 = weight(_text_:digital in 1323) [ClassicSimilarity], result of:
      0.11560067 = score(doc=1323,freq=10.0), product of:
        0.19770671 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.050121464 = queryNorm
        0.58470786 = fieldWeight in 1323, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.046875 = fieldNorm(doc=1323)
    0.05136547 = weight(_text_:library in 1323) [ClassicSimilarity], result of:
      0.05136547 = score(doc=1323,freq=10.0), product of:
        0.1317883 = queryWeight, product of:
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.050121464 = queryNorm
        0.38975742 = fieldWeight in 1323, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.046875 = fieldNorm(doc=1323)
  0.5 = coord(2/4)
```
Abstract

Maintaining a library of digital objects of necessaryy requires maintaining metadata about those objects. The metadata necessary for successful management and use of digital objeets is both more extensive than and different from the metadata used for managing collections of printed works and other physical materials. While a library may record descriptive metadata regarding a book in its collection, the book will not dissolve into a series of unconnected pages if the library fails to record structural metadata regarding the book's organization, nor will scholars be unable to evaluate the book's worth if the library fails to note that the book was produced using a Ryobi offset press. The Same cannot be said for a digital version of the saure book. Without structural metadata, the page image or text files comprising the digital work are of little use, and without technical metadata regarding the digitization process, scholars may be unsure of how accurate a reflection of the original the digital version provides. For internal management purposes, a library must have access to appropriate technical metadata in order to periodically refresh and migrate the data, ensuring the durability of valuable resources.

Hunter, J.: MetaNet - a metadata term thesaurus to enable semantic interoperability between metadata domains (2001) 0.08

0.078551635 = product of:
  0.10473551 = sum of:
    0.060926907 = weight(_text_:digital in 6471) [ClassicSimilarity], result of:
      0.060926907 = score(doc=6471,freq=4.0), product of:
        0.19770671 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.050121464 = queryNorm
        0.3081681 = fieldWeight in 6471, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.0390625 = fieldNorm(doc=6471)
    0.01914278 = weight(_text_:library in 6471) [ClassicSimilarity], result of:
      0.01914278 = score(doc=6471,freq=2.0), product of:
        0.1317883 = queryWeight, product of:
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.050121464 = queryNorm
        0.14525402 = fieldWeight in 6471, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.0390625 = fieldNorm(doc=6471)
    0.024665821 = product of:
      0.049331643 = sum of:
        0.049331643 = weight(_text_:project in 6471) [ClassicSimilarity], result of:
          0.049331643 = score(doc=6471,freq=2.0), product of:
            0.21156175 = queryWeight, product of:
              4.220981 = idf(docFreq=1764, maxDocs=44218)
              0.050121464 = queryNorm
            0.23317845 = fieldWeight in 6471, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.220981 = idf(docFreq=1764, maxDocs=44218)
              0.0390625 = fieldNorm(doc=6471)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: Metadata interoperability is a fundamental requirement for access to information within networked knowledge organization systems. The Harmony international digital library project [1] has developed a common underlying data model (the ABC model) to enable the scalable mapping of metadata descriptions across domains and media types. The ABC model [2] provides a set of basic building blocks for metadata modeling and recognizes the importance of 'events' to describe unambiguously metadata for objects with a complex history. To test and evaluate the interoperability capabilities of this model, we applied it to some real multimedia examples and analysed the results of mapping from the ABC model to various different metadata domains using XSLT [3]. This work revealed serious limitations in the ability of XSLT to support flexible dynamic semantic mapping. To overcome this, we developed MetaNet [4], a metadata term thesaurus which provides the additional semantic knowledge that is non-existent within declarative XML-encoded metadata descriptions. This paper describes MetaNet, its RDF Schema [5] representation and a hybrid mapping approach which combines the structural and syntactic mapping capabilities of XSLT with the semantic knowledge of MetaNet, to enable flexible and dynamic mapping among metadata standards.
Source: Journal of digital information. 1(2001) no.8, art.# 42

Broughton, V.: Automatic metadata generation : Digital resource description without human intervention (2007) 0.07

0.072070494 = product of:
  0.14414099 = sum of:
    0.10339639 = weight(_text_:digital in 6048) [ClassicSimilarity], result of:
      0.10339639 = score(doc=6048,freq=2.0), product of:
        0.19770671 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.050121464 = queryNorm
        0.52297866 = fieldWeight in 6048, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.09375 = fieldNorm(doc=6048)
    0.0407446 = product of:
      0.0814892 = sum of:
        0.0814892 = weight(_text_:22 in 6048) [ClassicSimilarity], result of:
          0.0814892 = score(doc=6048,freq=2.0), product of:
            0.17551683 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050121464 = queryNorm
            0.46428138 = fieldWeight in 6048, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=6048)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Date: 22. 9.2007 15:41:14

Blanchi, C.; Petrone, J.: Distributed interoperable metadata registry (2001) 0.07

0.07118432 = product of:
  0.14236864 = sum of:
    0.10446788 = weight(_text_:digital in 1228) [ClassicSimilarity], result of:
      0.10446788 = score(doc=1228,freq=6.0), product of:
        0.19770671 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.050121464 = queryNorm
        0.5283983 = fieldWeight in 1228, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1228)
    0.03790077 = weight(_text_:library in 1228) [ClassicSimilarity], result of:
      0.03790077 = score(doc=1228,freq=4.0), product of:
        0.1317883 = queryWeight, product of:
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.050121464 = queryNorm
        0.28758827 = fieldWeight in 1228, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1228)
  0.5 = coord(2/4)

Abstract: Interoperability between digital libraries depends on effective sharing of metadata. Successful sharing of metadata requires common standards for metadata exchange. Previous efforts have focused on either defining a single metadata standard, such as Dublin Core, or building digital library middleware, such as Z39.50 or Stanford's Digital Library Interoperability Protocol. In this article, we propose a distributed architecture for managing metadata and metadata schema. Instead of normalizing all metadata and schema to a single format, we have focused on building a middleware framework that tolerates heterogeneity. By providing facilities for typing and dynamic conversion of metadata, our system permits continual introduction of new forms of metadata with minimal impact on compatibility.

Wen, D.; Sakaguchi, T.; Sugimoto, S.; Tabata, K.: Multilingual Access to Dublin Core Metadata of ULIS Library (2002) 0.06

0.062224608 = product of:
  0.124449216 = sum of:
    0.086163655 = weight(_text_:digital in 2342) [ClassicSimilarity], result of:
      0.086163655 = score(doc=2342,freq=2.0), product of:
        0.19770671 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.050121464 = queryNorm
        0.4358155 = fieldWeight in 2342, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.078125 = fieldNorm(doc=2342)
    0.03828556 = weight(_text_:library in 2342) [ClassicSimilarity], result of:
      0.03828556 = score(doc=2342,freq=2.0), product of:
        0.1317883 = queryWeight, product of:
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.050121464 = queryNorm
        0.29050803 = fieldWeight in 2342, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.078125 = fieldNorm(doc=2342)
  0.5 = coord(2/4)

Source: Journal of digital information. 2(2002) no.2,

Patton, M.; Reynolds, D.; Choudhury, G.S.; DiLauro, T.: Toward a metadata generation framework : a case study at Johns Hopkins University (2004) 0.06
```
0.05752562 = product of:
  0.11505124 = sum of:
    0.0844228 = weight(_text_:digital in 1192) [ClassicSimilarity], result of:
      0.0844228 = score(doc=1192,freq=12.0), product of:
        0.19770671 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.050121464 = queryNorm
        0.42701027 = fieldWeight in 1192, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.03125 = fieldNorm(doc=1192)
    0.030628446 = weight(_text_:library in 1192) [ClassicSimilarity], result of:
      0.030628446 = score(doc=1192,freq=8.0), product of:
        0.1317883 = queryWeight, product of:
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.050121464 = queryNorm
        0.23240642 = fieldWeight in 1192, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.03125 = fieldNorm(doc=1192)
  0.5 = coord(2/4)
```
Abstract

In the June 2003 issue of D-Lib Magazine, Kenney et al. (2003) discuss a comparative study between Cornell's email reference staff and Google's Answers service. This interesting study provided insights on the potential impact of "computing and simple algorithms combined with human intelligence" for library reference services. As mentioned in the Kenney et al. article, Bill Arms (2000) had discussed the possibilities of automated digital libraries in an even earlier D-Lib article. Arms discusses not only automating reference services, but also another library function that seems to inspire lively debates about automation-metadata creation. While intended to illuminate, these debates sometimes generate more heat than light. In an effort to explore the potential for automating metadata generation, the Digital Knowledge Center (DKC) of the Sheridan Libraries at The Johns Hopkins University developed and tested an automated name authority control (ANAC) tool. ANAC represents a component of a digital workflow management system developed in connection with the digital Lester S. Levy Collection of Sheet Music. The evaluation of ANAC followed the spirit of the Kenney et al. study that was, as they stated, "more exploratory than scientific." These ANAC evaluation results are shared with the hope of fostering constructive dialogue and discussions about the potential for semi-automated techniques or frameworks for library functions and services such as metadata creation. The DKC's research agenda emphasizes the development of tools that combine automated processes and human intervention, with the overall goal of involving humans at higher levels of analysis and decision-making. Others have looked at issues regarding the automated generation of metadata. A session at the 2003 Joint Conference on Digital Libraries was devoted to automatic metadata creation, and a session at the 2004 conference addressed automated name disambiguation. Commercial vendors such as OCLC, Marcive, and LTI have long used automated techniques for matching names to Library of Congress authority records. We began developing ANAC as a component of a larger suite of open source tools to support workflow management for digital projects. This article describes the goals for the ANAC tool, provides an overview of the metadata records used for testing, describes the architecture for ANAC, and concludes with discussions of the methodology and evaluation of the experiment comparing human cataloging and ANAC-generated results.
Neumann, M.; Steinberg, J.; Schaer, P.: Web-ccraping for non-programmers : introducing OXPath for digital library metadata harvesting (2017) 0.05
```
0.053888097 = product of:
  0.107776195 = sum of:
    0.07461992 = weight(_text_:digital in 3895) [ClassicSimilarity], result of:
      0.07461992 = score(doc=3895,freq=6.0), product of:
        0.19770671 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.050121464 = queryNorm
        0.37742734 = fieldWeight in 3895, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3895)
    0.033156272 = weight(_text_:library in 3895) [ClassicSimilarity], result of:
      0.033156272 = score(doc=3895,freq=6.0), product of:
        0.1317883 = queryWeight, product of:
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.050121464 = queryNorm
        0.25158736 = fieldWeight in 3895, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3895)
  0.5 = coord(2/4)
```
Abstract

Building up new collections for digital libraries is a demanding task. Available data sets have to be extracted which is usually done with the help of software developers as it involves custom data handlers or conversion scripts. In cases where the desired data is only available on the data provider's website custom web scrapers are needed. This may be the case for small to medium-size publishers, research institutes or funding agencies. As data curation is a typical task that is done by people with a library and information science background, these people are usually proficient with XML technologies but are not full-stack programmers. Therefore we would like to present a web scraping tool that does not demand the digital library curators to program custom web scrapers from scratch. We present the open-source tool OXPath, an extension of XPath, that allows the user to define data to be extracted from websites in a declarative way. By taking one of our own use cases as an example, we guide you in more detail through the process of creating an OXPath wrapper for metadata harvesting. We also point out some practical things to consider when creating a web scraper (with OXPath). On top of that, we also present a syntax highlighting plugin for the popular text editor Atom that we developed to further support OXPath users and to simplify the authoring process.
DC-2013: International Conference on Dublin Core and Metadata Applications : Online Proceedings (2013) 0.05
```
0.052134253 = product of:
  0.06951234 = sum of:
    0.034465462 = weight(_text_:digital in 1076) [ClassicSimilarity], result of:
      0.034465462 = score(doc=1076,freq=2.0), product of:
        0.19770671 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.050121464 = queryNorm
        0.17432621 = fieldWeight in 1076, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.03125 = fieldNorm(doc=1076)
    0.015314223 = weight(_text_:library in 1076) [ClassicSimilarity], result of:
      0.015314223 = score(doc=1076,freq=2.0), product of:
        0.1317883 = queryWeight, product of:
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.050121464 = queryNorm
        0.11620321 = fieldWeight in 1076, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.03125 = fieldNorm(doc=1076)
    0.019732658 = product of:
      0.039465316 = sum of:
        0.039465316 = weight(_text_:project in 1076) [ClassicSimilarity], result of:
          0.039465316 = score(doc=1076,freq=2.0), product of:
            0.21156175 = queryWeight, product of:
              4.220981 = idf(docFreq=1764, maxDocs=44218)
              0.050121464 = queryNorm
            0.18654276 = fieldWeight in 1076, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.220981 = idf(docFreq=1764, maxDocs=44218)
              0.03125 = fieldNorm(doc=1076)
      0.5 = coord(1/2)
  0.75 = coord(3/4)
```
Abstract

The collocated conferences for DC-2013 and iPRES-2013 in Lisbon attracted 392 participants from over 37 countries. In addition to the Tuesday through Thursday conference days comprised of peer-reviewed paper and special sessions, 223 participants attended pre-conference tutorials and 246 participated in post-conference workshops for the collocated events. The peer-reviewed papers and presentations are available on the conference website Presentation page (URLs above). In sum, it was a great conference. In addition to links to PDFs of papers, project reports and posters (and their associated presentations), the published proceedings include presentation PDFs for the following: KEYNOTES Darling, we need to talk - Gildas Illien TUTORIALS -- Ivan Herman: "Introduction to Linked Open Data (LOD)" -- Steven Miller: "Introduction to Ontology Concepts and Terminology" -- Kai Eckert: "Metadata Provenance" -- Daniel Garjio: "The W3C Provenance Ontology" SPECIAL SESSIONS -- "Application Profiles as an Alternative to OWL Ontologies" -- "Long-term Preservation and Governance of RDF Vocabularies (W3C Sponsored)" -- "Data Enrichment and Transformation in the LOD Context: Poor & Popular vs Rich & Lonely--Can't we achieve both?" -- "Why Schema.org?"

Content

FULL PAPERS Provenance and Annotations for Linked Data - Kai Eckert How Portable Are the Metadata Standards for Scientific Data? A Proposal for a Metadata Infrastructure - Jian Qin, Kai Li Lessons Learned in Implementing the Extended Date/Time Format in a Large Digital Library - Hannah Tarver, Mark Phillips Towards the Representation of Chinese Traditional Music: A State of the Art Review of Music Metadata Standards - Mi Tian, György Fazekas, Dawn Black, Mark Sandler Maps and Gaps: Strategies for Vocabulary Design and Development - Diane Ileana Hillmann, Gordon Dunsire, Jon Phipps A Method for the Development of Dublin Core Application Profiles (Me4DCAP V0.1): Aescription - Mariana Curado Malta, Ana Alice Baptista Find and Combine Vocabularies to Design Metadata Application Profiles using Schema Registries and LOD Resources - Tsunagu Honma, Mitsuharu Nagamori, Shigeo Sugimoto Achieving Interoperability between the CARARE Schema for Monuments and Sites and the Europeana Data Model - Antoine Isaac, Valentine Charles, Kate Fernie, Costis Dallas, Dimitris Gavrilis, Stavros Angelis With a Focused Intent: Evolution of DCMI as a Research Community - Jihee Beak, Richard P. Smiraglia Metadata Capital in a Data Repository - Jane Greenberg, Shea Swauger, Elena Feinstein DC Metadata is Alive and Well - A New Standard for Education - Liddy Nevile Representation of the UNIMARC Bibliographic Data Format in Resource Description Framework - Gordon Dunsire, Mirna Willer, Predrag Perozic

Lightle, K.S.; Ridgway, J.S.: Generation of XML records across multiple metadata standards (2003) 0.05

0.049779683 = product of:
  0.09955937 = sum of:
    0.068930924 = weight(_text_:digital in 2189) [ClassicSimilarity], result of:
      0.068930924 = score(doc=2189,freq=2.0), product of:
        0.19770671 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.050121464 = queryNorm
        0.34865242 = fieldWeight in 2189, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.0625 = fieldNorm(doc=2189)
    0.030628446 = weight(_text_:library in 2189) [ClassicSimilarity], result of:
      0.030628446 = score(doc=2189,freq=2.0), product of:
        0.1317883 = queryWeight, product of:
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.050121464 = queryNorm
        0.23240642 = fieldWeight in 2189, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.0625 = fieldNorm(doc=2189)
  0.5 = coord(2/4)

Abstract: This paper describes the process that Eisenhower National Clearinghouse (ENC) staff went through to develop crosswalks between metadata based on three different standards and the generation of the corresponding XML records. ENC needed to generate different flavors of XML records so that metadata would be displayed correctly in catalog records generated through different digital library interfaces. The crosswalk between USMARC, IEEE LOM, and DC-ED is included, as well as examples of the XML records.

Daniel Jr., R.; Lagoze, C.: Extending the Warwick framework : from metadata containers to active digital objects (1997) 0.05
```
0.04853967 = product of:
  0.09707934 = sum of:
    0.07386994 = weight(_text_:digital in 1264) [ClassicSimilarity], result of:
      0.07386994 = score(doc=1264,freq=12.0), product of:
        0.19770671 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.050121464 = queryNorm
        0.37363398 = fieldWeight in 1264, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1264)
    0.02320939 = weight(_text_:library in 1264) [ClassicSimilarity], result of:
      0.02320939 = score(doc=1264,freq=6.0), product of:
        0.1317883 = queryWeight, product of:
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.050121464 = queryNorm
        0.17611115 = fieldWeight in 1264, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1264)
  0.5 = coord(2/4)
```
Abstract

Defining metadata as "data about data" provokes more questions than it answers. What are the forms of the data and metadata? Can we be more specific about the manner in which the metadata is "about" the data? Are data and metadata distinguished only in the context of their relationship? Is the nature of the relationship between the datasets declarative or procedural? Can the metadata itself be described by other data? Over the past several years, we have been engaged in a number of efforts examining the role, format, composition, and architecture of metadata for networked resources. During this time, we have noticed the tendency to be led astray by comfortable, but somewhat inappropriate, models in the non-digital information environment. Rather than pursuing familiar models, there is the need for a new model that fully exploits the unique combination of computation and connectivity that characterizes the digital library. In this paper, we describe an extension of the Warwick Framework that we call Distributed Active Relationships (DARs). DARs provide a powerful model for representing data and metadata in digital library objects. They explicitly express the relationships between networked resources, and even allow those relationships to be dynamically downloadable and executable. The DAR model is based on the following principles, which our examination of the "data about data" definition has led us to regard as axiomatic: * There is no essential distinction between data and metadata. We can only make such a distinction in terms of a particular "about" relationship. As a result, what is metadata in the context of one "about" relationship may be data in another. * There is no single "about" relationship. There are many different and important relationships between data resources. * Resources can be related without regard for their location. The connectivity in networked information architectures makes it possible to have data in one repository describe data in another repository. * The computational power of the networked information environment makes it possible to consider active or dynamic relationships between data sets. This adds considerable power to the "data about data" definition. First, data about another data set may not physically exist, but may be automatically derived. Second, the "about" relationship may be an executable object -- in a sense interpretable metadata. As will be shown, this provides useful mechanisms for handling complex metadata problems such as rights management of digital objects. The remainder of this paper describes the development and consequences of the DAR model. Section 2 reviews the Warwick Framework, which is the basis for the model described in this paper. Section 3 examines the concept of the Warwick Framework Catalog, which provides a mechanism for expressing the relationships between the packages in a Warwick Framework container. With that background established, section 4 generalizes the Warwick Framework by removing the restriction that it only contains "metadata". This allows us to consider digital library objects that are aggregations of (possibly distributed) data sets, with the relationships between the data sets expressed using a Warwick Framework Catalog. Section 5 further extends the model by describing Distributed Active Relationships (DARs). DARs are the explicit relationships that have the potential to be executable, as alluded to earlier. Finally, section 6 describes two possible implementations of these concepts.
Edmunds, J.: Roadmap to nowhere : BIBFLOW, BIBFRAME, and linked data for libraries (2017) 0.05
```
0.04552724 = product of:
  0.09105448 = sum of:
    0.039787523 = weight(_text_:library in 3523) [ClassicSimilarity], result of:
      0.039787523 = score(doc=3523,freq=6.0), product of:
        0.1317883 = queryWeight, product of:
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.050121464 = queryNorm
        0.30190483 = fieldWeight in 3523, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.046875 = fieldNorm(doc=3523)
    0.051266953 = product of:
      0.10253391 = sum of:
        0.10253391 = weight(_text_:project in 3523) [ClassicSimilarity], result of:
          0.10253391 = score(doc=3523,freq=6.0), product of:
            0.21156175 = queryWeight, product of:
              4.220981 = idf(docFreq=1764, maxDocs=44218)
              0.050121464 = queryNorm
            0.48465237 = fieldWeight in 3523, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.220981 = idf(docFreq=1764, maxDocs=44218)
              0.046875 = fieldNorm(doc=3523)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

On December 12, 2016, Carl Stahmer and MacKenzie Smith presented at the CNI Members Fall Meeting about the BIBFLOW project, self-described on Twitter as "a two-year project of the UC Davis University Library and Zepheira investigating the future of library technical services." In her opening remarks, Ms. Smith, University Librarian at UC Davis, stated that one of the goals of the project was to devise a roadmap "to get from where we are today, which is kind of the 1970s with a little lipstick on it, to 2020, which is where we're going to be very soon." The notion that where libraries are today is somehow behind the times is one of the commonly heard rationales behind a move to linked data. Stated more precisely: - Libraries devote considerable time and resources to producing high-quality bibliographic metadata - This metadata is stored in unconnected silos - This metadata is in a format (MARC) that is incompatible with technologies of the emerging Semantic Web - The visibility of library metadata is diminished as a result of the two points above Are these assertions true? If yes, is linked data the solution?

Godby, C.J.; Young, J.A.; Childress, E.: ¬A repository of metadata crosswalks (2004) 0.04

0.043557227 = product of:
  0.08711445 = sum of:
    0.060314562 = weight(_text_:digital in 1155) [ClassicSimilarity], result of:
      0.060314562 = score(doc=1155,freq=2.0), product of:
        0.19770671 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.050121464 = queryNorm
        0.30507088 = fieldWeight in 1155, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1155)
    0.026799891 = weight(_text_:library in 1155) [ClassicSimilarity], result of:
      0.026799891 = score(doc=1155,freq=2.0), product of:
        0.1317883 = queryWeight, product of:
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.050121464 = queryNorm
        0.20335563 = fieldWeight in 1155, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1155)
  0.5 = coord(2/4)

Abstract: This paper proposes a model for metadata crosswalks that associates three pieces of information: the crosswalk, the source metadata standard, and the target metadata standard, each of which may have a machine-readable encoding and human-readable description. The crosswalks are encoded as METS records that are made available to a repository for processing by search engines, OAI harvesters, and custom-designed Web services. The METS object brings together all of the information required to access and interpret crosswalks and represents a significant improvement over previously available formats. But it raises questions about how best to describe these complex objects and exposes gaps that must eventually be filled in by the digital library community.

Hook, P.A.; Gantchev, A.: Using combined metadata sources to visualize a small library (OBL's English Language Books) (2017) 0.04
```
0.042943195 = product of:
  0.08588639 = sum of:
    0.043081827 = weight(_text_:digital in 3870) [ClassicSimilarity], result of:
      0.043081827 = score(doc=3870,freq=2.0), product of:
        0.19770671 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.050121464 = queryNorm
        0.21790776 = fieldWeight in 3870, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3870)
    0.042804558 = weight(_text_:library in 3870) [ClassicSimilarity], result of:
      0.042804558 = score(doc=3870,freq=10.0), product of:
        0.1317883 = queryWeight, product of:
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.050121464 = queryNorm
        0.32479787 = fieldWeight in 3870, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3870)
  0.5 = coord(2/4)
```
Abstract

Data from multiple knowledge organization systems are combined to provide a global overview of the content holdings of a small personal library. Subject headings and classification data are used to effectively map the combined book and topic space of the library. While harvested and manipulated by hand, the work reveals issues and potential solutions when using automated techniques to produce topic maps of much larger libraries. The small library visualized consists of the thirty-nine, digital, English language books found in the Osama Bin Laden (OBL) compound in Abbottabad, Pakistan upon his death. As this list of books has garnered considerable media attention, it is worth providing a visual overview of the subject content of these books - some of which is not readily apparent from the titles. Metadata from subject headings and classification numbers was combined to create book-subject maps. Tree maps of the classification data were also produced. The books contain 328 subject headings. In order to enhance the base map with meaningful thematic overlay, library holding count data was also harvested (and aggregated from duplicates). This additional data revealed the relative scarcity or popularity of individual books.
Chan, L.M.; Zeng, M.L.: Metadata interoperability and standardization - a study of methodology, part I : achieving interoperability at the schema level (2006) 0.04
```
0.042796366 = product of:
  0.08559273 = sum of:
    0.060926907 = weight(_text_:digital in 1176) [ClassicSimilarity], result of:
      0.060926907 = score(doc=1176,freq=4.0), product of:
        0.19770671 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.050121464 = queryNorm
        0.3081681 = fieldWeight in 1176, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1176)
    0.024665821 = product of:
      0.049331643 = sum of:
        0.049331643 = weight(_text_:project in 1176) [ClassicSimilarity], result of:
          0.049331643 = score(doc=1176,freq=2.0), product of:
            0.21156175 = queryWeight, product of:
              4.220981 = idf(docFreq=1764, maxDocs=44218)
              0.050121464 = queryNorm
            0.23317845 = fieldWeight in 1176, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.220981 = idf(docFreq=1764, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1176)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

The rapid growth of Internet resources and digital collections has been accompanied by a proliferation of metadata schemas, each of which has been designed based on the requirements of particular user communities, intended users, types of materials, subject domains, project needs, etc. Problems arise when building large digital libraries or repositories with metadata records that were prepared according to diverse schemas. This article (published in two parts) contains an analysis of the methods that have been used to achieve or improve interoperability among metadata schemas and applications, for the purposes of facilitating conversion and exchange of metadata and enabling cross-domain metadata harvesting and federated searches. From a methodological point of view, implementing interoperability may be considered at different levels of operation: schema level, record level, and repository level. Part I of the article intends to explain possible situations in which metadata schemas may be created or implemented, whether in individual projects or in integrated repositories. It also discusses approaches used at the schema level. Part II of the article will discuss metadata interoperability efforts at the record and repository levels.
Weibel, S.L.: Border crossings : reflections on a decade of metadata consensus building (2005) 0.04
```
0.040034845 = product of:
  0.08006969 = sum of:
    0.060926907 = weight(_text_:digital in 1187) [ClassicSimilarity], result of:
      0.060926907 = score(doc=1187,freq=4.0), product of:
        0.19770671 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.050121464 = queryNorm
        0.3081681 = fieldWeight in 1187, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1187)
    0.01914278 = weight(_text_:library in 1187) [ClassicSimilarity], result of:
      0.01914278 = score(doc=1187,freq=2.0), product of:
        0.1317883 = queryWeight, product of:
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.050121464 = queryNorm
        0.14525402 = fieldWeight in 1187, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          2.6293786 = idf(docFreq=8668, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1187)
  0.5 = coord(2/4)
```
Abstract

In June of this year, I performed my final official duties as part of the Dublin Core Metadata Initiative management team. It is a happy irony to affix a seal on that service in this journal, as both D-Lib Magazine and the Dublin Core celebrate their tenth anniversaries. This essay is a personal reflection on some of the achievements and lessons of that decade. The OCLC-NCSA Metadata Workshop took place in March of 1995, and as we tried to understand what it meant and who would care, D-Lib magazine came into being and offered a natural venue for sharing our work. I recall a certain skepticism when Bill Arms said "We want D-Lib to be the first place people look for the latest developments in digital library research." These were the early days in the evolution of electronic publishing, and the goal was ambitious. By any measure, a decade of high-quality electronic publishing is an auspicious accomplishment, and D-Lib (and its host, CNRI) deserve congratulations for having achieved their goal. I am grateful to have been a contributor. That first DC workshop led to further workshops, a community, a variety of standards in several countries, an ISO standard, a conference series, and an international consortium. Looking back on this evolution is both satisfying and wistful. While I am pleased that the achievements are substantial, the unmet challenges also provide a rich till in which to cultivate insights on the development of digital infrastructure.
Baker, T.: ¬A grammar of Dublin Core (2000) 0.03
```
0.03116153 = product of:
  0.06232306 = sum of:
    0.048741527 = weight(_text_:digital in 1236) [ClassicSimilarity], result of:
      0.048741527 = score(doc=1236,freq=4.0), product of:
        0.19770671 = queryWeight, product of:
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.050121464 = queryNorm
        0.2465345 = fieldWeight in 1236, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.944552 = idf(docFreq=2326, maxDocs=44218)
          0.03125 = fieldNorm(doc=1236)
    0.013581533 = product of:
      0.027163066 = sum of:
        0.027163066 = weight(_text_:22 in 1236) [ClassicSimilarity], result of:
          0.027163066 = score(doc=1236,freq=2.0), product of:
            0.17551683 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.050121464 = queryNorm
            0.15476047 = fieldWeight in 1236, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=1236)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

Dublin Core is often presented as a modern form of catalog card -- a set of elements (and now qualifiers) that describe resources in a complete package. Sometimes it is proposed as an exchange format for sharing records among multiple collections. The founding principle that "every element is optional and repeatable" reinforces the notion that a Dublin Core description is to be taken as a whole. This paper, in contrast, is based on a much different premise: Dublin Core is a language. More precisely, it is a small language for making a particular class of statements about resources. Like natural languages, it has a vocabulary of word-like terms, the two classes of which -- elements and qualifiers -- function within statements like nouns and adjectives; and it has a syntax for arranging elements and qualifiers into statements according to a simple pattern. Whenever tourists order a meal or ask directions in an unfamiliar language, considerate native speakers will spontaneously limit themselves to basic words and simple sentence patterns along the lines of "I am so-and-so" or "This is such-and-such". Linguists call this pidginization. In such situations, a small phrase book or translated menu can be most helpful. By analogy, today's Web has been called an Internet Commons where users and information providers from a wide range of scientific, commercial, and social domains present their information in a variety of incompatible data models and description languages. In this context, Dublin Core presents itself as a metadata pidgin for digital tourists who must find their way in this linguistically diverse landscape. Its vocabulary is small enough to learn quickly, and its basic pattern is easily grasped. It is well-suited to serve as an auxiliary language for digital libraries. This grammar starts by defining terms. It then follows a 200-year-old tradition of English grammar teaching by focusing on the structure of single statements. It concludes by looking at the growing dictionary of Dublin Core vocabulary terms -- its registry, and at how statements can be used to build the metadata equivalent of paragraphs and compositions -- the application profile.

Date

26.12.2011 14:01:22

Search (52 results, page 1 of 3)

Authors

Years

Languages

Types

Themes