Search (10 results, page 1 of 1)

Godby, C.J.; Young, J.A.; Childress, E.: ¬A repository of metadata crosswalks (2004) 0.03

0.03444516 = product of:
  0.06889032 = sum of:
    0.05503747 = weight(_text_:processing in 1155) [ClassicSimilarity], result of:
      0.05503747 = score(doc=1155,freq=2.0), product of:
        0.175792 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.043425296 = queryNorm
        0.3130829 = fieldWeight in 1155, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1155)
    0.013852848 = product of:
      0.04155854 = sum of:
        0.04155854 = weight(_text_:29 in 1155) [ClassicSimilarity], result of:
          0.04155854 = score(doc=1155,freq=2.0), product of:
            0.15275662 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.043425296 = queryNorm
            0.27205724 = fieldWeight in 1155, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1155)
      0.33333334 = coord(1/3)
  0.5 = coord(2/4)

Abstract: This paper proposes a model for metadata crosswalks that associates three pieces of information: the crosswalk, the source metadata standard, and the target metadata standard, each of which may have a machine-readable encoding and human-readable description. The crosswalks are encoded as METS records that are made available to a repository for processing by search engines, OAI harvesters, and custom-designed Web services. The METS object brings together all of the information required to access and interpret crosswalks and represents a significant improvement over previously available formats. But it raises questions about how best to describe these complex objects and exposes gaps that must eventually be filled in by the digital library community.
Date: 26.12.2011 16:29:02

Wolfe, EW.: a case study in automated metadata enhancement : Natural Language Processing in the humanities (2019) 0.02
```
0.019458683 = product of:
  0.07783473 = sum of:
    0.07783473 = weight(_text_:processing in 5236) [ClassicSimilarity], result of:
      0.07783473 = score(doc=5236,freq=4.0), product of:
        0.175792 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.043425296 = queryNorm
        0.4427661 = fieldWeight in 5236, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5236)
  0.25 = coord(1/4)
```
Abstract

The Black Book Interactive Project at the University of Kansas (KU) is developing an expanded corpus of novels by African American authors, with an emphasis on lesser known writers and a goal of expanding research in this field. Using a custom metadata schema with an emphasis on race-related elements, each novel is analyzed for a variety of elements such as literary style, targeted content analysis, historical context, and other areas. Librarians at KU have worked to develop a variety of computational text analysis processes designed to assist with specific aspects of this metadata collection, including text mining and natural language processing, automated subject extraction based on word sense disambiguation, harvesting data from Wikidata, and other actions.
Cranefield, S.: Networked knowledge representation and exchange using UML and RDF (2001) 0.01
```
0.013759367 = product of:
  0.05503747 = sum of:
    0.05503747 = weight(_text_:processing in 5896) [ClassicSimilarity], result of:
      0.05503747 = score(doc=5896,freq=2.0), product of:
        0.175792 = queryWeight, product of:
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.043425296 = queryNorm
        0.3130829 = fieldWeight in 5896, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.048147 = idf(docFreq=2097, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5896)
  0.25 = coord(1/4)
```
Abstract

This paper proposes the use of the Unified Modeling Language (UML) as a language for modelling ontologies for Web resources and the knowledge contained within them. To provide a mechanism for serialising and processing object diagrams representing knowledge, a pair of XSI-T stylesheets have been developed to map from XML Metadata Interchange (XMI) encodings of class diagrams to corresponding RDF schemas and to Java classes representing the concepts in the ontologies. The Java code includes methods for marshalling and unmarshalling object-oriented information between in-memory data structures and RDF serialisations of that information. This provides a convenient mechanism for Java applications to share knowledge on the Web
Roy, W.; Gray, C.: Preparing existing metadata for repository batch import : a recipe for a fickle food (2018) 0.01
```
0.008826248 = product of:
  0.035304993 = sum of:
    0.035304993 = product of:
      0.05295749 = sum of:
        0.023539849 = weight(_text_:science in 4550) [ClassicSimilarity], result of:
          0.023539849 = score(doc=4550,freq=4.0), product of:
            0.11438741 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.043425296 = queryNorm
            0.20579056 = fieldWeight in 4550, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4550)
        0.029417641 = weight(_text_:22 in 4550) [ClassicSimilarity], result of:
          0.029417641 = score(doc=4550,freq=2.0), product of:
            0.15206799 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.043425296 = queryNorm
            0.19345059 = fieldWeight in 4550, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4550)
      0.6666667 = coord(2/3)
  0.25 = coord(1/4)
```
Abstract

In 2016, the University of Waterloo began offering a mediated copyright review and deposit service to support the growth of our institutional repository UWSpace. This resulted in the need to batch import large lists of published works into the institutional repository quickly and accurately. A range of methods have been proposed for harvesting publications metadata en masse, but many technological solutions can easily become detached from a workflow that is both reproducible for support staff and applicable to a range of situations. Many repositories offer the capacity for batch upload via CSV, so our method provides a template Python script that leverages the Habanero library for populating CSV files with existing metadata retrieved from the CrossRef API. In our case, we have combined this with useful metadata contained in a TSV file downloaded from Web of Science in order to enrich our metadata as well. The appeal of this 'low-maintenance' method is that it provides more robust options for gathering metadata semi-automatically, and only requires the user's ability to access Web of Science and the Python program, while still remaining flexible enough for local customizations.

Date

10.11.2018 16:27:22

Broughton, V.: Automatic metadata generation : Digital resource description without human intervention (2007) 0.01

0.005883528 = product of:
  0.023534112 = sum of:
    0.023534112 = product of:
      0.070602335 = sum of:
        0.070602335 = weight(_text_:22 in 6048) [ClassicSimilarity], result of:
          0.070602335 = score(doc=6048,freq=2.0), product of:
            0.15206799 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.043425296 = queryNorm
            0.46428138 = fieldWeight in 6048, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=6048)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Date: 22. 9.2007 15:41:14

Understanding metadata (2004) 0.00

0.0039223526 = product of:
  0.01568941 = sum of:
    0.01568941 = product of:
      0.047068227 = sum of:
        0.047068227 = weight(_text_:22 in 2686) [ClassicSimilarity], result of:
          0.047068227 = score(doc=2686,freq=2.0), product of:
            0.15206799 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.043425296 = queryNorm
            0.30952093 = fieldWeight in 2686, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=2686)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Date: 10. 9.2004 10:22:40

Sewing, S.: Bestandserhaltung und Archivierung : Koordinierung auf der Basis eines gemeinsamen Metadatenformates in den deutschen und österreichischen Bibliotheksverbünden (2021) 0.00

0.002941764 = product of:
  0.011767056 = sum of:
    0.011767056 = product of:
      0.035301168 = sum of:
        0.035301168 = weight(_text_:22 in 266) [ClassicSimilarity], result of:
          0.035301168 = score(doc=266,freq=2.0), product of:
            0.15206799 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.043425296 = queryNorm
            0.23214069 = fieldWeight in 266, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=266)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Date: 22. 5.2021 12:43:05

Baker, T.: ¬A grammar of Dublin Core (2000) 0.00

0.0019611763 = product of:
  0.007844705 = sum of:
    0.007844705 = product of:
      0.023534114 = sum of:
        0.023534114 = weight(_text_:22 in 1236) [ClassicSimilarity], result of:
          0.023534114 = score(doc=1236,freq=2.0), product of:
            0.15206799 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.043425296 = queryNorm
            0.15476047 = fieldWeight in 1236, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=1236)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)

Date: 26.12.2011 14:01:22

Neumann, M.; Steinberg, J.; Schaer, P.: Web-ccraping for non-programmers : introducing OXPath for digital library metadata harvesting (2017) 0.00
```
0.001387099 = product of:
  0.005548396 = sum of:
    0.005548396 = product of:
      0.016645188 = sum of:
        0.016645188 = weight(_text_:science in 3895) [ClassicSimilarity], result of:
          0.016645188 = score(doc=3895,freq=2.0), product of:
            0.11438741 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.043425296 = queryNorm
            0.1455159 = fieldWeight in 3895, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3895)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)
```
Abstract

Building up new collections for digital libraries is a demanding task. Available data sets have to be extracted which is usually done with the help of software developers as it involves custom data handlers or conversion scripts. In cases where the desired data is only available on the data provider's website custom web scrapers are needed. This may be the case for small to medium-size publishers, research institutes or funding agencies. As data curation is a typical task that is done by people with a library and information science background, these people are usually proficient with XML technologies but are not full-stack programmers. Therefore we would like to present a web scraping tool that does not demand the digital library curators to program custom web scrapers from scratch. We present the open-source tool OXPath, an extension of XPath, that allows the user to define data to be extracted from websites in a declarative way. By taking one of our own use cases as an example, we guide you in more detail through the process of creating an OXPath wrapper for metadata harvesting. We also point out some practical things to consider when creating a web scraper (with OXPath). On top of that, we also present a syntax highlighting plugin for the popular text editor Atom that we developed to further support OXPath users and to simplify the authoring process.
Roszkowski, M.; Lukas, C.: ¬A distributed architecture for resource discovery using metadata (1998) 0.00
```
0.0011096792 = product of:
  0.004438717 = sum of:
    0.004438717 = product of:
      0.01331615 = sum of:
        0.01331615 = weight(_text_:science in 1256) [ClassicSimilarity], result of:
          0.01331615 = score(doc=1256,freq=2.0), product of:
            0.11438741 = queryWeight, product of:
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.043425296 = queryNorm
            0.11641272 = fieldWeight in 1256, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.6341193 = idf(docFreq=8627, maxDocs=44218)
              0.03125 = fieldNorm(doc=1256)
      0.33333334 = coord(1/3)
  0.25 = coord(1/4)
```
Abstract

This article describes an approach for linking geographically distributed collections of metadata so that they are searchable as a single collection. We describe the infrastructure, which uses standard Internet protocols such as the Lightweight Directory Access Protocol (LDAP) and the Common Indexing Protocol (CIP), to distribute queries, return results, and exchange index information. We discuss the advantages of using linked collections of authoritative metadata as an alternative to using a keyword indexing search-engine for resource discovery. We examine other architectures that use metadata for resource discovery, such as Dienst/NCSTRL, the AHDS HTTP/Z39.50 Gateway, and the ROADS initiative. Finally, we discuss research issues and future directions of the project. The Internet Scout Project, which is funded by the National Science Foundation and is located in the Computer Sciences Department at the University of Wisconsin-Madison, is charged with assisting the higher education community in resource discovery on the Internet. To that end, the Scout Report and subsequent subject-specific Scout Reports were developed to guide the U.S. higher education community to research-quality resources. The Scout Report Signpost utilizes the content from the Scout Reports as the basis of a metadata collection. Signpost consists of more than 2000 cataloged Internet sites using established standards such as Library of Congress subject headings and abbreviated call letters, and emerging standards such as the Dublin Core (DC). This searchable and browseable collection is free and freely accessible, as are all of the Internet Scout Project's services.

Search (10 results, page 1 of 1)

Authors

Years

Languages

Themes