Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 04. Juni 2021)
1Velden, V. ; Lagoze, C.: ¬The extraction of community structures from publication networks to support ethnographic observations of field differences in scientific communication.
In: Journal of the American Society for Information Science and Technology. 64(2013) no.12, S.2405-2427.
Abstract: The scientific community of researchers in a research specialty is an important unit of analysis for understanding the field-specific shaping of scientific communication practices. These scientific communities are, however, a challenging unit of analysis to capture and compare because they overlap, have fuzzy boundaries, and evolve over time. We describe a network analytic approach that reveals the complexities of these communities through the examination of their publication networks in combination with insights from ethnographic field studies. We suggest that the structures revealed indicate overlapping subcommunities within a research specialty, and we provide evidence that they differ in disciplinary orientation and research practices. By mapping the community structures of scientific fields we increase confidence about the domain of validity of ethnographic observations as well as of collaborative patterns extracted from publication networks thereby enabling the systematic study of field differences. The network analytic methods presented include methods to optimize the delineation of a bibliographic data set to adequately represent a research specialty and methods to extract community structures from this data. We demonstrate the application of these methods in a case study of two research specialties in the physical and chemical sciences.
2Van de Sompel, H. ; Nelson, M.L. ; Lagoze, C. ; Warner, S.: Resource harvesting within the OAI-PMH framework.
In: D-Lib magazine. 10(2004) no.12, x S.
Abstract: Motivated by preservation and resource discovery, we examine how digital resources, and not just metadata about resources, can be harvested using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). We review and critique existing techniques for identifying and gathering digital resources using metadata harvested through the OAI-PMH. We introduce an alternative solution that builds on the introduction of complex object formats that provide a more accurate way to describe digital resources. We argue that the use of complex object formats as OAI-PMH metadata formats results in a reliable and attractive approach for incremental harvesting of resources using the OAI-PMH.
Anmerkung: Vgl.: http://dlib.ukoln.ac.uk/dlib/december04/vandesompel/12vandesompel.html.
3Faaborg, A. ; Lagoze, C.: Semantic browsing.
In: Research and advanced technology for digital libraries : 7th European Conference, proceedings / ECDL 2003, Trondheim, Norway, August 17-22, 2003. Berlin : Springer, 2003. S.70-81.
(Lecture notes in computer science; vol.2769)
Abstract: We have created software applications that allow users to both author and use Semantic Web metadata. To create and use a layer of semantic content on top of the existing Web, we have (1) implemented a user interface that expedites the task of attributing metadata to resources on the Web, and (2) augmented a Web browser to leverage this semantic metadata to provide relevant information and tasks to the user. This project provides a framework for annotating and reorganizing existing files, pages, and sites on the Web that is similar to Vannevar Bushrsquos original concepts of trail blazing and associative indexing.
Themenfeld: Semantisches Umfeld in Indexierung u. Retrieval ; Semantic Web
4Lagoze, C. ; Van de Sompel, H.: ¬The making of the Open Archives Initiative Protocol for Metadata Harvesting.
In: Library hi tech. 21(2003) no.2, S.118-128.
Abstract: The authors, who jointly serve as the Open Archives Initiative (OAI) executive, reflect an the three-year history of the OAI. Three years of technical work recently culminated in the release of a stabie production version 2 of the OAI Protocol for Metadata Harvesting (OAI-PMH). This technical product, the work that led up to it, and the process that made it possible have attracted some favor from the digital library and information community. The paper explores a number of factors in the history of the OAI that the authors believe have contributed to this positive response. The factors include focus an a defined problem Statement, an operational model in which strong leadership is balanced with solicited participation, a healthy dose of community building and Support, and sensible technical decisions.
Inhalt: Vgl. auch unter: http://www.emeraldinsight.com/10.1108/07378830310479776.
5Arms, W.Y. ; Dushay, N. ; Fulker, D. ; Lagoze, C.: ¬A case study in metadata harvesting : the NSDL.
In: Library hi tech. 21(2003) no.2, S.228-237.
Abstract: This paper describes the use of the Open Archives Initiative Protocol for Metadata Harvesting in the NSF's National Science Digital Library (NSDL). The protocol is used both as a method to ingest metadata into a central Metadata Repository and also as the means by which the repository exports metadata to service providers. The NSDL Search Service is used to illustrate this architecture. An early version of the Metadata Repository was an alpha test site for version 1 of the protocol and the production repository was a beta test site for version 2. This paper describes the implementation experience and early practical tests. Despite some teething troubles and the long-term difficulties of semantic compatibility, the overall conclusion is optimism that the Open Archive Initiative will be a successful part of the NSDL.
Inhalt: Vgl. auch unter: http://www.emeraldinsight.com/10.1108/07378830310479866.
7Hitchcock, S. ; Bergmark, D. ; Brody, T. ; Gutteridge, C. ; Carr, L. ; Hall, W. ; Lagoze, C. ; Harnad, S.: Open citation linking : the way forward.
In: D-Lib magazine. 8(2002) no.10, x S.
Abstract: The speed of scientific communication - the rate of ideas affecting other researchers' ideas - is increasing dramatically. The factor driving this is free, unrestricted access to research papers. Measurements of user activity in mature eprint archives of research papers such as arXiv have shown, for the first time, the degree to which such services support an evolving network of texts commenting on, citing, classifying, abstracting, listing and revising other texts. The Open Citation project has built tools to measure this activity, to build new archives, and has been closely involved with the development of the infrastructure to support open access on which these new services depend. This is the story of the project, intertwined with the concurrent emergence of the Open Archives Initiative (OAI). The paper describes the broad scope of the project's work, showing how it has progressed from early demonstrators of reference linking to produce Citebase, a Web-based citation and impact-ranked search service, and how it has supported the development of the EPrints.org software for building OAI-compliant archives. The work has been underpinned by analysis and experiments on the semantics of documents (digital objects) to determine the features required for formally perfect linking - instantiated as an application programming interface (API) for reference linking - that will enable other applications to build on this work in broader digital library information environments.
Anmerkung: Vgl.: http://dlib.ukoln.ac.uk/dlib/october02/hitchcock/10hitchcock.html.
Objekt: Open Citation project
8Lagoze, C.: Keeping Dublin Core simple : Cross-domain discovery or resource description?.
In: D-Lib magazine. 7(2001) no.1, xx S.
Abstract: Reality is messy. Individuals perceive or define objects differently. Objects may change over time, morphing into new versions of their former selves or into things altogether different. A book can give rise to a translation, derivation, or edition, and these resulting objects are related in complex ways to each other and to the people and contexts in which they were created or transformed. Providing a normalized view of such a messy reality is a precondition for managing information. From the first library catalogs, through Melvil Dewey's Decimal Classification system in the nineteenth century, to today's MARC encoding of AACR2 cataloging rules, libraries have epitomized the process of what David Levy calls "order making", whereby catalogers impose a veneer of regularity on the natural disorder of the artifacts they encounter. The pre-digital library within which the Catalog and its standards evolved was relatively self-contained and controlled. Creating and maintaining catalog records was, and still is, the task of professionals. Today's Web, in contrast, has brought together a diversity of information management communities, with a variety of order-making standards, into what Stuart Weibel has called the Internet Commons. The sheer scale of this context has motivated a search for new ways to describe and index information. Second-generation search engines such as Google can yield astonishingly good search results, while tools such as ResearchIndex for automatic citation indexing and techniques for inferring "Web communities" from constellations of hyperlinks promise even better methods for focusing queries on information from authoritative sources. Such "automated digital libraries," according to Bill Arms, promise to radically reduce the cost of managing information. Alongside the development of such automated methods, there is increasing interest in metadata as a means of imposing pre-defined order on Web content. While the size and changeability of the Web makes professional cataloging impractical, a minimal amount of information ordering, such as that represented by the Dublin Core (DC), may vastly improve the quality of an automatic index at low cost; indeed, recent work suggests that some types of simple description may be generated with little or no human intervention. ; Metadata is not monolithic. Instead, it is helpful to think of metadata as multiple views that can be projected from a single information object. Such views can form the basis of customized information services, such as search engines. Multiple views -- different types of metadata associated with a Web resource -- can facilitate a "drill-down" search paradigm, whereby people start their searches at a high level and later narrow their focus using domain-specific search categories. In Figure 1, for example, Mona Lisa may be viewed from the perspective of non-specialized searchers, with categories that are valid across domains (who painted it and when?); in the context of a museum (when and how was it acquired?); in the geo-spatial context of a walking tour using mobile devices (where is it in the gallery?); and in a legal framework (who owns the rights to its reproduction?). Multiple descriptive views imply a modular approach to metadata. Modularity is the basis of metadata architectures such as the Resource Description Framework (RDF), which permit different communities of expertise to associate and maintain multiple metadata packages for Web resources. As noted elsewhere, static association of multiple metadata packages with resources is but one way of achieving modularity. Another method is to computationally derive order-making views customized to the current needs of a client. This paper examines the evolution and scope of the Dublin Core from this perspective of metadata modularization. Dublin Core began in 1995 with a specific goal and scope -- as an easy-to-create and maintain descriptive format to facilitate cross-domain resource discovery on the Web. Over the years, this goal of "simple metadata for coarse-granularity discovery" came to mix with another goal -- that of community and domain-specific resource description and its attendant complexity. A notion of "qualified Dublin Core" evolved whereby the model for simple resource discovery -- a set of simple metadata elements in a flat, document-centric model -- would form the basis of more complex descriptions by treating the values of its elements as entities with properties ("component elements") in their own right. ; At the time of writing, the Dublin Core Metadata Initiative (DCMI) has clarified its commitment to the simple approach. The qualification principles announced in early 2000 support the use of DC elements as the basis for simple statements about resources, rather than as the foundation for more descriptive clauses. This paper takes a critical look at some of the issues that led up to this renewed commitment to simplicity. We argue that: * There remains a compelling need for simple, "pidgin" metadata. From a technical and economic perspective, document-centric metadata, where simple string values are associated with a finite set of properties, is most appropriate for generic, cross-domain discovery queries in the Internet Commons. Such metadata is not necessarily fixed in physical records, but may be projected algorithmically from more complex metadata or from content itself. * The Dublin Core, while far from perfect from an engineering perspective, is an acceptable standard for such simple metadata. Agreements in the global information space are as much social as technical, and the process by which the Dublin Core has been developed, involving a broad cross-section of international participants, is a model for such "socially developed" standards. * Efforts to introduce complexity into Dublin Core are misguided. Complex descriptions may be necessary for some Web resources and for some purposes, such as administration, preservation, and reference linking. However, complex descriptions require more expressive data models that differentiate between agents, documents, contexts, events, and the like. An attempt to intermix simplicity and complexity, and the data models most appropriate for them, defeats the equally noble goals of cross-domain description and extensive resource description. * The principle of modularity suggests that metadata formats tailored for simplicity be used alongside others tailored for complexity.
Anmerkung: Vgl.: http://dlib.ukoln.ac.uk/dlib/january01/lagoze/01lagoze.html.
Objekt: Dublin Core
9Davis, J.R. ; Lagoze, C.: NCSTRL : design and deployment of a globally distributed digital library.
In: Journal of the American Society for Information Science. 51(2000) no.3, S.273-280.
Abstract: The WWW provides unprecedented access to globally distributed content. The extent and uniform accessibility of the Web has proven beneficial for research, education, commerce, entertainment, and numerous other uses. Ironically, the fact that the Web is an information space without boundaries has also proven its biggest flaw. Key aspects of libraries, such as selectivity of content, customization of tools and services relative to collection and patron characteristics, and management of content and services are noticeably absent. In this paper, we review our experiences with NCSTRL and Dienst, describe some of the lessons we have learned from the deployment experience, and define some directions for the future
10Payette, S. ; Blanchi, C. ; Lagoze, C. ; Overly, E.A.: Interoperability for digital objects and repositories : the Cornell/CNRI experiments.
In: D-Lib magazine. 5(1999) no.5, xx S.
Abstract: For several years the Digital Library Research Group at Cornell University and the Corporation for National Research Initiatives (CNRI) have been engaged in research focused on the design and development of infrastructures for open architecture, confederated digital libraries. The goal of this effort is to achieve interoperability and extensibility of digital library systems through the definition of key digital library services and their open interfaces, allowing flexible interaction of existing services and augmentation of the infrastructure with new services. Some aspects of this research have included the development and deployment of the Dienst software, the Handle System®, and the architecture of digital objects and repositories. In this paper, we describe the joint effort by Cornell and CNRI to prototype a rich and deployable architecture for interoperable digital objects and repositories. This effort has challenged us to move theories of interoperability closer to practice. The Cornell/CNRI collaboration builds on two existing projects focusing on the development of interoperable digital libraries. Details relating to the technology of these projects are described elsewhere. Both projects were strongly influenced by the fundamental abstractions of repositories and digital objects as articulated by Kahn and Wilensky in A Framework for Distributed Digital Object Services. Furthermore, both programs were influenced by the container architecture described in the Warwick Framework, and by the notions of distributed dynamic objects presented by Lagoze and Daniel in their Distributed Active Relationship work. With these common roots, one would expect that the CNRI and Cornell repositories would be at least theoretically interoperable. However, the actual test would be the extent to which our independently developed repositories were practically interoperable. This paper focuses on the definition of interoperability in the joint Cornell/CNRI work and the set of experiments conducted to formally test it. Our motivation for this work is the eventual deployment of formally tested reference implementations of the repository architecture for experimentation and development by fellow digital library researchers. In Section 2, we summarize the digital object and repository approach that was the focus of our interoperability experiments. In Section 3, we describe the set of experiments that progressively tested interoperability at increasing levels of functionality. In Section 4, we discuss general conclusions, and in Section 5, we give a preview of our future work, including our plans to evolve our experimentation to the point of defining a set of formal metrics for measuring interoperability for repositories and digital objects. This is still a work in progress that is expected to undergo additional refinements during its development.
Anmerkung: Vgl.: http://dlib.ukoln.ac.uk/dlib/may99/payette/05payette.html.
11Daniel Jr., R. ; Lagoze, C.: Extending the Warwick framework : from metadata containers to active digital objects.
In: D-Lib magazine. 3(1997) no.11, xx S.
Abstract: Defining metadata as "data about data" provokes more questions than it answers. What are the forms of the data and metadata? Can we be more specific about the manner in which the metadata is "about" the data? Are data and metadata distinguished only in the context of their relationship? Is the nature of the relationship between the datasets declarative or procedural? Can the metadata itself be described by other data? Over the past several years, we have been engaged in a number of efforts examining the role, format, composition, and architecture of metadata for networked resources. During this time, we have noticed the tendency to be led astray by comfortable, but somewhat inappropriate, models in the non-digital information environment. Rather than pursuing familiar models, there is the need for a new model that fully exploits the unique combination of computation and connectivity that characterizes the digital library. In this paper, we describe an extension of the Warwick Framework that we call Distributed Active Relationships (DARs). DARs provide a powerful model for representing data and metadata in digital library objects. They explicitly express the relationships between networked resources, and even allow those relationships to be dynamically downloadable and executable. The DAR model is based on the following principles, which our examination of the "data about data" definition has led us to regard as axiomatic: * There is no essential distinction between data and metadata. We can only make such a distinction in terms of a particular "about" relationship. As a result, what is metadata in the context of one "about" relationship may be data in another. * There is no single "about" relationship. There are many different and important relationships between data resources. * Resources can be related without regard for their location. The connectivity in networked information architectures makes it possible to have data in one repository describe data in another repository. * The computational power of the networked information environment makes it possible to consider active or dynamic relationships between data sets. This adds considerable power to the "data about data" definition. First, data about another data set may not physically exist, but may be automatically derived. Second, the "about" relationship may be an executable object -- in a sense interpretable metadata. As will be shown, this provides useful mechanisms for handling complex metadata problems such as rights management of digital objects. The remainder of this paper describes the development and consequences of the DAR model. Section 2 reviews the Warwick Framework, which is the basis for the model described in this paper. Section 3 examines the concept of the Warwick Framework Catalog, which provides a mechanism for expressing the relationships between the packages in a Warwick Framework container. With that background established, section 4 generalizes the Warwick Framework by removing the restriction that it only contains "metadata". This allows us to consider digital library objects that are aggregations of (possibly distributed) data sets, with the relationships between the data sets expressed using a Warwick Framework Catalog. Section 5 further extends the model by describing Distributed Active Relationships (DARs). DARs are the explicit relationships that have the potential to be executable, as alluded to earlier. Finally, section 6 describes two possible implementations of these concepts.
Anmerkung: Vgl.: http://dlib.ukoln.ac.uk/dlib/november97/daniel/11daniel.html.
Objekt: Warwick framework