Search (21 results, page 1 of 2)

Pitti, D.V.: Encoded Archival Description : an introduction and overview (1999) 0.01
```
0.013643255 = product of:
  0.06139465 = sum of:
    0.036990993 = weight(_text_:bibliographic in 1152) [ClassicSimilarity], result of:
      0.036990993 = score(doc=1152,freq=2.0), product of:
        0.14333439 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.036818076 = queryNorm
        0.2580748 = fieldWeight in 1152, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.046875 = fieldNorm(doc=1152)
    0.024403658 = weight(_text_:data in 1152) [ClassicSimilarity], result of:
      0.024403658 = score(doc=1152,freq=2.0), product of:
        0.11642061 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.036818076 = queryNorm
        0.2096163 = fieldWeight in 1152, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=1152)
  0.22222222 = coord(2/9)
```
Abstract

Encoded Archival Description (EAD) is an emerging standard used internationally in an increasing number of archives and manuscripts libraries to encode data describing corporate records and personal papers. The individual descriptions are variously called finding aids, guides, handlists, or catalogs. While archival description shares many objectives with bibliographic description, it differs from it in several essential ways. From its inception, EAD was based on SGML, and, with the release of EAD version 1.0 in 1998, it is also compliant with XML. EAD was, and continues to be, developed by the archival community. While development was initiated in the United States, international interest and contribution are increasing. EAD is currently administered and maintained jointly by the Society of American Archivists and the United States Library of Congress. Developers are currently exploring ways to internationalize the administration and maintenance of EAD to reflect and represent the expanding base of users.
Miller, E.: ¬An introduction to the Resource Description Framework (1998) 0.01
```
0.011943011 = product of:
  0.1074871 = sum of:
    0.1074871 = weight(_text_:readable in 1231) [ClassicSimilarity], result of:
      0.1074871 = score(doc=1231,freq=2.0), product of:
        0.2262076 = queryWeight, product of:
          6.1439276 = idf(docFreq=257, maxDocs=44218)
          0.036818076 = queryNorm
        0.47517014 = fieldWeight in 1231, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.1439276 = idf(docFreq=257, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1231)
  0.11111111 = coord(1/9)
```
Abstract

The Resource Description Framework (RDF) is an infrastructure that enables the encoding, exchange and reuse of structured metadata. RDF is an application of XML that imposes needed structural constraints to provide unambiguous methods of expressing semantics. RDF additionally provides a means for publishing both human-readable and machine-processable vocabularies designed to encourage the reuse and extension of metadata semantics among disparate information communities. The structural constraints RDF imposes to support the consistent encoding and exchange of standardized metadata provides for the interchangeability of separate packages of metadata defined by different resource description communities.
Bearman, D.; Miller, E.; Rust, G.; Trant, J.; Weibel, S.: ¬A common model to support interoperable metadata : progress report on reconciling metadata requirements from the Dublin Core and INDECS/DOI communities (1999) 0.01
```
0.011369381 = product of:
  0.051162213 = sum of:
    0.03082583 = weight(_text_:bibliographic in 1249) [ClassicSimilarity], result of:
      0.03082583 = score(doc=1249,freq=2.0), product of:
        0.14333439 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.036818076 = queryNorm
        0.21506234 = fieldWeight in 1249, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1249)
    0.020336384 = weight(_text_:data in 1249) [ClassicSimilarity], result of:
      0.020336384 = score(doc=1249,freq=2.0), product of:
        0.11642061 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.036818076 = queryNorm
        0.17468026 = fieldWeight in 1249, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1249)
  0.22222222 = coord(2/9)
```
Abstract

The Dublin Core metadata community and the INDECS/DOI community of authors, rights holders, and publishers are seeking common ground in the expression of metadata for information resources. Recent meetings at the 6th Dublin Core Workshop in Washington DC sketched out common models for semantics (informed by the requirements articulated in the IFLA Functional Requirements for the Bibliographic Record) and conventions for knowledge representation (based on the Resource Description Framework under development by the W3C). Further development of detailed requirements is planned by both communities in the coming months with the aim of fully representing the metadata needs of each. An open "Schema Harmonization" working group has been established to identify a common framework to support interoperability among these communities. The present document represents a starting point identifying historical developments and common requirements of these perspectives on metadata and charts a path for harmonizing their respective conceptual models. It is hoped that collaboration over the coming year will result in agreed semantic and syntactic conventions that will support a high degree of interoperability among these communities, ideally expressed in a single data model and using common, standard tools.
Hill, L.L.; Frew, J.; Zheng, Q.: Geographic names : the implementation of a gazetteer in a georeferenced digital library (1999) 0.01
```
0.010593033 = product of:
  0.04766865 = sum of:
    0.024660662 = weight(_text_:bibliographic in 1240) [ClassicSimilarity], result of:
      0.024660662 = score(doc=1240,freq=2.0), product of:
        0.14333439 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.036818076 = queryNorm
        0.17204987 = fieldWeight in 1240, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03125 = fieldNorm(doc=1240)
    0.02300799 = weight(_text_:data in 1240) [ClassicSimilarity], result of:
      0.02300799 = score(doc=1240,freq=4.0), product of:
        0.11642061 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.036818076 = queryNorm
        0.19762816 = fieldWeight in 1240, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=1240)
  0.22222222 = coord(2/9)
```
Abstract

The Alexandria Digital Library (ADL) Project has developed a content standard for gazetteer objects and a hierarchical type scheme for geographic features. Both of these developments are based on ADL experience with an earlier gazetteer component for the Library, based on two gazetteers maintained by the U.S. federal government. We define the minimum components of a gazetteer entry as (1) a geographic name, (2) a geographic location represented by coordinates, and (3) a type designation. With these attributes, a gazetteer can function as a tool for indirect spatial location identification through names and types. The ADL Gazetteer Content Standard supports contribution and sharing of gazetteer entries with rich descriptions beyond the minimum requirements. This paper describes the content standard, the feature type thesaurus, and the implementation and research issues. A gazetteer is list of geographic names, together with their geographic locations and other descriptive information. A geographic name is a proper name for a geographic place and feature, such as Santa Barbara County, Mount Washington, St. Francis Hospital, and Southern California. There are many types of printed gazetteers. For example, the New York Times Atlas has a gazetteer section that can be used to look up a geographic name and find the page(s) and grid reference(s) where the corresponding feature is shown. Some gazetteers provide information about places and features; for example, a history of the locale, population data, physical data such as elevation, or the pronunciation of the name. Some lists of geographic names are available as hierarchical term sets (thesauri) designed for information retreival; these are used to describe bibliographic or museum materials. Examples include the authority files of the U.S. Library of Congress and the GeoRef Thesaurus produced by the American Geological Institute. The Getty Museum has recently made their Thesaurus of Geographic Names available online. This is a major project to develop a controlled vocabulary of current and historical names to describe (i.e., catalog) art and architecture literature. U.S. federal government mapping agencies maintain gazetteers containing the official names of places and/or the names that appear on map series. Examples include the U.S. Geological Survey's Geographic Names Information System (GNIS) and the National Imagery and Mapping Agency's Geographic Names Processing System (GNPS). Both of these are maintained in cooperation with the U.S. Board of Geographic Names (BGN). Many other examples could be cited -- for local areas, for other countries, and for special purposes. There is remarkable diversity in approaches to the description of geographic places and no standardization beyond authoritative sources for the geographic names themselves.
Atkins, H.: ¬The ISI® Web of Science® - links and electronic journals : how links work today in the Web of Science, and the challenges posed by electronic journals (1999) 0.01
```
0.009095504 = product of:
  0.04092977 = sum of:
    0.024660662 = weight(_text_:bibliographic in 1246) [ClassicSimilarity], result of:
      0.024660662 = score(doc=1246,freq=2.0), product of:
        0.14333439 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.036818076 = queryNorm
        0.17204987 = fieldWeight in 1246, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.03125 = fieldNorm(doc=1246)
    0.016269106 = weight(_text_:data in 1246) [ClassicSimilarity], result of:
      0.016269106 = score(doc=1246,freq=2.0), product of:
        0.11642061 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.036818076 = queryNorm
        0.1397442 = fieldWeight in 1246, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=1246)
  0.22222222 = coord(2/9)
```
Abstract

Since their inception in the early 1960s the strength and unique aspect of the ISI citation indexes has been their ability to illustrate the conceptual relationships between scholarly documents. When authors create reference lists for their papers, they make explicit links between their own, current work and the prior work of others. The exact nature of these links may not be expressed in the references themselves, and the motivation behind them may vary (this has been the subject of much discussion over the years), but the links embodied in references do exist. Over the past 30+ years, technology has allowed ISI to make the presentation of citation searching increasingly accessible to users of our products. Citation searching and link tracking moved from being rather cumbersome in print, to being direct and efficient (albeit non-intuitive) online, to being somewhat more user-friendly in CD format. But it is the confluence of the hypertext link and development of Web browsers that has enabled us to present to users a new form of citation product -- the Web of Science -- that is intuitive and makes citation indexing conceptually accessible. A cited reference search begins with a known, important (or at least relevant) document used as the search term. The search allows one to identify subsequent articles that have cited that document. This feature adds the dimension of prospective searching to the usual retrospective searching that all bibliographic indexes provide. Citation indexing is a prime example of a concept before its time - important enough to be used in the meantime by those sufficiently motivated, but just waiting for the right technology to come along to expand its use. While it was possible to follow citation links in earlier citation index formats, this required a level of effort on the part of users that was often just too much to ask of the casual user. In the citation indexes as presented in the Web of Science, the relationship between citing and cited documents is evident to users, and a click of the mouse is all it takes to follow a citation link. Citation connections are established between the published papers being indexed from the 8,000+ journals ISI covers and the items their reference lists contain during the data capture process. It is the standardized capture of each of the references included with these documents that enables us to provide the citation searching feature in all the citation index formats, as well as both internal and external links in the Web of Science.
Priss, U.: Description logic and faceted knowledge representation (1999) 0.01
```
0.008748596 = product of:
  0.03936868 = sum of:
    0.024403658 = weight(_text_:data in 2655) [ClassicSimilarity], result of:
      0.024403658 = score(doc=2655,freq=2.0), product of:
        0.11642061 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.036818076 = queryNorm
        0.2096163 = fieldWeight in 2655, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=2655)
    0.014965023 = product of:
      0.029930046 = sum of:
        0.029930046 = weight(_text_:22 in 2655) [ClassicSimilarity], result of:
          0.029930046 = score(doc=2655,freq=2.0), product of:
            0.12893063 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036818076 = queryNorm
            0.23214069 = fieldWeight in 2655, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2655)
      0.5 = coord(1/2)
  0.22222222 = coord(2/9)
```
Abstract

The term "facet" was introduced into the field of library classification systems by Ranganathan in the 1930's [Ranganathan, 1962]. A facet is a viewpoint or aspect. In contrast to traditional classification systems, faceted systems are modular in that a domain is analyzed in terms of baseline facets which are then synthesized. In this paper, the term "facet" is used in a broader meaning. Facets can describe different aspects on the same level of abstraction or the same aspect on different levels of abstraction. The notion of facets is related to database views, multicontexts and conceptual scaling in formal concept analysis [Ganter and Wille, 1999], polymorphism in object-oriented design, aspect-oriented programming, views and contexts in description logic and semantic networks. This paper presents a definition of facets in terms of faceted knowledge representation that incorporates the traditional narrower notion of facets and potentially facilitates translation between different knowledge representation formalisms. A goal of this approach is a modular, machine-aided knowledge base design mechanism. A possible application is faceted thesaurus construction for information retrieval and data mining. Reasoning complexity depends on the size of the modules (facets). A more general analysis of complexity will be left for future research.

Date

22. 1.2016 17:30:31
Daniel Jr., R.; Lagoze, C.: Extending the Warwick framework : from metadata containers to active digital objects (1997) 0.01
```
0.0072483453 = product of:
  0.06523511 = sum of:
    0.06523511 = weight(_text_:data in 1264) [ClassicSimilarity], result of:
      0.06523511 = score(doc=1264,freq=42.0), product of:
        0.11642061 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.036818076 = queryNorm
        0.56033987 = fieldWeight in 1264, product of:
          6.4807405 = tf(freq=42.0), with freq of:
            42.0 = termFreq=42.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1264)
  0.11111111 = coord(1/9)
```
Abstract

Defining metadata as "data about data" provokes more questions than it answers. What are the forms of the data and metadata? Can we be more specific about the manner in which the metadata is "about" the data? Are data and metadata distinguished only in the context of their relationship? Is the nature of the relationship between the datasets declarative or procedural? Can the metadata itself be described by other data? Over the past several years, we have been engaged in a number of efforts examining the role, format, composition, and architecture of metadata for networked resources. During this time, we have noticed the tendency to be led astray by comfortable, but somewhat inappropriate, models in the non-digital information environment. Rather than pursuing familiar models, there is the need for a new model that fully exploits the unique combination of computation and connectivity that characterizes the digital library. In this paper, we describe an extension of the Warwick Framework that we call Distributed Active Relationships (DARs). DARs provide a powerful model for representing data and metadata in digital library objects. They explicitly express the relationships between networked resources, and even allow those relationships to be dynamically downloadable and executable. The DAR model is based on the following principles, which our examination of the "data about data" definition has led us to regard as axiomatic: * There is no essential distinction between data and metadata. We can only make such a distinction in terms of a particular "about" relationship. As a result, what is metadata in the context of one "about" relationship may be data in another. * There is no single "about" relationship. There are many different and important relationships between data resources. * Resources can be related without regard for their location. The connectivity in networked information architectures makes it possible to have data in one repository describe data in another repository. * The computational power of the networked information environment makes it possible to consider active or dynamic relationships between data sets. This adds considerable power to the "data about data" definition. First, data about another data set may not physically exist, but may be automatically derived. Second, the "about" relationship may be an executable object -- in a sense interpretable metadata. As will be shown, this provides useful mechanisms for handling complex metadata problems such as rights management of digital objects. The remainder of this paper describes the development and consequences of the DAR model. Section 2 reviews the Warwick Framework, which is the basis for the model described in this paper. Section 3 examines the concept of the Warwick Framework Catalog, which provides a mechanism for expressing the relationships between the packages in a Warwick Framework container. With that background established, section 4 generalizes the Warwick Framework by removing the restriction that it only contains "metadata". This allows us to consider digital library objects that are aggregations of (possibly distributed) data sets, with the relationships between the data sets expressed using a Warwick Framework Catalog. Section 5 further extends the model by describing Distributed Active Relationships (DARs). DARs are the explicit relationships that have the potential to be executable, as alluded to earlier. Finally, section 6 describes two possible implementations of these concepts.
Brüggemann-Klein, A.; Klein, R.; Landgraf, B.: BibRelEx : Exploring bibliographic databases by visualization of annotated content-based relations (1999) 0.01
```
0.00711892 = product of:
  0.06407028 = sum of:
    0.06407028 = weight(_text_:bibliographic in 1157) [ClassicSimilarity], result of:
      0.06407028 = score(doc=1157,freq=6.0), product of:
        0.14333439 = queryWeight, product of:
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.036818076 = queryNorm
        0.44699866 = fieldWeight in 1157, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.893044 = idf(docFreq=2449, maxDocs=44218)
          0.046875 = fieldNorm(doc=1157)
  0.11111111 = coord(1/9)
```
Abstract

Traditional searching and browsing functions for bibliographic databases no longer enable users to deal efficiently with the rapidly growing number of scientific publications. The main goal of our project BibRelEx is to develop a new method based on the visualization of content-based relations between documents such as cites, succeeds, improves with respect to. BibRelEx will therefore use these relationships for effective exploration. In addition, BibRelEx will take advantage of the additional insights into the area that can result from the aggregation of expert knowledge, which complements the specialized knowledge represented in the documents themselves. We are preparing to test this approach using a bibliographic database in a specific area of computer science.
Rusch-Feja, D.; Becker, H.J.: Global Info : the German digital libraries project (1999) 0.01
```
0.00642973 = product of:
  0.057867568 = sum of:
    0.057867568 = weight(_text_:germany in 1242) [ClassicSimilarity], result of:
      0.057867568 = score(doc=1242,freq=2.0), product of:
        0.21956629 = queryWeight, product of:
          5.963546 = idf(docFreq=308, maxDocs=44218)
          0.036818076 = queryNorm
        0.26355398 = fieldWeight in 1242, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          5.963546 = idf(docFreq=308, maxDocs=44218)
          0.03125 = fieldNorm(doc=1242)
  0.11111111 = coord(1/9)
```
Abstract

The concept for the German Digital Libraries Program is imbedded in the Information Infrastructure Program of the German Federal Government for the years 1996-2000 which has been explicated in the Program Paper entitled "Information as Raw Material for Innovation".3 The Program Paper was published 1996 by the Federal Ministry for Education, Research, and Technology. The actual grants program "Global Info" was initiated by the Information and Communication Commission of the Joint Learned Societies to further technological advancement in enabling all researchers in Germany direct access to literature, research results, and other relevant information. This Commission was founded by four of the learned societies in 1995, and it has sponsored a series of workshops to increase awareness of leading edge technology and innovations in accessing electronic information sources. Now, nine of the leading research-level learned societies -- often those with umbrella responsibilities for other learned societies in their field -- are members of the Information and Communication Commission and represent the mathematicians, physicists, computer scientists, chemists, educational researchers, sociologists, psychologists, biologists and information technologists in the German Association of Engineers. (The German professional librarian societies are not members, as such, of this Commission, but are represented through delegates from libraries in the learned societies and in the future, hopefully, also by the German Association of Documentalists or through the cooperation between the documentalist and librarian professional societies.) The Federal Ministry earmarked 60 Million German Marks for projects within the framework of the German Digital Libraries Program in two phases over the next six years. The scope for the German Digital Libraries Program was announced in a press release in April 1997,4 and the first call for preliminary projects and expressions of interest in participation ended in July 1997. The Consortium members were suggested by the Information and Communication Commission of the Learned Societies (IuK Kommission), by key scientific research funding agencies in the German government, and by the publishers themselves. The first official meeting of the participants took place on December 1, 1997, at the Deutsche Bibliothek, located in the renowned center of German book trade, Frankfurt, thus documenting the active role and participation of libraries and publishers. In contrast to the Digital Libraries Project of the National Science Foundation in the United States, the German Digital Libraries project is based on furthering cooperation with universities, scientific publishing houses (including various international publishers), book dealers, and special subject information centers, as well as academic and research libraries. The goals of the German Digital Libraries Project are to achieve: 1) efficient access to world wide information; 2) directly from the scientist's desktop; 3) while providing the organization for and stimulating fundamental structural changes in the information and communication process of the scientific community.
Baker, T.: Languages for Dublin Core (1998) 0.01
```
0.0059715053 = product of:
  0.05374355 = sum of:
    0.05374355 = weight(_text_:readable in 1257) [ClassicSimilarity], result of:
      0.05374355 = score(doc=1257,freq=2.0), product of:
        0.2262076 = queryWeight, product of:
          6.1439276 = idf(docFreq=257, maxDocs=44218)
          0.036818076 = queryNorm
        0.23758507 = fieldWeight in 1257, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          6.1439276 = idf(docFreq=257, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1257)
  0.11111111 = coord(1/9)
```
Abstract

Over the past three years, the Dublin Core Metadata Initiative has achieved a broad international consensus on the semantics of a simple element set for describing electronic resources. Since the first workshop in March 1995, which was reported in the very first issue of D-Lib Magazine, Dublin Core has been the topic of perhaps a dozen articles here. Originally intended to be simple and intuitive enough for authors to tag Web pages without special training, Dublin Core is being adapted now for more specialized uses, from government information and legal deposit to museum informatics and electronic commerce. To meet such specialized requirements, Dublin Core can be customized with additional elements or qualifiers. However, these refinements can compromise interoperability across applications. There are tradeoffs between using specific terms that precisely meet local needs versus general terms that are understood more widely. We can better understand this inevitable tension between simplicity and complexity if we recognize that metadata is a form of human language. With Dublin Core, as with a natural language, people are inclined to stretch definitions, make general terms more specific, specific terms more general, misunderstand intended meanings, and coin new terms. One goal of this paper, therefore, will be to examine the experience of some related ways to seek semantic interoperability through simplicity: planned languages, interlingua constructs, and pidgins. The problem of semantic interoperability is compounded when we consider Dublin Core in translation. All of the workshops, documents, mailing lists, user guides, and working group outputs of the Dublin Core Initiative have been in English. But in many countries and for many applications, people need a metadata standard in their own language. In principle, the broad elements of Dublin Core can be defined equally well in Bulgarian or Hindi. Since Dublin Core is a controlled standard, however, any parallel definitions need to be kept in sync as the standard evolves. Another goal of the paper, then, will be to define the conceptual and organizational problem of maintaining a metadata standard in multiple languages. In addition to a name and definition, which are meant for human consumption, each Dublin Core element has a label, or indexing token, meant for harvesting by search engines. For practical reasons, these machine-readable tokens are English-looking strings such as Creator and Subject (just as HTML tags are called HEAD, BODY, or TITLE). These tokens, which are shared by Dublin Cores in every language, ensure that metadata fields created in any particular language are indexed together across repositories. As symbols of underlying universal semantics, these tokens form the basis of semantic interoperability among the multiple Dublin Cores. As long as we limit ourselves to sharing these indexing tokens among exact translations of a simple set of fifteen broad elements, the definitions of which fit easily onto two pages, the problem of Dublin Core in multiple languages is straightforward. But nothing having to do with human language is ever so simple. Just as speakers of various languages must learn the language of Dublin Core in their own tongues, we must find the right words to talk about a metadata language that is expressable in many discipline-specific jargons and natural languages and that inevitably will evolve and change over time.
Plotkin, R.C.; Schwartz, M.S.: Data modeling for news clip archive : a prototype solution (1997) 0.00
```
0.0046964865 = product of:
  0.042268377 = sum of:
    0.042268377 = weight(_text_:data in 1259) [ClassicSimilarity], result of:
      0.042268377 = score(doc=1259,freq=6.0), product of:
        0.11642061 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.036818076 = queryNorm
        0.3630661 = fieldWeight in 1259, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=1259)
  0.11111111 = coord(1/9)
```
Abstract

Film, videotape and multimedia archive systems must address the issues of editing, authoring and searching at the media (i.e. tape) or sub media (i.e. scene) level in addition to the traditional inventory management capabilities associated with the physical media. This paper describes a prototype of a database design for the storage, search and retrieval of multimedia and its related information. It also provides a process by which legacy data can be imported to this schema. The Continuous Media Index, or Comix system is the name of the prototype. An implementation of such a digital library solution incorporates multimedia objects, hierarchical relationships and timecode in addition to traditional attribute data. Present video and multimedia archive systems are easily migrated to this architecture. Comix was implemented for a videotape archiving system. It was written for, and implemented using IBM Digital Library version 1.0. A derivative of Comix is currently in development for customer specific applications. Principles of the Comix design as well as the importation methods are not specific to the underlying systems used.

Dunning, A.: Do we still need search engines? (1999) 0.00

0.0038798207 = product of:
  0.034918386 = sum of:
    0.034918386 = product of:
      0.06983677 = sum of:
        0.06983677 = weight(_text_:22 in 6021) [ClassicSimilarity], result of:
          0.06983677 = score(doc=6021,freq=2.0), product of:
            0.12893063 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036818076 = queryNorm
            0.5416616 = fieldWeight in 6021, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6021)
      0.5 = coord(1/2)
  0.11111111 = coord(1/9)

Source: Ariadne. 1999, no.22

Borgman, C.L.: Multi-media, multi-cultural, and multi-lingual digital libraries : or how do we exchange data In 400 languages? (1997) 0.00
```
0.0038744034 = product of:
  0.03486963 = sum of:
    0.03486963 = weight(_text_:data in 1263) [ClassicSimilarity], result of:
      0.03486963 = score(doc=1263,freq=12.0), product of:
        0.11642061 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.036818076 = queryNorm
        0.29951423 = fieldWeight in 1263, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1263)
  0.11111111 = coord(1/9)
```
Abstract

The Internet would not be very useful if communication were limited to textual exchanges between speakers of English located in the United States. Rather, its value lies in its ability to enable people from multiple nations, speaking multiple languages, to employ multiple media in interacting with each other. While computer networks broke through national boundaries long ago, they remain much more effective for textual communication than for exchanges of sound, images, or mixed media -- and more effective for communication in English than for exchanges in most other languages, much less interactions involving multiple languages. Supporting searching and display in multiple languages is an increasingly important issue for all digital libraries accessible on the Internet. Even if a digital library contains materials in only one language, the content needs to be searchable and displayable on computers in countries speaking other languages. We need to exchange data between digital libraries, whether in a single language or in multiple languages. Data exchanges may be large batch updates or interactive hyperlinks. In any of these cases, character sets must be represented in a consistent manner if exchanges are to succeed. Issues of interoperability, portability, and data exchange related to multi-lingual character sets have received surprisingly little attention in the digital library community or in discussions of standards for information infrastructure, except in Europe. The landmark collection of papers on Standards Policy for Information Infrastructure, for example, contains no discussion of multi-lingual issues except for a passing reference to the Unicode standard. The goal of this short essay is to draw attention to the multi-lingual issues involved in designing digital libraries accessible on the Internet. Many of the multi-lingual design issues parallel those of multi-media digital libraries, a topic more familiar to most readers of D-Lib Magazine. This essay draws examples from multi-media DLs to illustrate some of the urgent design challenges in creating a globally distributed network serving people who speak many languages other than English. First we introduce some general issues of medium, culture, and language, then discuss the design challenges in the transition from local to global systems, lastly addressing technical matters. The technical issues involve the choice of character sets to represent languages, similar to the choices made in representing images or sound. However, the scale of the language problem is far greater. Standards for multi-media representation are being adopted fairly rapidly, in parallel with the availability of multi-media content in electronic form. By contrast, we have hundreds (and sometimes thousands) of years worth of textual materials in hundreds of languages, created long before data encoding standards existed. Textual content from past and present is being encoded in language and application-specific representations that are difficult to exchange without losing data -- if they exchange at all. We illustrate the multi-language DL challenge with examples drawn from the research library community, which typically handles collections of materials in 400 or so languages. These are problems faced not only by developers of digital libraries, but by those who develop and manage any communication technology that crosses national or linguistic boundaries.
Dolin, R.; Agrawal, D.; El Abbadi, A.; Pearlman, J.: Using automated classification for summarizing and selecting heterogeneous information sources (1998) 0.00
```
0.0038346653 = product of:
  0.034511987 = sum of:
    0.034511987 = weight(_text_:data in 316) [ClassicSimilarity], result of:
      0.034511987 = score(doc=316,freq=4.0), product of:
        0.11642061 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.036818076 = queryNorm
        0.29644224 = fieldWeight in 316, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=316)
  0.11111111 = coord(1/9)
```
Abstract

Information retrieval over the Internet increasingly requires the filtering of thousands of heterogeneous information sources. Important sources of information include not only traditional databases with structured data and queries, but also increasing numbers of non-traditional, semi- or unstructured collections such as Web sites, FTP archives, etc. As the number and variability of sources increases, new ways of automatically summarizing, discovering, and selecting collections relevant to a user's query are needed. One such method involves the use of classification schemes, such as the Library of Congress Classification (LCC) [10], within which a collection may be represented based on its content, irrespective of the structure of the actual data or documents. For such a system to be useful in a large-scale distributed environment, it must be easy to use for both collection managers and users. As a result, it must be possible to classify documents automatically within a classification scheme. Furthermore, there must be a straightforward and intuitive interface with which the user may use the scheme to assist in information retrieval (IR).
Spink, A.; Wilson, T.; Ellis, D.; Ford, N.: Modeling users' successive searches in digital environments : a National Science Foundation/British Library funded study (1998) 0.00
```
0.0031634374 = product of:
  0.028470935 = sum of:
    0.028470935 = weight(_text_:data in 1255) [ClassicSimilarity], result of:
      0.028470935 = score(doc=1255,freq=8.0), product of:
        0.11642061 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.036818076 = queryNorm
        0.24455236 = fieldWeight in 1255, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1255)
  0.11111111 = coord(1/9)
```
Abstract

As digital libraries become a major source of information for many people, we need to know more about how people seek and retrieve information in digital environments. Quite commonly, users with a problem-at-hand and associated question-in-mind repeatedly search a literature for answers, and seek information in stages over extended periods from a variety of digital information resources. The process of repeatedly searching over time in relation to a specific, but possibly an evolving information problem (including changes or shifts in a variety of variables), is called the successive search phenomenon. The study outlined in this paper is currently investigating this new and little explored line of inquiry for information retrieval, Web searching, and digital libraries. The purpose of the research project is to investigate the nature, manifestations, and behavior of successive searching by users in digital environments, and to derive criteria for use in the design of information retrieval interfaces and systems supporting successive searching behavior. This study includes two related projects. The first project is based in the School of Library and Information Sciences at the University of North Texas and is funded by a National Science Foundation POWRE Grant <http://www.nsf.gov/cgi-bin/show?award=9753277>. The second project is based at the Department of Information Studies at the University of Sheffield (UK) and is funded by a grant from the British Library <http://www.shef. ac.uk/~is/research/imrg/uncerty.html> Research and Innovation Center. The broad objectives of each project are to examine the nature and extent of successive search episodes in digital environments by real users over time. The specific aim of the current project is twofold: * To characterize progressive changes and shifts that occur in: user situational context; user information problem; uncertainty reduction; user cognitive styles; cognitive and affective states of the user, and consequently in their queries; and * To characterize related changes over time in the type and use of information resources and search strategies particularly related to given capabilities of IR systems, and IR search engines, and examine changes in users' relevance judgments and criteria, and characterize their differences. The study is an observational, longitudinal data collection in the U.S. and U.K. Three questionnaires are used to collect data: reference, client post search and searcher post search questionnaires. Each successive search episode with a search intermediary for textual materials on the DIALOG Information Service is audiotaped and search transaction logs are recorded. Quantitative analysis includes statistical analysis using Likert scale data from the questionnaires and log-linear analysis of sequential data. Qualitative methods include: content analysis, structuring taxonomies; and diagrams to describe shifts and transitions within and between each search episode. Outcomes of the study are the development of appropriate model(s) for IR interactions in successive search episodes and the derivation of a set of design criteria for interfaces and systems supporting successive searching.
Fowler, R.H.; Wilson, B.A.; Fowler, W.A.L.: Information navigator : an information system using associative networks for display and retrieval (1992) 0.00
```
0.0027115175 = product of:
  0.024403658 = sum of:
    0.024403658 = weight(_text_:data in 919) [ClassicSimilarity], result of:
      0.024403658 = score(doc=919,freq=2.0), product of:
        0.11642061 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.036818076 = queryNorm
        0.2096163 = fieldWeight in 919, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=919)
  0.11111111 = coord(1/9)
```
Abstract

Document retrieval is a highly interactive process dealing with large amounts of information. Visual representations can provide both a means for managing the complexity of large information structures and an interface style well suited to interactive manipulation. The system we have designed utilizes visually displayed graphic structures and a direct manipulation interface style to supply an integrated environment for retrieval. A common visually displayed network structure is used for query, document content, and term relations. A query can be modified through direct manipulation of its visual form by incorporating terms from any other information structure the system displays. An associative thesaurus of terms and an inter-document network provide information about a document collection that can complement other retrieval aids. Visualization of these large data structures makes use of fisheye views and overview diagrams to help overcome some of the inherent difficulties of orientation and navigation in large information structures.
Landauer, T.K.; Foltz, P.W.; Laham, D.: ¬An introduction to Latent Semantic Analysis (1998) 0.00
```
0.0027115175 = product of:
  0.024403658 = sum of:
    0.024403658 = weight(_text_:data in 1162) [ClassicSimilarity], result of:
      0.024403658 = score(doc=1162,freq=2.0), product of:
        0.11642061 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.036818076 = queryNorm
        0.2096163 = fieldWeight in 1162, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=1162)
  0.11111111 = coord(1/9)
```
Abstract

Latent Semantic Analysis (LSA) is a theory and method for extracting and representing the contextual-usage meaning of words by statistical computations applied to a large corpus of text (Landauer and Dumais, 1997). The underlying idea is that the aggregate of all the word contexts in which a given word does and does not appear provides a set of mutual constraints that largely determines the similarity of meaning of words and sets of words to each other. The adequacy of LSA's reflection of human knowledge has been established in a variety of ways. For example, its scores overlap those of humans on standard vocabulary and subject matter tests; it mimics human word sorting and category judgments; it simulates word-word and passage-word lexical priming data; and as reported in 3 following articles in this issue, it accurately estimates passage coherence, learnability of passages by individual students, and the quality and quantity of knowledge contained in an essay.
Dolin, R.; Agrawal, D.; El Abbadi, A.; Pearlman, J.: Using automated classification for summarizing and selecting heterogeneous information sources (1998) 0.00
```
0.0023482433 = product of:
  0.021134188 = sum of:
    0.021134188 = weight(_text_:data in 1253) [ClassicSimilarity], result of:
      0.021134188 = score(doc=1253,freq=6.0), product of:
        0.11642061 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.036818076 = queryNorm
        0.18153305 = fieldWeight in 1253, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0234375 = fieldNorm(doc=1253)
  0.11111111 = coord(1/9)
```
Abstract

Information retrieval over the Internet increasingly requires the filtering of thousands of heterogeneous information sources. Important sources of information include not only traditional databases with structured data and queries, but also increasing numbers of non-traditional, semi- or unstructured collections such as Web sites, FTP archives, etc. As the number and variability of sources increases, new ways of automatically summarizing, discovering, and selecting collections relevant to a user's query are needed. One such method involves the use of classification schemes, such as the Library of Congress Classification (LCC), within which a collection may be represented based on its content, irrespective of the structure of the actual data or documents. For such a system to be useful in a large-scale distributed environment, it must be easy to use for both collection managers and users. As a result, it must be possible to classify documents automatically within a classification scheme. Furthermore, there must be a straightforward and intuitive interface with which the user may use the scheme to assist in information retrieval (IR). Our work with the Alexandria Digital Library (ADL) Project focuses on geo-referenced information, whether text, maps, aerial photographs, or satellite images. As a result, we have emphasized techniques which work with both text and non-text, such as combined textual and graphical queries, multi-dimensional indexing, and IR methods which are not solely dependent on words or phrases. Part of this work involves locating relevant online sources of information. In particular, we have designed and are currently testing aspects of an architecture, Pharos, which we believe will scale up to 1.000.000 heterogeneous sources. Pharos accommodates heterogeneity in content and format, both among multiple sources as well as within a single source. That is, we consider sources to include Web sites, FTP archives, newsgroups, and full digital libraries; all of these systems can include a wide variety of content and multimedia data formats. Pharos is based on the use of hierarchical classification schemes. These include not only well-known 'subject' (or 'concept') based schemes such as the Dewey Decimal System and the LCC, but also, for example, geographic classifications, which might be constructed as layers of smaller and smaller hierarchical longitude/latitude boxes. Pharos is designed to work with sophisticated queries which utilize subjects, geographical locations, temporal specifications, and other types of information domains. The Pharos architecture requires that hierarchically structured collection metadata be extracted so that it can be partitioned in such a way as to greatly enhance scalability. Automated classification is important to Pharos because it allows information sources to extract the requisite collection metadata automatically that must be distributed.
Brin, S.; Page, L.: ¬The anatomy of a large-scale hypertextual Web search engine (1998) 0.00
```
0.0022595983 = product of:
  0.020336384 = sum of:
    0.020336384 = weight(_text_:data in 947) [ClassicSimilarity], result of:
      0.020336384 = score(doc=947,freq=2.0), product of:
        0.11642061 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.036818076 = queryNorm
        0.17468026 = fieldWeight in 947, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0390625 = fieldNorm(doc=947)
  0.11111111 = coord(1/9)
```
Abstract

In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems. The prototype with a full text and hyperlink database of at least 24 million pages is available at http://google.stanford.edu/. To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of millions of queries every day. Despite the importance of large-scale search engines on the web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from three years ago. This paper provides an in-depth description of our large-scale web search engine -- the first such detailed public description we know of to date. Apart from the problems of scaling traditional search techniques to data of this magnitude, there are new technical challenges involved with using the additional information present in hypertext to produce better search results. This paper addresses this question of how to build a practical large-scale system which can exploit the additional information present in hypertext. Also we look at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want

Priss, U.: Faceted knowledge representation (1999) 0.00

0.0019399103 = product of:
  0.017459193 = sum of:
    0.017459193 = product of:
      0.034918386 = sum of:
        0.034918386 = weight(_text_:22 in 2654) [ClassicSimilarity], result of:
          0.034918386 = score(doc=2654,freq=2.0), product of:
            0.12893063 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.036818076 = queryNorm
            0.2708308 = fieldWeight in 2654, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2654)
      0.5 = coord(1/2)
  0.11111111 = coord(1/9)

Date: 22. 1.2016 17:30:31

Search (21 results, page 1 of 2)

Authors

Themes