Search (42 results, page 1 of 3)

Lusti, M.: Data Warehousing and Data Mining : Eine Einführung in entscheidungsunterstützende Systeme (1999) 0.07

0.06743487 = product of:
  0.13486974 = sum of:
    0.10949186 = weight(_text_:data in 4261) [ClassicSimilarity], result of:
      0.10949186 = score(doc=4261,freq=14.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.7394569 = fieldWeight in 4261, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0625 = fieldNorm(doc=4261)
    0.025377871 = product of:
      0.050755743 = sum of:
        0.050755743 = weight(_text_:22 in 4261) [ClassicSimilarity], result of:
          0.050755743 = score(doc=4261,freq=2.0), product of:
            0.16398162 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046827413 = queryNorm
            0.30952093 = fieldWeight in 4261, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=4261)
      0.5 = coord(1/2)
  0.5 = coord(2/4)

Date: 17. 7.2002 19:22:06
RSWK: Data-warehouse-Konzept / Lehrbuch
Data mining / Lehrbuch
Subject: Data-warehouse-Konzept / Lehrbuch
Data mining / Lehrbuch
Theme: Data Mining

Miller, E.; Schloss. B.; Lassila, O.; Swick, R.R.: Resource Description Framework (RDF) : model and syntax (1997) 0.04
```
0.037649848 = product of:
  0.075299695 = sum of:
    0.05431654 = weight(_text_:data in 5903) [ClassicSimilarity], result of:
      0.05431654 = score(doc=5903,freq=18.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.36682853 = fieldWeight in 5903, product of:
          4.2426405 = tf(freq=18.0), with freq of:
            18.0 = termFreq=18.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.02734375 = fieldNorm(doc=5903)
    0.020983158 = product of:
      0.041966315 = sum of:
        0.041966315 = weight(_text_:processing in 5903) [ClassicSimilarity], result of:
          0.041966315 = score(doc=5903,freq=4.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.22138305 = fieldWeight in 5903, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.02734375 = fieldNorm(doc=5903)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

RDF - the Resource Description Framework - is a foundation for processing metadata; it provides interoperability between applications that exchange machine-understandable information on the Web. RDF emphasizes facilities to enable automated processing of Web resources. RDF metadata can be used in a variety of application areas; for example: in resource discovery to provide better search engine capabilities; in cataloging for describing the content and content relationships available at a particular Web site, page, or digital library; by intelligent software agents to facilitate knowledge sharing and exchange; in content rating; in describing collections of pages that represent a single logical "document"; for describing intellectual property rights of Web pages, and in many others. RDF with digital signatures will be key to building the "Web of Trust" for electronic commerce, collaboration, and other applications. Metadata is "data about data" or specifically in the context of RDF "data describing web resources." The distinction between "data" and "metadata" is not an absolute one; it is a distinction created primarily by a particular application. Many times the same resource will be interpreted in both ways simultaneously. RDF encourages this view by using XML as the encoding syntax for the metadata. The resources being described by RDF are, in general, anything that can be named via a URI. The broad goal of RDF is to define a mechanism for describing resources that makes no assumptions about a particular application domain, nor defines the semantics of any application domain. The definition of the mechanism should be domain neutral, yet the mechanism should be suitable for describing information about any domain. This document introduces a model for representing RDF metadata and one syntax for expressing and transporting this metadata in a manner that maximizes the interoperability of independently developed web servers and clients. The syntax described in this document is best considered as a "serialization syntax" for the underlying RDF representation model. The serialization syntax is XML, XML being the W3C's work-in-progress to define a richer Web syntax for a variety of applications. RDF and XML are complementary; there will be alternate ways to represent the same RDF data model, some more suitable for direct human authoring. Future work may lead to including such alternatives in this document.

Content

RDF Data Model At the core of RDF is a model for representing named properties and their values. These properties serve both to represent attributes of resources (and in this sense correspond to usual attribute-value-pairs) and to represent relationships between resources. The RDF data model is a syntax-independent way of representing RDF statements. RDF statements that are syntactically very different could mean the same thing. This concept of equivalence in meaning is very important when performing queries, aggregation and a number of other tasks at which RDF is aimed. The equivalence is defined in a clean machine understandable way. Two pieces of RDF are equivalent if and only if their corresponding data model representations are the same. Table of contents 1. Introduction 2. RDF Data Model 3. RDF Grammar 4. Signed RDF 5. Examples 6. Appendix A: Brief Explanation of XML Namespaces
Priss, U.: Description logic and faceted knowledge representation (1999) 0.03
```
0.025035713 = product of:
  0.050071426 = sum of:
    0.031038022 = weight(_text_:data in 2655) [ClassicSimilarity], result of:
      0.031038022 = score(doc=2655,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.2096163 = fieldWeight in 2655, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=2655)
    0.019033402 = product of:
      0.038066804 = sum of:
        0.038066804 = weight(_text_:22 in 2655) [ClassicSimilarity], result of:
          0.038066804 = score(doc=2655,freq=2.0), product of:
            0.16398162 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046827413 = queryNorm
            0.23214069 = fieldWeight in 2655, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2655)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

The term "facet" was introduced into the field of library classification systems by Ranganathan in the 1930's [Ranganathan, 1962]. A facet is a viewpoint or aspect. In contrast to traditional classification systems, faceted systems are modular in that a domain is analyzed in terms of baseline facets which are then synthesized. In this paper, the term "facet" is used in a broader meaning. Facets can describe different aspects on the same level of abstraction or the same aspect on different levels of abstraction. The notion of facets is related to database views, multicontexts and conceptual scaling in formal concept analysis [Ganter and Wille, 1999], polymorphism in object-oriented design, aspect-oriented programming, views and contexts in description logic and semantic networks. This paper presents a definition of facets in terms of faceted knowledge representation that incorporates the traditional narrower notion of facets and potentially facilitates translation between different knowledge representation formalisms. A goal of this approach is a modular, machine-aided knowledge base design mechanism. A possible application is faceted thesaurus construction for information retrieval and data mining. Reasoning complexity depends on the size of the modules (facets). A more general analysis of complexity will be left for future research.

Date

22. 1.2016 17:30:31
Hill, L.L.; Frew, J.; Zheng, Q.: Geographic names : the implementation of a gazetteer in a georeferenced digital library (1999) 0.02
```
0.023109939 = product of:
  0.046219878 = sum of:
    0.029262928 = weight(_text_:data in 1240) [ClassicSimilarity], result of:
      0.029262928 = score(doc=1240,freq=4.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.19762816 = fieldWeight in 1240, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.03125 = fieldNorm(doc=1240)
    0.016956951 = product of:
      0.033913903 = sum of:
        0.033913903 = weight(_text_:processing in 1240) [ClassicSimilarity], result of:
          0.033913903 = score(doc=1240,freq=2.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.17890452 = fieldWeight in 1240, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.03125 = fieldNorm(doc=1240)
      0.5 = coord(1/2)
  0.5 = coord(2/4)
```
Abstract

The Alexandria Digital Library (ADL) Project has developed a content standard for gazetteer objects and a hierarchical type scheme for geographic features. Both of these developments are based on ADL experience with an earlier gazetteer component for the Library, based on two gazetteers maintained by the U.S. federal government. We define the minimum components of a gazetteer entry as (1) a geographic name, (2) a geographic location represented by coordinates, and (3) a type designation. With these attributes, a gazetteer can function as a tool for indirect spatial location identification through names and types. The ADL Gazetteer Content Standard supports contribution and sharing of gazetteer entries with rich descriptions beyond the minimum requirements. This paper describes the content standard, the feature type thesaurus, and the implementation and research issues. A gazetteer is list of geographic names, together with their geographic locations and other descriptive information. A geographic name is a proper name for a geographic place and feature, such as Santa Barbara County, Mount Washington, St. Francis Hospital, and Southern California. There are many types of printed gazetteers. For example, the New York Times Atlas has a gazetteer section that can be used to look up a geographic name and find the page(s) and grid reference(s) where the corresponding feature is shown. Some gazetteers provide information about places and features; for example, a history of the locale, population data, physical data such as elevation, or the pronunciation of the name. Some lists of geographic names are available as hierarchical term sets (thesauri) designed for information retreival; these are used to describe bibliographic or museum materials. Examples include the authority files of the U.S. Library of Congress and the GeoRef Thesaurus produced by the American Geological Institute. The Getty Museum has recently made their Thesaurus of Geographic Names available online. This is a major project to develop a controlled vocabulary of current and historical names to describe (i.e., catalog) art and architecture literature. U.S. federal government mapping agencies maintain gazetteers containing the official names of places and/or the names that appear on map series. Examples include the U.S. Geological Survey's Geographic Names Information System (GNIS) and the National Imagery and Mapping Agency's Geographic Names Processing System (GNPS). Both of these are maintained in cooperation with the U.S. Board of Geographic Names (BGN). Many other examples could be cited -- for local areas, for other countries, and for special purposes. There is remarkable diversity in approaches to the description of geographic places and no standardization beyond authoritative sources for the geographic names themselves.

¬Das große Data Becker Lexikon (1995) 0.02

0.022399765 = product of:
  0.08959906 = sum of:
    0.08959906 = weight(_text_:data in 5368) [ClassicSimilarity], result of:
      0.08959906 = score(doc=5368,freq=6.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.60511017 = fieldWeight in 5368, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.078125 = fieldNorm(doc=5368)
  0.25 = coord(1/4)

Imprint: Düsseldorf : Data Becker
Object: Große Data Becker Lexikon

Woods, E.W.; IFLA Section on classification and Indexing and Indexing and Information Technology; Joint Working Group on a Classification Format: Requirements for a format of classification data : Final report, July 1996 (1996) 0.02

0.021947198 = product of:
  0.08778879 = sum of:
    0.08778879 = weight(_text_:data in 3008) [ClassicSimilarity], result of:
      0.08778879 = score(doc=3008,freq=4.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.5928845 = fieldWeight in 3008, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.09375 = fieldNorm(doc=3008)
  0.25 = coord(1/4)

Object: USMARC for classification data

Daniel Jr., R.; Lagoze, C.: Extending the Warwick framework : from metadata containers to active digital objects (1997) 0.02
```
0.020742472 = product of:
  0.08296989 = sum of:
    0.08296989 = weight(_text_:data in 1264) [ClassicSimilarity], result of:
      0.08296989 = score(doc=1264,freq=42.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.56033987 = fieldWeight in 1264, product of:
          6.4807405 = tf(freq=42.0), with freq of:
            42.0 = termFreq=42.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1264)
  0.25 = coord(1/4)
```
Abstract

Defining metadata as "data about data" provokes more questions than it answers. What are the forms of the data and metadata? Can we be more specific about the manner in which the metadata is "about" the data? Are data and metadata distinguished only in the context of their relationship? Is the nature of the relationship between the datasets declarative or procedural? Can the metadata itself be described by other data? Over the past several years, we have been engaged in a number of efforts examining the role, format, composition, and architecture of metadata for networked resources. During this time, we have noticed the tendency to be led astray by comfortable, but somewhat inappropriate, models in the non-digital information environment. Rather than pursuing familiar models, there is the need for a new model that fully exploits the unique combination of computation and connectivity that characterizes the digital library. In this paper, we describe an extension of the Warwick Framework that we call Distributed Active Relationships (DARs). DARs provide a powerful model for representing data and metadata in digital library objects. They explicitly express the relationships between networked resources, and even allow those relationships to be dynamically downloadable and executable. The DAR model is based on the following principles, which our examination of the "data about data" definition has led us to regard as axiomatic: * There is no essential distinction between data and metadata. We can only make such a distinction in terms of a particular "about" relationship. As a result, what is metadata in the context of one "about" relationship may be data in another. * There is no single "about" relationship. There are many different and important relationships between data resources. * Resources can be related without regard for their location. The connectivity in networked information architectures makes it possible to have data in one repository describe data in another repository. * The computational power of the networked information environment makes it possible to consider active or dynamic relationships between data sets. This adds considerable power to the "data about data" definition. First, data about another data set may not physically exist, but may be automatically derived. Second, the "about" relationship may be an executable object -- in a sense interpretable metadata. As will be shown, this provides useful mechanisms for handling complex metadata problems such as rights management of digital objects. The remainder of this paper describes the development and consequences of the DAR model. Section 2 reviews the Warwick Framework, which is the basis for the model described in this paper. Section 3 examines the concept of the Warwick Framework Catalog, which provides a mechanism for expressing the relationships between the packages in a Warwick Framework container. With that background established, section 4 generalizes the Warwick Framework by removing the restriction that it only contains "metadata". This allows us to consider digital library objects that are aggregations of (possibly distributed) data sets, with the relationships between the data sets expressed using a Warwick Framework Catalog. Section 5 further extends the model by describing Distributed Active Relationships (DARs). DARs are the explicit relationships that have the potential to be executable, as alluded to earlier. Finally, section 6 describes two possible implementations of these concepts.
Plotkin, R.C.; Schwartz, M.S.: Data modeling for news clip archive : a prototype solution (1997) 0.01
```
0.013439858 = product of:
  0.053759433 = sum of:
    0.053759433 = weight(_text_:data in 1259) [ClassicSimilarity], result of:
      0.053759433 = score(doc=1259,freq=6.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.3630661 = fieldWeight in 1259, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=1259)
  0.25 = coord(1/4)
```
Abstract

Film, videotape and multimedia archive systems must address the issues of editing, authoring and searching at the media (i.e. tape) or sub media (i.e. scene) level in addition to the traditional inventory management capabilities associated with the physical media. This paper describes a prototype of a database design for the storage, search and retrieval of multimedia and its related information. It also provides a process by which legacy data can be imported to this schema. The Continuous Media Index, or Comix system is the name of the prototype. An implementation of such a digital library solution incorporates multimedia objects, hierarchical relationships and timecode in addition to traditional attribute data. Present video and multimedia archive systems are easily migrated to this architecture. Comix was implemented for a videotape archiving system. It was written for, and implemented using IBM Digital Library version 1.0. A derivative of Comix is currently in development for customer specific applications. Principles of the Comix design as well as the importation methods are not specific to the underlying systems used.
Rindflesch, T.C.; Aronson, A.R.: Semantic processing in information retrieval (1993) 0.01
```
0.012849508 = product of:
  0.05139803 = sum of:
    0.05139803 = product of:
      0.10279606 = sum of:
        0.10279606 = weight(_text_:processing in 4121) [ClassicSimilarity], result of:
          0.10279606 = score(doc=4121,freq=6.0), product of:
            0.18956426 = queryWeight, product of:
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.046827413 = queryNorm
            0.54227555 = fieldWeight in 4121, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.048147 = idf(docFreq=2097, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4121)
      0.5 = coord(1/2)
  0.25 = coord(1/4)
```
Abstract

Intuition suggests that one way to enhance the information retrieval process would be the use of phrases to characterize the contents of text. A number of researchers, however, have noted that phrases alone do not improve retrieval effectiveness. In this paper we briefly review the use of phrases in information retrieval and then suggest extensions to this paradigm using semantic information. We claim that semantic processing, which can be viewed as expressing relations between the concepts represented by phrases, will in fact enhance retrieval effectiveness. The availability of the UMLS® domain model, which we exploit extensively, significantly contributes to the feasibility of this processing.

Information als Rohstoff für Innovation : Programm der Bundesregierung 1996-2000 (1996) 0.01

0.012688936 = product of:
  0.050755743 = sum of:
    0.050755743 = product of:
      0.101511486 = sum of:
        0.101511486 = weight(_text_:22 in 5449) [ClassicSimilarity], result of:
          0.101511486 = score(doc=5449,freq=2.0), product of:
            0.16398162 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046827413 = queryNorm
            0.61904186 = fieldWeight in 5449, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=5449)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 22. 2.1997 19:26:34

Ask me[@sk.me]: your global information guide : der Wegweiser durch die Informationswelten (1996) 0.01

0.012688936 = product of:
  0.050755743 = sum of:
    0.050755743 = product of:
      0.101511486 = sum of:
        0.101511486 = weight(_text_:22 in 5837) [ClassicSimilarity], result of:
          0.101511486 = score(doc=5837,freq=2.0), product of:
            0.16398162 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046827413 = queryNorm
            0.61904186 = fieldWeight in 5837, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=5837)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 30.11.1996 13:22:37

Kosmos Weltatlas 2000 : Der Kompass für das 21. Jahrhundert. Inklusive Welt-Routenplaner (1999) 0.01

0.012688936 = product of:
  0.050755743 = sum of:
    0.050755743 = product of:
      0.101511486 = sum of:
        0.101511486 = weight(_text_:22 in 4085) [ClassicSimilarity], result of:
          0.101511486 = score(doc=4085,freq=2.0), product of:
            0.16398162 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046827413 = queryNorm
            0.61904186 = fieldWeight in 4085, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=4085)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 7.11.1999 18:22:39

Vögel unserer Heimat (1999) 0.01

0.011102819 = product of:
  0.044411276 = sum of:
    0.044411276 = product of:
      0.08882255 = sum of:
        0.08882255 = weight(_text_:22 in 4084) [ClassicSimilarity], result of:
          0.08882255 = score(doc=4084,freq=2.0), product of:
            0.16398162 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046827413 = queryNorm
            0.5416616 = fieldWeight in 4084, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=4084)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 7.11.1999 18:22:54

Dunning, A.: Do we still need search engines? (1999) 0.01

0.011102819 = product of:
  0.044411276 = sum of:
    0.044411276 = product of:
      0.08882255 = sum of:
        0.08882255 = weight(_text_:22 in 6021) [ClassicSimilarity], result of:
          0.08882255 = score(doc=6021,freq=2.0), product of:
            0.16398162 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046827413 = queryNorm
            0.5416616 = fieldWeight in 6021, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6021)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Source: Ariadne. 1999, no.22

Borgman, C.L.: Multi-media, multi-cultural, and multi-lingual digital libraries : or how do we exchange data In 400 languages? (1997) 0.01
```
0.011087317 = product of:
  0.044349268 = sum of:
    0.044349268 = weight(_text_:data in 1263) [ClassicSimilarity], result of:
      0.044349268 = score(doc=1263,freq=12.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.29951423 = fieldWeight in 1263, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1263)
  0.25 = coord(1/4)
```
Abstract

The Internet would not be very useful if communication were limited to textual exchanges between speakers of English located in the United States. Rather, its value lies in its ability to enable people from multiple nations, speaking multiple languages, to employ multiple media in interacting with each other. While computer networks broke through national boundaries long ago, they remain much more effective for textual communication than for exchanges of sound, images, or mixed media -- and more effective for communication in English than for exchanges in most other languages, much less interactions involving multiple languages. Supporting searching and display in multiple languages is an increasingly important issue for all digital libraries accessible on the Internet. Even if a digital library contains materials in only one language, the content needs to be searchable and displayable on computers in countries speaking other languages. We need to exchange data between digital libraries, whether in a single language or in multiple languages. Data exchanges may be large batch updates or interactive hyperlinks. In any of these cases, character sets must be represented in a consistent manner if exchanges are to succeed. Issues of interoperability, portability, and data exchange related to multi-lingual character sets have received surprisingly little attention in the digital library community or in discussions of standards for information infrastructure, except in Europe. The landmark collection of papers on Standards Policy for Information Infrastructure, for example, contains no discussion of multi-lingual issues except for a passing reference to the Unicode standard. The goal of this short essay is to draw attention to the multi-lingual issues involved in designing digital libraries accessible on the Internet. Many of the multi-lingual design issues parallel those of multi-media digital libraries, a topic more familiar to most readers of D-Lib Magazine. This essay draws examples from multi-media DLs to illustrate some of the urgent design challenges in creating a globally distributed network serving people who speak many languages other than English. First we introduce some general issues of medium, culture, and language, then discuss the design challenges in the transition from local to global systems, lastly addressing technical matters. The technical issues involve the choice of character sets to represent languages, similar to the choices made in representing images or sound. However, the scale of the language problem is far greater. Standards for multi-media representation are being adopted fairly rapidly, in parallel with the availability of multi-media content in electronic form. By contrast, we have hundreds (and sometimes thousands) of years worth of textual materials in hundreds of languages, created long before data encoding standards existed. Textual content from past and present is being encoded in language and application-specific representations that are difficult to exchange without losing data -- if they exchange at all. We illustrate the multi-language DL challenge with examples drawn from the research library community, which typically handles collections of materials in 400 or so languages. These are problems faced not only by developers of digital libraries, but by those who develop and manage any communication technology that crosses national or linguistic boundaries.
Dolin, R.; Agrawal, D.; El Abbadi, A.; Pearlman, J.: Using automated classification for summarizing and selecting heterogeneous information sources (1998) 0.01
```
0.010973599 = product of:
  0.043894395 = sum of:
    0.043894395 = weight(_text_:data in 316) [ClassicSimilarity], result of:
      0.043894395 = score(doc=316,freq=4.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.29644224 = fieldWeight in 316, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046875 = fieldNorm(doc=316)
  0.25 = coord(1/4)
```
Abstract

Information retrieval over the Internet increasingly requires the filtering of thousands of heterogeneous information sources. Important sources of information include not only traditional databases with structured data and queries, but also increasing numbers of non-traditional, semi- or unstructured collections such as Web sites, FTP archives, etc. As the number and variability of sources increases, new ways of automatically summarizing, discovering, and selecting collections relevant to a user's query are needed. One such method involves the use of classification schemes, such as the Library of Congress Classification (LCC) [10], within which a collection may be represented based on its content, irrespective of the structure of the actual data or documents. For such a system to be useful in a large-scale distributed environment, it must be easy to use for both collection managers and users. As a result, it must be possible to classify documents automatically within a classification scheme. Furthermore, there must be a straightforward and intuitive interface with which the user may use the scheme to assist in information retrieval (IR).

Strobel, S.: ¬The complete Linux kit : fully configured LINUX system kernel (1997) 0.01

0.009516701 = product of:
  0.038066804 = sum of:
    0.038066804 = product of:
      0.07613361 = sum of:
        0.07613361 = weight(_text_:22 in 8959) [ClassicSimilarity], result of:
          0.07613361 = score(doc=8959,freq=2.0), product of:
            0.16398162 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046827413 = queryNorm
            0.46428138 = fieldWeight in 8959, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=8959)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 16. 7.2002 20:22:55

Birmingham, J.: Internet search engines (1996) 0.01

0.009516701 = product of:
  0.038066804 = sum of:
    0.038066804 = product of:
      0.07613361 = sum of:
        0.07613361 = weight(_text_:22 in 5664) [ClassicSimilarity], result of:
          0.07613361 = score(doc=5664,freq=2.0), product of:
            0.16398162 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046827413 = queryNorm
            0.46428138 = fieldWeight in 5664, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=5664)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 10.11.1996 16:36:22

Kumar, V.; Furuta, R.; Allen, B.: Interactive interfaces for knowledge-rich domains (1996) 0.01
```
0.009052756 = product of:
  0.036211025 = sum of:
    0.036211025 = weight(_text_:data in 7082) [ClassicSimilarity], result of:
      0.036211025 = score(doc=7082,freq=2.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.24455236 = fieldWeight in 7082, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.0546875 = fieldNorm(doc=7082)
  0.25 = coord(1/4)
```
Abstract

Explores the use of interactive documents as interfaces to historical data starting with the basis of the well known representation of a timeline. When incorporated into the context of electronic documents, the timeline provides the basis for implementing an interface into an event space, relying particularly on hypertextual-style links. Generalizing timelines also permits the flexible representation of many different kinds of relationships beyond the temporal. Describes examples of such representations taken from prototype implementations
Spink, A.; Wilson, T.; Ellis, D.; Ford, N.: Modeling users' successive searches in digital environments : a National Science Foundation/British Library funded study (1998) 0.01
```
0.009052756 = product of:
  0.036211025 = sum of:
    0.036211025 = weight(_text_:data in 1255) [ClassicSimilarity], result of:
      0.036211025 = score(doc=1255,freq=8.0), product of:
        0.14807065 = queryWeight, product of:
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.046827413 = queryNorm
        0.24455236 = fieldWeight in 1255, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1620505 = idf(docFreq=5088, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1255)
  0.25 = coord(1/4)
```
Abstract

As digital libraries become a major source of information for many people, we need to know more about how people seek and retrieve information in digital environments. Quite commonly, users with a problem-at-hand and associated question-in-mind repeatedly search a literature for answers, and seek information in stages over extended periods from a variety of digital information resources. The process of repeatedly searching over time in relation to a specific, but possibly an evolving information problem (including changes or shifts in a variety of variables), is called the successive search phenomenon. The study outlined in this paper is currently investigating this new and little explored line of inquiry for information retrieval, Web searching, and digital libraries. The purpose of the research project is to investigate the nature, manifestations, and behavior of successive searching by users in digital environments, and to derive criteria for use in the design of information retrieval interfaces and systems supporting successive searching behavior. This study includes two related projects. The first project is based in the School of Library and Information Sciences at the University of North Texas and is funded by a National Science Foundation POWRE Grant <http://www.nsf.gov/cgi-bin/show?award=9753277>. The second project is based at the Department of Information Studies at the University of Sheffield (UK) and is funded by a grant from the British Library <http://www.shef. ac.uk/~is/research/imrg/uncerty.html> Research and Innovation Center. The broad objectives of each project are to examine the nature and extent of successive search episodes in digital environments by real users over time. The specific aim of the current project is twofold: * To characterize progressive changes and shifts that occur in: user situational context; user information problem; uncertainty reduction; user cognitive styles; cognitive and affective states of the user, and consequently in their queries; and * To characterize related changes over time in the type and use of information resources and search strategies particularly related to given capabilities of IR systems, and IR search engines, and examine changes in users' relevance judgments and criteria, and characterize their differences. The study is an observational, longitudinal data collection in the U.S. and U.K. Three questionnaires are used to collect data: reference, client post search and searcher post search questionnaires. Each successive search episode with a search intermediary for textual materials on the DIALOG Information Service is audiotaped and search transaction logs are recorded. Quantitative analysis includes statistical analysis using Likert scale data from the questionnaires and log-linear analysis of sequential data. Qualitative methods include: content analysis, structuring taxonomies; and diagrams to describe shifts and transitions within and between each search episode. Outcomes of the study are the development of appropriate model(s) for IR interactions in successive search episodes and the derivation of a set of design criteria for interfaces and systems supporting successive searching.

Search (42 results, page 1 of 3)

Authors

Languages

Types

Themes

Subjects