Search (47 results, page 1 of 3)

Spink, A.; Wilson, T.; Ellis, D.; Ford, N.: Modeling users' successive searches in digital environments : a National Science Foundation/British Library funded study (1998) 0.04
```
0.036295157 = product of:
  0.060491923 = sum of:
    0.010081458 = product of:
      0.050407287 = sum of:
        0.050407287 = weight(_text_:problem in 1255) [ClassicSimilarity], result of:
          0.050407287 = score(doc=1255,freq=6.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.28428814 = fieldWeight in 1255, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1255)
      0.2 = coord(1/5)
    0.018944593 = weight(_text_:of in 1255) [ClassicSimilarity], result of:
      0.018944593 = score(doc=1255,freq=46.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.29000556 = fieldWeight in 1255, product of:
          6.78233 = tf(freq=46.0), with freq of:
            46.0 = termFreq=46.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1255)
    0.031465873 = product of:
      0.062931746 = sum of:
        0.062931746 = weight(_text_:mind in 1255) [ClassicSimilarity], result of:
          0.062931746 = score(doc=1255,freq=2.0), product of:
            0.2607373 = queryWeight, product of:
              6.241566 = idf(docFreq=233, maxDocs=44218)
              0.04177434 = queryNorm
            0.24136074 = fieldWeight in 1255, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              6.241566 = idf(docFreq=233, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1255)
      0.5 = coord(1/2)
  0.6 = coord(3/5)
```
Abstract

As digital libraries become a major source of information for many people, we need to know more about how people seek and retrieve information in digital environments. Quite commonly, users with a problem-at-hand and associated question-in-mind repeatedly search a literature for answers, and seek information in stages over extended periods from a variety of digital information resources. The process of repeatedly searching over time in relation to a specific, but possibly an evolving information problem (including changes or shifts in a variety of variables), is called the successive search phenomenon. The study outlined in this paper is currently investigating this new and little explored line of inquiry for information retrieval, Web searching, and digital libraries. The purpose of the research project is to investigate the nature, manifestations, and behavior of successive searching by users in digital environments, and to derive criteria for use in the design of information retrieval interfaces and systems supporting successive searching behavior. This study includes two related projects. The first project is based in the School of Library and Information Sciences at the University of North Texas and is funded by a National Science Foundation POWRE Grant <http://www.nsf.gov/cgi-bin/show?award=9753277>. The second project is based at the Department of Information Studies at the University of Sheffield (UK) and is funded by a grant from the British Library <http://www.shef. ac.uk/~is/research/imrg/uncerty.html> Research and Innovation Center. The broad objectives of each project are to examine the nature and extent of successive search episodes in digital environments by real users over time. The specific aim of the current project is twofold: * To characterize progressive changes and shifts that occur in: user situational context; user information problem; uncertainty reduction; user cognitive styles; cognitive and affective states of the user, and consequently in their queries; and * To characterize related changes over time in the type and use of information resources and search strategies particularly related to given capabilities of IR systems, and IR search engines, and examine changes in users' relevance judgments and criteria, and characterize their differences. The study is an observational, longitudinal data collection in the U.S. and U.K. Three questionnaires are used to collect data: reference, client post search and searcher post search questionnaires. Each successive search episode with a search intermediary for textual materials on the DIALOG Information Service is audiotaped and search transaction logs are recorded. Quantitative analysis includes statistical analysis using Likert scale data from the questionnaires and log-linear analysis of sequential data. Qualitative methods include: content analysis, structuring taxonomies; and diagrams to describe shifts and transitions within and between each search episode. Outcomes of the study are the development of appropriate model(s) for IR interactions in successive search episodes and the derivation of a set of design criteria for interfaces and systems supporting successive searching.
Priss, U.: Description logic and faceted knowledge representation (1999) 0.02
```
0.015775634 = product of:
  0.039439082 = sum of:
    0.022459546 = weight(_text_:of in 2655) [ClassicSimilarity], result of:
      0.022459546 = score(doc=2655,freq=22.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.34381276 = fieldWeight in 2655, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=2655)
    0.016979538 = product of:
      0.033959076 = sum of:
        0.033959076 = weight(_text_:22 in 2655) [ClassicSimilarity], result of:
          0.033959076 = score(doc=2655,freq=2.0), product of:
            0.14628662 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04177434 = queryNorm
            0.23214069 = fieldWeight in 2655, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2655)
      0.5 = coord(1/2)
  0.4 = coord(2/5)
```
Abstract

The term "facet" was introduced into the field of library classification systems by Ranganathan in the 1930's [Ranganathan, 1962]. A facet is a viewpoint or aspect. In contrast to traditional classification systems, faceted systems are modular in that a domain is analyzed in terms of baseline facets which are then synthesized. In this paper, the term "facet" is used in a broader meaning. Facets can describe different aspects on the same level of abstraction or the same aspect on different levels of abstraction. The notion of facets is related to database views, multicontexts and conceptual scaling in formal concept analysis [Ganter and Wille, 1999], polymorphism in object-oriented design, aspect-oriented programming, views and contexts in description logic and semantic networks. This paper presents a definition of facets in terms of faceted knowledge representation that incorporates the traditional narrower notion of facets and potentially facilitates translation between different knowledge representation formalisms. A goal of this approach is a modular, machine-aided knowledge base design mechanism. A possible application is faceted thesaurus construction for information retrieval and data mining. Reasoning complexity depends on the size of the modules (facets). A more general analysis of complexity will be left for future research.

Date

22. 1.2016 17:30:31

Priss, U.: Faceted knowledge representation (1999) 0.01

0.014990156 = product of:
  0.03747539 = sum of:
    0.017665926 = weight(_text_:of in 2654) [ClassicSimilarity], result of:
      0.017665926 = score(doc=2654,freq=10.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.2704316 = fieldWeight in 2654, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0546875 = fieldNorm(doc=2654)
    0.019809462 = product of:
      0.039618924 = sum of:
        0.039618924 = weight(_text_:22 in 2654) [ClassicSimilarity], result of:
          0.039618924 = score(doc=2654,freq=2.0), product of:
            0.14628662 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04177434 = queryNorm
            0.2708308 = fieldWeight in 2654, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2654)
      0.5 = coord(1/2)
  0.4 = coord(2/5)

Abstract: Faceted Knowledge Representation provides a formalism for implementing knowledge systems. The basic notions of faceted knowledge representation are "unit", "relation", "facet" and "interpretation". Units are atomic elements and can be abstract elements or refer to external objects in an application. Relations are sequences or matrices of 0 and 1's (binary matrices). Facets are relational structures that combine units and relations. Each facet represents an aspect or viewpoint of a knowledge system. Interpretations are mappings that can be used to translate between different representations. This paper introduces the basic notions of faceted knowledge representation. The formalism is applied here to an abstract modeling of a faceted thesaurus as used in information retrieval.
Date: 22. 1.2016 17:30:31

Brin, S.; Page, L.: ¬The anatomy of a large-scale hypertextual Web search engine (1998) 0.01
```
0.012355096 = product of:
  0.030887738 = sum of:
    0.008315044 = product of:
      0.041575223 = sum of:
        0.041575223 = weight(_text_:problem in 947) [ClassicSimilarity], result of:
          0.041575223 = score(doc=947,freq=2.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.23447686 = fieldWeight in 947, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0390625 = fieldNorm(doc=947)
      0.2 = coord(1/5)
    0.022572692 = weight(_text_:of in 947) [ClassicSimilarity], result of:
      0.022572692 = score(doc=947,freq=32.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.34554482 = fieldWeight in 947, product of:
          5.656854 = tf(freq=32.0), with freq of:
            32.0 = termFreq=32.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=947)
  0.4 = coord(2/5)
```
Abstract

In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems. The prototype with a full text and hyperlink database of at least 24 million pages is available at http://google.stanford.edu/. To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of millions of queries every day. Despite the importance of large-scale search engines on the web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from three years ago. This paper provides an in-depth description of our large-scale web search engine -- the first such detailed public description we know of to date. Apart from the problems of scaling traditional search techniques to data of this magnitude, there are new technical challenges involved with using the additional information present in hypertext to produce better search results. This paper addresses this question of how to build a practical large-scale system which can exploit the additional information present in hypertext. Also we look at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want

Peters, C.; Picchi, E.: Across languages, across cultures : issues in multilinguality and digital libraries (1997) 0.01

0.011577157 = product of:
  0.028942892 = sum of:
    0.0133040715 = product of:
      0.066520356 = sum of:
        0.066520356 = weight(_text_:problem in 1233) [ClassicSimilarity], result of:
          0.066520356 = score(doc=1233,freq=2.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.375163 = fieldWeight in 1233, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0625 = fieldNorm(doc=1233)
      0.2 = coord(1/5)
    0.01563882 = weight(_text_:of in 1233) [ClassicSimilarity], result of:
      0.01563882 = score(doc=1233,freq=6.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.23940048 = fieldWeight in 1233, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0625 = fieldNorm(doc=1233)
  0.4 = coord(2/5)

Abstract: With the recent rapid diffusion over the international computer networks of world-wide distributed document bases, the question of multilingual access and multilingual information retrieval is becoming increasingly relevant. We briefly discuss just some of the issues that must be addressed in order to implement a multilingual interface for a Digital Library system and describe our own approach to this problem.

Baker, T.: Languages for Dublin Core (1998) 0.01
```
0.011098954 = product of:
  0.027747385 = sum of:
    0.010081458 = product of:
      0.050407287 = sum of:
        0.050407287 = weight(_text_:problem in 1257) [ClassicSimilarity], result of:
          0.050407287 = score(doc=1257,freq=6.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.28428814 = fieldWeight in 1257, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1257)
      0.2 = coord(1/5)
    0.017665926 = weight(_text_:of in 1257) [ClassicSimilarity], result of:
      0.017665926 = score(doc=1257,freq=40.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.2704316 = fieldWeight in 1257, product of:
          6.3245554 = tf(freq=40.0), with freq of:
            40.0 = termFreq=40.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1257)
  0.4 = coord(2/5)
```
Abstract

Over the past three years, the Dublin Core Metadata Initiative has achieved a broad international consensus on the semantics of a simple element set for describing electronic resources. Since the first workshop in March 1995, which was reported in the very first issue of D-Lib Magazine, Dublin Core has been the topic of perhaps a dozen articles here. Originally intended to be simple and intuitive enough for authors to tag Web pages without special training, Dublin Core is being adapted now for more specialized uses, from government information and legal deposit to museum informatics and electronic commerce. To meet such specialized requirements, Dublin Core can be customized with additional elements or qualifiers. However, these refinements can compromise interoperability across applications. There are tradeoffs between using specific terms that precisely meet local needs versus general terms that are understood more widely. We can better understand this inevitable tension between simplicity and complexity if we recognize that metadata is a form of human language. With Dublin Core, as with a natural language, people are inclined to stretch definitions, make general terms more specific, specific terms more general, misunderstand intended meanings, and coin new terms. One goal of this paper, therefore, will be to examine the experience of some related ways to seek semantic interoperability through simplicity: planned languages, interlingua constructs, and pidgins. The problem of semantic interoperability is compounded when we consider Dublin Core in translation. All of the workshops, documents, mailing lists, user guides, and working group outputs of the Dublin Core Initiative have been in English. But in many countries and for many applications, people need a metadata standard in their own language. In principle, the broad elements of Dublin Core can be defined equally well in Bulgarian or Hindi. Since Dublin Core is a controlled standard, however, any parallel definitions need to be kept in sync as the standard evolves. Another goal of the paper, then, will be to define the conceptual and organizational problem of maintaining a metadata standard in multiple languages. In addition to a name and definition, which are meant for human consumption, each Dublin Core element has a label, or indexing token, meant for harvesting by search engines. For practical reasons, these machine-readable tokens are English-looking strings such as Creator and Subject (just as HTML tags are called HEAD, BODY, or TITLE). These tokens, which are shared by Dublin Cores in every language, ensure that metadata fields created in any particular language are indexed together across repositories. As symbols of underlying universal semantics, these tokens form the basis of semantic interoperability among the multiple Dublin Cores. As long as we limit ourselves to sharing these indexing tokens among exact translations of a simple set of fifteen broad elements, the definitions of which fit easily onto two pages, the problem of Dublin Core in multiple languages is straightforward. But nothing having to do with human language is ever so simple. Just as speakers of various languages must learn the language of Dublin Core in their own tongues, we must find the right words to talk about a metadata language that is expressable in many discipline-specific jargons and natural languages and that inevitably will evolve and change over time.
Van de Sompel, H.; Hochstenbach, P.: Reference linking in a hybrid library environment : part 3: generalizing the SFX solution in the "SFX@Ghent & SFX@LANL" experiment (1999) 0.01
```
0.010986221 = product of:
  0.027465552 = sum of:
    0.0094074 = product of:
      0.047036998 = sum of:
        0.047036998 = weight(_text_:problem in 1243) [ClassicSimilarity], result of:
          0.047036998 = score(doc=1243,freq=4.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.2652803 = fieldWeight in 1243, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.03125 = fieldNorm(doc=1243)
      0.2 = coord(1/5)
    0.018058153 = weight(_text_:of in 1243) [ClassicSimilarity], result of:
      0.018058153 = score(doc=1243,freq=32.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.27643585 = fieldWeight in 1243, product of:
          5.656854 = tf(freq=32.0), with freq of:
            32.0 = termFreq=32.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03125 = fieldNorm(doc=1243)
  0.4 = coord(2/5)
```
Abstract

This is the third part of our papers about reference linking in a hybrid library environment. The first part described the state-of-the-art of reference linking and contrasted various approaches to the problem. It identified static and dynamic linking solutions, open and closed linking frameworks as well as just-in-case and just-in-time linking. The second part introduced SFX, a dynamic, just-in-time linking solution we built for our own purposes. However, we suggested that the underlying concepts were sufficiently generic to be applied in a wide range of digital libraries. In this third part we show how this has been demonstrated conclusively in the "SFX@Ghent & SFX@LANL" experiment. In this experiment, local as well as remote distributed information resources of the digital library collections of the Research Library of the Los Alamos National Laboratory and the University of Ghent Library have been used as starting points for SFX-links into other parts of the collections. The SFX-framework has further been generalized in order to achieve a technology that can easily be transferred from one digital library environment to another and that minimizes the overhead in making the distributed information services that make up those libraries interoperable with SFX. This third part starts with a presentation of the SFX problem statement in light of the recent discussions on reference linking. Next, it introduces the notion of global and local relevance of extended services as well as an architectural categorization of open linking frameworks, also referred to as frameworks that are supportive of selective resolution. Then, an in-depth description of the generalized SFX solution is given.
Oard, D.W.: Alternative approaches for cross-language text retrieval (1997) 0.01
```
0.010920028 = product of:
  0.027300071 = sum of:
    0.010081458 = product of:
      0.050407287 = sum of:
        0.050407287 = weight(_text_:problem in 1164) [ClassicSimilarity], result of:
          0.050407287 = score(doc=1164,freq=6.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.28428814 = fieldWeight in 1164, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1164)
      0.2 = coord(1/5)
    0.017218614 = weight(_text_:of in 1164) [ClassicSimilarity], result of:
      0.017218614 = score(doc=1164,freq=38.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.2635841 = fieldWeight in 1164, product of:
          6.164414 = tf(freq=38.0), with freq of:
            38.0 = termFreq=38.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1164)
  0.4 = coord(2/5)
```
Abstract

The explosive growth of the Internet and other sources of networked information have made automatic mediation of access to networked information sources an increasingly important problem. Much of this information is expressed as electronic text, and it is becoming practical to automatically convert some printed documents and recorded speech to electronic text as well. Thus, automated systems capable of detecting useful documents are finding widespread application. With even a small number of languages it can be inconvenient to issue the same query repeatedly in every language, so users who are able to read more than one language will likely prefer a multilingual text retrieval system over a collection of monolingual systems. And since reading ability in a language does not always imply fluent writing ability in that language, such users will likely find cross-language text retrieval particularly useful for languages in which they are less confident of their ability to express their information needs effectively. The use of such systems can be also be beneficial if the user is able to read only a single language. For example, when only a small portion of the document collection will ever be examined by the user, performing retrieval before translation can be significantly more economical than performing translation before retrieval. So when the application is sufficiently important to justify the time and effort required for translation, those costs can be minimized if an effective cross-language text retrieval system is available. Even when translation is not available, there are circumstances in which cross-language text retrieval could be useful to a monolingual user. For example, a researcher might find a paper published in an unfamiliar language useful if that paper contains references to works by the same author that are in the researcher's native language.
Multilingual text retrieval can be defined as selection of useful documents from collections that may contain several languages (English, French, Chinese, etc.). This formulation allows for the possibility that individual documents might contain more than one language, a common occurrence in some applications. Both cross-language and within-language retrieval are included in this formulation, but it is the cross-language aspect of the problem which distinguishes multilingual text retrieval from its well studied monolingual counterpart. At the SIGIR 96 workshop on "Cross-Linguistic Information Retrieval" the participants discussed the proliferation of terminology being used to describe the field and settled on "Cross-Language" as the best single description of the salient aspect of the problem. "Multilingual" was felt to be too broad, since that term has also been used to describe systems able to perform within-language retrieval in more than one language but that lack any cross-language capability. "Cross-lingual" and "cross-linguistic" were felt to be equally good descriptions of the field, but "crosslanguage" was selected as the preferred term in the interest of standardization. Unfortunately, at about the same time the U.S. Defense Advanced Research Projects Agency (DARPA) introduced "translingual" as their preferred term, so we are still some distance from reaching consensus on this matter.
I will not attempt to draw a sharp distinction between retrieval and filtering in this survey. Although my own work on adaptive cross-language text filtering has led me to make this distinction fairly carefully in other presentations (c.f., (Oard 1997b)), such an proach does little to help understand the fundamental techniques which have been applied or the results that have been obtained in this case. Since it is still common to view filtering (detection of useful documents in dynamic document streams) as a kind of retrieval, will simply adopt that perspective here.
Van de Sompel, H.; Hochstenbach, P.: Reference linking in a hybrid library environment : part 2: SFX, a generic linking solution (1999) 0.01
```
0.010812533 = product of:
  0.027031332 = sum of:
    0.008315044 = product of:
      0.041575223 = sum of:
        0.041575223 = weight(_text_:problem in 1241) [ClassicSimilarity], result of:
          0.041575223 = score(doc=1241,freq=2.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.23447686 = fieldWeight in 1241, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1241)
      0.2 = coord(1/5)
    0.018716287 = weight(_text_:of in 1241) [ClassicSimilarity], result of:
      0.018716287 = score(doc=1241,freq=22.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.28651062 = fieldWeight in 1241, product of:
          4.690416 = tf(freq=22.0), with freq of:
            22.0 = termFreq=22.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1241)
  0.4 = coord(2/5)
```
Abstract

This is the second part of two articles about reference linking in hybrid digital libraries. The first part, Frameworks for Linking described the current state-of-the-art and contrasted various approaches to the problem. It identified static and dynamic linking solutions, as well as open and closed linking frameworks. It also included an extensive bibliography. The second part describes our work at the University of Ghent to address these issues. SFX is a generic linking system that we have developed for our own needs, but its underlying concepts can be applied in a wide range of digital libraries. This is a description of the approach to the creation of extended services in a hybrid library environment that has been taken by the Library Automation team at the University of Ghent. The ongoing research has been grouped under the working title Special Effects (SFX). In order to explain the SFX-concepts in a comprehensive way, the discussion will start with a brief description of pre-SFX experiments. Thereafter, the basics of the SFX-approach are explained briefly, in combination with concrete implementation choices taken for the Elektron SFX-linking experiment. Elektron was the name of a modest digital library collaboration between the Universities of Ghent, Louvain and Antwerp.
Borgman, C.L.: Multi-media, multi-cultural, and multi-lingual digital libraries : or how do we exchange data In 400 languages? (1997) 0.01
```
0.009569087 = product of:
  0.023922717 = sum of:
    0.005820531 = product of:
      0.029102655 = sum of:
        0.029102655 = weight(_text_:problem in 1263) [ClassicSimilarity], result of:
          0.029102655 = score(doc=1263,freq=2.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.1641338 = fieldWeight in 1263, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1263)
      0.2 = coord(1/5)
    0.018102186 = weight(_text_:of in 1263) [ClassicSimilarity], result of:
      0.018102186 = score(doc=1263,freq=42.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.2771099 = fieldWeight in 1263, product of:
          6.4807405 = tf(freq=42.0), with freq of:
            42.0 = termFreq=42.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.02734375 = fieldNorm(doc=1263)
  0.4 = coord(2/5)
```
Abstract

The Internet would not be very useful if communication were limited to textual exchanges between speakers of English located in the United States. Rather, its value lies in its ability to enable people from multiple nations, speaking multiple languages, to employ multiple media in interacting with each other. While computer networks broke through national boundaries long ago, they remain much more effective for textual communication than for exchanges of sound, images, or mixed media -- and more effective for communication in English than for exchanges in most other languages, much less interactions involving multiple languages. Supporting searching and display in multiple languages is an increasingly important issue for all digital libraries accessible on the Internet. Even if a digital library contains materials in only one language, the content needs to be searchable and displayable on computers in countries speaking other languages. We need to exchange data between digital libraries, whether in a single language or in multiple languages. Data exchanges may be large batch updates or interactive hyperlinks. In any of these cases, character sets must be represented in a consistent manner if exchanges are to succeed. Issues of interoperability, portability, and data exchange related to multi-lingual character sets have received surprisingly little attention in the digital library community or in discussions of standards for information infrastructure, except in Europe. The landmark collection of papers on Standards Policy for Information Infrastructure, for example, contains no discussion of multi-lingual issues except for a passing reference to the Unicode standard. The goal of this short essay is to draw attention to the multi-lingual issues involved in designing digital libraries accessible on the Internet. Many of the multi-lingual design issues parallel those of multi-media digital libraries, a topic more familiar to most readers of D-Lib Magazine. This essay draws examples from multi-media DLs to illustrate some of the urgent design challenges in creating a globally distributed network serving people who speak many languages other than English. First we introduce some general issues of medium, culture, and language, then discuss the design challenges in the transition from local to global systems, lastly addressing technical matters. The technical issues involve the choice of character sets to represent languages, similar to the choices made in representing images or sound. However, the scale of the language problem is far greater. Standards for multi-media representation are being adopted fairly rapidly, in parallel with the availability of multi-media content in electronic form. By contrast, we have hundreds (and sometimes thousands) of years worth of textual materials in hundreds of languages, created long before data encoding standards existed. Textual content from past and present is being encoded in language and application-specific representations that are difficult to exchange without losing data -- if they exchange at all. We illustrate the multi-language DL challenge with examples drawn from the research library community, which typically handles collections of materials in 400 or so languages. These are problems faced not only by developers of digital libraries, but by those who develop and manage any communication technology that crosses national or linguistic boundaries.
Van de Sompel, H.; Hochstenbach, P.: Reference linking in a hybrid library environment : part 1: frameworks for linking (1999) 0.01
```
0.009417556 = product of:
  0.02354389 = sum of:
    0.0066520358 = product of:
      0.033260178 = sum of:
        0.033260178 = weight(_text_:problem in 1244) [ClassicSimilarity], result of:
          0.033260178 = score(doc=1244,freq=2.0), product of:
            0.17731056 = queryWeight, product of:
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.04177434 = queryNorm
            0.1875815 = fieldWeight in 1244, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.244485 = idf(docFreq=1723, maxDocs=44218)
              0.03125 = fieldNorm(doc=1244)
      0.2 = coord(1/5)
    0.016891856 = weight(_text_:of in 1244) [ClassicSimilarity], result of:
      0.016891856 = score(doc=1244,freq=28.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.25858206 = fieldWeight in 1244, product of:
          5.2915025 = tf(freq=28.0), with freq of:
            28.0 = termFreq=28.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03125 = fieldNorm(doc=1244)
  0.4 = coord(2/5)
```
Abstract

The creation of services linking related information entities is an area that is attracting an ever increasing interest in the ongoing development of the World Wide Web in general, and of research-related information systems in particular. Currently, both practice and theory point at linking services as being a major domain for innovation enabled by digital communication of content. Publishers, subscription agents, researchers and libraries are all looking into ways to create added value by linking related information entities, as such presenting the information within a broader context estimated to be relevant to the users of the information. This is the first of two articles in D-Lib Magazine on this topic. This first part describes the current state-of-the-art and contrasts various approaches to the problem. It identifies static and dynamic linking solutions as well as open and closed linking frameworks. It also includes an extensive bibliography. The second part, SFX, a Generic Linking Solution describes a system that we have developed for linking in a hybrid working environment. The creation of services linking related information entities is an area that is attracting an ever increasing interest in the ongoing development of the World Wide Web in general, and of research-related information systems in particular. Although most writings on electronic scientific communication have touted other benefits, such as the increase in communication speed, the possibility to exchange multimedia content and the absence of limitations on the length of research papers, currently both practice and theory point at linking services as being a major opportunity for improved communication of content. Publishers, subscription agents, researchers and libraries are all looking into ways to create added-value by linking related information entities, as such presenting the information within a broader context estimated to be relevant to the users of the information.

Dunning, A.: Do we still need search engines? (1999) 0.01

0.007923785 = product of:
  0.039618924 = sum of:
    0.039618924 = product of:
      0.07923785 = sum of:
        0.07923785 = weight(_text_:22 in 6021) [ClassicSimilarity], result of:
          0.07923785 = score(doc=6021,freq=2.0), product of:
            0.14628662 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04177434 = queryNorm
            0.5416616 = fieldWeight in 6021, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=6021)
      0.5 = coord(1/2)
  0.2 = coord(1/5)

Source: Ariadne. 1999, no.22

Page, A.: ¬The search is over : the search-engines secrets of the pros (1996) 0.01

0.0050474075 = product of:
  0.025237037 = sum of:
    0.025237037 = weight(_text_:of in 5670) [ClassicSimilarity], result of:
      0.025237037 = score(doc=5670,freq=10.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.38633084 = fieldWeight in 5670, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.078125 = fieldNorm(doc=5670)
  0.2 = coord(1/5)

Abstract: Covers 8 of the most popular search engines. Gives a summary of each and has a nice table of features that also briefly lists the pros and cons. Includes a short explanation of Boolean operators too

Lynch, C.A.: ¬The Z39.50 information retrieval standard : part I: a strategic view of its past, present and future (1997) 0.00
```
0.0049762414 = product of:
  0.024881206 = sum of:
    0.024881206 = weight(_text_:of in 1262) [ClassicSimilarity], result of:
      0.024881206 = score(doc=1262,freq=108.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.38088378 = fieldWeight in 1262, product of:
          10.392304 = tf(freq=108.0), with freq of:
            108.0 = termFreq=108.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.0234375 = fieldNorm(doc=1262)
  0.2 = coord(1/5)
```
Abstract

The Z39.50 standard for information retrieval is important from a number of perspectives. While still not widely known within the computer networking community, it is a mature standard that represents the culmination of two decades of thinking and debate about how information retrieval functions can be modeled, standardized, and implemented in a distributed systems environment. And - importantly -- it has been tested through substantial deployment experience. Z39.50 is one of the few examples we have to date of a protocol that actually goes beyond codifying mechanism and moves into the area of standardizing shared semantic knowledge. The extent to which this should be a goal of the protocol has been an ongoing source of controversy and tension within the developer community, and differing views on this issue can be seen both in the standard itself and the way that it is used in practice. Given the growing emphasis on issues such as "semantic interoperability" as part of the research agenda for digital libraries (see Clifford A. Lynch and Hector Garcia-Molina. Interoperability, Scaling, and the Digital Libraries Research Agenda, Report on the May 18-19, 1995 IITA Libraries Workshop, <http://www- diglib.stanford.edu/diglib/pub/reports/iita-dlw/main.html>), the insights gained by the Z39.50 community into the complex interactions among various definitions of semantics and interoperability are particularly relevant. The development process for the Z39.50 standard is also of interest in its own right. Its history, dating back to the 1970s, spans a period that saw the eclipse of formal standards-making agencies by groups such as the Internet Engineering Task Force (IETF) and informal standards development consortia. Moreover, in order to achieve meaningful implementation, Z39.50 had to move beyond its origins in the OSI debacle of the 1980s. Z39.50 has also been, to some extent, a victim of its own success -- or at least promise. Recent versions of the standard are highly extensible, and the consensus process of standards development has made it hospitable to an ever-growing set of new communities and requirements. As this process of extension has proceeded, it has become ever less clear what the appropriate scope and boundaries of the protocol should be, and what expectations one should have of practical interoperability among implementations of the standard. Z39.50 thus offers an excellent case study of the problems involved in managing the evolution of a standard over time. It may well offer useful lessons for the future of other standards such as HTTP and HTML, which seem to be facing some of the same issues.
This paper, which will appear in two parts, starting with this issue of D-Lib, looks at several strategic issues surrounding Z39.50. After a relatively brief overview of the function and history of the protocol, I will examine some of the competing visions of the protocol's role, with emphasis on issues of interoperability and the incorporation of semantics. The second installment of the paper will look at questions related to the management of the standard and the standards development process, with emphasis on the scope of the protocol and how that relates back again to interoperability questions. The paper concludes with a discussion of the adoption and deployment of the standard, its relationship to other standards, and some speculations on future directions for the protocol. This paper is not intended to be a tutorial on the details of how current or past versions of Z39.50 work. These technical details are covered not only in the standard itself (which can admittedly be rather difficult reading) but also in an array of tutorial and review papers (see <http://lcweb.loc.gov/z3950/agency> for bibliographies and pointers to on-line information on Z39.50). Instead, the paper's focus is on how and why Z39.50 developed the way it did, and the conceptual debates that have influenced its evolution and use. While a detailed technical knowledge of the operation of Z39.50 is certainly helpful, it should not be necessary in order to follow most of the material here. Some disclaimers are in order. I have been actively involved in the development of Z39.50 since the early 1980s and have been a participant -- and on occasion, even an instigator -- of some of the activities described here. This paper is an attempt to make a critical assessment of the current state of Z39.50 and a review of its development with the full benefit of hindsight. It recounts a number of debates that occurred within the developer community over the past years. In many of these, I advocated specific positions or approaches, sometimes successfully and sometimes unsuccessfully. What is presented here is one person's perspective - mine --, which is sometimes at odds with the current consensus with the developer community; I've tried to represent opposing views fairly, and to differentiate my opinions from fact or consensus. However, others will undoubtedly disagree with many of the comments here.
Thiele, H.: ¬The Dublin Core and Warwick framework : a review of the literature, March 1995 - September 1997 (1998) 0.00
```
0.00488322 = product of:
  0.024416098 = sum of:
    0.024416098 = weight(_text_:of in 1254) [ClassicSimilarity], result of:
      0.024416098 = score(doc=1254,freq=26.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.37376386 = fieldWeight in 1254, product of:
          5.0990195 = tf(freq=26.0), with freq of:
            26.0 = termFreq=26.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=1254)
  0.2 = coord(1/5)
```
Abstract

The purpose of this essay is to identify and explore the dynamics of the literature associated with the Dublin Core Workshop Series. The essay opens by identifying the problems that the Dublin Core Workshop Series is addressing, the status of the Internet at the time of the first workshop, and the contributions each workshop has made to the ongoing discussion. The body of the essay describes the characteristics of the literature, highlights key documents, and identifies the major researchers. The essay closes with evaluation of the literary trends and considerations of future research directions. The essay concludes that a shift from a descriptive emphasis to a more empirical form of literature is about to take place. Future research questions are identified in the areas of satisfying searcher needs, the impact of surrogate descriptions on search engine performance, and the effectiveness of surrogate descriptions in authenticating Internet resources.
Landauer, T.K.; Foltz, P.W.; Laham, D.: ¬An introduction to Latent Semantic Analysis (1998) 0.00
```
0.00488322 = product of:
  0.024416098 = sum of:
    0.024416098 = weight(_text_:of in 1162) [ClassicSimilarity], result of:
      0.024416098 = score(doc=1162,freq=26.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.37376386 = fieldWeight in 1162, product of:
          5.0990195 = tf(freq=26.0), with freq of:
            26.0 = termFreq=26.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=1162)
  0.2 = coord(1/5)
```
Abstract

Latent Semantic Analysis (LSA) is a theory and method for extracting and representing the contextual-usage meaning of words by statistical computations applied to a large corpus of text (Landauer and Dumais, 1997). The underlying idea is that the aggregate of all the word contexts in which a given word does and does not appear provides a set of mutual constraints that largely determines the similarity of meaning of words and sets of words to each other. The adequacy of LSA's reflection of human knowledge has been established in a variety of ways. For example, its scores overlap those of humans on standard vocabulary and subject matter tests; it mimics human word sorting and category judgments; it simulates word-word and passage-word lexical priming data; and as reported in 3 following articles in this issue, it accurately estimates passage coherence, learnability of passages by individual students, and the quality and quantity of knowledge contained in an essay.
Paskin, N.: DOI: current status and outlook (1999) 0.00
```
0.0045145387 = product of:
  0.022572692 = sum of:
    0.022572692 = weight(_text_:of in 1245) [ClassicSimilarity], result of:
      0.022572692 = score(doc=1245,freq=50.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.34554482 = fieldWeight in 1245, product of:
          7.071068 = tf(freq=50.0), with freq of:
            50.0 = termFreq=50.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03125 = fieldNorm(doc=1245)
  0.2 = coord(1/5)
```
Abstract

Over the past few months the International DOI Foundation (IDF) has produced a number of discussion papers and other materials about the Digital Object Identifier (DOIsm) initiative. They are all available at the DOI web site, including a brief summary of the DOI origins and purpose. The aim of the present paper is to update those papers, reflecting recent progress, and to provide a summary of the current position and context of the DOI. Although much of the material presented here is the result of a consensus by the organisations forming the International DOI Foundation, some of the points discuss work in progress. The paper describes the origin of the DOI as a persistent identifier for managing copyrighted materials and its development under the non-profit International DOI Foundation into a system providing identifiers of intellectual property with a framework for open applications to be built using them. Persistent identification implementations consistent with URN specifications have up to now been hindered by lack of widespread availability of resolution mechanisms, content typology consensus, and sufficiently flexible infrastructure; DOI attempts to overcome these obstacles. Resolution of the DOI uses the Handle System®, which offers the necessary functionality for open applications. The aim of the International DOI Foundation is to promote widespread applications of the DOI, which it is doing by pioneering some early implementations and by providing an extensible framework to ensure interoperability of future DOI uses. Applications of the DOI will require an interoperable scheme of declared metadata with each DOI; the basis of the DOI metadata scheme is a minimal "kernel" of elements supplemented by additional application-specific elements, under an umbrella data model (derived from the INDECS analysis) that promotes convergence of different application metadata sets. The IDF intends to require declaration of only a minimal set of metadata, sufficient to enable unambiguous look-up of a DOI, but this must be capable of extension by others to create open applications.
Payette, S.; Blanchi, C.; Lagoze, C.; Overly, E.A.: Interoperability for digital objects and repositories : the Cornell/CNRI experiments (1999) 0.00
```
0.004423326 = product of:
  0.02211663 = sum of:
    0.02211663 = weight(_text_:of in 1248) [ClassicSimilarity], result of:
      0.02211663 = score(doc=1248,freq=48.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.33856338 = fieldWeight in 1248, product of:
          6.928203 = tf(freq=48.0), with freq of:
            48.0 = termFreq=48.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03125 = fieldNorm(doc=1248)
  0.2 = coord(1/5)
```
Abstract

For several years the Digital Library Research Group at Cornell University and the Corporation for National Research Initiatives (CNRI) have been engaged in research focused on the design and development of infrastructures for open architecture, confederated digital libraries. The goal of this effort is to achieve interoperability and extensibility of digital library systems through the definition of key digital library services and their open interfaces, allowing flexible interaction of existing services and augmentation of the infrastructure with new services. Some aspects of this research have included the development and deployment of the Dienst software, the Handle System®, and the architecture of digital objects and repositories. In this paper, we describe the joint effort by Cornell and CNRI to prototype a rich and deployable architecture for interoperable digital objects and repositories. This effort has challenged us to move theories of interoperability closer to practice. The Cornell/CNRI collaboration builds on two existing projects focusing on the development of interoperable digital libraries. Details relating to the technology of these projects are described elsewhere. Both projects were strongly influenced by the fundamental abstractions of repositories and digital objects as articulated by Kahn and Wilensky in A Framework for Distributed Digital Object Services. Furthermore, both programs were influenced by the container architecture described in the Warwick Framework, and by the notions of distributed dynamic objects presented by Lagoze and Daniel in their Distributed Active Relationship work. With these common roots, one would expect that the CNRI and Cornell repositories would be at least theoretically interoperable. However, the actual test would be the extent to which our independently developed repositories were practically interoperable. This paper focuses on the definition of interoperability in the joint Cornell/CNRI work and the set of experiments conducted to formally test it. Our motivation for this work is the eventual deployment of formally tested reference implementations of the repository architecture for experimentation and development by fellow digital library researchers. In Section 2, we summarize the digital object and repository approach that was the focus of our interoperability experiments. In Section 3, we describe the set of experiments that progressively tested interoperability at increasing levels of functionality. In Section 4, we discuss general conclusions, and in Section 5, we give a preview of our future work, including our plans to evolve our experimentation to the point of defining a set of formal metrics for measuring interoperability for repositories and digital objects. This is still a work in progress that is expected to undergo additional refinements during its development.
Atkins, H.: ¬The ISI® Web of Science® - links and electronic journals : how links work today in the Web of Science, and the challenges posed by electronic journals (1999) 0.00
```
0.004330193 = product of:
  0.021650964 = sum of:
    0.021650964 = weight(_text_:of in 1246) [ClassicSimilarity], result of:
      0.021650964 = score(doc=1246,freq=46.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.33143494 = fieldWeight in 1246, product of:
          6.78233 = tf(freq=46.0), with freq of:
            46.0 = termFreq=46.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.03125 = fieldNorm(doc=1246)
  0.2 = coord(1/5)
```
Abstract

Since their inception in the early 1960s the strength and unique aspect of the ISI citation indexes has been their ability to illustrate the conceptual relationships between scholarly documents. When authors create reference lists for their papers, they make explicit links between their own, current work and the prior work of others. The exact nature of these links may not be expressed in the references themselves, and the motivation behind them may vary (this has been the subject of much discussion over the years), but the links embodied in references do exist. Over the past 30+ years, technology has allowed ISI to make the presentation of citation searching increasingly accessible to users of our products. Citation searching and link tracking moved from being rather cumbersome in print, to being direct and efficient (albeit non-intuitive) online, to being somewhat more user-friendly in CD format. But it is the confluence of the hypertext link and development of Web browsers that has enabled us to present to users a new form of citation product -- the Web of Science -- that is intuitive and makes citation indexing conceptually accessible. A cited reference search begins with a known, important (or at least relevant) document used as the search term. The search allows one to identify subsequent articles that have cited that document. This feature adds the dimension of prospective searching to the usual retrospective searching that all bibliographic indexes provide. Citation indexing is a prime example of a concept before its time - important enough to be used in the meantime by those sufficiently motivated, but just waiting for the right technology to come along to expand its use. While it was possible to follow citation links in earlier citation index formats, this required a level of effort on the part of users that was often just too much to ask of the casual user. In the citation indexes as presented in the Web of Science, the relationship between citing and cited documents is evident to users, and a click of the mouse is all it takes to follow a citation link. Citation connections are established between the published papers being indexed from the 8,000+ journals ISI covers and the items their reference lists contain during the data capture process. It is the standardized capture of each of the references included with these documents that enables us to provide the citation searching feature in all the citation index formats, as well as both internal and external links in the Web of Science.

Object

Web of Science
Dolin, R.; Agrawal, D.; El Abbadi, A.; Pearlman, J.: Using automated classification for summarizing and selecting heterogeneous information sources (1998) 0.00
```
0.004282867 = product of:
  0.021414334 = sum of:
    0.021414334 = weight(_text_:of in 316) [ClassicSimilarity], result of:
      0.021414334 = score(doc=316,freq=20.0), product of:
        0.06532493 = queryWeight, product of:
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.04177434 = queryNorm
        0.32781258 = fieldWeight in 316, product of:
          4.472136 = tf(freq=20.0), with freq of:
            20.0 = termFreq=20.0
          1.5637573 = idf(docFreq=25162, maxDocs=44218)
          0.046875 = fieldNorm(doc=316)
  0.2 = coord(1/5)
```
Abstract

Information retrieval over the Internet increasingly requires the filtering of thousands of heterogeneous information sources. Important sources of information include not only traditional databases with structured data and queries, but also increasing numbers of non-traditional, semi- or unstructured collections such as Web sites, FTP archives, etc. As the number and variability of sources increases, new ways of automatically summarizing, discovering, and selecting collections relevant to a user's query are needed. One such method involves the use of classification schemes, such as the Library of Congress Classification (LCC) [10], within which a collection may be represented based on its content, irrespective of the structure of the actual data or documents. For such a system to be useful in a large-scale distributed environment, it must be easy to use for both collection managers and users. As a result, it must be possible to classify documents automatically within a classification scheme. Furthermore, there must be a straightforward and intuitive interface with which the user may use the scheme to assist in information retrieval (IR).

Search (47 results, page 1 of 3)

Authors

Languages

Themes