Search (143 results, page 1 of 8)

Beppler, F.D.; Fonseca, F.T.; Pacheco, R.C.S.: Hermeneus: an architecture for an ontology-enabled information retrieval (2008) 0.02

0.021399792 = product of:
  0.07489927 = sum of:
    0.027943838 = weight(_text_:system in 3261) [ClassicSimilarity], result of:
      0.027943838 = score(doc=3261,freq=6.0), product of:
        0.07727166 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.02453417 = queryNorm
        0.36163113 = fieldWeight in 3261, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.046875 = fieldNorm(doc=3261)
    0.011207362 = weight(_text_:information in 3261) [ClassicSimilarity], result of:
      0.011207362 = score(doc=3261,freq=10.0), product of:
        0.04306919 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02453417 = queryNorm
        0.2602176 = fieldWeight in 3261, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=3261)
    0.025775949 = weight(_text_:retrieval in 3261) [ClassicSimilarity], result of:
      0.025775949 = score(doc=3261,freq=6.0), product of:
        0.07421378 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02453417 = queryNorm
        0.34732026 = fieldWeight in 3261, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=3261)
    0.009972124 = product of:
      0.019944249 = sum of:
        0.019944249 = weight(_text_:22 in 3261) [ClassicSimilarity], result of:
          0.019944249 = score(doc=3261,freq=2.0), product of:
            0.085914485 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.02453417 = queryNorm
            0.23214069 = fieldWeight in 3261, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=3261)
      0.5 = coord(1/2)
  0.2857143 = coord(4/14)

Abstract: Ontologies improve IR systems regarding its retrieval and presentation of information, which make the task of finding information more effective, efficient, and interactive. In this paper we argue that ontologies also greatly improve the engineering of such systems. We created a framework that uses ontology to drive the process of engineering an IR system. We developed a prototype that shows how a domain specialist without knowledge in the IR field can build an IR system with interactive components. The resulting system provides support for users not only to find their information needs but also to extend their state of knowledge. This way, our approach to ontology-enabled information retrieval addresses both the engineering aspect described here and also the usability aspect described elsewhere.
Date: 28.11.2016 12:43:22

Paralic, J.; Kostial, I.: Ontology-based information retrieval (2003) 0.02

0.015795203 = product of:
  0.07371095 = sum of:
    0.026618723 = weight(_text_:system in 1153) [ClassicSimilarity], result of:
      0.026618723 = score(doc=1153,freq=4.0), product of:
        0.07727166 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.02453417 = queryNorm
        0.34448233 = fieldWeight in 1153, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1153)
    0.008269517 = weight(_text_:information in 1153) [ClassicSimilarity], result of:
      0.008269517 = score(doc=1153,freq=4.0), product of:
        0.04306919 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02453417 = queryNorm
        0.1920054 = fieldWeight in 1153, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1153)
    0.038822707 = weight(_text_:retrieval in 1153) [ClassicSimilarity], result of:
      0.038822707 = score(doc=1153,freq=10.0), product of:
        0.07421378 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02453417 = queryNorm
        0.5231199 = fieldWeight in 1153, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1153)
  0.21428572 = coord(3/14)

Abstract: In the proposed article a new, ontology-based approach to information retrieval (IR) is presented. The system is based on a domain knowledge representation schema in form of ontology. New resources registered within the system are linked to concepts from this ontology. In such a way resources may be retrieved based on the associations and not only based on partial or exact term matching as the use of vector model presumes In order to evaluate the quality of this retrieval mechanism, experiments to measure retrieval efficiency have been performed with well-known Cystic Fibrosis collection of medical scientific papers. The ontology-based retrieval mechanism has been compared with traditional full text search based on vector IR model as well as with the Latent Semantic Indexing method.

Fang, L.: ¬A developing search service : heterogeneous resources integration and retrieval system (2004) 0.01
```
0.012306197 = product of:
  0.05742892 = sum of:
    0.02328653 = weight(_text_:system in 1193) [ClassicSimilarity], result of:
      0.02328653 = score(doc=1193,freq=6.0), product of:
        0.07727166 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.02453417 = queryNorm
        0.30135927 = fieldWeight in 1193, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1193)
    0.009339468 = weight(_text_:information in 1193) [ClassicSimilarity], result of:
      0.009339468 = score(doc=1193,freq=10.0), product of:
        0.04306919 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02453417 = queryNorm
        0.21684799 = fieldWeight in 1193, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1193)
    0.024802918 = weight(_text_:retrieval in 1193) [ClassicSimilarity], result of:
      0.024802918 = score(doc=1193,freq=8.0), product of:
        0.07421378 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02453417 = queryNorm
        0.33420905 = fieldWeight in 1193, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1193)
  0.21428572 = coord(3/14)
```
Abstract

This article describes two approaches for searching heterogeneous resources, which are explained as they are used in two corresponding existing systems-RIRS (Resource Integration Retrieval System) and HRUSP (Heterogeneous Resource Union Search Platform). On analyzing the existing systems, a possible framework-the MUSP (Multimetadata-Based Union Search Platform) is presented. Libraries now face a dilemma. On one hand, libraries subscribe to many types of database retrieval systems that are produced by various providers. The libraries build their data and information systems independently. This results in highly heterogeneous and distributed systems at the technical level (e.g., different operating systems and user interfaces) and at the conceptual level (e.g., the same objects are named using different terms). On the other hand, end users want to access all these heterogeneous data via a union interface, without having to know the structure of each information system or the different retrieval methods used by the systems. Libraries must achieve a harmony between information providers and users. In order to bridge the gap between the service providers and the users, it would seem that all source databases would need to be rebuilt according to a uniform data structure and query language, but this seems impossible. Fortunately, however, libraries and information and technology providers are now making an effort to find a middle course that meets the requirements of both data providers and users. They are doing this through resource integration.

Theme

Information Gateway

Ding, L.; Finin, T.; Joshi, A.; Peng, Y.; Cost, R.S.; Sachs, J.; Pan, R.; Reddivari, P.; Doshi, V.: Swoogle : a Semantic Web search and metadata engine (2004) 0.01

0.0109178955 = product of:
  0.050950177 = sum of:
    0.022816047 = weight(_text_:system in 4704) [ClassicSimilarity], result of:
      0.022816047 = score(doc=4704,freq=4.0), product of:
        0.07727166 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.02453417 = queryNorm
        0.29527056 = fieldWeight in 4704, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.046875 = fieldNorm(doc=4704)
    0.0070881573 = weight(_text_:information in 4704) [ClassicSimilarity], result of:
      0.0070881573 = score(doc=4704,freq=4.0), product of:
        0.04306919 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02453417 = queryNorm
        0.16457605 = fieldWeight in 4704, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=4704)
    0.021045974 = weight(_text_:retrieval in 4704) [ClassicSimilarity], result of:
      0.021045974 = score(doc=4704,freq=4.0), product of:
        0.07421378 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02453417 = queryNorm
        0.2835858 = fieldWeight in 4704, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=4704)
  0.21428572 = coord(3/14)

Abstract: Swoogle is a crawler-based indexing and retrieval system for the Semantic Web, i.e., for Web documents in RDF or OWL. It extracts metadata for each discovered document, and computes relations between documents. Discovered documents are also indexed by an information retrieval system which can use either character N-Gram or URIrefs as keywords to find relevant documents and to compute the similarity among a set of documents. One of the interesting properties we compute is rank, a measure of the importance of a Semantic Web document.
Source: CIKM '04 Proceedings of the thirteenth ACM international conference on Information and knowledge management

Birmingham, W.; Pardo, B.; Meek, C.; Shifrin, J.: ¬The MusArt music-retrieval system (2002) 0.01
```
0.009874061 = product of:
  0.04607895 = sum of:
    0.021511177 = weight(_text_:system in 1205) [ClassicSimilarity], result of:
      0.021511177 = score(doc=1205,freq=8.0), product of:
        0.07727166 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.02453417 = queryNorm
        0.27838376 = fieldWeight in 1205, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.03125 = fieldNorm(doc=1205)
    0.0047254385 = weight(_text_:information in 1205) [ClassicSimilarity], result of:
      0.0047254385 = score(doc=1205,freq=4.0), product of:
        0.04306919 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02453417 = queryNorm
        0.10971737 = fieldWeight in 1205, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=1205)
    0.019842334 = weight(_text_:retrieval in 1205) [ClassicSimilarity], result of:
      0.019842334 = score(doc=1205,freq=8.0), product of:
        0.07421378 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02453417 = queryNorm
        0.26736724 = fieldWeight in 1205, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=1205)
  0.21428572 = coord(3/14)
```
Abstract

Music websites are ubiquitous, and music downloads, such as MP3, are a major source of Web traffic. As the amount of musical content increases and the Web becomes an important mechanism for distributing music, we expect to see a rising demand for music search services. Many currently available music search engines rely on file names, song title, composer or performer as the indexing and retrieval mechanism. These systems do not make use of the musical content. We believe that a more natural, effective, and usable music-information retrieval (MIR) system should have audio input, where the user can query with musical content. We are developing a system called MusArt for audio-input MIR. With MusArt, as with other audio-input MIR systems, a user sings or plays a theme, hook, or riff from the desired piece of music. The system transcribes the query and searches for related themes in a database, returning the most similar themes, given some measure of similarity. We call this "retrieval by query." In this paper, we describe the architecture of MusArt. An important element of MusArt is metadata creation: we believe that it is essential to automatically abstract important musical elements, particularly themes. Theme extraction is performed by a subsystem called MME, which we describe later in this paper. Another important element of MusArt is its support for a variety of search engines, as we believe that MIR is too complex for a single approach to work for all queries. Currently, MusArt supports a dynamic time-warping search engine that has high recall, and a complementary stochastic search engine that searches over themes, emphasizing speed and relevancy. The stochastic search engine is discussed in this paper.

Theme

Information Gateway

Whitney , C.; Schiff, L.: ¬The Melvyl Recommender Project : developing library recommendation services (2006) 0.01

0.009485896 = product of:
  0.044267513 = sum of:
    0.016133383 = weight(_text_:system in 1173) [ClassicSimilarity], result of:
      0.016133383 = score(doc=1173,freq=2.0), product of:
        0.07727166 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.02453417 = queryNorm
        0.20878783 = fieldWeight in 1173, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.046875 = fieldNorm(doc=1173)
    0.0070881573 = weight(_text_:information in 1173) [ClassicSimilarity], result of:
      0.0070881573 = score(doc=1173,freq=4.0), product of:
        0.04306919 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02453417 = queryNorm
        0.16457605 = fieldWeight in 1173, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=1173)
    0.021045974 = weight(_text_:retrieval in 1173) [ClassicSimilarity], result of:
      0.021045974 = score(doc=1173,freq=4.0), product of:
        0.07421378 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02453417 = queryNorm
        0.2835858 = fieldWeight in 1173, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=1173)
  0.21428572 = coord(3/14)

Abstract: Popular commercial on-line services such as Google, e-Bay, Amazon, and Netflix have evolved quickly over the last decade to help people find what they want, developing information retrieval strategies such as usefully ranked results, spelling correction, and recommender systems. Online library catalogs (OPACs), in contrast, have changed little and are notoriously difficult for patrons to use (University of California Libraries, 2005). Over the past year (June 2005 to the present), the Melvyl Recommender Project (California Digital Library, 2005) has been exploring methods and feasibility of closing the gap between features that library patrons want and have come to expect from information retrieval systems and what libraries are currently equipped to deliver. The project team conducted exploratory work in five topic areas: relevance ranking, auto-correction, use of a text-based discovery system, user interface strategies, and recommending. This article focuses specifically on the recommending portion of the project and potential extensions to that work.

Summann, F.; Lossau, N.: Search engine technology and digital libraries : moving from theory to practice (2004) 0.01
```
0.0086287 = product of:
  0.040267266 = sum of:
    0.021511177 = weight(_text_:system in 1196) [ClassicSimilarity], result of:
      0.021511177 = score(doc=1196,freq=8.0), product of:
        0.07727166 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.02453417 = queryNorm
        0.27838376 = fieldWeight in 1196, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.03125 = fieldNorm(doc=1196)
    0.0047254385 = weight(_text_:information in 1196) [ClassicSimilarity], result of:
      0.0047254385 = score(doc=1196,freq=4.0), product of:
        0.04306919 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02453417 = queryNorm
        0.10971737 = fieldWeight in 1196, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=1196)
    0.014030648 = weight(_text_:retrieval in 1196) [ClassicSimilarity], result of:
      0.014030648 = score(doc=1196,freq=4.0), product of:
        0.07421378 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02453417 = queryNorm
        0.18905719 = fieldWeight in 1196, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=1196)
  0.21428572 = coord(3/14)
```
Abstract

This article describes the journey from the conception of and vision for a modern search-engine-based search environment to its technological realisation. In doing so, it takes up the thread of an earlier article on this subject, this time from a technical viewpoint. As well as presenting the conceptual considerations of the initial stages, this article will principally elucidate the technological aspects of this journey. The starting point for the deliberations about development of an academic search engine was the experience we gained through the generally successful project "Digital Library NRW", in which from 1998 to 2000-with Bielefeld University Library in overall charge-we designed a system model for an Internet-based library portal with an improved academic search environment at its core. At the heart of this system was a metasearch with an availability function, to which we added a user interface integrating all relevant source material for study and research. The deficiencies of this approach were felt soon after the system was launched in June 2001. There were problems with the stability and performance of the database retrieval system, with the integration of full-text documents and Internet pages, and with acceptance by users, because users are increasingly performing the searches themselves using search engines rather than going to the library for help in doing searches. Since a long list of problems are also encountered using commercial search engines for academic use (in particular the retrieval of academic information and long-term availability), the idea was born for a search engine configured specifically for academic use. We also hoped that with one single access point founded on improved search engine technology, we could access the heterogeneous academic resources of subject-based bibliographic databases, catalogues, electronic newspapers, document servers and academic web pages.

Theme

Information Gateway

Linden, E.J. van der; Vliegen, R.; Wijk, J.J. van: Visual Universal Decimal Classification (2007) 0.01

0.008542442 = product of:
  0.039864726 = sum of:
    0.02328653 = weight(_text_:system in 548) [ClassicSimilarity], result of:
      0.02328653 = score(doc=548,freq=6.0), product of:
        0.07727166 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.02453417 = queryNorm
        0.30135927 = fieldWeight in 548, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.0390625 = fieldNorm(doc=548)
    0.004176737 = weight(_text_:information in 548) [ClassicSimilarity], result of:
      0.004176737 = score(doc=548,freq=2.0), product of:
        0.04306919 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02453417 = queryNorm
        0.09697737 = fieldWeight in 548, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=548)
    0.012401459 = weight(_text_:retrieval in 548) [ClassicSimilarity], result of:
      0.012401459 = score(doc=548,freq=2.0), product of:
        0.07421378 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02453417 = queryNorm
        0.16710453 = fieldWeight in 548, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=548)
  0.21428572 = coord(3/14)

Abstract: UDC aims to be a consistent and complete classification system, that enables practitioners to classify documents swiftly and smoothly. The eventual goal of UDC is to enable the public at large to retrieve documents from large collections of documents that are classified with UDC. The large size of the UDC Master Reference File, MRF with over 66.000 records, makes it difficult to obtain an overview and to understand its structure. Moreover, finding the right classification in MRF turns out to be difficult in practice. Last but not least, retrieval of documents requires insight and understanding of the coding system. Visualization is an effective means to support the development of UDC as well as its use by practitioners. Moreover, visualization offers possibilities to use the classification without use of the coding system as such. MagnaView has developed an application which demonstrates the use of interactive visualization to face these challenges. In our presentation, we discuss these challenges, and we give a demonstration of the way the application helps face these. Examples of visualizations can be found below.
Content: Beitrag anlässlich des 'UDC Seminar: Information Access for the Global Community, The Hague, 4-5 June 2007'. - Vgl.: http://www.udcc.org/seminar07/presentations/magnaview.pdf.

Zhang, J.; Mostafa, J.; Tripathy, H.: Information retrieval by semantic analysis and visualization of the concept space of D-Lib® magazine (2002) 0.01
```
0.008537784 = product of:
  0.039842993 = sum of:
    0.013444485 = weight(_text_:system in 1211) [ClassicSimilarity], result of:
      0.013444485 = score(doc=1211,freq=8.0), product of:
        0.07727166 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.02453417 = queryNorm
        0.17398985 = fieldWeight in 1211, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.01953125 = fieldNorm(doc=1211)
    0.008860197 = weight(_text_:information in 1211) [ClassicSimilarity], result of:
      0.008860197 = score(doc=1211,freq=36.0), product of:
        0.04306919 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02453417 = queryNorm
        0.20572007 = fieldWeight in 1211, product of:
          6.0 = tf(freq=36.0), with freq of:
            36.0 = termFreq=36.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.01953125 = fieldNorm(doc=1211)
    0.017538311 = weight(_text_:retrieval in 1211) [ClassicSimilarity], result of:
      0.017538311 = score(doc=1211,freq=16.0), product of:
        0.07421378 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02453417 = queryNorm
        0.23632148 = fieldWeight in 1211, product of:
          4.0 = tf(freq=16.0), with freq of:
            16.0 = termFreq=16.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.01953125 = fieldNorm(doc=1211)
  0.21428572 = coord(3/14)
```
Abstract

In this article we present a method for retrieving documents from a digital library through a visual interface based on automatically generated concepts. We used a vocabulary generation algorithm to generate a set of concepts for the digital library and a technique called the max-min distance technique to cluster them. Additionally, the concepts were visualized in a spring embedding graph layout to depict the semantic relationship among them. The resulting graph layout serves as an aid to users for retrieving documents. An online archive containing the contents of D-Lib Magazine from July 1995 to May 2002 was used to test the utility of an implemented retrieval and visualization system. We believe that the method developed and tested can be applied to many different domains to help users get a better understanding of online document collections and to minimize users' cognitive load during execution of search tasks. Over the past few years, the volume of information available through the World Wide Web has been expanding exponentially. Never has so much information been so readily available and shared among so many people. Unfortunately, the unstructured nature and huge volume of information accessible over networks have made it hard for users to sift through and find relevant information. To deal with this problem, information retrieval (IR) techniques have gained more intensive attention from both industrial and academic researchers. Numerous IR techniques have been developed to help deal with the information overload problem. These techniques concentrate on mathematical models and algorithms for retrieval. Popular IR models such as the Boolean model, the vector-space model, the probabilistic model and their variants are well established.
From the user's perspective, however, it is still difficult to use current information retrieval systems. Users frequently have problems expressing their information needs and translating those needs into queries. This is partly due to the fact that information needs cannot be expressed appropriately in systems terms. It is not unusual for users to input search terms that are different from the index terms information systems use. Various methods have been proposed to help users choose search terms and articulate queries. One widely used approach is to incorporate into the information system a thesaurus-like component that represents both the important concepts in a particular subject area and the semantic relationships among those concepts. Unfortunately, the development and use of thesauri is not without its own problems. The thesaurus employed in a specific information system has often been developed for a general subject area and needs significant enhancement to be tailored to the information system where it is to be used. This thesaurus development process, if done manually, is both time consuming and labor intensive. Usage of a thesaurus in searching is complex and may raise barriers for the user. For illustration purposes, let us consider two scenarios of thesaurus usage. In the first scenario the user inputs a search term and the thesaurus then displays a matching set of related terms. Without an overview of the thesaurus - and without the ability to see the matching terms in the context of other terms - it may be difficult to assess the quality of the related terms in order to select the correct term. In the second scenario the user browses the whole thesaurus, which is organized as in an alphabetically ordered list. The problem with this approach is that the list may be long, and neither does it show users the global semantic relationship among all the listed terms.
Nevertheless, because thesaurus use has shown to improve retrieval, for our method we integrate functions in the search interface that permit users to explore built-in search vocabularies to improve retrieval from digital libraries. Our method automatically generates the terms and their semantic relationships representing relevant topics covered in a digital library. We call these generated terms the "concepts", and the generated terms and their semantic relationships we call the "concept space". Additionally, we used a visualization technique to display the concept space and allow users to interact with this space. The automatically generated term set is considered to be more representative of subject area in a corpus than an "externally" imposed thesaurus, and our method has the potential of saving a significant amount of time and labor for those who have been manually creating thesauri as well. Information visualization is an emerging discipline and developed very quickly in the last decade. With growing volumes of documents and associated complexities, information visualization has become increasingly important. Researchers have found information visualization to be an effective way to use and understand information while minimizing a user's cognitive load. Our work was based on an algorithmic approach of concept discovery and association. Concepts are discovered using an algorithm based on an automated thesaurus generation procedure. Subsequently, similarities among terms are computed using the cosine measure, and the associations among terms are established using a method known as max-min distance clustering. The concept space is then visualized in a spring embedding graph, which roughly shows the semantic relationships among concepts in a 2-D visual representation. The semantic space of the visualization is used as a medium for users to retrieve the desired documents. In the remainder of this article, we present our algorithmic approach of concept generation and clustering, followed by description of the visualization technique and interactive interface. The paper ends with key conclusions and discussions on future work.

Theme

Semantisches Umfeld in Indexierung u. Retrieval
Veen, T. van; Oldroyd, B.: Search and retrieval in The European Library : a new approach (2004) 0.01
```
0.008378824 = product of:
  0.03910118 = sum of:
    0.013444485 = weight(_text_:system in 1164) [ClassicSimilarity], result of:
      0.013444485 = score(doc=1164,freq=2.0), product of:
        0.07727166 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.02453417 = queryNorm
        0.17398985 = fieldWeight in 1164, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1164)
    0.004176737 = weight(_text_:information in 1164) [ClassicSimilarity], result of:
      0.004176737 = score(doc=1164,freq=2.0), product of:
        0.04306919 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02453417 = queryNorm
        0.09697737 = fieldWeight in 1164, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1164)
    0.021479957 = weight(_text_:retrieval in 1164) [ClassicSimilarity], result of:
      0.021479957 = score(doc=1164,freq=6.0), product of:
        0.07421378 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02453417 = queryNorm
        0.28943354 = fieldWeight in 1164, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1164)
  0.21428572 = coord(3/14)
```
Abstract

The objective of the European Library (TEL) project [TEL] was to set up a co-operative framework and specify a system for integrated access to the major collections of the European national libraries. This has been achieved by successfully applying a new approach for search and retrieval via URLs (SRU) [ZiNG] combined with a new metadata paradigm. One aim of the TEL approach is to have a low barrier of entry into TEL, and this has driven our choice for the technical solution described here. The solution comprises portal and client functionality running completely in the browser, resulting in a low implementation barrier and maximum scalability, as well as giving users control over the search interface and what collections to search. In this article we will describe, step by step, the development of both the search and retrieval architecture and the metadata infrastructure in the European Library project. We will show that SRU is a good alternative to the Z39.50 protocol and can be implemented without losing investments in current Z39.50 implementations. The metadata model being used by TEL is a Dublin Core Application Profile, and we have taken into account that functional requirements will change over time and therefore the metadata model will need to be able to evolve in a controlled way. We make this possible by means of a central metadata registry containing all characteristics of the metadata in TEL. Finally, we provide two scenarios to show how the TEL concept can be developed and extended, with applications capable of increasing their functionality by "learning" new metadata or protocol options.

Theme

Information Gateway
Doerr, M.: Semantic problems of thesaurus mapping (2001) 0.01
```
0.007997492 = product of:
  0.03732163 = sum of:
    0.019013375 = weight(_text_:system in 5902) [ClassicSimilarity], result of:
      0.019013375 = score(doc=5902,freq=4.0), product of:
        0.07727166 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.02453417 = queryNorm
        0.24605882 = fieldWeight in 5902, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5902)
    0.005906798 = weight(_text_:information in 5902) [ClassicSimilarity], result of:
      0.005906798 = score(doc=5902,freq=4.0), product of:
        0.04306919 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02453417 = queryNorm
        0.13714671 = fieldWeight in 5902, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5902)
    0.012401459 = weight(_text_:retrieval in 5902) [ClassicSimilarity], result of:
      0.012401459 = score(doc=5902,freq=2.0), product of:
        0.07421378 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02453417 = queryNorm
        0.16710453 = fieldWeight in 5902, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=5902)
  0.21428572 = coord(3/14)
```
Abstract

With networked information access to heterogeneous data sources, the problem of terminology provision and interoperability of controlled vocabulary schemes such as thesauri becomes increasingly urgent. Solutions are needed to improve the performance of full-text retrieval systems and to guide the design of controlled terminology schemes for use in structured data, including metadata. Thesauri are created in different languages, with different scope and points of view and at different levels of abstraction and detail, to accomodate access to a specific group of collections. In any wider search accessing distributed collections, the user would like to start with familiar terminology and let the system find out the correspondences to other terminologies in order to retrieve equivalent results from all addressed collections. This paper investigates possible semantic differences that may hinder the unambiguous mapping and transition from one thesaurus to another. It focusses on the differences of meaning of terms and their relations as intended by their creators for indexing and querying a specific collection, in contrast to methods investigating the statistical relevance of terms for objects in a collection. It develops a notion of optimal mapping, paying particular attention to the intellectual quality of mappings between terms from different vocabularies and to problems of polysemy. Proposals are made to limit the vagueness introduced by the transition from one vocabulary to another. The paper shows ways in which thesaurus creators can improve their methodology to meet the challenges of networked access of distributed collections created under varying conditions. For system implementers, the discussion will lead to a better understanding of the complexity of the problem

Source

Journal of digital information. 1(2001) no.8,

Hagedorn, K.; Chapman, S.; Newman, D.: Enhancing search and browse using automated clustering of subject metadata (2007) 0.01

0.0077201175 = product of:
  0.036027215 = sum of:
    0.016133383 = weight(_text_:system in 1168) [ClassicSimilarity], result of:
      0.016133383 = score(doc=1168,freq=2.0), product of:
        0.07727166 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.02453417 = queryNorm
        0.20878783 = fieldWeight in 1168, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.046875 = fieldNorm(doc=1168)
    0.0050120843 = weight(_text_:information in 1168) [ClassicSimilarity], result of:
      0.0050120843 = score(doc=1168,freq=2.0), product of:
        0.04306919 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02453417 = queryNorm
        0.116372846 = fieldWeight in 1168, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.046875 = fieldNorm(doc=1168)
    0.014881751 = weight(_text_:retrieval in 1168) [ClassicSimilarity], result of:
      0.014881751 = score(doc=1168,freq=2.0), product of:
        0.07421378 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02453417 = queryNorm
        0.20052543 = fieldWeight in 1168, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.046875 = fieldNorm(doc=1168)
  0.21428572 = coord(3/14)

Abstract: The Web puzzle of online information resources often hinders end-users from effective and efficient access to these resources. Clustering resources into appropriate subject-based groupings may help alleviate these difficulties, but will it work with heterogeneous material? The University of Michigan and the University of California Irvine joined forces to test automatically enhancing metadata records using the Topic Modeling algorithm on the varied OAIster corpus. We created labels for the resulting clusters of metadata records, matched the clusters to an in-house classification system, and developed a prototype that would showcase methods for search and retrieval using the enhanced records. Results indicated that while the algorithm was somewhat time-intensive to run and using a local classification scheme had its drawbacks, precise clustering of records was achieved and the prototype interface proved that faceted classification could be powerful in helping end-users find resources.

Aitken, S.; Reid, S.: Evaluation of an ontology-based information retrieval tool (2000) 0.01

0.007019364 = product of:
  0.049135543 = sum of:
    0.009450877 = weight(_text_:information in 2862) [ClassicSimilarity], result of:
      0.009450877 = score(doc=2862,freq=4.0), product of:
        0.04306919 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02453417 = queryNorm
        0.21943474 = fieldWeight in 2862, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0625 = fieldNorm(doc=2862)
    0.03968467 = weight(_text_:retrieval in 2862) [ClassicSimilarity], result of:
      0.03968467 = score(doc=2862,freq=8.0), product of:
        0.07421378 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02453417 = queryNorm
        0.5347345 = fieldWeight in 2862, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0625 = fieldNorm(doc=2862)
  0.14285715 = coord(2/14)

Abstract: This paper evaluates the use of an explicit domain ontology in an information retrieval tool. The evaluation compares the performance of ontology-enhanced retrieval with keyword retrieval for a fixed set of queries across several data sets. The robustness of the IR approach is assessed by comparing the performance of the tool on the original data set with that on previously unseen data.

Scheir, P.; Pammer, V.; Lindstaedt, S.N.: Information retrieval on the Semantic Web : does it exist? (2007) 0.01

0.006828477 = product of:
  0.047799338 = sum of:
    0.013075255 = weight(_text_:information in 4329) [ClassicSimilarity], result of:
      0.013075255 = score(doc=4329,freq=10.0), product of:
        0.04306919 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02453417 = queryNorm
        0.3035872 = fieldWeight in 4329, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4329)
    0.034724083 = weight(_text_:retrieval in 4329) [ClassicSimilarity], result of:
      0.034724083 = score(doc=4329,freq=8.0), product of:
        0.07421378 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02453417 = queryNorm
        0.46789268 = fieldWeight in 4329, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=4329)
  0.14285715 = coord(2/14)

Abstract: Plenty of contemporary attempts to search exist that are associated with the area of Semantic Web. But which of them qualify as information retrieval for the Semantic Web? Do such approaches exist? To answer these questions we take a look at the nature of the Semantic Web and Semantic Desktop and at definitions for information and data retrieval. We survey current approaches referred to by their authors as information retrieval for the Semantic Web or that use Semantic Web technology for search.
Source: Lernen - Wissen - Adaption : workshop proceedings / LWA 2007, Halle, September 2007. Martin Luther University Halle-Wittenberg, Institute for Informatics, Databases and Information Systems. Hrsg.: Alexander Hinneburg

Schaefer, A.; Jordan, M.; Klas, C.-P.; Fuhr, N.: Active support for query formulation in virtual digital libraries : a case study with DAFFODIL (2005) 0.01

0.0064334315 = product of:
  0.03002268 = sum of:
    0.013444485 = weight(_text_:system in 4296) [ClassicSimilarity], result of:
      0.013444485 = score(doc=4296,freq=2.0), product of:
        0.07727166 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.02453417 = queryNorm
        0.17398985 = fieldWeight in 4296, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4296)
    0.004176737 = weight(_text_:information in 4296) [ClassicSimilarity], result of:
      0.004176737 = score(doc=4296,freq=2.0), product of:
        0.04306919 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02453417 = queryNorm
        0.09697737 = fieldWeight in 4296, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4296)
    0.012401459 = weight(_text_:retrieval in 4296) [ClassicSimilarity], result of:
      0.012401459 = score(doc=4296,freq=2.0), product of:
        0.07421378 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02453417 = queryNorm
        0.16710453 = fieldWeight in 4296, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4296)
  0.21428572 = coord(3/14)

Abstract: Daffodil is a front-end to federated, heterogeneous digital libraries targeting at strategic support of users during the information seeking process. This is done by offering a variety of functions for searching, exploring and managing digital library objects. However, the distributed search increases response time and the conceptual model of the underlying search processes is inherently weaker. This makes query formulation harder and the resulting waiting times can be frustrating. In this paper, we investigate the concept of proactive support during the user's query formulation. For improving user efficiency and satisfaction, we implemented annotations, proactive support and error markers on the query form itself. These functions decrease the probability for syntactical or semantical errors in queries. Furthermore, the user is able to make better tactical decisions and feels more confident that the system handles the query properly. Evaluations with 30 subjects showed that user satisfaction is improved, whereas no conclusive results were received for efficiency.
Theme: Semantisches Umfeld in Indexierung u. Retrieval

Borlund, P.: ¬The IIR evaluation model : a framework for evaluation of interactive information retrieval systems (2003) 0.01

0.0062771165 = product of:
  0.043939814 = sum of:
    0.014176315 = weight(_text_:information in 922) [ClassicSimilarity], result of:
      0.014176315 = score(doc=922,freq=4.0), product of:
        0.04306919 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02453417 = queryNorm
        0.3291521 = fieldWeight in 922, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.09375 = fieldNorm(doc=922)
    0.029763501 = weight(_text_:retrieval in 922) [ClassicSimilarity], result of:
      0.029763501 = score(doc=922,freq=2.0), product of:
        0.07421378 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02453417 = queryNorm
        0.40105087 = fieldWeight in 922, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.09375 = fieldNorm(doc=922)
  0.14285715 = coord(2/14)

Source: Information Research. 8(2003), no.3

Bradford, R.B.: Relationship discovery in large text collections using Latent Semantic Indexing (2006) 0.01
```
0.0054449434 = product of:
  0.025409736 = sum of:
    0.0088404855 = weight(_text_:information in 1163) [ClassicSimilarity], result of:
      0.0088404855 = score(doc=1163,freq=14.0), product of:
        0.04306919 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02453417 = queryNorm
        0.20526241 = fieldWeight in 1163, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.03125 = fieldNorm(doc=1163)
    0.009921167 = weight(_text_:retrieval in 1163) [ClassicSimilarity], result of:
      0.009921167 = score(doc=1163,freq=2.0), product of:
        0.07421378 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02453417 = queryNorm
        0.13368362 = fieldWeight in 1163, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.03125 = fieldNorm(doc=1163)
    0.0066480828 = product of:
      0.0132961655 = sum of:
        0.0132961655 = weight(_text_:22 in 1163) [ClassicSimilarity], result of:
          0.0132961655 = score(doc=1163,freq=2.0), product of:
            0.085914485 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.02453417 = queryNorm
            0.15476047 = fieldWeight in 1163, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=1163)
      0.5 = coord(1/2)
  0.21428572 = coord(3/14)
```
Abstract

This paper addresses the problem of information discovery in large collections of text. For users, one of the key problems in working with such collections is determining where to focus their attention. In selecting documents for examination, users must be able to formulate reasonably precise queries. Queries that are too broad will greatly reduce the efficiency of information discovery efforts by overwhelming the users with peripheral information. In order to formulate efficient queries, a mechanism is needed to automatically alert users regarding potentially interesting information contained within the collection. This paper presents the results of an experiment designed to test one approach to generation of such alerts. The technique of latent semantic indexing (LSI) is used to identify relationships among entities of interest. Entity extraction software is used to pre-process the text of the collection so that the LSI space contains representation vectors for named entities in addition to those for individual terms. In the LSI space, the cosine of the angle between the representation vectors for two entities captures important information regarding the degree of association of those two entities. For appropriate choices of entities, determining the entity pairs with the highest mutual cosine values yields valuable information regarding the contents of the text collection. The test database used for the experiment consists of 150,000 news articles. The proposed approach for alert generation is tested using a counterterrorism analysis example. The approach is shown to have significant potential for aiding users in rapidly focusing on information of potential importance in large text collections. The approach also has value in identifying possible use of aliases.

Source

Proceedings of the Fourth Workshop on Link Analysis, Counterterrorism, and Security, SIAM Data Mining Conference, Bethesda, MD, 20-22 April, 2006. [http://www.siam.org/meetings/sdm06/workproceed/Link%20Analysis/15.pdf]

Theme

Semantisches Umfeld in Indexierung u. Retrieval

Baeza-Yates, R.; Boldi, P.; Castillo, C.: Generalizing PageRank : damping functions for linkbased ranking algorithms (2006) 0.01

0.005333207 = product of:
  0.0248883 = sum of:
    0.004176737 = weight(_text_:information in 2565) [ClassicSimilarity], result of:
      0.004176737 = score(doc=2565,freq=2.0), product of:
        0.04306919 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02453417 = queryNorm
        0.09697737 = fieldWeight in 2565, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2565)
    0.012401459 = weight(_text_:retrieval in 2565) [ClassicSimilarity], result of:
      0.012401459 = score(doc=2565,freq=2.0), product of:
        0.07421378 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02453417 = queryNorm
        0.16710453 = fieldWeight in 2565, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2565)
    0.008310104 = product of:
      0.016620208 = sum of:
        0.016620208 = weight(_text_:22 in 2565) [ClassicSimilarity], result of:
          0.016620208 = score(doc=2565,freq=2.0), product of:
            0.085914485 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.02453417 = queryNorm
            0.19345059 = fieldWeight in 2565, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2565)
      0.5 = coord(1/2)
  0.21428572 = coord(3/14)

Date: 16. 1.2016 10:22:28
Source: http://chato.cl/papers/baeza06_general_pagerank_damping_functions_link_ranking.pdf [Proceedings of the ACM Special Interest Group on Information Retrieval (SIGIR) Conference, SIGIR'06, August 6-10, 2006, Seattle, Washington, USA]

Francu, V.: Does convenience trump accuracy? : the avatars of the UDC in Romania (2007) 0.00

0.0049545267 = product of:
  0.034681685 = sum of:
    0.010128049 = weight(_text_:information in 544) [ClassicSimilarity], result of:
      0.010128049 = score(doc=544,freq=6.0), product of:
        0.04306919 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02453417 = queryNorm
        0.23515764 = fieldWeight in 544, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0546875 = fieldNorm(doc=544)
    0.024553634 = weight(_text_:retrieval in 544) [ClassicSimilarity], result of:
      0.024553634 = score(doc=544,freq=4.0), product of:
        0.07421378 = queryWeight, product of:
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.02453417 = queryNorm
        0.33085006 = fieldWeight in 544, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.024915 = idf(docFreq=5836, maxDocs=44218)
          0.0546875 = fieldNorm(doc=544)
  0.14285715 = coord(2/14)

Abstract: This paper will concentrate on some major issues regarding the potential of UDC and the current controversy about its use UDC in Romania: i) the importance of hierarchical structures in controlled vocabularies with a direct impact on improved information retrieval given by the browsing function which enables visualizing the hierarchies in subject areas rather than just locating a particular topic; ii) the lack of popularity of the UDC as an indexing and information retrieval language among its users be they librarians or end users of library OPACs; and iii) the situation of UDC teachers and teaching in Romanian universities.
Content: Beitrag anlässlich des 'UDC Seminar: Information Access for the Global Community, The Hague, 4-5 June 2007'. - http://www.udcc.org/seminar07/presentations/francu.pdf.gl. http://www.udcc.org/seminar07/presentations/francu.pdf.

Lavoie, B.; Connaway, L.S.; Dempsey, L.: Anatomy of aggregate collections : the example of Google print for libraries (2005) 0.00
```
0.004599434 = product of:
  0.021464024 = sum of:
    0.013971919 = weight(_text_:system in 1184) [ClassicSimilarity], result of:
      0.013971919 = score(doc=1184,freq=6.0), product of:
        0.07727166 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.02453417 = queryNorm
        0.18081556 = fieldWeight in 1184, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.0234375 = fieldNorm(doc=1184)
    0.0025060421 = weight(_text_:information in 1184) [ClassicSimilarity], result of:
      0.0025060421 = score(doc=1184,freq=2.0), product of:
        0.04306919 = queryWeight, product of:
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.02453417 = queryNorm
        0.058186423 = fieldWeight in 1184, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          1.7554779 = idf(docFreq=20772, maxDocs=44218)
          0.0234375 = fieldNorm(doc=1184)
    0.004986062 = product of:
      0.009972124 = sum of:
        0.009972124 = weight(_text_:22 in 1184) [ClassicSimilarity], result of:
          0.009972124 = score(doc=1184,freq=2.0), product of:
            0.085914485 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.02453417 = queryNorm
            0.116070345 = fieldWeight in 1184, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0234375 = fieldNorm(doc=1184)
      0.5 = coord(1/2)
  0.21428572 = coord(3/14)
```
Abstract

Google's December 2004 announcement of its intention to collaborate with five major research libraries - Harvard University, the University of Michigan, Stanford University, the University of Oxford, and the New York Public Library - to digitize and surface their print book collections in the Google searching universe has, predictably, stirred conflicting opinion, with some viewing the project as a welcome opportunity to enhance the visibility of library collections in new environments, and others wary of Google's prospective role as gateway to these collections. The project has been vigorously debated on discussion lists and blogs, with the participating libraries commonly referred to as "the Google 5". One point most observers seem to concede is that the questions raised by this initiative are both timely and significant. The Google Print Library Project (GPLP) has galvanized a long overdue, multi-faceted discussion about library print book collections. The print book is core to library identity and practice, but in an era of zero-sum budgeting, it is almost inevitable that print book budgets will decline as budgets for serials, digital resources, and other materials expand. As libraries re-allocate resources to accommodate changing patterns of user needs, print book budgets may be adversely impacted. Of course, the degree of impact will depend on a library's perceived mission. A public library may expect books to justify their shelf-space, with de-accession the consequence of minimal use. A national library, on the other hand, has a responsibility to the scholarly and cultural record and may seek to collect comprehensively within particular areas, with the attendant obligation to secure the long-term retention of its print book collections. The combination of limited budgets, changing user needs, and differences in library collection strategies underscores the need to think about a collective, or system-wide, print book collection - in particular, how can an inter-institutional system be organized to achieve goals that would be difficult, and/or prohibitively expensive, for any one library to undertake individually [4]? Mass digitization programs like GPLP cast new light on these and other issues surrounding the future of library print book collections, but at this early stage, it is light that illuminates only dimly. It will be some time before GPLP's implications for libraries and library print book collections can be fully appreciated and evaluated. But the strong interest and lively debate generated by this initiative suggest that some preliminary analysis - premature though it may be - would be useful, if only to undertake a rough mapping of the terrain over which GPLP potentially will extend. At the least, some early perspective helps shape interesting questions for the future, when the boundaries of GPLP become settled, workflows for producing and managing the digitized materials become systematized, and usage patterns within the GPLP framework begin to emerge.
This article offers some perspectives on GPLP in light of what is known about library print book collections in general, and those of the Google 5 in particular, from information in OCLC's WorldCat bibliographic database and holdings file. Questions addressed include: * Coverage: What proportion of the system-wide print book collection will GPLP potentially cover? What is the degree of holdings overlap across the print book collections of the five participating libraries? * Language: What is the distribution of languages associated with the print books held by the GPLP libraries? Which languages are predominant? * Copyright: What proportion of the GPLP libraries' print book holdings are out of copyright? * Works: How many distinct works are represented in the holdings of the GPLP libraries? How does a focus on works impact coverage and holdings overlap? * Convergence: What are the effects on coverage of using a different set of five libraries? What are the effects of adding the holdings of additional libraries to those of the GPLP libraries, and how do these effects vary by library type? These questions certainly do not exhaust the analytical possibilities presented by GPLP. More in-depth analysis might look at Google 5 coverage in particular subject areas; it also would be interesting to see how many books covered by the GPLP have already been digitized in other contexts. However, these questions are left to future studies. The purpose here is to explore a few basic questions raised by GPLP, and in doing so, provide an empirical context for the debate that is sure to continue for some time to come. A secondary objective is to lay some groundwork for a general set of questions that could be used to explore the implications of any mass digitization initiative. A suggested list of questions is provided in the conclusion of the article.

Date

26.12.2011 14:08:22

Search (143 results, page 1 of 8)

Authors

Languages

Themes