Search (28 results, page 1 of 2)

  • × type_ss:"a"
  • × type_ss:"el"
  • × year_i:[2000 TO 2010}
  1. Fang, L.: ¬A developing search service : heterogeneous resources integration and retrieval system (2004) 0.02
    0.023835853 = product of:
      0.047671705 = sum of:
        0.047671705 = product of:
          0.09534341 = sum of:
            0.09534341 = weight(_text_:e.g in 1193) [ClassicSimilarity], result of:
              0.09534341 = score(doc=1193,freq=4.0), product of:
                0.23393378 = queryWeight, product of:
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.044842023 = queryNorm
                0.40756583 = fieldWeight in 1193, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1193)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This article describes two approaches for searching heterogeneous resources, which are explained as they are used in two corresponding existing systems-RIRS (Resource Integration Retrieval System) and HRUSP (Heterogeneous Resource Union Search Platform). On analyzing the existing systems, a possible framework-the MUSP (Multimetadata-Based Union Search Platform) is presented. Libraries now face a dilemma. On one hand, libraries subscribe to many types of database retrieval systems that are produced by various providers. The libraries build their data and information systems independently. This results in highly heterogeneous and distributed systems at the technical level (e.g., different operating systems and user interfaces) and at the conceptual level (e.g., the same objects are named using different terms). On the other hand, end users want to access all these heterogeneous data via a union interface, without having to know the structure of each information system or the different retrieval methods used by the systems. Libraries must achieve a harmony between information providers and users. In order to bridge the gap between the service providers and the users, it would seem that all source databases would need to be rebuilt according to a uniform data structure and query language, but this seems impossible. Fortunately, however, libraries and information and technology providers are now making an effort to find a middle course that meets the requirements of both data providers and users. They are doing this through resource integration.
  2. Mimno, D.; Crane, G.; Jones, A.: Hierarchical catalog records : implementing a FRBR catalog (2005) 0.02
    0.023354271 = product of:
      0.046708543 = sum of:
        0.046708543 = product of:
          0.093417086 = sum of:
            0.093417086 = weight(_text_:e.g in 1183) [ClassicSimilarity], result of:
              0.093417086 = score(doc=1183,freq=6.0), product of:
                0.23393378 = queryWeight, product of:
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.044842023 = queryNorm
                0.39933133 = fieldWeight in 1183, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1183)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    IFLA's Functional Requirements for Bibliographic Records (FRBR) lay the foundation for a new generation of cataloging systems that recognize the difference between a particular work (e.g., Moby Dick), diverse expressions of that work (e.g., translations into German, Japanese and other languages), different versions of the same basic text (e.g., the Modern Library Classics vs. Penguin editions), and particular items (a copy of Moby Dick on the shelf). Much work has gone into finding ways to infer FRBR relationships between existing catalog records and modifying catalog interfaces to display those relationships. Relatively little work, however, has gone into exploring the creation of catalog records that are inherently based on the FRBR hierarchy of works, expressions, manifestations, and items. The Perseus Digital Library has created a new catalog that implements such a system for a small collection that includes many works with multiple versions. We have used this catalog to explore some of the implications of hierarchical catalog records for searching and browsing. Current online library catalog interfaces present many problems for searching. One commonly cited failure is the inability to find and collocate all versions of a distinct intellectual work that exist in a collection and the inability to take into account known variations in titles and personal names (Yee 2005). The IFLA Functional Requirements for Bibliographic Records (FRBR) attempts to address some of these failings by introducing the concept of multiple interrelated bibliographic entities (IFLA 1998). In particular, relationships between abstract intellectual works and the various published instances of those works are divided into a four-level hierarchy of works (such as the Aeneid), expressions (Robert Fitzgerald's translation of the Aeneid), manifestations (a particular paperback edition of Robert Fitzgerald's translation of the Aeneid), and items (my copy of a particular paperback edition of Robert Fitzgerald's translation of the Aeneid). In this formulation, each level in the hierarchy "inherits" information from the preceding level. Much of the work on FRBRized catalogs so far has focused on organizing existing records that describe individual physical books. Relatively little work has gone into rethinking what information should be in catalog records, or how the records should relate to each other. It is clear, however, that a more "native" FRBR catalog would include separate records for works, expressions, manifestations, and items. In this way, all information about a work would be centralized in one record. Records for subsequent expressions of that work would add only the information specific to each expression: Samuel Butler's translation of the Iliad does not need to repeat the fact that the work was written by Homer. This approach has certain inherent advantages for collections with many versions of the same works: new publications can be cataloged more quickly, and records can be stored and updated more efficiently.
  3. Van der Veer Martens, B.: Do citation systems represent theories of truth? (2001) 0.02
    0.021480048 = product of:
      0.042960096 = sum of:
        0.042960096 = product of:
          0.08592019 = sum of:
            0.08592019 = weight(_text_:22 in 3925) [ClassicSimilarity], result of:
              0.08592019 = score(doc=3925,freq=4.0), product of:
                0.15702912 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044842023 = queryNorm
                0.54716086 = fieldWeight in 3925, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.078125 = fieldNorm(doc=3925)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    22. 7.2006 15:22:28
  4. Pika, J.: Universal Decimal Classification at the ETH-Bibliothek Zürich : a Swiss perspective (2007) 0.02
    0.020225393 = product of:
      0.040450785 = sum of:
        0.040450785 = product of:
          0.08090157 = sum of:
            0.08090157 = weight(_text_:e.g in 5899) [ClassicSimilarity], result of:
              0.08090157 = score(doc=5899,freq=2.0), product of:
                0.23393378 = queryWeight, product of:
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.044842023 = queryNorm
                0.34583107 = fieldWeight in 5899, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.046875 = fieldNorm(doc=5899)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The ETH library has been using the UDC for the past twenty-five years and yet most of the users had almost never taken a single notice about it. The query in today's NEBIS-OPAC (former ETHICS) is based on verbal search with three-lingual descriptors and corresponding related search-terms including e.g. synonyma as well as user-friendly expressions from scientific journals - scientific jargon - to facilitate the dialog with OPAC. A single UDC number, standing behind these descriptors, connects them to the related document-titles, regardless of language. Thus the user actually works with the UDC, without realizing it. This paper describes the experience with this OPAC and the work behind it.
  5. Qin, J.; Paling, S.: Converting a controlled vocabulary into an ontology : the case of GEM (2001) 0.02
    0.018226424 = product of:
      0.03645285 = sum of:
        0.03645285 = product of:
          0.0729057 = sum of:
            0.0729057 = weight(_text_:22 in 3895) [ClassicSimilarity], result of:
              0.0729057 = score(doc=3895,freq=2.0), product of:
                0.15702912 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044842023 = queryNorm
                0.46428138 = fieldWeight in 3895, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.09375 = fieldNorm(doc=3895)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    24. 8.2005 19:20:22
  6. Crane, G.; Jones, A.: Text, information, knowledge and the evolving record of humanity (2006) 0.02
    0.016854495 = product of:
      0.03370899 = sum of:
        0.03370899 = product of:
          0.06741798 = sum of:
            0.06741798 = weight(_text_:e.g in 1182) [ClassicSimilarity], result of:
              0.06741798 = score(doc=1182,freq=8.0), product of:
                0.23393378 = queryWeight, product of:
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.044842023 = queryNorm
                0.28819257 = fieldWeight in 1182, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.01953125 = fieldNorm(doc=1182)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    Although the Alexandria Digital Library provides far richer data than the TGN (5.9 vs. 1.3 million names), its added size lowers, rather than increases, the accuracy of most geographic name identification systems for historical documents: most of the extra 4.6 million names cover low frequency entities that rarely occur in any particular corpus. The TGN is sufficiently comprehensive to provide quite enough noise: we find place names that are used over and over (there are almost one hundred Washingtons) and semantically ambiguous (e.g., is Washington a person or a place?). Comprehensive knowledge sources emphasize recall but lower precision. We need data with which to determine which "Tribune" or "John Brown" a particular passage denotes. Secondly and paradoxically, our reference works may not be comprehensive enough. Human actors come and go over time. Organizations appear and vanish. Even places can change their names or vanish. The TGN does associate the obsolete name Siam with the nation of Thailand (tgn,1000142) - but also with towns named Siam in Iowa (tgn,2035651), Tennessee (tgn,2101519), and Ohio (tgn,2662003). Prussia appears but as a general region (tgn,7016786), with no indication when or if it was a sovereign nation. And if places do point to the same object over time, that object may have very different significance over time: in the foundational works of Western historiography, Herodotus reminds us that the great cities of the past may be small today, and the small cities of today great tomorrow (Hdt. 1.5), while Thucydides stresses that we cannot estimate the past significance of a place by its appearance today (Thuc. 1.10). In other words, we need to know the population figures for the various Washingtons in 1870 if we are analyzing documents from 1870. The foundations have been laid for reference works that provide machine actionable information about entities at particular times in history. The Alexandria Digital Library Gazetteer Content Standard8 represents a sophisticated framework with which to create such resources: places can be associated with temporal information about their foundation (e.g., Washington, DC, founded on 16 July 1790), changes in names for the same location (e.g., Saint Petersburg to Leningrad and back again), population figures at various times and similar historically contingent data. But if we have the software and the data structures, we do not yet have substantial amounts of historical content such as plentiful digital gazetteers, encyclopedias, lexica, grammars and other reference works to illustrate many periods and, even if we do, those resources may not be in a useful form: raw OCR output of a complex lexicon or gazetteer may have so many errors and have captured so little of the underlying structure that the digital resource is useless as a knowledge base. Put another way, human beings are still much better at reading and interpreting the contents of page images than machines. While people, places, and dates are probably the most important core entities, we will find a growing set of objects that we need to identify and track across collections, and each of these categories of objects will require its own knowledge sources. The following section enumerates and briefly describes some existing categories of documents that we need to mine for knowledge. This brief survey focuses on the format of print sources (e.g., highly structured textual "database" vs. unstructured text) to illustrate some of the challenges involved in converting our published knowledge into semantically annotated, machine actionable form.
  7. Heery, R.; Carpenter, L.; Day, M.: Renardus project developments and the wider digital library context (2001) 0.02
    0.016854495 = product of:
      0.03370899 = sum of:
        0.03370899 = product of:
          0.06741798 = sum of:
            0.06741798 = weight(_text_:e.g in 1219) [ClassicSimilarity], result of:
              0.06741798 = score(doc=1219,freq=8.0), product of:
                0.23393378 = queryWeight, product of:
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.044842023 = queryNorm
                0.28819257 = fieldWeight in 1219, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.01953125 = fieldNorm(doc=1219)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    For those building digital library services, the organisational barriers are sometimes far more intractable than technological issues. This was firmly flagged in one of the first workshops focusing specifically on the digital library research agenda: Digital libraries are not simply technological constructs; they exist within a rich legal, social, and economic context, and will succeed only to the extent that they meet these broader needs. The innovatory drive within the development of digital library services thrives on the tension between meeting both technical and social imperatives. The Renardus project partners have previously taken parts in projects establishing the technical basis for subject gateways (e.g., ROADS, DESIRE], EELS) and are aware that technical barriers to interoperability are outweighed by challenges relating to the organisational and business models used. Within the Renardus project there has been a determination to address these organisational and business issues from the beginning. Renardus intends initially to create a pilot service, targeting the European scholar with a single point of access to quality selected Web resources. Looking ahead beyond current project funding, it aims to create the organisational and technological infrastructure for a sustainable service. This means the project is concerned with the range of processes required to establish a viable service, and is explicitly addressing business issues as well as providing a technical infrastructure. The overall aim of Renardus is to establish a collaborative framework for European subject gateways that will benefit both users in terms of enhanced services, and the gateways themselves in terms of shared solutions. In order to achieve this aim, Renardus will provide firstly a pilot service for the European academic and research communities brokering access to those European-based information gateways that currently participate in the project; in other words, brokering to gateways that are already in existence. Secondly the project will explore ways to establish the organisational basis for co-operative efforts such as metadata sharing, joint technical solutions and agreement on standardisation. It is intended that this exploration will feed back valuable experience to the individual participating gateways to suggest ways their services can be enhanced.
    Funding from the UK Electronic Libraries (eLib) programme and the European Community's Fourth Framework programme assisted the initial emergence of information gateways (e.g., SOSIG, EEVL, OMNI in the UK, and EELS in Sweden). Other gateways have been developed by initiatives co-ordinated by national libraries (such as DutchESS in the Netherlands, and AVEL and EdNA in Australia) and by universities and research funding bodies (e.g., GEM in the US, the Finnish Virtual Library, and the German SSG-FI services). An account of the emergence of subject gateways since the mid-1990s by Dempsey gives an historical perspective -- informed by UK experience in particular -- and also considers the future development of subject gateways in relation to other services. When considering the development and future of gateways, it would be helpful to have a clear definition of the service offered by a so-called 'subject gateway'. Precise definitions of 'information gateways', 'subject gateways' and 'quality controlled subject gateways' have been debated elsewhere. Koch has reviewed definitions and suggested typologies that are useful, not least in showing the differences that exist between broadly similar services. Working definitions that we will use in this article are that a subject gateway provides a search service to high quality Web resources selected from a particular subject area, whereas information gateways have a wider criteria for selection of resources, e.g., a national approach. Inevitably in a rapidly changing international environment different people perceive different emphases in attempts to label services, the significant issue is that users, developers and designers can recognise and benefit from commonalties in approach.
  8. Markey, K.: ¬The online library catalog : paradise lost and paradise regained? (2007) 0.02
    0.016685098 = product of:
      0.033370197 = sum of:
        0.033370197 = product of:
          0.06674039 = sum of:
            0.06674039 = weight(_text_:e.g in 1172) [ClassicSimilarity], result of:
              0.06674039 = score(doc=1172,freq=4.0), product of:
                0.23393378 = queryWeight, product of:
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.044842023 = queryNorm
                0.28529608 = fieldWeight in 1172, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.02734375 = fieldNorm(doc=1172)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The impetus for this essay is the library community's uncertainty regarding the present and future direction of the library catalog in the era of Google and mass digitization projects. The uncertainty is evident at the highest levels. Deanna Marcum, Associate Librarian for Library Services at the Library of Congress (LC), is struck by undergraduate students who favor digital resources over the online library catalog because such resources are available at anytime and from anywhere (Marcum, 2006). She suggests that "the detailed attention that we have been paying to descriptive cataloging may no longer be justified ... retooled catalogers could give more time to authority control, subject analysis, [and] resource identification and evaluation" (Marcum, 2006, 8). In an abrupt about-face, LC terminated series added entries in cataloging records, one of the few subject-rich fields in such records (Cataloging Policy and Support Office, 2006). Mann (2006b) and Schniderman (2006) cite evidence of LC's prevailing viewpoint in favor of simplifying cataloging at the expense of subject cataloging. LC commissioned Karen Calhoun (2006) to prepare a report on "revitalizing" the online library catalog. Calhoun's directive is clear: divert resources from cataloging mass-produced formats (e.g., books) to cataloging the unique primary sources (e.g., archives, special collections, teaching objects, research by-products). She sums up her rationale for such a directive, "The existing local catalog's market position has eroded to the point where there is real concern for its ability to weather the competition for information seekers' attention" (p. 10). At the University of California Libraries (2005), a task force's recommendations parallel those in Calhoun report especially regarding the elimination of subject headings in favor of automatically generated metadata. Contemplating these events prompted me to revisit the glorious past of the online library catalog. For a decade and a half beginning in the early 1980s, the online library catalog was the jewel in the crown when people eagerly queued at its terminals to find information written by the world's experts. I despair how eagerly people now embrace Google because of the suspect provenance of the information Google retrieves. Long ago, we could have added more value to the online library catalog but the only thing we changed was the catalog's medium. Our failure to act back then cost the online catalog the crown. Now that the era of mass digitization has begun, we have a second chance at redesigning the online library catalog, getting it right, coaxing back old users, and attracting new ones. Let's revisit the past, reconsidering missed opportunities, reassessing their merits, combining them with new directions, making bold decisions and acting decisively on them.
  9. Dobratz, S.; Neuroth, H.: nestor: Network of Expertise in long-term STOrage of digital Resources : a digital preservation initiative for Germany (2004) 0.01
    0.014301512 = product of:
      0.028603025 = sum of:
        0.028603025 = product of:
          0.05720605 = sum of:
            0.05720605 = weight(_text_:e.g in 1195) [ClassicSimilarity], result of:
              0.05720605 = score(doc=1195,freq=4.0), product of:
                0.23393378 = queryWeight, product of:
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.044842023 = queryNorm
                0.2445395 = fieldWeight in 1195, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.0234375 = fieldNorm(doc=1195)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    As follow up, in 2002 the nestor long-term archiving working group provided an initial spark towards planning and organising coordinated activities concerning the long-term preservation and long-term availability of digital documents in Germany. This resulted in a workshop, held 29 - 30 October 2002, where major tasks were discussed. Influenced by the demands and progress of the nestor network, the participants reached agreement to start work on application-oriented projects and to address the following topics: * Overlapping problems o Collection and preservation of digital objects (selection criteria, preservation policy) o Definition of criteria for trusted repositories o Creation of models of cooperation, etc. * Digital objects production process o Analysis of potential conflicts between production and long-term preservation o Documentation of existing document models and recommendations for standards models to be used for long-term preservation o Identification systems for digital objects, etc. * Transfer of digital objects o Object data and metadata o Transfer protocols and interoperability o Handling of different document types, e.g. dynamic publications, etc. * Long-term preservation of digital objects o Design and prototype implementation of depot systems for digital objects (OAIS was chosen to be the best functional model.) o Authenticity o Functional requirements on user interfaces of an depot system o Identification systems for digital objects, etc. At the end of the workshop, participants decided to establish a permanent distributed infrastructure for long-term preservation and long-term accessibility of digital resources in Germany comparable, e.g., to the Digital Preservation Coalition in the UK. The initial phase, nestor, is now being set up by the above-mentioned 3-year funding project.
  10. Dushay, N.: Visualizing bibliographic metadata : a virtual (book) spine viewer (2004) 0.01
    0.014301512 = product of:
      0.028603025 = sum of:
        0.028603025 = product of:
          0.05720605 = sum of:
            0.05720605 = weight(_text_:e.g in 1197) [ClassicSimilarity], result of:
              0.05720605 = score(doc=1197,freq=4.0), product of:
                0.23393378 = queryWeight, product of:
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.044842023 = queryNorm
                0.2445395 = fieldWeight in 1197, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.0234375 = fieldNorm(doc=1197)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    When our experience of information discovery is mediated by a computer, we neither move ourselves nor the monitor. We have only the computer's monitor to view, and the keyboard and/or mouse to manipulate what is displayed there. Computer interfaces often reduce our ability to get a sense of the contents of a library: we don't perceive the scope of the library: its breadth, (the quantity of materials/information), its density (how full the shelves are, how thorough the collection is for individual topics), or the general audience for the materials (e.g., whether the materials are appropriate for middle school students, college professors, etc.). Additionally, many computer interfaces for information discovery require users to scroll through long lists, to click numerous navigational links and to read a lot of text to find the exact text they want to read. Text features of resources are almost always presented alphabetically, and the number of items in these alphabetical lists sometimes can be very long. Alphabetical ordering is certainly an improvement over no ordering, but it generally has no bearing on features with an inherent non-alphabetical ordering (e.g., dates of historical events), nor does it necessarily group similar items together. Alphabetical ordering of resources is analogous to one of the most familiar complaints about dictionaries: sometimes you need to know how to spell a word in order to look up its correct spelling in the dictionary. Some have used technology to replicate the appearance of physical libraries, presenting rooms of bookcases and shelves of book spines in virtual 3D environments. This approach presents a problem, as few book spines can be displayed legibly on a monitor screen. This article examines the role of book spines, call numbers, and other traditional organizational and information discovery concepts, and integrates this knowledge with information visualization techniques to show how computers and monitors can meet or exceed similar information discovery methods. The goal is to tap the unique potentials of current information visualization approaches in order to improve information discovery, offer new services, and most important of all, improve user satisfaction. We need to capitalize on what computers do well while bearing in mind their limitations. The intent is to design GUIs to optimize utility and provide a positive experience for the user.
  11. Choudhury, G.S.; DiLauro, T.; Droettboom, M.; Fujinaga, I.; MacMillan, K.: Strike up the score : deriving searchable and playable digital formats from sheet music (2001) 0.01
    0.014301512 = product of:
      0.028603025 = sum of:
        0.028603025 = product of:
          0.05720605 = sum of:
            0.05720605 = weight(_text_:e.g in 1220) [ClassicSimilarity], result of:
              0.05720605 = score(doc=1220,freq=4.0), product of:
                0.23393378 = queryWeight, product of:
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.044842023 = queryNorm
                0.2445395 = fieldWeight in 1220, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.0234375 = fieldNorm(doc=1220)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    In the final report to NEH, the Curator of Special Collections at the MSEL stated, "the most useful thing we learned from this project was that you can never overestimate the amount of time it will take to create a quality digital product" (Requardt 1998). The word "resources" might represent a more comprehensive choice than the word "time" in this previous statement. This "sink" of time and resources manifested itself by an increasing allocation of human labor and time to deal with workflow issues related to large-scale digitization. The Levy Collection experience provides ample evidence that there will be mistakes during and after digitization and that unforeseen challenges or difficulties will arise, especially when dealing with rare or fragile materials. The current strategy of allocating additional human labor neither limits costs nor scales well. Consequently, the Digital Knowledge Center (DKC) of the Milton S. Eisenhower Library sought and secured funding for the development of a workflow management system through the National Science Foundation's (NSF) Digital Libraries Initiative, Phase 2 and the Institute for Museum and Library Services (IMLS)6 National Leadership Grant Program. The Levy family and a technology entrepreneur in Maryland provided additional funding for other aspects of the project. The mission of this second phase of the Levy project ("Levy II") can be summarized as follows: * Reduce costs for large collection ingestion by creating a suite of open-source processes, tools and interfaces for workflow management * Increase access capabilities by providing a suite of research tools * Demonstrate utility of tools and processes with a subset of the online Levy Collection The cornerstones of the workflow management system include: optical music recognition (OMR) software to generate a logical representation of the score -- for sound generation, musical searching, and musicological research -- and an automated name authority control system to disambiguate names (e.g., the authors Mark Twain and Samuel Clemens are the same individual). The research tools focus upon enhanced searching capabilities through the development and application of a fast, disk-based search engine for lyrics and music, and the incorporation of an XML structure for metadata. Though this paper focuses on the OMR component of our work, a companion paper to be published in a future issue of D-Lib will describe more fully the other tools (e.g., the automated name authority control system and the disk-based search engine), the overall workflow management system, and the project management process.
  12. DiLauro, T.; Choudhury, G.S.; Patton, M.; Warner, J.W.; Brown, E.W.: Automated name authority control and enhanced searching in the Levy collection (2001) 0.01
    0.013483594 = product of:
      0.026967188 = sum of:
        0.026967188 = product of:
          0.053934377 = sum of:
            0.053934377 = weight(_text_:e.g in 1160) [ClassicSimilarity], result of:
              0.053934377 = score(doc=1160,freq=2.0), product of:
                0.23393378 = queryWeight, product of:
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.044842023 = queryNorm
                0.23055404 = fieldWeight in 1160, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1160)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This paper is the second in a series in D-Lib Magazine and describes a workflow management system being developed by the Digital Knowledge Center (DKC) at the Milton S. Eisenhower Library (MSEL) of The Johns Hopkins University. Based on experience from digitizing the Lester S. Levy Collection of Sheet Music, it was apparent that large-scale digitization efforts require a significant amount of human labor that is both time-consuming and costly. Consequently, this workflow management system aims to reduce the amount of human labor and time for large-scale digitization projects. The mission of this second phase of the project ("Levy II") can be summarized as follows: * Reduce costs for large collection ingestion by creating a suite of open-source processes, tools, and interfaces for workflow management * Increase access capabilities by providing a suite of research tools * Demonstrate utility of tools and processes with a subset of the online Levy Collection The cornerstones of the workflow management system include optical music recognition (OMR) software and an automated name authority control system (ANAC). The OMR software generates a logical representation of the score for sound generation, music searching, and musicological research. The ANAC disambiguates names, associating each name with an individual (e.g., the composer Septimus Winner also published under the pseudonyms Alice Hawthorne and Apsley Street, among others). Complementing the workflow tools, a suite of research tools focuses upon enhanced searching capabilities through the development and application of a fast, disk-based search engine for lyrics and music and the incorporation of an XML structure for metadata. The first paper (Choudhury et al. 2001) described the OMR software and musical components of Levy II. This paper focuses on the metadata and intellectual access components that include automated name authority control and the aforementioned search engine.
  13. Chan, L.M.; Zeng, M.L.: Metadata interoperability and standardization - a study of methodology, part II : achieving interoperability at the record and repository levels (2006) 0.01
    0.013483594 = product of:
      0.026967188 = sum of:
        0.026967188 = product of:
          0.053934377 = sum of:
            0.053934377 = weight(_text_:e.g in 1177) [ClassicSimilarity], result of:
              0.053934377 = score(doc=1177,freq=2.0), product of:
                0.23393378 = queryWeight, product of:
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.044842023 = queryNorm
                0.23055404 = fieldWeight in 1177, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1177)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    This is the second part of an analysis of the methods that have been used to achieve or improve interoperability among metadata schemas and their applications in order to facilitate the conversion and exchange of metadata and to enable cross-domain metadata harvesting and federated searches. From a methodological point of view, implementing interoperability may be considered at different levels of operation: schema level (discussed in Part I of the article), record level (discussed in Part II of the article), and repository level (also discussed in Part II). The results of efforts to improve interoperability may be observed from different perspectives as well, including element-based and value-based approaches. As discussed in Part I of this study, the results of efforts to improve interoperability can be observed at different levels: 1. Schema level - Efforts are focused on the elements of the schemas, being independent of any applications. The results usually appear as derived element sets or encoded schemas, crosswalks, application profiles, and element registries. 2. Record level - Efforts are intended to integrate the metadata records through the mapping of the elements according to the semantic meanings of these elements. Common results include converted records and new records resulting from combining values of existing records. 3. Repository level - With harvested or integrated records from varying sources, efforts at this level focus on mapping value strings associated with particular elements (e.g., terms associated with subject or format elements). The results enable cross-collection searching. In the following sections, we will continue to analyze interoperability efforts and methodologies, focusing on the record level and the repository level. It should be noted that the models to be discussed in this article are not always mutually exclusive. Sometimes, within a particular project, more than one method may be used.
  14. Bird, S.; Dale, R.; Dorr, B.; Gibson, B.; Joseph, M.; Kan, M.-Y.; Lee, D.; Powley, B.; Radev, D.; Tan, Y.F.: ¬The ACL Anthology Reference Corpus : a reference dataset for bibliographic research in computational linguistics (2008) 0.01
    0.013483594 = product of:
      0.026967188 = sum of:
        0.026967188 = product of:
          0.053934377 = sum of:
            0.053934377 = weight(_text_:e.g in 2804) [ClassicSimilarity], result of:
              0.053934377 = score(doc=2804,freq=2.0), product of:
                0.23393378 = queryWeight, product of:
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.044842023 = queryNorm
                0.23055404 = fieldWeight in 2804, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2804)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Content
    Vgl. auch: Automatic Term Recognition (ATR) is a research task that deals with the identification of domain-specific terms. Terms, in simple words, are textual realization of significant concepts in an expertise domain. Additionally, domain-specific terms may be classified into a number of categories, in which each category represents a significant concept. A term classification task is often defined on top of an ATR procedure to perform such categorization. For instance, in the biomedical domain, terms can be classified as drugs, proteins, and genes. This is a reference dataset for terminology extraction and classification research in computational linguistics. It is a set of manually annotated terms in English language that are extracted from the ACL Anthology Reference Corpus (ACL ARC). The ACL ARC is a canonicalised and frozen subset of scientific publications in the domain of Human Language Technologies (HLT). It consists of 10,921 articles from 1965 to 2006. The dataset, called ACL RD-TEC, is comprised of more than 69,000 candidate terms that are manually annotated as valid and invalid terms. Furthermore, valid terms are classified as technology and non-technology terms. Technology terms refer to a method, process, or in general a technological concept in the domain of HLT, e.g. machine translation, word sense disambiguation, and language modelling. On the other hand, non-technology terms refer to important concepts other than technological; examples of such terms in the domain of HLT are multilingual lexicon, corpora, word sense, and language model. The dataset is created to serve as a gold standard for the comparison of the algorithms of term recognition and classification. [http://catalog.elra.info/product_info.php?products_id=1236].
  15. Decimal Classification Editorial Policy Committee (2002) 0.01
    0.010740024 = product of:
      0.021480048 = sum of:
        0.021480048 = product of:
          0.042960096 = sum of:
            0.042960096 = weight(_text_:22 in 236) [ClassicSimilarity], result of:
              0.042960096 = score(doc=236,freq=4.0), product of:
                0.15702912 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044842023 = queryNorm
                0.27358043 = fieldWeight in 236, product of:
                  2.0 = tf(freq=4.0), with freq of:
                    4.0 = termFreq=4.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=236)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    The Decimal Classification Editorial Policy Committee (EPC) held its Meeting 117 at the Library Dec. 3-5, 2001, with chair Andrea Stamm (Northwestern University) presiding. Through its actions at this meeting, significant progress was made toward publication of DDC unabridged Edition 22 in mid-2003 and Abridged Edition 14 in early 2004. For Edition 22, the committee approved the revisions to two major segments of the classification: Table 2 through 55 Iran (the first half of the geographic area table) and 900 History and geography. EPC approved updates to several parts of the classification it had already considered: 004-006 Data processing, Computer science; 340 Law; 370 Education; 510 Mathematics; 610 Medicine; Table 3 issues concerning treatment of scientific and technical themes, with folklore, arts, and printing ramifications at 398.2 - 398.3, 704.94, and 758; Table 5 and Table 6 Ethnic Groups and Languages (portions concerning American native peoples and languages); and tourism issues at 647.9 and 790. Reports on the results of testing the approved 200 Religion and 305-306 Social groups schedules were received, as was a progress report on revision work for the manual being done by Ross Trotter (British Library, retired). Revisions for Abridged Edition 14 that received committee approval included 010 Bibliography; 070 Journalism; 150 Psychology; 370 Education; 380 Commerce, communications, and transportation; 621 Applied physics; 624 Civil engineering; and 629.8 Automatic control engineering. At the meeting the committee received print versions of _DC&_ numbers 4 and 5. Primarily for the use of Dewey translators, these cumulations list changes, substantive and cosmetic, to DDC Edition 21 and Abridged Edition 13 for the period October 1999 - December 2001. EPC will hold its Meeting 118 at the Library May 15-17, 2002.
  16. Heflin, J.; Hendler, J.: Semantic interoperability on the Web (2000) 0.01
    0.010632081 = product of:
      0.021264162 = sum of:
        0.021264162 = product of:
          0.042528324 = sum of:
            0.042528324 = weight(_text_:22 in 759) [ClassicSimilarity], result of:
              0.042528324 = score(doc=759,freq=2.0), product of:
                0.15702912 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044842023 = queryNorm
                0.2708308 = fieldWeight in 759, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=759)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    11. 5.2013 19:22:18
  17. Beagle, D.: Visualizing keyword distribution across multidisciplinary c-space (2003) 0.01
    0.010112696 = product of:
      0.020225393 = sum of:
        0.020225393 = product of:
          0.040450785 = sum of:
            0.040450785 = weight(_text_:e.g in 1202) [ClassicSimilarity], result of:
              0.040450785 = score(doc=1202,freq=2.0), product of:
                0.23393378 = queryWeight, product of:
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.044842023 = queryNorm
                0.17291553 = fieldWeight in 1202, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  5.2168427 = idf(docFreq=651, maxDocs=44218)
                  0.0234375 = fieldNorm(doc=1202)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Abstract
    But what happens to this awareness in a digital library? Can discursive formations be represented in cyberspace, perhaps through diagrams in a visualization interface? And would such a schema be helpful to a digital library user? To approach this question, it is worth taking a moment to reconsider what Radford is looking at. First, he looks at titles to see how the books cluster. To illustrate, I scanned one hundred books on the shelves of a college library under subclass HT 101-395, defined by the LCC subclass caption as Urban groups. The City. Urban sociology. Of the first 100 titles in this sequence, fifty included the word "urban" or variants (e.g. "urbanization"). Another thirty-five used the word "city" or variants. These keywords appear to mark their titles as the heart of this discursive formation. The scattering of titles not using "urban" or "city" used related terms such as "town," "community," or in one case "skyscrapers." So we immediately see some empirical correlation between keywords and classification. But we also see a problem with the commonly used search technique of title-keyword. A student interested in urban studies will want to know about this entire subclass, and may wish to browse every title available therein. A title-keyword search on "urban" will retrieve only half of the titles, while a search on "city" will retrieve just over a third. There will be no overlap, since no titles in this sample contain both words. The only place where both words appear in a common string is in the LCC subclass caption, but captions are not typically indexed in library Online Public Access Catalogs (OPACs). In a traditional library, this problem is mitigated when the student goes to the shelf looking for any one of the books and suddenly discovers a much wider selection than the keyword search had led him to expect. But in a digital library, the issue of non-retrieval can be more problematic, as studies have indicated. Micco and Popp reported that, in a study funded partly by the U.S. Department of Education, 65 of 73 unskilled users searching for material on U.S./Soviet foreign relations found some material but never realized they had missed a large percentage of what was in the database.
  18. Bittner, T.; Donnelly, M.; Winter, S.: Ontology and semantic interoperability (2006) 0.01
    0.009113212 = product of:
      0.018226424 = sum of:
        0.018226424 = product of:
          0.03645285 = sum of:
            0.03645285 = weight(_text_:22 in 4820) [ClassicSimilarity], result of:
              0.03645285 = score(doc=4820,freq=2.0), product of:
                0.15702912 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044842023 = queryNorm
                0.23214069 = fieldWeight in 4820, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=4820)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    3.12.2016 18:39:22
  19. Beppler, F.D.; Fonseca, F.T.; Pacheco, R.C.S.: Hermeneus: an architecture for an ontology-enabled information retrieval (2008) 0.01
    0.009113212 = product of:
      0.018226424 = sum of:
        0.018226424 = product of:
          0.03645285 = sum of:
            0.03645285 = weight(_text_:22 in 3261) [ClassicSimilarity], result of:
              0.03645285 = score(doc=3261,freq=2.0), product of:
                0.15702912 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044842023 = queryNorm
                0.23214069 = fieldWeight in 3261, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=3261)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    28.11.2016 12:43:22
  20. Atran, S.; Medin, D.L.; Ross, N.: Evolution and devolution of knowledge : a tale of two biologies (2004) 0.01
    0.009113212 = product of:
      0.018226424 = sum of:
        0.018226424 = product of:
          0.03645285 = sum of:
            0.03645285 = weight(_text_:22 in 479) [ClassicSimilarity], result of:
              0.03645285 = score(doc=479,freq=2.0), product of:
                0.15702912 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.044842023 = queryNorm
                0.23214069 = fieldWeight in 479, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.046875 = fieldNorm(doc=479)
          0.5 = coord(1/2)
      0.5 = coord(1/2)
    
    Date
    23. 1.2022 10:22:18