Search (934 results, page 1 of 47)

  • × type_ss:"el"
  1. Si, L.E.; O'Brien, A.; Probets, S.: Integration of distributed terminology resources to facilitate subject cross-browsing for library portal systems (2009) 0.11
    0.11058211 = product of:
      0.14744282 = sum of:
        0.01841403 = weight(_text_:for in 3628) [ClassicSimilarity], result of:
          0.01841403 = score(doc=3628,freq=8.0), product of:
            0.08876751 = queryWeight, product of:
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.047278564 = queryNorm
            0.20744109 = fieldWeight in 3628, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3628)
        0.11301481 = weight(_text_:computing in 3628) [ClassicSimilarity], result of:
          0.11301481 = score(doc=3628,freq=4.0), product of:
            0.26151994 = queryWeight, product of:
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.047278564 = queryNorm
            0.43214604 = fieldWeight in 3628, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3628)
        0.016013984 = product of:
          0.032027967 = sum of:
            0.032027967 = weight(_text_:22 in 3628) [ClassicSimilarity], result of:
              0.032027967 = score(doc=3628,freq=2.0), product of:
                0.16556148 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.047278564 = queryNorm
                0.19345059 = fieldWeight in 3628, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=3628)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    Purpose: To develop a prototype middleware framework between different terminology resources in order to provide a subject cross-browsing service for library portal systems. Design/methodology/approach: Nine terminology experts were interviewed to collect appropriate knowledge to support the development of a theoretical framework for the research. Based on this, a simplified software-based prototype system was constructed incorporating the knowledge acquired. The prototype involved mappings between the computer science schedule of the Dewey Decimal Classification (which acted as a spine) and two controlled vocabularies UKAT and ACM Computing Classification. Subsequently, six further experts in the field were invited to evaluate the prototype system and provide feedback to improve the framework. Findings: The major findings showed that given the large variety of terminology resources distributed on the web, the proposed middleware service is essential to integrate technically and semantically the different terminology resources in order to facilitate subject cross-browsing. A set of recommendations are also made outlining the important approaches and features that support such a cross browsing middleware service.
    Content
    This paper is a pre-print version presented at the ISKO UK 2009 conference, 22-23 June, prior to peer review and editing. For published proceedings see special issue of Aslib Proceedings journal.
    Object
    ACM Computing Classification
  2. Dirks, L.: eResearch, semantic computing and the cloud : towards a smart cyberinfrastructure for eResearch (2009) 0.09
    0.09118978 = product of:
      0.18237956 = sum of:
        0.022552488 = weight(_text_:for in 2815) [ClassicSimilarity], result of:
          0.022552488 = score(doc=2815,freq=12.0), product of:
            0.08876751 = queryWeight, product of:
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.047278564 = queryNorm
            0.2540624 = fieldWeight in 2815, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2815)
        0.15982707 = weight(_text_:computing in 2815) [ClassicSimilarity], result of:
          0.15982707 = score(doc=2815,freq=8.0), product of:
            0.26151994 = queryWeight, product of:
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.047278564 = queryNorm
            0.6111468 = fieldWeight in 2815, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2815)
      0.5 = coord(2/4)
    
    Abstract
    In the future, frontier research in many fields will increasingly require the collaboration of globally distributed groups of researchers needing access to distributed computing, data resources and support for remote access to expensive, multi-national specialized facilities such as telescopes and accelerators or specialist data archives. There is also a general belief that an important road to innovation will be provided by multi-disciplinary and collaborative research - from bio-informatics and earth systems science to social science and archaeology. There will also be an explosion in the amount of research data collected in the next decade - 100's of Terabytes will be common in many fields. These future research requirements constitute the 'eResearch' agenda. Powerful software services will be widely deployed on top of the academic research networks to form the necessary 'Cyberinfrastructure' to provide a collaborative research environment for the global academic community. The difficulties in combining data and information from distributed sources, the multi-disciplinary nature of research and collaboration, and the need to move to present researchers with tooling that enable them to express what they want to do rather than how to do it highlight the need for an ecosystem of Semantic Computing technologies. Such technologies will further facilitate information sharing and discovery, will enable reasoning over information, and will allow us to start thinking about knowledge and how it can be handled by computers. This talk will review the elements of this vision and explain the need for semantic-oriented computing by exploring eResearch projects that have successfully applied relevant technologies. It will also suggest that a software + service model with scientific services delivered from the cloud will become an increasingly accepted model for research.
  3. Brand, A.: CrossRef turns one (2001) 0.08
    0.08230459 = product of:
      0.10973945 = sum of:
        0.018321728 = weight(_text_:for in 1222) [ClassicSimilarity], result of:
          0.018321728 = score(doc=1222,freq=22.0), product of:
            0.08876751 = queryWeight, product of:
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.047278564 = queryNorm
            0.20640129 = fieldWeight in 1222, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.0234375 = fieldNorm(doc=1222)
        0.047948122 = weight(_text_:computing in 1222) [ClassicSimilarity], result of:
          0.047948122 = score(doc=1222,freq=2.0), product of:
            0.26151994 = queryWeight, product of:
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.047278564 = queryNorm
            0.18334404 = fieldWeight in 1222, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.0234375 = fieldNorm(doc=1222)
        0.043469597 = product of:
          0.08693919 = sum of:
            0.08693919 = weight(_text_:machinery in 1222) [ClassicSimilarity], result of:
              0.08693919 = score(doc=1222,freq=2.0), product of:
                0.35214928 = queryWeight, product of:
                  7.448392 = idf(docFreq=69, maxDocs=44218)
                  0.047278564 = queryNorm
                0.24688165 = fieldWeight in 1222, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  7.448392 = idf(docFreq=69, maxDocs=44218)
                  0.0234375 = fieldNorm(doc=1222)
          0.5 = coord(1/2)
      0.75 = coord(3/4)
    
    Abstract
    CrossRef, the only full-blown application of the Digital Object Identifier (DOI®) System to date, is now a little over a year old. What started as a cooperative effort among publishers and technologists to prototype DOI-based linking of citations in e-journals evolved into an independent, non-profit enterprise in early 2000. We have made considerable headway during our first year, but there is still much to be done. When CrossRef went live with its collaborative linking service last June, it had enabled reference links in roughly 1,100 journals from a member base of 33 publishers, using a functional prototype system. The DOI-X prototype was described in an article published in D-Lib Magazine in February of 2000. On the occasion of CrossRef's first birthday as a live service, this article provides a non-technical overview of our progress to date and the major hurdles ahead. The electronic medium enriches the research literature arena for all players -- researchers, librarians, and publishers -- in numerous ways. Information has been made easier to discover, to share, and to sell. To take a simple example, the aggregation of book metadata by electronic booksellers was a huge boon to scholars seeking out obscure backlist titles, or discovering books they would never otherwise have known to exist. It was equally a boon for the publishers of those books, who saw an unprecedented surge in sales of backlist titles with the advent of centralized electronic bookselling. In the serials sphere, even in spite of price increases and the turmoil surrounding site licenses for some prime electronic content, libraries overall are now able to offer more content to more of their patrons. Yet undoubtedly, the key enrichment for academics and others navigating a scholarly corpus is linking, and in particular the linking that takes the reader out of one document and into another in the matter of a click or two. Since references are how authors make explicit the links between their work and precedent scholarship, what could be more fundamental to the reader than making those links immediately actionable? That said, automated linking is only really useful from a research perspective if it works across publications and across publishers. Not only do academics think about their own writings and those of their colleagues in terms of "author, title, rough date" -- the name of the journal itself is usually not high on the list of crucial identifying features -- but they are oblivious as to the identity of the publishers of all but their very favorite books and journals.
    Citation linking is thus also a huge benefit to journal publishers, because, as with electronic bookselling, it drives readers to their content in yet another way. In step with what was largely a subscription-based economy for journal sales, an "article economy" appears to be emerging. Journal publishers sell an increasing amount of their content on an article basis, whether through document delivery services, aggregators, or their own pay-per-view systems. At the same time, most research-oriented access to digitized material is still mediated by libraries. Resource discovery services must be able to authenticate subscribed or licensed users somewhere in the process, and ensure that a given user is accessing as a default the version of an article that their library may have already paid for. The well-known "appropriate copy" issue is addressed below. Another benefit to publishers from including outgoing citation links is simply the value they can add to their own journals. Publishers carry out the bulk of the technological prototyping and development that has produced electronic journals and the enhanced functionality readers have come to expect. There is clearly competition among them to provide readers with the latest features. That a number of publishers would agree to collaborate in the establishment of an infrastructure for reference linking was thus by no means predictable. CrossRef was incorporated in January of 2000 as a collaborative venture among 12 of the world's top scientific and scholarly publishers, both commercial and not-for-profit, to enable cross-publisher reference linking throughout the digital journal literature. The founding members were Academic Press, a Harcourt Company; the American Association for the Advancement of Science (the publisher of Science); American Institute of Physics (AIP); Association for Computing Machinery (ACM); Blackwell Science; Elsevier Science; The Institute of Electrical and Electronics Engineers, Inc. (IEEE); Kluwer Academic Publishers (a Wolters Kluwer Company); Nature; Oxford University Press; Springer-Verlag; and John Wiley & Sons, Inc. Start-up funds for CrossRef were provided as loans from eight of the original publishers.
  4. Durno, J.: Digital archaeology and/or forensics : working with floppy disks from the 1980s (2016) 0.08
    0.076688446 = product of:
      0.15337689 = sum of:
        0.02551523 = weight(_text_:for in 3196) [ClassicSimilarity], result of:
          0.02551523 = score(doc=3196,freq=6.0), product of:
            0.08876751 = queryWeight, product of:
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.047278564 = queryNorm
            0.28743884 = fieldWeight in 3196, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.0625 = fieldNorm(doc=3196)
        0.12786166 = weight(_text_:computing in 3196) [ClassicSimilarity], result of:
          0.12786166 = score(doc=3196,freq=2.0), product of:
            0.26151994 = queryWeight, product of:
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.047278564 = queryNorm
            0.48891744 = fieldWeight in 3196, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.0625 = fieldNorm(doc=3196)
      0.5 = coord(2/4)
    
    Abstract
    While software originating from the domain of digital forensics has demonstrated utility for data recovery from contemporary storage media, it is not as effective for working with floppy disks from the 1980s. This paper details alternative strategies for recovering data from floppy disks employing software originating from the software preservation and retro-computing communities. Imaging hardware, storage formats and processing workflows are also discussed.
  5. Chen, H.: Semantic research for digital libraries (1999) 0.08
    0.07562129 = product of:
      0.15124258 = sum of:
        0.015624823 = weight(_text_:for in 1247) [ClassicSimilarity], result of:
          0.015624823 = score(doc=1247,freq=4.0), product of:
            0.08876751 = queryWeight, product of:
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.047278564 = queryNorm
            0.17601961 = fieldWeight in 1247, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.046875 = fieldNorm(doc=1247)
        0.13561776 = weight(_text_:computing in 1247) [ClassicSimilarity], result of:
          0.13561776 = score(doc=1247,freq=4.0), product of:
            0.26151994 = queryWeight, product of:
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.047278564 = queryNorm
            0.51857525 = fieldWeight in 1247, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.046875 = fieldNorm(doc=1247)
      0.5 = coord(2/4)
    
    Abstract
    In this era of the Internet and distributed, multimedia computing, new and emerging classes of information systems applications have swept into the lives of office workers and people in general. From digital libraries, multimedia systems, geographic information systems, and collaborative computing to electronic commerce, virtual reality, and electronic video arts and games, these applications have created tremendous opportunities for information and computer science researchers and practitioners. As applications become more pervasive, pressing, and diverse, several well-known information retrieval (IR) problems have become even more urgent. Information overload, a result of the ease of information creation and transmission via the Internet and WWW, has become more troublesome (e.g., even stockbrokers and elementary school students, heavily exposed to various WWW search engines, are versed in such IR terminology as recall and precision). Significant variations in database formats and structures, the richness of information media (text, audio, and video), and an abundance of multilingual information content also have created severe information interoperability problems -- structural interoperability, media interoperability, and multilingual interoperability.
  6. McQueen, T.F.; Fleck, R.A. Jr.: Changing patterns of Internet usage and challenges at colleges and universities (2005) 0.07
    0.065053955 = product of:
      0.13010791 = sum of:
        0.01822896 = weight(_text_:for in 769) [ClassicSimilarity], result of:
          0.01822896 = score(doc=769,freq=4.0), product of:
            0.08876751 = queryWeight, product of:
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.047278564 = queryNorm
            0.20535621 = fieldWeight in 769, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.0546875 = fieldNorm(doc=769)
        0.111878954 = weight(_text_:computing in 769) [ClassicSimilarity], result of:
          0.111878954 = score(doc=769,freq=2.0), product of:
            0.26151994 = queryWeight, product of:
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.047278564 = queryNorm
            0.42780277 = fieldWeight in 769, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.0546875 = fieldNorm(doc=769)
      0.5 = coord(2/4)
    
    Abstract
    Increased enrollments, changing student expectations, and shifting patterns of Internet access and usage continue to generate resource and administrative challenges for colleges and universities. Computer center staff and college administrators must balance increased access demands, changing system loads, and system security within constrained resources. To assess the changing academic computing environment, computer center directors from several geographic regions were asked to respond to an online questionnaire that assessed patterns of usage, resource allocation, policy formulation, and threats. Survey results were compared with data from a study conducted by the authors in 1999. The analysis includes changing patterns in Internet usage, access, and supervision. The paper also presents details of usage by institutional type and application as well as recommendations for more precise resource assessment by college administrators.
  7. Wong, W.; Liu, W.; Bennamoun, M.: Ontology learning from text : a look back and into the future (2010) 0.07
    0.065053955 = product of:
      0.13010791 = sum of:
        0.01822896 = weight(_text_:for in 4733) [ClassicSimilarity], result of:
          0.01822896 = score(doc=4733,freq=4.0), product of:
            0.08876751 = queryWeight, product of:
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.047278564 = queryNorm
            0.20535621 = fieldWeight in 4733, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4733)
        0.111878954 = weight(_text_:computing in 4733) [ClassicSimilarity], result of:
          0.111878954 = score(doc=4733,freq=2.0), product of:
            0.26151994 = queryWeight, product of:
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.047278564 = queryNorm
            0.42780277 = fieldWeight in 4733, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.0546875 = fieldNorm(doc=4733)
      0.5 = coord(2/4)
    
    Abstract
    Ontologies are often viewed as the answer to the need for inter-operable semantics in modern information systems. The explosion of textual information on the "Read/Write" Web coupled with the increasing demand for ontologies to power the Semantic Web have made (semi-)automatic ontology learning from text a very promising research area. This together with the advanced state in related areas such as natural language processing have fuelled research into ontology learning over the past decade. This survey looks at how far we have come since the turn of the millennium, and discusses the remaining challenges that will define the research directions in this area in the near future.
    Content
    Pre-publication version für: ACM Computing Surveys, Vol. X, No. X, Article X, Publication date: X 2011.
  8. Tomassen, S.L.: Research on ontology-driven information retrieval (2006 (?)) 0.06
    0.057516333 = product of:
      0.115032665 = sum of:
        0.019136423 = weight(_text_:for in 4328) [ClassicSimilarity], result of:
          0.019136423 = score(doc=4328,freq=6.0), product of:
            0.08876751 = queryWeight, product of:
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.047278564 = queryNorm
            0.21557912 = fieldWeight in 4328, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.046875 = fieldNorm(doc=4328)
        0.095896244 = weight(_text_:computing in 4328) [ClassicSimilarity], result of:
          0.095896244 = score(doc=4328,freq=2.0), product of:
            0.26151994 = queryWeight, product of:
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.047278564 = queryNorm
            0.36668807 = fieldWeight in 4328, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.046875 = fieldNorm(doc=4328)
      0.5 = coord(2/4)
    
    Abstract
    An increasing number of recent information retrieval systems make use of ontologies to help the users clarify their information needs and come up with semantic representations of documents. A particular concern here is the integration of these semantic approaches with traditional search technology. The research presented in this paper examines how ontologies can be efficiently applied to large-scale search systems for the web. We describe how these systems can be enriched with adapted ontologies to provide both an in-depth understanding of the user's needs as well as an easy integration with standard vector-space retrieval systems. The ontology concepts are adapted to the domain terminology by computing a feature vector for each concept. Later, the feature vectors are used to enrich a provided query. The whole retrieval system is under development as part of a larger Semantic Web standardization project for the Norwegian oil & gas sector.
  9. Rehurek, R.; Sojka, P.: Software framework for topic modelling with large corpora (2010) 0.06
    0.055760533 = product of:
      0.111521065 = sum of:
        0.015624823 = weight(_text_:for in 1058) [ClassicSimilarity], result of:
          0.015624823 = score(doc=1058,freq=4.0), product of:
            0.08876751 = queryWeight, product of:
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.047278564 = queryNorm
            0.17601961 = fieldWeight in 1058, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.046875 = fieldNorm(doc=1058)
        0.095896244 = weight(_text_:computing in 1058) [ClassicSimilarity], result of:
          0.095896244 = score(doc=1058,freq=2.0), product of:
            0.26151994 = queryWeight, product of:
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.047278564 = queryNorm
            0.36668807 = fieldWeight in 1058, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.046875 = fieldNorm(doc=1058)
      0.5 = coord(2/4)
    
    Abstract
    Large corpora are ubiquitous in today's world and memory quickly becomes the limiting factor in practical applications of the Vector Space Model (VSM). In this paper, we identify a gap in existing implementations of many of the popular algorithms, which is their scalability and ease of use. We describe a Natural Language Processing software framework which is based on the idea of document streaming, i.e. processing corpora document after document, in a memory independent fashion. Within this framework, we implement several popular algorithms for topical inference, including Latent Semantic Analysis and Latent Dirichlet Allocation, in a way that makes them completely independent of the training corpus size. Particular emphasis is placed on straightforward and intuitive framework design, so that modifications and extensions of the methods and/or their application by interested practitioners are effortless. We demonstrate the usefulness of our approach on a real-world scenario of computing document similarities within an existing digital library DML-CZ.
  10. Spitkovsky, V.I.; Chang, A.X.: ¬A cross-lingual dictionary for english Wikipedia concepts (2012) 0.05
    0.054518014 = product of:
      0.10903603 = sum of:
        0.022096837 = weight(_text_:for in 336) [ClassicSimilarity], result of:
          0.022096837 = score(doc=336,freq=8.0), product of:
            0.08876751 = queryWeight, product of:
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.047278564 = queryNorm
            0.24892932 = fieldWeight in 336, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.046875 = fieldNorm(doc=336)
        0.08693919 = product of:
          0.17387839 = sum of:
            0.17387839 = weight(_text_:machinery in 336) [ClassicSimilarity], result of:
              0.17387839 = score(doc=336,freq=2.0), product of:
                0.35214928 = queryWeight, product of:
                  7.448392 = idf(docFreq=69, maxDocs=44218)
                  0.047278564 = queryNorm
                0.4937633 = fieldWeight in 336, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  7.448392 = idf(docFreq=69, maxDocs=44218)
                  0.046875 = fieldNorm(doc=336)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    We present a resource for automatically associating strings of text with English Wikipedia concepts. Our machinery is bi-directional, in the sense that it uses the same fundamental probabilistic methods to map strings to empirical distributions over Wikipedia articles as it does to map article URLs to distributions over short, language-independent strings of natural language text. For maximal interoperability, we release our resource as a set of ?at line-based text ?les, lexicographically sorted and encoded with UTF-8. These files capture joint probability distributions underlying concepts (we use the terms article, concept and Wikipedia URL interchangeably) and associated snippets of text, as well as other features that can come in handy when working with Wikipedia articles and related information.
    Content
    Vgl. auch: Spitkovsky, V., P. Norvig: From words to concepts and back: dictionaries for linking text, entities and ideas. In: http://googleresearch.blogspot.de/2012/05/from-words-to-concepts-and-back.html. Für den Datenpool vgl.: nlp.stanford.edu/pubs/corsswikis-data.tar.bz2.
  11. Dunning, T.: Statistical identification of language (1994) 0.05
    0.053472333 = product of:
      0.106944665 = sum of:
        0.0110484185 = weight(_text_:for in 3627) [ClassicSimilarity], result of:
          0.0110484185 = score(doc=3627,freq=2.0), product of:
            0.08876751 = queryWeight, product of:
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.047278564 = queryNorm
            0.12446466 = fieldWeight in 3627, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.046875 = fieldNorm(doc=3627)
        0.095896244 = weight(_text_:computing in 3627) [ClassicSimilarity], result of:
          0.095896244 = score(doc=3627,freq=2.0), product of:
            0.26151994 = queryWeight, product of:
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.047278564 = queryNorm
            0.36668807 = fieldWeight in 3627, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.046875 = fieldNorm(doc=3627)
      0.5 = coord(2/4)
    
    Abstract
    A statistically based program has been written which learns to distinguish between languages. The amount of training text that such a program needs is surprisingly small, and the amount of text needed to make an identification is also quite small. The program incorporates no linguistic presuppositions other than the assumption that text can be encoded as a string of bytes. Such a program can be used to determine which language small bits of text are in. It also shows a potential for what might be called 'statistical philology' in that it may be applied directly to phonetic transcriptions to help elucidate family trees among language dialects. A variant of this program has been shown to be useful as a quality control in biochemistry. In this application, genetic sequences are assumed to be expressions in a language peculiar to the organism from which the sequence is taken. Thus language identification becomes species identification.
    Series
    Technical report CRL MCCS-94-273, Computing Research Lab, New Mexico State University
  12. Binding, C.; Tudhope, D.: Improving interoperability using vocabulary linked data (2015) 0.05
    0.051233012 = product of:
      0.102466024 = sum of:
        0.022552488 = weight(_text_:for in 2205) [ClassicSimilarity], result of:
          0.022552488 = score(doc=2205,freq=12.0), product of:
            0.08876751 = queryWeight, product of:
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.047278564 = queryNorm
            0.2540624 = fieldWeight in 2205, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2205)
        0.079913534 = weight(_text_:computing in 2205) [ClassicSimilarity], result of:
          0.079913534 = score(doc=2205,freq=2.0), product of:
            0.26151994 = queryWeight, product of:
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.047278564 = queryNorm
            0.3055734 = fieldWeight in 2205, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2205)
      0.5 = coord(2/4)
    
    Abstract
    The concept of Linked Data has been an emerging theme within the computing and digital heritage areas in recent years. The growth and scale of Linked Data has underlined the need for greater commonality in concept referencing, to avoid local redefinition and duplication of reference resources. Achieving domain-wide agreement on common vocabularies would be an unreasonable expectation; however, datasets often already have local vocabulary resources defined, and so the prospects for large-scale interoperability can be substantially improved by creating alignment links from these local vocabularies out to common external reference resources. The ARIADNE project is undertaking large-scale integration of archaeology dataset metadata records, to create a cross-searchable research repository resource. Key to enabling this cross search will be the 'subject' metadata originating from multiple data providers, containing terms from multiple multilingual controlled vocabularies. This paper discusses various aspects of vocabulary mapping. Experience from the previous SENESCHAL project in the publication of controlled vocabularies as Linked Open Data is discussed, emphasizing the importance of unique URI identifiers for vocabulary concepts. There is a need to align legacy indexing data to the uniquely defined concepts and examples are discussed of SENESCHAL data alignment work. A case study for the ARIADNE project presents work on mapping between vocabularies, based on the Getty Art and Architecture Thesaurus as a central hub and employing an interactive vocabulary mapping tool developed for the project, which generates SKOS mapping relationships in JSON and other formats. The potential use of such vocabulary mappings to assist cross search over archaeological datasets from different countries is illustrated in a pilot experiment. The results demonstrate the enhanced opportunities for interoperability and cross searching that the approach offers.
  13. Tang, J.; Liang, B.-Y.; Li, J.-Z.: Toward detecting mapping strategies for ontology interoperability (2005) 0.05
    0.050250523 = product of:
      0.100501046 = sum of:
        0.020587513 = weight(_text_:for in 3367) [ClassicSimilarity], result of:
          0.020587513 = score(doc=3367,freq=10.0), product of:
            0.08876751 = queryWeight, product of:
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.047278564 = queryNorm
            0.2319262 = fieldWeight in 3367, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3367)
        0.079913534 = weight(_text_:computing in 3367) [ClassicSimilarity], result of:
          0.079913534 = score(doc=3367,freq=2.0), product of:
            0.26151994 = queryWeight, product of:
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.047278564 = queryNorm
            0.3055734 = fieldWeight in 3367, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3367)
      0.5 = coord(2/4)
    
    Abstract
    Ontology mapping is one of the core tasks for ontology interoperability. It is aimed to find semantic relationships between entities (i.e. concept, attribute, and relation) of two ontologies. It benefits many applications, such as integration of ontology based web data sources, interoperability of agents or web services. To reduce the amount of users' effort as much as possible, (semi-) automatic ontology mapping is becoming more and more important to bring it into fruition. In the existing literature, many approaches have found considerable interest by combining several different similar/mapping strategies (namely multi-strategy based mapping). However, experiments show that the multi-strategy based mapping does not always outperform its single-strategy counterpart. In this paper, we mainly aim to deal with two problems: (1) for a new, unseen mapping task, should we select a multi-strategy based algorithm or just one single-strategy based algorithm? (2) if the task is suitable for multi-strategy, then how to select the strategies into the final combined scenario? We propose an approach of multiple strategies detections for ontology mapping. The results obtained so far show that multi-strategy detection improves on precision and recall significantly.
    Content
    Beitrag anlässlich: Workshop on The Semantic Computing Initiative (SeC 2005) --- From Semantic Web to Semantic World --- to be held in conjunction with The 14th Int'l Conf. on World Wide Web (WWW2005); vgl.: http://www.instsec.org/2005ws/.
  14. Patton, M.; Reynolds, D.; Choudhury, G.S.; DiLauro, T.: Toward a metadata generation framework : a case study at Johns Hopkins University (2004) 0.04
    0.043013833 = product of:
      0.08602767 = sum of:
        0.022096835 = weight(_text_:for in 1192) [ClassicSimilarity], result of:
          0.022096835 = score(doc=1192,freq=18.0), product of:
            0.08876751 = queryWeight, product of:
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.047278564 = queryNorm
            0.2489293 = fieldWeight in 1192, product of:
              4.2426405 = tf(freq=18.0), with freq of:
                18.0 = termFreq=18.0
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.03125 = fieldNorm(doc=1192)
        0.06393083 = weight(_text_:computing in 1192) [ClassicSimilarity], result of:
          0.06393083 = score(doc=1192,freq=2.0), product of:
            0.26151994 = queryWeight, product of:
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.047278564 = queryNorm
            0.24445872 = fieldWeight in 1192, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.03125 = fieldNorm(doc=1192)
      0.5 = coord(2/4)
    
    Abstract
    In the June 2003 issue of D-Lib Magazine, Kenney et al. (2003) discuss a comparative study between Cornell's email reference staff and Google's Answers service. This interesting study provided insights on the potential impact of "computing and simple algorithms combined with human intelligence" for library reference services. As mentioned in the Kenney et al. article, Bill Arms (2000) had discussed the possibilities of automated digital libraries in an even earlier D-Lib article. Arms discusses not only automating reference services, but also another library function that seems to inspire lively debates about automation-metadata creation. While intended to illuminate, these debates sometimes generate more heat than light. In an effort to explore the potential for automating metadata generation, the Digital Knowledge Center (DKC) of the Sheridan Libraries at The Johns Hopkins University developed and tested an automated name authority control (ANAC) tool. ANAC represents a component of a digital workflow management system developed in connection with the digital Lester S. Levy Collection of Sheet Music. The evaluation of ANAC followed the spirit of the Kenney et al. study that was, as they stated, "more exploratory than scientific." These ANAC evaluation results are shared with the hope of fostering constructive dialogue and discussions about the potential for semi-automated techniques or frameworks for library functions and services such as metadata creation. The DKC's research agenda emphasizes the development of tools that combine automated processes and human intervention, with the overall goal of involving humans at higher levels of analysis and decision-making. Others have looked at issues regarding the automated generation of metadata. A session at the 2003 Joint Conference on Digital Libraries was devoted to automatic metadata creation, and a session at the 2004 conference addressed automated name disambiguation. Commercial vendors such as OCLC, Marcive, and LTI have long used automated techniques for matching names to Library of Congress authority records. We began developing ANAC as a component of a larger suite of open source tools to support workflow management for digital projects. This article describes the goals for the ANAC tool, provides an overview of the metadata records used for testing, describes the architecture for ANAC, and concludes with discussions of the methodology and evaluation of the experiment comparing human cataloging and ANAC-generated results.
  15. Kenney, A.R.; McGovern, N.Y.; Martinez, I.T.; Heidig, L.J.: Google meets eBay : what academic librarians can learn from alternative information providers (2003) 0.04
    0.04098641 = product of:
      0.08197282 = sum of:
        0.01804199 = weight(_text_:for in 1200) [ClassicSimilarity], result of:
          0.01804199 = score(doc=1200,freq=12.0), product of:
            0.08876751 = queryWeight, product of:
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.047278564 = queryNorm
            0.20324993 = fieldWeight in 1200, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.03125 = fieldNorm(doc=1200)
        0.06393083 = weight(_text_:computing in 1200) [ClassicSimilarity], result of:
          0.06393083 = score(doc=1200,freq=2.0), product of:
            0.26151994 = queryWeight, product of:
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.047278564 = queryNorm
            0.24445872 = fieldWeight in 1200, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.03125 = fieldNorm(doc=1200)
      0.5 = coord(2/4)
    
    Abstract
    In April 2002, the dominant Internet search engine, GoogleT, introduced a beta version of its expert service, Google Answers, with little fanfare. Almost immediately the buzz within the information community focused on implications for reference librarians. Google had already been lauded as the cheaper and faster alternative for finding information, and declining reference statistics and Online Public Access Catalog (OPAC) use in academic libraries had been attributed in part to its popularity. One estimate suggests that the Google search engine handles more questions in a day and a half than all the libraries in the country provide in a year. Indeed, Craig Silverstein, Google's Director of Technology, indicated that the raison d'être for the search engine was to "seem as smart as a reference librarian," even as he acknowledged that this goal was "hundreds of years away". Bill Arms had reached a similar conclusion regarding the more nuanced reference functions in a thought-provoking article in this journal on automating digital libraries. But with the launch of Google Answers, the power of "brute force computing" and simple algorithms could be combined with human intelligence to represent a market-driven alternative to library reference services. Google Answers is part of a much larger trend to provide networked reference assistance. Expert services have sprung up in both the commercial and non-profit sector. Libraries too have responded to the Web, providing a suite of services through the virtual reference desk (VRD) movement, from email reference to chat reference to collaborative services that span the globe. As the Internet's content continues to grow and deepen - encompassing over 40 million web sites - it has been met by a groundswell of services to find and filter information. These services include an extensive range from free to fee-based, cost-recovery to for-profit, and library providers to other information providers - both new and traditional. As academic libraries look towards the future in a dynamic and competitive information landscape, what implications do these services have for their programs, and what can be learned from them to improve library offerings? This paper presents the results of a modest study conducted by Cornell University Library (CUL) to compare and contrast its digital reference services with those of Google Answers. The study provided an opportunity for librarians to shift their focus from fearing the impact of Google, as usurper of the library's role and diluter of the academic experience, to gaining insights into how Google's approach to service development and delivery has made it so attractive.
  16. Gladney, H.M.; Bennett, J.L.: What do we mean by authentic? : what's the real McCoy? (2003) 0.04
    0.04098641 = product of:
      0.08197282 = sum of:
        0.01804199 = weight(_text_:for in 1201) [ClassicSimilarity], result of:
          0.01804199 = score(doc=1201,freq=12.0), product of:
            0.08876751 = queryWeight, product of:
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.047278564 = queryNorm
            0.20324993 = fieldWeight in 1201, product of:
              3.4641016 = tf(freq=12.0), with freq of:
                12.0 = termFreq=12.0
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.03125 = fieldNorm(doc=1201)
        0.06393083 = weight(_text_:computing in 1201) [ClassicSimilarity], result of:
          0.06393083 = score(doc=1201,freq=2.0), product of:
            0.26151994 = queryWeight, product of:
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.047278564 = queryNorm
            0.24445872 = fieldWeight in 1201, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.03125 = fieldNorm(doc=1201)
      0.5 = coord(2/4)
    
    Abstract
    Authenticity is among digital document security properties needing attention. Literature focused on preservation reveals uncertainty - even confusion - about what we might mean by authentic. The current article provides a definition that spans vernacular usage of "authentic", ranging from digital documents through material artifacts to natural objects. We accomplish this by modeling entity transmission through time and space by signal sequences and object representations at way stations, and by carefully distinguishing objective facts from subjective values and opinions. Our model can be used to clarify other words that denote information quality, such as "evidence", "essential", and "useful". Digital documents are becoming important in most kinds of human activity. Whenever we buy something valuable, agree to a contract, design and build a machine, or provide a service, we should understand exactly what we intend and be ready to describe this as accurately as the occasion demands. This makes worthwhile whatever care is needed to devise definitions that are sufficiently precise and distinct from each other to explain what we are doing and to minimize community confusion. When we set out, some months ago, to describe answers to the open technical challenges of digital preservation, we took for granted the existence of a broad, unambiguous definition for authentic. Document authenticity is of fundamental importance not only for scholarly work, but also for practical affairs, including legal matters, regulatory requirements, military and other governmental information, and financial transactions. Trust, and evidence for deciding what can be trusted as authentic are considered in many works about digital preservation. These topics are broad, deep, and subtle, raising many questions. Among these, the current work addresses a single question, "What is a useful meaning of authentic or of authenticity for digital documents - a meaning that is not itself a source of confusion?" Progress in managing digital information would be hampered without a clear answer that is sufficiently objective to guide the evaluation of communication and computing technology. Our approach to constructing an answer to this question is to break each object transmission into pieces whose treatment we can describe explicitly and with attention to potential imperfections.
  17. Dextre Clarke, S.G.: Challenges and opportunities for KOS standards (2007) 0.04
    0.040648535 = product of:
      0.08129707 = sum of:
        0.03645792 = weight(_text_:for in 4643) [ClassicSimilarity], result of:
          0.03645792 = score(doc=4643,freq=4.0), product of:
            0.08876751 = queryWeight, product of:
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.047278564 = queryNorm
            0.41071242 = fieldWeight in 4643, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.109375 = fieldNorm(doc=4643)
        0.04483915 = product of:
          0.0896783 = sum of:
            0.0896783 = weight(_text_:22 in 4643) [ClassicSimilarity], result of:
              0.0896783 = score(doc=4643,freq=2.0), product of:
                0.16556148 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.047278564 = queryNorm
                0.5416616 = fieldWeight in 4643, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.109375 = fieldNorm(doc=4643)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Content
    Beitrag anläßlich des Seminars "Tools for knowledge organization - ISKO UK Seminar", 4. September 2007
    Date
    22. 9.2007 15:41:14
  18. Robbio, A. de; Maguolo, D.; Marini, A.: Scientific and general subject classifications in the digital world (2001) 0.04
    0.04020042 = product of:
      0.08040084 = sum of:
        0.01647001 = weight(_text_:for in 2) [ClassicSimilarity], result of:
          0.01647001 = score(doc=2,freq=10.0), product of:
            0.08876751 = queryWeight, product of:
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.047278564 = queryNorm
            0.18554096 = fieldWeight in 2, product of:
              3.1622777 = tf(freq=10.0), with freq of:
                10.0 = termFreq=10.0
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.03125 = fieldNorm(doc=2)
        0.06393083 = weight(_text_:computing in 2) [ClassicSimilarity], result of:
          0.06393083 = score(doc=2,freq=2.0), product of:
            0.26151994 = queryWeight, product of:
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.047278564 = queryNorm
            0.24445872 = fieldWeight in 2, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.03125 = fieldNorm(doc=2)
      0.5 = coord(2/4)
    
    Abstract
    In the present work we discuss opportunities, problems, tools and techniques encountered when interconnecting discipline-specific subject classifications, primarily organized as search devices in bibliographic databases, with general classifications originally devised for book shelving in public libraries. We first state the fundamental distinction between topical (or subject) classifications and object classifications. Then we trace the structural limitations that have constrained subject classifications since their library origins, and the devices that were used to overcome the gap with genuine knowledge representation. After recalling some general notions on structure, dynamics and interferences of subject classifications and of the objects they refer to, we sketch a synthetic overview on discipline-specific classifications in Mathematics, Computing and Physics, on one hand, and on general classifications on the other. In this setting we present The Scientific Classifications Page, which collects groups of Web pages produced by a pool of software tools for developing hypertextual presentations of single or paired subject classifications from sequential source files, as well as facilities for gathering information from KWIC lists of classification descriptions. Further we propose a concept-oriented methodology for interconnecting subject classifications, with the concrete support of a relational analysis of the whole Mathematics Subject Classification through its evolution since 1959. Finally, we recall a very basic method for interconnection provided by coreference in bibliographic records among index elements from different systems, and point out the advantages of establishing the conditions of a more widespread application of such a method. A part of these contents was presented under the title Mathematics Subject Classification and related Classifications in the Digital World at the Eighth International Conference Crimea 2001, "Libraries and Associations in the Transient World: New Technologies and New Forms of Cooperation", Sudak, Ukraine, June 9-17, 2001, in a special session on electronic libraries, electronic publishing and electronic information in science chaired by Bernd Wegner, Editor-in-Chief of Zentralblatt MATH.
  19. Page, A.: ¬The search is over : the search-engines secrets of the pros (1996) 0.04
    0.039956767 = product of:
      0.15982707 = sum of:
        0.15982707 = weight(_text_:computing in 5670) [ClassicSimilarity], result of:
          0.15982707 = score(doc=5670,freq=2.0), product of:
            0.26151994 = queryWeight, product of:
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.047278564 = queryNorm
            0.6111468 = fieldWeight in 5670, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.078125 = fieldNorm(doc=5670)
      0.25 = coord(1/4)
    
    Source
    PC computing. 1996, Oct., S. -
  20. Arms, W.Y.; Blanchi, C.; Overly, E.A.: ¬An architecture for information in digital libraries (1997) 0.04
    0.03913265 = product of:
      0.0782653 = sum of:
        0.022325827 = weight(_text_:for in 1260) [ClassicSimilarity], result of:
          0.022325827 = score(doc=1260,freq=24.0), product of:
            0.08876751 = queryWeight, product of:
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.047278564 = queryNorm
            0.25150898 = fieldWeight in 1260, product of:
              4.8989797 = tf(freq=24.0), with freq of:
                24.0 = termFreq=24.0
              1.8775425 = idf(docFreq=18385, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1260)
        0.055939477 = weight(_text_:computing in 1260) [ClassicSimilarity], result of:
          0.055939477 = score(doc=1260,freq=2.0), product of:
            0.26151994 = queryWeight, product of:
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.047278564 = queryNorm
            0.21390139 = fieldWeight in 1260, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              5.5314693 = idf(docFreq=475, maxDocs=44218)
              0.02734375 = fieldNorm(doc=1260)
      0.5 = coord(2/4)
    
    Abstract
    Flexible organization of information is one of the key design challenges in any digital library. For the past year, we have been working with members of the National Digital Library Project (NDLP) at the Library of Congress to build an experimental system to organize and store library collections. This is a report on the work. In particular, we describe how a few technical building blocks are used to organize the material in collections, such as the NDLP's, and how these methods fit into a general distributed computing framework. The technical building blocks are part of a framework that evolved as part of the Computer Science Technical Reports Project (CSTR). This framework is described in the paper, "A Framework for Distributed Digital Object Services", by Robert Kahn and Robert Wilensky (1995). The main building blocks are: "digital objects", which are used to manage digital material in a networked environment; "handles", which identify digital objects and other network resources; and "repositories", in which digital objects are stored. These concepts are amplified in "Key Concepts in the Architecture of the Digital Library", by William Y. Arms (1995). In summer 1995, after earlier experimental development, work began on the implementation of a full digital library system based on this framework. In addition to Kahn/Wilensky and Arms, several working papers further elaborate on the design concepts. A paper by Carl Lagoze and David Ely, "Implementation Issues in an Open Architectural Framework for Digital Object Services", delves into some of the repository concepts. The initial repository implementation was based on a paper by Carl Lagoze, Robert McGrath, Ed Overly and Nancy Yeager, "A Design for Inter-Operable Secure Object Stores (ISOS)". Work on the handle system, which began in 1992, is described in a series of papers that can be found on the Handle Home Page. The National Digital Library Program (NDLP) at the Library of Congress is a large scale project to convert historic collections to digital form and make them widely available over the Internet. The program is described in two articles by Caroline R. Arms, "Historical Collections for the National Digital Library". The NDLP itself draws on experience gained through the earlier American Memory Program. Based on this work, we have built a pilot system that demonstrates how digital objects can be used to organize complex materials, such as those found in the NDLP. The pilot was demonstrated to members of the library in July 1996. The pilot system includes the handle system for identifying digital objects, a pilot repository to store them, and two user interfaces: one designed for librarians to manage digital objects in the repository, the other for library patrons to access the materials stored in the repository. Materials from the NDLP's Coolidge Consumerism compilation have been deposited into the pilot repository. They include a variety of photographs and texts, converted to digital form. The pilot demonstrates the use of handles for identifying such material, the use of meta-objects for managing sets of digital objects, and the choice of metadata. We are now implementing an enhanced prototype system for completion in early 1997.

Years

Languages

Types

  • a 452
  • r 21
  • s 17
  • n 16
  • i 14
  • p 13
  • x 12
  • m 10
  • b 5
  • More… Less…

Themes