Search (88 results, page 1 of 5)

Snowhill, L.: E-books and their future in academic libraries (2001) 0.05
```
0.045333505 = product of:
  0.22666752 = sum of:
    0.22666752 = weight(_text_:books in 1218) [ClassicSimilarity], result of:
      0.22666752 = score(doc=1218,freq=12.0), product of:
        0.24756333 = queryWeight, product of:
          4.8330836 = idf(docFreq=956, maxDocs=44218)
          0.051222645 = queryNorm
        0.9155941 = fieldWeight in 1218, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          4.8330836 = idf(docFreq=956, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1218)
  0.2 = coord(1/5)
```
Abstract

The University of California's California Digital Library (CDL) formed an Ebook Task Force in August 2000 to evaluate academic libraries' experiences with electronic books (e-books), investigate the e-book market, and develop operating guidelines, principles and potential strategies for further exploration of the use of e-books at the University of California (UC). This article, based on the findings and recommendations of the Task Force Report, briefly summarizes task force findings, and outlines issues and recommendations for making e-books viable over the long term in the academic environment, based on the long-term goals of building strong research collections and providing high level services and collections to its users.

Object

E-books
Danowski, P.: Authority files and Web 2.0 : Wikipedia and the PND. An Example (2007) 0.04
```
0.040318966 = product of:
  0.100797415 = sum of:
    0.0660976 = weight(_text_:books in 1291) [ClassicSimilarity], result of:
      0.0660976 = score(doc=1291,freq=2.0), product of:
        0.24756333 = queryWeight, product of:
          4.8330836 = idf(docFreq=956, maxDocs=44218)
          0.051222645 = queryNorm
        0.2669927 = fieldWeight in 1291, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.8330836 = idf(docFreq=956, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1291)
    0.03469981 = weight(_text_:22 in 1291) [ClassicSimilarity], result of:
      0.03469981 = score(doc=1291,freq=2.0), product of:
        0.17937298 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.051222645 = queryNorm
        0.19345059 = fieldWeight in 1291, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.0390625 = fieldNorm(doc=1291)
  0.4 = coord(2/5)
```
Abstract

More and more users index everything on their own in the web 2.0. There are services for links, videos, pictures, books, encyclopaedic articles and scientific articles. All these services are library independent. But must that really be? Can't libraries help with their experience and tools to make user indexing better? On the experience of a project from German language Wikipedia together with the German person authority files (Personen Namen Datei - PND) located at German National Library (Deutsche Nationalbibliothek) I would like to show what is possible. How users can and will use the authority files, if we let them. We will take a look how the project worked and what we can learn for future projects. Conclusions - Authority files can have a role in the web 2.0 - there must be an open interface/ service for retrieval - everything that is indexed on the net with authority files can be easy integrated in a federated search - O'Reilly: You have to found ways that your data get more important that more it will be used

Content

Vortrag anlässlich des Workshops: "Extending the multilingual capacity of The European Library in the EDL project Stockholm, Swedish National Library, 22-23 November 2007".
Lavoie, B.; Connaway, L.S.; Dempsey, L.: Anatomy of aggregate collections : the example of Google print for libraries (2005) 0.04
```
0.035804212 = product of:
  0.08951053 = sum of:
    0.06869064 = weight(_text_:books in 1184) [ClassicSimilarity], result of:
      0.06869064 = score(doc=1184,freq=6.0), product of:
        0.24756333 = queryWeight, product of:
          4.8330836 = idf(docFreq=956, maxDocs=44218)
          0.051222645 = queryNorm
        0.27746695 = fieldWeight in 1184, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          4.8330836 = idf(docFreq=956, maxDocs=44218)
          0.0234375 = fieldNorm(doc=1184)
    0.020819884 = weight(_text_:22 in 1184) [ClassicSimilarity], result of:
      0.020819884 = score(doc=1184,freq=2.0), product of:
        0.17937298 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.051222645 = queryNorm
        0.116070345 = fieldWeight in 1184, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.0234375 = fieldNorm(doc=1184)
  0.4 = coord(2/5)
```
Abstract

Google's December 2004 announcement of its intention to collaborate with five major research libraries - Harvard University, the University of Michigan, Stanford University, the University of Oxford, and the New York Public Library - to digitize and surface their print book collections in the Google searching universe has, predictably, stirred conflicting opinion, with some viewing the project as a welcome opportunity to enhance the visibility of library collections in new environments, and others wary of Google's prospective role as gateway to these collections. The project has been vigorously debated on discussion lists and blogs, with the participating libraries commonly referred to as "the Google 5". One point most observers seem to concede is that the questions raised by this initiative are both timely and significant. The Google Print Library Project (GPLP) has galvanized a long overdue, multi-faceted discussion about library print book collections. The print book is core to library identity and practice, but in an era of zero-sum budgeting, it is almost inevitable that print book budgets will decline as budgets for serials, digital resources, and other materials expand. As libraries re-allocate resources to accommodate changing patterns of user needs, print book budgets may be adversely impacted. Of course, the degree of impact will depend on a library's perceived mission. A public library may expect books to justify their shelf-space, with de-accession the consequence of minimal use. A national library, on the other hand, has a responsibility to the scholarly and cultural record and may seek to collect comprehensively within particular areas, with the attendant obligation to secure the long-term retention of its print book collections. The combination of limited budgets, changing user needs, and differences in library collection strategies underscores the need to think about a collective, or system-wide, print book collection - in particular, how can an inter-institutional system be organized to achieve goals that would be difficult, and/or prohibitively expensive, for any one library to undertake individually [4]? Mass digitization programs like GPLP cast new light on these and other issues surrounding the future of library print book collections, but at this early stage, it is light that illuminates only dimly. It will be some time before GPLP's implications for libraries and library print book collections can be fully appreciated and evaluated. But the strong interest and lively debate generated by this initiative suggest that some preliminary analysis - premature though it may be - would be useful, if only to undertake a rough mapping of the terrain over which GPLP potentially will extend. At the least, some early perspective helps shape interesting questions for the future, when the boundaries of GPLP become settled, workflows for producing and managing the digitized materials become systematized, and usage patterns within the GPLP framework begin to emerge.
This article offers some perspectives on GPLP in light of what is known about library print book collections in general, and those of the Google 5 in particular, from information in OCLC's WorldCat bibliographic database and holdings file. Questions addressed include: * Coverage: What proportion of the system-wide print book collection will GPLP potentially cover? What is the degree of holdings overlap across the print book collections of the five participating libraries? * Language: What is the distribution of languages associated with the print books held by the GPLP libraries? Which languages are predominant? * Copyright: What proportion of the GPLP libraries' print book holdings are out of copyright? * Works: How many distinct works are represented in the holdings of the GPLP libraries? How does a focus on works impact coverage and holdings overlap? * Convergence: What are the effects on coverage of using a different set of five libraries? What are the effects of adding the holdings of additional libraries to those of the GPLP libraries, and how do these effects vary by library type? These questions certainly do not exhaust the analytical possibilities presented by GPLP. More in-depth analysis might look at Google 5 coverage in particular subject areas; it also would be interesting to see how many books covered by the GPLP have already been digitized in other contexts. However, these questions are left to future studies. The purpose here is to explore a few basic questions raised by GPLP, and in doing so, provide an empirical context for the debate that is sure to continue for some time to come. A secondary objective is to lay some groundwork for a general set of questions that could be used to explore the implications of any mass digitization initiative. A suggested list of questions is provided in the conclusion of the article.

Date

26.12.2011 14:08:22

Celli, J.: ¬The New Books Project : a prototype for re-inventing the Cataloguing-in-Publication program to meet the needs for publishers, libraries and readers in the 21st century (2001) 0.03

0.03172685 = product of:
  0.15863423 = sum of:
    0.15863423 = weight(_text_:books in 6897) [ClassicSimilarity], result of:
      0.15863423 = score(doc=6897,freq=2.0), product of:
        0.24756333 = queryWeight, product of:
          4.8330836 = idf(docFreq=956, maxDocs=44218)
          0.051222645 = queryNorm
        0.6407824 = fieldWeight in 6897, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.8330836 = idf(docFreq=956, maxDocs=44218)
          0.09375 = fieldNorm(doc=6897)
  0.2 = coord(1/5)

Mann, T.: Is precoordination unnecessary in LCSH? : Are Web sites more important to catalog than books?: a reference librarian's thought on the future of bibliographic control (2000) 0.03

0.02643904 = product of:
  0.1321952 = sum of:
    0.1321952 = weight(_text_:books in 6135) [ClassicSimilarity], result of:
      0.1321952 = score(doc=6135,freq=2.0), product of:
        0.24756333 = queryWeight, product of:
          4.8330836 = idf(docFreq=956, maxDocs=44218)
          0.051222645 = queryNorm
        0.5339854 = fieldWeight in 6135, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.8330836 = idf(docFreq=956, maxDocs=44218)
          0.078125 = fieldNorm(doc=6135)
  0.2 = coord(1/5)

Díaz, P.: Usability of hypermedia educational e-books (2003) 0.03
```
0.025904862 = product of:
  0.1295243 = sum of:
    0.1295243 = weight(_text_:books in 1198) [ClassicSimilarity], result of:
      0.1295243 = score(doc=1198,freq=12.0), product of:
        0.24756333 = queryWeight, product of:
          4.8330836 = idf(docFreq=956, maxDocs=44218)
          0.051222645 = queryNorm
        0.52319664 = fieldWeight in 1198, product of:
          3.4641016 = tf(freq=12.0), with freq of:
            12.0 = termFreq=12.0
          4.8330836 = idf(docFreq=956, maxDocs=44218)
          0.03125 = fieldNorm(doc=1198)
  0.2 = coord(1/5)
```
Abstract

To arrive at relevant and reliable conclusions concerning the usability of a hypermedia educational e-book, developers have to apply a well-defined evaluation procedure as well as a set of clear, concrete and measurable quality criteria. Evaluating an educational tool involves not only testing the user interface but also the didactic method, the instructional materials and the interaction mechanisms to prove whether or not they help users reach their goals for learning. This article presents a number of evaluation criteria for hypermedia educational e-books and describes how they are embedded into an evaluation procedure. This work is chiefly aimed at helping education developers evaluate their systems, as well as to provide them with guidance for addressing educational requirements during the design process. In recent years, more and more educational e-books are being created, whether by academics trying to keep pace with the advanced requirements of the virtual university or by publishers seeking to meet the increasing demand for educational resources that can be accessed anywhere and anytime, and that include multimedia information, hypertext links and powerful search and annotating mechanisms. To develop a useful educational e-book many things have to be considered, such as the reading patterns of users, accessibility for different types of users and computer platforms, copyright and legal issues, development of new business models and so on. Addressing usability is very important since e-books are interactive systems and, consequently, have to be designed with the needs of their users in mind. Evaluating usability involves analyzing whether systems are effective, efficient and secure for use; easy to learn and remember; and have a good utility. Any interactive system, as e-books are, has to be assessed to determine if it is really usable as well as useful. Such an evaluation is not only concerned with assessing the user interface but is also aimed at analyzing whether the system can be used in an efficient way to meet the needs of its users - who in the case of educational e-books are learners and teachers. Evaluation provides the opportunity to gather valuable information about design decisions. However, to be successful the evaluation has to be carefully planned and prepared so developers collect appropriate and reliable data from which to draw relevant conclusions.
Crane, G.: What do you do with a million books? (2006) 0.02
```
0.023647794 = product of:
  0.11823897 = sum of:
    0.11823897 = weight(_text_:books in 1180) [ClassicSimilarity], result of:
      0.11823897 = score(doc=1180,freq=10.0), product of:
        0.24756333 = queryWeight, product of:
          4.8330836 = idf(docFreq=956, maxDocs=44218)
          0.051222645 = queryNorm
        0.477611 = fieldWeight in 1180, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          4.8330836 = idf(docFreq=956, maxDocs=44218)
          0.03125 = fieldNorm(doc=1180)
  0.2 = coord(1/5)
```
Abstract

The Greek historian Herodotus has the Athenian sage Solon estimate the lifetime of a human being at c. 26,250 days (Herodotus, The Histories, 1.32). If we could read a book on each of those days, it would take almost forty lifetimes to work through every volume in a single million book library. The continuous tradition of written European literature that began with the Iliad and Odyssey in the eighth century BCE is itself little more than a million days old. While libraries that contain more than one million items are not unusual, print libraries never possessed a million books of use to any one reader. The great libraries that took shape in the nineteenth and twentieth centuries were meta-structures, whose catalogues and finding aids allowed readers to create their own customized collections, building on the fixed classification schemes and disciplinary structures that took shape in the nineteenth century. The digital libraries of the early twenty-first century can be searched and their contents transmitted around the world. They can contain time-based media, images, quantitative data, and a far richer array of content than print, with visualization technologies blurring the boundaries between library and museum. But our digital libraries remain filled with digital incunabula - digital objects whose form remains firmly rooted in traditions of print, with HTML and PDF largely mimicking the limitations of their print predecessors. Vast collections based on image books - raw digital pictures of books with searchable but uncorrected text from OCR - could arguably retard our long-term progress, reinforcing the hegemony of structures that evolved to minimize the challenges of a world where paper was the only medium of distribution and where humans alone could read. Already the books in a digital library are beginning to read one another and to confer among themselves before creating a new synthetic document for review by their human readers.
Bailey, C.W. Jr.: Scholarly electronic publishing bibliography (2003) 0.02
```
0.022434268 = product of:
  0.11217134 = sum of:
    0.11217134 = weight(_text_:books in 1656) [ClassicSimilarity], result of:
      0.11217134 = score(doc=1656,freq=4.0), product of:
        0.24756333 = queryWeight, product of:
          4.8330836 = idf(docFreq=956, maxDocs=44218)
          0.051222645 = queryNorm
        0.45310158 = fieldWeight in 1656, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.8330836 = idf(docFreq=956, maxDocs=44218)
          0.046875 = fieldNorm(doc=1656)
  0.2 = coord(1/5)
```
Abstract

This selective bibliography presents over 1,900 articles, books, and other printed and electronic sources that are useful in understanding scholarly electronic publishing efforts on the Internet

Content

Table of Contents 1 Economic Issues* 2 Electronic Books and Texts 2.1 Case Studies and History 2.2 General Works* 2.3 Library Issues* 3 Electronic Serials 3.1 Case Studies and History 3.2 Critiques 3.3 Electronic Distribution of Printed Journals 3.4 General Works* 3.5 Library Issues* 3.6 Research* 4 General Works* 5 Legal Issues 5.1 Intellectual Property Rights* 5.2 License Agreements 5.3 Other Legal Issues 6 Library Issues 6.1 Cataloging, Identifiers, Linking, and Metadata* 6.2 Digital Libraries* 6.3 General Works* 6.4 Information Integrity and Preservation* 7 New Publishing Models* 8 Publisher Issues 8.1 Digital Rights Management* 9 Repositories and E-Prints* Appendix A. Related Bibliographies by the Same Author Appendix B. About the Author
Mitchell, J.S.: DDC 22 : an introduction (2003) 0.02
```
0.021725517 = product of:
  0.10862758 = sum of:
    0.10862758 = weight(_text_:22 in 1936) [ClassicSimilarity], result of:
      0.10862758 = score(doc=1936,freq=10.0), product of:
        0.17937298 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.051222645 = queryNorm
        0.6055961 = fieldWeight in 1936, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1936)
  0.2 = coord(1/5)
```
Abstract

Dewey Decimal Classification and Relative Index, Edition 22 (DDC 22) will be issued simultaneously in print and web versions in July 2003. The new edition is the first full print update to the Dewey Decimal Classification system in seven years-it includes several significant updates and many new numbers and topics. DDC 22 also features some fundamental structural changes that have been introduced with the goals of promoting classifier efficiency and improving the DDC for use in a variety of applications in the web environment. Most importantly, the content of the new edition has been shaped by the needs and recommendations of Dewey users around the world. The worldwide user community has an important role in shaping the future of the DDC.

Object

DDC-22
Beagle, D.: Visualizing keyword distribution across multidisciplinary c-space (2003) 0.02
```
0.020985337 = product of:
  0.10492668 = sum of:
    0.10492668 = weight(_text_:books in 1202) [ClassicSimilarity], result of:
      0.10492668 = score(doc=1202,freq=14.0), product of:
        0.24756333 = queryWeight, product of:
          4.8330836 = idf(docFreq=956, maxDocs=44218)
          0.051222645 = queryNorm
        0.42383775 = fieldWeight in 1202, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          4.8330836 = idf(docFreq=956, maxDocs=44218)
          0.0234375 = fieldNorm(doc=1202)
  0.2 = coord(1/5)
```
Abstract

The concept of c-space is proposed as a visualization schema relating containers of content to cataloging surrogates and classification structures. Possible applications of keyword vector clusters within c-space could include improved retrieval rates through the use of captioning within visual hierarchies, tracings of semantic bleeding among subclasses, and access to buried knowledge within subject-neutral publication containers. The Scholastica Project is described as one example, following a tradition of research dating back to the 1980's. Preliminary focus group assessment indicates that this type of classification rendering may offer digital library searchers enriched entry strategies and an expanded range of re-entry vocabularies. Those of us who work in traditional libraries typically assume that our systems of classification: Library of Congress Classification (LCC) and Dewey Decimal Classification (DDC), are descriptive rather than prescriptive. In other words, LCC classes and subclasses approximate natural groupings of texts that reflect an underlying order of knowledge, rather than arbitrary categories prescribed by librarians to facilitate efficient shelving. Philosophical support for this assumption has traditionally been found in a number of places, from the archetypal tree of knowledge, to Aristotelian categories, to the concept of discursive formations proposed by Michel Foucault. Gary P. Radford has elegantly described an encounter with Foucault's discursive formations in the traditional library setting: "Just by looking at the titles on the spines, you can see how the books cluster together...You can identify those books that seem to form the heart of the discursive formation and those books that reside on the margins. Moving along the shelves, you see those books that tend to bleed over into other classifications and that straddle multiple discursive formations. You can physically and sensually experience...those points that feel like state borders or national boundaries, those points where one subject ends and another begins, or those magical places where one subject has morphed into another..."
But what happens to this awareness in a digital library? Can discursive formations be represented in cyberspace, perhaps through diagrams in a visualization interface? And would such a schema be helpful to a digital library user? To approach this question, it is worth taking a moment to reconsider what Radford is looking at. First, he looks at titles to see how the books cluster. To illustrate, I scanned one hundred books on the shelves of a college library under subclass HT 101-395, defined by the LCC subclass caption as Urban groups. The City. Urban sociology. Of the first 100 titles in this sequence, fifty included the word "urban" or variants (e.g. "urbanization"). Another thirty-five used the word "city" or variants. These keywords appear to mark their titles as the heart of this discursive formation. The scattering of titles not using "urban" or "city" used related terms such as "town," "community," or in one case "skyscrapers." So we immediately see some empirical correlation between keywords and classification. But we also see a problem with the commonly used search technique of title-keyword. A student interested in urban studies will want to know about this entire subclass, and may wish to browse every title available therein. A title-keyword search on "urban" will retrieve only half of the titles, while a search on "city" will retrieve just over a third. There will be no overlap, since no titles in this sample contain both words. The only place where both words appear in a common string is in the LCC subclass caption, but captions are not typically indexed in library Online Public Access Catalogs (OPACs). In a traditional library, this problem is mitigated when the student goes to the shelf looking for any one of the books and suddenly discovers a much wider selection than the keyword search had led him to expect. But in a digital library, the issue of non-retrieval can be more problematic, as studies have indicated. Micco and Popp reported that, in a study funded partly by the U.S. Department of Education, 65 of 73 unskilled users searching for material on U.S./Soviet foreign relations found some material but never realized they had missed a large percentage of what was in the database.

Van der Veer Martens, B.: Do citation systems represent theories of truth? (2001) 0.02

0.019629175 = product of:
  0.09814587 = sum of:
    0.09814587 = weight(_text_:22 in 3925) [ClassicSimilarity], result of:
      0.09814587 = score(doc=3925,freq=4.0), product of:
        0.17937298 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.051222645 = queryNorm
        0.54716086 = fieldWeight in 3925, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.078125 = fieldNorm(doc=3925)
  0.2 = coord(1/5)

Date: 22. 7.2006 15:22:28

Dextre Clarke, S.G.: Challenges and opportunities for KOS standards (2007) 0.02

0.019431893 = product of:
  0.09715946 = sum of:
    0.09715946 = weight(_text_:22 in 4643) [ClassicSimilarity], result of:
      0.09715946 = score(doc=4643,freq=2.0), product of:
        0.17937298 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.051222645 = queryNorm
        0.5416616 = fieldWeight in 4643, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.109375 = fieldNorm(doc=4643)
  0.2 = coord(1/5)

Date: 22. 9.2007 15:41:14

Cohen, D.J.: From Babel to knowledge : data mining large digital collections (2006) 0.02
```
0.018317504 = product of:
  0.09158752 = sum of:
    0.09158752 = weight(_text_:books in 1178) [ClassicSimilarity], result of:
      0.09158752 = score(doc=1178,freq=6.0), product of:
        0.24756333 = queryWeight, product of:
          4.8330836 = idf(docFreq=956, maxDocs=44218)
          0.051222645 = queryNorm
        0.36995593 = fieldWeight in 1178, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          4.8330836 = idf(docFreq=956, maxDocs=44218)
          0.03125 = fieldNorm(doc=1178)
  0.2 = coord(1/5)
```
Abstract

In Jorge Luis Borges's curious short story The Library of Babel, the narrator describes an endless collection of books stored from floor to ceiling in a labyrinth of countless hexagonal rooms. The pages of the library's books seem to contain random sequences of letters and spaces; occasionally a few intelligible words emerge in the sea of paper and ink. Nevertheless, readers diligently, and exasperatingly, scan the shelves for coherent passages. The narrator himself has wandered numerous rooms in search of enlightenment, but with resignation he simply awaits his death and burial - which Borges explains (with signature dark humor) consists of being tossed unceremoniously over the library's banister. Borges's nightmare, of course, is a cursed vision of the research methods of disciplines such as literature, history, and philosophy, where the careful reading of books, one after the other, is supposed to lead inexorably to knowledge and understanding. Computer scientists would approach Borges's library far differently. Employing the information theory that forms the basis for search engines and other computerized techniques for assessing in one fell swoop large masses of documents, they would quickly realize the collection's incoherence though sampling and statistical methods - and wisely start looking for the library's exit. These computational methods, which allow us to find patterns, determine relationships, categorize documents, and extract information from massive corpuses, will form the basis for new tools for research in the humanities and other disciplines in the coming decade. For the past three years I have been experimenting with how to provide such end-user tools - that is, tools that harness the power of vast electronic collections while hiding much of their complicated technical plumbing. In particular, I have made extensive use of the application programming interfaces (APIs) the leading search engines provide for programmers to query their databases directly (from server to server without using their web interfaces). In addition, I have explored how one might extract information from large digital collections, from the well-curated lexicographic database WordNet to the democratic (and poorly curated) online reference work Wikipedia. While processing these digital corpuses is currently an imperfect science, even now useful tools can be created by combining various collections and methods for searching and analyzing them. And more importantly, these nascent services suggest a future in which information can be gleaned from, and sense can be made out of, even imperfect digital libraries of enormous scale. A brief examination of two approaches to data mining large digital collections hints at this future, while also providing some lessons about how to get there.

Zumer, M.; Clavel, G.: EDLproject : one more step towards the European digtial library (2007) 0.02

0.016655907 = product of:
  0.083279535 = sum of:
    0.083279535 = weight(_text_:22 in 3184) [ClassicSimilarity], result of:
      0.083279535 = score(doc=3184,freq=2.0), product of:
        0.17937298 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.051222645 = queryNorm
        0.46428138 = fieldWeight in 3184, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.09375 = fieldNorm(doc=3184)
  0.2 = coord(1/5)

Content: Vortrag anläasslich des Workshops: "Extending the multilingual capacity of The European Library in the EDL project Stockholm, Swedish National Library, 22-23 November 2007".

Boleda, G.; Evert, S.: Multiword expressions : a pain in the neck of lexical semantics (2009) 0.02

0.016655907 = product of:
  0.083279535 = sum of:
    0.083279535 = weight(_text_:22 in 4888) [ClassicSimilarity], result of:
      0.083279535 = score(doc=4888,freq=2.0), product of:
        0.17937298 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.051222645 = queryNorm
        0.46428138 = fieldWeight in 4888, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.09375 = fieldNorm(doc=4888)
  0.2 = coord(1/5)

Date: 1. 3.2013 14:56:22

Bourdon, F.: Funktionale Anforderungen an bibliographische Datensätze und ein internationales Nummernsystem für Normdaten : wie weit kann Normierung durch Technik unterstützt werden? (2001) 0.02

0.016655907 = product of:
  0.083279535 = sum of:
    0.083279535 = weight(_text_:22 in 6888) [ClassicSimilarity], result of:
      0.083279535 = score(doc=6888,freq=2.0), product of:
        0.17937298 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.051222645 = queryNorm
        0.46428138 = fieldWeight in 6888, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.09375 = fieldNorm(doc=6888)
  0.2 = coord(1/5)

Date: 26.12.2011 12:30:22

Qin, J.; Paling, S.: Converting a controlled vocabulary into an ontology : the case of GEM (2001) 0.02

0.016655907 = product of:
  0.083279535 = sum of:
    0.083279535 = weight(_text_:22 in 3895) [ClassicSimilarity], result of:
      0.083279535 = score(doc=3895,freq=2.0), product of:
        0.17937298 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.051222645 = queryNorm
        0.46428138 = fieldWeight in 3895, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.09375 = fieldNorm(doc=3895)
  0.2 = coord(1/5)

Date: 24. 8.2005 19:20:22

Broughton, V.: Automatic metadata generation : Digital resource description without human intervention (2007) 0.02

0.016655907 = product of:
  0.083279535 = sum of:
    0.083279535 = weight(_text_:22 in 6048) [ClassicSimilarity], result of:
      0.083279535 = score(doc=6048,freq=2.0), product of:
        0.17937298 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.051222645 = queryNorm
        0.46428138 = fieldWeight in 6048, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.09375 = fieldNorm(doc=6048)
  0.2 = coord(1/5)

Date: 22. 9.2007 15:41:14

Tudhope, D.: Knowledge Organization System Services : brief review of NKOS activities and possibility of KOS registries (2007) 0.02

0.016655907 = product of:
  0.083279535 = sum of:
    0.083279535 = weight(_text_:22 in 100) [ClassicSimilarity], result of:
      0.083279535 = score(doc=100,freq=2.0), product of:
        0.17937298 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.051222645 = queryNorm
        0.46428138 = fieldWeight in 100, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.09375 = fieldNorm(doc=100)
  0.2 = coord(1/5)

Date: 22. 9.2007 15:41:14

Isaac, A.: After EDLproject : controlled Vocabularies in TELPlus (2007) 0.02

0.016655907 = product of:
  0.083279535 = sum of:
    0.083279535 = weight(_text_:22 in 116) [ClassicSimilarity], result of:
      0.083279535 = score(doc=116,freq=2.0), product of:
        0.17937298 = queryWeight, product of:
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.051222645 = queryNorm
        0.46428138 = fieldWeight in 116, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.5018296 = idf(docFreq=3622, maxDocs=44218)
          0.09375 = fieldNorm(doc=116)
  0.2 = coord(1/5)

Content: Vortrag anlässlich des Workshops: "Extending the multilingual capacity of The European Library in the EDL project Stockholm, Swedish National Library, 22-23 November 2007".

Search (88 results, page 1 of 5)

Authors

Languages

Types

Themes