Search (129 results, page 1 of 7)

  • × language_ss:"e"
  • × theme_ss:"Metadaten"
  • × type_ss:"a"
  1. Pope, J.T.; Holley, R.P.: Google Book Search and metadata (2011) 0.12
    0.11552312 = product of:
      0.17328468 = sum of:
        0.13227855 = weight(_text_:book in 1887) [ClassicSimilarity], result of:
          0.13227855 = score(doc=1887,freq=6.0), product of:
            0.2237077 = queryWeight, product of:
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.050679956 = queryNorm
            0.5913008 = fieldWeight in 1887, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1887)
        0.041006133 = product of:
          0.082012266 = sum of:
            0.082012266 = weight(_text_:search in 1887) [ClassicSimilarity], result of:
              0.082012266 = score(doc=1887,freq=6.0), product of:
                0.17614716 = queryWeight, product of:
                  3.475677 = idf(docFreq=3718, maxDocs=44218)
                  0.050679956 = queryNorm
                0.46558946 = fieldWeight in 1887, product of:
                  2.4494898 = tf(freq=6.0), with freq of:
                    6.0 = termFreq=6.0
                  3.475677 = idf(docFreq=3718, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=1887)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    This article summarizes published documents on metadata provided by Google for books scanned as part of the Google Book Search (GBS) project and provides suggestions for improvement. The faulty, misleading, and confusing metadata in current Google records can pose potentially serious problems for users of GBS. Google admits that it took data, which proved to be inaccurate, from many sources and is attempting to correct errors. Some argue that metadata is not needed with keyword searching; but optical character recognition (OCR) errors, synonym control, and materials in foreign languages make reliable metadata a requirement for academic researchers. The authors recommend that users should be able to submit error reports to Google to correct faulty metadata.
    Object
    Google Book Search
  2. Roux, M.: Metadata for search engines : what can be learned from e-Sciences? (2012) 0.07
    0.07069763 = product of:
      0.10604644 = sum of:
        0.0654609 = weight(_text_:book in 96) [ClassicSimilarity], result of:
          0.0654609 = score(doc=96,freq=2.0), product of:
            0.2237077 = queryWeight, product of:
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.050679956 = queryNorm
            0.29261798 = fieldWeight in 96, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.046875 = fieldNorm(doc=96)
        0.04058554 = product of:
          0.08117108 = sum of:
            0.08117108 = weight(_text_:search in 96) [ClassicSimilarity], result of:
              0.08117108 = score(doc=96,freq=8.0), product of:
                0.17614716 = queryWeight, product of:
                  3.475677 = idf(docFreq=3718, maxDocs=44218)
                  0.050679956 = queryNorm
                0.460814 = fieldWeight in 96, product of:
                  2.828427 = tf(freq=8.0), with freq of:
                    8.0 = termFreq=8.0
                  3.475677 = idf(docFreq=3718, maxDocs=44218)
                  0.046875 = fieldNorm(doc=96)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    E-sciences are data-intensive sciences that make a large use of the Web to share, collect, and process data. In this context, primary scientific data is becoming a new challenging issue as data must be extensively described (1) to account for empiric conditions and results that allow interpretation and/or analyses and (2) to be understandable by computers used for data storage and information retrieval. With this respect, metadata is a focal point whatever it is considered from the point of view of the user to visualize and exploit data as well as this of the search tools to find and retrieve information. Numerous disciplines are concerned with the issues of describing complex observations and addressing pertinent knowledge. In this paper, similarities and differences in data description and exploration strategies among disciplines in e-sciences are examined.
    Footnote
    Vgl.: http://www.igi-global.com/book/next-generation-search-engines/64420.
    Source
    Next generation search engines: advanced models for information retrieval. Eds.: C. Jouis, u.a
  3. Baker, T.: ¬A grammar of Dublin Core (2000) 0.04
    0.03824898 = product of:
      0.057373464 = sum of:
        0.043640595 = weight(_text_:book in 1236) [ClassicSimilarity], result of:
          0.043640595 = score(doc=1236,freq=2.0), product of:
            0.2237077 = queryWeight, product of:
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.050679956 = queryNorm
            0.19507864 = fieldWeight in 1236, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.03125 = fieldNorm(doc=1236)
        0.013732869 = product of:
          0.027465738 = sum of:
            0.027465738 = weight(_text_:22 in 1236) [ClassicSimilarity], result of:
              0.027465738 = score(doc=1236,freq=2.0), product of:
                0.17747258 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050679956 = queryNorm
                0.15476047 = fieldWeight in 1236, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=1236)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Dublin Core is often presented as a modern form of catalog card -- a set of elements (and now qualifiers) that describe resources in a complete package. Sometimes it is proposed as an exchange format for sharing records among multiple collections. The founding principle that "every element is optional and repeatable" reinforces the notion that a Dublin Core description is to be taken as a whole. This paper, in contrast, is based on a much different premise: Dublin Core is a language. More precisely, it is a small language for making a particular class of statements about resources. Like natural languages, it has a vocabulary of word-like terms, the two classes of which -- elements and qualifiers -- function within statements like nouns and adjectives; and it has a syntax for arranging elements and qualifiers into statements according to a simple pattern. Whenever tourists order a meal or ask directions in an unfamiliar language, considerate native speakers will spontaneously limit themselves to basic words and simple sentence patterns along the lines of "I am so-and-so" or "This is such-and-such". Linguists call this pidginization. In such situations, a small phrase book or translated menu can be most helpful. By analogy, today's Web has been called an Internet Commons where users and information providers from a wide range of scientific, commercial, and social domains present their information in a variety of incompatible data models and description languages. In this context, Dublin Core presents itself as a metadata pidgin for digital tourists who must find their way in this linguistically diverse landscape. Its vocabulary is small enough to learn quickly, and its basic pattern is easily grasped. It is well-suited to serve as an auxiliary language for digital libraries. This grammar starts by defining terms. It then follows a 200-year-old tradition of English grammar teaching by focusing on the structure of single statements. It concludes by looking at the growing dictionary of Dublin Core vocabulary terms -- its registry, and at how statements can be used to build the metadata equivalent of paragraphs and compositions -- the application profile.
    Date
    26.12.2011 14:01:22
  4. Lagoze, C.: Keeping Dublin Core simple : Cross-domain discovery or resource description? (2001) 0.03
    0.031991065 = product of:
      0.047986593 = sum of:
        0.02727537 = weight(_text_:book in 1216) [ClassicSimilarity], result of:
          0.02727537 = score(doc=1216,freq=2.0), product of:
            0.2237077 = queryWeight, product of:
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.050679956 = queryNorm
            0.12192415 = fieldWeight in 1216, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.01953125 = fieldNorm(doc=1216)
        0.020711223 = product of:
          0.041422445 = sum of:
            0.041422445 = weight(_text_:search in 1216) [ClassicSimilarity], result of:
              0.041422445 = score(doc=1216,freq=12.0), product of:
                0.17614716 = queryWeight, product of:
                  3.475677 = idf(docFreq=3718, maxDocs=44218)
                  0.050679956 = queryNorm
                0.23515818 = fieldWeight in 1216, product of:
                  3.4641016 = tf(freq=12.0), with freq of:
                    12.0 = termFreq=12.0
                  3.475677 = idf(docFreq=3718, maxDocs=44218)
                  0.01953125 = fieldNorm(doc=1216)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Reality is messy. Individuals perceive or define objects differently. Objects may change over time, morphing into new versions of their former selves or into things altogether different. A book can give rise to a translation, derivation, or edition, and these resulting objects are related in complex ways to each other and to the people and contexts in which they were created or transformed. Providing a normalized view of such a messy reality is a precondition for managing information. From the first library catalogs, through Melvil Dewey's Decimal Classification system in the nineteenth century, to today's MARC encoding of AACR2 cataloging rules, libraries have epitomized the process of what David Levy calls "order making", whereby catalogers impose a veneer of regularity on the natural disorder of the artifacts they encounter. The pre-digital library within which the Catalog and its standards evolved was relatively self-contained and controlled. Creating and maintaining catalog records was, and still is, the task of professionals. Today's Web, in contrast, has brought together a diversity of information management communities, with a variety of order-making standards, into what Stuart Weibel has called the Internet Commons. The sheer scale of this context has motivated a search for new ways to describe and index information. Second-generation search engines such as Google can yield astonishingly good search results, while tools such as ResearchIndex for automatic citation indexing and techniques for inferring "Web communities" from constellations of hyperlinks promise even better methods for focusing queries on information from authoritative sources. Such "automated digital libraries," according to Bill Arms, promise to radically reduce the cost of managing information. Alongside the development of such automated methods, there is increasing interest in metadata as a means of imposing pre-defined order on Web content. While the size and changeability of the Web makes professional cataloging impractical, a minimal amount of information ordering, such as that represented by the Dublin Core (DC), may vastly improve the quality of an automatic index at low cost; indeed, recent work suggests that some types of simple description may be generated with little or no human intervention.
    Metadata is not monolithic. Instead, it is helpful to think of metadata as multiple views that can be projected from a single information object. Such views can form the basis of customized information services, such as search engines. Multiple views -- different types of metadata associated with a Web resource -- can facilitate a "drill-down" search paradigm, whereby people start their searches at a high level and later narrow their focus using domain-specific search categories. In Figure 1, for example, Mona Lisa may be viewed from the perspective of non-specialized searchers, with categories that are valid across domains (who painted it and when?); in the context of a museum (when and how was it acquired?); in the geo-spatial context of a walking tour using mobile devices (where is it in the gallery?); and in a legal framework (who owns the rights to its reproduction?). Multiple descriptive views imply a modular approach to metadata. Modularity is the basis of metadata architectures such as the Resource Description Framework (RDF), which permit different communities of expertise to associate and maintain multiple metadata packages for Web resources. As noted elsewhere, static association of multiple metadata packages with resources is but one way of achieving modularity. Another method is to computationally derive order-making views customized to the current needs of a client. This paper examines the evolution and scope of the Dublin Core from this perspective of metadata modularization. Dublin Core began in 1995 with a specific goal and scope -- as an easy-to-create and maintain descriptive format to facilitate cross-domain resource discovery on the Web. Over the years, this goal of "simple metadata for coarse-granularity discovery" came to mix with another goal -- that of community and domain-specific resource description and its attendant complexity. A notion of "qualified Dublin Core" evolved whereby the model for simple resource discovery -- a set of simple metadata elements in a flat, document-centric model -- would form the basis of more complex descriptions by treating the values of its elements as entities with properties ("component elements") in their own right.
  5. Nabavi, M.; Karimi, E.: Metadata elements for children in theory and practice (2022) 0.03
    0.030858565 = product of:
      0.09257569 = sum of:
        0.09257569 = weight(_text_:book in 1110) [ClassicSimilarity], result of:
          0.09257569 = score(doc=1110,freq=4.0), product of:
            0.2237077 = queryWeight, product of:
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.050679956 = queryNorm
            0.41382432 = fieldWeight in 1110, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.046875 = fieldNorm(doc=1110)
      0.33333334 = coord(1/3)
    
    Abstract
    This research aimed to investigate the status of children-specific metadata elements in theory (existing literature) and practice (metadata standards and children's digital libraries). Literature reviews as well as two cases, including children's online national libraries of Iran, and Singapore, are used to identify children-specific metadata elements and their application. The results revealed that descriptive metadata types had been mentioned more than analytical, social, and relational types; the DCMI metadata standard, besides LOM and ALTO metadata standards, can be used to develop an application profile for children's library catalogs. Two cases showed that they partially cover children-specific metadata elements, and neither has covered relational metadata elements. A deeper analysis of the children-specific metadata elements suggests that children's catalogs should be semantic and social. The results of this study can be insightful for children's book catalogers and children's book publishers (for marketing purposes).
  6. Dempsey, L.: Metadata (1997) 0.03
    0.029093731 = product of:
      0.08728119 = sum of:
        0.08728119 = weight(_text_:book in 46) [ClassicSimilarity], result of:
          0.08728119 = score(doc=46,freq=2.0), product of:
            0.2237077 = queryWeight, product of:
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.050679956 = queryNorm
            0.39015728 = fieldWeight in 46, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.0625 = fieldNorm(doc=46)
      0.33333334 = coord(1/3)
    
    Abstract
    The term 'metadata' is becoming commonly used to refer to a variety of types of data which describe other data. A familiar example is bibliographic data, which describes a book or a serial article. Suggests that a routine definiton might be: 'metadata is data which describes attributes of a resource'. Gives some examples before looking at the Dublic Core, a simple response to the challenge of describing a wide range of network resources
  7. Dempsey, L.: Metadata (1997) 0.03
    0.029093731 = product of:
      0.08728119 = sum of:
        0.08728119 = weight(_text_:book in 107) [ClassicSimilarity], result of:
          0.08728119 = score(doc=107,freq=2.0), product of:
            0.2237077 = queryWeight, product of:
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.050679956 = queryNorm
            0.39015728 = fieldWeight in 107, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.0625 = fieldNorm(doc=107)
      0.33333334 = coord(1/3)
    
    Abstract
    The term 'metadata' is becoming commonly used to refer to a variety of types of data which describe other data. A familiar example is bibliographic data, which describes a book or a serial article. Suggests that a rountine definition might be: 'Metadata is data which describes attributes of a resource'. Provides examples to expand on this before looking at the Dublin Core, a simple set of elements for describing a wide range of network resources
  8. Franklin, R.A.: Re-inventing subject access for the semantic web (2003) 0.03
    0.027261382 = product of:
      0.081784144 = sum of:
        0.081784144 = sum of:
          0.04058554 = weight(_text_:search in 2556) [ClassicSimilarity], result of:
            0.04058554 = score(doc=2556,freq=2.0), product of:
              0.17614716 = queryWeight, product of:
                3.475677 = idf(docFreq=3718, maxDocs=44218)
                0.050679956 = queryNorm
              0.230407 = fieldWeight in 2556, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.475677 = idf(docFreq=3718, maxDocs=44218)
                0.046875 = fieldNorm(doc=2556)
          0.041198608 = weight(_text_:22 in 2556) [ClassicSimilarity], result of:
            0.041198608 = score(doc=2556,freq=2.0), product of:
              0.17747258 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.050679956 = queryNorm
              0.23214069 = fieldWeight in 2556, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=2556)
      0.33333334 = coord(1/3)
    
    Abstract
    First generation scholarly research on the Web lacked a firm system of authority control. Second generation Web research is beginning to model subject access with library science principles of bibliographic control and cataloguing. Harnessing the Web and organising the intellectual content with standards and controlled vocabulary provides precise search and retrieval capability, increasing relevance and efficient use of technology. Dublin Core metadata standards permit a full evaluation and cataloguing of Web resources appropriate to highly specific research needs and discovery. Current research points to a type of structure based on a system of faceted classification. This system allows the semantic and syntactic relationships to be defined. Controlled vocabulary, such as the Library of Congress Subject Headings, can be assigned, not in a hierarchical structure, but rather as descriptive facets of relating concepts. Web design features such as this are adding value to discovery and filtering out data that lack authority. The system design allows for scalability and extensibility, two technical features that are integral to future development of the digital library and resource discovery.
    Date
    30.12.2008 18:22:46
  9. Renear, A.H.; Wickett, K.M.; Urban, R.J.; Dubin, D.; Shreeves, S.L.: Collection/item metadata relationships (2008) 0.03
    0.027261382 = product of:
      0.081784144 = sum of:
        0.081784144 = sum of:
          0.04058554 = weight(_text_:search in 2623) [ClassicSimilarity], result of:
            0.04058554 = score(doc=2623,freq=2.0), product of:
              0.17614716 = queryWeight, product of:
                3.475677 = idf(docFreq=3718, maxDocs=44218)
                0.050679956 = queryNorm
              0.230407 = fieldWeight in 2623, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.475677 = idf(docFreq=3718, maxDocs=44218)
                0.046875 = fieldNorm(doc=2623)
          0.041198608 = weight(_text_:22 in 2623) [ClassicSimilarity], result of:
            0.041198608 = score(doc=2623,freq=2.0), product of:
              0.17747258 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.050679956 = queryNorm
              0.23214069 = fieldWeight in 2623, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=2623)
      0.33333334 = coord(1/3)
    
    Abstract
    Contemporary retrieval systems, which search across collections, usually ignore collection-level metadata. Alternative approaches, exploiting collection-level information, will require an understanding of the various kinds of relationships that can obtain between collection-level and item-level metadata. This paper outlines the problem and describes a project that is developing a logic-based framework for classifying collection/item metadata relationships. This framework will support (i) metadata specification developers defining metadata elements, (ii) metadata creators describing objects, and (iii) system designers implementing systems that take advantage of collection-level metadata. We present three examples of collection/item metadata relationship categories, attribute/value-propagation, value-propagation, and value-constraint and show that even in these simple cases a precise formulation requires modal notions in addition to first-order logic. These formulations are related to recent work in information retrieval and ontology evaluation.
    Source
    Metadata for semantic and social applications : proceedings of the International Conference on Dublin Core and Metadata Applications, Berlin, 22 - 26 September 2008, DC 2008: Berlin, Germany / ed. by Jane Greenberg and Wolfgang Klas
  10. Hook, P.A.; Gantchev, A.: Using combined metadata sources to visualize a small library (OBL's English Language Books) (2017) 0.03
    0.025715468 = product of:
      0.0771464 = sum of:
        0.0771464 = weight(_text_:book in 3870) [ClassicSimilarity], result of:
          0.0771464 = score(doc=3870,freq=4.0), product of:
            0.2237077 = queryWeight, product of:
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.050679956 = queryNorm
            0.34485358 = fieldWeight in 3870, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3870)
      0.33333334 = coord(1/3)
    
    Abstract
    Data from multiple knowledge organization systems are combined to provide a global overview of the content holdings of a small personal library. Subject headings and classification data are used to effectively map the combined book and topic space of the library. While harvested and manipulated by hand, the work reveals issues and potential solutions when using automated techniques to produce topic maps of much larger libraries. The small library visualized consists of the thirty-nine, digital, English language books found in the Osama Bin Laden (OBL) compound in Abbottabad, Pakistan upon his death. As this list of books has garnered considerable media attention, it is worth providing a visual overview of the subject content of these books - some of which is not readily apparent from the titles. Metadata from subject headings and classification numbers was combined to create book-subject maps. Tree maps of the classification data were also produced. The books contain 328 subject headings. In order to enhance the base map with meaningful thematic overlay, library holding count data was also harvested (and aggregated from duplicates). This additional data revealed the relative scarcity or popularity of individual books.
  11. Johansson, S.; Golub, K.: LibraryThing for libraries : how tag moderation and size limitations affect tag clouds (2019) 0.03
    0.025715468 = product of:
      0.0771464 = sum of:
        0.0771464 = weight(_text_:book in 5398) [ClassicSimilarity], result of:
          0.0771464 = score(doc=5398,freq=4.0), product of:
            0.2237077 = queryWeight, product of:
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.050679956 = queryNorm
            0.34485358 = fieldWeight in 5398, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5398)
      0.33333334 = coord(1/3)
    
    Abstract
    The aim of this study is to analyse differences between tags on LibraryThing's web page and tag clouds in their "Library-Thing for Libraries" service, and assess if, and how, the Library-Thing tag moderation and limitations to the size of the tag cloud in the library catalogue affect the description of the information resource. An e-mail survey was conducted with personnel at LibraryThing, and the results were compared against tags for twenty different fiction books, collected from two different library catalogues with disparate tag cloud sizes, and Library-Thing's web page. The data were analysed using a modified version of Golder and Huberman's tag categories (2006). The results show that while LibraryThing claims to only remove the inherently personal tags, several other types of tags are found to have been discarded as well. Occasionally a certain type of tag is in-cluded in one book, and excluded in another. The comparison between the two tag cloud sizes suggests that the larger tag clouds provide a more pronounced picture regarding the contents of the book but at the cost of an increase in the number of tags with synonymous or redundant information.
  12. Wolfe, EW.: a case study in automated metadata enhancement : Natural Language Processing in the humanities (2019) 0.03
    0.025457015 = product of:
      0.076371044 = sum of:
        0.076371044 = weight(_text_:book in 5236) [ClassicSimilarity], result of:
          0.076371044 = score(doc=5236,freq=2.0), product of:
            0.2237077 = queryWeight, product of:
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.050679956 = queryNorm
            0.34138763 = fieldWeight in 5236, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5236)
      0.33333334 = coord(1/3)
    
    Abstract
    The Black Book Interactive Project at the University of Kansas (KU) is developing an expanded corpus of novels by African American authors, with an emphasis on lesser known writers and a goal of expanding research in this field. Using a custom metadata schema with an emphasis on race-related elements, each novel is analyzed for a variety of elements such as literary style, targeted content analysis, historical context, and other areas. Librarians at KU have worked to develop a variety of computational text analysis processes designed to assist with specific aspects of this metadata collection, including text mining and natural language processing, automated subject extraction based on word sense disambiguation, harvesting data from Wikidata, and other actions.
  13. Chou, C.: Purpose-driven assessment of cataloging and metadata services : transforming broken links into linked data (2019) 0.03
    0.025457015 = product of:
      0.076371044 = sum of:
        0.076371044 = weight(_text_:book in 5280) [ClassicSimilarity], result of:
          0.076371044 = score(doc=5280,freq=2.0), product of:
            0.2237077 = queryWeight, product of:
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.050679956 = queryNorm
            0.34138763 = fieldWeight in 5280, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.0546875 = fieldNorm(doc=5280)
      0.33333334 = coord(1/3)
    
    Abstract
    Many primary school classrooms have book collections. Most teachers organize and maintain these collections by themselves, although some involve students in the processes. This qualitative study considers a third approach, parent-involved categorization, to understand how people without library or education training categorize books. We observed and interviewed parents and a teacher who worked together to categorize books in a kindergarten classroom. They employed multiple orthogonal organizing principles, felt that working collaboratively made the task less overwhelming, solved difficult problems pragmatically, organized books primarily to facilitate retrieval by the teacher, and left lumping and splitting decisions to the teacher.
  14. Heidorn, P.B.; Wei, Q.: Automatic metadata extraction from museum specimen labels (2008) 0.02
    0.022717819 = product of:
      0.068153456 = sum of:
        0.068153456 = sum of:
          0.033821285 = weight(_text_:search in 2624) [ClassicSimilarity], result of:
            0.033821285 = score(doc=2624,freq=2.0), product of:
              0.17614716 = queryWeight, product of:
                3.475677 = idf(docFreq=3718, maxDocs=44218)
                0.050679956 = queryNorm
              0.19200584 = fieldWeight in 2624, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.475677 = idf(docFreq=3718, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2624)
          0.034332175 = weight(_text_:22 in 2624) [ClassicSimilarity], result of:
            0.034332175 = score(doc=2624,freq=2.0), product of:
              0.17747258 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.050679956 = queryNorm
              0.19345059 = fieldWeight in 2624, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=2624)
      0.33333334 = coord(1/3)
    
    Abstract
    This paper describes the information properties of museum specimen labels and machine learning tools to automatically extract Darwin Core (DwC) and other metadata from these labels processed through Optical Character Recognition (OCR). The DwC is a metadata profile describing the core set of access points for search and retrieval of natural history collections and observation databases. Using the HERBIS Learning System (HLS) we extract 74 independent elements from these labels. The automated text extraction tools are provided as a web service so that users can reference digital images of specimens and receive back an extended Darwin Core XML representation of the content of the label. This automated extraction task is made more difficult by the high variability of museum label formats, OCR errors and the open class nature of some elements. In this paper we introduce our overall system architecture, and variability robust solutions including, the application of Hidden Markov and Naïve Bayes machine learning models, data cleaning, use of field element identifiers, and specialist learning models. The techniques developed here could be adapted to any metadata extraction situation with noisy text and weakly ordered elements.
    Source
    Metadata for semantic and social applications : proceedings of the International Conference on Dublin Core and Metadata Applications, Berlin, 22 - 26 September 2008, DC 2008: Berlin, Germany / ed. by Jane Greenberg and Wolfgang Klas
  15. Weibel, S.L.: Dublin Core Metadata Initiative (DCMI) : a personal history (2009) 0.02
    0.0218203 = product of:
      0.0654609 = sum of:
        0.0654609 = weight(_text_:book in 3772) [ClassicSimilarity], result of:
          0.0654609 = score(doc=3772,freq=2.0), product of:
            0.2237077 = queryWeight, product of:
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.050679956 = queryNorm
            0.29261798 = fieldWeight in 3772, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.046875 = fieldNorm(doc=3772)
      0.33333334 = coord(1/3)
    
    Footnote
    Vgl.: http://www.tandfonline.com/doi/book/10.1081/E-ELIS3.
  16. Greenberg, J.: Metadata and digital information (2009) 0.02
    0.0218203 = product of:
      0.0654609 = sum of:
        0.0654609 = weight(_text_:book in 4697) [ClassicSimilarity], result of:
          0.0654609 = score(doc=4697,freq=2.0), product of:
            0.2237077 = queryWeight, product of:
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.050679956 = queryNorm
            0.29261798 = fieldWeight in 4697, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.046875 = fieldNorm(doc=4697)
      0.33333334 = coord(1/3)
    
    Content
    Digital unter: http://dx.doi.org/10.1081/E-ELIS3-120044415. Vgl.: http://www.tandfonline.com/doi/book/10.1081/E-ELIS3.
  17. White, M.: ¬The value of taxonomies, thesauri and metadata in enterprise search (2016) 0.02
    0.020324064 = product of:
      0.06097219 = sum of:
        0.06097219 = product of:
          0.12194438 = sum of:
            0.12194438 = weight(_text_:search in 2964) [ClassicSimilarity], result of:
              0.12194438 = score(doc=2964,freq=26.0), product of:
                0.17614716 = queryWeight, product of:
                  3.475677 = idf(docFreq=3718, maxDocs=44218)
                  0.050679956 = queryNorm
                0.69228697 = fieldWeight in 2964, product of:
                  5.0990195 = tf(freq=26.0), with freq of:
                    26.0 = termFreq=26.0
                  3.475677 = idf(docFreq=3718, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=2964)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Although the technical, mathematical and linguistic principles of search date back to the early 1960s and enterprise search applications have been commercially available since the 1980s; it is only since the launch of Microsoft SharePoint 2010 and the integration of the Apache Lucene and Solr projects in 2010 that there has been a wider adoption of enterprise search applications. Surveys carried out over the last five years indicate that although enterprises accept that search applications are essential in locating information, there has not been any significant investment in search teams to support these applications. Where taxonomies, thesauri and metadata have been used to improve the search user interface and enhance the search experience, the indications are that levels of search satisfaction are significantly higher. The challenges faced by search managers in developing and maintaining these tools include a lack of published research on the use of these tools and difficulty in recruiting search team members with the requisite skills and experience. There would seem to be an important and immediate opportunity to bring together the research, knowledge organization and enterprise search communities to explore how good practice in the use of taxonomies, thesauri and metadata in enterprise search can be established, enhanced and promoted.
  18. Bogaard, T.; Hollink, L.; Wielemaker, J.; Ossenbruggen, J. van; Hardman, L.: Metadata categorization for identifying search patterns in a digital library (2019) 0.02
    0.019526727 = product of:
      0.058580182 = sum of:
        0.058580182 = product of:
          0.117160365 = sum of:
            0.117160365 = weight(_text_:search in 5281) [ClassicSimilarity], result of:
              0.117160365 = score(doc=5281,freq=24.0), product of:
                0.17614716 = queryWeight, product of:
                  3.475677 = idf(docFreq=3718, maxDocs=44218)
                  0.050679956 = queryNorm
                0.66512775 = fieldWeight in 5281, product of:
                  4.8989797 = tf(freq=24.0), with freq of:
                    24.0 = termFreq=24.0
                  3.475677 = idf(docFreq=3718, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5281)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    Purpose For digital libraries, it is useful to understand how users search in a collection. Investigating search patterns can help them to improve the user interface, collection management and search algorithms. However, search patterns may vary widely in different parts of a collection. The purpose of this paper is to demonstrate how to identify these search patterns within a well-curated historical newspaper collection using the existing metadata. Design/methodology/approach The authors analyzed search logs combined with metadata records describing the content of the collection, using this metadata to create subsets in the logs corresponding to different parts of the collection. Findings The study shows that faceted search is more prevalent than non-faceted search in terms of number of unique queries, time spent, clicks and downloads. Distinct search patterns are observed in different parts of the collection, corresponding to historical periods, geographical regions or subject matter. Originality/value First, this study provides deeper insights into search behavior at a fine granularity in a historical newspaper collection, by the inclusion of the metadata in the analysis. Second, it demonstrates how to use metadata categorization as a way to analyze distinct search patterns in a collection.
  19. Andresen, L.: Metadata in Denmark (2000) 0.02
    0.018310493 = product of:
      0.054931477 = sum of:
        0.054931477 = product of:
          0.10986295 = sum of:
            0.10986295 = weight(_text_:22 in 4899) [ClassicSimilarity], result of:
              0.10986295 = score(doc=4899,freq=2.0), product of:
                0.17747258 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050679956 = queryNorm
                0.61904186 = fieldWeight in 4899, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.125 = fieldNorm(doc=4899)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Date
    16. 7.2000 20:58:22
  20. Pitti, D.V.: Encoded Archival Description (EAD) (2009) 0.02
    0.018183582 = product of:
      0.05455074 = sum of:
        0.05455074 = weight(_text_:book in 3777) [ClassicSimilarity], result of:
          0.05455074 = score(doc=3777,freq=2.0), product of:
            0.2237077 = queryWeight, product of:
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.050679956 = queryNorm
            0.2438483 = fieldWeight in 3777, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              4.414126 = idf(docFreq=1454, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3777)
      0.33333334 = coord(1/3)
    
    Footnote
    Vgl.: http://www.tandfonline.com/doi/book/10.1081/E-ELIS3.

Years

Types