Search (59 results, page 1 of 3)

Gursoy, A.; Wickett, K.; Feinberg, M.: Understanding tag functions in a moderated, user-generated metadata ecosystem (2018) 0.04
```
0.03600474 = product of:
  0.09001185 = sum of:
    0.057061244 = weight(_text_:context in 3946) [ClassicSimilarity], result of:
      0.057061244 = score(doc=3946,freq=4.0), product of:
        0.17622331 = queryWeight, product of:
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.04251826 = queryNorm
        0.32380077 = fieldWeight in 3946, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3946)
    0.032950602 = weight(_text_:system in 3946) [ClassicSimilarity], result of:
      0.032950602 = score(doc=3946,freq=4.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.24605882 = fieldWeight in 3946, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3946)
  0.4 = coord(2/5)
```
Abstract

Purpose The purpose of this paper is to investigate tag use in a metadata ecosystem that supports a fan work repository to identify functions of tags and explore the system as a co-constructed communicative context. Design/methodology/approach Using modified techniques from grounded theory (Charmaz, 2007), this paper integrates humanistic and social science methods to identify kinds of tag use in a rich setting. Findings Three primary roles of tags emerge out of detailed study of the metadata ecosystem: tags can identify elements in the fan work, tags can reflect on how those elements are used or adapted in the fan work, and finally, tags can express the fan author's sense of her role in the discursive context of the fan work repository. Attending to each of the tag roles shifts focus away from just what tags say to include how they say it. Practical implications Instead of building metadata systems designed solely for retrieval or description, this research suggests that it may be fruitful to build systems that recognize various metadata functions and allow for expressivity. This research also suggests that attending to metadata previously considered unusable in systems may reflect the participants' sense of the system and their role within it. Originality/value In addition to accommodating a wider range of tag functions, this research implies consideration of metadata ecosystems, where different kinds of tags do different things and work together to create a multifaceted artifact.
Peters, I.; Stock, W.G.: Power tags in information retrieval (2010) 0.03
```
0.03469106 = product of:
  0.08672766 = sum of:
    0.06342807 = weight(_text_:index in 865) [ClassicSimilarity], result of:
      0.06342807 = score(doc=865,freq=4.0), product of:
        0.18579477 = queryWeight, product of:
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.04251826 = queryNorm
        0.3413878 = fieldWeight in 865, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.0390625 = fieldNorm(doc=865)
    0.023299592 = weight(_text_:system in 865) [ClassicSimilarity], result of:
      0.023299592 = score(doc=865,freq=2.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.17398985 = fieldWeight in 865, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.0390625 = fieldNorm(doc=865)
  0.4 = coord(2/5)
```
Abstract

Purpose - Many Web 2.0 services (including Library 2.0 catalogs) make use of folksonomies. The purpose of this paper is to cut off all tags in the long tail of a document-specific tag distribution. The remaining tags at the beginning of a tag distribution are considered power tags and form a new, additional search option in information retrieval systems. Design/methodology/approach - In a theoretical approach the paper discusses document-specific tag distributions (power law and inverse-logistic shape), the development of such distributions (Yule-Simon process and shuffling theory) and introduces search tags (besides the well-known index tags) as a possibility for generating tag distributions. Findings - Search tags are compatible with broad and narrow folksonomies and with all knowledge organization systems (e.g. classification systems and thesauri), while index tags are only applicable in broad folksonomies. Based on these findings, the paper presents a sketch of an algorithm for mining and processing power tags in information retrieval systems. Research limitations/implications - This conceptual approach is in need of empirical evaluation in a concrete retrieval system. Practical implications - Power tags are a new search option for retrieval systems to limit the amount of hits. Originality/value - The paper introduces power tags as a means for enhancing the precision of search results in information retrieval systems that apply folksonomies, e.g. catalogs in Library 2.0environments.

Kopácsi, S. et al.: Development of a classification server to support metadata harmonization in a long term preservation system (2016) 0.03

0.026320523 = product of:
  0.06580131 = sum of:
    0.046599183 = weight(_text_:system in 3280) [ClassicSimilarity], result of:
      0.046599183 = score(doc=3280,freq=2.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.3479797 = fieldWeight in 3280, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.078125 = fieldNorm(doc=3280)
    0.019202124 = product of:
      0.057606373 = sum of:
        0.057606373 = weight(_text_:22 in 3280) [ClassicSimilarity], result of:
          0.057606373 = score(doc=3280,freq=2.0), product of:
            0.1488917 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04251826 = queryNorm
            0.38690117 = fieldWeight in 3280, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.078125 = fieldNorm(doc=3280)
      0.33333334 = coord(1/3)
  0.4 = coord(2/5)

Source: Metadata and semantics research: 10th International Conference, MTSR 2016, Göttingen, Germany, November 22-25, 2016, Proceedings. Eds.: E. Garoufallou

Wisser, K.: ¬The errors of our ways : using metadata quality research to understand common error patterns in the application of name headings (2014) 0.02

0.024017572 = product of:
  0.06004393 = sum of:
    0.04841807 = weight(_text_:context in 1574) [ClassicSimilarity], result of:
      0.04841807 = score(doc=1574,freq=2.0), product of:
        0.17622331 = queryWeight, product of:
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.04251826 = queryNorm
        0.27475408 = fieldWeight in 1574, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.046875 = fieldNorm(doc=1574)
    0.011625858 = product of:
      0.034877572 = sum of:
        0.034877572 = weight(_text_:29 in 1574) [ClassicSimilarity], result of:
          0.034877572 = score(doc=1574,freq=2.0), product of:
            0.14956595 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.04251826 = queryNorm
            0.23319192 = fieldWeight in 1574, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.046875 = fieldNorm(doc=1574)
      0.33333334 = coord(1/3)
  0.4 = coord(2/5)

Abstract: Using data culled during a metadata quality research project for the Social Network and Archival Context (SNAC) project, this article discusses common errors and problems in the use of standardized languages, specifically unambiguous names for persons and corporate bodies. Errors such as misspelling, qualifiers, format, and miss-encoding point to several areas where quality control measures can improve aggregation of data. Results from a large data set indicate that there are predictable problems that can be retrospectively corrected before aggregation. This research looked specifically at name formation and expression in metadata records, but the errors detected could be extended to other controlled vocabularies as well.
Source: Metadata and semantics research: 8th Research Conference, MTSR 2014, Karlsruhe, Germany, November 27-29, 2014, Proceedings. Eds.: S. Closs et al

Jeffery, K.G.; Bailo, D.: EPOS: using metadata in geoscience (2014) 0.02
```
0.020466631 = product of:
  0.05116658 = sum of:
    0.03954072 = weight(_text_:system in 1581) [ClassicSimilarity], result of:
      0.03954072 = score(doc=1581,freq=4.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.29527056 = fieldWeight in 1581, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.046875 = fieldNorm(doc=1581)
    0.011625858 = product of:
      0.034877572 = sum of:
        0.034877572 = weight(_text_:29 in 1581) [ClassicSimilarity], result of:
          0.034877572 = score(doc=1581,freq=2.0), product of:
            0.14956595 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.04251826 = queryNorm
            0.23319192 = fieldWeight in 1581, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.046875 = fieldNorm(doc=1581)
      0.33333334 = coord(1/3)
  0.4 = coord(2/5)
```
Abstract

One of the key aspects of the approaching data-intensive science era is integration of data through interoperability of systems providing data products or visualisation and processing services. Far from being simple, interoperability requires robust and scalable e-infrastructures capable of supporting it. In this work we present the case of EPOS, a project for data integration in the field of Earth Sciences. We describe the design of its e-infrastructure and show its main characteristics. One of the main elements enabling the system to integrate data, data products and services is the metadata catalog based on the CERIF metadata model. Such a model, modified to fit into the general e-infrastructure design, is part of a three-layer metadata architecture. CERIF guarantees a robust handling of metadata, which is in this case the key to the interoperability and to one of the feature of the EPOS system: the possibility of carrying on data intensive science orchestrating the distributed resources made available by EPOS data providers and stakeholders.

Source

Metadata and semantics research: 8th Research Conference, MTSR 2014, Karlsruhe, Germany, November 27-29, 2014, Proceedings. Eds.: S. Closs et al

Gradmann, S.: Container - Content - Context : zur Evolution bibliothekarischer Metadaten von Katalogdaten zu Library Linked Data (2012) 0.02

0.01936723 = product of:
  0.09683614 = sum of:
    0.09683614 = weight(_text_:context in 1023) [ClassicSimilarity], result of:
      0.09683614 = score(doc=1023,freq=2.0), product of:
        0.17622331 = queryWeight, product of:
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.04251826 = queryNorm
        0.54950815 = fieldWeight in 1023, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.09375 = fieldNorm(doc=1023)
  0.2 = coord(1/5)

Palavitsinis, N.; Manouselis, N.; Sanchez-Alonso, S.: Metadata quality in digital repositories : empirical results from the cross-domain transfer of a quality assurance process (2014) 0.02
```
0.01936723 = product of:
  0.09683614 = sum of:
    0.09683614 = weight(_text_:context in 1288) [ClassicSimilarity], result of:
      0.09683614 = score(doc=1288,freq=8.0), product of:
        0.17622331 = queryWeight, product of:
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.04251826 = queryNorm
        0.54950815 = fieldWeight in 1288, product of:
          2.828427 = tf(freq=8.0), with freq of:
            8.0 = termFreq=8.0
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.046875 = fieldNorm(doc=1288)
  0.2 = coord(1/5)
```
Abstract

Metadata quality presents a challenge faced by many digital repositories. There is a variety of proposed quality assurance frameworks applied in repositories that are deployed in various contexts. Although studies report that there is an improvement of the quality of the metadata in many of the applications, the transfer of a successful approach from one application context to another has not been studied to a satisfactory extent. This article presents the empirical results of the application of a metadata quality assurance process that has been developed and successfully applied in an educational context (learning repositories) to 2 different application contexts to compare results with the previous application and assess its generalizability. More specifically, it reports results from the adaptation and application of this process in a library context (institutional repositories) and in a cultural context (digital cultural repositories). Initial empirical findings indicate that content providers seem to be gaining a better understanding of metadata when the proposed process is put in place and that the quality of the produced metadata records increases.
Willis, C.; Greenberg, J.; White, H.: Analysis and synthesis of metadata goals for scientific data (2012) 0.02
```
0.015983826 = product of:
  0.03995956 = sum of:
    0.032278713 = weight(_text_:context in 367) [ClassicSimilarity], result of:
      0.032278713 = score(doc=367,freq=2.0), product of:
        0.17622331 = queryWeight, product of:
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.04251826 = queryNorm
        0.18316938 = fieldWeight in 367, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.03125 = fieldNorm(doc=367)
    0.0076808496 = product of:
      0.023042548 = sum of:
        0.023042548 = weight(_text_:22 in 367) [ClassicSimilarity], result of:
          0.023042548 = score(doc=367,freq=2.0), product of:
            0.1488917 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04251826 = queryNorm
            0.15476047 = fieldWeight in 367, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.03125 = fieldNorm(doc=367)
      0.33333334 = coord(1/3)
  0.4 = coord(2/5)
```
Abstract

The proliferation of discipline-specific metadata schemes contributes to artificial barriers that can impede interdisciplinary and transdisciplinary research. The authors considered this problem by examining the domains, objectives, and architectures of nine metadata schemes used to document scientific data in the physical, life, and social sciences. They used a mixed-methods content analysis and Greenberg's () metadata objectives, principles, domains, and architectural layout (MODAL) framework, and derived 22 metadata-related goals from textual content describing each metadata scheme. Relationships are identified between the domains (e.g., scientific discipline and type of data) and the categories of scheme objectives. For each strong correlation (>0.6), a Fisher's exact test for nonparametric data was used to determine significance (p < .05). Significant relationships were found between the domains and objectives of the schemes. Schemes describing observational data are more likely to have "scheme harmonization" (compatibility and interoperability with related schemes) as an objective; schemes with the objective "abstraction" (a conceptual model exists separate from the technical implementation) also have the objective "sufficiency" (the scheme defines a minimal amount of information to meet the needs of the community); and schemes with the objective "data publication" do not have the objective "element refinement." The analysis indicates that many metadata-driven goals expressed by communities are independent of scientific discipline or the type of data, although they are constrained by historical community practices and workflows as well as the technological environment at the time of scheme creation. The analysis reveals 11 fundamental metadata goals for metadata documenting scientific data in support of sharing research data across disciplines and domains. The authors report these results and highlight the need for more metadata-related research, particularly in the context of recent funding agency policy changes.
Raja, N.A.: Digitized content and index pages as alternative subject access fields (2012) 0.02
```
0.015222736 = product of:
  0.07611368 = sum of:
    0.07611368 = weight(_text_:index in 870) [ClassicSimilarity], result of:
      0.07611368 = score(doc=870,freq=4.0), product of:
        0.18579477 = queryWeight, product of:
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.04251826 = queryNorm
        0.40966535 = fieldWeight in 870, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.046875 = fieldNorm(doc=870)
  0.2 = coord(1/5)
```
Abstract

This article describes a pilot study undertaken to test the benefits of the digitized Content and Index pages of books and content pages of journal Issues in providing subject access to documents in a collection. A partial digitization strategy is used to fossick specific information using the alternative subject access fields in bibliographic records. A pilot study was carried out to search for books and journal articles containing information on "Leadership., "Women Entrepreneurs., "Disinvestment. and "Digital preservation. through normal procedu re and based on information stored in MARC 21 fields 653, 505 and 520 of the bibliographic records in the University of Mumbai Library. The results are compared to draw the conclusions.

White, H.: Examining scientific vocabulary : mapping controlled vocabularies with free text keywords (2013) 0.01

0.012345138 = product of:
  0.061725687 = sum of:
    0.061725687 = product of:
      0.09258853 = sum of:
        0.04650343 = weight(_text_:29 in 1953) [ClassicSimilarity], result of:
          0.04650343 = score(doc=1953,freq=2.0), product of:
            0.14956595 = queryWeight, product of:
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.04251826 = queryNorm
            0.31092256 = fieldWeight in 1953, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5176873 = idf(docFreq=3565, maxDocs=44218)
              0.0625 = fieldNorm(doc=1953)
        0.046085097 = weight(_text_:22 in 1953) [ClassicSimilarity], result of:
          0.046085097 = score(doc=1953,freq=2.0), product of:
            0.1488917 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04251826 = queryNorm
            0.30952093 = fieldWeight in 1953, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1953)
      0.6666667 = coord(2/3)
  0.2 = coord(1/5)

Date: 29. 5.2015 19:09:22

Rousidis, D.; Garoufallou, E.; Balatsoukas, P.; Sicilia, M.-A.: Evaluation of metadata in research data repositories : the case of the DC.Subject Element (2015) 0.01
```
0.011412249 = product of:
  0.057061244 = sum of:
    0.057061244 = weight(_text_:context in 2392) [ClassicSimilarity], result of:
      0.057061244 = score(doc=2392,freq=4.0), product of:
        0.17622331 = queryWeight, product of:
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.04251826 = queryNorm
        0.32380077 = fieldWeight in 2392, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.0390625 = fieldNorm(doc=2392)
  0.2 = coord(1/5)
```
Abstract

Research Data repositories are growing in terms of volume rapidly and exponentially. Their main goal is to provide scientists the essential mechanism to store, share, and re-use datasets generated at various stages of the research process. Despite the fact that metadata play an important role for research data management in the context of these repositories, several factors - such as the big volume of data and its complex lifecycles, as well as operational constraints related to financial resources and human factors - may impede the effectiveness of several metadata elements. The aim of the research reported in this paper was to perform a descriptive analysis of the DC.Subject metadata element and to identify its data quality problems in the context of the Dryad research data repository. In order to address this aim a total of 4.557 packages and 13.638 data files were analysed following a data-preprocessing method. The findings showed emerging trends about the subject coverage of the repository (e.g. the most popular subjects and the authors that contributed the most for these subjects). Also, quality problems related to the lack of controlled vocabulary and standardisation were very common. This study has implications for the evaluation of metadata and the improvement of the quality of the research data annotation process.

Koch, G.; Koch, W.: Aggregation and management of metadata in the context of Europeana (2017) 0.01

0.01129755 = product of:
  0.05648775 = sum of:
    0.05648775 = weight(_text_:context in 3910) [ClassicSimilarity], result of:
      0.05648775 = score(doc=3910,freq=2.0), product of:
        0.17622331 = queryWeight, product of:
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.04251826 = queryNorm
        0.32054642 = fieldWeight in 3910, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3910)
  0.2 = coord(1/5)

Wolfe, EW.: a case study in automated metadata enhancement : Natural Language Processing in the humanities (2019) 0.01
```
0.01129755 = product of:
  0.05648775 = sum of:
    0.05648775 = weight(_text_:context in 5236) [ClassicSimilarity], result of:
      0.05648775 = score(doc=5236,freq=2.0), product of:
        0.17622331 = queryWeight, product of:
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.04251826 = queryNorm
        0.32054642 = fieldWeight in 5236, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.0546875 = fieldNorm(doc=5236)
  0.2 = coord(1/5)
```
Abstract

The Black Book Interactive Project at the University of Kansas (KU) is developing an expanded corpus of novels by African American authors, with an emphasis on lesser known writers and a goal of expanding research in this field. Using a custom metadata schema with an emphasis on race-related elements, each novel is analyzed for a variety of elements such as literary style, targeted content analysis, historical context, and other areas. Librarians at KU have worked to develop a variety of computational text analysis processes designed to assist with specific aspects of this metadata collection, including text mining and natural language processing, automated subject extraction based on word sense disambiguation, harvesting data from Wikidata, and other actions.
Margaritopoulos, M.; Margaritopoulos, T.; Mavridis, I.; Manitsaris, A.: Quantifying and measuring metadata completeness (2012) 0.01
```
0.00968546 = product of:
  0.048427295 = sum of:
    0.048427295 = weight(_text_:system in 43) [ClassicSimilarity], result of:
      0.048427295 = score(doc=43,freq=6.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.36163113 = fieldWeight in 43, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.046875 = fieldNorm(doc=43)
  0.2 = coord(1/5)
```
Abstract

Completeness of metadata is one of the most essential characteristics of their quality. An incomplete metadata record is a record of degraded quality. Existing approaches to measure metadata completeness limit their scope in counting the existence of values in fields, regardless of the metadata hierarchy as defined in international standards. Such a traditional approach overlooks several issues that need to be taken into account. This paper presents a fine-grained metrics system for measuring metadata completeness, based on field completeness. A metadata field is considered to be a container of multiple pieces of information. In this regard, the proposed system is capable of following the hierarchy of metadata as it is set by the metadata schema and admeasuring the effect of multiple values of multivalued fields. An application of the proposed metrics system, after being configured according to specific user requirements, to measure completeness of a real-world set of metadata is demonstrated. The results prove its ability to assess the sufficiency of metadata to describe a resource and provide targeted measures of completeness throughout the metadata hierarchy.
Roux, M.: Metadata for search engines : what can be learned from e-Sciences? (2012) 0.01
```
0.009683615 = product of:
  0.04841807 = sum of:
    0.04841807 = weight(_text_:context in 96) [ClassicSimilarity], result of:
      0.04841807 = score(doc=96,freq=2.0), product of:
        0.17622331 = queryWeight, product of:
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.04251826 = queryNorm
        0.27475408 = fieldWeight in 96, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.046875 = fieldNorm(doc=96)
  0.2 = coord(1/5)
```
Abstract

E-sciences are data-intensive sciences that make a large use of the Web to share, collect, and process data. In this context, primary scientific data is becoming a new challenging issue as data must be extensively described (1) to account for empiric conditions and results that allow interpretation and/or analyses and (2) to be understandable by computers used for data storage and information retrieval. With this respect, metadata is a focal point whatever it is considered from the point of view of the user to visualize and exploit data as well as this of the search tools to find and retrieve information. Numerous disciplines are concerned with the issues of describing complex observations and addressing pertinent knowledge. In this paper, similarities and differences in data description and exploration strategies among disciplines in e-sciences are examined.
Bartczak, J.; Glendon, I.: Python, Google Sheets, and the Thesaurus for Graphic Materials for efficient metadata project workflows (2017) 0.01
```
0.009683615 = product of:
  0.04841807 = sum of:
    0.04841807 = weight(_text_:context in 3893) [ClassicSimilarity], result of:
      0.04841807 = score(doc=3893,freq=2.0), product of:
        0.17622331 = queryWeight, product of:
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.04251826 = queryNorm
        0.27475408 = fieldWeight in 3893, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.046875 = fieldNorm(doc=3893)
  0.2 = coord(1/5)
```
Abstract

In 2017, the University of Virginia (U.Va.) will launch a two year initiative to celebrate the bicentennial anniversary of the University's founding in 1819. The U.Va. Library is participating in this event by digitizing some 20,000 photographs and negatives that document student life on the U.Va. grounds in the 1960s and 1970s. Metadata librarians and archivists are well-versed in the challenges associated with generating digital content and accompanying description within the context of limited resources. This paper describes how technology and new approaches to metadata design have enabled the University of Virginia's Metadata Analysis and Design Department to rapidly and successfully generate accurate description for these digital objects. Python's pandas module improves efficiency by cleaning and repurposing data recorded at digitization, while the lxml module builds MODS XML programmatically from CSV tables. A simplified technique for subject heading selection and assignment in Google Sheets provides a collaborative environment for streamlined metadata creation and data quality control.
Bellotto, A.; Bekesi, J.: Enriching metadata for a university repository by modelling and infrastructure : a new vocabulary server for Phaidra (2019) 0.01
```
0.009683615 = product of:
  0.04841807 = sum of:
    0.04841807 = weight(_text_:context in 5693) [ClassicSimilarity], result of:
      0.04841807 = score(doc=5693,freq=2.0), product of:
        0.17622331 = queryWeight, product of:
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.04251826 = queryNorm
        0.27475408 = fieldWeight in 5693, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.046875 = fieldNorm(doc=5693)
  0.2 = coord(1/5)
```
Abstract

This paper illustrates an initial step towards the 'semantic enrichment' of University of Vienna's Phaidra repository as one of the valuable and up-to-date strategies able to enhance its role and usage. Firstly, a technical report points out the choice made in a local context, i.e. the deployment of the vocabulary server iQvoc instead of the formerly used SKOSMOS, explaining design decisions behind the current tool and additional features that the implementation required. Afterwards, some modelling characteristics of the local LOD controlled vocabulary are described according to SKOS documentation and best practices, highlighting which approaches can be pursued for rendering a LOD KOS available in the Web as well as issues that can be possibly encountered.
Kopácsi, S.; Hudak, R.; Ganguly, R.: Implementation of a classification server to support metadata organization for long term preservation systems (2017) 0.01
```
0.0092261685 = product of:
  0.04613084 = sum of:
    0.04613084 = weight(_text_:system in 3915) [ClassicSimilarity], result of:
      0.04613084 = score(doc=3915,freq=4.0), product of:
        0.13391352 = queryWeight, product of:
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.04251826 = queryNorm
        0.34448233 = fieldWeight in 3915, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          3.1495528 = idf(docFreq=5152, maxDocs=44218)
          0.0546875 = fieldNorm(doc=3915)
  0.2 = coord(1/5)
```
Abstract

In diesem Artikel beschreiben wir die Implementierung eines Klassifikationsservers für Metadatenorganisation in einem Langzeitarchivierungssystem für digitale Objekte. Nach einer kurzen Einführung in Klassifikationen und Wissensorganisationen stellen wir die Anforderungen an das zu implementierende System vor. Wir beschreiben sämtliche Simple Knowledge Organization System (SKOS) Management Tools, die wir untersucht haben, darunter auch Skosmos, die Lösung, die wir für die Implementierung gewählt haben. Skosmos ist ein open source, webbasierter SKOS Browser, basierend auf dem Jena Fuseki SPARQL Server. Wir diskutieren einige entscheidende Schritte während der Installation der ausgewählten Tools und präsentieren sowohl die potentiell auftretenden Probleme mit den verwendeten Klassifikationen als auch mögliche Lösungen.
Husevag, A.-S.R.: Named entities in indexing : a case study of TV subtitles and metadata records (2016) 0.01
```
0.008970084 = product of:
  0.044850416 = sum of:
    0.044850416 = weight(_text_:index in 3105) [ClassicSimilarity], result of:
      0.044850416 = score(doc=3105,freq=2.0), product of:
        0.18579477 = queryWeight, product of:
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.04251826 = queryNorm
        0.24139762 = fieldWeight in 3105, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.369764 = idf(docFreq=1520, maxDocs=44218)
          0.0390625 = fieldNorm(doc=3105)
  0.2 = coord(1/5)
```
Abstract

This paper explores the possible role of named entities in an automatic index-ing process, based on text in subtitles. This is done by analyzing entity types, name den-sity and name frequencies in subtitles and metadata records from different TV programs. The name density in metadata records is much higher than the name density in subtitles, and named entities with high frequencies in the subtitles are more likely to be mentioned in the metadata records. Personal names, geographical names and names of organizations where the most prominent entity types in both the news subtitles and news metadata, while persons, works and locations are the most prominent in culture programs.
Çelebi, A.; Özgür, A.: Segmenting hashtags and analyzing their grammatical structure (2018) 0.01
```
0.008069678 = product of:
  0.040348392 = sum of:
    0.040348392 = weight(_text_:context in 4221) [ClassicSimilarity], result of:
      0.040348392 = score(doc=4221,freq=2.0), product of:
        0.17622331 = queryWeight, product of:
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.04251826 = queryNorm
        0.22896172 = fieldWeight in 4221, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.14465 = idf(docFreq=1904, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4221)
  0.2 = coord(1/5)
```
Abstract

Originated as a label to mark specific tweets, hashtags are increasingly used to convey messages that people like to see in the trending hashtags list. Complex noun phrases and even sentences can be turned into a hashtag. Breaking hashtags into their words is a challenging task due to the irregular and compact nature of the language used in Twitter. In this study, we investigate feature-based machine learning and language model (LM)-based approaches for hashtag segmentation. Our results show that LM alone is not successful at segmenting nontrivial hashtags. However, when the N-best LM-based segmentations are incorporated as features into the feature-based approach, along with context-based features proposed in this study, state-of-the-art results in hashtag segmentation are achieved. In addition, we provide an analysis of over two million distinct hashtags, autosegmented by using our best configuration. The analysis reveals that half of all 60 million hashtag occurrences contain multiple words and 80% of sentiment is trapped inside multiword hashtags, justifying the need for hashtag segmentation. Furthermore, we analyze the grammatical structure of hashtags by parsing them and observe that 77% of the hashtags are noun-based, whereas 11.9% are verb-based.

Search (59 results, page 1 of 3)

Authors

Languages

Types

Themes

Subjects

Classifications