Search (15 results, page 1 of 1)

  • × author_ss:"Greenberg, J."
  1. White, H.; Willis, C.; Greenberg, J.: HIVEing : the effect of a semantic web technology on inter-indexer consistency (2014) 0.03
    0.02905105 = product of:
      0.043576576 = sum of:
        0.026672786 = weight(_text_:on in 1781) [ClassicSimilarity], result of:
          0.026672786 = score(doc=1781,freq=8.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.24300331 = fieldWeight in 1781, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1781)
        0.01690379 = product of:
          0.03380758 = sum of:
            0.03380758 = weight(_text_:22 in 1781) [ClassicSimilarity], result of:
              0.03380758 = score(doc=1781,freq=2.0), product of:
                0.1747608 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04990557 = queryNorm
                0.19345059 = fieldWeight in 1781, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1781)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    Purpose - The purpose of this paper is to examine the effect of the Helping Interdisciplinary Vocabulary Engineering (HIVE) system on the inter-indexer consistency of information professionals when assigning keywords to a scientific abstract. This study examined first, the inter-indexer consistency of potential HIVE users; second, the impact HIVE had on consistency; and third, challenges associated with using HIVE. Design/methodology/approach - A within-subjects quasi-experimental research design was used for this study. Data were collected using a task-scenario based questionnaire. Analysis was performed on consistency results using Hooper's and Rolling's inter-indexer consistency measures. A series of t-tests was used to judge the significance between consistency measure results. Findings - Results suggest that HIVE improves inter-indexing consistency. Working with HIVE increased consistency rates by 22 percent (Rolling's) and 25 percent (Hooper's) when selecting relevant terms from all vocabularies. A statistically significant difference exists between the assignment of free-text keywords and machine-aided keywords. Issues with homographs, disambiguation, vocabulary choice, and document structure were all identified as potential challenges. Research limitations/implications - Research limitations for this study can be found in the small number of vocabularies used for the study. Future research will include implementing HIVE into the Dryad Repository and studying its application in a repository system. Originality/value - This paper showcases several features used in HIVE system. By using traditional consistency measures to evaluate a semantic web technology, this paper emphasizes the link between traditional indexing and next generation machine-aided indexing (MAI) tools.
  2. White, H.C.; Carrier, S.; Thompson, A.; Greenberg, J.; Scherle, R.: ¬The Dryad Data Repository : a Singapore framework metadata architecture in a DSpace environment (2008) 0.03
    0.02822417 = product of:
      0.042336255 = sum of:
        0.01867095 = weight(_text_:on in 2592) [ClassicSimilarity], result of:
          0.01867095 = score(doc=2592,freq=2.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.17010231 = fieldWeight in 2592, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2592)
        0.023665305 = product of:
          0.04733061 = sum of:
            0.04733061 = weight(_text_:22 in 2592) [ClassicSimilarity], result of:
              0.04733061 = score(doc=2592,freq=2.0), product of:
                0.1747608 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04990557 = queryNorm
                0.2708308 = fieldWeight in 2592, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0546875 = fieldNorm(doc=2592)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Source
    Metadata for semantic and social applications : proceedings of the International Conference on Dublin Core and Metadata Applications, Berlin, 22 - 26 September 2008, DC 2008: Berlin, Germany / ed. by Jane Greenberg and Wolfgang Klas
  3. Shoffner, M.; Greenberg, J.; Kramer-Duffield, J.; Woodbury, D.: Web 2.0 semantic systems : collaborative learning in science (2008) 0.02
    0.019074293 = product of:
      0.028611438 = sum of:
        0.015088406 = weight(_text_:on in 2661) [ClassicSimilarity], result of:
          0.015088406 = score(doc=2661,freq=4.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.13746344 = fieldWeight in 2661, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.03125 = fieldNorm(doc=2661)
        0.013523032 = product of:
          0.027046064 = sum of:
            0.027046064 = weight(_text_:22 in 2661) [ClassicSimilarity], result of:
              0.027046064 = score(doc=2661,freq=2.0), product of:
                0.1747608 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04990557 = queryNorm
                0.15476047 = fieldWeight in 2661, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=2661)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    The basic goal of education within a discipline is to transform a novice into an expert. This entails moving the novice toward the "semantic space" that the expert inhabits-the space of concepts, meanings, vocabularies, and other intellectual constructs that comprise the discipline. Metadata is significant to this goal in digitally mediated education environments. Encoding the experts' semantic space not only enables the sharing of semantics among discipline scientists, but also creates an environment that bridges the semantic gap between the common vocabulary of the novice and the granular descriptive language of the seasoned scientist (Greenberg, et al, 2005). Developments underlying the Semantic Web, where vocabularies are formalized in the Web Ontology Language (OWL), and Web 2.0 approaches of user-generated folksonomies provide an infrastructure for linking vocabulary systems and promoting group learning via metadata literacy. Group learning is a pedagogical approach to teaching that harnesses the phenomenon of "collective intelligence" to increase learning by means of collaboration. Learning a new semantic system can be daunting for a novice, and yet it is integral to advance one's knowledge in a discipline and retain interest. These ideas are key to the "BOT 2.0: Botany through Web 2.0, the Memex and Social Learning" project (Bot 2.0).72 Bot 2.0 is a collaboration involving the North Carolina Botanical Garden, the UNC SILS Metadata Research center, and the Renaissance Computing Institute (RENCI). Bot 2.0 presents a curriculum utilizing a memex as a way for students to link and share digital information, working asynchronously in an environment beyond the traditional classroom. Our conception of a memex is not a centralized black box but rather a flexible, distributed framework that uses the most salient and easiest-to-use collaborative platforms (e.g., Facebook, Flickr, wiki and blog technology) for personal information management. By meeting students "where they live" digitally, we hope to attract students to the study of botanical science. A key aspect is to teach students scientific terminology and about the value of metadata, an inherent function in several of the technologies and in the instructional approach we are utilizing. This poster will report on a study examining the value of both folksonomies and taxonomies for post-secondary college students learning plant identification. Our data is drawn from a curriculum involving a virtual independent learning portion and a "BotCamp" weekend at UNC, where students work with digital plan specimens that they have captured. Results provide some insight into the importance of collaboration and shared vocabulary for gaining confidence and for student progression from novice to expert in botany.
    Source
    Metadata for semantic and social applications : proceedings of the International Conference on Dublin Core and Metadata Applications, Berlin, 22 - 26 September 2008, DC 2008: Berlin, Germany / ed. by Jane Greenberg and Wolfgang Klas
  4. Crystal, A.; Greenberg, J.: Relevance criteria identified by health information users during Web searches (2006) 0.01
    0.008890929 = product of:
      0.026672786 = sum of:
        0.026672786 = weight(_text_:on in 5909) [ClassicSimilarity], result of:
          0.026672786 = score(doc=5909,freq=8.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.24300331 = fieldWeight in 5909, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5909)
      0.33333334 = coord(1/3)
    
    Abstract
    This article focuses on the relevance judgments made by health information users who use the Web. Health information users were conceptualized as motivated information users concerned about how an environmental issue affects their health. Users identified their own environmental health interests and conducted a Web search of a particular environmental health Web site. Users were asked to identify (by highlighting with a mouse) the criteria they use to assess relevance in both Web search engine surrogates and full-text Web documents. Content analysis of document criteria highlighted by users identified the criteria these users relied on most often. Key criteria identified included (in order of frequency of appearance) research, topic, scope, data, influence, affiliation, Web characteristics, and authority/ person. A power-law distribution of criteria was observed (a few criteria represented most of the highlighted regions, with a long tail of occasionally used criteria). Implications of this work are that information retrieval (IR) systems should be tailored in terms of users' tendencies to rely on certain document criteria, and that relevance research should combine methods to gather richer, contextualized data. Metadata for IR systems, such as that used in search engine surrogates, could be improved by taking into account actual usage of relevance criteria. Such metadata should be user-centered (based on data from users, as in this study) and contextappropriate (fit to users' situations and tasks).
  5. Greenberg, J.: Understanding metadata and metadata scheme (2005) 0.01
    0.0075442037 = product of:
      0.02263261 = sum of:
        0.02263261 = weight(_text_:on in 5725) [ClassicSimilarity], result of:
          0.02263261 = score(doc=5725,freq=4.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.20619515 = fieldWeight in 5725, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=5725)
      0.33333334 = coord(1/3)
    
    Abstract
    Although the development and implementation of metadata schemes over the last decade has been extensive, research examining the sum of these activities is limited. This limitation is likely due to the massive scope of the topic. A framework is needed to study the full extent of, and functionalities supported by, metadata schemes. Metadata schemes developed for information resources are analyzed. To begin, I present a review of the definition of metadata, metadata functions, and several metadata typologies. Next, a conceptualization for metadata schemes is presented. The emphasis is on semantic container-like metadata schemes (data structures). The last part of this paper introduces the MODAL (Metadata Objectives and principles, Domains, and Architectural Layout) framework as an approach for studying metadata schemes. The paper concludes with a brief discussion on value of frameworks for examining metadata schemes, including different types of metadata schemes.
  6. Greenberg, J.: Automatic query expansion via lexical-semantic relationships (2001) 0.01
    0.0062868367 = product of:
      0.01886051 = sum of:
        0.01886051 = weight(_text_:on in 5703) [ClassicSimilarity], result of:
          0.01886051 = score(doc=5703,freq=4.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.1718293 = fieldWeight in 5703, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5703)
      0.33333334 = coord(1/3)
    
    Abstract
    Structured thesauri encode equivalent, hierarchical, and associative relationships and have been developed as indexing/retrieval tools. Despite the fact that these tools provide a rich semantic network of vocabulary terms, they are seldom employed for automatic query expansion (QE) activities. This article reports on an experiment that examined whether thesaurus terms, related to query in a specified semantic way (as synonyms and partial-synonyms (SYNs), narrower terms (NTs), related terms (RTs), and broader terms (BTs)), could be identified as having a more positive impact on retrieval effectiveness when added to a query through automatic QE. The research found that automatic QE via SYNs and NTs increased relative recall with a decline in precision that was not statistically significant, and that automatic QE via RTs and BTs increased relative recall with a decline in precision that was statistically significant. Recallbased and a precision-based ranking orders for automatic QE via semantically encoded thesauri terminology were identified. Mapping results found between enduser query terms and the ProQuest Controlled Vocabulary (1997) (the thesaurus used in this study) are reported, and future research foci related to the investigation are discussed
  7. Greenberg, J.; Mayer-Patel, K.; Trujillo, S.: YouTube: applying FRBR and exploring the multiple description coding compression model (2012) 0.01
    0.0062868367 = product of:
      0.01886051 = sum of:
        0.01886051 = weight(_text_:on in 1930) [ClassicSimilarity], result of:
          0.01886051 = score(doc=1930,freq=4.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.1718293 = fieldWeight in 1930, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1930)
      0.33333334 = coord(1/3)
    
    Abstract
    Nearly everyone who has searched YouTube for a favorite show, movie, news cast, or other known item, has retrieved multiple videos clips (or segments) that appear to duplicate, overlap, and relate. The work presented in this paper considers this challenge and reports on a study examining the applicability of the Functional Requirements for Bibliographic Records (FRBR) for relating varying renderings of YouTube videos. The paper also introduces the Multiple Description Coding Compression (MDC2) to extend FRBR and address YouTube preservation/storage challenges. The study sample included 20 video segments from YouTube; 10 connected with the event, Small Step for Man (US Astronaut Neil Armstrong's first step on the moon), and 10 with the 1966 classic movie, "Batman: The Movie." The FRBR analysis used a qualitative content analysis, and the MDC2 exploration was pursued via high-level approach of protocol modeling. Results indicate that FRBR is applicable to YouTube, although the analyses required a localization of the Work, Expression, Manifestation, and Item (WEMI) FRBR elements. The MDC2 exploration illustrates an approach for exploring FRBR in the context of other models, and identifies a potential means for addressing YouTube-related preservation/storage challenges.
  8. Greenberg, J.: Subject control of ephemera : MARC format options (1996) 0.01
    0.00622365 = product of:
      0.01867095 = sum of:
        0.01867095 = weight(_text_:on in 543) [ClassicSimilarity], result of:
          0.01867095 = score(doc=543,freq=2.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.17010231 = fieldWeight in 543, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0546875 = fieldNorm(doc=543)
      0.33333334 = coord(1/3)
    
    Footnote
    Paper presented at the Summer Seminar on Ephemera, Philadelphia, Pennsylvania, 4-6 Aug 1994
  9. Greenberg, J.: ¬A quantitative categorical analysis of metadata elements in image-applicable metadata schemes (2001) 0.01
    0.00622365 = product of:
      0.01867095 = sum of:
        0.01867095 = weight(_text_:on in 6529) [ClassicSimilarity], result of:
          0.01867095 = score(doc=6529,freq=2.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.17010231 = fieldWeight in 6529, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0546875 = fieldNorm(doc=6529)
      0.33333334 = coord(1/3)
    
    Abstract
    This article reports on a quantitative categorical analysis of metadata elements in the Dublin Core, VRA Core, REACH, and EAD metadata schemas, all of which can be used for organizing and describing images. The study found that each of the examined metadata schemas contains elements that support the discovery, use, authentication, and administration of images, and that the number and proportion of elements supporting functions in these classes varies per schema. The study introduces a new schema comparison methodology and explores the development of a class-oriented functional metadata schema for controlling images across multiple domains
  10. Greenberg, J.: Optimal query expansion (QE) processing methods with semantically encoded structured thesaurus terminology (2001) 0.01
    0.0053345575 = product of:
      0.016003672 = sum of:
        0.016003672 = weight(_text_:on in 5750) [ClassicSimilarity], result of:
          0.016003672 = score(doc=5750,freq=2.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.14580199 = fieldWeight in 5750, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=5750)
      0.33333334 = coord(1/3)
    
    Abstract
    While researchers have explored the value of structured thesauri as controlled vocabularies for general information retrieval (IR) activities, they have not identified the optimal query expansion (QE) processing methods for taking advantage of the semantic encoding underlying the terminology in these tools. The study reported on in this article addresses this question, and examined whether QE via semantically encoded thesauri terminology is more effective in the automatic or interactive processing environment. The research found that, regardless of end-users' retrieval goals, synonyms and partial synonyms (SYNs) and narrower terms (NTs) are generally good candidates for automatic QE and that related (RTs) are better candidates for interactive QE. The study also examined end-users' selection of semantically encoded thesauri terms for interactive QE, and explored how retrieval goals and QE processes may be combined in future thesauri-supported IR systems
  11. Greenberg, J.: Intellectual control of visual archives : a comparison between the Art and Architecture Thesaurus and the Library of Congress Thesaurus for Graphic Materials (1993) 0.01
    0.0053345575 = product of:
      0.016003672 = sum of:
        0.016003672 = weight(_text_:on in 546) [ClassicSimilarity], result of:
          0.016003672 = score(doc=546,freq=2.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.14580199 = fieldWeight in 546, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=546)
      0.33333334 = coord(1/3)
    
    Abstract
    The following investigation is a comparison between the Art and Architecture Thesaurus (AAT) and the LC Thesaurus for Graphic Materials (LCTGM), two popular sources for providing subject access to visual archives. The analysis begins with a discussion on the nature of visual archives and the employment of archival control theory to graphic materials. The major difference observed is that the AAT is a faceted structure geared towards a specialized audience of art and architecture researchers, while LCTGM is similar to LCSH in structure and aims to service the wide-spread archival community. The conclusion recognizes the need to understand the differences between subject thesauri and subject heading lists, and the pressing need to investigate and understand intellectual control of visual archives in today's automated environment.
  12. Willis, C.; Greenberg, J.; White, H.: Analysis and synthesis of metadata goals for scientific data (2012) 0.00
    0.0045076776 = product of:
      0.013523032 = sum of:
        0.013523032 = product of:
          0.027046064 = sum of:
            0.027046064 = weight(_text_:22 in 367) [ClassicSimilarity], result of:
              0.027046064 = score(doc=367,freq=2.0), product of:
                0.1747608 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04990557 = queryNorm
                0.15476047 = fieldWeight in 367, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.03125 = fieldNorm(doc=367)
          0.5 = coord(1/2)
      0.33333334 = coord(1/3)
    
    Abstract
    The proliferation of discipline-specific metadata schemes contributes to artificial barriers that can impede interdisciplinary and transdisciplinary research. The authors considered this problem by examining the domains, objectives, and architectures of nine metadata schemes used to document scientific data in the physical, life, and social sciences. They used a mixed-methods content analysis and Greenberg's () metadata objectives, principles, domains, and architectural layout (MODAL) framework, and derived 22 metadata-related goals from textual content describing each metadata scheme. Relationships are identified between the domains (e.g., scientific discipline and type of data) and the categories of scheme objectives. For each strong correlation (>0.6), a Fisher's exact test for nonparametric data was used to determine significance (p < .05). Significant relationships were found between the domains and objectives of the schemes. Schemes describing observational data are more likely to have "scheme harmonization" (compatibility and interoperability with related schemes) as an objective; schemes with the objective "abstraction" (a conceptual model exists separate from the technical implementation) also have the objective "sufficiency" (the scheme defines a minimal amount of information to meet the needs of the community); and schemes with the objective "data publication" do not have the objective "element refinement." The analysis indicates that many metadata-driven goals expressed by communities are independent of scientific discipline or the type of data, although they are constrained by historical community practices and workflows as well as the technological environment at the time of scheme creation. The analysis reveals 11 fundamental metadata goals for metadata documenting scientific data in support of sharing research data across disciplines and domains. The authors report these results and highlight the need for more metadata-related research, particularly in the context of recent funding agency policy changes.
  13. Greenberg, J.: Theoretical considerations of lifecycle modeling : an analysis of the Dryad Repository demonstrating automatic metadata propagation, inheritance, and value system adoption (2009) 0.00
    0.0044454644 = product of:
      0.013336393 = sum of:
        0.013336393 = weight(_text_:on in 2990) [ClassicSimilarity], result of:
          0.013336393 = score(doc=2990,freq=2.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.121501654 = fieldWeight in 2990, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2990)
      0.33333334 = coord(1/3)
    
    Abstract
    The Dryad repository is for data supporting published research in the field of evolutionary biology and related disciplines. Dryad development team members seek a theoretical framework to aid communication about metadata issues and plans. This article explores lifecycle modeling as a theoretical framework for understanding metadata in the repostiroy enivornment. A background discussion reviews the importance of theory, the status of a metadata theory, and lifecycle concepts. An analysis draws examples from the Dryad repository demonstrating automatic propagation, metadata inheritance, and value system adoption, and reports results from a faceted term mapping experiment that included 12 vocabularies and approximately 600 terms. The article also reports selected key findings from a recent survey on the data-sharing attitudes and behaviors of nearly 400 evolutionary biologists. Te results confirm the applicability of lifecycle modeling to Dryad's metadata infrastructure. The article concludes that lifecycle modeling provides a theoretical framework that can enhance our understanding of metadata, aid communication about the topic of metadata in the repository environment, and potentially help sustain robust repository development.
  14. Li, K.; Greenberg, J.; Dunic, J.: Data objects and documenting scientific processes : an analysis of data events in biodiversity data papers (2020) 0.00
    0.0044454644 = product of:
      0.013336393 = sum of:
        0.013336393 = weight(_text_:on in 5615) [ClassicSimilarity], result of:
          0.013336393 = score(doc=5615,freq=2.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.121501654 = fieldWeight in 5615, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=5615)
      0.33333334 = coord(1/3)
    
    Abstract
    The data paper, an emerging scholarly genre, describes research data sets and is intended to bridge the gap between the publication of research data and scientific articles. Research examining how data papers report data events, such as data transactions and manipulations, is limited. The research reported on in this article addresses this limitation and investigated how data events are inscribed in data papers. A content analysis was conducted examining the full texts of 82 data papers, drawn from the curated list of data papers connected to the Global Biodiversity Information Facility. Data events recorded for each paper were organized into a set of 17 categories. Many of these categories are described together in the same sentence, which indicates the messiness of data events in the laboratory space. The findings challenge the degrees to which data papers are a distinct genre compared to research articles and they describe data-centric research processes in a through way. This article also discusses how our results could inform a better data publication ecosystem in the future.
  15. Grabus, S.; Logan, P.M.; Greenberg, J.: Temporal concept drift and alignment : an empirical approach to comparing knowledge organization systems over time (2022) 0.00
    0.0044454644 = product of:
      0.013336393 = sum of:
        0.013336393 = weight(_text_:on in 1100) [ClassicSimilarity], result of:
          0.013336393 = score(doc=1100,freq=2.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.121501654 = fieldWeight in 1100, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1100)
      0.33333334 = coord(1/3)
    
    Abstract
    This research explores temporal concept drift and temporal alignment in knowledge organization systems (KOS). A comparative analysis is pursued using the 1910 Library of Congress Subject Headings, 2020 FAST Topical, and automatic indexing. The use case involves a sample of 90 nineteenth-century Encyclopedia Britannica entries. The entries were indexed using two approaches: 1) full-text indexing; 2) Named Entity Recognition was performed upon the entries with Stanza, Stanford's NLP toolkit, and entities were automatically indexed with the Helping Interdisciplinary Vocabulary application (HIVE), using both 1910 LCSH and FAST Topical. The analysis focused on three goals: 1) identifying results that were exclusive to the 1910 LCSH output; 2) identifying terms in the exclusive set that have been deprecated from the contemporary LCSH, demonstrating temporal concept drift; and 3) exploring the historical significance of these deprecated terms. Results confirm that historical vocabularies can be used to generate anachronistic subject headings representing conceptual drift across time in KOS and historical resources. A methodological contribution is made demonstrating how to study changes in KOS over time and improve the contextualization historical humanities resources.