Search (203 results, page 1 of 11)

  • × year_i:[2020 TO 2030}
  1. Noever, D.; Ciolino, M.: ¬The Turing deception (2022) 0.14
    0.14355062 = product of:
      0.28710124 = sum of:
        0.07177531 = product of:
          0.21532592 = sum of:
            0.21532592 = weight(_text_:3a in 862) [ClassicSimilarity], result of:
              0.21532592 = score(doc=862,freq=2.0), product of:
                0.38312992 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.045191016 = queryNorm
                0.56201804 = fieldWeight in 862, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.046875 = fieldNorm(doc=862)
          0.33333334 = coord(1/3)
        0.21532592 = weight(_text_:2f in 862) [ClassicSimilarity], result of:
          0.21532592 = score(doc=862,freq=2.0), product of:
            0.38312992 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.045191016 = queryNorm
            0.56201804 = fieldWeight in 862, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.046875 = fieldNorm(doc=862)
      0.5 = coord(2/4)
    
    Source
    https%3A%2F%2Farxiv.org%2Fabs%2F2212.06721&usg=AOvVaw3i_9pZm9y_dQWoHi6uv0EN
  2. Gabler, S.: Vergabe von DDC-Sachgruppen mittels eines Schlagwort-Thesaurus (2021) 0.12
    0.11962552 = product of:
      0.23925105 = sum of:
        0.05981276 = product of:
          0.17943828 = sum of:
            0.17943828 = weight(_text_:3a in 1000) [ClassicSimilarity], result of:
              0.17943828 = score(doc=1000,freq=2.0), product of:
                0.38312992 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.045191016 = queryNorm
                0.46834838 = fieldWeight in 1000, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=1000)
          0.33333334 = coord(1/3)
        0.17943828 = weight(_text_:2f in 1000) [ClassicSimilarity], result of:
          0.17943828 = score(doc=1000,freq=2.0), product of:
            0.38312992 = queryWeight, product of:
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.045191016 = queryNorm
            0.46834838 = fieldWeight in 1000, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              8.478011 = idf(docFreq=24, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1000)
      0.5 = coord(2/4)
    
    Content
    Master thesis Master of Science (Library and Information Studies) (MSc), Universität Wien. Advisor: Christoph Steiner. Vgl.: https://www.researchgate.net/publication/371680244_Vergabe_von_DDC-Sachgruppen_mittels_eines_Schlagwort-Thesaurus. DOI: 10.25365/thesis.70030. Vgl. dazu die Präsentation unter: https://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=web&cd=&ved=0CAIQw7AJahcKEwjwoZzzytz_AhUAAAAAHQAAAAAQAg&url=https%3A%2F%2Fwiki.dnb.de%2Fdownload%2Fattachments%2F252121510%2FDA3%2520Workshop-Gabler.pdf%3Fversion%3D1%26modificationDate%3D1671093170000%26api%3Dv2&psig=AOvVaw0szwENK1or3HevgvIDOfjx&ust=1687719410889597&opi=89978449.
  3. ¬The library's guide to graphic novels (2020) 0.09
    0.08566846 = product of:
      0.34267384 = sum of:
        0.34267384 = weight(_text_:graphic in 717) [ClassicSimilarity], result of:
          0.34267384 = score(doc=717,freq=40.0), product of:
            0.29924196 = queryWeight, product of:
              6.6217136 = idf(docFreq=159, maxDocs=44218)
              0.045191016 = queryNorm
            1.1451397 = fieldWeight in 717, product of:
              6.3245554 = tf(freq=40.0), with freq of:
                40.0 = termFreq=40.0
              6.6217136 = idf(docFreq=159, maxDocs=44218)
              0.02734375 = fieldNorm(doc=717)
      0.25 = coord(1/4)
    
    Abstract
    This monograph provides an overview of the various aspects involved in selecting, acquiring and cataloging graphic novels and making them available to patrons.
    The circ stats say it all: graphic novels' popularity among library users keeps growing, with more being published (and acquired by libraries) each year. The unique challenges of developing and managing a graphics novels collection have led the Association of Library Collections and Technical Services (ALCTS) to craft this guide, presented under the expert supervision of editor Ballestro, who has worked with comics for more than 35 years. Examining the ever-changing ways that graphic novels are created, packaged, marketed, and released, this resource gathers a range of voices from the field to explore such topics as: a cultural history of comics and graphic novels from their World War II origins to today, providing a solid grounding for newbies and fresh insights for all; catching up on the Big Two's reboots: Marvel's 10 and DC's 4; five questions to ask when evaluating nonfiction graphic novels and 30 picks for a core collection; key publishers and cartoonists to consider when adding international titles; developing a collection that supports curriculum and faculty outreach to ensure wide usage, with catalogers' tips for organizing your collection and improving discovery; real-world examples of how libraries treat graphic novels, such as an in-depth profile of the development of Penn Library's Manga collection; how to integrate the emerging field of graphic medicine into the collection; and specialized resources like The Cartoonists of Color and Queer Cartoonists databases, the open access scholarly journal Comic Grid, and the No Flying, No Tights website. Packed with expert guidance and useful information, this guide will assist technical services staff, catalogers, and acquisition and collection management librarians.
    Content
    Inhalt: Between the Panels: A Cultural History of Comic Books and Graphic Novels / by Joshua Everett -- Graphic Novel Companies, Reboots, and Numbering / by John Ballestro -- Creating and Developing a Graphic Literature Collection in an Academic Library / by Andrea Kingston -- Non-Fiction Graphic Novels / by Carli Spina -- Fiction Graphic Novels / by Kayla Kuni -- International Comics and Graphic Novels / by Emily Drew, Lucia Serantes, and Amie Wright -- Building a Japanese Manga Collection for Non-Traditional Patrons in an Academic Library / by Molly Desjardins and Michael P. Williams -- Graphic Medicine in Your Library: Ideas and Strategies for Collecting Comics about Healthcare / by Alice Jaggers, Matthew Noe, and Ariel Pomputius -- The Nuts and Bolts of Comics Cataloging / by Allison Bailund, Hallie Clawson, and Staci Crouch -- Teaching and Programming with Graphic Novels in Academic Libraries / by Jacob Gordon and Sarah Kern.
    LCSH
    Libraries / Special collections / Graphic novels
    RSWK
    Bibliothek / Comic / Graphic Novel / Sammlung / Universitätsbibliothek / Wissenschaftliche Bibliothek
    Subject
    Bibliothek / Comic / Graphic Novel / Sammlung / Universitätsbibliothek / Wissenschaftliche Bibliothek
    Libraries / Special collections / Graphic novels
  4. Becnel, K.; Moeller, R.A.: Graphic novels in the school library : questions of cataloging, classification, and arrangement (2022) 0.07
    0.07260963 = product of:
      0.29043853 = sum of:
        0.29043853 = weight(_text_:graphic in 1107) [ClassicSimilarity], result of:
          0.29043853 = score(doc=1107,freq=22.0), product of:
            0.29924196 = queryWeight, product of:
              6.6217136 = idf(docFreq=159, maxDocs=44218)
              0.045191016 = queryNorm
            0.97058094 = fieldWeight in 1107, product of:
              4.690416 = tf(freq=22.0), with freq of:
                22.0 = termFreq=22.0
              6.6217136 = idf(docFreq=159, maxDocs=44218)
              0.03125 = fieldNorm(doc=1107)
      0.25 = coord(1/4)
    
    Abstract
    In recent years, many school librarians have been scrambling to build and expand their graphic novel collections to meet the large and growing demand for these materials. For the purposes of this study, the term graphic novels refers to volumes in which the content is provided through sequential art, including fiction, nonfiction, and biographical material. As the library field has not yet arrived at a set of best practices or guidelines for institutions working to classify and catalog graphic novels, this study seeks to record the ways in which school librarians are handling these materials as well as issues and questions at the forefront of their minds. A survey of school librarians in the United States revealed that almost all of them collect fiction and nonfiction graphic novels, while 67% collect manga. Most respondents indicated that they are partly or solely responsible for the cataloging and classification decisions made in their media centers. For classification purposes, most have elected to create separate graphic novel collections to house their fictional graphic novels. Some include nonfiction graphic novels in this section, while others create a nonfiction graphic novel collection nearby or shelve nonfiction graphic novels with other items that deal with similar subject matter. Many school librarians express uncertainty about how best to catalog and classify longer series, adapted classics, superhero stories, and the increasing number and variety of inventive titles that defy categorization. They also struggle with inconsistent vendor records and past practices and suffer from a lack of full confidence in their knowledge of how to best classify and catalog graphic novels so that they are both searchable in the library catalog and easily accessible on the shelves.
  5. Candela, G.: ¬An automatic data quality approach to assess semantic data from cultural heritage institutions (2023) 0.03
    0.030688211 = product of:
      0.122752845 = sum of:
        0.122752845 = sum of:
          0.07989353 = weight(_text_:methods in 997) [ClassicSimilarity], result of:
            0.07989353 = score(doc=997,freq=4.0), product of:
              0.18168657 = queryWeight, product of:
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.045191016 = queryNorm
              0.43973273 = fieldWeight in 997, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.0546875 = fieldNorm(doc=997)
          0.042859312 = weight(_text_:22 in 997) [ClassicSimilarity], result of:
            0.042859312 = score(doc=997,freq=2.0), product of:
              0.15825124 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.045191016 = queryNorm
              0.2708308 = fieldWeight in 997, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0546875 = fieldNorm(doc=997)
      0.25 = coord(1/4)
    
    Abstract
    In recent years, cultural heritage institutions have been exploring the benefits of applying Linked Open Data to their catalogs and digital materials. Innovative and creative methods have emerged to publish and reuse digital contents to promote computational access, such as the concepts of Labs and Collections as Data. Data quality has become a requirement for researchers and training methods based on artificial intelligence and machine learning. This article explores how the quality of Linked Open Data made available by cultural heritage institutions can be automatically assessed. The results obtained can be useful for other institutions who wish to publish and assess their collections.
    Date
    22. 6.2023 18:23:31
  6. Das, S.; Paik, J.H.: Gender tagging of named entities using retrieval-assisted multi-context aggregation : an unsupervised approach (2023) 0.03
    0.026304178 = product of:
      0.10521671 = sum of:
        0.10521671 = sum of:
          0.068480164 = weight(_text_:methods in 941) [ClassicSimilarity], result of:
            0.068480164 = score(doc=941,freq=4.0), product of:
              0.18168657 = queryWeight, product of:
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.045191016 = queryNorm
              0.37691376 = fieldWeight in 941, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.046875 = fieldNorm(doc=941)
          0.03673655 = weight(_text_:22 in 941) [ClassicSimilarity], result of:
            0.03673655 = score(doc=941,freq=2.0), product of:
              0.15825124 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.045191016 = queryNorm
              0.23214069 = fieldWeight in 941, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=941)
      0.25 = coord(1/4)
    
    Abstract
    Inferring the gender of named entities present in a text has several practical applications in information sciences. Existing approaches toward name gender identification rely exclusively on using the gender distributions from labeled data. In the absence of such labeled data, these methods fail. In this article, we propose a two-stage model that is able to infer the gender of names present in text without requiring explicit name-gender labels. We use coreference resolution as the backbone for our proposed model. To aid coreference resolution where the existing contextual information does not suffice, we use a retrieval-assisted context aggregation framework. We demonstrate that state-of-the-art name gender inference is possible without supervision. Our proposed method matches or outperforms several supervised approaches and commercially used methods on five English language datasets from different domains.
    Date
    22. 3.2023 12:00:14
  7. Ma, Y.: Relatedness and compatibility : the concept of privacy in Mandarin Chinese and American English corpora (2023) 0.02
    0.021289835 = product of:
      0.08515934 = sum of:
        0.08515934 = sum of:
          0.048422787 = weight(_text_:methods in 887) [ClassicSimilarity], result of:
            0.048422787 = score(doc=887,freq=2.0), product of:
              0.18168657 = queryWeight, product of:
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.045191016 = queryNorm
              0.26651827 = fieldWeight in 887, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.046875 = fieldNorm(doc=887)
          0.03673655 = weight(_text_:22 in 887) [ClassicSimilarity], result of:
            0.03673655 = score(doc=887,freq=2.0), product of:
              0.15825124 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.045191016 = queryNorm
              0.23214069 = fieldWeight in 887, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.046875 = fieldNorm(doc=887)
      0.25 = coord(1/4)
    
    Abstract
    This study investigates how privacy as an ethical concept exists in two languages: Mandarin Chinese and American English. The exploration relies on two genres of corpora from 10 years: social media posts and news articles, 2010-2019. A mixed-methods approach combining structural topic modeling (STM) and human interpretation were used to work with the data. Findings show various privacy-related topics across the two languages. Moreover, some of these different topics revealed fundamental incompatibilities for understanding privacy across these two languages. In other words, some of the variations of topics do not just reflect contextual differences; they reveal how the two languages value privacy in different ways that can relate back to the society's ethical tradition. This study is one of the first empirically grounded intercultural explorations of the concept of privacy. It has shown that natural language is promising to operationalize intercultural and comparative privacy research, and it provides an examination of the concept as it is understood in these two languages.
    Date
    22. 1.2023 18:59:40
  8. Palsdottir, A.: Data literacy and management of research data : a prerequisite for the sharing of research data (2021) 0.02
    0.020101214 = product of:
      0.080404855 = sum of:
        0.080404855 = sum of:
          0.05591382 = weight(_text_:methods in 183) [ClassicSimilarity], result of:
            0.05591382 = score(doc=183,freq=6.0), product of:
              0.18168657 = queryWeight, product of:
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.045191016 = queryNorm
              0.3077488 = fieldWeight in 183, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.03125 = fieldNorm(doc=183)
          0.024491036 = weight(_text_:22 in 183) [ClassicSimilarity], result of:
            0.024491036 = score(doc=183,freq=2.0), product of:
              0.15825124 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.045191016 = queryNorm
              0.15476047 = fieldWeight in 183, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.03125 = fieldNorm(doc=183)
      0.25 = coord(1/4)
    
    Abstract
    Purpose The purpose of this paper is to investigate the knowledge and attitude about research data management, the use of data management methods and the perceived need for support, in relation to participants' field of research. Design/methodology/approach This is a quantitative study. Data were collected by an email survey and sent to 792 academic researchers and doctoral students. Total response rate was 18% (N = 139). The measurement instrument consisted of six sets of questions: about data management plans, the assignment of additional information to research data, about metadata, standard file naming systems, training at data management methods and the storing of research data. Findings The main finding is that knowledge about the procedures of data management is limited, and data management is not a normal practice in the researcher's work. They were, however, in general, of the opinion that the university should take the lead by recommending and offering access to the necessary tools of data management. Taken together, the results indicate that there is an urgent need to increase the researcher's understanding of the importance of data management that is based on professional knowledge and to provide them with resources and training that enables them to make effective and productive use of data management methods. Research limitations/implications The survey was sent to all members of the population but not a sample of it. Because of the response rate, the results cannot be generalized to all researchers at the university. Nevertheless, the findings may provide an important understanding about their research data procedures, in particular what characterizes their knowledge about data management and attitude towards it. Practical implications Awareness of these issues is essential for information specialists at academic libraries, together with other units within the universities, to be able to design infrastructures and develop services that suit the needs of the research community. The findings can be used, to develop data policies and services, based on professional knowledge of best practices and recognized standards that assist the research community at data management. Originality/value The study contributes to the existing literature about research data management by examining the results by participants' field of research. Recognition of the issues is critical in order for information specialists in collaboration with universities to design relevant infrastructures and services for academics and doctoral students that can promote their research data management.
    Date
    20. 1.2015 18:30:22
  9. Kim, J.(im); Kim, J.(enna): Effect of forename string on author name disambiguation (2020) 0.02
    0.01774153 = product of:
      0.07096612 = sum of:
        0.07096612 = sum of:
          0.040352322 = weight(_text_:methods in 5930) [ClassicSimilarity], result of:
            0.040352322 = score(doc=5930,freq=2.0), product of:
              0.18168657 = queryWeight, product of:
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.045191016 = queryNorm
              0.22209854 = fieldWeight in 5930, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.0390625 = fieldNorm(doc=5930)
          0.030613795 = weight(_text_:22 in 5930) [ClassicSimilarity], result of:
            0.030613795 = score(doc=5930,freq=2.0), product of:
              0.15825124 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.045191016 = queryNorm
              0.19345059 = fieldWeight in 5930, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=5930)
      0.25 = coord(1/4)
    
    Abstract
    In author name disambiguation, author forenames are used to decide which name instances are disambiguated together and how much they are likely to refer to the same author. Despite such a crucial role of forenames, their effect on the performance of heuristic (string matching) and algorithmic disambiguation is not well understood. This study assesses the contributions of forenames in author name disambiguation using multiple labeled data sets under varying ratios and lengths of full forenames, reflecting real-world scenarios in which an author is represented by forename variants (synonym) and some authors share the same forenames (homonym). The results show that increasing the ratios of full forenames substantially improves both heuristic and machine-learning-based disambiguation. Performance gains by algorithmic disambiguation are pronounced when many forenames are initialized or homonyms are prevalent. As the ratios of full forenames increase, however, they become marginal compared to those by string matching. Using a small portion of forename strings does not reduce much the performances of both heuristic and algorithmic disambiguation methods compared to using full-length strings. These findings provide practical suggestions, such as restoring initialized forenames into a full-string format via record linkage for improved disambiguation performances.
    Date
    11. 7.2020 13:22:58
  10. Jia, J.: From data to knowledge : the relationships between vocabularies, linked data and knowledge graphs (2021) 0.02
    0.01774153 = product of:
      0.07096612 = sum of:
        0.07096612 = sum of:
          0.040352322 = weight(_text_:methods in 106) [ClassicSimilarity], result of:
            0.040352322 = score(doc=106,freq=2.0), product of:
              0.18168657 = queryWeight, product of:
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.045191016 = queryNorm
              0.22209854 = fieldWeight in 106, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.0390625 = fieldNorm(doc=106)
          0.030613795 = weight(_text_:22 in 106) [ClassicSimilarity], result of:
            0.030613795 = score(doc=106,freq=2.0), product of:
              0.15825124 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.045191016 = queryNorm
              0.19345059 = fieldWeight in 106, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=106)
      0.25 = coord(1/4)
    
    Abstract
    Purpose The purpose of this paper is to identify the concepts, component parts and relationships between vocabularies, linked data and knowledge graphs (KGs) from the perspectives of data and knowledge transitions. Design/methodology/approach This paper uses conceptual analysis methods. This study focuses on distinguishing concepts and analyzing composition and intercorrelations to explore data and knowledge transitions. Findings Vocabularies are the cornerstone for accurately building understanding of the meaning of data. Vocabularies provide for a data-sharing model and play an important role in supporting the semantic expression of linked data and defining the schema layer; they are also used for entity recognition, alignment and linkage for KGs. KGs, which consist of a schema layer and a data layer, are presented as cubes that organically combine vocabularies, linked data and big data. Originality/value This paper first describes the composition of vocabularies, linked data and KGs. More importantly, this paper innovatively analyzes and summarizes the interrelatedness of these factors, which comes from frequent interactions between data and knowledge. The three factors empower each other and can ultimately empower the Semantic Web.
    Date
    22. 1.2021 14:24:32
  11. Zhang, L.; Lu, W.; Yang, J.: LAGOS-AND : a large gold standard dataset for scholarly author name disambiguation (2023) 0.02
    0.01774153 = product of:
      0.07096612 = sum of:
        0.07096612 = sum of:
          0.040352322 = weight(_text_:methods in 883) [ClassicSimilarity], result of:
            0.040352322 = score(doc=883,freq=2.0), product of:
              0.18168657 = queryWeight, product of:
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.045191016 = queryNorm
              0.22209854 = fieldWeight in 883, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.0390625 = fieldNorm(doc=883)
          0.030613795 = weight(_text_:22 in 883) [ClassicSimilarity], result of:
            0.030613795 = score(doc=883,freq=2.0), product of:
              0.15825124 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.045191016 = queryNorm
              0.19345059 = fieldWeight in 883, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=883)
      0.25 = coord(1/4)
    
    Abstract
    In this article, we present a method to automatically build large labeled datasets for the author ambiguity problem in the academic world by leveraging the authoritative academic resources, ORCID and DOI. Using the method, we built LAGOS-AND, two large, gold-standard sub-datasets for author name disambiguation (AND), of which LAGOS-AND-BLOCK is created for clustering-based AND research and LAGOS-AND-PAIRWISE is created for classification-based AND research. Our LAGOS-AND datasets are substantially different from the existing ones. The initial versions of the datasets (v1.0, released in February 2021) include 7.5 M citations authored by 798 K unique authors (LAGOS-AND-BLOCK) and close to 1 M instances (LAGOS-AND-PAIRWISE). And both datasets show close similarities to the whole Microsoft Academic Graph (MAG) across validations of six facets. In building the datasets, we reveal the variation degrees of last names in three literature databases, PubMed, MAG, and Semantic Scholar, by comparing author names hosted to the authors' official last names shown on the ORCID pages. Furthermore, we evaluate several baseline disambiguation methods as well as the MAG's author IDs system on our datasets, and the evaluation helps identify several interesting findings. We hope the datasets and findings will bring new insights for future studies. The code and datasets are publicly available.
    Date
    22. 1.2023 18:40:36
  12. Oh, H.; Nam, S.; Zhu, Y.: Structured abstract summarization of scientific articles : summarization using full-text section information (2023) 0.02
    0.01774153 = product of:
      0.07096612 = sum of:
        0.07096612 = sum of:
          0.040352322 = weight(_text_:methods in 889) [ClassicSimilarity], result of:
            0.040352322 = score(doc=889,freq=2.0), product of:
              0.18168657 = queryWeight, product of:
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.045191016 = queryNorm
              0.22209854 = fieldWeight in 889, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.0390625 = fieldNorm(doc=889)
          0.030613795 = weight(_text_:22 in 889) [ClassicSimilarity], result of:
            0.030613795 = score(doc=889,freq=2.0), product of:
              0.15825124 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.045191016 = queryNorm
              0.19345059 = fieldWeight in 889, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=889)
      0.25 = coord(1/4)
    
    Abstract
    The automatic summarization of scientific articles differs from other text genres because of the structured format and longer text length. Previous approaches have focused on tackling the lengthy nature of scientific articles, aiming to improve the computational efficiency of summarizing long text using a flat, unstructured abstract. However, the structured format of scientific articles and characteristics of each section have not been fully explored, despite their importance. The lack of a sufficient investigation and discussion of various characteristics for each section and their influence on summarization results has hindered the practical use of automatic summarization for scientific articles. To provide a balanced abstract proportionally emphasizing each section of a scientific article, the community introduced the structured abstract, an abstract with distinct, labeled sections. Using this information, in this study, we aim to understand tasks ranging from data preparation to model evaluation from diverse viewpoints. Specifically, we provide a preprocessed large-scale dataset and propose a summarization method applying the introduction, methods, results, and discussion (IMRaD) format reflecting the characteristics of each section. We also discuss the objective benchmarks and perspectives of state-of-the-art algorithms and present the challenges and research directions in this area.
    Date
    22. 1.2023 18:57:12
  13. Yu, C.; Xue, H.; An, L.; Li, G.: ¬A lightweight semantic-enhanced interactive network for efficient short-text matching (2023) 0.02
    0.01774153 = product of:
      0.07096612 = sum of:
        0.07096612 = sum of:
          0.040352322 = weight(_text_:methods in 890) [ClassicSimilarity], result of:
            0.040352322 = score(doc=890,freq=2.0), product of:
              0.18168657 = queryWeight, product of:
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.045191016 = queryNorm
              0.22209854 = fieldWeight in 890, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.0390625 = fieldNorm(doc=890)
          0.030613795 = weight(_text_:22 in 890) [ClassicSimilarity], result of:
            0.030613795 = score(doc=890,freq=2.0), product of:
              0.15825124 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.045191016 = queryNorm
              0.19345059 = fieldWeight in 890, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=890)
      0.25 = coord(1/4)
    
    Abstract
    Knowledge-enhanced short-text matching has been a significant task attracting much attention in recent years. However, the existing approaches cannot effectively balance effect and efficiency. Effective models usually consist of complex network structures leading to slow inference speed and the difficulties of applications in actual practice. In addition, most knowledge-enhanced models try to link the mentions in the text to the entities of the knowledge graphs-the difficulties of entity linking decrease the generalizability among different datasets. To address these problems, we propose a lightweight Semantic-Enhanced Interactive Network (SEIN) model for efficient short-text matching. Unlike most current research, SEIN employs an unsupervised method to select WordNet's most appropriate paraphrase description as the external semantic knowledge. It focuses on integrating semantic information and interactive information of text while simplifying the structure of other modules. We conduct intensive experiments on four real-world datasets, that is, Quora, Twitter-URL, SciTail, and SICK-E. Compared with state-of-the-art methods, SEIN achieves the best performance on most datasets. The experimental results proved that introducing external knowledge could effectively improve the performance of the short-text matching models. The research sheds light on the role of lightweight models in leveraging external knowledge to improve the effect of short-text matching.
    Date
    22. 1.2023 19:05:27
  14. Vakkari, P.; Järvelin, K.; Chang, Y.-W.: ¬The association of disciplinary background with the evolution of topics and methods in Library and Information Science research 1995-2015 (2023) 0.02
    0.01774153 = product of:
      0.07096612 = sum of:
        0.07096612 = sum of:
          0.040352322 = weight(_text_:methods in 998) [ClassicSimilarity], result of:
            0.040352322 = score(doc=998,freq=2.0), product of:
              0.18168657 = queryWeight, product of:
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.045191016 = queryNorm
              0.22209854 = fieldWeight in 998, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.0390625 = fieldNorm(doc=998)
          0.030613795 = weight(_text_:22 in 998) [ClassicSimilarity], result of:
            0.030613795 = score(doc=998,freq=2.0), product of:
              0.15825124 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.045191016 = queryNorm
              0.19345059 = fieldWeight in 998, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=998)
      0.25 = coord(1/4)
    
    Date
    22. 6.2023 18:15:06
  15. Barité, M.; Parentelli, V.; Rodríguez Casaballe, N.; Suárez, M.V.: Interdisciplinarity and postgraduate teaching of knowledge organization (KO) : elements for a necessary dialogue (2023) 0.02
    0.01774153 = product of:
      0.07096612 = sum of:
        0.07096612 = sum of:
          0.040352322 = weight(_text_:methods in 1125) [ClassicSimilarity], result of:
            0.040352322 = score(doc=1125,freq=2.0), product of:
              0.18168657 = queryWeight, product of:
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.045191016 = queryNorm
              0.22209854 = fieldWeight in 1125, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1125)
          0.030613795 = weight(_text_:22 in 1125) [ClassicSimilarity], result of:
            0.030613795 = score(doc=1125,freq=2.0), product of:
              0.15825124 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.045191016 = queryNorm
              0.19345059 = fieldWeight in 1125, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1125)
      0.25 = coord(1/4)
    
    Abstract
    Interdisciplinarity implies the previous existence of disciplinary fields and not their dissolution. As a general objective, we propose to establish an initial approach to the emphasis given to interdisciplinarity in the teaching of KO, through the teaching staff responsible for postgraduate courses focused on -or related to the KO, in Ibero-American universities. For conducting the research, the framework and distribution of a survey addressed to teachers is proposed, based on four lines of action: 1. The way teachers manage the concept of interdisciplinarity. 2. The place that teachers give to interdisciplinarity in KO. 3. Assessment of interdisciplinary content that teachers incorporate into their postgraduate courses. 4. Set of teaching strategies and resources used by teachers to include interdisciplinarity in the teaching of KO. The study analyzed 22 responses. Preliminary results show that KO teachers recognize the influence of other disciplines in concepts, theories, methods, and applications, but no consensus has been reached regarding which disciplines and authors are the ones who build interdisciplinary bridges. Among other conclusions, the study strongly suggests that environmental and social tensions are reflected in subject representation, especially in the construction of friendly knowl­edge organization systems with interdisciplinary visions, and in the expressions through which information is sought.
  16. Luo, L.; Ju, J.; Li, Y.-F.; Haffari, G.; Xiong, B.; Pan, S.: ChatRule: mining logical rules with large language models for knowledge graph reasoning (2023) 0.02
    0.01774153 = product of:
      0.07096612 = sum of:
        0.07096612 = sum of:
          0.040352322 = weight(_text_:methods in 1171) [ClassicSimilarity], result of:
            0.040352322 = score(doc=1171,freq=2.0), product of:
              0.18168657 = queryWeight, product of:
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.045191016 = queryNorm
              0.22209854 = fieldWeight in 1171, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1171)
          0.030613795 = weight(_text_:22 in 1171) [ClassicSimilarity], result of:
            0.030613795 = score(doc=1171,freq=2.0), product of:
              0.15825124 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.045191016 = queryNorm
              0.19345059 = fieldWeight in 1171, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1171)
      0.25 = coord(1/4)
    
    Abstract
    Logical rules are essential for uncovering the logical connections between relations, which could improve the reasoning performance and provide interpretable results on knowledge graphs (KGs). Although there have been many efforts to mine meaningful logical rules over KGs, existing methods suffer from the computationally intensive searches over the rule space and a lack of scalability for large-scale KGs. Besides, they often ignore the semantics of relations which is crucial for uncovering logical connections. Recently, large language models (LLMs) have shown impressive performance in the field of natural language processing and various applications, owing to their emergent ability and generalizability. In this paper, we propose a novel framework, ChatRule, unleashing the power of large language models for mining logical rules over knowledge graphs. Specifically, the framework is initiated with an LLM-based rule generator, leveraging both the semantic and structural information of KGs to prompt LLMs to generate logical rules. To refine the generated rules, a rule ranking module estimates the rule quality by incorporating facts from existing KGs. Last, a rule validator harnesses the reasoning ability of LLMs to validate the logical correctness of ranked rules through chain-of-thought reasoning. ChatRule is evaluated on four large-scale KGs, w.r.t. different rule quality metrics and downstream tasks, showing the effectiveness and scalability of our method.
    Date
    23.11.2023 19:07:22
  17. Guo, T.; Bai, X.; Zhen, S.; Abid, S.; Xia, F.: Lost at starting line : predicting maladaptation of university freshmen based on educational big data (2023) 0.02
    0.01774153 = product of:
      0.07096612 = sum of:
        0.07096612 = sum of:
          0.040352322 = weight(_text_:methods in 1194) [ClassicSimilarity], result of:
            0.040352322 = score(doc=1194,freq=2.0), product of:
              0.18168657 = queryWeight, product of:
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.045191016 = queryNorm
              0.22209854 = fieldWeight in 1194, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1194)
          0.030613795 = weight(_text_:22 in 1194) [ClassicSimilarity], result of:
            0.030613795 = score(doc=1194,freq=2.0), product of:
              0.15825124 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.045191016 = queryNorm
              0.19345059 = fieldWeight in 1194, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1194)
      0.25 = coord(1/4)
    
    Abstract
    The transition from secondary education to higher education could be challenging for most freshmen. For students who fail to adjust to university life smoothly, their status may worsen if the university cannot offer timely and proper guidance. Helping students adapt to university life is a long-term goal for any academic institution. Therefore, understanding the nature of the maladaptation phenomenon and the early prediction of "at-risk" students are crucial tasks that urgently need to be tackled effectively. This article aims to analyze the relevant factors that affect the maladaptation phenomenon and predict this phenomenon in advance. We develop a prediction framework (MAladaptive STudEnt pRediction, MASTER) for the early prediction of students with maladaptation. First, our framework uses the SMOTE (Synthetic Minority Oversampling Technique) algorithm to solve the data label imbalance issue. Moreover, a novel ensemble algorithm, priority forest, is proposed for outputting ranks instead of binary results, which enables us to perform proactive interventions in a prioritized manner where limited education resources are available. Experimental results on real-world education datasets demonstrate that the MASTER framework outperforms other state-of-art methods.
    Date
    27.12.2022 18:34:22
  18. Hocker, J.; Schindler, C.; Rittberger, M.: Participatory design for ontologies : a case study of an open science ontology for qualitative coding schemas (2020) 0.02
    0.01753612 = product of:
      0.07014448 = sum of:
        0.07014448 = sum of:
          0.045653444 = weight(_text_:methods in 179) [ClassicSimilarity], result of:
            0.045653444 = score(doc=179,freq=4.0), product of:
              0.18168657 = queryWeight, product of:
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.045191016 = queryNorm
              0.25127584 = fieldWeight in 179, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.0204134 = idf(docFreq=2156, maxDocs=44218)
                0.03125 = fieldNorm(doc=179)
          0.024491036 = weight(_text_:22 in 179) [ClassicSimilarity], result of:
            0.024491036 = score(doc=179,freq=2.0), product of:
              0.15825124 = queryWeight, product of:
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.045191016 = queryNorm
              0.15476047 = fieldWeight in 179, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5018296 = idf(docFreq=3622, maxDocs=44218)
                0.03125 = fieldNorm(doc=179)
      0.25 = coord(1/4)
    
    Abstract
    Purpose The open science movement calls for transparent and retraceable research processes. While infrastructures to support these practices in qualitative research are lacking, the design needs to consider different approaches and workflows. The paper bases on the definition of ontologies as shared conceptualizations of knowledge (Borst, 1999). The authors argue that participatory design is a good way to create these shared conceptualizations by giving domain experts and future users a voice in the design process via interviews, workshops and observations. Design/methodology/approach This paper presents a novel approach for creating ontologies in the field of open science using participatory design. As a case study the creation of an ontology for qualitative coding schemas is presented. Coding schemas are an important result of qualitative research, and reuse can yield great potential for open science making qualitative research more transparent, enhance sharing of coding schemas and teaching of qualitative methods. The participatory design process consisted of three parts: a requirement analysis using interviews and an observation, a design phase accompanied by interviews and an evaluation phase based on user tests as well as interviews. Findings The research showed several positive outcomes due to participatory design: higher commitment of users, mutual learning, high quality feedback and better quality of the ontology. However, there are two obstacles in this approach: First, contradictive answers by the interviewees, which needs to be balanced; second, this approach takes more time due to interview planning and analysis. Practical implications The implication of the paper is in the long run to decentralize the design of open science infrastructures and to involve parties affected on several levels. Originality/value In ontology design, several methods exist by using user-centered design or participatory design doing workshops. In this paper, the authors outline the potentials for participatory design using mainly interviews in creating an ontology for open science. The authors focus on close contact to researchers in order to build the ontology upon the expert's knowledge.
    Date
    20. 1.2015 18:30:22
  19. Dietz, K.: en.wikipedia.org > 6 Mio. Artikel (2020) 0.01
    0.01495319 = product of:
      0.05981276 = sum of:
        0.05981276 = product of:
          0.17943828 = sum of:
            0.17943828 = weight(_text_:3a in 5669) [ClassicSimilarity], result of:
              0.17943828 = score(doc=5669,freq=2.0), product of:
                0.38312992 = queryWeight, product of:
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.045191016 = queryNorm
                0.46834838 = fieldWeight in 5669, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  8.478011 = idf(docFreq=24, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=5669)
          0.33333334 = coord(1/3)
      0.25 = coord(1/4)
    
    Content
    "Die Englischsprachige Wikipedia verfügt jetzt über mehr als 6 Millionen Artikel. An zweiter Stelle kommt die deutschsprachige Wikipedia mit 2.3 Millionen Artikeln, an dritter Stelle steht die französischsprachige Wikipedia mit 2.1 Millionen Artikeln (via Researchbuzz: Firehose <https://rbfirehose.com/2020/01/24/techcrunch-wikipedia-now-has-more-than-6-million-articles-in-english/> und Techcrunch <https://techcrunch.com/2020/01/23/wikipedia-english-six-million-articles/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+Techcrunch+%28TechCrunch%29&guccounter=1&guce_referrer=aHR0cHM6Ly9yYmZpcmVob3NlLmNvbS8yMDIwLzAxLzI0L3RlY2hjcnVuY2gtd2lraXBlZGlhLW5vdy1oYXMtbW9yZS10aGFuLTYtbWlsbGlvbi1hcnRpY2xlcy1pbi1lbmdsaXNoLw&guce_referrer_sig=AQAAAK0zHfjdDZ_spFZBF_z-zDjtL5iWvuKDumFTzm4HvQzkUfE2pLXQzGS6FGB_y-VISdMEsUSvkNsg2U_NWQ4lwWSvOo3jvXo1I3GtgHpP8exukVxYAnn5mJspqX50VHIWFADHhs5AerkRn3hMRtf_R3F1qmEbo8EROZXp328HMC-o>). 250120 via digithek ch = #fineBlog s.a.: Angesichts der Veröffentlichung des 6-millionsten Artikels vergangene Woche in der englischsprachigen Wikipedia hat die Community-Zeitungsseite "Wikipedia Signpost" ein Moratorium bei der Veröffentlichung von Unternehmensartikeln gefordert. Das sei kein Vorwurf gegen die Wikimedia Foundation, aber die derzeitigen Maßnahmen, um die Enzyklopädie gegen missbräuchliches undeklariertes Paid Editing zu schützen, funktionierten ganz klar nicht. *"Da die ehrenamtlichen Autoren derzeit von Werbung in Gestalt von Wikipedia-Artikeln überwältigt werden, und da die WMF nicht in der Lage zu sein scheint, dem irgendetwas entgegenzusetzen, wäre der einzige gangbare Weg für die Autoren, fürs erste die Neuanlage von Artikeln über Unternehmen zu untersagen"*, schreibt der Benutzer Smallbones in seinem Editorial <https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2020-01-27/From_the_editor> zur heutigen Ausgabe."
  20. Suominen, O.; Koskenniemi, I.: Annif Analyzer Shootout : comparing text lemmatization methods for automated subject indexing (2022) 0.01
    0.013345277 = product of:
      0.053381108 = sum of:
        0.053381108 = product of:
          0.106762215 = sum of:
            0.106762215 = weight(_text_:methods in 658) [ClassicSimilarity], result of:
              0.106762215 = score(doc=658,freq=14.0), product of:
                0.18168657 = queryWeight, product of:
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.045191016 = queryNorm
                0.5876176 = fieldWeight in 658, product of:
                  3.7416575 = tf(freq=14.0), with freq of:
                    14.0 = termFreq=14.0
                  4.0204134 = idf(docFreq=2156, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=658)
          0.5 = coord(1/2)
      0.25 = coord(1/4)
    
    Abstract
    Automated text classification is an important function for many AI systems relevant to libraries, including automated subject indexing and classification. When implemented using the traditional natural language processing (NLP) paradigm, one key part of the process is the normalization of words using stemming or lemmatization, which reduces the amount of linguistic variation and often improves the quality of classification. In this paper, we compare the output of seven different text lemmatization algorithms as well as two baseline methods. We measure how the choice of method affects the quality of text classification using example corpora in three languages. The experiments have been performed using the open source Annif toolkit for automated subject indexing and classification, but should generalize also to other NLP toolkits and similar text classification tasks. The results show that lemmatization methods in most cases outperform baseline methods in text classification particularly for Finnish and Swedish text, but not English, where baseline methods are most effective. The differences between lemmatization methods are quite small. The systematic comparison will help optimize text classification pipelines and inform the further development of the Annif toolkit to incorporate a wider choice of normalization methods.

Languages

  • e 173
  • d 29
  • sp 1
  • More… Less…

Types

  • a 194
  • el 27
  • m 4
  • p 2
  • x 1
  • More… Less…