Search (3 results, page 1 of 1)

  • × author_ss:"Zhu, Y."
  • × year_i:[2020 TO 2030}
  1. Oh, H.; Nam, S.; Zhu, Y.: Structured abstract summarization of scientific articles : summarization using full-text section information (2023) 0.02
    0.023842867 = product of:
      0.0357643 = sum of:
        0.01886051 = weight(_text_:on in 889) [ClassicSimilarity], result of:
          0.01886051 = score(doc=889,freq=4.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.1718293 = fieldWeight in 889, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=889)
        0.01690379 = product of:
          0.03380758 = sum of:
            0.03380758 = weight(_text_:22 in 889) [ClassicSimilarity], result of:
              0.03380758 = score(doc=889,freq=2.0), product of:
                0.1747608 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.04990557 = queryNorm
                0.19345059 = fieldWeight in 889, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=889)
          0.5 = coord(1/2)
      0.6666667 = coord(2/3)
    
    Abstract
    The automatic summarization of scientific articles differs from other text genres because of the structured format and longer text length. Previous approaches have focused on tackling the lengthy nature of scientific articles, aiming to improve the computational efficiency of summarizing long text using a flat, unstructured abstract. However, the structured format of scientific articles and characteristics of each section have not been fully explored, despite their importance. The lack of a sufficient investigation and discussion of various characteristics for each section and their influence on summarization results has hindered the practical use of automatic summarization for scientific articles. To provide a balanced abstract proportionally emphasizing each section of a scientific article, the community introduced the structured abstract, an abstract with distinct, labeled sections. Using this information, in this study, we aim to understand tasks ranging from data preparation to model evaluation from diverse viewpoints. Specifically, we provide a preprocessed large-scale dataset and propose a summarization method applying the introduction, methods, results, and discussion (IMRaD) format reflecting the characteristics of each section. We also discuss the objective benchmarks and perspectives of state-of-the-art algorithms and present the challenges and research directions in this area.
    Date
    22. 1.2023 18:57:12
  2. Wu, C.; Yan, E.; Zhu, Y.; Li, K.: Gender imbalance in the productivity of funded projects : a study of the outputs of National Institutes of Health R01 grants (2021) 0.01
    0.0075442037 = product of:
      0.02263261 = sum of:
        0.02263261 = weight(_text_:on in 391) [ClassicSimilarity], result of:
          0.02263261 = score(doc=391,freq=4.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.20619515 = fieldWeight in 391, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.046875 = fieldNorm(doc=391)
      0.33333334 = coord(1/3)
    
    Abstract
    This study examines the relationship between team's gender composition and outputs of funded projects using a large data set of National Institutes of Health (NIH) R01 grants and their associated publications between 1990 and 2017. This study finds that while the women investigators' presence in NIH grants is generally low, higher women investigator presence is on average related to slightly lower number of publications. This study finds empirically that women investigators elect to work in fields in which fewer publications per million-dollar funding is the norm. For fields where women investigators are relatively well represented, they are as productive as men. The overall lower productivity of women investigators may be attributed to the low representation of women in high productivity fields dominated by men investigators. The findings shed light on possible reasons for gender disparity in grant productivity.
  3. Zhu, Y.; Quan, L.; Chen, P.-Y.; Kim, M.C.; Che, C.: Predicting coauthorship using bibliographic network embedding (2023) 0.00
    0.0044454644 = product of:
      0.013336393 = sum of:
        0.013336393 = weight(_text_:on in 917) [ClassicSimilarity], result of:
          0.013336393 = score(doc=917,freq=2.0), product of:
            0.109763056 = queryWeight, product of:
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.04990557 = queryNorm
            0.121501654 = fieldWeight in 917, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              2.199415 = idf(docFreq=13325, maxDocs=44218)
              0.0390625 = fieldNorm(doc=917)
      0.33333334 = coord(1/3)
    
    Abstract
    Coauthorship prediction applies predictive analytics to bibliographic data to predict authors who are highly likely to be coauthors. In this study, we propose an approach for coauthorship prediction based on bibliographic network embedding through a graph-based bibliographic data model that can be used to model common bibliographic data, including papers, terms, sources, authors, departments, research interests, universities, and countries. A real-world dataset released by AMiner that includes more than 2 million papers, 8 million citations, and 1.7 million authors were integrated into a large bibliographic network using the proposed bibliographic data model. Translation-based methods were applied to the entities and relationships to generate their low-dimensional embeddings while preserving their connectivity information in the original bibliographic network. We applied machine learning algorithms to embeddings that represent the coauthorship relationships of the two authors and achieved high prediction results. The reference model, which is the combination of a network embedding size of 100, the most basic translation-based method, and a gradient boosting method achieved an F1 score of 0.9 and even higher scores are obtainable with different embedding sizes and more advanced embedding methods. Thus, the strengths of the proposed approach lie in its customizable components under a unified framework.