Search (5 results, page 1 of 1)

  • × author_ss:"Zhu, Y."
  1. Oh, H.; Nam, S.; Zhu, Y.: Structured abstract summarization of scientific articles : summarization using full-text section information (2023) 0.02
    0.01597124 = product of:
      0.03194248 = sum of:
        0.014865918 = weight(_text_:information in 889) [ClassicSimilarity], result of:
          0.014865918 = score(doc=889,freq=6.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.16796975 = fieldWeight in 889, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=889)
        0.01707656 = product of:
          0.03415312 = sum of:
            0.03415312 = weight(_text_:22 in 889) [ClassicSimilarity], result of:
              0.03415312 = score(doc=889,freq=2.0), product of:
                0.17654699 = queryWeight, product of:
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.050415643 = queryNorm
                0.19345059 = fieldWeight in 889, product of:
                  1.4142135 = tf(freq=2.0), with freq of:
                    2.0 = termFreq=2.0
                  3.5018296 = idf(docFreq=3622, maxDocs=44218)
                  0.0390625 = fieldNorm(doc=889)
          0.5 = coord(1/2)
      0.5 = coord(2/4)
    
    Abstract
    The automatic summarization of scientific articles differs from other text genres because of the structured format and longer text length. Previous approaches have focused on tackling the lengthy nature of scientific articles, aiming to improve the computational efficiency of summarizing long text using a flat, unstructured abstract. However, the structured format of scientific articles and characteristics of each section have not been fully explored, despite their importance. The lack of a sufficient investigation and discussion of various characteristics for each section and their influence on summarization results has hindered the practical use of automatic summarization for scientific articles. To provide a balanced abstract proportionally emphasizing each section of a scientific article, the community introduced the structured abstract, an abstract with distinct, labeled sections. Using this information, in this study, we aim to understand tasks ranging from data preparation to model evaluation from diverse viewpoints. Specifically, we provide a preprocessed large-scale dataset and propose a summarization method applying the introduction, methods, results, and discussion (IMRaD) format reflecting the characteristics of each section. We also discuss the objective benchmarks and perspectives of state-of-the-art algorithms and present the challenges and research directions in this area.
    Date
    22. 1.2023 18:57:12
    Source
    Journal of the Association for Information Science and Technology. 74(2023) no.2, S.234-248
  2. Zhu, Y.; Yan, E.; Song, I.-Y..: ¬The use of a graph-based system to improve bibliographic information retrieval : system design, implementation, and evaluation (2017) 0.01
    0.006812419 = product of:
      0.027249675 = sum of:
        0.027249675 = weight(_text_:information in 3356) [ClassicSimilarity], result of:
          0.027249675 = score(doc=3356,freq=14.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.3078936 = fieldWeight in 3356, product of:
              3.7416575 = tf(freq=14.0), with freq of:
                14.0 = termFreq=14.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=3356)
      0.25 = coord(1/4)
    
    Abstract
    In this article, we propose a graph-based interactive bibliographic information retrieval system-GIBIR. GIBIR provides an effective way to retrieve bibliographic information. The system represents bibliographic information as networks and provides a form-based query interface. Users can develop their queries interactively by referencing the system-generated graph queries. Complex queries such as "papers on information retrieval, which were cited by John's papers that had been presented in SIGIR" can be effectively answered by the system. We evaluate the proposed system by developing another relational database-based bibliographic information retrieval system with the same interface and functions. Experiment results show that the proposed system executes the same queries much faster than the relational database-based system, and on average, our system reduced the execution time by 72% (for 3-node query), 89% (for 4-node query), and 99% (for 5-node query).
    Source
    Journal of the Association for Information Science and Technology. 68(2017) no.2, S.480-490
  3. Zhu, Y.; Quan, L.; Chen, P.-Y.; Kim, M.C.; Che, C.: Predicting coauthorship using bibliographic network embedding (2023) 0.00
    0.0030344925 = product of:
      0.01213797 = sum of:
        0.01213797 = weight(_text_:information in 917) [ClassicSimilarity], result of:
          0.01213797 = score(doc=917,freq=4.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.13714671 = fieldWeight in 917, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=917)
      0.25 = coord(1/4)
    
    Abstract
    Coauthorship prediction applies predictive analytics to bibliographic data to predict authors who are highly likely to be coauthors. In this study, we propose an approach for coauthorship prediction based on bibliographic network embedding through a graph-based bibliographic data model that can be used to model common bibliographic data, including papers, terms, sources, authors, departments, research interests, universities, and countries. A real-world dataset released by AMiner that includes more than 2 million papers, 8 million citations, and 1.7 million authors were integrated into a large bibliographic network using the proposed bibliographic data model. Translation-based methods were applied to the entities and relationships to generate their low-dimensional embeddings while preserving their connectivity information in the original bibliographic network. We applied machine learning algorithms to embeddings that represent the coauthorship relationships of the two authors and achieved high prediction results. The reference model, which is the combination of a network embedding size of 100, the most basic translation-based method, and a gradient boosting method achieved an F1 score of 0.9 and even higher scores are obtainable with different embedding sizes and more advanced embedding methods. Thus, the strengths of the proposed approach lie in its customizable components under a unified framework.
    Source
    Journal of the Association for Information Science and Technology. 74(2023) no.4, S.388-401
  4. Wu, C.; Yan, E.; Zhu, Y.; Li, K.: Gender imbalance in the productivity of funded projects : a study of the outputs of National Institutes of Health R01 grants (2021) 0.00
    0.0025748524 = product of:
      0.01029941 = sum of:
        0.01029941 = weight(_text_:information in 391) [ClassicSimilarity], result of:
          0.01029941 = score(doc=391,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.116372846 = fieldWeight in 391, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.046875 = fieldNorm(doc=391)
      0.25 = coord(1/4)
    
    Source
    Journal of the Association for Information Science and Technology. 72(2021) no.11, S.1386-1399
  5. Yan, E.; Zhu, Y.: Adding the dimension of knowledge trading to source impact assessment : approaches, indicators, and implications (2017) 0.00
    0.0021457102 = product of:
      0.008582841 = sum of:
        0.008582841 = weight(_text_:information in 3633) [ClassicSimilarity], result of:
          0.008582841 = score(doc=3633,freq=2.0), product of:
            0.08850355 = queryWeight, product of:
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.050415643 = queryNorm
            0.09697737 = fieldWeight in 3633, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              1.7554779 = idf(docFreq=20772, maxDocs=44218)
              0.0390625 = fieldNorm(doc=3633)
      0.25 = coord(1/4)
    
    Source
    Journal of the Association for Information Science and Technology. 68(2017) no.5, S.1090-1104