Jiang, Y.; Meng, R.; Huang, Y.; Lu, W.; Liu, J.: Generating keyphrases for readers : a controllable keyphrase generation framework (2023)
0.02
0.018256467 = product of:
0.06389763 = sum of:
0.032137483 = weight(_text_:wide in 1012) [ClassicSimilarity], result of:
0.032137483 = score(doc=1012,freq=2.0), product of:
0.1312982 = queryWeight, product of:
4.4307585 = idf(docFreq=1430, maxDocs=44218)
0.029633347 = queryNorm
0.24476713 = fieldWeight in 1012, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.4307585 = idf(docFreq=1430, maxDocs=44218)
0.0390625 = fieldNorm(doc=1012)
0.010089659 = weight(_text_:information in 1012) [ClassicSimilarity], result of:
0.010089659 = score(doc=1012,freq=8.0), product of:
0.052020688 = queryWeight, product of:
1.7554779 = idf(docFreq=20772, maxDocs=44218)
0.029633347 = queryNorm
0.19395474 = fieldWeight in 1012, product of:
2.828427 = tf(freq=8.0), with freq of:
8.0 = termFreq=8.0
1.7554779 = idf(docFreq=20772, maxDocs=44218)
0.0390625 = fieldNorm(doc=1012)
0.014978974 = weight(_text_:retrieval in 1012) [ClassicSimilarity], result of:
0.014978974 = score(doc=1012,freq=2.0), product of:
0.08963835 = queryWeight, product of:
3.024915 = idf(docFreq=5836, maxDocs=44218)
0.029633347 = queryNorm
0.16710453 = fieldWeight in 1012, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.024915 = idf(docFreq=5836, maxDocs=44218)
0.0390625 = fieldNorm(doc=1012)
0.0066915164 = product of:
0.020074548 = sum of:
0.020074548 = weight(_text_:22 in 1012) [ClassicSimilarity], result of:
0.020074548 = score(doc=1012,freq=2.0), product of:
0.103770934 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.029633347 = queryNorm
0.19345059 = fieldWeight in 1012, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.0390625 = fieldNorm(doc=1012)
0.33333334 = coord(1/3)
0.2857143 = coord(4/14)
- Abstract
- With the wide application of keyphrases in many Information Retrieval (IR) and Natural Language Processing (NLP) tasks, automatic keyphrase prediction has been emerging. However, these statistically important phrases are contributing increasingly less to the related tasks because the end-to-end learning mechanism enables models to learn the important semantic information of the text directly. Similarly, keyphrases are of little help for readers to quickly grasp the paper's main idea because the relationship between the keyphrase and the paper is not explicit to readers. Therefore, we propose to generate keyphrases with specific functions for readers to bridge the semantic gap between them and the information producers, and verify the effectiveness of the keyphrase function for assisting users' comprehension with a user experiment. A controllable keyphrase generation framework (the CKPG) that uses the keyphrase function as a control code to generate categorized keyphrases is proposed and implemented based on Transformer, BART, and T5, respectively. For the Computer Science domain, the Macro-avgs of , , and on the Paper with Code dataset are up to 0.680, 0.535, and 0.558, respectively. Our experimental results indicate the effectiveness of the CKPG models.
- Date
- 22. 6.2023 14:55:20
- Source
- Journal of the Association for Information Science and Technology. 74(2023) no.7, S.759-774
Oh, H.; Nam, S.; Zhu, Y.: Structured abstract summarization of scientific articles : summarization using full-text section information (2023)
0.00
0.0022042028 = product of:
0.0154294185 = sum of:
0.008737902 = weight(_text_:information in 889) [ClassicSimilarity], result of:
0.008737902 = score(doc=889,freq=6.0), product of:
0.052020688 = queryWeight, product of:
1.7554779 = idf(docFreq=20772, maxDocs=44218)
0.029633347 = queryNorm
0.16796975 = fieldWeight in 889, product of:
2.4494898 = tf(freq=6.0), with freq of:
6.0 = termFreq=6.0
1.7554779 = idf(docFreq=20772, maxDocs=44218)
0.0390625 = fieldNorm(doc=889)
0.0066915164 = product of:
0.020074548 = sum of:
0.020074548 = weight(_text_:22 in 889) [ClassicSimilarity], result of:
0.020074548 = score(doc=889,freq=2.0), product of:
0.103770934 = queryWeight, product of:
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.029633347 = queryNorm
0.19345059 = fieldWeight in 889, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.5018296 = idf(docFreq=3622, maxDocs=44218)
0.0390625 = fieldNorm(doc=889)
0.33333334 = coord(1/3)
0.14285715 = coord(2/14)
- Abstract
- The automatic summarization of scientific articles differs from other text genres because of the structured format and longer text length. Previous approaches have focused on tackling the lengthy nature of scientific articles, aiming to improve the computational efficiency of summarizing long text using a flat, unstructured abstract. However, the structured format of scientific articles and characteristics of each section have not been fully explored, despite their importance. The lack of a sufficient investigation and discussion of various characteristics for each section and their influence on summarization results has hindered the practical use of automatic summarization for scientific articles. To provide a balanced abstract proportionally emphasizing each section of a scientific article, the community introduced the structured abstract, an abstract with distinct, labeled sections. Using this information, in this study, we aim to understand tasks ranging from data preparation to model evaluation from diverse viewpoints. Specifically, we provide a preprocessed large-scale dataset and propose a summarization method applying the introduction, methods, results, and discussion (IMRaD) format reflecting the characteristics of each section. We also discuss the objective benchmarks and perspectives of state-of-the-art algorithms and present the challenges and research directions in this area.
- Date
- 22. 1.2023 18:57:12
- Source
- Journal of the Association for Information Science and Technology. 74(2023) no.2, S.234-248