McKeown, K.; Daume III, H.; Chaturvedi, S.; Paparrizos, J.; Thadani, K.; Barrio, P.; Biran, O.; Bothe, S.; Collins, M.; Fleischmann, K.R.; Gravano, L.; Jha, R.; King, B.; McInerney, K.; Moon, T.; Neelakantan, A.; O'Seaghdha, D.; Radev, D.; Templeton, C.; Teufel, S.: Predicting the impact of scientific concepts using full-text features (2016)
0.00
0.0025370158 = product of:
0.0050740317 = sum of:
0.0050740317 = product of:
0.010148063 = sum of:
0.010148063 = weight(_text_:a in 3153) [ClassicSimilarity], result of:
0.010148063 = score(doc=3153,freq=18.0), product of:
0.053105544 = queryWeight, product of:
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.046056706 = queryNorm
0.19109234 = fieldWeight in 3153, product of:
4.2426405 = tf(freq=18.0), with freq of:
18.0 = termFreq=18.0
1.153047 = idf(docFreq=37942, maxDocs=44218)
0.0390625 = fieldNorm(doc=3153)
0.5 = coord(1/2)
0.5 = coord(1/2)
- Abstract
- New scientific concepts, interpreted broadly, are continuously introduced in the literature, but relatively few concepts have a long-term impact on society. The identification of such concepts is a challenging prediction task that would help multiple parties-including researchers and the general public-focus their attention within the vast scientific literature. In this paper we present a system that predicts the future impact of a scientific concept, represented as a technical term, based on the information available from recently published research articles. We analyze the usefulness of rich features derived from the full text of the articles through a variety of approaches, including rhetorical sentence analysis, information extraction, and time-series analysis. The results from two large-scale experiments with 3.8 million full-text articles and 48 million metadata records support the conclusion that full-text features are significantly more useful for prediction than metadata-only features and that the most accurate predictions result from combining the metadata and full-text features. Surprisingly, these results hold even when the metadata features are available for a much larger number of documents than are available for the full-text features.
- Type
- a