Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 28. April 2022)
1Smalheiser, N.R.: Literature-based discovery : Beyond the ABCs.
In: Journal of the American Society for Information Science and Technology. 63(2012) no.2, S.218-224.
(Advances in information science)
Abstract: Literature-based discovery (LBD) refers to a particular type of text mining that seeks to identify nontrivial assertions that are implicit, and not explicitly stated, and that are detected by juxtaposing (generally a large body of) documents. In this review, I will provide a brief overview of LBD, both past and present, and will propose some new directions for the next decade. The prevalent ABC model is not "wrong"; however, it is only one of several different types of models that can contribute to the development of the next generation of LBD tools. Perhaps the most urgent need is to develop a series of objective literature-based interestingness measures, which can customize the output of LBD systems for different types of scientific investigations.
2Swanson, D.R. ; Smalheiser, N.R. ; Torvik, V.I.: Ranking indirect connections in literature-based discovery : the role of Medical Subject Headings.
In: Journal of the American Society for Information Science and Technology. 57(2006) no.11, S.1427-1439.
Abstract: Arrowsmith, a computer-assisted process for literature-based discovery, takes as input two disjoint sets of records (A, C) from the Medline database. lt produces a list of title words and phrases, B, that are common to A and C, and displays the title context in which each B-term occurs within A and within C. Subject experts then can try to find A-B and B-C title-pairs that together may suggest novel and plausible indirect A-C relationships (via B-terms) that are of particular interest in the absence of any known direct A-C relationship. The list of B-terms typically is so large that it is difficult to find the relatively few that contribute to scientifically interesting connections. The purpose of the present article is to propose and test several techniques for improving the quality of the B-Iist. These techniques exploit the Medical Subject Headings (MeSH) that are assigned to each input record. A MesH-based concept of literature cohesiveness is defined and plays a key rote. The proposed techniques are tested an a published example of indirect connections between migraine and magnesium deficiency. The tests demonstrate how the earlier results can be replicated with a more efficient and more systematic computer-aided process.
3Torvik, V.I. ; Weeber, M. ; Swanson, D.R. ; Smalheiser, N.R.: ¬A probabilistic similarity metric for medline mecords : a model for author name disambiguation.
In: Journal of the American Society for Information Science and Technology. 56(2005) no.2, S.140-158.
Abstract: We present a model for estimating the probability that a pair of author names (sharing last name and first initial), appearing an two different Medline articles, refer to the same individual. The model uses a simple yet powerful similarity profile between a pair of articles, based an title, journal name, coauthor names, medical subject headings (MeSH), language, affiliation, and name attributes (prevalence in the literature, middle initial, and suffix). The similarity profile distribution is computed from reference sets consisting of pairs of articles containing almost exclusively author matches versus nonmatches, generated in an unbiased manner. Although the match set is generated automatically and might contain a small proportion of nonmatches, the model is quite robust against contamination with nonmatches. We have created a free, public service ("Author-ity": http://arrowsmith.psych.uic.edu) that takes as input an author's name given an a specific article, and gives as output a list of all articles with that (last name, first initial) ranked by decreasing similarity, with match probability indicated.
4Swanson, D.R. ; Smalheiser, N.R. ; Bookstein, A.: Information discovery from complementary literatures : categorizing viruses as potential weapons.
In: Journal of the American Society for Information Science and technology. 52(2001) no.10, S.797-812.
Abstract: Using novel informatics techniques to process the Output of Medline searches, we have generated a list of viruses that may have the potential for development as weapons. Our findings are intended as a guide to the virus literature to support further studies that might then lead to appropriate defense and public health measures. This article stresses methods that are more generally relevant to information science. Initial Medline searches identified two kinds of virus literaturesthe first concerning the genetic aspects of virulence, and the second concerning the transmission of viral diseases. Both literatures taken together are of central importance in identifying research relevant to the development of biological weapons. Yet, the two literatures had very few articles in common. We downloaded the Medline records for each of the two literatures and used a computer to extract all virus terms common to both. The fact that the resulting virus list includes most of an earlier independently published list of viruses considered by military experts to have the highest threat as potential biological weapons served as a test of the method; the test outcome showed a high degree of statistical significance, thus supporting an inference that the new viruses an the list share certain important characteristics with viruses of known biological