Literatur zur Informationserschließung
Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft
/
Powered by litecat, BIS Oldenburg
(Stand: 28. April 2022)
Suche
Suchergebnisse
Treffer 1–3 von 3
sortiert nach:
-
1Fegley, B.D. ; Torvik, V.I.: On the role of poetic versus nonpoetic features in "kindred" and diachronic poetry attribution.
In: Journal of the American Society for Information Science and Technology. 63(2012) no.11, S.2165-2181.
Abstract: Author attribution studies have demonstrated remarkable success in applying orthographic and lexicographic features of text in a variety of discrimination problems. What might poetic features, such as syllabic stress and mood, contribute? We address this question in the context of two different attribution problems: (a) kindred: differentiate Langston Hughes' early poems from those of kindred poets and (b) diachronic: differentiate Hughes' early from his later poems. Using a diverse set of 535 generic text features, each categorized as poetic or nonpoetic, correlation-based greedy forward search ranked the features and a support vector machine classified the poems. A small subset of features (~10) achieved cross-validated precision and recall as high as 87%. Poetic features (rhyme patterns particularly) were nearly as effective as nonpoetic in kindred discrimination, but less effective diachronically. In other words, Hughes used both poetic and nonpoetic features in distinctive ways and his use of nonpoetic features evolved systematically while he continued to experiment with poetic features. These findings affirm qualitative studies attesting to structural elements from Black oral tradition and Black folk music (blues) and to the internal consistency of Hughes' early poetry.
Themenfeld: Computerlinguistik
-
2Swanson, D.R. ; Smalheiser, N.R. ; Torvik, V.I.: Ranking indirect connections in literature-based discovery : the role of Medical Subject Headings.
In: Journal of the American Society for Information Science and Technology. 57(2006) no.11, S.1427-1439.
Abstract: Arrowsmith, a computer-assisted process for literature-based discovery, takes as input two disjoint sets of records (A, C) from the Medline database. lt produces a list of title words and phrases, B, that are common to A and C, and displays the title context in which each B-term occurs within A and within C. Subject experts then can try to find A-B and B-C title-pairs that together may suggest novel and plausible indirect A-C relationships (via B-terms) that are of particular interest in the absence of any known direct A-C relationship. The list of B-terms typically is so large that it is difficult to find the relatively few that contribute to scientifically interesting connections. The purpose of the present article is to propose and test several techniques for improving the quality of the B-Iist. These techniques exploit the Medical Subject Headings (MeSH) that are assigned to each input record. A MesH-based concept of literature cohesiveness is defined and plays a key rote. The proposed techniques are tested an a published example of indirect connections between migraine and magnesium deficiency. The tests demonstrate how the earlier results can be replicated with a more efficient and more systematic computer-aided process.
Themenfeld: Informationsdienstleistungen
Wissenschaftsfach: Medizin
Objekt: MeSH
-
3Torvik, V.I. ; Weeber, M. ; Swanson, D.R. ; Smalheiser, N.R.: ¬A probabilistic similarity metric for medline mecords : a model for author name disambiguation.
In: Journal of the American Society for Information Science and Technology. 56(2005) no.2, S.140-158.
Abstract: We present a model for estimating the probability that a pair of author names (sharing last name and first initial), appearing an two different Medline articles, refer to the same individual. The model uses a simple yet powerful similarity profile between a pair of articles, based an title, journal name, coauthor names, medical subject headings (MeSH), language, affiliation, and name attributes (prevalence in the literature, middle initial, and suffix). The similarity profile distribution is computed from reference sets consisting of pairs of articles containing almost exclusively author matches versus nonmatches, generated in an unbiased manner. Although the match set is generated automatically and might contain a small proportion of nonmatches, the model is quite robust against contamination with nonmatches. We have created a free, public service ("Author-ity": http://arrowsmith.psych.uic.edu) that takes as input an author's name given an a specific article, and gives as output a list of all articles with that (last name, first initial) ranked by decreasing similarity, with match probability indicated.
Themenfeld: Informetrie
Wissenschaftsfach: Medizin
Objekt: Medline