Literatur zur Informationserschließung
Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft
/
Powered by litecat, BIS Oldenburg
(Stand: 28. April 2022)
Suche
Suchergebnisse
Treffer 1–20 von 75
sortiert nach:
-
1Tavakolizadeh-Ravari, M.: Analysis of the long term dynamics in thesaurus developments and its consequences.
In: http://edoc.hu-berlin.de/docviews/abstract.php?id=28144. Berlin : Humboldt-Universität zu Berlin / Institut für Bibliotheks- und Informationswissenschaft, 2017. 128 S.
Abstract: Die Arbeit analysiert die dynamische Entwicklung und den Gebrauch von Thesaurusbegriffen. Zusätzlich konzentriert sie sich auf die Faktoren, die die Zahl von Indexbegriffen pro Dokument oder Zeitschrift beeinflussen. Als Untersuchungsobjekt dienten der MeSH und die entsprechende Datenbank "MEDLINE". Die wichtigsten Konsequenzen sind: 1. Der MeSH-Thesaurus hat sich durch drei unterschiedliche Phasen jeweils logarithmisch entwickelt. Solch einen Thesaurus sollte folgenden Gleichung folgen: "T = 3.076,6 Ln (d) - 22.695 + 0,0039d" (T = Begriffe, Ln = natürlicher Logarithmus und d = Dokumente). Um solch einen Thesaurus zu konstruieren, muss man demnach etwa 1.600 Dokumente von unterschiedlichen Themen des Bereiches des Thesaurus haben. Die dynamische Entwicklung von Thesauri wie MeSH erfordert die Einführung eines neuen Begriffs pro Indexierung von 256 neuen Dokumenten. 2. Die Verteilung der Thesaurusbegriffe erbrachte drei Kategorien: starke, normale und selten verwendete Headings. Die letzte Gruppe ist in einer Testphase, während in der ersten und zweiten Kategorie die neu hinzukommenden Deskriptoren zu einem Thesauruswachstum führen. 3. Es gibt ein logarithmisches Verhältnis zwischen der Zahl von Index-Begriffen pro Aufsatz und dessen Seitenzahl für die Artikeln zwischen einer und einundzwanzig Seiten. 4. Zeitschriftenaufsätze, die in MEDLINE mit Abstracts erscheinen erhalten fast zwei Deskriptoren mehr. 5. Die Findablity der nicht-englisch sprachigen Dokumente in MEDLINE ist geringer als die englische Dokumente. 6. Aufsätze der Zeitschriften mit einem Impact Factor 0 bis fünfzehn erhalten nicht mehr Indexbegriffe als die der anderen von MEDINE erfassten Zeitschriften. 7. In einem Indexierungssystem haben unterschiedliche Zeitschriften mehr oder weniger Gewicht in ihrem Findability. Die Verteilung der Indexbegriffe pro Seite hat gezeigt, dass es bei MEDLINE drei Kategorien der Publikationen gibt. Außerdem gibt es wenige stark bevorzugten Zeitschriften."
Inhalt: Vgl.: https://www.ibi.hu-berlin.de/de/archiv/forschung/prom_habil/dissertationen/Tavakolizadeh-Ravari2007. Vgl. auch: http://mravari.blogfa.com/post-20.aspxgl.
Anmerkung: Dissertation, Humboldt-Universität zu Berlin - Institut für Bibliotheks- und Informationswissenschaft.
Themenfeld: Konzeption und Anwendung des Prinzips Thesaurus ; Informetrie ; Automatisches Indexieren
Objekt: MEDLINE ; MeSH
-
2Rotolo, D. ; Leydesdorff, L.: Matching Medline/PubMed data with Web of Science: A routine in R language.
In: Journal of the Association for Information Science and Technology. 66(2015) no.10, S.2155-2159.
(Brief communications)
Abstract: We present a novel routine, namely medlineR, based on the R language, that allows the user to match data from Medline/PubMed with records indexed in the ISI Web of Science (WoS) database. The matching allows exploiting the rich and controlled vocabulary of medical subject headings (MeSH) of Medline/PubMed with additional fields of WoS. The integration provides data (e.g., citation data, list of cited reference, list of the addresses of authors' host organizations, WoS subject categories) to perform a variety of scientometric analyses. This brief communication describes medlineR, the method on which it relies, and the steps the user should follow to perform the matching across the two databases. To demonstrate the differences from Leydesdorff and Opthof (Journal of the American Society for Information Science and Technology, 64(5), 1076-1080), we conclude this artcle by testing the routine on the MeSH category "Burgada syndrome."
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23385/abstract.
Themenfeld: Informetrie
Wissenschaftsfach: Medizin
Objekt: Medline ; PubMed ; Web of Science
-
3Zhang, Y.: Searching for specific health-related information in MedlinePlus : behavioral patterns and user experience.
In: Journal of the Association for Information Science and Technology. 65(2014) no.1, S.53-68.
Abstract: Searches for specific factual health information constitute a significant part of consumer health information requests, but little is known about how users search for such information. This study attempts to fill this gap by observing users' behavior while using MedlinePlus to search for specific health information. Nineteen students participated in the study, and each performed 12 specific tasks. During the search process, they submitted short queries or complete questions, and they examined less than 1 result per search. Participants rarely reformulated queries; when they did, they tended to make a query more specific or more general, or iterate in different ways. Participants also browsed, primarily relying on the alphabetical list and the anatomical classification, to navigate to specific health topics. Participants overall had a positive experience with MedlinePlus, and the experience was significantly correlated with task difficulty and participants' spatial abilities. The results suggest that, to better support specific item search in the health domain, systems could provide a more "natural" interface to encourage users to ask questions; effective conceptual hierarchies could be implemented to help users reformulate queries; and the search results page should be reconceptualized as a place for accessing answers rather than documents. Moreover, multiple schemas should be provided to help users navigate to a health topic. The results also suggest that users' experience with information systems in general and health-related systems in particular should be evaluated in relation to contextual factors, such as task features and individual differences.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.22960/abstract.
Wissenschaftsfach: Medizin
Objekt: MedlinePlus
-
4Mirel, B. ; Tonks, J.S ; Song, J. ; Meng, F. ; Xuan, W. ; Ameziane, R.: Studying PubMed usages in the field for complex problem solving : implications for tool design.
In: Journal of the American Society for Information Science and Technology. 64(2013) no.5, S.874-892.
Abstract: Many recent studies on MEDLINE-based information seeking have shed light on scientists' behaviors and associated tool innovations that may improve efficiency and effectiveness. Few, if any, studies, however, examine scientists' problem-solving uses of PubMed in actual contexts of work and corresponding needs for better tool support. Addressing this gap, we conducted a field study of novice scientists (14 upper-level undergraduate majors in molecular biology) as they engaged in a problem-solving activity with PubMed in a laboratory setting. Findings reveal many common stages and patterns of information seeking across users as well as variations, especially variations in cognitive search styles. Based on these findings, we suggest tool improvements that both confirm and qualify many results found in other recent studies. Our findings highlight the need to use results from context-rich studies to inform decisions in tool design about when to offer improved features to users.
Wissenschaftsfach: Medizin
Objekt: PubMed ; MEDLINE
-
5Lee, D.H. ; Schleyer, T.: Social tagging is no substitute for controlled indexing : a comparison of Medical Subject Headings and CiteULike tags assigned to 231,388 papers.
In: Journal of the American Society for Information Science and Technology. 63(2012) no.9, S.1747-1757.
Abstract: Social tagging and controlled indexing both facilitate access to information resources. Given the increasing popularity of social tagging and the limitations of controlled indexing (primarily cost and scalability), it is reasonable to investigate to what degree social tagging could substitute for controlled indexing. In this study, we compared CiteULike tags to Medical Subject Headings (MeSH) terms for 231,388 citations indexed in MEDLINE. In addition to descriptive analyses of the data sets, we present a paper-by-paper analysis of tags and MeSH terms: the number of common annotations, Jaccard similarity, and coverage ratio. In the analysis, we apply three increasingly progressive levels of text processing, ranging from normalization to stemming, to reduce the impact of lexical differences. Annotations of our corpus consisted of over 76,968 distinct tags and 21,129 distinct MeSH terms. The top 20 tags/MeSH terms showed little direct overlap. On a paper-by-paper basis, the number of common annotations ranged from 0.29 to 0.5 and the Jaccard similarity from 2.12% to 3.3% using increased levels of text processing. At most, 77,834 citations (33.6%) shared at least one annotation. Our results show that CiteULike tags and MeSH terms are quite distinct lexically, reflecting different viewpoints/processes between social tagging and controlled indexing.
Themenfeld: Indexierungsstudien ; Social tagging
Wissenschaftsfach: Medizin
Objekt: MeSH ; CiteULike ; MEDLINE
-
6Leydesdorff, L. ; Rotolo, D. ; Rafols, I.: Bibliometric perspectives on medical innovation using the medical subject headings of PubMed.
In: Journal of the American Society for Information Science and Technology. 63(2012) no.11, S.2239-2253.
Abstract: Multiple perspectives on the nonlinear processes of medical innovations can be distinguished and combined using the Medical Subject Headings (MeSH) of the MEDLINE database. Focusing on three main branches-"diseases," "drugs and chemicals," and "techniques and equipment"-we use base maps and overlay techniques to investigate the translations and interactions and thus to gain a bibliometric perspective on the dynamics of medical innovations. To this end, we first analyze the MEDLINE database, the MeSH index tree, and the various options for a static mapping from different perspectives and at different levels of aggregation. Following a specific innovation (RNA interference) over time, the notion of a trajectory which leaves a signature in the database is elaborated. Can the detailed index terms describing the dynamics of research be used to predict the diffusion dynamics of research results? Possibilities are specified for further integration between the MEDLINE database on one hand, and the Science Citation Index and Scopus (containing citation information) on the other.
Themenfeld: Informetrie
Wissenschaftsfach: Medizin
Objekt: PubMed ; MEDLINE
Hilfsmittel: MeSH
-
7Ibekwe-SanJuan, F.: Semantic metadata annotation : tagging Medline abstracts for enhanced information access.
In: Aslib proceedings. 62(2010) nos.4/5, S.476-488.
Abstract: Purpose - The object of this study is to develop methods for automatically annotating the argumentative role of sentences in scientific abstracts. Working from Medline abstracts, sentences were classified into four major argumentative roles: objective, method, result, and conclusion. The idea is that, if the role of each sentence can be marked up, then these metadata can be used during information retrieval to seek particular types of information such as novelty, conclusions, methodologies, aims/goals of a scientific piece of work. Design/methodology/approach - Two approaches were tested: linguistic cues and positional heuristics. Linguistic cues are lexico-syntactic patterns modelled as regular expressions implemented in a linguistic parser. Positional heuristics make use of the relative position of a sentence in the abstract to deduce its argumentative class. Findings - The experiments showed that positional heuristics attained a much higher degree of accuracy on Medline abstracts with an F-score of 64 per cent, whereas the linguistic cues only attained an F-score of 12 per cent. This is mostly because sentences from different argumentative roles are not always announced by surface linguistic cues. Research limitations/implications - A limitation to the study was the inability to test other methods to perform this task such as machine learning techniques which have been reported to perform better on Medline abstracts. Also, to compare the results of the study with earlier studies using Medline abstracts, the different argumentative roles present in Medline had to be mapped on to four major argumentative roles. This may have favourably biased the performance of the sentence classification by positional heuristics. Originality/value - To the best of one's knowledge, this study presents the first instance of evaluating linguistic cues and positional heuristics on the same corpus.
Anmerkung: Beitrag in einem Special Issue: Content architecture: exploiting and managing diverse resources: proceedings of the first national conference of the United Kingdom chapter of the International Society for Knowedge Organization (ISKO)
Themenfeld: Wissensrepräsentation
Wissenschaftsfach: Medizin
Objekt: Medline
-
8Zhang, Y.: Dimensions and elements of people's mental models of an information-rich Web space.
In: Journal of the American Society for Information Science and Technology. 61(2010) no.11, S.2206-2218.
Abstract: Although considered proxies for people to interact with a system, mental models have produced limited practical implications for system design. This might be due to the lack of exploration of the elements of mental models resulting from the methodological challenge of measuring mental models. This study employed a new method, concept listing, to elicit people's mental models of an information-rich space, MedlinePlus, after they interacted with the system for 5 minutes. Thirty-eight undergraduate students participated in the study. The results showed that, in this short period of time, participants perceived MedlinePlus from many different aspects in relation to four components: the system as a whole, its content, information organization, and interface. Meanwhile, participants expressed evaluations of or emotions about the four components. In terms of the procedural knowledge, an integral part of people's mental models, only one participant identified a strategy more aligned to the capabilities of MedlinePlus to solve a hypothetical task; the rest planned to use general search and browse strategies. The composition of participants' mental models of MedlinePlus was consistent with that of their models of information-rich Web spaces in general.
Themenfeld: Benutzerstudien
Objekt: Medline
-
9Humphrey, S.M. ; Névéol, A. ; Browne, A. ; Gobeil, J. ; Ruch, P. ; Darmoni, S.J.: Comparing a rule-based versus statistical system for automatic categorization of MEDLINE documents according to biomedical specialty.
In: Journal of the American Society for Information Science and Technology. 60(2009) no.12, S.2530-2539.
Abstract: Automatic document categorization is an important research problem in Information Science and Natural Language Processing. Many applications, including, Word Sense Disambiguation and Information Retrieval in large collections, can benefit from such categorization. This paper focuses on automatic categorization of documents from the biomedical literature into broad discipline-based categories. Two different systems are described and contrasted: CISMeF, which uses rules based on human indexing of the documents by the Medical Subject Headings (MeSH) controlled vocabulary in order to assign metaterms (MTs), and Journal Descriptor Indexing (JDI), based on human categorization of about 4,000 journals and statistical associations between journal descriptors (JDs) and textwords in the documents. We evaluate and compare the performance of these systems against a gold standard of humanly assigned categories for 100 MEDLINE documents, using six measures selected from trec_eval. The results show that for five of the measures performance is comparable, and for one measure JDI is superior. We conclude that these results favor JDI, given the significantly greater intellectual overhead involved in human indexing and maintaining a rule base for mapping MeSH terms to MTs. We also note a JDI method that associates JDs with MeSH indexing rather than textwords, and it may be worthwhile to investigate whether this JDI method (statistical) and CISMeF (rule-based) might be combined and then evaluated showing they are complementary to one another.
Themenfeld: Automatisches Indexieren ; Automatisches Klassifizieren
Wissenschaftsfach: Medizin
Objekt: MEDLINE
-
10Huuskonen, S. ; Vakkari, P.: Students' search process and outcome in Medline in writing an essay for a class on evidence-based medicine.
In: Journal of documentation. 64(2008) no.2, S.287-303.
Abstract: Purpose - The aim of this study is to explore to which extent searching by medical students in Medline produces information items useful for writing an essay measured by precision and relative recall as perceived by the students, the proportion of cited items, and their utilization on four dimensions of the essay writing task evaluated by external assessors. It also aims to study interrelations of search process and outcome. Design/methodology/approach - The study subjects were 42 third year medical students attending a class on Diagnostic and therapy. Searching in Medline was a part of their assignment of essay writing. The data consist of students' printed logs of Medline searches, students' assessments of the usefulness of the references retrieved, a questionnaire concerning the search process, and evaluation scores of the essays given by the teachers of the class. Pearson correlation coefficients were calculated for answering the research questions. Findings - The paper finds that precision and relative recall were not associated with evaluation scores in three of the four dimensions assessed. Some of the process variables were associated with precision and with assessment scores in two of the four dimensions assessed. Citing rate was negatively associated with recall. It seems that precision and recall are only weakly, if at all, associated to the use of information in the documents retrieved for writing the essay. Precision and relative recall are not associated to the way information in the retrieved items is used for performing the task. Users evidently look for a sufficient number of documents containing enough information for progressing in their task. Precision and recall are not sufficient measures in evaluating IR systems, but they have to be completed by other measures indicating the impact of the system on users' task performance. Originality/value - The paper provides useful information on students' information search process.
Themenfeld: Informationsdienstleistungen
Wissenschaftsfach: Medizin
Objekt: Medline
-
11Abdou, S. ; Savoy, J.: Searching in Medline : query expansion and manual indexing evaluation.
In: Information processing and management. 44(2008) no.2, S.781-789.
Abstract: Based on a relatively large subset representing one third of the Medline collection, this paper evaluates ten different IR models, including recent developments in both probabilistic and language models. We show that the best performing IR models is a probabilistic model developed within the Divergence from Randomness framework [Amati, G., & van Rijsbergen, C.J. (2002) Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM-Transactions on Information Systems 20(4), 357-389], which result in 170% enhancements in mean average precision when compared to the classical tf idf vector-space model. This paper also reports on our impact evaluations on the retrieval effectiveness of manually assigned descriptors (MeSH or Medical Subject Headings), showing that by including these terms retrieval performance can improve from 2.4% to 13.5%, depending on the underling IR model. Finally, we design a new general blind-query expansion approach showing improved retrieval performances compared to those obtained using the Rocchio approach.
Themenfeld: Retrievalstudien
Wissenschaftsfach: Medizin
Objekt: Medline
-
12Humphrey, S.M. ; Rogers, W.J. ; Kilicoglu, H. ; Demner-Fushman, D. ; Rindflesch, T.C.: Word sense disambiguation by selecting the best semantic type based on journal descriptor indexing : preliminary experiment.
In: Journal of the American Society for Information Science and Technology. 57(2006) no.1, S.96-113.
Abstract: An experiment was performed at the National Library of Medicine® (NLM®) in word sense disambiguation (WSD) using the Journal Descriptor Indexing (JDI) methodology. The motivation is the need to solve the ambiguity problem confronting NLM's MetaMap system, which maps free text to terms corresponding to concepts in NLM's Unified Medical Language System® (UMLS®) Metathesaurus®. If the text maps to more than one Metathesaurus concept at the same high confidence score, MetaMap has no way of knowing which concept is the correct mapping. We describe the JDI methodology, which is ultimately based an statistical associations between words in a training set of MEDLINE® citations and a small set of journal descriptors (assigned by humans to journals per se) assumed to be inherited by the citations. JDI is the basis for selecting the best meaning that is correlated to UMLS semantic types (STs) assigned to ambiguous concepts in the Metathesaurus. For example, the ambiguity transport has two meanings: "Biological Transport" assigned the ST Cell Function and "Patient transport" assigned the ST Health Care Activity. A JDI-based methodology can analyze text containing transport and determine which ST receives a higher score for that text, which then returns the associated meaning, presumed to apply to the ambiguity itself. We then present an experiment in which a baseline disambiguation method was compared to four versions of JDI in disambiguating 45 ambiguous strings from NLM's WSD Test Collection. Overall average precision for the highest-scoring JDI version was 0.7873 compared to 0.2492 for the baseline method, and average precision for individual ambiguities was greater than 0.90 for 23 of them (51%), greater than 0.85 for 24 (53%), and greater than 0.65 for 35 (79%). On the basis of these results, we hope to improve performance of JDI and test its use in applications.
Themenfeld: Computerlinguistik
Wissenschaftsfach: Medizin
Objekt: Medline ; UMLS
-
13Torvik, V.I. ; Weeber, M. ; Swanson, D.R. ; Smalheiser, N.R.: ¬A probabilistic similarity metric for medline mecords : a model for author name disambiguation.
In: Journal of the American Society for Information Science and Technology. 56(2005) no.2, S.140-158.
Abstract: We present a model for estimating the probability that a pair of author names (sharing last name and first initial), appearing an two different Medline articles, refer to the same individual. The model uses a simple yet powerful similarity profile between a pair of articles, based an title, journal name, coauthor names, medical subject headings (MeSH), language, affiliation, and name attributes (prevalence in the literature, middle initial, and suffix). The similarity profile distribution is computed from reference sets consisting of pairs of articles containing almost exclusively author matches versus nonmatches, generated in an unbiased manner. Although the match set is generated automatically and might contain a small proportion of nonmatches, the model is quite robust against contamination with nonmatches. We have created a free, public service ("Author-ity": http://arrowsmith.psych.uic.edu) that takes as input an author's name given an a specific article, and gives as output a list of all articles with that (last name, first initial) ranked by decreasing similarity, with match probability indicated.
Themenfeld: Informetrie
Wissenschaftsfach: Medizin
Objekt: Medline
-
14Kaulen, H.: Deutsche Publikationen in MEDLINE unterschlagen.
In: Password. 2005, H.3, S.15.
Inhalt: "Systematische Verzerrungen beim Aufbau von internationalen Literaturdatenbanken haben zu einer zunehmenden Schieflage in der öffentlichen Wahrnehmung von Forschungsergebnissen geführt. Eine wichtige Quelle für diese Verzerrungen ist die deutsche Sprache. as nicht in Englisch publiziert wird, findet seltener Eingang in so wichtige Datenbanken wie Medline, die von der amerikanische National Library of Medicine zusammengestellt und verwaltet wird. Was nicht in diesen virtuellen Bibliotheken aufgelistet ist, kann auch nicht auf elektronischem Wege recherchiert werden. Viele Daten finden daher keinen Eingang in die wissenschaftliche Diskussion. Welche Konsequenzen sich daraus für den Wissenschaftsstandort Deutschland ergeben, haben Jens Türp und seine Kollegen vom Zentrum für Zahnmedizin der Universität Basel für den Bereich der Zahnmedizin ermittelt. Von den in den vergangenen dreißig Jahren in deutschen zahnmedizinischen Fachzeitschriften publizierten klinischen Studien waren - je nach Art der Studie - nur die Hälfte oder drei Viertel in Medline berücksichtigt. Von diesen wiederum waren viele mit einem falschen Index versehen, was ihre Auffindung besonders für ungeübte Datenbankbenutzer schwierig machte. Wesentliche Erkenntnisse der deutschen zahnmedizinischen Forschung scheinen also nicht wahrgenommen zu werden, was einen fatalen Teufelskreis in Gang setzt. Was nicht wahrgenommen wird, wird auch nicht zitiert. Und was nicht zitiert wird, hat noch weniger Chancen, in einer der angesehenen Datenbanken aufgelistet zu werden. Deshalb verlieren deutsche Zeitschriften für die Veröffentlichung von Originalarbeiten zunehmend an Bedeutung. Viele Zeitschriften haben deshalb den Rückzug in die Fort-und Weiterbildungsarbeit angetreten und veröffentlichen vorrangig Obersichtsartikel statt Originalarbeiten. Obwohl man bei den deutschen Fachzeitschriften das Dilemma kennt, scheint man wenig dagegen tun zu kön nen. Die "Deutsche Zahnärztliche Zeitschrift" bemüht sich seit Jahren darum, wieder in Medline aufgenommen zu werden, wo sie zwischen 1970 und 1992 vertreten war. Sie erfüllt zwar nach wie vor alle Kriterien, trotzdem wird ihr das Listing, wie Thomas Kerschbaum von der Zahnklinik der Universität Köln erläuterte, ohne Nennung von Gründen vorenthalten. Das ist nur durch eine gewisse Willkür zu erklären. Die mangelnde Berücksichtigung der deutschsprachigen Originalartikel verzerrt aber nicht nur die Wahrnehmung der Forschungsergebnisse. Die betroffenen Forscher kommen auch bei der Vergabe der Fördermittel zunehmend in Bedrängnis. Seitdem ein Teil der Forschungsgelder leistungsbezogen vergeben wird und Leistung am Grad der öffentlichen Wahrnehmung gemessen wird, werden ihnen immer weniger Mittel zugeteilt."
Anmerkung: Nachdruck eines Beitrages in der FAZ: "Deutsche Publikationen in Datenbanken unterschlagen"
Themenfeld: Informationsmittel
Objekt: Medline
-
15Comeau, D.C. ; Wilbur, W.J.: Non-Word Identification or Spell Checking Without a Dictionary.
In: Journal of the American Society for Information Science and technology. 55(2004) no.2, S.169-177.
Abstract: MEDLINE is a collection of more than 12 million references and abstracts covering recent life science literature. With its continued growth and cutting-edge terminology, spell-checking with a traditional lexicon based approach requires significant additional manual followup. In this work, an internal corpus based context quality rating a, frequency, and simple misspelling transformations are used to rank words from most likely to be misspellings to least likely. Eleven-point average precisions of 0.891 have been achieved within a class of 42,340 all alphabetic words having an a score less than 10. Our models predict that 16,274 or 38% of these words are misspellings. Based an test data, this result has a recall of 79% and a precision of 86%. In other words, spell checking can be done by statistics instead of with a dictionary. As an application we examine the time history of low a words in MEDLINE titles and abstracts.
Themenfeld: Computerlinguistik
Wissenschaftsfach: Medizin
Objekt: Medline
-
16Srinivasan, P.: Text mining : generating hypotheses from MEDLINE.
In: Journal of the American Society for Information Science and technology. 55(2004) no.5, S.396-413.
Abstract: Hypothesis generation, a crucial initial step for making scientific discoveries, relies an prior knowledge, experience, and intuition. Chance connections made between seemingly distinct subareas sometimes turn out to be fruitful. The goal in text mining is to assist in this process by automatically discovering a small set of interesting hypotheses from a suitable text collection. In this report, we present open and closed text mining algorithms that are built within the discovery framework established by Swanson and Smalheiser. Our algorithms represent topics using metadata profiles. When applied to MEDLINE, these are McSH based profiles. We present experiments that demonstrate the effectiveness of our algorithms. Specifically, our algorithms successfully generate ranked term lists where the key terms representing novel relationships between topics are ranked high.
Themenfeld: Data Mining
Wissenschaftsfach: Medizin
Objekt: Medline
-
17Mit gemeinsamer Such- und Findemaschine : ZBMed und DIMDI.
In: Password. 2002, H.12, S.32.
Abstract: DIMDI und die Deutsche Zentralbibliothek für Medizin (beide Köln) haben die gemeinsame Suchmaschine www.medpilot.de gestartet. Das Projekt wird von der Deutschen Forschungsgemeinschaft finanziert. Die gleichzeitigen Suchmöglichkeiten in den Datenbanken des DIMDI (zum Beispiel Medline, Cancerlit und Toxline) und der ZBMed (etwa CCMed, Bibliothekskatalog und Linkdatenbank) über eine Oberfläche sollen vor allem von Ärzten genutzt werden. Mit den Möglichkeiten der Profirecherche. Bei Verfügbarkeit ist die direkte Dokumentbestellung oder der OnlineZugriff auf Volltexte möglich. Gefundene Buchtitel können über eine integrierte Schnittstelle im Buchhandel bestellt werden.
Anmerkung: Vgl.: www.medpilot.de
Themenfeld: Informationsmittel
Wissenschaftsfach: Medizin
Objekt: DIMDI ; Medline
Anwendungsfeld: Fachinformationseinrichtungen
-
18Kim, W. ; Wilbur, W.J.: Corpus-based statistical screening for content-bearing terms.
In: Journal of the American Society for Information Science and technology. 52(2001) no.3, S.247-259.
Abstract: Kim and Wilber present three techniques for the algorithmic identification in text of content bearing terms and phrases intended for human use as entry points or hyperlinks. Using a set of 1,075 terms from MEDLINE evaluated on a zero to four, stop word to definite content word scale, they evaluate the ranked lists of their three methods based on their placement of content words in the top ranks. Data consist of the natural language elements of 304,057 MEDLINE records from 1996, and 173,252 Wall Street Journal records from the TIPSTER collection. Phrases are extracted by breaking at punctuation marks and stop words, normalized by lower casing, replacement of nonalphanumerics with spaces, and the reduction of multiple spaces. In the ``strength of context'' approach each document is a vector of binary values for each word or word pair. The words or word pairs are removed from all documents, and the Robertson, Spark Jones relevance weight for each term computed, negative weights replaced with zero, those below a randomness threshold ignored, and the remainder summed for each document, to yield a score for the document and finally to assign to the term the average document score for documents in which it occurred. The average of these word scores is assigned to the original phrase. The ``frequency clumping'' approach defines a random phrase as one whose distribution among documents is Poisson in character. A pvalue, the probability that a phrase frequency of occurrence would be equal to, or less than, Poisson expectations is computed, and a score assigned which is the negative log of that value. In the ``database comparison'' approach if a phrase occurring in a document allows prediction that the document is in MEDLINE rather that in the Wall Street Journal, it is considered to be content bearing for MEDLINE. The score is computed by dividing the number of occurrences of the term in MEDLINE by occurrences in the Journal, and taking the product of all these values. The one hundred top and bottom ranked phrases that occurred in at least 500 documents were collected for each method. The union set had 476 phrases. A second selection was made of two word phrases occurring each in only three documents with a union of 599 phrases. A judge then ranked the two sets of terms as to subject specificity on a 0 to 4 scale. Precision was the average subject specificity of the first r ranks and recall the fraction of the subject specific phrases in the first r ranks and eleven point average precision was used as a summary measure. The three methods all move content bearing terms forward in the lists as does the use of the sum of the logs of the three methods.
Themenfeld: Computerlinguistik
Objekt: Medline
-
19Qin, J.: Semantic similarities between a keyword database and a controlled vocabulary database : an investigation in the antibiotic resistance literature.
In: Journal of the American Society for Information Science. 51(2000) no.2, S.166-180.
Abstract: The 'KeyWords Plus' in the Science Citation Index database represents an approach to combining citation and semantic indexing in describing the document content. This paper explores the similariites or dissimilarities between citation-semantic and analytic indexing. The dataset consisted of over 400 matching records in the SCI and MEDLINE databases on antibiotic resistance in pneumonia. The degree of similarity in indexing terms was found to vary on a scale from completely different to completely identical with various levels in between. The within-document similarity in the 2 databases was measured by a variation on the Jaccard coefficient - the Inclusion Index. The average inclusion coefficient was 0,4134 for SCI and 0,3371 for Medline. The 20 terms occuring most frequently in each database were identified. The 2 groups of terms shared the same terms that consist of the 'intellectual base' for the subject. conceptual similarity was analyzed through scatterplots of matching and nonmatching terms vs. partially identical and broader/narrower terms. The study also found that both databases differed in assigning terms in various semantic categories. Implications of this research and further studies are suggested
Themenfeld: Indexierungsstudien
Wissenschaftsfach: Pharmazie
Objekt: Science Citation Index ; Medline
-
20Hehl, H.: Ein Linksystem zur Integration von Literatursuche und Literaturbeschaffung : Medline-LINK.
In: nfd Information - Wissenschaft und Praxis. 51(2000) H.4, S.209-216.
Abstract: Die durch das WWW gegebenen Möglichkeiten, von den Suchergebnissen einer Datenbank über Hyperlinks auf elektronische Volltexte zuzugreifen, einzelne Titel online zu bestellen oder mit anderen Datenbanken oder Katalogen zu verknüpfen, werden mittlerweile von vielen Datenbankanbietern genutzt oder standardmässig angeboten. Das hier besprochene Linksystem weist dieselben Möglichkeiten auf und verbindet Suchergebnisse mit lokal verfügbaren elektronischen Zeitschriften bzw. mit Bibliotheks-Katalogen. Auch eine automatische Bestellfunktion ist vorhanden. Dieses auf Javascript basierende Linksystem verwendet ein einfaches, bisher aber noch wenig bekanntes Verfahren, bei dem jeweils die gesamte Ergebnisliste einer Datenbank (50 bis 200 Titel) in das Texteingabefeld eines Formulars eingefügt und dann weiter mit Javascript bearbeitet wird. Vorteilhaft ist die gross Anpassungsfähigkeit des Programms an die speziellen oder sogar individuellen Bedürfnisse. Medline-LINK wendet dieses Linkverfahren auf die besonders effiziente und zudem entgeltfreie Datenbank PubMed an. In dieser Testversion bilden die von der UB Regensburg abonnierten E-Zeitschriften zusätzlich eines großen Teils von elsevier-Zeitschriften den Grundbestand der zu verknüpfenden Zeitschriften. Über die dynamisch ermittelte ISSN kann die Verbindung von der Ergebnisanzeige zu den Bestandsanzeigen des BVB und GBV hergestellt werden. Die automatische Bestellfunktion wird am Beispiel des Fernleihbestellformulars der UB Regensburg demonstriert
Themenfeld: Internet ; Informationsmittel
Wissenschaftsfach: Medizin
Objekt: Medline-LINK ; PubMed
Land/Ort: Regensburg