Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 16. Dezember 2019)
1Zhang, J. ; Yu, Q. ; Zheng, F. ; Long, C. ; Lu, Z. ; Duan, Z.: Comparing keywords plus of WOS and author keywords : a case study of patient adherence research.
In: Journal of the Association for Information Science and Technology. 67(2016) no.4, S.967-972.
Abstract: Bibliometric analysis based on literature in the Web of Science (WOS) has become an increasingly popular method for visualizing the structure of scientific fields. Keywords Plus and Author Keywords are commonly selected as units of analysis, despite the limited research evidence demonstrating the effectiveness of Keywords Plus. This study was conceived to evaluate the efficacy of Keywords Plus as a parameter for capturing the content and scientific concepts presented in articles. Using scientific papers about patient adherence that were retrieved from WOS, a comparative assessment of Keywords Plus and Author Keywords was performed at the scientific field level and the document level, respectively. Our search yielded more Keywords Plus terms than Author Keywords, and the Keywords Plus terms were more broadly descriptive. Keywords Plus is as effective as Author Keywords in terms of bibliometric analysis investigating the knowledge structure of scientific fields, but it is less comprehensive in representing an article's content.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23437/abstract.
Objekt: Web of Science
2Liu, W. ; Dog(an, R.I. ; Kim, S. ; Comeau, D.C. ; Kim, W. ; Yeganova, L. ; Lu, Z. ; Wilbur, W.J.: Author name disambiguation for PubMed.
In: Journal of the Association for Information Science and Technology. 65(2014) no.4, S.765-781.
Abstract: Log analysis shows that PubMed users frequently use author names in queries for retrieving scientific literature. However, author name ambiguity may lead to irrelevant retrieval results. To improve the PubMed user experience with author name queries, we designed an author name disambiguation system consisting of similarity estimation and agglomerative clustering. A machine-learning method was employed to score the features for disambiguating a pair of papers with ambiguous names. These features enable the computation of pairwise similarity scores to estimate the probability of a pair of papers belonging to the same author, which drives an agglomerative clustering algorithm regulated by 2 factors: name compatibility and probability level. With transitivity violation correction, high precision author clustering is achieved by focusing on minimizing false-positive pairing. Disambiguation performance is evaluated with manual verification of random samples of pairs from clustering results. When compared with a state-of-the-art system, our evaluation shows that among all the pairs the lumping error rate drops from 10.1% to 2.2% for our system, while the splitting error rises from 1.8% to 7.7%. This results in an overall error rate of 9.9%, compared with 11.9% for the state-of-the-art method. Other evaluations based on gold standard data also show the increase in accuracy of our clustering. We attribute the performance improvement to the machine-learning method driven by a large-scale training set and the clustering algorithm regulated by a name compatibility scheme preferring precision. With integration of the author name disambiguation system into the PubMed search engine, the overall click-through-rate of PubMed users on author name query results improved from 34.9% to 36.9%.
3Özel, S.A. ; Altingövde, I.S. ; Ulusoy, Ö. ; Özsoyoglu, G. ; Özsoyoglu, Z.M.: Metadata-Based Modeling of Information Resources an the Web.
In: Journal of the American Society for Information Science and technology. 55(2004) no.2, S.97-110.
Abstract: This paper deals with the problem of modeling Web information resources using expert knowledge and personalized user information for improved Web searching capabilities. We propose a "Web information space" model, which is composed of Web-based information resources (HTML/XML [Hypertext Markup Language/Extensible Markup Language] documents an the Web), expert advice repositories (domain-expert-specified metadata for information resources), and personalized information about users (captured as user profiles that indicate users' preferences about experts as well as users' knowledge about topics). Expert advice, the heart of the Web information space model, is specified using topics and relationships among topics (called metalinks), along the lines of the recently proposed topic maps. Topics and metalinks constitute metadata that describe the contents of the underlying HTML/XML Web resources. The metadata specification process is semiautomated, and it exploits XML DTDs (Document Type Definition) to allow domain-expert guided mapping of DTD elements to topics and metalinks. The expert advice is stored in an object-relational database management system (DBMS). To demonstrate the practicality and usability of the proposed Web information space model, we created a prototype expert advice repository of more than one million topics/metalinks for DBLP (Database and Logic Programming) Bibliography data set. We also present a query interface that provides sophisticated querying fa cilities for DBLP Bibliography resources using the expert advice repository.
Themenfeld: Internet ; Metadaten
4Lu, Z. ; McKinley, K.S.: ¬The effect of collection organization and query locality on information retrieval system performance.
In: Advances in information retrieval: Recent research from the Center for Intelligent Information Retrieval. Ed.: W.B. Croft. Boston, MA : Kluwer Academic Publ., 2000. S.173-202.
(The Kluwer international series on information retrieval; 7)
Abstract: The explosion of content in distributed information retrieval (IR) systems requires new mechanisms in order to attain timely and accurate retrieval of text. Collection selection and partial collection replication with replica selection are two such mechanisms that enable IR systems to search a small percentage of data and thus improve performance and scalability. To maintain effectiveness as well as efficiency, IR systems must be configured carefully to consider workload locality and possible collection organizations. We propose IR system architectures that incorporate collection selection and partial replication, and compare configurations using a validated simulator. Locality and collection organization have dramatic effects on performance. For example, we demonstrate with simulation results that collection selection performs especially well when the distribution of queries to collections is uniform and collections are organized by topics, but it suffers when particular collections are "hot." We find that when queries have even modest locality, configurations that replicate data outperform those that partition data, usually significantly. These results can be used as the basis for IR system designs under a variety of workloads and collection organizations
5Allan, J. ; Ballesteros, L. ; Callan, J.P. ; Croft, W.B. ; Lu, Z.: Recent experiment with INQUERY.
In: The Fourth Text Retrieval Conference (TREC-4). Ed.: K. Harman. Gaithersburgh, MD : National Institute of Standards and Technology, 1996. S.49-63.
(NIST special publication; 500-236)
Objekt: INQUERY ; TREC