Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 28. April 2022)
1Leginus, M. ; Zhai, C.X. ; Dolog, P.: Personalized generation of word clouds from tweets.
In: Journal of the Association for Information Science and Technology. 67(2016) no.5, S.1021-1032.
Abstract: Active users of Twitter are often overwhelmed with the vast amount of tweets. In this work we attempt to help users browsing a large number of accumulated posts. We propose a personalized word cloud generation as a means for users' navigation. Various user past activities such as user published tweets, retweets, and seen but not retweeted tweets are leveraged for enhanced personalization of word clouds. The best personalization results are attained with user past retweets. However, users' own past tweets are not as useful as retweets for personalization. Negative preferences derived from seen but not retweeted tweets further enhance personalized word cloud generation. The ranking combination method outperforms the preranking approach and provides a general framework for combined ranking of various user past information for enhanced word cloud generation. To better capture subtle differences of generated word clouds, we propose an evaluation of word clouds with a mean average precision measure.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23494/abstract.
2Vinod Vydiswaran, V.G. ; Zhai, C.X. ; Roth, D. ; Pirolli, P.: Overcoming bias to learn about controversial topics.
In: Journal of the Association for Information Science and Technology. 66(2015) no.8, S.1655-1672.
Abstract: Deciding whether a claim is true or false often requires a deeper understanding of the evidence supporting and contradicting the claim. However, when presented with many evidence documents, users do not necessarily read and trust them uniformly. Psychologists and other researchers have shown that users tend to follow and agree with articles and sources that hold viewpoints similar to their own, a phenomenon known as confirmation bias. This suggests that when learning about a controversial topic, human biases and viewpoints about the topic may affect what is considered "trustworthy" or credible. It is an interesting challenge to build systems that can help users overcome this bias and help them decide the truthfulness of claims. In this article, we study various factors that enable humans to acquire additional information about controversial claims in an unbiased fashion. Specifically, we designed a user study to understand how presenting evidence with contrasting viewpoints and source expertise ratings affect how users learn from the evidence documents. We find that users do not seek contrasting viewpoints by themselves, but explicitly presenting contrasting evidence helps them get a well-rounded understanding of the topic. Furthermore, explicit knowledge of the credibility of the sources and the context in which the source provides the evidence document not only affects what users read but also whether they perceive the document to be credible.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23274/abstract.
3Ding, Y. ; Zhang, G. ; Chambers, T. ; Song, M. ; Wang, X. ; Zhai, C.: Content-based citation analysis : the next generation of citation analysis.
In: Journal of the Association for Information Science and Technology. 65(2014) no.9, S.1820-1833.
Abstract: Traditional citation analysis has been widely applied to detect patterns of scientific collaboration, map the landscapes of scholarly disciplines, assess the impact of research outputs, and observe knowledge transfer across domains. It is, however, limited, as it assumes all citations are of similar value and weights each equally. Content-based citation analysis (CCA) addresses a citation's value by interpreting each one based on its context at both the syntactic and semantic levels. This paper provides a comprehensive overview of CAA research in terms of its theoretical foundations, methodical approaches, and example applications. In addition, we highlight how increased computational capabilities and publicly available full-text resources have opened this area of research to vast possibilities, which enable deeper citation analysis, more accurate citation prediction, and increased knowledge discovery.
Themenfeld: Citation indexing
4Ling, X. ; Jiang, J. ; He, X. ; Mei, Q. ; Zhai, C. ; Schatz, B.: Generating gene summaries from biomedical literature : a study of semi-structured summarization.
In: Information processing and management. 43(2007) no.6, S.1777-1791.
Abstract: Most knowledge accumulated through scientific discoveries in genomics and related biomedical disciplines is buried in the vast amount of biomedical literature. Since understanding gene regulations is fundamental to biomedical research, summarizing all the existing knowledge about a gene based on literature is highly desirable to help biologists digest the literature. In this paper, we present a study of methods for automatically generating gene summaries from biomedical literature. Unlike most existing work on automatic text summarization, in which the generated summary is often a list of extracted sentences, we propose to generate a semi-structured summary which consists of sentences covering specific semantic aspects of a gene. Such a semi-structured summary is more appropriate for describing genes and poses special challenges for automatic text summarization. We propose a two-stage approach to generate such a summary for a given gene - first retrieving articles about a gene and then extracting sentences for each specified semantic aspect. We address the issue of gene name variation in the first stage and propose several different methods for sentence extraction in the second stage. We evaluate the proposed methods using a test set with 20 genes. Experiment results show that the proposed methods can generate useful semi-structured gene summaries automatically from biomedical literature, and our proposed methods outperform general purpose summarization methods. Among all the proposed methods for sentence extraction, a probabilistic language modeling approach that models gene context performs the best.
Themenfeld: Automatisches Abstracting
Wissenschaftsfach: Biologie ; Medizin
5Zhai, C.X. ; Lafferty, J.: ¬A risk minimization framework for information retrieval.
In: Information processing and management. 42(2006) no.1, S.31-55.
Abstract: This paper presents a probabilistic information retrieval framework in which the retrieval problem is formally treated as a statistical decision problem. In this framework, queries and documents are modeled using statistical language models, user preferences are modeled through loss functions, and retrieval is cast as a risk minimization problem. We discuss how this framework can unify existing retrieval models and accommodate systematic development of new retrieval models. As an example of using the framework to model non-traditional retrieval problems, we derive retrieval models for subtopic retrieval, which is concerned with retrieving documents to cover many different subtopics of a general query topic. These new models differ from traditional retrieval models in that they relax the traditional assumption of independent relevance of documents.
Anmerkung: Beitrag innerhalb eines thematischen Schwerpunktes "Formal Methods for Information Retrieval"