Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 04. Juni 2021)
1Lee, J. ; Jatowt, A. ; Kim, K.-S..: Discovering underlying sensations of human emotions based on social media.
In: Journal of the Association for Information Science and Technology. 72(2021) no.4, S.417-432.
Abstract: Analyzing social media has become a common way for capturing and understanding people's opinions, sentiments, interests, and reactions to ongoing events. Social media has thus become a rich and real-time source for various kinds of public opinion and sentiment studies. According to psychology and neuroscience, human emotions are known to be strongly dependent on sensory perceptions. Although sensation is the most fundamental antecedent of human emotions, prior works have not looked into their relation to emotions based on social media texts. In this paper, we report the results of our study on sensation effects that underlie human emotions as revealed in social media. We focus on the key five types of sensations: sight, hearing, touch, smell, and taste. We first establish a correlation between emotion and sensation in terms of linguistic expressions. Then, in the second part of the paper, we define novel features useful for extracting sensation information from social media. Finally, we design a method to classify texts into ones associated with different types of sensations. The sensation dataset resulting from this research is opened to the public to foster further studies.
Inhalt: Vgl.: https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24414.
2Chin, J.Y. ; Bhowmick, S.S. ; Jatowt, A.: On-demand recent personal tweets summarization on mobile devices.
In: Journal of the Association for Information Science and Technology. 70(2019) no.6, S.547-562.
Abstract: Tweets summarization aims to find a group of representative tweets for a specific set of input tweets or a given topic. In recent times, there have been several research efforts toward devising a variety of techniques to summarize tweets in Twitter. However, these techniques are either not personal (that is, consider only tweets in the timeline of a specific user) or are too expensive to be realized on a mobile device. Given that 80% of active Twitter users access the site on mobile devices, in this article we present a lightweight, personal, on-demand, topic modeling-based tweets summarization engine called TOTEM, designed for such devices. Specifically, TOTEM first preprocesses recent tweets in a user's timeline and exploits Latent Dirichlet Allocation-based topic modeling to assign each preprocessed tweet to a topic. Then it generates a ranked list of relevant tweets, a topic label, and a topic summary for each of the topics. Our experimental study with real-world data sets demonstrates the superiority of TOTEM.
Inhalt: Vgl.: https://onlinelibrary.wiley.com/doi/10.1002/asi.24137.
3Zielinski, K. ; Nielek, R. ; Wierzbicki, A. ; Jatowt, A.: Computing controversy : formal model and algorithms for detecting controversy on Wikipedia and in search queries.
In: Information processing and management. 54(2018) no.1, S.14-36.
Abstract: Controversy is a complex concept that has been attracting attention of scholars from diverse fields. In the era of Internet and social media, detecting controversy and controversial concepts by the means of automatic methods is especially important. Web searchers could be alerted when the contents they consume are controversial or when they attempt to acquire information on disputed topics. Presenting users with the indications and explanations of the controversy should offer them chance to see the "wider picture" rather than letting them obtain one-sided views. In this work we first introduce a formal model of controversy as the basis of computational approaches to detecting controversial concepts. Then we propose a classification based method for automatic detection of controversial articles and categories in Wikipedia. Next, we demonstrate how to use the obtained results for the estimation of the controversy level of search queries. The proposed method can be incorporated into search engines as a component responsible for detection of queries related to controversial topics. The method is independent of the search engine's retrieval and search results recommendation algorithms, and is therefore unaffected by a possible filter bubble. Our approach can be also applied in Wikipedia or other knowledge bases for supporting the detection of controversy and content maintenance. Finally, we believe that our results could be useful for social science researchers for understanding the complex nature of controversy and in fostering their studies.
Inhalt: Vgl.: https://doi.org/10.1016/j.ipm.2017.08.005.
Themenfeld: Informationsmittel ; Internet
4Jatowt, A. ; Yeung, C.M.A. ; Tanaka, K.: Generic method for detecting focus time of documents.
In: Information processing and management. 51(2015) no.6, S.851-868.
Abstract: Time is an important aspect of text documents. While some documents are atemporal, many have strong temporal characteristics and contain contents related to time. Such documents can be mapped to their corresponding time periods. In this paper, we propose estimating the focus time of documents which is defined as the time period to which document's content refers and which is considered complementary dimension to the document's creation time. We propose several estimators of focus time by utilizing statistical knowledge from external resources such as news article collections. The advantage of our approach is that document focus time can be estimated even for documents that do not contain any temporal expressions or contain only few of them. We evaluate the effectiveness of our methods on the diverse datasets of documents about historical events related to 5 countries. Our approach achieves average error of less than 21 years on collections of Wikipedia pages, extracts from history-related books and web pages, while using the total time frame of 113 years. We also demonstrate an example classification method to distinguish temporal from atemporal documents.
Inhalt: Vgl.: doi: 10.1016/j.ipm.2015.05.001.
Anmerkung: Beitrag in einem Themenschwerpunkt "Time and information retrieval"
5Joho, H. ; Jatowt, A. ; Blanco, R.: Temporal information searching behaviour and strategies.
In: Information processing and management. 51(2015) no.6, S.834-850.
Abstract: Temporal aspects have been receiving a great deal of interest in Information Retrieval and related fields. Although previous studies have proposed, designed and implemented temporal-aware systems and solutions, understanding of people's temporal information searching behaviour is still limited. This paper reports the findings of a user study that explored temporal information searching behaviour and strategies in a laboratory setting. Information needs were grouped into three temporal classes (Past, Recency, and Future) to systematically study their characteristics. The main findings of our experiment are as follows. (1) It is intuitive for people to augment topical keywords with temporal expressions such as history, recent, or future as a tactic of temporal search. (2) However, such queries produce mixed results and the success of query reformulations appears to depend on topics to a large extent. (3) Search engine interfaces should detect temporal information needs to trigger the display of temporal search options. (4) Finding a relevant Wikipedia page or similar summary page is a popular starting point of past information needs. (5) Current search engines do a good job for information needs related to recent events, but more work is needed for past and future tasks. (6) Participants found it most difficult to find future information. Searching for domain experts was a key tactic in Future search, and file types of relevant documents are different from other temporal classes. Overall, the comparison of search across temporal classes indicated that Future search was the most difficult and the least successful followed by the search for the Past and then for Recency information. This paper discusses the implications of these findings on the design of future temporal IR systems
Inhalt: Vgl.: 10.1016/j.ipm.2015.03.006.
Anmerkung: Beitrag in einem Themenschwerpunkt "Time and information retrieval"