Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 04. Juni 2021)
1Chen, J. ; Wang, D. ; Xie, I. ; Lu, Q.: Image annotation tactics : transitions, strategies and efficiency.
In: Information processing and management. 54(2018) no.6, S.985-1001.
Abstract: Human interpretation of images during image annotation is complicated, but most existing interactive image annotation systems are generally operated based on social tagging, while ignoring that tags are insufficient to convey image semantics. Hence, it is critical to study the nature of image annotation behaviors and process. This study investigated annotation tactics, transitions, strategies and their efficiency during the image annotation process. A total of 90 participants were recruited to annotate nine pictures in three emotional dimensions with three interactive annotation methods. Data collected from annotation logs and verbal protocols were analyzed by applying both qualitative and quantitative methods. The findings of this study show that the cognitive process of human interpretation of images is rather complex, which reveals a probable bias in research involving image relevance feedback. Participants preferred applying scroll bar (Scr) and image comparison (Cim) tactics comparing with rating tactic (Val), and they did fewer fine tuning activities, which reflects the influence of perceptual level and users' cognitive load during image annotation. Annotation tactic transition analysis showed that Cim was more likely to be adopted at the beginning of each phase, and the most remarkable transition was from Cim to Scr. By applying sequence analysis, the authors found 10 most commonly used sequences representing four types of annotation strategies, including Single tactic strategy, Tactic combination strategy, Fix mode strategy and Shift mode strategy. Furthermore, two patterns, "quarter decreasing" and "transition cost," were identified based on time data, and both multiple tactics (e.g., the combination of Cim and Scr) and fine tuning activities were recognized as efficient tactic applications. Annotation patterns found in this study suggest more research needs to be done considering the need for multi-interactive methods and their influence. The findings of this study generated detailed and useful guidance for the interactive design in image annotation systems, including recommending efficient tactic applications in different phases, highlighting the most frequently applied tactics and transitions, and avoiding unnecessary transitions.
Inhalt: Vgl.: https://doi.org/10.1016/j.ipm.2018.06.009.
Behandelte Form: Bilder
2Ouyang, Y. ; Li, W. ; Li, S. ; Lu, Q.: Intertopic information mining for query-based summarization.
In: Journal of the American Society for Information Science and Technology. 61(2010) no.5, S.1062-1072.
Abstract: In this article, the authors address the problem of sentence ranking in summarization. Although most existing summarization approaches are concerned with the information embodied in a particular topic (including a set of documents and an associated query) for sentence ranking, they propose a novel ranking approach that incorporates intertopic information mining. Intertopic information, in contrast to intratopic information, is able to reveal pairwise topic relationships and thus can be considered as the bridge across different topics. In this article, the intertopic information is used for transferring word importance learned from known topics to unknown topics under a learning-based summarization framework. To mine this information, the authors model the topic relationship by clustering all the words in both known and unknown topics according to various kinds of word conceptual labels, which indicate the roles of the words in the topic. Based on the mined relationships, we develop a probabilistic model using manually generated summaries provided for known topics to predict ranking scores for sentences in unknown topics. A series of experiments have been conducted on the Document Understanding Conference (DUC) 2006 data set. The evaluation results show that intertopic information is indeed effective for sentence ranking and the resultant summarization system performs comparably well to the best-performing DUC participating systems on the same data set.
Themenfeld: Automatisches Abstracting
3Wei, F. ; Li, W. ; Lu, Q. ; He, Y.: Applying two-level reinforcement ranking in query-oriented multidocument summarization.
In: Journal of the American Society for Information Science and Technology. 60(2009) no.10, S.2119-2131.
Abstract: Sentence ranking is the issue of most concern in document summarization today. While traditional feature-based approaches evaluate sentence significance and rank the sentences relying on the features that are particularly designed to characterize the different aspects of the individual sentences, the newly emerging graph-based ranking algorithms (such as the PageRank-like algorithms) recursively compute sentence significance using the global information in a text graph that links sentences together. In general, the existing PageRank-like algorithms can model well the phenomena that a sentence is important if it is linked by many other important sentences. Or they are capable of modeling the mutual reinforcement among the sentences in the text graph. However, when dealing with multidocument summarization these algorithms often assemble a set of documents into one large file. The document dimension is totally ignored. In this article we present a framework to model the two-level mutual reinforcement among sentences as well as documents. Under this framework we design and develop a novel ranking algorithm such that the document reinforcement is taken into account in the process of sentence ranking. The convergence issue is examined. We also explore an interesting and important property of the proposed algorithm. When evaluated on the DUC 2005 and 2006 query-oriented multidocument summarization datasets, significant results are achieved.
Themenfeld: Automatisches Abstracting ; Retrievalalgorithmen
4Yang, Y. ; Lu, Q. ; Zhao, T.: ¬A delimiter-based general approach for Chinese term extraction.
In: Journal of the American Society for Information Science and Technology. 61(2010) no.1, S.111-125.
Abstract: This article addresses a two-step approach for term extraction. In the first step on term candidate extraction, a new delimiter-based approach is proposed to identify features of the delimiters of term candidates rather than those of the term candidates themselves. This delimiter-based method is much more stable and domain independent than the previous approaches. In the second step on term verification, an algorithm using link analysis is applied to calculate the relevance between term candidates and the sentences from which the terms are extracted. All information is obtained from the working domain corpus without the need for prior domain knowledge. The approach is not targeted at any specific domain and there is no need for extensive training when applying it to new domains. In other words, the method is not domain dependent and it is especially useful for resource-limited domains. Evaluations of Chinese text in two different domains show quite significant improvements over existing techniques and also verify its efficiency and its relatively domain-independent nature. The proposed method is also very effective for extracting new terms so that it can serve as an efficient tool for updating domain knowledge, especially for expanding lexicons.
5Lee, K.H. ; Ng, M.K.M. ; Lu, Q.: Text segmentation for Chinese spell checking.
In: Journal of the American Society for Information Science. 50(1999) no.9, S.751-759.
Abstract: Chinese spell checking is different from its counterparts for Western languages because Chinese words in texts are not separated by spaces. Chinese spell checking in this article refers to how to identify the misuse of characters in text composition. In other words, it is error correction at the word level rather than at the character level. Before Chinese sentences are spell checked, the text is segmented into semantic units. Error detection can then be carried out on the segmented text based on thesaurus and grammar rules. Segmentation is not a trivial process due to ambiguities in the Chinese language and errors in texts. Because it is not practical to define all Chinese words in a dictionary, words not predefined must also be dealt with. The number of word combinations increases exponentially with the length of the sentence. In this article, a Block-of-Combinations (BOC) segmentation method based on frequency of word usage is proposed to reduce the word combinations from exponential growth to linear growth. From experiments carried out on Hong Kong newspapers, BOC can correctly solve 10% more ambiguities than the Maximum Match segmentation method. To make the segmentation more suitable for spell checking, user interaction is also suggested