Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 28. April 2022)
1Huang, X. ; Soergel, D. ; Klavans, J.L.: Modeling and analyzing the topicality of art images.
In: Journal of the Association for Information Science and Technology. 66(2015) no.8, S.1616-1644.
Abstract: This study demonstrates an improved conceptual foundation to support well-structured analysis of image topicality. First we present a conceptual framework for analyzing image topicality, explicating the layers, the perspectives, and the topical relevance relationships involved in modeling the topicality of art images. We adapt a generic relevance typology to image analysis by extending it with definitions and relationships specific to the visual art domain and integrating it with schemes of image-text relationships that are important for image subject indexing. We then apply the adapted typology to analyze the topical relevance relationships between 11 art images and 768 image tags assigned by art historians and librarians. The original contribution of our work is the topical structure analysis of image tags that allows the viewer to more easily grasp the content, context, and meaning of an image and quickly tune into aspects of interest; it could also guide both the indexer and the searcher to specify image tags/descriptors in a more systematic and precise manner and thus improve the match between the two parties. An additional contribution is systematically examining and integrating the variety of image-text relationships from a relevance perspective. The paper concludes with implications for relational indexing and social tagging.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23269/abstract.
Behandelte Form: Bilder
2Huang, X. ; Soergel, D.: Relevance: an improved framework for explicating the notion.
In: Journal of the American Society for Information Science and Technology. 64(2013) no.1, S.18-35.
Abstract: Synthesizing and building on many ideas from the literature, this article presents an improved conceptual framework that clarifies the notion of relevance with its many elements, variables, criteria, and situational factors. Relevance is defined as a Relationship (R) between an Information Object (I) and an Information Need (N) (which consists of Topic, User, Problem/Task, and Situation/Context) with focus on R. This defines Relevance-as-is (conceptual relevance, strong relevance). To determine relevance, an Agent A (a person or system) operates on a representation I? of the information object and a representation N? of the information need, resulting in relevance-as-determined (operational measure of relevance, weak relevance, an approximation). Retrieval tests compare relevance-as-determined by different agents. This article discusses and compares two major approaches to conceptualizing relevance: the entity-focused approach (focus on elaborating the entities involved in relevance) and the relationship-focused approach (focus on explicating the relational nature of relevance). The article argues that because relevance is fundamentally a relational construct the relationship-focused approach deserves a higher priority and more attention than it has received. The article further elaborates on the elements of the framework with a focus on clarifying several critical issues on the discourse on relevance.
3Huang, X.: Applying a generic function-based topical relevance typology to structure clinical questions and answers.
In: Journal of the American Society for Information Science and Technology. 64(2013) no.1, S.65-85.
Abstract: This study investigates the manifestation and utility of a generic function-based topical relevance typology adapted to the subject domain of clinical medicine. By specifying the functional role of a given piece of relevant information in the overall structure of a topic, the proposed typology provides a generic framework for integrating different pieces of clinical evidence and a multifaceted view of a clinical problem. In medical problem solving structured knowledge plays a key role. The typology provides the conceptual basis for integrating and structuring knowledge; it incorporates and goes beyond existing clinical schemes (such as PICO and illness script) and offers extra assistance for physicians as well as lay users (such as patients and caregivers) to manage the vast amount of diversified evidence, to maintain a structured view of the patient problem at hand, and ultimately to make well-grounded clinical choices. Developed as a generic topical framework across topics and domains, the typology proved useful for clinical medicine once extended with domain-specific definitions and relationships. This article reports the findings of using the adapted and extended typology in the analysis of 26 clinical questions and their evidence-based answers. The article concludes with potential applications of the typology to improve clinical information seeking, organizing, and processing.
4Zhao, L. ; Wu, L. ; Huang, X.: Using query expansion in graph-based approach for query-focused multi-document summarization.
In: Information processing and management. 45(2009) no.1, S.35-41.
Abstract: This paper presents a novel query expansion method, which is combined in the graph-based algorithm for query-focused multi-document summarization, so as to resolve the problem of information limit in the original query. Our approach makes use of both the sentence-to-sentence relations and the sentence-to-word relations to select the query biased informative words from the document set and use them as query expansions to improve the sentence ranking result. Compared to previous query expansion approaches, our approach can capture more relevant information with less noise. We performed experiments on the data of document understanding conference (DUC) 2005 and DUC 2006, and the evaluation results show that the proposed query expansion method can significantly improve the system performance and make our system comparable to the state-of-the-art systems.
5Liu, Z. ; Huang, X.: Gender differences in the online reading environment.
In: Journal of documentation. 64(2008) no.4, S.616-626.
Abstract: Purpose - The purpose of this study is to explore gender differences in the online reading environment. Design/methodology/approach - Survey and analysis methods are employed. Findings - Survey results reveal that female readers have a stronger preference for paper as a reading medium than male readers, whereas male readers exhibit a greater degree of satisfaction with online reading than females. Additionally, males and females differ significantly on the dimension of selective reading and sustained attention. Originality/value - Understanding gender differences would enable a better understanding of the changing reading behavior in the online environment, and to develop more effective digital reading devices. Factors affecting gender differences in the online reading environment are discussed, and directions for future research are suggested.
6Liu, Y. ; Huang, X. ; An, A.: Personalized recommendation with adaptive mixture of markov models.
In: Journal of the American Society for Information Science and Technology. 58(2007) no.12, S.1851-1870.
Abstract: With more and more information available on the Internet, the task of making personalized recommendations to assist the user's navigation has become increasingly important. Considering there might be millions of users with different backgrounds accessing a Web site everyday, it is infeasible to build a separate recommendation system for each user. To address this problem, clustering techniques can first be employed to discover user groups. Then, user navigation patterns for each group can be discovered, to allow the adaptation of a Web site to the interest of each individual group. In this paper, we propose to model user access sequences as stochastic processes, and a mixture of Markov models based approach is taken to cluster users and to capture the sequential relationships inherent in user access histories. Several important issues that arise in constructing the Markov models are also addressed. The first issue lies in the complexity of the mixture of Markov models. To improve the efficiency of building/maintaining the mixture of Markov models, we develop a lightweight adapt-ive algorithm to update the model parameters without recomputing model parameters from scratch. The second issue concerns the proper selection of training data for building the mixture of Markov models. We investigate two different training data selection strategies and perform extensive experiments to compare their effectiveness on a real dataset that is generated by a Web-based knowledge management system, Livelink.
Anmerkung: Beitrag eines Themenschwerpunktes "Mining Web resources for enhancing information retrieval"
Themenfeld: Data Mining
7Peng, F. ; Huang, X.: Machine learning for Asian language text classification.
In: Journal of documentation. 63(2007) no.3, S.378-397.
Abstract: Purpose - The purpose of this research is to compare several machine learning techniques on the task of Asian language text classification, such as Chinese and Japanese where no word boundary information is available in written text. The paper advocates a simple language modeling based approach for this task. Design/methodology/approach - Naïve Bayes, maximum entropy model, support vector machines, and language modeling approaches were implemented and were applied to Chinese and Japanese text classification. To investigate the influence of word segmentation, different word segmentation approaches were investigated and applied to Chinese text. A segmentation-based approach was compared with the non-segmentation-based approach. Findings - There were two findings: the experiments show that statistical language modeling can significantly outperform standard techniques, given the same set of features; and it was found that classification with word level features normally yields improved classification performance, but that classification performance is not monotonically related to segmentation accuracy. In particular, classification performance may initially improve with increased segmentation accuracy, but eventually classification performance stops improving, and can in fact even decrease, after a certain level of segmentation accuracy. Practical implications - Apply the findings to real web text classification is ongoing work. Originality/value - The paper is very relevant to Chinese and Japanese information processing, e.g. webpage classification, web search.
Themenfeld: Computerlinguistik ; Automatisches Klassifizieren
8Huang, X. ; Peng, F, ; An, A. ; Schuurmans, D.: Dynamic Web log session identification with statistical language models.
In: Journal of the American Society for Information Science and Technology. 55(2004) no.14, S.1290-1303.
Abstract: We present a novel session identification method based an statistical language modeling. Unlike standard timeout methods, which use fixed time thresholds for session identification, we use an information theoretic approach that yields more robust results for identifying session boundaries. We evaluate our new approach by learning interesting association rules from the segmented session files. We then compare the performance of our approach to three standard session identification methods-the standard timeout method, the reference length method, and the maximal forward reference method-and find that our statistical language modeling approach generally yields superior results. However, as with every method, the performance of our technique varies with changing parameter settings. Therefore, we also analyze the influence of the two key factors in our language-modeling-based approach: the choice of smoothing technique and the language model order. We find that all standard smoothing techniques, save one, perform weIl, and that performance is robust to language model order.
Anmerkung: Beitrag in einem Themenheft über Webometrics
Themenfeld: Internet ; Informetrie
9Beaulieu, M.M. ; Gatford, M. ; Huang, X. ; Robertson, S.E. ; Walker, S. ; Williams, P.: Okapi an TREC-5.
In: The Fifth Text Retrieval Conference (TREC-5). Ed.: E.M. Voorhees u. D.K. Harman. Gaithersburgh, MD : National Institute of Standards and Technology, 1997. S.143-165.
(NIST special publication;)
Objekt: TREC ; Okapi
10Huang, X. ; Robertson, S.E.: Application of probilistic methods to Chinese text retrieval.
In: Journal of documentation. 53(1997) no.1, S.74-79.
Abstract: Discusses the use of text retrieval methods based on the probabilistic model with Chinese language material. Since Chinese text has no natural word boundaries, either a dictionary based word segmentation method must be applied to the text, or indexing and searching must be done in terms of single Chinese characters. In either case, it becomes important to have a good way of dealing with phrases or contoguous strings of characters; the probabilistic model does not at present have such a facility. Proposes some ad hoc modifications of the probabilistic weighting function and matching method for this purpose
Inhalt: Vgl. auch unter: http://www.emeraldinsight.com/10.1108/EUM0000000007193.
Anmerkung: Contribution to a thematic issue on Okapi and information retrieval research ;