Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 04. Juni 2021)
1Tang, X. ; Chen, L. ; Cui, J. ; Wei, B.: Knowledge representation learning with entity descriptions, hierarchical types, and textual relations.
In: Information processing and management. 56(2019) no.3, S.809-822.
Abstract: Knowledge representation learning methods usually only utilize triple facts, or just consider one kind of extra information. In this paper, we propose a multi-source knowledge representation learning (MKRL) model, which can combine entity descriptions, hierarchical types, and textual relations with triple facts. Specifically, for entity descriptions, a convolutional neural network is used to get representations. For hierarchical type, weighted hierarchy encoders are used to construct the projection matrixes of hierarchical types, and the projection matrix of an entity combines all hierarchical type projection matrixes of the entity with the relation-specific type constrains. For textual relations, a sentence-level attention mechanism is employed to get representations. We evaluate MKRL model on knowledge graph completion task with dataset FB15k-237, and experimental results demonstrate that our model outperforms the state-of-the-art methods, which indicates the effectiveness of multi-source information for knowledge representation.
Inhalt: Vgl.: https://doi.org/10.1016/j.ipm.2019.01.005.
2Chen, L. ; Fang, H.: ¬An automatic method for ex-tracting innovative ideas based on the Scopus® database.
In: Knowledge organization. 46(2019) no.3, S.171-186.
Abstract: The novelty of knowledge claims in a research paper can be considered an evaluation criterion for papers to supplement citations. To provide a foundation for research evaluation from the perspective of innovativeness, we propose an automatic approach for extracting innovative ideas from the abstracts of technology and engineering papers. The approach extracts N-grams as candidates based on part-of-speech tagging and determines whether they are novel by checking the Scopus® database to determine whether they had ever been presented previously. Moreover, we discussed the distributions of innovative ideas in different abstract structures. To improve the performance by excluding noisy N-grams, a list of stopwords and a list of research description characteristics were developed. We selected abstracts of articles published from 2011 to 2017 with the topic of semantic analysis as the experimental texts. Excluding noisy N-grams, considering the distribution of innovative ideas in abstracts, and suitably combining N-grams can effectively improve the performance of automatic innovative idea extraction. Unlike co-word and co-citation analysis, innovative-idea extraction aims to identify the differences in a paper from all previously published papers.
Inhalt: DOI:10.57 71/0943-7444-2019-3-171.
Themenfeld: Informetrie ; Computerlinguistik
3Han, B. ; Chen, L. ; Tian, X.: Knowledge based collection selection for distributed information retrieval.
In: Information processing and management. 54(2018) no.1, S.116-128.
Abstract: Recent years have seen a great deal of work on collection selection. Most collection selection methods use central sample index (CSI) that consists of some documents sampled from each collection as collection description. The limitations of these methods are the usage of 'flat' meaning representations that ignore structure and relationships among words in CSI, and the calculation of query-collection similarity metric that ignore semantic distance between query words and indexed words. In this paper, we propose a knowledge based collection selection method (KBCS) to improve collection representation and query-collection similarity metric. KBCS models a collection as a weighted entity set and applies a novel query-collection similarity metric to select highly scored collections. Specifically, in the part of collection representation, context- and structure-based measures are employed to weight the semantic distance between two entities extracted from the sampled documents of a collection. In addition, the novel query-collection similarity metric takes the entity weight, collection size, and other factors into account. To enrich concepts contained in a query, DBpedia based query expansion is integrated. Finally, extensive experiments were conducted on a large webpage dataset, and DBpedia was chosen as the graph knowledge base. Experimental results demonstrate the effectiveness of KBCS.
Inhalt: Vgl.: https://doi.org/10.1016/j.ipm.2017.10.002.
4Chen, L. ; Holsapple, C.W. ; Hsiao, S.-H. ; Ke, Z. ; Oh, J.-Y. ; Yang, Z.: Knowledge-dissemination channels : analytics of stature evaluation.
In: Journal of the Association for Information Science and Technology. 68(2017) no.4, S.911-930.
Abstract: Understanding relative statures of channels for disseminating knowledge is of practical interest to both generators and consumers of knowledge flows. For generators, stature can influence attractiveness of alternative dissemination routes and deliberations of those who assess generator performance. For knowledge consumers, channel stature may influence knowledge content to which they are exposed. This study introduces a novel approach to conceptualizing and measuring stature of knowledge-dissemination channels: the power-impact (PI) technique. It is a flexible technique having 3 complementary variants, giving holistic insights about channel stature by accounting for both attraction of knowledge generators to a distribution channel and degree to which knowledge consumers choose to use a channel's knowledge content. Each PI variant is expressed in terms of multiple parameters, permitting customization of stature evaluation to suit its user's preferences. In the spirit of analytics, each PI variant is driven by objective evidence of actual behaviors. The PI technique is based on 2 building blocks: (a) power that channels have for attracting results of generators' knowledge work, and (b) impact that channel contents' exhibit on prospective recipients. Feasibility and functionality of the PI-technique design are demonstrated by applying it to solve a problem of journal stature evaluation for the information-systems discipline.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23725/full.
Anmerkung: This article was published online on 21 December 2016. An error was subsequently identified. This notice is included in the online and print versions to indicate that both have been corrected on 15 February 2017.
5Xie, H. ; Li, X. ; Wang, T. ; Lau, R.Y.K. ; Wong, T.-L. ; Chen, L. ; Wang, F.L. ; Li, Q.: Incorporating sentiment into tag-based user profiles and resource profiles for personalized search in folksonomy.
In: Information processing and management. 52(2016) no.1, S.61-72.
Abstract: In recent years, there has been a rapid growth of user-generated data in collaborative tagging (a.k.a. folksonomy-based) systems due to the prevailing of Web 2.0 communities. To effectively assist users to find their desired resources, it is critical to understand user behaviors and preferences. Tag-based profile techniques, which model users and resources by a vector of relevant tags, are widely employed in folksonomy-based systems. This is mainly because that personalized search and recommendations can be facilitated by measuring relevance between user profiles and resource profiles. However, conventional measurements neglect the sentiment aspect of user-generated tags. In fact, tags can be very emotional and subjective, as users usually express their perceptions and feelings about the resources by tags. Therefore, it is necessary to take sentiment relevance into account into measurements. In this paper, we present a novel generic framework SenticRank to incorporate various sentiment information to various sentiment-based information for personalized search by user profiles and resource profiles. In this framework, content-based sentiment ranking and collaborative sentiment ranking methods are proposed to obtain sentiment-based personalized ranking. To the best of our knowledge, this is the first work of integrating sentiment information to address the problem of the personalized tag-based search in collaborative tagging systems. Moreover, we compare the proposed sentiment-based personalized search with baselines in the experiments, the results of which have verified the effectiveness of the proposed framework. In addition, we study the influences by popular sentiment dictionaries, and SenticNet is the most prominent knowledge base to boost the performance of personalized search in folksonomy.
Inhalt: Vgl.: doi:10.1016/j.ipm.2015.03.001.
Anmerkung: Beitrag in einem Themenheft "Emotion and sentiment in social and expressive media"
Themenfeld: Folksonomies ; Inhaltsanalyse
6Chen, L.-C.: Next generation search engine for the result clustering technology.
In: Next generation search engines: advanced models for information retrieval. Eds.: C. Jouis, u.a. Hershey, PA : IGI Publishing, 2012. S.274-290.
Abstract: Result clustering has recently attracted a lot of attention to provide the users with a succinct overview of relevant search results than traditional search engines. This chapter proposes a mixed clustering method to organize all returned search results into a hierarchical tree structure. The clustering method accomplishes two main tasks, one is label construction and the other is tree building. This chapter uses precision to measure the quality of clustering results. According to the results of experiments, the author preliminarily concluded that the performance of the system is better than many other well-known commercial and academic systems. This chapter makes several contributions. First, it presents a high performance system based on the clustering method. Second, it develops a divisive hierarchical clustering algorithm to organize all returned snippets into hierarchical tree structure. Third, it performs a wide range of experimental analyses to show that almost all commercial systems are significantly better than most current academic systems.
Anmerkung: Vgl.: http://www.igi-global.com/book/next-generation-search-engines/64429.
7Chen, L. ; Zeng, J. ; Tokuda, N.: ¬A "stereo" document representation for textual information retrieval.
In: Journal of the American Society for Information Science and Technology. 57(2006) no.6, S.768-774.
Abstract: A new document representation model is presented in this paper. This model is based on the idea of representing a document by two or more pictures of the document taken from different perspectives. It is shown that by applying the stereo representation model, enhanced textual retrieval performance is achieved because the new model improves the capability of capturing individual features of the document. Experiments have been conducted on two standard corpora, TIME and ADI, using the standard term vector method and the latent semantic indexing (LSI) method based upon both the stereo representation model and the traditional representation model. Statistical t-tests on the experimental results have convincingly illustrated that these methods achieve significant improvements in retrieval performances with the stereo representation model over those with the traditional representation model.
8Gaines, B.R. ; Chen, L.-J. ; Shaw, M.L.G.: Modeling the human factors of scholarly communities supported through the Internet and World Wide Web.
In: Journal of the American Society for Information Science. 48(1997) no.11, S.987-1003.
Abstract: Provides a framework for analysing the utility, usability and likeability of net and web services and illustrates its application to significant aspects of supporting scholarly communities. The utility of the net and web are measured in terms of the growth of usage and the different services involved are distinguished in terms of their specific utilities. A layered protocol model is used to model discourse through the net and is extended to encompass interaction in communities. An operational criterion for distinguishing different communities is defined in terms of the types of awareness that resource providers and user have of one another. Develops a temporal model of discourse that enables the spectrum of services ranging from real-time discourse to long-term publication to be analyzed in a unified framework. The dimensions of awareness and time are used to characterise and compare the full range of net services and model their unification through the next generation of web browsers
Anmerkung: Contribution to a special topic issue on current research in human-computer interaction