Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 04. Juni 2021)
1Kang, X. ; Wu, Y. ; Ren, W.: Toward action comprehension for searching : mining actionable intents in query entities.
In: Journal of the Association for Information Science and Technology. 71(2020) no.2, S.143-157.
Abstract: Understanding search engine users' intents has been a popular study in information retrieval, which directly affects the quality of retrieved information. One of the fundamental problems in this field is to find a connection between the entity in a query and the potential intents of the users, the latter of which would further reveal important information for facilitating the users' future actions. In this article, we present a novel research method for mining the actionable intents for search users, by generating a ranked list of the potentially most informative actions based on a massive pool of action samples. We compare different search strategies and their combinations for retrieving the action pool and develop three criteria for measuring the informativeness of the selected action samples, that is, the significance of an action sample within the pool, the representativeness of an action sample for the other candidate samples, and the diverseness of an action sample with respect to the selected actions. Our experiment, based on the Action Mining (AM) query entity data set from the Actionable Knowledge Graph (AKG) task at NTCIR-13, suggests that the proposed approach is effective in generating an informative and early-satisfying ranking of potential actions for search users.
Inhalt: Vgl.: https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24220.
Themenfeld: Suchmaschinen ; Suchtaktik
2Yang, L. ; Wu, Y.: Creating a taxonomy of earth-quake disaster response and recovery for online earthquake information management.
In: Knowledge organization. 46(2019) no.2, S.77-89.
Abstract: The goal of this study is to develop a taxonomy of earthquake response and recovery using online information re-sources for organizing and sharing earthquake-related online in-formation resources. A constructivist/interpretivist research par-adigm was used in the study. A combination of top-down and bottom-up approaches was used to build the taxonomy. Facet analysis of disaster management, the timeframe of disaster man-agement, and modular design were performed when designing the taxonomy. Two case studies were done to demonstrate the usefulness of the taxonomy for organizing and sharing infor-mation. The facet-based taxonomy can be used to organize online information for browsing and navigation. It can also be used to index and tag online information resources to support searching. It creates a common language for earthquake manage-ment stakeholders to share knowledge. The top three level cate-gories of the taxonomy can be applied to the management of other types of disasters. The taxonomy has implications for earthquake online information management, knowledge manage-ment and disaster management. The approach can be used to build taxonomies for managing online information resources on other topics (including various types of time-sensitive disaster re-sponses). We propose a common language for sharing infor-mation on disasters, which has great social relevance.
3Wu, Y. ; Liu, Y. ; Tsai, Y.-H.R. ; Yau, S.-T.: Investigating the role of eye movements and physiological signals in search satisfaction prediction using geometric analysis.
In: Journal of the Association for Information Science and Technology. 70(2019) no.9, S.981-999.
Abstract: Two general challenges faced by data analysis are the existence of noise and the extraction of meaningful information from collected data. In this study, we used a multiscale framework to reduce the effects caused by noise and to extract explainable geometric properties to characterize finite metric spaces. We conducted lab experiments that integrated the use of eye-tracking, electrodermal activity (EDA), and user logs to explore users' information-seeking behaviors on search engine result pages (SERPs). Experimental results of 1,590 search queries showed that the proposed strategies effectively predicted query-level user satisfaction using EDA and eye-tracking data. The bootstrap analysis showed that combining EDA and eye-tracking data with user behavior data extracted from user logs led to a significantly better linear model fit than using user behavior data alone. Furthermore, cross-user and cross-task validations showed that our methods can be generalized to different search engine users performing different preassigned tasks.
Inhalt: Vgl.: https://onlinelibrary.wiley.com/doi/10.1002/asi.24240.
Anmerkung: Beitrag in einem 'Special issue on neuro-information science'.
4Wu, Y. ; Bai, R.: ¬An event relationship model for knowledge organization and visualization.
In: http://www.iskocus.org/NASKO2017papers/NASKO2017_paper_34.pdf [NASKO 2017, June 15-16, 2017, Champaign, IL, USA].
Abstract: An event is a specific occurrence involving participants, which is a typed, n-ary association of entities or other events, each identified as a participant in a specific semantic role in the event (Pyysalo et al. 2012; Linguistic Data Consortium 2005). Event types may vary across domains. Representing relationships between events can facilitate the understanding of knowledge in complex systems (such as economic systems, human body, social systems). In the simplest form, an event can be represented as Entity A
Entity B. This paper evaluates several knowledge organization and visualization models and tools, such as concept maps (Cmap), topic maps (Ontopia), network analysis models (Gephi), and ontology (Protégé), then proposes an event relationship model that aims to integrate the strengths of these models, and can represent complex knowledge expressed in events and their relationships.
Inhalt: Beitrag bei: NASKO 2017: Visualizing Knowledge Organization: Bringing Focus to Abstract Realities. The sixth North American Symposium on Knowledge Organization (NASKO 2017), June 15-16, 2017, in Champaign, IL, USA.
5Li, J. ; Zhang, P. ; Song, D. ; Wu, Y.: Understanding an enriched multidimensional user relevance model by analyzing query logs.
In: Journal of the Association for Information Science and Technology. 68(2017) no.12, S.2743-2754.
Abstract: Modeling multidimensional relevance in information retrieval (IR) has attracted much attention in recent years. However, most existing studies are conducted through relatively small-scale user studies, which may not reflect a real-world and natural search scenario. In this article, we propose to study the multidimensional user relevance model (MURM) on large scale query logs, which record users' various search behaviors (e.g., query reformulations, clicks and dwelling time, etc.) in natural search settings. We advance an existing MURM model (including five dimensions: topicality, novelty, reliability, understandability, and scope) by providing two additional dimensions, that is, interest and habit. The two new dimensions represent personalized relevance judgment on retrieved documents. Further, for each dimension in the enriched MURM model, a set of computable features are formulated. By conducting extensive document ranking experiments on Bing's query logs and TREC session Track data, we systematically investigated the impact of each dimension on retrieval performance and gained a series of insightful findings which may bring benefits for the design of future IR systems.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23868/full.
6Du, J. ; Tang, X. ; Wu, Y.: ¬The effects of research level and article type on the differences between citation metrics and F1000 recommendations.
In: Journal of the Association for Information Science and Technology. 67(2016) no.12, S.3008-3021.
Abstract: F1000 recommendations were assessed as a potential data source for research evaluation, but the reasons for differences between F1000 Article Factor (FFa scores) and citations remain unexplored. By linking recommendations for 28,254 publications in F1000 with citations in Scopus, we investigated the effect of research level (basic, clinical, mixed) and article type on the internal consistency of assessments based on citations and FFa scores. The research level has little impact on the differences between the 2 evaluation tools, while article type has a big effect. These 2 measures differ significantly for 2 groups: (a) nonprimary research or evidence-based research are more highly cited but not highly recommended, while (b) translational research or transformative research are more highly recommended but have fewer citations. This can be expected, since citation activity is usually practiced by academic authors while the potential for scientific revolutions and the suitability for clinical practice of an article should be investigated from a practitioners' perspective. We conclude with a recommendation that the application of bibliometric approaches in research evaluation should consider the proportion of 3 types of publications: evidence-based research, transformative research, and translational research. The latter 2 types are more suitable for assessment through peer review.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23548/full.
7Huang, M.-H. ; Wu, L.-L. ; Wu, Y.-C.: ¬A study of research collaboration in the pre-web and post-web stages : a coauthorship analysis of the information systems discipline.
In: Journal of the Association for Information Science and Technology. 66(2015) no.4, S.778-797.
Abstract: To explore the possible facilitative role of the Internet in the process of research collaboration, this study endeavored to systematically compare the phenomenon of co-authorship and the impacts of co-authorship between pre-web and post-web stages in the field of information systems. Three hypotheses were proposed in this study. First, research collaboration increases in the post-web stage relative to the pre-web stage. Second, research collaboration is positively related to research impact, operationally defined as the number of citations. Lastly, the positive relationship between research collaboration and research impact is stronger in the post-web stage than that in the pre-web stage. Articles published in the field of information systems in both time periods were collected to test the hypotheses. The empirical results strongly support H1 and H2, showing that co-authorship increases in the post-web stage, and positively correlates with citations received by information systems articles. The positive effects of interdisciplinary collaborations and collaborations among multiple authors are enhanced in the post-web stage, but such enhancement is not found for international collaboration. H3 is partially supported.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23196/abstract.
8Wu, Y. ; Yang, L.: Construction and evaluation of an oil spill semantic relation taxonomy for supporting knowledge discovery.
In: Knowledge organization. 42(2015) no.4, S.222-231.
Abstract: The paper presents the rationale, significance, method and procedure of building a taxonomy of semantic relations in the oil spill domain for supporting knowledge discovery through inference. Difficult problems during the development of the taxonomy are discussed and partial solutions are proposed. A preliminary functional evaluation of the taxonomy for supporting knowledge discovery was performed. Durability an expansibility of the taxonomy were evaluated by using the taxonomy to classifying the terms in a biomedical relation ontology. The taxonomy was found to have full expansibility and high degree of durability. The study proposes more research problems than solutions.
Inhalt: Papers from the Fifth North American Symposium on Knowledge Organization (NASKO 2015), sponsored by ISKO-Canada/US, June 18-19, 2015, Los Angeles, California. Vgl.: http://www.ergon-verlag.de/isko_ko/downloads/ko_42_2015_4.pdf.
Themenfeld: Wissensrepräsentation ; Theorie verbaler Dokumentationssprachen
9Wu, Y. ; Lehman, A. ; Dunaway, D:J.: Evaluations of a large topic map as a knowledge organization tool for supporting self-regulated learning.
In: Knowledge organization. 42(2015) no.6, S.386-398.
Abstract: A large topic map was created to facilitate understanding of the impacts of the 2010 Gulf of Mexico Oil Spill Incident. The topic map has both a text and graphical interface, which complement each other. A formative evaluation and two summative evaluations were conducted, as qualitative studies, to assess the usefulness and usability of the large topic maps for facilitating self-regulated learning. The topic maps were found useful for knowledge fusion and discovery, and can be useful when undertaking interdisciplinary and multidisciplinary research. Users reported some usability issues about the graphical topic map, including information overload and cluttered display of topics when displaying large number of topics and their associated topics. The text topic map was found easier to use due to displaying topics, relationships and references in a linear view.
Inhalt: Vgl.: http://www.ergon-verlag.de/isko_ko/downloads/ko_42_2015_6.
Objekt: Topic maps
10Xiao, C. ; Zhou, F. ; Wu, Y.: Predicting audience gender in online content-sharing social networks.
In: Journal of the American Society for Information Science and Technology. 64(2013) no.6, S.1284-1297.
Abstract: Understanding the behavior and characteristics of web users is valuable when improving information dissemination, designing recommendation systems, and so on. In this work, we explore various methods of predicting the ratio of male viewers to female viewers on YouTube. First, we propose and examine two hypotheses relating to audience consistency and topic consistency. The former means that videos made by the same authors tend to have similar male-to-female audience ratios, whereas the latter means that videos with similar topics tend to have similar audience gender ratios. To predict the audience gender ratio before video publication, two features based on these two hypotheses and other features are used in multiple linear regression (MLR) and support vector regression (SVR). We find that these two features are the key indicators of audience gender, whereas other features, such as gender of the user and duration of the video, have limited relationships. Second, another method is explored to predict the audience gender ratio. Specifically, we use the early comments collected after video publication to predict the ratio via simple linear regression (SLR). The experiments indicate that this model can achieve better performance by using a few early comments. We also observe that the correlation between the number of early comments (cost) and the predictive accuracy (gain) follows the law of diminishing marginal utility. We build the functions of these elements via curve fitting to find the appropriate number of early comments (approximately 250) that can achieve maximum gain at minimum cost.
11Wu, Y.: Indexing historical, political cartoons for retrieval.
In: Knowledge organization. 40(2013) no.5, S.283-294.
Abstract: Previous literature indicates that political cartoons are difficult to index because they have a subjective nature, and indexers may fail to understand the content of a cartoon or may interpret its content subjectively. This study aims to investigate the indexability of historical, political cartoons and the variables that affect the indexing results. It proposes an indexing scheme for describing historical, political cartoons, and uses that indexing scheme to conduct indexing experiments. Through indexing experiments and statistical analysis, three variables, which affect the indexing results, are identified: indexers, indexing fields, and cartoons. There is a statistically significant difference in inter-indexer consistency on indexers, indexing fields, and cartoons. The paper argues that historical, political cartoons can be indexed if knowledgeable indexers are available, and the context of the cartoons is provided. It also proposes a mediated, collaborative indexing approach to indexing such materials.
Inhalt: Vgl.: http://www.ergon-verlag.de/isko_ko/downloads/ko_40_2013_5_a.pdf.
Behandelte Form: Cartoons
12Lee, Y.-S. ; Wu, Y.-C. ; Yang, J.-C.: BVideoQA : Online English/Chinese bilingual video question answering.
In: Journal of the American Society for Information Science and Technology. 60(2009) no.3, S.509-525.
Abstract: This article presents a bilingual video question answering (QA) system, namely BVideoQA, which allows users to retrieve Chinese videos through English or Chinese natural language questions. Our method first extracts an optimal one-to-one string pattern matching according to the proposed dense and long N-gram match. On the basis of the matched string patterns, it gives a passage score based on our term-weighting scheme. The main contributions of this approach to multimedia information retrieval literatures include: (a) development of a truly bilingual video QA system, (b) presentation of a robust bilingual passage retrieval algorithm to handle no-word-boundary languages such as Chinese and Japanese, (c) development of a large-scale bilingual video QA corpus for system evaluation, and (d) comparisons of seven top-performing retrieval methods under the fair conditions. The experimental studies indicate that our method is superior to other existing approaches in terms of precision and main rank reciprocal rates. When ported to English, encouraging empirical results also are obtained. Our method is very important to Asian-like languages since the development of a word tokenizer is optional.
13Li, Q. ; Wu, Y.-f.B.: People search : searching people sharing similar interests from the Web.
In: Journal of the American Society for Information Science and Technology. 59(2008) no.1, S.111-125.
Abstract: On the Web, there are limited ways of finding people sharing similar interests with a given person. The current methods are either ineffective or time consuming. In this paper, we present a new approach for searching people sharing similar interests from the Web. Given a person, to find similar people from the Web, there are two major research issues: person representation and matching persons. In this study, we propose a person representation method which uses a person's website to represent this person. Our design of matching process takes person representation into consideration to allow the same representation to be used when composing the query. Under this person representation method, the proposed algorithm integrates textual content and hyperlink information of all the pages belonging to a personal website to represent a person and match persons. Other algorithms are also explored and compared to the proposed algorithm. Experimental results are presented.
14Wu, Y.-f.B. ; Li, Q. ; Bot, R.S. ; Chen, X.: Finding nuggets in documents : a machine learning approach.
In: Journal of the American Society for Information Science and Technology. 57(2006) no.6, S.740-752.
Abstract: Document keyphrases provide a concise summary of a document's content, offering semantic metadata summarizing a document. They can be used in many applications related to knowledge management and text mining, such as automatic text summarization, development of search engines, document clustering, document classification, thesaurus construction, and browsing interfaces. Because only a small portion of documents have keyphrases assigned by authors, and it is time-consuming and costly to manually assign keyphrases to documents, it is necessary to develop an algorithm to automatically generate keyphrases for documents. This paper describes a Keyphrase Identification Program (KIP), which extracts document keyphrases by using prior positive samples of human identified phrases to assign weights to the candidate keyphrases. The logic of our algorithm is: The more keywords a candidate keyphrase contains and the more significant these keywords are, the more likely this candidate phrase is a keyphrase. KIP's learning function can enrich the glossary database by automatically adding new identified keyphrases to the database. KIP's personalization feature will let the user build a glossary database specifically suitable for the area of his/her interest. The evaluation results show that KIP's performance is better than the systems we compared to and that the learning function is effective.
Themenfeld: Automatisches Abstracting
15Allen, R.B. ; Wu, Y.: Metrics for the scope of a collection.
In: Journal of the American Society for Information Science and Technology. 56(2005) no.12, S.1243-1249.
Abstract: Some collections cover many topics, while others are narrowly focused an a limited number of topics. We introduce the concept of the "scope" of a collection of documents and we compare two ways of measuring lt. These measures are based an the distances between documents. The first uses the overlap of words between pairs of documents. The second measure uses a novel method that calculates the semantic relatedness to pairs of words from the documents. Those values are combined to obtain an overall distance between the documents. The main validation for the measures compared Web pages categorized by Yahoo. Sets of pages sampied from broad categories were determined to have a higher scope than sets derived from subcategories. The measure was significant and confirmed the expected difference in scope. Finally, we discuss other measures related to scope.
16Liu, J. ; Wu, Y. ; Zhou, L.: ¬A hybrid method for abstracting newspaper articles.
In: Journal of the American Society for Information Science. 50(1999) no.13, S.1234-1245.
Abstract: This paper introduces a hybrid method for abstracting Chinese text. It integrates the statistical approach with language understanding. Some linguistics heuristics and segmentation are also incorporated into the abstracting process. The prototype system is of a multipurpose type catering for various users with different reqirements. Initial responses show that the proposed method contributes much to the flexibility and accuracy of the automatic Chinese abstracting system. In practice, the present work provides a path to developing an intelligent Chinese system for automating the information
Themenfeld: Automatisches Abstracting
Behandelte Form: Zeitungen