Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 04. Juni 2021)
1Li, W. ; Zheng, Y. ; Zhan, Y. ; Feng, R. ; Zhang, T. ; Fan, W.: Cross-modal retrieval with dual multi-angle self-attention.
In: Journal of the Association for Information Science and Technology. 72(2021) no.1, S.46-65.
Abstract: In recent years, cross-modal retrieval has been a popular research topic in both fields of computer vision and natural language processing. There is a huge semantic gap between different modalities on account of heterogeneous properties. How to establish the correlation among different modality data faces enormous challenges. In this work, we propose a novel end-to-end framework named Dual Multi-Angle Self-Attention (DMASA) for cross-modal retrieval. Multiple self-attention mechanisms are applied to extract fine-grained features for both images and texts from different angles. We then integrate coarse-grained and fine-grained features into a multimodal embedding space, in which the similarity degrees between images and texts can be directly compared. Moreover, we propose a special multistage training strategy, in which the preceding stage can provide a good initial value for the succeeding stage and make our framework work better. Very promising experimental results over the state-of-the-art methods can be achieved on three benchmark datasets of Flickr8k, Flickr30k, and MSCOCO.
Inhalt: Vgl.: https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24373.
2Zhang, Y. ; Li, X. ; Fan, W.: User adoption of physician's replies in an online health community : an empirical study.
In: Journal of the Association for Information Science and Technology. 71(2020) no.10, S.1179-1191.
Abstract: Online health question-and-answer consultation with physicians is becoming a common phenomenon. However, it is unclear how users identify the most satisfying reply. Based on the dual-process theory of knowledge adoption, we developed a conceptual model and empirical method to study which factors influence adoption of a reply. We extracted 6 variables for argument quality (Ease of understanding, Relevance, Completeness, Objectivity, Timeliness, Structure) and 4 for source credibility (Physician's online experience, Physician's offline expertise, Hospital location, Hospital level). The empirical results indicate that both central and peripheral routes affect user's adoption of a response. Physician's offline expertise negatively affects user's adoption decision, while physician's online experience positively affects it; this effect is positively moderated by user involvement.
3Zeng, M.L. ; Fan, W.: SKOS and its application in transferring traditional thesauri into networked knowledge organization systems.
In: New pespectives on subject indexing and classification: essays in honour of Magda Heiner-Freiling. Red.: K. Knull-Schlomann, u.a. Leipzig : Deutsche Nationalbibliothek, 2008. S.157-166.
Abstract: In remembrance of Magda Heiner-Freiling who dedicated her professional efforts in promoting the sharing of subject access among world libraries, we sincerely wish to add our contribution to the endeavor she started and dreamed of finishing by writing this paper in Chinese, introducing SKOS and discussing its applications in transferring the largest controlled vocabulary in China, the Chinese Classified Thesaurus (CCT), into a SKOS-based knowledge organization system (KOS). The paper discusses the conceptual models of concept-based and term-based systems, the converting solutions of CCT, and the potential usage of a KOS registry built on SKOS and other Web-based protocols and technologies.
Anmerkung: In Chinesisch
Objekt: SKOS ; Chinese Classified Thesaurus
4Zeng, M.L. ; Fan, W. ; Lin, X.: SKOS for an integrated vocabulary structure.
In: Metadata for semantic and social applications : proceedings of the International Conference on Dublin Core and Metadata Applications, Berlin, 22 - 26 September 2008, DC 2008: Berlin, Germany / ed. by Jane Greenberg and Wolfgang Klas. Göttingen : Univ.-Verl., 2008. S.200-201.
Abstract: In order to transfer the Chinese Classified Thesaurus (CCT) into a machine-processable format and provide CCT-based Web services, a pilot study has been conducted in which a variety of selected CCT classes and mapped thesaurus entries are encoded with SKOS. OWL and RDFS are also used to encode the same contents for the purposes of feasibility and cost-benefit comparison. CCT is a collected effort led by the National Library of China. It is an integration of the national standards Chinese Library Classification (CLC) 4th edition and Chinese Thesaurus (CT). As a manually created mapping product, CCT provides for each of the classes the corresponding thesaurus terms, and vice versa. The coverage of CCT includes four major clusters: philosophy, social sciences and humanities, natural sciences and technologies, and general works. There are 22 main-classes, 52,992 sub-classes and divisions, 110,837 preferred thesaurus terms, 35,690 entry terms (non-preferred terms), and 59,738 pre-coordinated headings (Chinese Classified Thesaurus, 2005) Major challenges of encoding this large vocabulary comes from its integrated structure. CCT is a result of the combination of two structures (illustrated in Figure 1): a thesaurus that uses ISO-2788 standardized structure and a classification scheme that is basically enumerative, but provides some flexibility for several kinds of synthetic mechanisms Other challenges include the complex relationships caused by differences of granularities of two original schemes and their presentation with various levels of SKOS elements; as well as the diverse coordination of entries due to the use of auxiliary tables and pre-coordinated headings derived from combining classes, subdivisions, and thesaurus terms, which do not correspond to existing unique identifiers. The poster reports the progress, shares the sample SKOS entries, and summarizes problems identified during the SKOS encoding process. Although OWL Lite and OWL Full provide richer expressiveness, the cost-benefit issues and the final purposes of encoding CCT raise questions of using such approaches.
Anmerkung: Vgl. unter: http://dcpapers.dublincore.org/ojs/pubs/article/view/935/931.
Themenfeld: Wissensrepräsentation ; Semantic Web
Objekt: SKOS ; CCT ; OWL
5Radev, D. ; Fan, W. ; Qu, H. ; Wu, H. ; Grewal, A.: Probabilistic question answering on the Web.
In: Journal of the American Society for Information Science and Technology. 56(2005) no.6, S.571-583.
Abstract: Web-based search engines such as Google and NorthernLight return documents that are relevant to a user query, not answers to user questions. We have developed an architecture that augments existing search engines so that they support natural language question answering. The process entails five steps: query modulation, document retrieval, passage extraction, phrase extraction, and answer ranking. In this article, we describe some probabilistic approaches to the last three of these stages. We show how our techniques apply to a number of existing search engines, and we also present results contrasting three different methods for question answering. Our algorithm, probabilistic phrase reranking (PPR), uses proximity and question type features and achieves a total reciprocal document rank of .20 an the TREC8 corpus. Our techniques have been implemented as a Web-accessible system, called NSIR.
Themenfeld: Suchmaschinen ; Retrievalalgorithmen ; Computerlinguistik ; Sprachretrieval
6Fan, W. ; Fox, E.A. ; Pathak, P. ; Wu, H.: ¬The effects of fitness functions an genetic programming-based ranking discovery for Web search.
In: Journal of the American Society for Information Science and technology. 55(2004) no.7, S.628-636.
Abstract: Genetic-based evolutionary learning algorithms, such as genetic algorithms (GAs) and genetic programming (GP), have been applied to information retrieval (IR) since the 1980s. Recently, GP has been applied to a new IR taskdiscovery of ranking functions for Web search-and has achieved very promising results. However, in our prior research, only one fitness function has been used for GP-based learning. It is unclear how other fitness functions may affect ranking function discovery for Web search, especially since it is weIl known that choosing a proper fitness function is very important for the effectiveness and efficiency of evolutionary algorithms. In this article, we report our experience in contrasting different fitness function designs an GP-based learning using a very large Web corpus. Our results indicate that the design of fitness functions is instrumental in performance improvement. We also give recommendations an the design of fitness functions for genetic-based information retrieval experiments.
7Fan, W. ; Gordon, M.D. ; Pathak, P.: ¬A generic ranking function discovery framework by genetic programming for information retrieval.
In: Information processing and management. 40(2004) no.4, S.587-602.
Abstract: Ranking functions play a substantial role in the performance of information retrieval (IR) systems and search engines. Although there are many ranking functions available in the IR literature, various empirical evaluation studies show that ranking functions do not perform consistently well across different contexts (queries, collections, users). Moreover, it is often difficult and very expensive for human beings to design optimal ranking functions that work well in all these contexts. In this paper, we propose a novel ranking function discovery framework based on Genetic Programming and show through various experiments how this new framework helps automate the ranking function design/discovery process.
8Fan, W. ; Luo, M. ; Wang, L. ; Xi, W. ; Fox, E.A.: Tuning before feedback : combining ranking discovery and blind feedback for robust retrieval.
In: SIGIR'04: Proceedings of the 27th Annual International ACM-SIGIR Conference an Research and Development in Information Retrieval. Ed.: K. Järvelin, u.a. New York, NY : ACM Press, 2004. S.138-145.
9Radev, D.R. ; Libner, K. ; Fan, W.: Getting answers to natural language questions on the Web.
In: Journal of the American Society for Information Science and technology. 53(2002) no.5, S.359-364.
Abstract: Seven hundred natural language questions from TREC-8 and TREC-9 were sent by Radev, Libner, and Fan to each of nine web search engines. The top 40 sites returned by each system were stored for evaluation of their productivity of correct answers. Each question per engine was scored as the sum of the reciprocal ranks of identified correct answers. The large number of zero scores gave a positive skew violating the normality assumption for ANOVA, so values were transformed to zero for no hit and one for one or more hits. The non-zero values were then square-root transformed to remove the remaining positive skew. Interactions were observed between search engine and answer type (name, place, date, et cetera), search engine and number of proper nouns in the query, search engine and the need for time limitation, and search engine and total query words. All effects were significant. Shortest queries had the highest mean scores. One or more proper nouns present provides a significant advantage. Non-time dependent queries have an advantage. Place, name, person, and text description had mean scores between .85 and .9 with date at .81 and number at .59. There were significant differences in score by search engine. Search engines found at least one correct answer in between 87.7 and 75.45 of the cases. Google and Northern Light were just short of a 90% hit rate. No evidence indicated that a particular engine was better at answering any particular sort of question.
Themenfeld: Retrievalstudien ; Suchmaschinen