Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 04. Juni 2021)
1Santana, A.F. ; Gonçalves, M.A. ; Laender, A.H.F. ; Ferreira, A.A.: Incremental author name disambiguation by exploiting domain-specific heuristics.
In: Journal of the Association for Information Science and Technology. 68(2017) no.4, S.931-945.
Abstract: The vast majority of the current author name disambiguation solutions are designed to disambiguate a whole digital library (DL) at once considering the entire repository. However, these solutions besides being very expensive and having scalability problems, also may not benefit from eventual manual corrections, as they may be lost whenever the process of disambiguating the entire repository is required. In the real world, in which repositories are updated on a daily basis, incremental solutions that disambiguate only the newly introduced citation records, are likely to produce improved results in the long run. However, the problem of incremental author name disambiguation has been largely neglected in the literature. In this article we present a new author name disambiguation method, specially designed for the incremental scenario. In our experiments, our new method largely outperforms recent incremental proposals reported in the literature as well as the current state-of-the-art non-incremental method.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23726/full.
2Ferreira, A.A. ; Veloso, A. ; Gonçalves, M.A. ; Laender, A.H.F.: Self-training author name disambiguation for information scarce scenarios.
In: Journal of the Association for Information Science and Technology. 65(2014) no.6, S.1257-1278.
Abstract: We present a novel 3-step self-training method for author name disambiguation-SAND (self-training associative name disambiguator)-which requires no manual labeling, no parameterization (in real-world scenarios) and is particularly suitable for the common situation in which only the most basic information about a citation record is available (i.e., author names, and work and venue titles). During the first step, real-world heuristics on coauthors are able to produce highly pure (although fragmented) clusters. The most representative of these clusters are then selected to serve as training data for the third supervised author assignment step. The third step exploits a state-of-the-art transductive disambiguation method capable of detecting unseen authors not included in any training example and incorporating reliable predictions to the training data. Experiments conducted with standard public collections, using the minimum set of attributes present in a citation, demonstrate that our proposed method outperforms all representative unsupervised author grouping disambiguation methods and is very competitive with fully supervised author assignment methods. Thus, different from other bootstrapping methods that explore privileged, hard to obtain information such as self-citations and personal information, our proposed method produces topnotch performance with no (manual) training data or parameterization and in the presence of scarce information.
3Cota, R.G. ; Ferreira, A.A. ; Nascimento, C. ; Gonçalves, M.A. ; Laender, A.H.F.: ¬An unsupervised heuristic-based hierarchical method for name disambiguation in bibliographic citations.
In: Journal of the American Society for Information Science and Technology. 61(2010) no.9, S.1853-1870.
Abstract: Name ambiguity in the context of bibliographic citations is a difficult problem which, despite the many efforts from the research community, still has a lot of room for improvement. In this article, we present a heuristic-based hierarchical clustering method to deal with this problem. The method successively fuses clusters of citations of similar author names based on several heuristics and similarity measures on the components of the citations (e.g., coauthor names, work title, and publication venue title). During the disambiguation task, the information about fused clusters is aggregated providing more information for the next round of fusion. In order to demonstrate the effectiveness of our method, we ran a series of experiments in two different collections extracted from real-world digital libraries and compared it, under two metrics, with four representative methods described in the literature. We present comparisons of results using each considered attribute separately (i.e., coauthor names, work title, and publication venue title) with the author name attribute and using all attributes together. These results show that our unsupervised method, when using all attributes, performs competitively against all other methods, under both metrics, loosing only in one case against a supervised method, whose result was very close to ours. Moreover, such results are achieved without the burden of any training and without using any privileged information such as knowing a priori the correct number of clusters.
4Kommers, P.A.M. ; Ferreira, A. ; Kwak, A.K.: Document management for hypermedia design.
Berlin : Springer, 1997. X,287 S.
Abstract: Electronic texts offer new ways to store, retrieve, update, and cross-link information. Hypermedia documents require new levels of organization and strict discipline from authors, editors, and managers. This book provides a step-by step guide to all aspects of hypermedia development, from strategic decision-making to editing formats and production methods
Themenfeld: Elektronisches Publizieren ; Hypertext
LCSH: Interactive multimedia ; Hypertext systems
RSWK: Hypermedia / Dokumentverarbeitung (21) ; SGML ; Hypertext