Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 04. Juni 2021)
1Vilares, J. ; Alonso, M.A. ; Doval, Y. ; Vilares, M.: Studying the effect and treatment of misspelled queries in Cross-Language Information Retrieval.
In: Information processing and management. 52(2016) no.4, S.646-657.
Abstract: General graph random walk has been successfully applied in multi-document summarization, but it has some limitations to process documents by this way. In this paper, we propose a novel hypergraph based vertex-reinforced random walk framework for multi-document summarization. The framework first exploits the Hierarchical Dirichlet Process (HDP) topic model to learn a word-topic probability distribution in sentences. Then the hypergraph is used to capture both cluster relationship based on the word-topic probability distribution and pairwise similarity among sentences. Finally, a time-variant random walk algorithm for hypergraphs is developed to rank sentences which ensures sentence diversity by vertex-reinforcement in summaries. Experimental results on the public available dataset demonstrate the effectiveness of our framework.
Inhalt: Vgl.: http://www.sciencedirect.com/science/article/pii/S0306457315001478.
Themenfeld: Multilinguale Probleme
2Vilares, D. ; Alonso, M.A. ; Gómez-Rodríguez, C.: On the usefulness of lexical and syntactic processing in polarity classification of Twitter messages.
In: Journal of the Association for Information Science and Technology. 66(2015) no.9, S.1799-1816.
Abstract: Millions of micro texts are published every day on Twitter. Identifying the sentiment present in them can be helpful for measuring the frame of mind of the public, their satisfaction with respect to a product, or their support of a social event. In this context, polarity classification is a subfield of sentiment analysis focused on determining whether the content of a text is objective or subjective, and in the latter case, if it conveys a positive or a negative opinion. Most polarity detection techniques tend to take into account individual terms in the text and even some degree of linguistic knowledge, but they do not usually consider syntactic relations between words. This article explores how relating lexical, syntactic, and psychometric information can be helpful to perform polarity classification on Spanish tweets. We provide an evaluation for both shallow and deep linguistic perspectives. Empirical results show an improved performance of syntactic approaches over pure lexical models when using large training sets to create a classifier, but this tendency is reversed when small training collections are used.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23284/abstract.
Themenfeld: Automatisches Indexieren ; Automatisches Klassifizieren
3Vilares, J. ; Alonso, M.A. ; Vilares, M.: Extraction of complex index terms in non-English IR : a shallow parsing based approach.
In: Information processing and management. 44(2008) no.4, S.1517-1537.
Abstract: The performance of information retrieval systems is limited by the linguistic variation present in natural language texts. Word-level natural language processing techniques have been shown to be useful in reducing this variation. In this article, we summarize our work on the extension of these techniques for dealing with phrase-level variation in European languages, taking Spanish as a case in point. We propose the use of syntactic dependencies as complex index terms in an attempt to solve the problems deriving from both syntactic and morpho-syntactic variation and, in this way, to obtain more precise index terms. Such dependencies are obtained through a shallow parser based on cascades of finite-state transducers in order to reduce as far as possible the overhead due to this parsing process. The use of different sources of syntactic information, queries or documents, has been also studied, as has the restriction of the dependencies applied to those obtained from noun phrases. Our approaches have been tested using the CLEF corpus, obtaining consistent improvements with regard to classical word-level non-linguistic techniques. Results show, on the one hand, that syntactic information extracted from documents is more useful than that from queries. On the other hand, it has been demonstrated that by restricting dependencies to those corresponding to noun phrases, important reductions of storage and management costs can be achieved, albeit at the expense of a slight reduction in performance.
4Alonso, M.A.L.: ¬Los tesauros conceptuales como herramienta de precision en los sistemas de organizacion cientifica.
In: Revista interamericana de bibliotecologia. 22(1999) no.1, S.21-35.
Anmerkung: Übers. d. Titels: Conceptual thesauri as precision tools in scientific organisation systems
6Alonso, M.A.L.: ¬Un tesauro conceptual para la recuperacion de la informacion juridica comercial.
In: Revista Española de Documentaçion Cientifica. 21(1998) no.2, S.164-173.
Abstract: The Commercial Law Thesaurus was elaborated as an application of a doctoral investigation into the application of a conceptual model to legal information retrieval. Justifies the need for such a thesaurus, briefly describes the genesis of its construction, and outlines conclusions obtained. Validates the efficiency of the thesaurus in the monitoring carried out in the legal database Juridoc and comments on the display of results as categorized, alphabetical, permuted and conceptual indexes
Anmerkung: Übers. d. Titels: A cognitive thesaurus for retrieval of legal commercial information
Themenfeld: Konzeption und Anwendung des Prinzips Thesaurus