Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 28. April 2022)
1Buccio, E. Di ; Melucci, M. ; Moro, F.: Detecting verbose queries and improving information retrieval.
In: Information processing and management. 50(2014) no.2, S.342-360.
Abstract: Although most of the queries submitted to search engines are composed of a few keywords and have a length that ranges from three to six words, more than 15% of the total volume of the queries are verbose, introduce ambiguity and cause topic drifts. We consider verbosity a different property of queries from length since a verbose query is not necessarily long, it might be succinct and a short query might be verbose. This paper proposes a methodology to automatically detect verbose queries and conditionally modify queries. The methodology proposed in this paper exploits state-of-the-art classification algorithms, combines concepts from a large linguistic database and uses a topic gisting algorithm we designed for verbose query modification purposes. Our experimental results have been obtained using the TREC Robust track collection, thirty topics classified by difficulty degree, four queries per topic classified by verbosity and length, and human assessment of query verbosity. Our results suggest that the methodology for query modification conditioned to query verbosity detection and topic gisting is significantly effective and that query modification should be refined when topic difficulty and query verbosity are considered since these two properties interact and query verbosity is not straightforwardly related to query length.
Inhalt: Vgl.: doi: 10.1016/j.ipm.2013.09.003.
Themenfeld: Semantisches Umfeld in Indexierung u. Retrieval
2Melucci, M.: Contextual search : a computational framework.
Boston, MA : Now Publ., 2012. 152 S.
(Foundations and trends(r) in information retrieval; 6, 4/5)
Abstract: The growing availability of data in electronic form, the expansion of the World Wide Web and the accessibility of computational methods for large-scale data processing have allowed researchers in Information Retrieval (IR) to design systems which can effectively and efficiently constrain search within the boundaries given by context, thus transforming classical search into contextual search. Contextual Search: A Computational Framework introduces contextual search within a computational framework based on contextual variables, contextual factors and statistical models. It describes how statistical models can process contextual variables to infer the contextual factors underlying the current search context. It also provides background to the subject by: placing it among other surveys on relevance, interaction, context, and behaviour; providing a description of the contextual variables used for implementing the statistical models which represent and predict relevance and contextual factors; and providing an overview of the evaluation methodologies and findings relevant to this subject. Contextual Search: A Computational Framework is a highly recommended read, both for beginners who are embarking on research in this area and as a useful reference for established IR researchers.
Inhalt: Table of contents 1. Introduction 2. Query Intent 3. Personal Interest 4. Document Quality 5. Contextual Search Evaluation 6. Conclusions Acknowledgements References A. Implementations
Themenfeld: Semantisches Umfeld in Indexierung u. Retrieval
3Melucci, M. ; Orio, N.: Design, implementation, and evaluation of a methodology for automatic stemmer generation.
In: Journal of the American Society for Information Science and Technology. 58(2007) no.5, S.673-686.
Abstract: The authors describe a statistical approach based on hidden Markov models (HMMs), for generating stemmers automatically. The proposed approach requires little effort to insert new languages in the system even if minimal linguistic knowledge is available. This is a key advantage especially for digital libraries, which are often developed for a specific institution or government because the program can manage a great amount of documents written in local languages. The evaluation described in the article shows that the stemmers implemented by means of HMMs are as effective as those based on linguistic rules.
4Bacchin, M. ; Ferro, N. ; Melucci, M.: ¬A probabilistic model for stemmer generation.
In: Information processing and management. 41(2005) no.1, S.121-137.
Abstract: In this paper we will present a language-independent probabilistic model which can automatically generate stemmers. Stemmers can improve the retrieval effectiveness of information retrieval systems, however the designing and the implementation of stemmers requires a laborious amount of effort due to the fact that documents and queries are often written or spoken in several different languages. The probabilistic model proposed in this paper aims at the development of stemmers used for several languages. The proposed model describes the mutual reinforcement relationship between stems and derivations and then provides a probabilistic interpretation. A series of experiments shows that the stemmers generated by the probabilistic model are as effective as the ones based on linguistic knowledge.
5Melucci, M.: Making digital libraries effective : automatic generation of links for similarity search across hyper-textbooks.
In: Journal of the American Society for Information Science and technology. 55(2004) no.5, S.414-430.
Abstract: Textbooks are more available in electronic format now than in the past. Because textbooks are typically large, the end user needs effective tools to rapidly access information encapsulated in textbooks stored in digital libraries. Statistical similarity-based links among hypertextbooks are a means to provide those tools. In this paper, the design and the implementation of a tool that generates networks of links within and across hypertextbooks through a completely automatic and unsupervised procedure is described. The design is based an statistical techniques. The overall methodology is presented together with the results of a case study reached through a working prototype that shows that connecting hyper-textbooks is an efficient way to provide an effective retrieval capability.
Themenfeld: Computer Based Training ; Hypertext
6Melucci, M. ; Orio, N.: Combining melody processing and information retrieval techniques : methodology, evaluation, and system implementation.
In: Journal of the American Society for Information Science and Technology. 55(2004) no.12, S.1058-1066.
Abstract: The article describes the project an music information retrieval that has been carried out at the University of Padova, Italy. The research work has been characterized by the synergy of the modular integration of sound techniques of melody processing and of statistical information retrieval. After illustrating the background from which the project has originated, we describe the complete process, from methodology design through evaluation and system implementation. Conclusions, impacts an research in music information retrieval, and future directions are also described.
Anmerkung: Beitrag in einem Themenheft zur Musikerschließung und zum Musikretrieval
8Melucci, M.: Passage retrieval : a probabilistic technique.
In: Information processing and management. 34(1998) no.1, S.43-68.
Abstract: This paper presents a probabilistic technique to retrieve passages from texts having a large size or heterogeneous semantic content. The proposed technique is independent on any supporting auxiliary data, such as text structure, topic organization, or pre-defined text segments. A Bayesian framework implements the probabilistic technique. We carried out experiments to compare the probabilistique technique to one based on a text segmentation algorithm. In particular, the probabilistique technique is more effective than, or as effective as the one based on the text segmentation to retrieve small passages. Results show that passage size affects passage retrieval performance. Results do also suggest that text organization and query generality may have an impact on the difference in effectiveness between the two techniques
9Agosti, M. ; Crestani, F. ; Melucci, M.: On the use of information retrieval techniques for the automatic construction of hypertext.
In: Information processing and management. 33(1997) no.2, S.133-144.
Abstract: Introduces what automatic authoring of a hypertext for information retrieval means. The most difficult part of the automatic construction of a hypertext is the creation of links connecting documents or document fragments that are related. Becaus of this, to many researchers it seemed natural to use information retrieval techniques for this purpose, since information retrieval has always dealt with the construction of relationships between objects mutually relevant. Presents a survey of some of the attempts toward the automatic construction of hypertexts for information retrieval. Identifies and compares scope, advantages and limitations of different approaches. Points out the main and most successful current lines of research
Anmerkung: Contribution to a special issue on methods and tools for the automatic construction of hypertext
10Agosti, M. ; Crestani, F. ; Melucci, M.: Design and implementation of a tool for the automatic construction of hypertexts for information retrieval.
In: Information processing and management. 32(1996) no.4, S.459-476.
Abstract: Describes the design and implementation of TACHIR, a tool for the automatic construction of hypertexts for information retrieval. Through the use of an authoring methodology employing a set of well known information retrieval techniques, TACHIR automatically builds up a hypertext from a document collection. The structure of the hypertext reflects a 3 level conceptual model which enables navigation among documents, index terms, and concepts using automatically determined links. The hypertext is implemented using the HTML language. It can be distributed on different sites and different machines over the Internet, and it can be navigated using WWW interfaces