Diese Datenbank enthält über 40.000 Dokumente zu Themen aus den Bereichen Formalerschließung – Inhaltserschließung – Information Retrieval.
© 2015 W. Gödert, TH Köln, Institut für Informationswissenschaft / Powered by litecat, BIS Oldenburg (Stand: 03. März 2020)
1Varathan, K.D. ; Giachanou, A. ; Crestani, F.: Comparative opinion mining : a review.
In: Journal of the Association for Information Science and Technology. 68(2017) no.4, S.811-829.
Abstract: Opinion mining refers to the use of natural language processing, text analysis, and computational linguistics to identify and extract subjective information in textual material. Opinion mining, also known as sentiment analysis, has received a lot of attention in recent times, as it provides a number of tools to analyze public opinion on a number of different topics. Comparative opinion mining is a subfield of opinion mining which deals with identifying and extracting information that is expressed in a comparative form (e.g., "paper X is better than the Y"). Comparative opinion mining plays a very important role when one tries to evaluate something because it provides a reference point for the comparison. This paper provides a review of the area of comparative opinion mining. It is the first review that cover specifically this topic as all previous reviews dealt mostly with general opinion mining. This survey covers comparative opinion mining from two different angles. One from the perspective of techniques and the other from the perspective of comparative opinion elements. It also incorporates preprocessing tools as well as data set that were used by past researchers that can be useful to future researchers in the field of comparative opinion mining.
Inhalt: Vgl.: http://onlinelibrary.wiley.com/doi/10.1002/asi.23716/full.
Themenfeld: Data Mining
2Crestani, F. ; Mizzaro, S. ; Scagnetto, I,: Mobile information retrieval.
Cham : Springer, 2017. VI, 110 S.
(Springer briefs in computer science)
Abstract: This book offers a helpful starting point in the scattered, rich, and complex body of literature on Mobile Information Retrieval (Mobile IR), reviewing more than 200 papers in nine chapters. Highlighting the most interesting and influential contributions that have appeared in recent years, it particularly focuses on both user interaction and techniques for the perception and use of context, which, taken together, shape much of today's research on Mobile IR. The book starts by addressing the differences between IR and Mobile IR, while also reviewing the foundations of Mobile IR research. It then examines the different kinds of documents, users, and information needs that can be found in Mobile IR, and which set it apart from standard IR. Next, it discusses the two important issues of user interfaces and context-awareness. In closing, it covers issues related to the evaluation of Mobile IR applications. Overall, the book offers a valuable tool, helping new and veteran researchers alike to navigate this exciting and highly dynamic area of research.
Anmerkung: Rez. in: JASIST 69(2018) no.10, S.1283-1287 (Daqing He).
LCSH: Computer science ; Information storage and retrieval ; User interfaces (Computer systems) ; Text processing (Computer science)
RSWK: Mobiles Endgerät ; Mobile Computing ; Information Retrieval
DDC: 025.04 / dc23
RVK: ST 270
3Keikha, M. ; Crestani, F. ; Carman, M.J.: Employing document dependency in blog search.
In: Journal of the American Society for Information Science and Technology. 63(2012) no.2, S.354-365.
Abstract: The goal in blog search is to rank blogs according to their recurrent relevance to the topic of the query. State-of-the-art approaches view it as an expert search or resource selection problem. We investigate the effect of content-based similarity between posts on the performance of the retrieval system. We test two different approaches for smoothing (regularizing) relevance scores of posts based on their dependencies. In the first approach, we smooth term distributions describing posts by performing a random walk over a document-term graph in which similar posts are highly connected. In the second, we directly smooth scores for posts using a regularization framework that aims to minimize the discrepancy between scores for similar documents. We then extend these approaches to consider the time interval between the posts in smoothing the scores. The idea is that if two posts are temporally close, then they are good sources for smoothing each other's relevance scores. We compare these methods with the state-of-the-art approaches in blog search that employ Language Modeling-based resource selection algorithms and fusion-based methods for aggregating post relevance scores. We show performance gains over the baseline techniques which do not take advantage of the relation between posts for smoothing relevance estimates.
4Bache, R. ; Baillie, M. ; Crestani, F.: Measuring the likelihood property of scoring functions in general retrieval models.
In: Journal of the American Society for Information Science and Technology. 60(2009) no.6, S.1294-1297.
Abstract: Although retrieval systems based on probabilistic models will rank the objects (e.g., documents) being retrieved according to the probability of some matching criterion (e.g., relevance), they rarely yield an actual probability, and the scoring function is interpreted to be purely ordinal within a given retrieval task. In this brief communication, it is shown that some scoring functions possess the likelihood property, which means that the scoring function indicates the likelihood of matching when compared to other retrieval tasks, which is potentially more useful than pure ranking although it cannot be interpreted as an actual probability. This property can be detected by using two modified effectiveness measures: entire precision and entire recall.
5Simeoni, F. ; Yakici, M. ; Neely, S. ; Crestani, F.: Metadata harvesting for content-based distributed information retrieval.
In: Journal of the American Society for Information Science and Technology. 59(2008) no.1, S.12-24.
Abstract: We propose an approach to content-based Distributed Information Retrieval based on the periodic and incremental centralization of full-content indices of widely dispersed and autonomously managed document sources. Inspired by the success of the Open Archive Initiative's (OAI) Protocol for metadata harvesting, the approach occupies middle ground between content crawling and distributed retrieval. As in crawling, some data move toward the retrieval process, but it is statistics about the content rather than content itself; this grants more efficient use of network resources and wider scope of application. As in distributed retrieval, some processing is distributed along with the data, but it is indexing rather than retrieval; this reduces the costs of content provision while promoting the simplicity, effectiveness, and responsiveness of retrieval. Overall, we argue that the approach retains the good properties of centralized retrieval without renouncing to cost-effective, large-scale resource pooling. We discuss the requirements associated with the approach and identify two strategies to deploy it on top of the OAI infrastructure. In particular, we define a minimal extension of the OAI protocol which supports the coordinated harvesting of full-content indices and descriptive metadata for content resources. Finally, we report on the implementation of a proof-of-concept prototype service for multimodel content-based retrieval of distributed file collections.
6Sweeney, S. ; Crestani, F. ; Losada, D.E.: 'Show me more' : incremental length summarisation using novelty detection.
In: Information processing and management. 44(2008) no.2, S.663-686.
Abstract: The paper presents a study investigating the effects of incorporating novelty detection in automatic text summarisation. Condensing a textual document, automatic text summarisation can reduce the need to refer to the source document. It also offers a means to deliver device-friendly content when accessing information in non-traditional environments. An effective method of summarisation could be to produce a summary that includes only novel information. However, a consequence of focusing exclusively on novel parts may result in a loss of context, which may have an impact on the correct interpretation of the summary, with respect to the source document. In this study we compare two strategies to produce summaries that incorporate novelty in different ways: a constant length summary, which contains only novel sentences, and an incremental summary, containing additional sentences that provide context. The aim is to establish whether a summary that contains only novel sentences provides sufficient basis to determine relevance of a document, or if indeed we need to include additional sentences to provide context. Findings from the study seem to suggest that there is only a minimal difference in performance for the tasks we set our users and that the presence of contextual information is not so important. However, for the case of mobile information access, a summary that contains only novel information does offer benefits, given bandwidth constraints.
Themenfeld: Automatisches Abstracting
7Crestani, F. ; Du, H.: Written versus spoken queries : a qualitative and quantitative comparative analysis.
In: Journal of the American Society for Information Science and Technology. 57(2006) no.7, S.881-890.
Abstract: The authors report on an experimental study on the differences between spoken and written queries. A set of written and spontaneous spoken queries are generated by users from written topics. These two sets of queries are compared in qualitative terms and in terms of their retrieval effectiveness. Written and spoken queries are compared in terms of length, duration, and part of speech. In addition, assuming perfect transcription of the spoken queries, written and spoken queries are compared in terms of their aptitude to describe relevant documents. The retrieval effectiveness of spoken and written queries is compared using three different information retrieval models. The results show that using speech to formulate one's information need provides a way to express it more naturally and encourages the formulation of longer queries. Despite that, longer spoken queries do not seem to significantly improve retrieval effectiveness compared with written queries.
8Crestani, F. ; Wu, S.: Testing the cluster hypothesis in distributed information retrieval.
In: Information processing and management. 42(2006) no.5, S.1137-1150.
Abstract: How to merge and organise query results retrieved from different resources is one of the key issues in distributed information retrieval. Some previous research and experiments suggest that cluster-based document browsing is more effective than a single merged list. Cluster-based retrieval results presentation is based on the cluster hypothesis, which states that documents that cluster together have a similar relevance to a given query. However, while this hypothesis has been demonstrated to hold in classical information retrieval environments, it has never been fully tested in heterogeneous distributed information retrieval environments. Heterogeneous document representations, the presence of document duplicates, and disparate qualities of retrieval results, are major features of an heterogeneous distributed information retrieval environment that might disrupt the effectiveness of the cluster hypothesis. In this paper we report on an experimental investigation into the validity and effectiveness of the cluster hypothesis in highly heterogeneous distributed information retrieval environments. The results show that although clustering is affected by different retrieval results representations and quality, the cluster hypothesis still holds and that generating hierarchical clusters in highly heterogeneous distributed information retrieval environments is still a very effective way of presenting retrieval results to users.
Themenfeld: Verteilte bibliographische Datenbanken
9Crestani, F. ; Vegas, J. ; Fuente, P. de la: ¬A graphical user interface for the retrieval of hierarchically structured documents.
In: Information processing and management. 40(2004) no.2, S.269-289.
Abstract: Past research has proved that graphical user interfaces (GUIs) can significantly improve the effectiveness of the information access task. Our work is based on the consideration that structured document retrieval requires different user graphical interfaces from standard information retrieval. In structured document retrieval a GUI has to enable a user to query, browse retrieved documents, provide query refinement and relevance feedback based not only on full documents, but also on specific document parts in relation to the document structure. In this paper, we present a new GUI for structured document retrieval specifically designed for hierarchically structured documents. A user task-oriented evaluation has shown that the proposed interface provides the user with an intuitive and powerful set of tools for structured document searching, retrieved list navigation, and search refinement.
10Crestani, F. ; Dominich, S. ; Lalmas, M. ; Rijsbergen, C.J.K. van: Mathematical, logical, and formal methods in information retrieval : an introduction to the special issue.
In: Journal of the American Society for Information Science and technology. 54(2003) no.4, S.281-284.
Abstract: Research an the use of mathematical, logical, and formal methods, has been central to Information Retrieval research for a long time. Research in this area is important not only because it helps enhancing retrieval effectiveness, but also because it helps clarifying the underlying concepts of Information Retrieval. In this article we outline some of the major aspects of the subject, and summarize the papers of this special issue with respect to how they relate to these aspects. We conclude by highlighting some directions of future research, which are needed to better understand the formal characteristics of Information Retrieval.
Anmerkung: Einführung zu den Beiträgen eines Themenheftes: Mathematical, logical, and formal methods in information retrieval
13Tombros, T. ; Crestani, F.: Users' perception of relevance of spoken documents.
In: Journal of the American Society for Information Science. 51(2000) no.10, S.929-939.
Abstract: We present the results of a study of user's perception of relevance of documents. The aim is to study experimentally how users' perception varies depending on the form that retrieved documents are presented. Documents retrieved in response to a query are presented to users in a variety of ways, from full text to a machine spoken query-biased automatically-generated summary, and the difference in users' perception of relevance is studied. The experimental results suggest that the effectiveness of advanced multimedia Information Retrieval applications may be affected by the low level of users' perception of relevance of retrieved documents
Themenfeld: Retrievalstudien ; Benutzerstudien
14Agosti, M. ; Crestani, F. ; Melucci, M.: On the use of information retrieval techniques for the automatic construction of hypertext.
In: Information processing and management. 33(1997) no.2, S.133-144.
Abstract: Introduces what automatic authoring of a hypertext for information retrieval means. The most difficult part of the automatic construction of a hypertext is the creation of links connecting documents or document fragments that are related. Becaus of this, to many researchers it seemed natural to use information retrieval techniques for this purpose, since information retrieval has always dealt with the construction of relationships between objects mutually relevant. Presents a survey of some of the attempts toward the automatic construction of hypertexts for information retrieval. Identifies and compares scope, advantages and limitations of different approaches. Points out the main and most successful current lines of research
Anmerkung: Contribution to a special issue on methods and tools for the automatic construction of hypertext
15Agosti, M. ; Crestani, F. ; Melucci, M.: Design and implementation of a tool for the automatic construction of hypertexts for information retrieval.
In: Information processing and management. 32(1996) no.4, S.459-476.
Abstract: Describes the design and implementation of TACHIR, a tool for the automatic construction of hypertexts for information retrieval. Through the use of an authoring methodology employing a set of well known information retrieval techniques, TACHIR automatically builds up a hypertext from a document collection. The structure of the hypertext reflects a 3 level conceptual model which enables navigation among documents, index terms, and concepts using automatically determined links. The hypertext is implemented using the HTML language. It can be distributed on different sites and different machines over the Internet, and it can be navigated using WWW interfaces
16Crestani, F. ; Rijsbergen, C.J. van: Information retrieval by imaging.
In: Information retrieval: new systems and current research. Proceedings of the 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Drymen, Scotland, 22-23 Mar 94. Ed.: R. Leon. London : Taylor Graham, 1996. S.47-76.
Abstract: Explains briefly what constitutes the imaging process and explains how imaging can be used in information retrieval. Proposes an approach based on the concept of: 'a term is a possible world'; which enables the exploitation of term to term relationships which are estimated using an information theoretic measure. Reports results of an evaluation exercise to compare the performance of imaging retrieval, using possible world semantics, with a benchmark and using the Cranfield 2 document collection to measure precision and recall. Initially, the performance imaging retrieval was seen to be better but statistical analysis proved that the difference was not significant. The problem with imaging retrieval lies in the amount of computations needed to be performed at run time and a later experiement investigated the possibility of reducing this amount. Notes lines of further investigation
17Crestani, F. ; Ruthven, I. ; Sanderson, M. ; Rijsbergen, C.J. van: ¬The troubles with using a logical model of IR on a large collection of documents : experimenting retrieval by logical imaging on TREC.
In: The Fourth Text Retrieval Conference (TREC-4). Ed.: K. Harman. Gaithersburgh, MD : National Institute of Standards and Technology, 1996. S.509-526.
(NIST special publication; 500-236)
18Crestani, F. ; Rijsbergen, C.J. van: Information retrieval by logical imaging.
In: Journal of documentation. 51(1995) no.1, S.3-17.
Abstract: The evaluation of an implication by imaging is a logical technique developed in the framework of modal logic. Its interpretation in the context of a 'possible worlds' semantics is very appealing for information retrieval. In 19889, Van Rijsbergen suggested its use for solving 1 of the fundamental problems of logical models of information retrieval: the evaluation of the logical implication that a document is relevant to a query if it implies the query. Since then, others have tried to follow that suggestion proposing models and applications, though without much success. Most of these approaches had as their basic assunption the consideration that ' document is a possible world'. Proposes instead an approach based on a completely different assumption: ' a term is a possible world'. This approach enables the exploitation of term-term relationships which are estimated using an information theoretic measure