Search (10 results, page 1 of 1)

Crestani, F.; Rijsbergen, C.J. van: Information retrieval by imaging (1996) 0.11

0.10762094 = product of:
  0.14349459 = sum of:
    0.0070626684 = product of:
      0.028250674 = sum of:
        0.028250674 = weight(_text_:based in 6967) [ClassicSimilarity], result of:
          0.028250674 = score(doc=6967,freq=2.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.19973516 = fieldWeight in 6967, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.046875 = fieldNorm(doc=6967)
      0.25 = coord(1/4)
    0.117351316 = weight(_text_:term in 6967) [ClassicSimilarity], result of:
      0.117351316 = score(doc=6967,freq=6.0), product of:
        0.21904005 = queryWeight, product of:
          4.66603 = idf(docFreq=1130, maxDocs=44218)
          0.04694356 = queryNorm
        0.5357528 = fieldWeight in 6967, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          4.66603 = idf(docFreq=1130, maxDocs=44218)
          0.046875 = fieldNorm(doc=6967)
    0.019080611 = product of:
      0.038161222 = sum of:
        0.038161222 = weight(_text_:22 in 6967) [ClassicSimilarity], result of:
          0.038161222 = score(doc=6967,freq=2.0), product of:
            0.16438834 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04694356 = queryNorm
            0.23214069 = fieldWeight in 6967, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=6967)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: Explains briefly what constitutes the imaging process and explains how imaging can be used in information retrieval. Proposes an approach based on the concept of: 'a term is a possible world'; which enables the exploitation of term to term relationships which are estimated using an information theoretic measure. Reports results of an evaluation exercise to compare the performance of imaging retrieval, using possible world semantics, with a benchmark and using the Cranfield 2 document collection to measure precision and recall. Initially, the performance imaging retrieval was seen to be better but statistical analysis proved that the difference was not significant. The problem with imaging retrieval lies in the amount of computations needed to be performed at run time and a later experiement investigated the possibility of reducing this amount. Notes lines of further investigation
Source: Information retrieval: new systems and current research. Proceedings of the 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Drymen, Scotland, 22-23 Mar 94. Ed.: R. Leon

Giachanou, A.; Rosso, P.; Crestani, F.: ¬The impact of emotional signals on credibility assessment (2021) 0.10

0.09810272 = product of:
  0.13080363 = sum of:
    0.005885557 = product of:
      0.023542227 = sum of:
        0.023542227 = weight(_text_:based in 328) [ClassicSimilarity], result of:
          0.023542227 = score(doc=328,freq=2.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.16644597 = fieldWeight in 328, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=328)
      0.25 = coord(1/4)
    0.056460675 = weight(_text_:term in 328) [ClassicSimilarity], result of:
      0.056460675 = score(doc=328,freq=2.0), product of:
        0.21904005 = queryWeight, product of:
          4.66603 = idf(docFreq=1130, maxDocs=44218)
          0.04694356 = queryNorm
        0.25776416 = fieldWeight in 328, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          4.66603 = idf(docFreq=1130, maxDocs=44218)
          0.0390625 = fieldNorm(doc=328)
    0.068457395 = product of:
      0.13691479 = sum of:
        0.13691479 = weight(_text_:assessment in 328) [ClassicSimilarity], result of:
          0.13691479 = score(doc=328,freq=6.0), product of:
            0.25917634 = queryWeight, product of:
              5.52102 = idf(docFreq=480, maxDocs=44218)
              0.04694356 = queryNorm
            0.5282689 = fieldWeight in 328, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              5.52102 = idf(docFreq=480, maxDocs=44218)
              0.0390625 = fieldNorm(doc=328)
      0.5 = coord(1/2)
  0.75 = coord(3/4)

Abstract: Fake news is considered one of the main threats of our society. The aim of fake news is usually to confuse readers and trigger intense emotions to them in an attempt to be spread through social networks. Even though recent studies have explored the effectiveness of different linguistic patterns for fake news detection, the role of emotional signals has not yet been explored. In this paper, we focus on extracting emotional signals from claims and evaluating their effectiveness on credibility assessment. First, we explore different methodologies for extracting the emotional signals that can be triggered to the users when they read a claim. Then, we present emoCred, a model that is based on a long-short term memory model that incorporates emotional signals extracted from the text of the claims to differentiate between credible and non-credible ones. In addition, we perform an analysis to understand which emotional signals and which terms are the most useful for the different credibility classes. We conduct extensive experiments and a thorough analysis on real-world datasets. Our results indicate the importance of incorporating emotional signals in the credibility assessment problem.

Crestani, F.; Rijsbergen, C.J. van: Information retrieval by logical imaging (1995) 0.07

0.072574824 = product of:
  0.14514965 = sum of:
    0.00823978 = product of:
      0.03295912 = sum of:
        0.03295912 = weight(_text_:based in 1759) [ClassicSimilarity], result of:
          0.03295912 = score(doc=1759,freq=2.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.23302436 = fieldWeight in 1759, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1759)
      0.25 = coord(1/4)
    0.13690987 = weight(_text_:term in 1759) [ClassicSimilarity], result of:
      0.13690987 = score(doc=1759,freq=6.0), product of:
        0.21904005 = queryWeight, product of:
          4.66603 = idf(docFreq=1130, maxDocs=44218)
          0.04694356 = queryNorm
        0.62504494 = fieldWeight in 1759, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          4.66603 = idf(docFreq=1130, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1759)
  0.5 = coord(2/4)

Abstract: The evaluation of an implication by imaging is a logical technique developed in the framework of modal logic. Its interpretation in the context of a 'possible worlds' semantics is very appealing for information retrieval. In 19889, Van Rijsbergen suggested its use for solving 1 of the fundamental problems of logical models of information retrieval: the evaluation of the logical implication that a document is relevant to a query if it implies the query. Since then, others have tried to follow that suggestion proposing models and applications, though without much success. Most of these approaches had as their basic assunption the consideration that ' document is a possible world'. Proposes instead an approach based on a completely different assumption: ' a term is a possible world'. This approach enables the exploitation of term-term relationships which are estimated using an information theoretic measure

Keikha, M.; Crestani, F.; Carman, M.J.: Employing document dependency in blog search (2012) 0.05
```
0.045809288 = product of:
  0.091618575 = sum of:
    0.011771114 = product of:
      0.047084454 = sum of:
        0.047084454 = weight(_text_:based in 4987) [ClassicSimilarity], result of:
          0.047084454 = score(doc=4987,freq=8.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.33289194 = fieldWeight in 4987, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=4987)
      0.25 = coord(1/4)
    0.07984746 = weight(_text_:term in 4987) [ClassicSimilarity], result of:
      0.07984746 = score(doc=4987,freq=4.0), product of:
        0.21904005 = queryWeight, product of:
          4.66603 = idf(docFreq=1130, maxDocs=44218)
          0.04694356 = queryNorm
        0.3645336 = fieldWeight in 4987, product of:
          2.0 = tf(freq=4.0), with freq of:
            4.0 = termFreq=4.0
          4.66603 = idf(docFreq=1130, maxDocs=44218)
          0.0390625 = fieldNorm(doc=4987)
  0.5 = coord(2/4)
```
Abstract

The goal in blog search is to rank blogs according to their recurrent relevance to the topic of the query. State-of-the-art approaches view it as an expert search or resource selection problem. We investigate the effect of content-based similarity between posts on the performance of the retrieval system. We test two different approaches for smoothing (regularizing) relevance scores of posts based on their dependencies. In the first approach, we smooth term distributions describing posts by performing a random walk over a document-term graph in which similar posts are highly connected. In the second, we directly smooth scores for posts using a regularization framework that aims to minimize the discrepancy between scores for similar documents. We then extend these approaches to consider the time interval between the posts in smoothing the scores. The idea is that if two posts are temporally close, then they are good sources for smoothing each other's relevance scores. We compare these methods with the state-of-the-art approaches in blog search that employ Language Modeling-based resource selection algorithms and fusion-based methods for aggregating post relevance scores. We show performance gains over the baseline techniques which do not take advantage of the relation between posts for smoothing relevance estimates.

Crestani, F.; Dominich, S.; Lalmas, M.; Rijsbergen, C.J.K. van: Mathematical, logical, and formal methods in information retrieval : an introduction to the special issue (2003) 0.00

0.0047701527 = product of:
  0.019080611 = sum of:
    0.019080611 = product of:
      0.038161222 = sum of:
        0.038161222 = weight(_text_:22 in 1451) [ClassicSimilarity], result of:
          0.038161222 = score(doc=1451,freq=2.0), product of:
            0.16438834 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04694356 = queryNorm
            0.23214069 = fieldWeight in 1451, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1451)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 22. 3.2003 19:27:36

Crestani, F.; Du, H.: Written versus spoken queries : a qualitative and quantitative comparative analysis (2006) 0.00

0.0047701527 = product of:
  0.019080611 = sum of:
    0.019080611 = product of:
      0.038161222 = sum of:
        0.038161222 = weight(_text_:22 in 5047) [ClassicSimilarity], result of:
          0.038161222 = score(doc=5047,freq=2.0), product of:
            0.16438834 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.04694356 = queryNorm
            0.23214069 = fieldWeight in 5047, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=5047)
      0.5 = coord(1/2)
  0.25 = coord(1/4)

Date: 5. 6.2006 11:22:23

Simeoni, F.; Yakici, M.; Neely, S.; Crestani, F.: Metadata harvesting for content-based distributed information retrieval (2008) 0.00
```
0.0029427784 = product of:
  0.011771114 = sum of:
    0.011771114 = product of:
      0.047084454 = sum of:
        0.047084454 = weight(_text_:based in 1336) [ClassicSimilarity], result of:
          0.047084454 = score(doc=1336,freq=8.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.33289194 = fieldWeight in 1336, product of:
              2.828427 = tf(freq=8.0), with freq of:
                8.0 = termFreq=8.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=1336)
      0.25 = coord(1/4)
  0.25 = coord(1/4)
```
Abstract

We propose an approach to content-based Distributed Information Retrieval based on the periodic and incremental centralization of full-content indices of widely dispersed and autonomously managed document sources. Inspired by the success of the Open Archive Initiative's (OAI) Protocol for metadata harvesting, the approach occupies middle ground between content crawling and distributed retrieval. As in crawling, some data move toward the retrieval process, but it is statistics about the content rather than content itself; this grants more efficient use of network resources and wider scope of application. As in distributed retrieval, some processing is distributed along with the data, but it is indexing rather than retrieval; this reduces the costs of content provision while promoting the simplicity, effectiveness, and responsiveness of retrieval. Overall, we argue that the approach retains the good properties of centralized retrieval without renouncing to cost-effective, large-scale resource pooling. We discuss the requirements associated with the approach and identify two strategies to deploy it on top of the OAI infrastructure. In particular, we define a minimal extension of the OAI protocol which supports the coordinated harvesting of full-content indices and descriptive metadata for content resources. Finally, we report on the implementation of a proof-of-concept prototype service for multimodel content-based retrieval of distributed file collections.
Crestani, F.; Wu, S.: Testing the cluster hypothesis in distributed information retrieval (2006) 0.00
```
0.002548521 = product of:
  0.010194084 = sum of:
    0.010194084 = product of:
      0.040776335 = sum of:
        0.040776335 = weight(_text_:based in 984) [ClassicSimilarity], result of:
          0.040776335 = score(doc=984,freq=6.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.28829288 = fieldWeight in 984, product of:
              2.4494898 = tf(freq=6.0), with freq of:
                6.0 = termFreq=6.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0390625 = fieldNorm(doc=984)
      0.25 = coord(1/4)
  0.25 = coord(1/4)
```
Abstract

How to merge and organise query results retrieved from different resources is one of the key issues in distributed information retrieval. Some previous research and experiments suggest that cluster-based document browsing is more effective than a single merged list. Cluster-based retrieval results presentation is based on the cluster hypothesis, which states that documents that cluster together have a similar relevance to a given query. However, while this hypothesis has been demonstrated to hold in classical information retrieval environments, it has never been fully tested in heterogeneous distributed information retrieval environments. Heterogeneous document representations, the presence of document duplicates, and disparate qualities of retrieval results, are major features of an heterogeneous distributed information retrieval environment that might disrupt the effectiveness of the cluster hypothesis. In this paper we report on an experimental investigation into the validity and effectiveness of the cluster hypothesis in highly heterogeneous distributed information retrieval environments. The results show that although clustering is affected by different retrieval results representations and quality, the cluster hypothesis still holds and that generating hierarchical clusters in highly heterogeneous distributed information retrieval environments is still a very effective way of presenting retrieval results to users.
Crestani, F.; Vegas, J.; Fuente, P. de la: ¬A graphical user interface for the retrieval of hierarchically structured documents (2004) 0.00
```
0.0024970302 = product of:
  0.009988121 = sum of:
    0.009988121 = product of:
      0.039952483 = sum of:
        0.039952483 = weight(_text_:based in 2555) [ClassicSimilarity], result of:
          0.039952483 = score(doc=2555,freq=4.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.28246817 = fieldWeight in 2555, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.046875 = fieldNorm(doc=2555)
      0.25 = coord(1/4)
  0.25 = coord(1/4)
```
Abstract

Past research has proved that graphical user interfaces (GUIs) can significantly improve the effectiveness of the information access task. Our work is based on the consideration that structured document retrieval requires different user graphical interfaces from standard information retrieval. In structured document retrieval a GUI has to enable a user to query, browse retrieved documents, provide query refinement and relevance feedback based not only on full documents, but also on specific document parts in relation to the document structure. In this paper, we present a new GUI for structured document retrieval specifically designed for hierarchically structured documents. A user task-oriented evaluation has shown that the proposed interface provides the user with an intuitive and powerful set of tools for structured document searching, retrieved list navigation, and search refinement.
Bache, R.; Baillie, M.; Crestani, F.: Measuring the likelihood property of scoring functions in general retrieval models (2009) 0.00
```
0.002059945 = product of:
  0.00823978 = sum of:
    0.00823978 = product of:
      0.03295912 = sum of:
        0.03295912 = weight(_text_:based in 2860) [ClassicSimilarity], result of:
          0.03295912 = score(doc=2860,freq=2.0), product of:
            0.14144066 = queryWeight, product of:
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.04694356 = queryNorm
            0.23302436 = fieldWeight in 2860, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.0129938 = idf(docFreq=5906, maxDocs=44218)
              0.0546875 = fieldNorm(doc=2860)
      0.25 = coord(1/4)
  0.25 = coord(1/4)
```
Abstract

Although retrieval systems based on probabilistic models will rank the objects (e.g., documents) being retrieved according to the probability of some matching criterion (e.g., relevance), they rarely yield an actual probability, and the scoring function is interpreted to be purely ordinal within a given retrieval task. In this brief communication, it is shown that some scoring functions possess the likelihood property, which means that the scoring function indicates the likelihood of matching when compared to other retrieval tasks, which is potentially more useful than pure ranking although it cannot be interpreted as an actual probability. This property can be detected by using two modified effectiveness measures: entire precision and entire recall.

Search (10 results, page 1 of 1)

Authors

Years

Themes