Search (3 results, page 1 of 1)

Beitzel, S.M.; Jensen, E.C.; Chowdhury, A.; Grossman, D.; Frieder, O; Goharian, N.: Fusion of effective retrieval strategies in the same information retrieval system (2004) 0.00
```
0.0019955188 = product of:
  0.011973113 = sum of:
    0.011973113 = weight(_text_:in in 2502) [ClassicSimilarity], result of:
      0.011973113 = score(doc=2502,freq=10.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.20163295 = fieldWeight in 2502, product of:
          3.1622777 = tf(freq=10.0), with freq of:
            10.0 = termFreq=10.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=2502)
  0.16666667 = coord(1/6)
```
Abstract

Prior efforts have shown that under certain situations retrieval effectiveness may be improved via the use of data fusion techniques. Although these improvements have been observed from the fusion of result sets from several distinct information retrieval systems, it has often been thought that fusing different document retrieval strategies in a single information retrieval system will lead to similar improvements. In this study, we show that this is not the case. We hold constant systemic differences such as parsing, stemming, phrase processing, and relevance feedback, and fuse result sets generated from highly effective retrieval strategies in the same information retrieval system. From this, we show that data fusion of highly effective retrieval strategies alone shows little or no improvement in retrieval effectiveness. Furthermore, we present a detailed analysis of the performance of modern data fusion approaches, and demonstrate the reasons why they do not perform weIl when applied to this problem. Detailed results and analyses are included to support our conclusions.
Chowdhury, A.; Mccabe, M.C.: Improving information retrieval systems using part of speech tagging (1993) 0.00
```
0.0015457221 = product of:
  0.009274333 = sum of:
    0.009274333 = weight(_text_:in in 1061) [ClassicSimilarity], result of:
      0.009274333 = score(doc=1061,freq=6.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.1561842 = fieldWeight in 1061, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.046875 = fieldNorm(doc=1061)
  0.16666667 = coord(1/6)
```
Abstract

The object of Information Retrieval is to retrieve all relevant documents for a user query and only those relevant documents. Much research has focused on achieving this objective with little regard for storage overhead or performance. In the paper we evaluate the use of Part of Speech Tagging to improve, the index storage overhead and general speed of the system with only a minimal reduction to precision recall measurements. We tagged 500Mbs of the Los Angeles Times 1990 and 1989 document collection provided by TREC for parts of speech. We then experimented to find the most relevant part of speech to index. We show that 90% of precision recall is achieved with 40% of the document collections terms. We also show that this is a improvement in overhead with only a 1% reduction in precision recall.
Beitzel, S.M.; Jensen, E.C.; Chowdhury, A.; Frieder, O.; Grossman, D.: Temporal analysis of a very large topically categorized Web query log (2007) 0.00
```
0.0012881019 = product of:
  0.007728611 = sum of:
    0.007728611 = weight(_text_:in in 60) [ClassicSimilarity], result of:
      0.007728611 = score(doc=60,freq=6.0), product of:
        0.059380736 = queryWeight, product of:
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.043654136 = queryNorm
        0.1301535 = fieldWeight in 60, product of:
          2.4494898 = tf(freq=6.0), with freq of:
            6.0 = termFreq=6.0
          1.3602545 = idf(docFreq=30841, maxDocs=44218)
          0.0390625 = fieldNorm(doc=60)
  0.16666667 = coord(1/6)
```
Abstract

The authors review a log of billions of Web queries that constituted the total query traffic for a 6-month period of a general-purpose commercial Web search service. Previously, query logs were studied from a single, cumulative view. In contrast, this study builds on the authors' previous work, which showed changes in popularity and uniqueness of topically categorized queries across the hours in a day. To further their analysis, they examine query traffic on a daily, weekly, and monthly basis by matching it against lists of queries that have been topically precategorized by human editors. These lists represent 13% of the query traffic. They show that query traffic from particular topical categories differs both from the query stream as a whole and from other categories. Additionally, they show that certain categories of queries trend differently over varying periods. The authors key contribution is twofold: They outline a method for studying both the static and topical properties of a very large query log over varying periods, and they identify and examine topical trends that may provide valuable insight for improving both retrieval effectiveness and efficiency.

Search (3 results, page 1 of 1)

Authors

Years

Types

Themes