Search (32 results, page 1 of 2)

Cannane, A.; Williams, H.E.: General-purpose compression for efficient retrieval (2001) 0.14
```
0.13827354 = product of:
  0.27654707 = sum of:
    0.27654707 = product of:
      0.55309415 = sum of:
        0.55309415 = weight(_text_:compression in 5705) [ClassicSimilarity], result of:
          0.55309415 = score(doc=5705,freq=20.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            1.5334243 = fieldWeight in 5705, product of:
              4.472136 = tf(freq=20.0), with freq of:
                20.0 = termFreq=20.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.046875 = fieldNorm(doc=5705)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Compression of databases not only reduces space requirements but can also reduce overall retrieval times. In text databases, compression of documents based on semistatic modeling with words has been shown to be both practical and fast. Similarly, for specific applications -such as databases of integers or scientific databases-specially designed semistatic compression schemes work well. We propose a scheme for general-purpose compression that can be applied to all types of data stored in large collections. We describe our approach -which we call RAY-in detail, and show experimentally the compression available, compression and decompression costs, and performance as a stream and random-access technique. We show that, in many cases, RAY achieves better compression than an efficient Huffman scheme and popular adaptive compression techniques, and that it can be used as an efficient general-purpose compression scheme
Cheng, C.-S.; Chung, C.-P.; Shann, J.J.-J.: Fast query evaluation through document identifier assignment for inverted file-based information retrieval systems (2006) 0.05
```
0.0515315 = product of:
  0.103063 = sum of:
    0.103063 = product of:
      0.206126 = sum of:
        0.206126 = weight(_text_:compression in 979) [ClassicSimilarity], result of:
          0.206126 = score(doc=979,freq=4.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            0.5714735 = fieldWeight in 979, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.0390625 = fieldNorm(doc=979)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

Compressing an inverted file can greatly improve query performance of an information retrieval system (IRS) by reducing disk I/Os. We observe that a good document identifier assignment (DIA) can make the document identifiers in the posting lists more clustered, and result in better compression as well as shorter query processing time. In this paper, we tackle the NP-complete problem of finding an optimal DIA to minimize the average query processing time in an IRS when the probability distribution of query terms is given. We indicate that the greedy nearest neighbor (Greedy-NN) algorithm can provide excellent performance for this problem. However, the Greedy-NN algorithm is inappropriate if used in large-scale IRSs, due to its high complexity O(N2 × n), where N denotes the number of documents and n denotes the number of distinct terms. In real-world IRSs, the distribution of query terms is skewed. Based on this fact, we propose a fast O(N × n) heuristic, called partition-based document identifier assignment (PBDIA) algorithm, which can efficiently assign consecutive document identifiers to those documents containing frequently used query terms, and improve compression efficiency of the posting lists for those terms. This can result in reduced query processing time. The experimental results show that the PBDIA algorithm can yield a competitive performance versus the Greedy-NN for the DIA problem, and that this optimization problem has significant advantages for both long queries and parallel information retrieval (IR).
Moffat, A.; Bell, T.A.H.: In situ generation of compressed inverted files (1995) 0.04
```
0.04372593 = product of:
  0.08745186 = sum of:
    0.08745186 = product of:
      0.17490372 = sum of:
        0.17490372 = weight(_text_:compression in 2648) [ClassicSimilarity], result of:
          0.17490372 = score(doc=2648,freq=2.0), product of:
            0.36069217 = queryWeight, product of:
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.049309507 = queryNorm
            0.48491132 = fieldWeight in 2648, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              7.314861 = idf(docFreq=79, maxDocs=44218)
              0.046875 = fieldNorm(doc=2648)
      0.5 = coord(1/2)
  0.5 = coord(1/2)
```
Abstract

An inverted index stores, for each term that appears in a collection of documents, a list of document numbers containing that term. Such an index is indispensible when Boolean or informal ranked queries are to be answered. Construction of the index ist, however, a non trivial task. Simple methods using in.memory data structures cannot be used for large collections because they require too much random access storage, and traditional disc based methods require large amounts of temporary file space. Describes a new indexing algorithm designed to create large compressed inverted indexes in situ. It makes use of simple compression codes for the positive integers and an in place external multi way merge sort. The new techniques has been used to invert a 2-gigabyte text collection in under 4 hours, using less than 40 megabytes of temporary disc space, and less than 20 megabytes of main memory

Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval (1986) 0.03

0.026723031 = product of:
  0.053446062 = sum of:
    0.053446062 = product of:
      0.106892124 = sum of:
        0.106892124 = weight(_text_:22 in 402) [ClassicSimilarity], result of:
          0.106892124 = score(doc=402,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.61904186 = fieldWeight in 402, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.125 = fieldNorm(doc=402)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Information processing and management. 22(1986) no.6, S.465-476

Smeaton, A.F.; Rijsbergen, C.J. van: ¬The retrieval effects of query expansion on a feedback document retrieval system (1983) 0.02

0.02338265 = product of:
  0.0467653 = sum of:
    0.0467653 = product of:
      0.0935306 = sum of:
        0.0935306 = weight(_text_:22 in 2134) [ClassicSimilarity], result of:
          0.0935306 = score(doc=2134,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.5416616 = fieldWeight in 2134, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=2134)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 30. 3.2001 13:32:22

Back, J.: ¬An evaluation of relevancy ranking techniques used by Internet search engines (2000) 0.02

0.02338265 = product of:
  0.0467653 = sum of:
    0.0467653 = product of:
      0.0935306 = sum of:
        0.0935306 = weight(_text_:22 in 3445) [ClassicSimilarity], result of:
          0.0935306 = score(doc=3445,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.5416616 = fieldWeight in 3445, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.109375 = fieldNorm(doc=3445)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 25. 8.2005 17:42:22

Fuhr, N.: Ranking-Experimente mit gewichteter Indexierung (1986) 0.02

0.020042272 = product of:
  0.040084545 = sum of:
    0.040084545 = product of:
      0.08016909 = sum of:
        0.08016909 = weight(_text_:22 in 58) [ClassicSimilarity], result of:
          0.08016909 = score(doc=58,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.46428138 = fieldWeight in 58, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=58)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 14. 6.2015 22:12:44

Fuhr, N.: Rankingexperimente mit gewichteter Indexierung (1986) 0.02

0.020042272 = product of:
  0.040084545 = sum of:
    0.040084545 = product of:
      0.08016909 = sum of:
        0.08016909 = weight(_text_:22 in 2051) [ClassicSimilarity], result of:
          0.08016909 = score(doc=2051,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.46428138 = fieldWeight in 2051, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.09375 = fieldNorm(doc=2051)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 14. 6.2015 22:12:56

MacFarlane, A.; Robertson, S.E.; McCann, J.A.: Parallel computing for passage retrieval (2004) 0.01

0.0133615155 = product of:
  0.026723031 = sum of:
    0.026723031 = product of:
      0.053446062 = sum of:
        0.053446062 = weight(_text_:22 in 5108) [ClassicSimilarity], result of:
          0.053446062 = score(doc=5108,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.30952093 = fieldWeight in 5108, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=5108)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 20. 1.2007 18:30:22

Faloutsos, C.: Signature files (1992) 0.01

0.0133615155 = product of:
  0.026723031 = sum of:
    0.026723031 = product of:
      0.053446062 = sum of:
        0.053446062 = weight(_text_:22 in 3499) [ClassicSimilarity], result of:
          0.053446062 = score(doc=3499,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.30952093 = fieldWeight in 3499, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=3499)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 7. 5.1999 15:22:48

Losada, D.E.; Barreiro, A.: Emebedding term similarity and inverse document frequency into a logical model of information retrieval (2003) 0.01

0.0133615155 = product of:
  0.026723031 = sum of:
    0.026723031 = product of:
      0.053446062 = sum of:
        0.053446062 = weight(_text_:22 in 1422) [ClassicSimilarity], result of:
          0.053446062 = score(doc=1422,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.30952093 = fieldWeight in 1422, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1422)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 3.2003 19:27:23

Bornmann, L.; Mutz, R.: From P100 to P100' : a new citation-rank approach (2014) 0.01

0.0133615155 = product of:
  0.026723031 = sum of:
    0.026723031 = product of:
      0.053446062 = sum of:
        0.053446062 = weight(_text_:22 in 1431) [ClassicSimilarity], result of:
          0.053446062 = score(doc=1431,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.30952093 = fieldWeight in 1431, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1431)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 8.2014 17:05:18

Tober, M.; Hennig, L.; Furch, D.: SEO Ranking-Faktoren und Rang-Korrelationen 2014 : Google Deutschland (2014) 0.01

0.0133615155 = product of:
  0.026723031 = sum of:
    0.026723031 = product of:
      0.053446062 = sum of:
        0.053446062 = weight(_text_:22 in 1484) [ClassicSimilarity], result of:
          0.053446062 = score(doc=1484,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.30952093 = fieldWeight in 1484, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0625 = fieldNorm(doc=1484)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 13. 9.2014 14:45:22

Ravana, S.D.; Rajagopal, P.; Balakrishnan, V.: Ranking retrieval systems using pseudo relevance judgments (2015) 0.01

0.011810022 = product of:
  0.023620045 = sum of:
    0.023620045 = product of:
      0.04724009 = sum of:
        0.04724009 = weight(_text_:22 in 2591) [ClassicSimilarity], result of:
          0.04724009 = score(doc=2591,freq=4.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.27358043 = fieldWeight in 2591, product of:
              2.0 = tf(freq=4.0), with freq of:
                4.0 = termFreq=4.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0390625 = fieldNorm(doc=2591)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 20. 1.2015 18:30:22
18. 9.2018 18:22:56

Chang, C.-H.; Hsu, C.-C.: Integrating query expansion and conceptual relevance feedback for personalized Web information retrieval (1998) 0.01

0.011691325 = product of:
  0.02338265 = sum of:
    0.02338265 = product of:
      0.0467653 = sum of:
        0.0467653 = weight(_text_:22 in 1319) [ClassicSimilarity], result of:
          0.0467653 = score(doc=1319,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.2708308 = fieldWeight in 1319, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=1319)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 1. 8.1996 22:08:06

Kanaeva, Z.: Ranking: Google und CiteSeer (2005) 0.01

0.011691325 = product of:
  0.02338265 = sum of:
    0.02338265 = product of:
      0.0467653 = sum of:
        0.0467653 = weight(_text_:22 in 3276) [ClassicSimilarity], result of:
          0.0467653 = score(doc=3276,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.2708308 = fieldWeight in 3276, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.0546875 = fieldNorm(doc=3276)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 20. 3.2005 16:23:22

Joss, M.W.; Wszola, S.: ¬The engines that can : text search and retrieval software, their strategies, and vendors (1996) 0.01

0.010021136 = product of:
  0.020042272 = sum of:
    0.020042272 = product of:
      0.040084545 = sum of:
        0.040084545 = weight(_text_:22 in 5123) [ClassicSimilarity], result of:
          0.040084545 = score(doc=5123,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.23214069 = fieldWeight in 5123, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=5123)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 12. 9.1996 13:56:22

Kelledy, F.; Smeaton, A.F.: Signature files and beyond (1996) 0.01

0.010021136 = product of:
  0.020042272 = sum of:
    0.020042272 = product of:
      0.040084545 = sum of:
        0.040084545 = weight(_text_:22 in 6973) [ClassicSimilarity], result of:
          0.040084545 = score(doc=6973,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.23214069 = fieldWeight in 6973, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=6973)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Source: Information retrieval: new systems and current research. Proceedings of the 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Drymen, Scotland, 22-23 Mar 94. Ed.: R. Leon

Crestani, F.; Dominich, S.; Lalmas, M.; Rijsbergen, C.J.K. van: Mathematical, logical, and formal methods in information retrieval : an introduction to the special issue (2003) 0.01

0.010021136 = product of:
  0.020042272 = sum of:
    0.020042272 = product of:
      0.040084545 = sum of:
        0.040084545 = weight(_text_:22 in 1451) [ClassicSimilarity], result of:
          0.040084545 = score(doc=1451,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.23214069 = fieldWeight in 1451, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=1451)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 22. 3.2003 19:27:36

Fan, W.; Fox, E.A.; Pathak, P.; Wu, H.: ¬The effects of fitness functions an genetic programming-based ranking discovery for Web search (2004) 0.01

0.010021136 = product of:
  0.020042272 = sum of:
    0.020042272 = product of:
      0.040084545 = sum of:
        0.040084545 = weight(_text_:22 in 2239) [ClassicSimilarity], result of:
          0.040084545 = score(doc=2239,freq=2.0), product of:
            0.1726735 = queryWeight, product of:
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.049309507 = queryNorm
            0.23214069 = fieldWeight in 2239, product of:
              1.4142135 = tf(freq=2.0), with freq of:
                2.0 = termFreq=2.0
              3.5018296 = idf(docFreq=3622, maxDocs=44218)
              0.046875 = fieldNorm(doc=2239)
      0.5 = coord(1/2)
  0.5 = coord(1/2)

Date: 31. 5.2004 19:22:06

Search (32 results, page 1 of 2)

Authors

Years

Languages

Types

Themes

Subjects

Classifications