-
Baeza-Yates, R.; Navarro, G.: XQL and proximal nodes (2002)
0.01
0.011089402 = product of:
0.022178805 = sum of:
0.022178805 = product of:
0.04435761 = sum of:
0.04435761 = weight(_text_:r in 454) [ClassicSimilarity], result of:
0.04435761 = score(doc=454,freq=2.0), product of:
0.17326194 = queryWeight, product of:
3.3102584 = idf(docFreq=4387, maxDocs=44218)
0.05234091 = queryNorm
0.25601473 = fieldWeight in 454, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.3102584 = idf(docFreq=4387, maxDocs=44218)
0.0546875 = fieldNorm(doc=454)
0.5 = coord(1/2)
0.5 = coord(1/2)
-
Baeza-Yates, R.; Navarro, G.: Block addressing indices for approximate text retrieval (2000)
0.01
0.009505202 = product of:
0.019010404 = sum of:
0.019010404 = product of:
0.03802081 = sum of:
0.03802081 = weight(_text_:r in 4295) [ClassicSimilarity], result of:
0.03802081 = score(doc=4295,freq=2.0), product of:
0.17326194 = queryWeight, product of:
3.3102584 = idf(docFreq=4387, maxDocs=44218)
0.05234091 = queryNorm
0.2194412 = fieldWeight in 4295, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.3102584 = idf(docFreq=4387, maxDocs=44218)
0.046875 = fieldNorm(doc=4295)
0.5 = coord(1/2)
0.5 = coord(1/2)
-
Navarro, G.; Baeza-Yates, R.; Azevedo Arcoverde, J.M.: Matchsimile : a flexible approximate matching tool for searching proper names (2003)
0.01
0.009505202 = product of:
0.019010404 = sum of:
0.019010404 = product of:
0.03802081 = sum of:
0.03802081 = weight(_text_:r in 1420) [ClassicSimilarity], result of:
0.03802081 = score(doc=1420,freq=2.0), product of:
0.17326194 = queryWeight, product of:
3.3102584 = idf(docFreq=4387, maxDocs=44218)
0.05234091 = queryNorm
0.2194412 = fieldWeight in 1420, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
3.3102584 = idf(docFreq=4387, maxDocs=44218)
0.046875 = fieldNorm(doc=1420)
0.5 = coord(1/2)
0.5 = coord(1/2)
-
Adiego, J.; Navarro, G.; Fuente, P. de la: Lempel-Ziv compression of highly structured documents (2007)
0.01
0.007511559 = product of:
0.015023118 = sum of:
0.015023118 = product of:
0.06009247 = sum of:
0.06009247 = weight(_text_:authors in 4993) [ClassicSimilarity], result of:
0.06009247 = score(doc=4993,freq=2.0), product of:
0.23861247 = queryWeight, product of:
4.558814 = idf(docFreq=1258, maxDocs=44218)
0.05234091 = queryNorm
0.25184128 = fieldWeight in 4993, product of:
1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
4.558814 = idf(docFreq=1258, maxDocs=44218)
0.0390625 = fieldNorm(doc=4993)
0.25 = coord(1/4)
0.5 = coord(1/2)
- Abstract
- The authors describe Lempel-Ziv to Compress Structure (LZCS), a novel Lempel-Ziv approach suitable for compressing structured documents. LZCS takes advantage of repeated substructures that may appear in the documents, by replacing them with a backward reference to their previous occurrence. The result of the LZCS transformation is still a valid structured document, which is human-readable and can be transmitted by ASCII channels. Moreover, LZCS transformed documents are easy to search, display, access at random, and navigate. In a second stage, the transformed documents can be further compressed using any semistatic technique, so that it is still possible to do all those operations efficiently; or with any adaptive technique to boost compression. LZCS is especially efficient in the compression of collections of highly structured data, such as extensible markup language (XML) forms, invoices, e-commerce, and Web-service exchange documents. The comparison with other structure-aware and standard compressors shows that LZCS is a competitive choice for these type of documents, whereas the others are not well-suited to support navigation or random access. When joined to an adaptive compressor, LZCS obtains by far the best compression ratios.