Search (1 results, page 1 of 1)

Akman, K.I.: ¬A new text compression technique based on natural language structure (1995) 0.02
```
0.015656501 = product of:
  0.0469695 = sum of:
    0.0469695 = weight(_text_:search in 1860) [ClassicSimilarity], result of:
      0.0469695 = score(doc=1860,freq=2.0), product of:
        0.1747324 = queryWeight, product of:
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.05027291 = queryNorm
        0.2688082 = fieldWeight in 1860, product of:
          1.4142135 = tf(freq=2.0), with freq of:
            2.0 = termFreq=2.0
          3.475677 = idf(docFreq=3718, maxDocs=44218)
          0.0546875 = fieldNorm(doc=1860)
  0.33333334 = coord(1/3)
```
Abstract

Describes a new data compression technique which utilizes some of the common structural characteristics of languages. The proposed algorithm partitions words into their roots and suffixes which are then replaced by shorter bit representations. The method used 3 dictionaries in the from of binary search trees and 1 character array. The first 2 dictionaries are for roots, and the third one is for suffixes. The character array is used for both searching compressible words and coding incompressible words. The number of bits in representing a substring depends on the number of the entries in the dictionary in which the substring is found. The proposed algorithm is implemented in the Turkish language and tested using 3 different text groups with different lenghts. Results indicate a compression factor of up to 47 per cent