The Library
The effect of flexible parsing for dynamic dictionary based data compression
Tools
Matias, Yossi, Rajpoot, Nasir M. (Nasir Mahmood) and Sahinalp, Suleyman Cenk (2001) The effect of flexible parsing for dynamic dictionary based data compression. Journal of Experimental Algorithmics (JEA), Volume 6 . pp. 1-19. Article Number 10 . doi:10.1145/945394.945404 ISSN 1084-6654.
|
PDF
WRAP_Rajpoot_research%2Fpcav%2Fimproc%2Fpubs%2Fmatias2002.pdf - Accepted Version - Requires a PDF viewer. Download (415Kb) | Preview |
|
PDF
a10-matias (1).pdf - Published Version Embargoed item. Restricted access to Repository staff only - Requires a PDF viewer. Download (355Kb) |
Official URL: http://dx.doi.org/10.1145/945394.945404
Abstract
We report on the performance evaluation of greedy parsing with a single step lookahead (which we call flexible Parsing or FP as an alternative to the commonly used greedy parsing (with no-lookaheads) scheme. Greedy parsing is the basis of most popular compression programs including UNIX compress and gzip, however it usually results in far from optimal parsing/compression with regard to the dictionary construction scheme in use. Flexible parsing, however, is optimal [MS99], i.e. partitions any given input to the smallest number of phrases possible, for dictionary construction schemes which satisfy the prefix property throughout their execution.
We focus on the application of FP in the context of the LZW variant of the Lempel-Ziv'78 dictionary construction method [Wel84, ZL78], which is of considerable practical interest. We implement two compression algorithms which use (1) FP with LZW dictionary (LZW-FP), and (2) FP with an alternative flexible dictionary (FPA as introduced in [Hor95]). Our implementations are based on novel on-line data structures enabling us to use linear time and space. We test our implementations on a collection of input sequences which includes textual files, DNA sequences, medical images, and pseudorandom binary files, and compare our results with two of the most popular compression programs UNIX compress and gzip. Our results demonstrate that flexible parsing is especially useful for non-textual data, on which it improves over the compression rates of compress and gzip by up to 20% and 35%, respectively.
Item Type: | Journal Article | ||||
---|---|---|---|---|---|
Subjects: | Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software | ||||
Divisions: | Faculty of Science, Engineering and Medicine > Science > Computer Science | ||||
Library of Congress Subject Headings (LCSH): | Data compression (Computer science), Image compression | ||||
Journal or Publication Title: | Journal of Experimental Algorithmics (JEA) | ||||
Publisher: | Association for Computing Machinery | ||||
ISSN: | 1084-6654 | ||||
Official Date: | 2001 | ||||
Dates: |
|
||||
Volume: | Volume 6 | ||||
Number of Pages: | 19 | ||||
Page Range: | pp. 1-19 | ||||
Article Number: | Article Number 10 | ||||
DOI: | 10.1145/945394.945404 | ||||
Status: | Peer Reviewed | ||||
Publication Status: | Published | ||||
Access rights to Published version: | Restricted or Subscription Access | ||||
Date of first compliant deposit: | 27 December 2015 | ||||
Date of first compliant Open Access: | 27 December 2015 | ||||
Funder: | Israel Science Foundation (ISF), Israel. Miśrad ha-madaʻ, North Atlantic Treaty Organization (NATO), European Commission (EC) | ||||
Grant number: | CRG- 972175 (NATO), 20244 - ALCOM IT (EC) |
Request changes or add full text files to a record
Repository staff actions (login required)
View Item |
Downloads
Downloads per month over past year