Implementation
All of the compression/decompression modes (see below) are implemented by subclassing Compressor to provide the required low-level compression implementation. Each class provides both compression and decompression capabilities. The compressor and decompressor use a dictionary implemented as a trie for looking up the codes assigned to sequences of characters, combined with a vector of pointers into the trie for looking up character sequences based on codes.
Compression Modes
The compressor has a couple of compression modes, variants on the basic LZW algorithm. For the purposes of specifying which to use (see notes on running the compressor), each has an abbreviation, given in parentheses below.
- Fixed code width, static dictionary (fs): This is the simplest LZW variant; each code in the compressed output is the same width (traditionally 12 bits), and when the dictionary is full compression continues without adding any new entries to the dictionary.
- Variable code width, static dictionary (vs): Instead of all codes being the same width, the code width grows so as to be just large enough to uniquely identify all entries in the dictionary, up to some maximum. Once the code width has reached this maximum and all codes of that width have been used, no new entries are added to the dictionary. This scheme provides improved compression as the dictionary is being built up, but as it fills up the compression ratio approaches that of the fixed-width scheme.