PROGRAMMER'S GUIDECompression / decompression library
■ | Advance
Compression / decompression library

1. 1. guide


1.1 Applicable

This compression / decompression library is for decompressing data compressed by the run-length compression tool provided separately.

Run-length compression tool CMPRUN.EXE

1.2 compression method

Original, mismatch series processing is added. In a simple run-length algorithm, the mismatched sequences worsen the compression ratio, so the mismatched sequences are processed together. Match series is expressed by consecutive length and value. The mismatched series is represented by the mismatched series length and the mismatched series value (uncompressed data).
The mismatch sequence length is described as a negative value (two's complement) to distinguish between the sign that represents the match sequence and the sign that represents the mismatch sequence.
Since the input of short consecutive lengths divides the mismatched series and worsens the compression rate, the end condition of the mismatched series is devised.

[Definition of mismatch series]
Mismatched series are defined by the following start and end conditions.

<Start condition of mismatch series>

Input of the symbol (character) of the consecutive length 1.

[Example of mismatch series]
The following figure is an example of a mismatched series.

<End condition of mismatched series>

Input of a matching series with a consecutive length of 3 or more.
or,
When the mismatch series length reaches the limit that can be expressed in the number of processing unit bytes.

Figure 1.1 Example of mismatched series
Start ... A A A B C C C C D D D ...
Exit start ··· A A A B C D E F G G G ······
Exit start ··· A A A B C C D D E E E ······
End Start ・ ・ ・ A A A B C C D D D E F F ・ ・ ・ ・
                     end

[Processing unit]
Select the processing unit from the three types shown in the following table. From the beginning, enter one character (symbol) by separating each processing unit boundary.

Table 1.1 Processing unit
Processing unit Image of compression processing unit for input data
BYTE
(1 byte)
WORD
(2 bytes)
DWORD
(4 bytes)

* The numbers are hexadecimal. Equivalent to 1 byte.

[Expression of the leader]
The normal sequence length (matched sequence length) is represented by a positive number, and the mismatched sequence length is represented by a negative number. Since the consecutive length (matching series length) is always 2 or more, describe by subtracting 2.
The number of bytes used to represent the length is the same as the number of processing unit bytes.
The actual amount of data for the length is interpreted on a processing-by-process basis. In other words, when the processing unit is WORD and the continuous length is 5, it means the coding processing for 10 bytes of data.
The following table is the range of length values that can be represented.

Table 1.2 Continuous length width and continuous length expression


Processing unit
 Consecutive length (matching series length)
 Mismatch series length
 Expression (description)
 Expression (description)
 BYTE 1 byte
 WORD 2 bytes
 DWORD 4 bytes 

[Processing image diagram]

The following figure is a processing image of the compression algorithm by run-length / mismatch sequence processing.

Figure 1.2 Run-length / mismatch series processing

・ ・ ・ ・ Matched series value or mismatched series value (uncompressed data)
・ ・ ・ ・ Processing unit. One of BYTE, WORD, DWORD.
・ ・ ・ ・ Continuous length or mismatched series length.
・ ・ ・ ・ Continuous length width. Equal to the processing unit.


■ | Advance
PROGRAMMER'S GUIDECompression / decompression library
Copyright SEGA ENTERPRISES, LTD., 1997