8.16.0.4
9 Prefix Codes: Encoding and Decoding🔗ℹ
Added in version 1.2 of package binaryio-lib.
This module provides encoding and decoding support using
prefix codes (aka
prefix-free codes), including
Huffman codes.
See binaryio/huffman for support for computing such codes.
Encodes the values of
input using
encode-table, which
represents the prefix code as a mapping of values to
short bitvectors. The
input can be any sequence, including strings, byte strings, lists, and
so on. The result is a byte string containing the encoded bits, along with the
number of bits in the encoding. See
output-bitport-get-output for the
format of the encoded byte string and the interpretation of the
pad-sbv
argument.
The encode-table must be one of the following:
Each code in the table must be unique, and the set of codes must form a valid
prefix code. Otherwise, the results of encoding and decoding are unpredictable.
If msf? is #t (the default), then bits within each byte are
added in most significant first order. If msf? is #f,
then bits within each byte are added in least significant first order.
Examples:
> (require binaryio/examples/hpack) |
|
> (values enc enc-bits) |
#"\234\264Pu<\36\312$\376?" | 74 |
|
Converts
encode-table (see
prefixcode-encode) into a data
structure suitable for decoding the same prefix code.
The representation is not specified, but if all values in the table are readable
(or quotable), then the representation of the decoder tree is readable (or
quotable).
Example:
Decodes
bs (from
start-bit-index, inclusive, to
end-bit-index, exclusive) using the prefix code represented by
decode-tree, which must have been produced by
prefixcode-build-decode-tree.
Each value represented by decode-tree must be a byte, character, byte
string, or character string. Each decoded value is accumulated into a byte
string, which is the result of successful decoding.
If the decoder encounters a sequence of bits that is not a valid code prefix, it calls
(handle-error 'bad bad-start-index bad-end-index bad-code)
to handle the error.
If the decoder reaches end-bit-index without completing the current
code, and if end-code is #f, then it handles the error by
calling
(handle-error 'incomplete incomplete-start-index end-bit-index incomplete-code)
But if end-code is a bitvector, then no error is signaled if the bits
between the end of the last complete code and end-bit-index are a
prefix of end-code.
Note that if handle-error returns normally, its result is discarded, so
it is recommended that handle-error escape (for example, by raising an
exception).
Examples:
Like
prefixcode-decode, but each decoded value is sent to
output, and the result of a successful decoding is
(void).
If output is an output port, then a decoded value must be a byte,
character, byte string, or character string, and the value is emitted by writing
it to the port. If output is a procedure, then any value is allowed,
and the value is emitted by calling output on it.
If decoding completes successfully, the result is (void); otherwise, it
is the result of the call to handle-error.
Examples:
Like
prefixcode-decode, but decodes the input to a list. This allows
values other than bytes, characters, and character strings to be conveniently
decoded.
Note that if handle-error returns normally, its result is discarded, so
it is recommended that handle-error escape (for example, by raising an
exception).
Like
prefixcode-decode, but decodes a single value from the input. The
result is one of the following:
(values 'ok next-bit-index value) — The bits from
start-bit-index (inclusive) to next-bit-index (exclusive)
represent the code for value.
(values 'bad next-bit-index bad-code) — The bits from
start-bit-index to next-bit-index do not represent a valid
code or its prefix. The bad-code result contains those bits as a
bitvector.
(values 'incomplete next-bit-index incomplete-code) — The
bits from start-bit-index to next-bit-index represent an
incomplete code, but it is not a prefix of end-code. The
incomplete-code result contains those bits as a bitvector.
(values 'end next-bit-index final-code) — The bits from
start-bit-index to next-bit-index represent a prefix of
end-code — possibly all of end-code, possibly empty (if
start-bit-index equals end-bit-index). The
final-code result contains those bits as a bitvector.
Examples: