4 Data Generator
These data files themselves are generated by tooling from the Unicode Character Database primarily UCD.zip and Unihan.zip. These zip files, along with other reference material are downloaded into a temporary location and used to generate the Racket data files used by codepoint/properties. The fetch and generate processes may be used separately but are intended to be run together by the executable ucd-generator.
4.1 Running the generator
As a part of the build executed by raco setup, an executable named ucd-generator is generated. This tool is used to create a set of data files according to the following steps.
- If the temporary directory, data, is not present:
create the temporary directory,
use the curl command to download the required files, and
unzip the UCD.zip file.
If the output directory, generated, is not present then create it.
For each of the relevant source files generate each target data file.
Note that this tool must be run in the root directory of the package.
4.2 Internal module codepoint/generator
(require codepoint/generator) | package: codepoint |
This module has both a provided function interface as well as a main module. The main module is actually the entry point for the generated ucd-generator executable, but can also be invoked directly as racket ./private/generator.rkt.
4.3 Internal module codepoint/ucd
(require codepoint/ucd) | package: codepoint |
This module also has both a provided function interface as well as a main module. While this can be invoked on it’s own to download data files it is typically invoked by the generate-modules function, or ucd-generator executable.