GTP measure
(require gtp-measure) | package: gtp-measure |
1 Command-line: raco gtp-measure
See also: GTP targets
See also: GTP configuration
To see all accepted flags: raco gtp-measure --help
To measure performance and print status messages:
PLTSTDERR="error info@gtp-measure" raco gtp-measure ....
1.1 Stages of measurement
After gtp-measure is invoked on the command line, it operates in five stages:
resolve the command-line targets to actual files / directories;
resolve the command-line configuration options to a configuration;
setup a measuring task, based on the targets;
divide the measuring task into sub-tasks;
collect data, write to the task’s data directory.
1.2 Configuration and Data Files
The gtp-measure library uses the basedir library to obtain configuration and data files.
User-level configuration settings are stored in the file:
(writable-config-file "config.rktd" #:program "gtp-measure")
Each task gets a data directory, stored under:
(writable-data-dir #:program "gtp-measure")
Together, the data files and command-line arguments build a gtp-measure configuration value. See Configuration Fallback for details on how these data sources work together.
2 GTP targets
gtp file target : a file containing a Racket module and exactly one call to time-apply (possibly via time);
- gtp typed-untyped target : a directory containing: (1) a "typed" directory, (2) an "untyped" directory, (3) optionally a "base" directory, and (4) optionally a "both" directory.
The "typed" directory must contain a few typed/racket modules.
The "untyped" directory must contain matching Racket modules. These modules must have the same name as the modules in the "typed" directory, and should have the same code as the typed modules —
just missing type annotations and type casts. The optional "base" directory may contain data files that the "typed" and "untyped" modules may reference via a relative path (e.g. "../base/file.rkt")
The optional "both" directory may contain modules that the "typed" and "untyped" modules may reference as if they were in the same directory (e.g. "file.rkt"). If so, the "typed" and "untyped" modules will not compile unless the "both" modules are copied into their directory. This is by design.
gtp manifest target : a file containing a gtp-measure/manifest module.
- gtp deep-shallow-untyped target : a directory containing: (1) a "typed" directory, (2) an "untyped" directory, (3) a "shallow" directory, (4) optionally a "base" directory, and (5) optionally a "both" directory.
The "typed" and "untyped" directories must follow the same guidelines as for a typed-untyped target.
The "shallow" directory must contain matching typed modules in Transient mode (#lang typed/racket #:transient).
The optional "base" directory may contain data files that the "typed" and "untyped" modules may reference via a relative path (e.g. "../base/file.rkt")
The optional "both" directory may contain modules that the "typed" and "untyped" modules may reference as if they were in the same directory (e.g. "file.rkt"). If so, the "typed" and "untyped" modules will not compile unless the "both" modules are copied into their directory. This is by design.
To measure a file target, gtp-measure compiles the file once and repeatedly: runs the file and parses the output of time-apply. See GTP configuration for details on how gtp-measure compiles and runs Racket modules.
To measure a typed-untyped target, gtp-measure chooses a sequence of typed-untyped configurations and, for each: copies the configuration to a directory, and runs this program’s entry module as a file target. The sequence of configurations is either exhaustive or approximate.
To measure a manifest target, gtp-measure runs the targets listed in the manifest.
To measure a deep-shallow-untyped target, the protocol is similar to typed-untyped targets.
2.1 Typed-Untyped Configuration
A typed-untyped configuration for a
typed-untyped target with M
modules is a working program with M modules —
The gtp-measure library encodes such a configuration with a string of length M where each character is either #\0 or #\1. If the character at position i is #\0, the configuration uses the i-th module in the "untyped" directory and ignores the i-th module in the "typed" directory. If the character at position i is #\1, the configuration uses the i-th "typed" module and ignores the "untyped" module. Modules are ordered by filename-sort.
2.2 Exhaustive vs. Approximate evaluation
An exhaustive evaluation of a typed-untyped target with M modules measures the performance of all 2M configurations. This is a lot of measuring, and will probably take a very long time if M is 15 or more.
An R-S-approximate evaluation measures R * S * M randomly-selected configurations; more precisely, R sequences containing S*M configuration in each sequence. This number, RSM, is probably less than 2M. (If it’s not, just do an exhaustive evaluation.) See GTP configuration for how to set R and S, and how to switch from an exhaustive evaluation to an approximate one.
The idea of an approximate evaluation comes from our work on Typed Racket. Greenman and Migeed (PEPM 2018) give a more precise definition, and apply the idea to Reticulated Python. Note that gtp-measure uses a different definition of S than the PEPM paper.
2.3 Design: typed-untyped directory
The point of a typed-untyped directory
is to describe an exponentially-large set of programs in “less than exponential” space.
The set is all ways of taking a Typed Racket program and removing some of its
types —
The "typed" and "untyped" directories are a first step to reduce space. Instead of storing all 2M programs for a program with M modules, we store 2M modules. The reason we store 2M instead of just M typed modules is that we do not have a way to automatically remove types from a Typed Racket program (to remove types, we sometimes want to translate type casts to Racket).
The "base" directory is a second way to save space. If a program depends on data or libraries, they belong in the "base" directory so that all configurations can reference one copy.
The "both" directory helps us automatically generate configurations by solving a technical problem. The problem is that if an untyped module defines a struct and two typed modules import it, both typed modules need to reference a canonical require/typed for the struct’s type definitions. We solve this by putting an type adaptor module with the require/typed in the "both" directory. An adaptor can require "typed" or "untyped" modules, and typed modules can require the adaptor.
3 GTP configuration
(require gtp-measure/configure) | package: gtp-measure |
The gtp-measure library is parameterized by a set of key/value pairs. This section documents the available keys and the type of values each key expects.
symbol
Used to compile and run Racket programs.
In particular, if <BIN> is the value of key:bin then the command to compile the target <FIILE> is:
<BIN>/raco make -v <FILE>
and the command to run <FILE> is:
<BIN>/racket <FILE>
Since this package was originally created to measure the GTP benchmarks, which depend on the require-typed-check package, invoking raco gtp-measure ensures that the package is installed for the current value of key:bin. If the package is missing, <BIN>/raco pkg installs it.
Changed in version 0.3 of package gtp-measure: Automatically install require-typed-check if missing.
symbol
Determines the number of times to run a file target and collect data.
symbol
Determines the number of times (if any) to run a file target and ignore the output BEFORE collecting data.
symbol
Determines R, the number of samples for any approximate evaluations.
symbol
Determines the size of each sample in any approximate evaluations. The size is S*M, where S is the value associated with this key and M is the number of modules in the typed-untyped target.
symbol
Determines whether to run an exhaustive or approximate evaluation for a typed-untyped target. Let M be the number of modules in the target and let C be the value associated with this key. If (<= M C), then gtp-measure runs an exhaustive evaluation; otherwise, it runs an approximate evaluation.
symbol
Determines the entry module of all typed-untyped targets. This module is treated as a file target for each configuration in the typed-untyped evaluation.
symbol
By default, this is the value of current-inexact-milliseconds when gtp-measure was invoked. You should probably not override this default.
symbol
Sets a time limit for the total time to run a configuration. If the value is #false then there is no time limit. Otherwise, the value is the time limit in seconds.
The total time includes all the warmup iterations and all the collecting iterations.
See also Time Limit Parsing.
Added in version 0.3 of package gtp-measure.
symbol
By default, this is the value of (vector->list (current-command-line-arguments)) when gtp-measure was invoked. You should probably not override this default.
All intermediate files and all results are saved in the given directory.
3.1 Configuration Fallback
The gtp-measure library defines a default value for each configuration key. Users can override this default by writing a hashtable with relevant keys (a subset of the keys listed above) to their configuration file. Users can override both the defaults and their global configuration by supplying a command-line flag. Run raco gtp-measure --help to see available flags.
The defaults for the machine that rendered this document are the following:
key:bin = "/home/root/racket/bin/"
key:iterations = 8
key:jit-warmup = 1
key:num-samples = 10
key:sample-factor = 10
key:cutoff = 9
key:entry-point = "main.rkt"
key:start-time = 0
key:time-limit = #f
key:argv = ()
4 GTP measuring task
A task describes a sequence of targets to measure.
4.1 GTP task setup
Before measuring the targets in a task, the gtp-measure library allocates a directory for the task and writes files that describe what is to be run. If the task is interrupted, gtp-measure may be able to resume the task; run raco gtp-measure --help for instructions.
4.2 GTP sub-task
A sub-task is one unit of a task. This concept is not well-defined. The idea is to divide measuring tasks into small pieces so there is little to recompute if a task is interrupted.
More later.
5 Data Description Languages
The gtp-measure library includes a few small languages to describe data formats.
5.1 Manifest = Benchmark Instructions
#lang gtp-measure/manifest | package: gtp-measure |
#lang gtp-measure/manifest #:config #hash((iterations . 10)) file-0.rkt typed-untyped-dir-0 "file-1.rkt" ("file-2.rkt" . file) (typed-untyped-dir-1 . typed-untyped)
There is an internal syntax class for these “target descriptors” that should be made public.
5.2 Output Data: File Target
#lang gtp-measure/output/file | package: gtp-measure |
successful time output, containing the CPU time, real time, and GC time;
a Racket runtime error message;
or a timeout notice ("timeout N").
5.3 Output Data: Typed-Untyped Target
#lang gtp-measure/output/typed-untyped ("00000" ("cpu time: 566 real time: 567 gc time: 62" "cpu time: 577 real time: 578 gc time: 62")) ("00001" ("cpu time: 820 real time: 822 gc time: 46" "cpu time: 793 real time: 795 gc time: 44")) ("00010" ("cpu time: 561 real time: 562 gc time: 46" "cpu time: 565 real time: 566 gc time: 44")) ("00011" ("cpu time: 805 real time: 807 gc time: 47" "cpu time: 813 real time: 815 gc time: 45")) ....
$ racket jpeg-2020-08-17.rktd dataset info: - num configs: 32 - num timings: 256 - min time: 110 ms - max time: 8453 ms - total time: 968537 ms
5.4 Output Data: Deep-Shallow-Untyped Target
#lang gtp-measure/output/deep-shallow-untyped ("00000" ("cpu time: 325 real time: 325 gc time: 60")) ("00001" ("cpu time: 336 real time: 336 gc time: 64")) ("00002" ("cpu time: 332 real time: 332 gc time: 64")) ("00010" ("cpu time: 7059 real time: 7061 gc time: 70")) ("00020" ("cpu time: 410 real time: 410 gc time: 64")) ("00011" ("cpu time: 7119 real time: 7121 gc time: 76")) ("00012" ("cpu time: 7035 real time: 7037 gc time: 76")) ("00021" ("cpu time: 426 real time: 426 gc time: 63")) ("00022" ("cpu time: 433 real time: 433 gc time: 77")) ("00100" ("cpu time: 7154 real time: 7158 gc time: 80")) ....
$ racket jpeg-2020-08-17.rktd dataset info: - num configs: 243 - num timings: 1944 - min time: 117 ms - max time: 9036 ms - total time: 6827787 ms
6 gtp-measure Utilities
6.1 Time Limit Parsing
procedure
(string->time-limit str) → exact-nonnegative-integer?
str : string?
> (string->time-limit "1") 1
> (string->time-limit "1s") 1
> (string->time-limit "1m") 60
> (string->time-limit "1h") 3600
procedure
(hours->seconds h) → exact-nonnegative-integer?
h : exact-nonnegative-integer?
procedure
(minutes->seconds m) → exact-nonnegative-integer?
m : exact-nonnegative-integer?
> (hours->seconds 1) 3600
> (minutes->seconds 1) 60