microyaml
(require microyaml) | package: microyaml |
1 Introduction
This library is a much faster, non-compliant YAML parser for Racket and Typed Racket. It is non-compliant because some features are not supported.
1.1 Supported features
Inline values (strings, numbers, booleans)
Multi-line sequences
Multi-line mappings
Explicit mappings
Nesting
Comments
UTF-8
1.2 Unsupported features
List marker immediately followed by indentation
Inline sequences (square brackets notation)
Inline mappings (curly braces notation)
Multi-line strings (whether arrow, pipe, implicit, or quoted notation)
Document markers (triple dash/triple dot notation)
Tags (exclamation mark notation)
Anchor nodes (ampersand notation)
Alias nodes (asterisk notation)
Directives (percent notation)
Escape sequences (backslash notation)
Serialisation
1.3 Performance
Benchmark file: ‘femboy_music_FRAGILE.yaml’ (105k lines; 2.9 MB) (source)
| Language | Program | Speed-down = Time | |
|------------|---------------------|----------------------| |
| JavaScript | js-yaml | 1x = 180 ms | |
| Racket | microyaml (strings) | 2x = 370 ms | |
| Racket | microyaml (types) | 3x = 481 ms | |
| .NET | YamlDotNet | 6x = 1.17 s | |
| Python | yaml | 35x = 6.40 s | |
| Racket | yaml | 23,500x = 1h 10m 33s | |
Benchmark file: ‘saltern.yaml’ (38k lines; 719 kB) (source, lightly edited)
| Language | Program | Speed-down = Time | |
|------------|---------------------|-------------------| |
| Racket | microyaml (types) | 1x = 209 ms | |
| .NET | YamlDotNet | 4x = 795 ms | |
| Racket | yaml | 2,124x = 7m 15s | |
| Racket | microyaml (strings) | in strings mode, found explicit hash key [...] which is not a string | |
| JavaScript | js-yaml | YAMLException: duplicated mapping key at 349:8 [...] | |
| Python | yaml | ConstructorError: while constructing a mapping, found unhashable key | |
2 Provides
microyaml allows data to be parsed in strings mode or typed mode. Each mode has corresponding functions to parse data in that mode.
This only refers to how data is parsed, it doesn’t refer to your Racket language. (Each of the modes can be used in both Typed Racket and standard Racket.) The mode you should use depends on your use case. For example, if you want numbers to be parsed as numbers, you need to use typed mode.
2.1 Typed mode
procedure
(port->yaml in) → yaml-value?
in : input-port?
procedure
(file->yaml file) → yaml-value?
file : path-string?
In typed mode, which is most accurate to YAML (but not spec-compliant with it), YAML values are determined by:
procedure
(yaml-value? v) → boolean?
v : any/c
Hash: an immutable hash? where each key is a yaml-key? and each value is a yaml-value?
Sequence: an immutable list? of yaml-value?
type
type
type
type
2.2 Strings mode
procedure
in : input-port?
procedure
(file->yaml/string file) → yaml-value-string?
file : path-string?
In strings mode, which is closer to StrictYAML (but not spec-compliant with it), YAML values are determined by:
procedure
(yaml-value-string? v) → boolean?
v : any/c
Scalar: just a string?
Hash: an immutable hasheq? where each key is a symbol? and each value is a yaml-value-string?
Sequence: an immutable list? of yaml-value-string?
Since hash keys must always be symbol? in strings mode, hash keys defined as non-scalars using question mark syntax are not supported, and will produce an error.
3 Examples
Example YAML file:
message: hello world |
decimal: 123.4 |
rational: 7/8 |
blank: |
nested: |
hash: |
key: value |
list: |
- 5 |
- 6 |
- - 7.1 |
- 7.2 |
> (define typed-output (port->yaml (open-input-string example-yaml))) > typed-output
'#hash((blank . null)
(decimal . 123.4)
(message . "hello world")
(nested
.
#hash((hash . #hash((key . "value"))) (list . (5 6 (7.1 7.2)))))
(rational . 7/8))
> (yaml-value? typed-output) #t
> (yaml-value-string? typed-output) #f
> (define string-output (port->yaml/string (open-input-string example-yaml))) > string-output
'#hasheq((blank . "")
(decimal . "123.4")
(message . "hello world")
(nested
.
#hasheq((hash . #hasheq((key . "value")))
(list . ("5" "6" ("7.1" "7.2")))))
(rational . "7/8"))
> (yaml-value? string-output) #t
> (yaml-value-string? string-output) #t