1.1 Specifying languages
This section describes the syntax of the syntax-spec metalanguage, used to describe the grammar, binding structure, and host interface of a DSL.
Language specifications are made via the subforms of syntax-spec, which must be used at module-level.
syntax
(syntax-spec spec-def ...)
spec-def = binding-class | extension-class | nonterminal | host-interface
The following subsections address each kind of declaration allowed within the syntax-spec form.
1.1.1 Binding classes
Binding classes distinguish types of binding. When a reference resolves to a binder, it is an error if the binding class declared for the reference position does not match the binding class of the binding position.
syntax
maybe-description = #:description string-literal |
maybe-binding-space = #:binding-space space-symbol |
The #:description option provides a user-friendly phrase describing the kind of binding. This description is used in error messages.
The #:binding-space option specifies a binding space to use for all bindings and references declared with this class.
Operationally, the binding space declaration causes the syntax-spec expander to add the binding space scope to bindings and references. The scope is added to the scope sets of all binding occurrences. When parsing a reference position declared with a binding class that has an associated binding space, the name that is looked up is augmented with the binding class scope in order to give it access to bindings defined in the space.
1.1.2 Extension classes
Extension classes distinguish types of extensions to languages. A syntax transformer is tagged with an extension class using define-dsl-syntax. Nonterminals can be declared extensible by a certain extension class using #:allow-extension. These extensions are expanded away into core DSL forms before compilation.
syntax
maybe-description = #:description string-literal |
maybe-binding-space = #:binding-space space-symbol |
The #:description option provides a user-friendly phrase describing the kind of extension. This description is used in error messages.
The #:binding-space option specifies a binding space to use for all extensions with this class.
1.1.3 Nonterminals
syntax
Defines a nonterminal supporting let-like binding structure.
Example:
(syntax-spec (binding-class my-var) (nonterminal my-expr n:number x:my-var (my-let ([x:my-var e:my-expr]) body:my-expr) #:binding (scope (bind x) body)))
syntax
(nonterminal/nesting id (nested-id) nonterminal-options production ...)
Defines a nesting nonterminal supporting nested, let*-like binding structure. Nesting nonterminals may also be used to describe complex binding structures like for match.
Example:
(syntax-spec (binding-class my-var) (nonterminal my-expr n:number x:my-var (my-let* (b:binding-pair ...) body:my-expr) #:binding (nest b body)) (nonterminal/nesting binding-pair (nested) [x:my-var e:my-expr] #:binding (scope (bind x) nested)))
syntax
(nonterminal/exporting id nonterminal-options production ...)
Defines an exporting nonterminal which can export bindings, like define and begin.
Example:
(syntax-spec (binding-class my-var) (nonterminal/exporting my-defn (my-define x:my-var e:my-expr) #:binding (export x) (my-begin d:my-defn ...) #:binding (re-export d)) (nonterminal my-expr n:number))
1.1.3.1 Nonterminal options
nonterminal-options | = |
| |||||
maybe-description | = | #:description string-literal | |||||
| | |||||||
maybe-allow-extension | = | #:allow-extension extension-class-spec | |||||
| | |||||||
extension-class-spec | = | extension-class-id | |||||
| | (extension-class-id ...) | ||||||
maybe-binding-space | = | #:binding-space space-symbol | |||||
| |
The #:description option provides a user-friendly phrase describing the kind of nonterminal. This description is used in error messages.
The #:allow-extension option makes the nonterminal extensible by macros of the given extension class(es).
The #:binding-space option specifies a binding space to use for all bindings and references declared with this nonterminal.
1.1.3.2 Productions
production | = | rewrite-production | |||||
| | form-production | ||||||
| | syntax-production | ||||||
rewrite-production | = |
| |||||
form-production | = | (form-id . syntax-spec) maybe-binding-spec | |||||
| | form-id | ||||||
syntax-production | = | syntax-spec maybe-binding-spec | |||||
maybe-binding-spec | = | #:binding binding-spec | |||||
| |
A rewrite production allows certain terms to be re-written into other forms. For example, you might want to tag literals:
(syntax-spec (nonterminal peg (~> (~or s:string s:char s:number s:regexp) #:with #%peg-datum (datum->syntax #'s '#%peg-datum) #'(#%peg-datum s)) ...))
Rewrite productions don’t have binding specs since they declare an expansion of surface syntax into a another DSL form. The don’t necessarily have to expand into a core form like one declared by a form production or a syntax production. A rewrite production can expand into a DSL macro usage or another rewrite production.
Form productions and syntax productions declare core forms in the nonterminal which may have binding specs. If a binding spec is not provided, one is implicitly created. In this case, or if any spec variable is excluded from a binding spec in general, it will be treated as a reference position and implicitly added to the binding spec.
A form production defines a form with the specified name. You may want to use a syntax production if you are re-interpreting racket syntax. For example, if you are implementing your own block macro using syntax-spec:
(syntax-spec (nonterminal/exporting block-form #:allow-extension racket-macro ((~literal define-values) (x:racket-var ...) e:racket-expr) #:binding (export x) ((~literal define-syntaxes) (x:racket-macro ...) e:expr) #:binding (export-syntaxes x e) e:racket-expr))
When a form production’s form is used outside of the context of a syntax-spec DSL, like being used directly in Racket, a syntax error is thrown.
1.1.4 Syntax specs
syntax-spec | = | () | |||
| | keyword | ||||
| | ... | ||||
| | ...+ | ||||
| | (~literal id maybe-space) | ||||
| | (~datum id) | ||||
| | (syntax-spec . syntax-spec) | ||||
| | spec-variable-id:binding-class-id | ||||
| | spec-variable-id:nonterminal-id | ||||
| | spec-variable-id:extension-class-id | ||||
maybe-space | = | #:space space-name | |||
| |
Syntax specs declare the grammar of a DSL form.
() specifies an empty list.
keyword specifies a keyword like #:key .
... specifies zero or more repetitions of the previous syntax spec.
...+ specifies one or more repetitions of the previous syntax spec.
(~literal id maybe-space) specifies a literal identifier and optionally allows the specification of a binding space for the identifier.
(syntax-spec . syntax-spec) specifies a pair of syntax specs.
spec-variable-id:binding-class-id specifies an identifier sub-expression belonging to the specified binding class. For example: x:my-var specifies an identifier of the my-var binding class.
spec-variable-id:nonterminal-id specifies a sub-expression that conforms to the specified nonterminal’s grammar. For example: e:my-expr specifies a sub-expression that is a my-expr, where my-expr is a nonterminal name.
spec-variable-id:extension-class-id specifies a sub-expression that is treated as a phase-1 transformer expression. This is used when your language has macro definitions.
1.1.5 Binding specs
binding-spec | = | spec-variable-id | ||
| | (bind spec-variable-id ...+) | |||
| | (bind-syntax spec-variable-id spec-variable-id) | |||
| | (bind-syntaxes spec-variable-id spec-variable-id) | |||
| | (scope spec ...) | |||
| | [spec ...] | |||
| | (nest spec-variable-id binding-spec) | |||
| | (nest-one spec-variable-id binding-spec) | |||
| | (import spec-variable-id ...+) | |||
| | (export spec-variable-id ...+) | |||
| | (export-syntax spec-variable-id spec-variable-id) | |||
| | (export-syntaxes spec-variable-id spec-variable-id) | |||
| | (re-export spec-variable-id ...+) |
Binding specs declare the binding rules of a DSL’s forms. They allow us to control the scope of bound variables and to check that programs are well-bound before compilation. A binding spec is associated with a production and refers to spec variables from the production
For a spec-variable-id binding spec, if the specified sub-expression is an identifier (the syntax spec would be something like x:my-var), then the identifier is treated as a reference. If the specified sub-expression is something else (the syntax spec would be something like x:my-expr), then the resulting binding structure depends on the content of the sub-expression. If a sub-expression of a production is excluded from its binding spec it is implicitly added as this type of binding spec.
(bind x) declares that the variable specified by x is bound in the current scope. bind must be used inside of a scope and x must be specified with a binding class in its syntax spec like x:my-var.
Example:(syntax-spec (binding-class my-var) (nonterminal my-expr n:number x:my-var (my-let ([x:my-var e:my-expr]) body:my-expr) #:binding (scope (bind x) body))) (bind-syntax x e) declares that the variable specified by x is bound to the transformer specified by e. bind-syntax must be used inside of a scope, x must be specified with a binding class in its syntax spec like x:my-var, and e must be specified with an extension class in its syntax spec like e:my-macro.
Example:(syntax-spec (binding-class my-var) (extension-class my-macro) (nonterminal my-expr #:allow-extension my-macro n:number (my-let-syntax ([x:my-var trans:my-macro]) body:my-expr) #:binding (scope (bind-syntax x trans) body))) bind-syntaxes is similar to bind-syntax, except it binds multiple identifiers.
Example:(syntax-spec (binding-class my-var) (extension-class my-macro) (nonterminal my-expr #:allow-extension my-macro n:number (my-let-syntaxes ([(x:my-var ...) trans:my-macro]) body:my-expr) #:binding (scope (bind-syntaxes x trans) body))) Here, trans should evaluate to multiple transformers using values.scope declares that bindings and sub-expressions in the sub-specs are in a particular scope. Local bindings binding specs like bind must occur directly in a scope binding spec.
(my-let ([x:my-var e:my-expr]) body:my-expr) #:binding (scope (bind x) e) We could instead write(my-let ([x:my-var e:my-expr]) body:my-expr) #:binding [(scope (bind x) body) e] Which adds that e is a sub-expression outside of the scope of the let. All un-referenced syntax spec variables get implicitly added to a group with the provided binding spec, so the former example is equivalent to the latter.nest is used with nesting nonterminals. In particular, the first argument to nest must be a spec variable associated with a nesting nonterminal. The second argument is treated as the "base case" of the "fold". nest should only be used when the first argument has ellipsis depth 1.
Example:
(syntax-spec (binding-class my-var) (nonterminal my-expr n:number x:my-var (my-let* (b:binding-pair ...) body:my-expr) #:binding (nest b body)) (nonterminal/nesting binding-pair (nested) [x:my-var e:my-expr] #:binding (scope (bind x) nested))) The nest binding spec sort of folds over the binding pairs. In this example, it’ll produce binding structure like
[e1 (scope (bind x1) [e2 (scope (bind x2) [... [en (scope (bind xn) body)]])])]
nest-one is similar to nest, except it is used when the first argument has ellipsis depth 0.
Example:(syntax-spec (binding-class pattern-var) (nonterminal clause [p:pat e:racket-expr] #:binding (nest-one p e)) (nonterminal/nesting pat (nested) x:pattern-var #:binding (scope (bind x) nested) ((~literal cons) car-pat:pat cdr-pat:pat) #:binding (nest-one car-pat (nest-one cdr-pat nested)))) (import d) imports the bindings exported from the sub-expression specified by d. import must be used inside of a scope and must refer to a syntax spec associated with an exporting nonterminal.
Example:(syntax-spec (binding-class my-var) (nonterminal/exporting my-defn (my-define x:my-var e:my-expr) #:binding (export x) (my-begin d:my-defn ...) #:binding (re-export d)) (nonterminal my-expr n:number (my-local [d:my-defn ...] body:my-expr) #:binding (scope (import d) body))) (export x) exports the variable specified by x. x must refer to a syntax spec variable associated with a binding class and export can only be used in an exporting nonterminal. See import for an example of usage.
(export-syntax x e) is like bind-syntax, except it exports the binding instead of binding the identifier locally to the current scope. Like export, it can only be used in an exporting nonterminal.
(export-syntaxes x e) is like bind-syntaxes, except it exports the bindings instead of binding the identifiers locally to the current scope. Like export, it can only be used in an exporting nonterminal.
(re-export d) exports all bindings that are exported by d. d must be associated with an exporting nonterminal and re-export can only be used in an exporting nonterminal. See import for an example of usage.
1.1.6 Host interface forms
Host interface forms are the entry point to the DSL from the host language. They often invoke a compiler macro to translate the DSL forms into Racket expressions.
syntax
(host-interface/expression (id . syntax-spec) maybe-binding-spec pattern-directive ... body ...+)
Defines a host interface to be used in expression positions.
Can only be used inside of a syntax-spec block.
An example from the miniKanren DSL:
(syntax-spec ... (host-interface/expression (run n:expr q:term-variable g:goal) #:binding (scope (bind q) g) #`(let ([q (var 'q)]) (map (reify q) (run-goal n (compile-goal g))))))
This defines run, which takes in a Racket expression representing a number, a term variable, and a goal, and invokes the compiler compile-goal to translate the DSL forms into Racket.
syntax
(host-interface/definition (id . syntax-spec) maybe-binding-spec #:lhs [pattern-directive ... body ...+] #:rhs [pattern-directive ... body ...+])
Defines a host interface to be used in a definition context.
#:lhs runs before the right-hand-sides of definitions in the current context expand and must produce the identifier being defined.
#:rhs runs after the left-hand-sides of definitions and must produce the Racket expression whose value will be bound to the identifier (don’t emit the definition syntax, just the syntax for producing the value).
Can only be used inside of a syntax-spec block.
An example from the miniKanren DSL:
(syntax-spec ... (host-interface/definition (defrel (name:relation-name x:term-variable ...) g:goal) #:binding [(export name) (scope (bind x) g)] #:lhs [(symbol-table-set! relation-arity #'name (length (syntax->list #'(x ...)))) #'name] #:rhs [#`(lambda (x ...) (lambda (s) (lambda () (#%app (compile-goal g) s))))]))
This defines defrel, which defines a relation. In the #:lhs, We record arity information about the identifier before producing it. Since the left-hand-sides all run before the right-hand-sides, even if there is mutual recursion, all arity information will be available before any goals are compiled. Note that the #:rhs produces a lambda expression, not a define.
syntax
(host-interface/definitions (id . syntax-spec) maybe-binding-spec pattern-directive ... body ...+)
Defines a host interface to be used in a definition context.
Can be used to produce multiple definitions.
Can only be used inside of a syntax-spec block.
An example from the PEG DSL:
(syntax-spec (host-interface/definitions (define-pegs [name:nonterm p:peg] ...) #:binding (export name) (run-leftrec-check! (attribute name) (attribute p)) #'(begin (define name (lambda (in) (with-reference-compilers ([var immutable-reference-compiler]) (compile-peg p in)))) ...)))
Unlike host-interface/definition, the definitions are directly produced by the host interface.
1.1.7 Defining macros for DSLs
syntax
(define-dsl-syntax name extension-class-id transformer-expr)
Defines a macro of the specified extension class. The transformer expression can be any Racket expression that evaluates to a (-> syntax? syntax?) procedure, so it is possible to use syntax-rules, syntax-case, syntax-parse, etc.
Example:
(define-dsl-syntax conj goal-macro (syntax-parser [(_ g) #'g] [(_ g1 g2 g* ...) #'(conj (conj2 g1 g2) g* ...)]))
This defines a macro conj that expands to a goal in miniKanren.
1.1.8 Embedding Racket syntax
nonterminal
A nonterminal that allows arbitrary host language expressions. Expressions are wrapped with #%host-expression during DSL expansion.
binding class
A binding class for host language bindings.
extension class
A binding class for arbitrary host language transformers.