1.1 Notation
Like most languages, Rhombus syntax builds on a set of rules for parsing
characters into tokens. Unlike most languages––but like Lisp,
Scheme, and Racket—
To explore shrubbery notation independent of Rhombus, try #lang shrubbery. The parsed form is represented as an S-expression, so the output is only useful if you’re familiar with S-expression notation.
Numbers are decimal, either integer or floating-point, or they’re hexadecimal, octal, or binary integers written with a 0x, 0o, or 0b prefix, respectively. An underscore can be used to separate digits in a number.
0
42
-42
1_048_576
3.14157
.5
6.022e23
0xf00ba7ba2
0o377
0b1001
Identifiers use Unicode alphanumeric characters, _, and emoji sequences, with an initial character that is not numeric.
pi
scissor7
π
underscore_case
camelCase
Keywords are like identifiers, but prefixed with ~ and no space. As a datatype distinct from identifiers, they are useful as names that cannot be misconstrued as bound variables or as any other kind of expression form.
~base
~stronger_than
The following characters are used for shrubbery structure and are mostly not available for use in operators:
( ) [ ] { } ' ; , : | « » \ " # @ |
Any other Unicode punctuation or symbol character (but not an emoji) is fair game for an operator:
->
!^$&%$
The : and | characters can be used as part of an operator, even though the characters have a special meaning when used alone. To avoid confusion with blocks, an operator cannot end with : unless it contains only : characters. Similarly, to avoid potential confusion with operators alongside numbers, an operator that ends in +, -, or . must consist only of that character. So, ++ and ... are operators, but !+ is not. Similar problems happen with comments, so an operator cannot contain // or /* or have multiple characters and end in /. A ~ cannot be used by itself as an operator to avoid confusion with ~ to form a keyword.
Shrubbery notation does not include a notion of operator precedence. Instead, Rhombus builds a precedence-parsing layer on top of shrubbery notation (which is why shrubberies are not full-grown trees). Precedence in Rhombus is macro-defined in the same way that syntactic forms are macro-defined in Rhombus.
Booleans are written with a leading # followed immediately by true or false.
#true
#false
Strings of Unicode characters use double quotes, and byte strings are similar, but with a # prefix. Strings and byte strings support the usual escapes, such as \n for a newline character or byte.
"This is a string,\n just like you'd expect"
#"a byte string"
Comments are C-style, but block comments are nestable.
// This is a line comment
/* This is a multiline
comment that /* continues */
on further lines */
To aid interoperability with Racket and to support some rarely useful datatypes, such as characters, shrubbery notation includes an escape to S-expression notation through #{…}. For example, #{list-first} is a single identifier that includes - as one of its characters. A #{…} cannot wrap a list-structured S-expression that uses immediate parentheses, however.
Shrubbery notation is whitespace-sensitive, and it uses line breaks and indentation for grouping. A line with more indentation starts a block, and it’s always after a line that ends with a :. A | alternative also starts a block, and the | itself can start a new line, in which case it must line up with the start of its enclosing form. So, the |s below are written with the same indentation as if, match, or cond to create the alternative cases within those forms:
In DrRacket, hit Tab to cycle through the possible indentations for a line. See also Shrubbery Support in DrRacket.
println("group within block")
println("another group within block")
if is_rotten(apple)
| get_another()
| take_bite()
be_happy()
match x
| 0:
x + zero
| n:
n + 1
| // check the weather
is_raining():
take_umbrella()
| // check the destination
going_to_beach():
wear_sunscreen()
take_umbrella()
| // assume a hat is enough
~else:
wear_hat()
A : isn’t needed before the first | in an alts-block, because the | itself is enough of an indication that a sequence of alternatives is starting, but a : is allowed. Some forms support the combination of a : followed by a sequence of | alternatives, but most forms have either a : block or a sequence of | alternatives.
Each line within a block forms a group. Groups are important, because parsing and macro expansion are constrained to operate on groups (although a group can contain nested blocks, etc.). Groups at the same level of indentation as a previous line continue that group’s block. A | can have multiple groups in the subblock to its right. A : block or sequence of | alternatives can only be at the end of an enclosing group.
A : doesn’t have to be followed by a new line, but it starts a new block, anyway. Similarly, a | that starts an alternative doesn’t have to be on a new line. These examples parse the same as the previous examples:
another group within block
if is_rotten(apple) | get_another() | take_bite()
be_happy()
x + zero
| n: n + 1
cond | is_raining(): take_umbrella()
| going_to_beach(): wear_sunscreen()
take_umbrella()
| ~else: wear_hat()
Within a block, a ; can be used instead of a new line to start a new group, so these examples also parse the same:
if is_rotten(apple) | get_another() | take_bite(); be_happy()
| n: n + 1
cond | is_raining(): take_umbrella()
| going_to_beach(): wear_sunscreen(); take_umbrella()
| ~else: wear_hat()
You can add extra ;s, such as at the end of lines, since ; will never create an empty group.
Finally, anything that can be written with newlines and indentation can
be written on a single line, but « and » may be
required to delimit a block using « just after : or
| and » at the end of the block. Normally,
parentheses work just as well, since they can be wrapped around any
expression—
Parentheses (…), square brackets […], and curly braces {…} combine a sequence of groups. A comma , can be used to separate groups on one line between the opener and closer. Furthermore, a , is required to separate groups, even if they’re not on the same line. You can’t have extra ,s, except after the last group.
f(1, 2,
3, 4)
["apples",
"bananas",
"cookies",
"milk"]
map(add_five, [1, 2, 3, 4,])
Indentation still works for creating blocks within (…), […], or {…}:
map(fun (x):
x + 5,
[1, 2, 3, 4])
There are some subtleties related to the “precedence” of :, |, ;, and ,, but they’re likely to work as you expect in a given example.
Single-quote marks '…' are used for quoting code (not strings), as in macros. Quotes work like (…), except that the content is more like a top-level or block sequence, and ; is used as a group separator (optional when groups are on separate lines).
Nested quoting sometimes requires the use of ' « ... » ' so that the nested opening quote is not parsed as a close quote. This counts as a different use of « and » than with : or |, and it doesn’t disable indentation for the quoted code.