On this page:
rx
rx_  in
rx
rx_  in
8.16.0.1

12.2 Regular Expressions🔗ℹ

 import: rhombus/rx package: rhombus-lib

A regular expression, or regexp can be matched against the content of a string, byte string, or input port. The rx and rx_in forms create regexps, which are represented as RX objects. A successful match is represented as a RXMatch object, which reports either a matching (byte) string or a range of the input.

The rx and rx_in binding forms match input while directly binding named capture groups within the regexp pattern, instead of returning an RXMatch object.

A regexp matches in either character or byte mode. The mode is inferred by the elements of the pattern, but bytes or string can force a choice of mode. A regexp in character mode can be matched against a byte string or input port, in which case it matches UTF-8 sequences whose decoding matches the character regexp. A regexp in byte mode can similarly be matched against strings, where it matches a string whose UTF-8 encoding matches the string. Regexp matches are reported in terms of strings when the regexp is in character mode and when the input is a string; otherwise, matches are reported in terms of bytes.

expression

rx'pat'

 

expression

rx_in'pat'

A regexp, which is represented as an instance of RX.

See Regexp Patterns for patterns that can be used in pat.

The rx form produces a regexp that matches with RX.match only when the whole input string, byte string, or port content matches the pattern. An rx_in regexp matches with RX.match the same as with RX.match_in, which means that it always can match against a portion of the input.

> rx'any*'

rx 'any *'

> rx'any*'.match("abc")

RXMatch("abc", [], {})

> rx'any ($more: any*)'.match("abc")

RXMatch("abc", ["bc"], {#'more: 1})

> rx'any ($more: any*)'.match("abc")[#'more]

"bc"

> rx'["a"-"z"]*'.match("abc")

RXMatch("abc", [], {})

> rx'["a"-"z"]*'.match("_abc_")

#false

> rx_in'["a"-"z"]+'.match("_abc_")

RXMatch("abc", [], {})

> rx_in'["a"-"z"]+'.match_range("_abc_")

RXMatch(1 .. 4, [], {})

binding operator

rx'pat'

 

binding operator

rx_in'pat'

Matches a string, byte string or input port whose content matches, and binds capture-group names in pat to their corresponding matches.

See Regexp Patterns for patterns that can be used in pat.

The rx and rx_in bindings forms as analogous to the rx and rx_in expression forms, where rx matches only when the whole input matches, and rx_in can match a part of the input.

> def rx'"hello " ($name: any*)' = "hello alice"

> name

"alice"

> def rx'alpha+' = "!!! alice ???"

def: value does not satisfy annotation

  value: "!!! alice ???"

  annotation: matching(rx’alpha+’)

> def rx_in'$who: alpha+' = "!!! alice ???"

> who

"alice"