randstr: Random String Generator
| (require randstr) | package: randstr |
A library for generating random strings based on regex-like patterns.
1 Functions
(randstr "[a-z]{5}") (randstr "[0-9][a-z]+") (randstr "(abc|def)+")
(randstr* "[0-9]{3}" 5)
2 Pattern Syntax
The following pattern syntax is supported:
[abc] - Choose randomly from characters a, b, or c
[a-z] - Choose randomly from lowercase letters a through z
(abc|def) - Choose randomly between "abc" or "def"
a* - Zero or more of the preceding character
a+ - One or more of the preceding character
a? - Zero or one of the preceding character
. - Any character
[:alpha:] - Alphabetic characters
[:digit:] - Numeric characters
[:alnum:] - Alphanumeric characters (POSIX standard) or [:alphanum:] (alias)
[:word:] - Word characters (alphanumeric plus underscore)
[:blank:] - Blank characters (space and tab)
[:space:] - Whitespace characters
[:upper:] - Uppercase letters
[:lower:] - Lowercase letters
[:ascii:] - ASCII characters
[:cntrl:] - Control characters
[:graph:] - Printable characters except space
[:print:] - Printable characters including space
[:punct:] - Punctuation characters
[:xdigit:] - Hexadecimal digits
\\p{L} - Unicode letters
\\p{N} - Unicode numbers
\\p{P} - Unicode punctuation
\\p{M} - Unicode marks
\\p{S} - Unicode symbols
\\p{Z} - Unicode separators
\\p{C} - Unicode other (control characters)
\\p{Lu} - Unicode uppercase letters
\\p{Ll} - Unicode lowercase letters
\\p{Nd} - Unicode decimal numbers
\\p{Letter} - Unicode letters (alias for \\p{L})
\\p{Number} - Unicode numbers (alias for \\p{N})
\\p{Punctuation} - Unicode punctuation (alias for \\p{P})
\\p{Script=Han} - Unicode characters from Han script
\\p{Script=Latin} - Unicode characters from Latin script
\\p{Block=Basic_Latin} - Unicode characters from Basic Latin block
\\p{Block=CJK_Unified_Ideographs} - Unicode characters from CJK Unified Ideographs block
\\p{Alphabetic} - Unicode alphabetic characters
\\p{Uppercase} - Unicode uppercase characters
\\p{Lowercase} - Unicode lowercase characters
\\p{White_Space} - Unicode whitespace characters
\\p{Cased} - Unicode characters with case distinctions
\\p{Dash} - Unicode dash characters
\\p{Emoji} - Unicode emoji characters
\\p{Emoji_Component} - Unicode emoji component characters
\\p{Emoji_Modifier} - Unicode emoji modifier characters
\\p{Emoji_Modifier_Base} - Unicode emoji modifier base characters
\\p{Emoji_Presentation} - Unicode emoji presentation characters
\\p{Extended_Pictographic} - Unicode extended pictographic characters
\\p{Hex_Digit} - Unicode hexadecimal digits
\\p{ID_Continue} - Unicode identifier continuation characters
\\p{ID_Start} - Unicode identifier start characters
\\p{Ideographic} - Unicode ideographic characters
\\p{Math} - Unicode mathematical symbols
\\p{Quotation_Mark} - Unicode quotation mark characters
3 Advanced Examples
In addition to basic pattern matching, the library supports more complex patterns:
(randstr "[[:alpha:]]{5}") (randstr "[[:digit:]]{3}") (randstr "[[:alnum:]]{4}") (randstr "[[:word:]]+") (randstr "[[:upper:]0-9]+") (randstr "[[:lower:]_]+") (randstr "[[:alpha:]0-9]+") (randstr "\\p{L}{5}") (randstr "\\p{N}{3}") (randstr "\\p{P}{2}") (randstr "\\p{Lu}{3}\\p{Ll}{3}") (randstr "\\p{Letter}{5}") (randstr "\\p{Number}{3}") (randstr "\\p{Script=Han}{2}") (randstr "\\p{Block=Basic_Latin}{5}") (randstr "\\p{Alphabetic}{4}") (randstr "\\p{White_Space}{3}")
4 Character Class Duplicate Handling
When a character class contains duplicate elements, each unique character is treated equally regardless of how many times it appears in the class. For example:
[aaabbbccc] - Each of a, b, c has equal probability (1/3 each), not a=3/9, b=3/9, c=3/9
[a-cb-e] - Each of a, b, c, d, e has equal probability (1/5 each)
[[:digit:]0-2] - Digits 0, 1, 2 appear in both the POSIX class and the range, but each digit still has equal probability
This ensures fair distribution of character selection in all character classes.
5 License
This project is licensed under the MIT License. See the "LICENSE" file for details.