7.3 Characters
A character is Unicode code point.
Characters are comparable, which means that generic operations like < and > work on characters.
| annotation | |
> Char"a"
Char"a"
| ~else: "no"
"yes"
> Char"too long"
Char: expected a literal single-character string
Char: expected a literal single-character string
| method | |
| 
 | |
| method | |
| 
 | |
| method | |
| 
 | |
| method | |
| method | |
| 
 | |
| method | |
| 
 | |
| method | |
| 
 | |
| method | |
| 
 | |
| method | |
| 
 | |
| method | |
| 
 | |
| method | |
| 
 | |
| method | |
| 
 | |
| method | |
| 
 | |
| method | |
| 
 | |
| method | |
| 
 | |
| method | |
| 
 | |
| method | |
- alphabetic: Unicode “Alphabetic” property 
- lowercase: Unicode “Lowercase” property 
- uppercase: Unicode “Uppercase” property 
- titlecase: Unicode general category Lt 
- numeric: Unicode “Numeric_Type” property other than None 
- symbolic: Unicode general category Sm, Sc, Sk, or So 
- punctuation: Unicode general category Pc, Pd, Ps, Pe, Pi, Pf, or Po 
- graphic: alphabetic, numeric, symbolic, punctuation, or Unicode general category is Ll, Lm, Lo, Lt, Lu, Nd, Nl, No, Mn, Mc, or Me 
- whitespace: Unicode “White_Space” property 
- blank (horizontal whitespace): Unicode general category is Zs or the Tab character 
- ISO control: Unicode value between 0x0 and 0x1F (inclusive) or between 0x7F and 0x9F (inclusive) 
- extended pictographic: Unicode “Extended_Pictographic” property 
- general category: #'lu, #'ll, #'lt, #'lm, #'lo, #'mn, #'mc, #'me, #'nd, #'nl, #'no, #'ps, #'pe, #'pi, #'pf, #'pd, #'pc, #'po, #'sc, #'sm, #'sk, #'so, #'zs, #'zp, #'zl, #'cc, #'cf, #'cs, #'co, or #'cn. 
- grapheme break property: #'Other, #'CR, #'LF, #'Control, #'Extend, #'ZWJ, #'Regional_Indicator, #'Prepend, #'SpacingMark, #'L, #'V, #'T, #'LV, or #'LVT 
A value of 0 for state represents the initial state or a state where no characters are pending toward a new boundary. Thus, if a sequence of characters is exhausted and accumulated state is not 0, then the end of the stream creates one last grapheme-cluster boundary. When Char.grapheme_step produces a true value as its first result and a non-0 value as its second result, then the given ch must be the only character pending toward the next grapheme cluster (by the rules of Unicode grapheme clustering).
The Char.grapheme_step function will produce a result for any fixnum state, but the meaning of a non-0 state is specified only in that providing such a state produced by Char.grapheme_step in another call to Char.grapheme_step continues detecting grapheme-cluster boundaries in the sequence.
See also String.grapheme_span and String.grapheme_count.
| annotation | |
As always for a veneer, CharCI works only in static mode (see use_static) to help ensure that it has the intended effect.
#false
#true