Bitset improvements and Unicode support
In R3 a number of changes have just been made to the BITSET! datatype (to appear in the next alpha release):
- Bitsets directly support Unicode character values (codepoints).
- Bitsets are variable length and auto-expand as necessary.
- New functions (datatype actions) and arguments are supported. For example, you can now AND, OR, and XOR bitsets.
These changes were the result of the Unicodifying of R3. We will want code like this to work for any string, even Unicode:
white-space: make bitset! " ^-^/"
if find white-space a-char [...]
and:
where: find a-string white-space
In addition, we also want bitsets that work with Unicode to be efficient and use very little memory (even though Unicode spans a large range of possible characters). In the example above, the bitset only requires 10 bytes.
We also want bitsets to expand when needed. In R2, bitsets were fairly restrictive and errors would be thrown for actions that should, in theory, be valid.
So, these improvements have been made, and we may be making just a few more as well, once users get a chance to try it out.
See the R3 Documentation Home Page and click on the bitset link for more information.
1 Comments
|