REBOL 3.0

PARSE: TO and THRU multiple

Carl Sassenrath, CTO
REBOL Technologies
29-Sep-2009 7:18 GMT

Article #0256
Main page || Index || Prior Article [0255] || Next Article [0257] || 6 Comments || Send feedback

A84 provides a first draft implementation of the TO and THRU commands that will match multiple targets.

For example, to match CR, LF, or END in a string, you can write:

to [cr | lf | end]

Main rules:

  • For blocks, you can match a single input value. Use QUOTE for special literals.
  • For strings, the match can be made with a string, char, or integer char value. The match is case-insensitive unless the /case refinement is used.
  • For binary, the match can be made with binary!, integer byte value (lowest 8 bits of the integer), or char value (less than 256.)
  • Each target can be followed with a paren for taking action on the match. (Allows you to set a variable if you need to know which target you hit.)

A few special notes:

  1. Do not forget the or-bar to separate the targets.
  2. Only singular match rules are supported at this time. Do not use complex rules.
  3. Can be very CPU intensive. Don't use string targets where char targets are wanted (e.g. use #"a" not "a" when possible.) Also be aware that using variables for targets will slow it down. (No target caching as of yet.)
  4. You cannot mix string and binary types. Remember that strings are Unicode-oriented and binary is encoded data (such as UTF-8 or anything else.)

Example:

This code removes all CRs and LFs from the string:

parse str [any [to [cr | lf] remove skip]]

Count the number of CR and LF chars in a string and display them:

cr's: lf's: 0
parse str [any [thru [cr (++ cr's) | lf (++ lf's)]]]
?? [cr's lf's]

6 Comments

REBOL 3.0
Updated 26-Dec-2024 - Edit - Copyright REBOL Technologies - REBOL.net