Forms of replacement/composition

Many developers have wanted to add new functions to help with the creation of series (blocks and strings) using replacement and composition methods.

So, I wanted to summarize a few thoughts on it here, then let the REBOL DocBase wiki continue with outlining the spec.

Below are the main composition techniques, which I'm sure are outlined in some computer science textbook, but for this blog, I'll just summarize.

Inline composition:

The composition is generated inline. The abstract form is:

this (new1) and (new2)

Here new1 and new2 are the values or constructors of the value (block or string). This is the typical method used in REBOL, as can be seen in:

compose [this ('new1) and ('new2)]
reform ["this" 'new1 "and" 'new1]
repend out ["this " 'new1 " and " 'new1]

Note that the 'new values can be any evaluated expression:

compose [this (make-new 1) and (make-new 2)]

Where make-new joins 'new to its argument, an integer.

Replacement, notated by position:

The composition is generated by substitution based on positions. The form uses a template:

this <1> and <2>

and provides its values with just a series:

[new1 new2]

It should be noted that this form requires an "escape" or meta notation to identify the targets. For example, this is a common method used in various shells and scripting languages:

this $1 and $2

There is no equivalent function in REBOL, but for REBOL blocks, the method is trivial. For strings, replace would be used, but that's not precisely the same (in both semantics or performance).

Replacement, notated by name:

The composition is generated by substitution based on names (labels). The form uses a template:

this <label1> and <label2>

and provides its values with name value pairs:

[label1 new1 label2 new2]

So, label1 is replaced with new1, etc.

It should be noted that this form requires an "escape" or meta notation to identify the targets. For example, this is a common method used in various shells and scripting languages:

this $label1 and $label2

The $ provides the indication.

In REBOL, this is done with replace, but it would require multiple calls to the function, so there's no direct equivalent. Users normally create their own little foreach for it:

foreach [lab val][label1 new1 label2 new2][
    replace...
]

Direct replacement:

I'm including this one because it's what we normally do in REBOL.

We would write:

this label1 and label2

and that can be a block or a string where we use foreach and replace, similar to that shown above. The arguments would be:

[label1 new1 label2 new2]

Of course, this method does separate the data from the meta-data (there's no escape), so it's not a general form, and must be used with greater care.

More complicated forms:

I should point out that there are more complicated forms. In many languages the concept of formatting (such as printf in C) use a separate language (yes, writing stuff like "%s %5d" is a language, simple char-based, but not an actual dialect of C, something totally different).

Formatting is a more complicated situation because it can control replacement field size, conversion precision, alignment, spacing, and more.

So, for this discussion, let's not get into that area. Let's stick with straight replacement/substitution.

Wiki continuation...

With the goal of defining a reasonable function for REBOL, the refinement and continuation of this discussion can be found on DocBase: Replacement.

Post Comments