Tiny HTML to Text ConverterAuthor: Carl Sassenrath Here's a little script I use so often I thought it should be part of the cookbook. This script will take any HTML file (like a web page or REBOL document) and convert it to simple text. It does nothing special for formatting, but it does make for an easy way to get to the actual content of a page.
This example reads a page from a web site, loads the text into a REBOL block. The block contains tags as tag! datatypes and text as string! datatypes. REMOVE-EACH removes all the tags, which leaves only the strings. REJOIN joins all the strings together into a single string that is written to a text file. As with most REBOL code, the above example can be simplified down to just:
Note you don't need the separate READ (LOAD will do that for you) or the REJOIN (which is implicit in a WRITE of a block). |