# | User | Message | Date |
4818 | BenBran | yes it works perfect in R3. Thanks again. | 6-Jan-10 20:18 |
4817 | BenBran | lol :-) | 6-Jan-10 20:17 |
4816 | BrianH | You were right, it was something simple :) | 6-Jan-10 20:16 |
4815 | BenBran | doh! | 6-Jan-10 20:15 |
4814 | BrianH | That is R2, not R3. | 6-Jan-10 20:15 |
4813 | BenBran | >> help system SYSTEM is an object of value: version tuple! 2.7.7.3.1 build date! 1-Jan-2010/12:15:27-8:00 product word! View core tuple! 2.7.7 components block! length: 60 | 6-Jan-10 20:14 |
4812 | BrianH | What version of REBOL are you using? system/version ... | 6-Jan-10 20:13 |
4811 | BenBran | for completeness in R3 - I tried the lines above: >> parse "GET /a.html HTTP/1.1" ["get " return to " "] ** Script Error: Invalid argument: ?native? ** Where: halt-view ** Near: parse "GET /a.html HTTP/1.1" ["get " return to " "] I must be missing something simple | 6-Jan-10 20:11 |
4810 | BrianH | >> parse "GET /a.html HTTP/1.1" ["get " return to " "] == "/a.html" Note that /all is the default in R3 so you need to specify space after GET. | 6-Jan-10 19:43 |
4809 | BrianH | That would return the file instead of setting a variable and not return false because of leftover input. | 6-Jan-10 19:40 |
4808 | BrianH | PARSE returns true if the rule matches and covers the entire input, or false otherwise. Your rule matched but there was input left over. PARSE's return value doesn't matter in this case, just whether file is set or not. If you are using R3 you can do this too: parse buffer [ "get" [ "http" | "/" | return to " "]] | 6-Jan-10 19:39 |
4807 | Graham | parse buffer [ "get" [ "http" | "/" | copy file to #" " ( print file) ] to end ] will return true | 6-Jan-10 19:37 |
4806 | BrianH | Was going to reply but Graham types faster :) | 6-Jan-10 19:36 |
4805 | BenBran | ok I see. Thanks. | 6-Jan-10 19:36 |
4804 | Graham | true if the rule completes to the end, false otherwise | 6-Jan-10 19:35 |
4803 | Graham | umm.. parse returns either true or false ... | 6-Jan-10 19:35 |
4802 | Graham | if you want the value you have to change the parse rule | 6-Jan-10 19:34 |
4801 | Graham | false is the value returned by the parse function | 6-Jan-10 19:34 |
4800 | BenBran | I get whats happening now. If i compare buffer and file I see the clipped text: >> probe file == "index.html" >> probe buffer {GET /a.html HTTP/1.1 Host: localhost User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/531.21.8 (KHTML, like Gecko) Version/4.0.4 Safar i/531.21.10 Accept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Language: en-US Accept-Encoding: gzip, deflate Connection: keep-alive Address: 127.0.0.1} >>probe parse buffer ["get" ["http" | "/ " | copy file to " "]] == false >> probe file == "/a.html" Should I have been able to see the results instead of == false? | 6-Jan-10 19:31 |
4799 | BrianH | The break being a parse match fail, and file being set to none for a zero-length match. | 6-Jan-10 19:06 |
4798 | BrianH | Sort of. The actual code is a little more complex, more like this: either tmp: find data " " [file: if 0 < offset? data tmp [copy/part data tmp]] [break] | 6-Jan-10 19:04 |
4797 | BrianH | So, copy file to " " is the equivalent of this regular REBOL code: file: if find data " " [copy/part data find data " "] | 6-Jan-10 18:59 |
4796 | BrianH | The copy and to are parse operations. COPY copies the data covered by the next operation, the TO. TO covers the data from the current parse position until the first instance it can find of its argument. | 6-Jan-10 18:56 |
4795 | BrianH | BenBran:
Not sure where to put this so asking here: I downloaded a web script and it has a snippet I don't understand: buffer: make string! 1024 ;; contains the browser request file: "index.html" parse buffer ["get" ["http" | "/ " | copy file to " " ]] what does: copy file to " " mean or do? tia | 6-Jan-10 18:53 |
4794 | Pekr | Carl - first "error" in parse rewrite with some/any is the auto protection for non advancing input. It is like writting in BASIC 10 Print "Hello" 20 goto 10 ... and not expecting it to run forever, because some magical internal mechanism kicks-in. If I write the code which could cause infinite loop, then be it. For me it causes the opposite reaction - some/any are not safe to use, let us use while instead .... something like: parse str [some [to "abc"]] is so obvious and self explanatory, that actually not looping forever almost feels like parse error. But - even if I don't like it, maybe most such infinite loop hits are more difficult to notice, so that actually the prevention might be ok, I don't know. As for me though, I would probably prefer some internal capability to detect such case, and some debug option to show last rule/position, where it happens ... I am not fluent enough with parse theory, but maybe it also relates to your loop vs matching note above ... | 1-Jan-10 8:45 |
4793 | Gregg | For example - Parsing an input that has nested structures, and how to collect the values you want. - Showing the user where the parse failed. - How to avoid infinite parse loops. - How to safely modify the input stream. More advanced examples would be great too of course. | 31-Dec-09 21:33 |
4792 | Gregg | We have some cool new parse enhancements; really, really nice some of them. What I think will add the most value to PARSE--and maybe this is just me--are practical examples, idioms, and best practices. | 31-Dec-09 21:30 |
4791 | Steeve | I see your point, but what if the ANY block contains production rules ?
parse "" [any [and skip copy tmp to end break | insert "1" and insert "2"]] (i know, stupid example) | 31-Dec-09 18:47 |
4790 | Carl | There are a few ways to do it, but that is not my point. | 31-Dec-09 18:40 |
4789 | Steeve | any [and skip copy tmp to end] any [copy tmp [skip to end]] etc... | 31-Dec-09 18:39 |
4788 | Steeve | We have so much alternatives that i don't see this as a burden | 31-Dec-09 18:36 |
4787 | Carl | It's a small thing, and maybe too late to change. I wanted to point it out. | 31-Dec-09 18:34 |
4786 | Carl | In other words, is ANY smart about the input? If there is no input, why should it even try? Of course, in the past we've used ANY a bit like WHILE -- as a LOOPing method, not really as a MATCHing method. | 31-Dec-09 18:33 |
4785 | Carl | In the rewrite of DECODE-CGI, that behavior of ANY forces me to write: parse "" [any [end break | copy tmp to end]] This seems wrong to me if we define ANY as a MATCHing function, not as a LOOP function. This topic has been debated a bit between a few of us, but I think it deserves more attention. | 31-Dec-09 18:29 |
4784 | Steeve | what do you expect in this case ? | 31-Dec-09 18:29 |
4783 | Carl | I'm still running into some problems with PARSE... mainly from the expectation of what ANY and SOME should do. For example: >> parse "" [any [copy tmp to end]] >> tmp == "" | 31-Dec-09 18:26 |
4782 | Carl | Right: synonyms. | 31-Dec-09 18:23 |
4781 | Ladislav | Carl made a distinction in R3 blog, but they currently work the same, as far as I can tell, so, the only difference I see is, that ACCEPT is more self-explanatory. | 30-Dec-09 11:52 |
4780 | Pekr | What is the difference between BREAK and ACCEPT? Both "break" out of the rule, both with success (IMO). | 30-Dec-09 7:52 |
4779 | Ladislav | e.g. parse [a b c] [?? copy value thru 1 skip to end] should have preferably been parse [a b c] [?? copy value 1 skip to end] | 29-Dec-09 18:09 |
4778 | Ladislav | COPY should accept any rule, not just the ones you mentioned | 29-Dec-09 17:57 |
4777 | Fork | kcollins: I'm using OS/X, I still haven't found a way to reproduce it. Comes and goes. | 29-Dec-09 17:49 |
4776 | Fork | Ladislav: I didn't realize you could use "while" as the second argument to copy, I thought it only worked with to and thru... | 29-Dec-09 17:49 |
4775 | Ladislav | I overlooked, that you used the STRING! datatype: parse [1 2 3] [?? while [integer! string! accept | skip | reject] ?? integer!] | 29-Dec-09 13:08 |
4774 | Ladislav | Re the THRU problem: you can use parse [1 2 3] [?? while [integer! block! accept | skip | reject] ?? integer!] | 29-Dec-09 13:05 |
4773 | Ladislav | Regarding the QUOTE keyword: the original proposal was to treat blocks as in quote [1 2] as sequences of elements, not as embedded blocks, wouldn't you prefer that behaviour? | 29-Dec-09 13:01 |
4772 | kcollins | Fork, are you seeing these outputs "coo", "thte", etc. on a Linux build of R3? I have seen similar corrupted output with Linux R3 when testing TCP client code, as documented in Curecode #1322. | 29-Dec-09 6:32 |
4771 | Fork | Well, I should find a way to reproduce it before doing that. Left a note about how getting a CureCode account didn't work the other day. | 28-Dec-09 20:02 |
4770 | BrianH | Definitely another bug. CureCode it. | 28-Dec-09 19:57 |
4769 | Fork | >> parse [a b c] [?? copy value thru 1 skip to end] coo:: [a b c] == true | 28-Dec-09 19:56 |
4768 | BrianH | But no such characters should be output by ?? | 28-Dec-09 19:56 |
4767 | Fork | Indeterminate, e.g. just ran it again and: | 28-Dec-09 19:56 |
4766 | BrianH | Seems like a Unicode to ANSI translation error. | 28-Dec-09 19:56 |
4765 | Fork | (That question mark not visible in the terminal, showed up when I pasted here) | 28-Dec-09 19:54 |
4764 | Fork | >> parse [a b c] [?? copy value thru 1 skip to end] co? : [a b c] == true | 28-Dec-09 19:54 |
4763 | Fork | FYI still seeing some erratic behavior with ?? at head of the parse rule | 28-Dec-09 19:54 |
4762 | BrianH | Fork, the fact that both of those examples work incorrectly instead of throwing an error is a bug in PARSE. It should be CureCoded. | 28-Dec-09 19:46 |
4761 | Pekr | to/thru were reimplemented to allow multiple options. There are cases, where they are not supposed to work, but in above case I would regard it being a bug .... unless some guru finds a theory showing us why it should be regarded being a correct result :-) | 28-Dec-09 19:45 |
4760 | Pekr | >> parse [a b c][?? 3 skip ??] 3: [a b c] end!: [] == true | 28-Dec-09 19:44 |
4759 | Pekr | I would expect that ... | 28-Dec-09 19:42 |
4758 | Fork | Should the latter be [a b c] ? | 28-Dec-09 19:42 |
4757 | Pekr | brian - so we can use things like any-string! or other typesets to match? | 28-Dec-09 19:41 |
4756 | Fork | >> parse [a b c] [(value: none) copy value to 3 skip to end (probe value)]
[a b]
== true >> parse [a b c] [(value: none) copy value thru 3 skip to end (probe value)] [a b] == true | 28-Dec-09 19:41 |
4755 | BrianH | Fortunately typesets work for block parsing like bitsets do for string parsing, so first sets are easy. | 28-Dec-09 19:40 |
4754 | BrianH | Yes. You can express a sequence of characters in a string as a string literal, but not a sequence of types in a block. You are going to need first sets and the other LL tricks for that. | 28-Dec-09 19:38 |
4753 | Fork | Is a sequence of things one of the complex rules that you can't use in a thru? | 28-Dec-09 19:35 |
4752 | Fork | And it stopped doing that. I'll see if I can get it to do it again. | 28-Dec-09 19:33 |
4751 | Fork | Hm. Version: 2.100.96.2.5 I quit and restarted. | 28-Dec-09 19:32 |
4750 | Pekr | what do you mean by "match thru a series of things"? | 28-Dec-09 19:31 |
4749 | Pekr | >> parse [1 2 3][?? thru [integer! string!] ?? integer!] thru: [1 2 3] integer!: [2 3] == false | 28-Dec-09 19:30 |
4748 | Fork | ?? not initialized after first match? And secondly, how do I match thru a series of things (e.g. integer! integer!, but just wondering about the thte. ?? problem before the first match?) | 28-Dec-09 19:28 |
4747 | Fork | What's that "thte" thing? | 28-Dec-09 19:27 |
4746 | Fork | >> parse [1 2 3] [?? thru [integer! string!] ?? integer!] thte: [1 2 3] integer!: [2 3] == false | 28-Dec-09 19:26 |
4745 | Ladislav | More complicated rules can be easily simulated using the While keyword, the opposite isn't true. Carl's example just proves, why While is useful. | 25-Dec-09 14:17 |
4744 | Ladislav | sorry, I meant a: [b a |] | 25-Dec-09 13:51 |
4743 | Ladislav | The WHILE keyword is the simplest possible cycle. The rule: a: [while b] is equivalent to recursive: a: [b a] | 25-Dec-09 13:50 |
4742 | Pekr | I probably need more examples .. | 24-Dec-09 10:49 |
4741 | Pekr | Running above examples, my opinion is, that in fact adding 'while was probably not a good decision. I can understand, that now we have more power - our code will not easily cause an infinite loops, but otoh you now have to think, if it can happen or not, and 'some becomes your enemy ... | 24-Dec-09 10:47 |
4740 | Pekr | I don't probably understand usefullness of 'while at all. Because now I have to think, if my code would cause infinite loop, or not, and use 'some or 'while accordingly ... | 24-Dec-09 10:42 |
4739 | Pekr | Henrik - according to docs explanation, 'parse contains some internal protection for the case, when input stream does not advance its position. In R2, following code causes infinite loop, in R3, it returns false: parse str [some [to "abc"]] (I am not sure I like that it returns false - normally I expect it to cause infinite loop. This is imo overprotecting programmer, and you have to think, why your code returns false anyway, which for me is the same, as if it would cause an infinite loop) Further from docs: To avoid infinite looping, a special internal rule is triggered based on the fact that the rule did not change the input position. However, this shows a problem with this rule: parse str [some [to "a" remove thru "b"]] Here the input did not appear to advance, but something useful happened. In such cases, the some word should not be used, and the while word is better: parse str [while [to "a" remove thru "b"]] | 24-Dec-09 10:40 |
4738 | Henrik | Looking at the new WHILE keyword and I was quite baffled by Carl's use of it in his latest blog example. Then I read the docs and it didn't get much better: - WHILE is a variant of ANY - ANY stops, if input does not change - WHILE doesn't stop, even if input does not change What does "input does not change" mean? Is it about changing the parse series length during parse? Is it actively moving the parse index back or forth using special commands? Is it normal progression of parse index with each cycle of WHILE or ANY? Is it alteration of the parse series content while maintaining length during parse? | 24-Dec-09 9:32 |
4737 | Maxim | hehe | 16-Dec-09 19:02 |
4736 | Gabriele | Maxim, maybe you thought I was kidding the other day... ;) | 16-Dec-09 10:22 |
4735 | Maxim | the funny thing is that the C language reference on the MSDN is actually pretty well done... there are a lot of evil C examples for some of the more obscure parts of the language like pointers, structs and unions. funny thing is that some of the most complex things to express where the litteral constants! integers, with octal, hex notation... not as simple as some [digits] ;-) | 16-Dec-09 5:31 |
4734 | BrianH | Well, good luck! :) | 16-Dec-09 5:30 |
4733 | Maxim | my goal is to get the host code and OpenGL headers past the parsing phase. once that is done, I'll start work on adding the production phase. I still have to write the pre-processor, but that in fact is pretty straight forward. there are little rules and they are much more static and well defined on the MS web site. | 16-Dec-09 5:28 |
4732 | BrianH | "data" in this case being C source. | 16-Dec-09 5:28 |
4731 | BrianH | No, really. The syntax of C is so complex that you would need a lot of data to test all of the common variations. | 16-Dec-09 5:27 |
4730 | Maxim | you are being sarcastic right? :-) | 16-Dec-09 5:26 |
4729 | Maxim | there is all in all only two or three rules that I'm unsure of the transformation, as some aspects of the C syntax are a bit obscure to represent. | 16-Dec-09 5:26 |
4728 | BrianH | Are you sure you have enough test code/data? | 16-Dec-09 5:26 |
4727 | BrianH | Sounds about right. | 16-Dec-09 5:25 |
4726 | Maxim | well, considering that I just finished the basic rule re-organisation... eheheh I think I'll apply the unit testing phase right now to test if all the rules perform as they shoudl using input text. there is probably going to be about 100kb of unit test code for what is now about 12kb of parse rules. | 16-Dec-09 5:25 |
4725 | BrianH | You might be better off translating a C grammar for a PEG or TDPL parser generator into PARSE - less topological shifts needed. | 16-Dec-09 5:23 |
4724 | BrianH | Unfortunately, the C grammar was designed with LR parsers in mind. | 16-Dec-09 5:21 |
4723 | BrianH | BNF is just a syntax form, with a *lot* of variation. The real difference that matters between Yacc and PARSE is the parsing model. Yacc implements an LR parser (or some variant thereof), and PARSE implements a variant of TDPL parsing (related to PEG), though more powerful and with a completely different syntax. How you structure the parse rules depends on the parsing model, not the syntax. For instance, LR parsers tend to do recursion rather than iteration, and when they recurse the recrsive call tends to be on the left, with the distinguishing clause on the right. For PEG parsers, recursion goes the other way. This is not an error, this is a difference in parsing model. If you are translating from Yacc to PARSE, it's not just a syntax change. You have to reorganize the rules to match the new model. And watch out: Certain patterns are easier to express in some parsing models than in others. Some patterns aren't supported at all in some models, and so no amount of translation will help you. We chose the TDPL model for PARSE because it is more expressive than the LR model, so in theory you should be able to translate LR rules to PARSE with some topological twists (redoing the sturcture of the rules). However, there are patterns that you can express in PARSE that can't be translated to LR, even with topological changes. | 16-Dec-09 5:21 |
4722 | Maxim | I've been rewriting bnf generated parse rules (and often a bit cryptically) into proper parse ordered rules for 3 days now... <sigh>
C is sooo complex for what it really does. I''ve discovered a few quite mind-boggling language capabilities...
stuff like: char *( *(*var)() )[10]; it takes 7 steps to define what that really is and there are other "fun" examples which end up being interpretation nightmares, but look really simple. one thing is certain at this point... although I will be able to build a C to rebol converter with relative precision under specific goals, some of the crazy stuff just will have to be finished manually by humans. at least I rarely see such twisted C code in most of what I've been reading so far. | 16-Dec-09 3:55 |
4721 | Maxim | sure. | 14-Dec-09 22:34 |
4720 | Gregg | Generating PARSE rules wasn't too hard. It is a nice fit. Same issue with existing grammars though, in that you have to fix some things up manually, or we have to make the generator smarter. I'll zap you what I have. Can't remember where I've posted it elsewhere. | 14-Dec-09 22:33 |
4719 | Maxim | that is nice, is your ABNF parser still accessiblel somewhere? it could improve the quatily and ease of integrating the protocols to R3 IMO. ABNF also seems much more aligned to parse | 14-Dec-09 22:30 |
4718 | Gregg | There are a lot of differences, unfortunately. It's not terrible, just different. It's not EBNF. http://en.wikipedia.org/wiki/Augmented_Backus%E2%80%93Naur_Form | 14-Dec-09 22:27 |
4717 | Maxim | is ABNF == EBNF ? | 14-Dec-09 22:27 |
4716 | Maxim | what is the difference? | 14-Dec-09 22:25 |
4715 | Gregg | Yup. Different mindset. I just looked at your BNF compiler earlier. Good stuff. I did an ABNF-to-parse generator some time back. ABNF is used in a lot of IETF RFCs and such. | 14-Dec-09 22:25 |
4714 | Maxim | one strange thing I realised is that most people who write bnf, will write them in exactly the opposite of what parse needs to be.. they'll but the smallest pattern first. so that if applied in parse directly, it always short-circuits the other rules following it. | 14-Dec-09 19:56 |
4713 | Maxim | finished the rewrite of the BNF parser... funny... there is more documentation & comments than code. | 13-Dec-09 22:55 |
4712 | Maxim | I've used word= for other things before and I liked it. | 13-Dec-09 20:00 |
4711 | Maxim | I'll try that, its a good variant, even better since then we clearly identify the 3 different parse constructs separately. | 13-Dec-09 19:59 |
4710 | Gregg | For a long time I've added = to the end of my parse rules, and = to the beginning of parse variables. I think it matches the production rule grammar well, and also emulates set-word/get-word syntax. | 13-Dec-09 19:56 |
4709 | Maxim | the new parse rejection system is VERY cool. ( can simplify the structure of some rules a lot :-) | 13-Dec-09 5:44 |
4708 | Maxim | (all in R3, but not using newer parse stuff, cause its not required) | 13-Dec-09 4:17 |
4707 | Maxim | yay, I've got the BNF grammar done... its ripping through a C language BNF grammar definition... :-) now I've just got to make a parse rule emitter ... easy enough. | 13-Dec-09 4:17 |
4706 | PeterWood | "any others care to comment?" I'm afraid t looks very messy to me and reminded me of Perl for some reasion. | 13-Dec-09 1:31 |
4705 | Maxim | true :-) | 13-Dec-09 0:58 |
4704 | Graham | it's not a syntax but a convention ... | 13-Dec-09 0:56 |
4703 | Maxim | I'm just trying to get a feel for what others think about the idea. and sharing a bit of a discovery at the same time, if it may help others. the goal isn't to be popular or convince others... and sorry, if my last line may have looked harsh, it wasn't. :-) I was just resuming your reaction plainly and relaunching the question to be sure others realize I want a few opinions. | 13-Dec-09 0:39 |
4702 | Graham | Max, just do what ever suits you. | 13-Dec-09 0:35 |
4701 | Maxim | unfortunately what you say isn't feasible, even if you can technically do it. who is going to program a parser to colorise code which is usefull for only one application? its actually going to take more time to write your color parser for each piece of code than write the code itself :-P so bottom line, Graham doesn't like this syntax. any others care to comment? | 13-Dec-09 0:18 |
4700 | Graham | exactly ... for coding. | 13-Dec-09 0:05 |
4699 | Maxim | but not while I'm coding... this is not for presentation, its for coding... I'm writing rules twice as fast now... just cause I'm not waisting time "searching" for the keywords within all of that text. | 13-Dec-09 0:02 |
4698 | Graham | without the need for all those = signs everywhere | 13-Dec-09 0:01 |
4697 | Graham | so you could write a parser that reads your rules and colorises them ... | 13-Dec-09 0:01 |
4696 | Maxim | stuff is colorized... (*in my editor*) | 13-Dec-09 0:00 |
4695 | Maxim | syntax highlighting colorizes words ... stuff is colorized... but user words aren't colorised and they all get mixed up between functions, variables and rules... and having colors which are two strong next to each other and in relative distribution ... cancels out. | 12-Dec-09 23:59 |
4694 | Graham | Chuck Moore uses color extensively in his color forth .. to replace other types of syntactic markup. | 12-Dec-09 23:58 |
4693 | Graham | Gab uses the == in his literate editor .. | 12-Dec-09 23:57 |
4692 | Graham | Use an editor that colorises the words | 12-Dec-09 23:57 |
4691 | Maxim | what do you mean color? | 12-Dec-09 23:56 |
4690 | Graham | use color instead :) | 12-Dec-09 23:55 |
4689 | Maxim | with syntax highlighting it's quite amazing how bits stands out. ... in my editor at least. | 12-Dec-09 23:25 |
4688 | Maxim | when using rules in other contexts, they also stick out... =alphabet=: rejoin [=digit= =letter= bits "_"] here I immediately see that bits isn't a rule, but a function or a word. | 12-Dec-09 23:24 |
4687 | Maxim | another example.... in this dense block of text, I can spot the =eol= (end of line) token instantly in both x and y dimensions of the rule paragraph: =line-comment=: [ =comment-symbol= [ [thru =eol= (print "comment to end of line")] |[to end] ] (print "success") ] | 12-Dec-09 23:22 |
4686 | Maxim | I just adopted a new notation standard for parse rules... the goal is to make rules a bit more verbose as to the type of each rule token... I find this reads well in any direction, since we encouter the "=" character when reading from left to right or right to left... and parse rules often have to be read from right to left. example: =terminal=: [ =quote= copy terminal to =quote= skip (print ["found terminal: " terminal]) ] on very large rules, and with the syntax highlighting in my editor making the "=" signs very distinct, I can instantly detect what parts of my rules are other rules or character patterns... it also helps out in the declarations... I see when blocks are intended to be used as rules quite instantly where ever they are in my code. in my current little parser, I find I can edit my rules almost twice as fast and loose MUCH less time scanning my blocks to find the rule tokens, and switching them around. wonder what you guys think about it... | 12-Dec-09 23:20 |
4685 | WuJian | newbie's solution,without PARSE: >> s2: {1 ''2 '3 4 ' '5 ''6 '7 8 9 '0'} >> replace/all s2 {''} {'} replace/all s2 {'} {''} print str 1 ''2 ''3 4 '' ''5 ''6 ''7 8 9 ''0'' >> str == s2 == true | 12-Dec-09 3:01 |
4684 | Reichart | Jack, Parse is my fav REBOL command. If I ever have time, this is the one funciton I would like to create hundreds of examples for in a Wiki. | 12-Dec-09 1:27 |
4683 | Maxim | I'd gladly give back a few $ for their efforts | 11-Dec-09 18:15 |
4682 | Maxim | I sure would use it... some people have helped save days of work with free code and insight. | 11-Dec-09 18:14 |
4681 | Maxim | actually, having a paypal account linked with your login and a "donate" button would be really nice :-) right in the chat tool. | 11-Dec-09 18:14 |
4680 | Steeve | we should add a DONATE account somewhere, linked with Altme. I'm sure people would be glad to add 1 dollar for such fast assistance. Then, we could finance some interesting projects | 11-Dec-09 18:13 |
4679 | Rebolek | Just curious, I tested both versions and Steeve's version is about 2times faster than Maxim's :) | 11-Dec-09 18:12 |
4678 | Maxim | ( I can see that being misleading when read hehehe :-) | 11-Dec-09 18:08 |
4677 | jack-ort | Ah! when you said "...you match double quotes first then fallback to single quotes, ..." I was thinking double-quote character, not double single-quotes. Need more coffee... Thanks very much! | 11-Dec-09 18:07 |
4676 | Steeve | corrected version with thru: >> parse/all str [ any [thru {'} [{'} | p: (insert p {'} ) skip ]]] | 11-Dec-09 18:06 |
4675 | Maxim | print it out in the rebol console... you will see that my exampe doesn't nave any double quote characters.. they just look like so in altme's font ;-) | 11-Dec-09 18:05 |
4674 | jack-ort | Thanks! I'm going to have to look @ this for awhile to understand why you even need to worry about the double-quote character. Much to learn.... Thanks Maxim and Steeve for the prompt replies! | 11-Dec-09 18:04 |
4673 | Steeve | same as mine, except i use THRU to speed up the process | 11-Dec-09 18:04 |
4672 | Maxim | note all ticks... ( ' ) are single quote chars in the above. | 11-Dec-09 18:02 |
4671 | Maxim | >> str: {1 ''2 '3 4 ' '5 ''6 '7 8 9 '0'} >> parse/all str [some [{''} | [{'} here: (insert here {'}) skip] | skip]] >> print str == {1 ''2 ''3 4 '' ''5 ''6 ''7 8 9 ''0''} | 11-Dec-09 18:01 |
4670 | Steeve | i think i misunderstood something, replace {"} by {'} maybe | 11-Dec-09 17:59 |
4669 | Steeve | >> parse/all str [ any [thru {"} [{"} | p: (insert p {"} skip) ]]] something like this (not tested) | 11-Dec-09 17:57 |
4668 | jack-ort | yes, View 2.7.6 under Windows XP | 11-Dec-09 17:54 |
4667 | Maxim | R2? | 11-Dec-09 17:52 |
4666 | Maxim | easy, actually. you match double quotes first then fallback to single quotes, adding a new one and skiping one char... give me a minute I should get something working... | 11-Dec-09 17:52 |
4665 | jack-ort | Help! Still struggling to understand parse. How could I replace any and all SINGLE occurrences of the single-quote character anywhere in a string (beginning, middle or end) with TWO single-quotes? But if there are already TWO single-quotes together, I want to leave them alone. TIA for any and all help for a newbie! | 11-Dec-09 17:50 |
4664 | BrianH | | | 4-Dec-09 6:09 |
4663 | Graham | Ladislav, what 'choice operator? | 3-Dec-09 22:52 |
4662 | Graham | Janko, charset is short for make bitset! so you can call them bitsets or charsets :) | 3-Dec-09 22:50 |
4661 | Ladislav | It looks, that I could have used: C: [while [and [A | B] accept | skip | reject]] | 3-Dec-09 11:39 |
4660 | Ladislav | "I didn't know you could set the position back with :here" - you can set the position back even without :here, the choice operator is sufficient for you to be able to do that, see the above idioms as an example | 3-Dec-09 11:08 |
4659 | Ladislav | Just to complete the list of possible equivalents to the C: [to [A | B]] rule, here is a way how to do it in Rebol3 parse: C: [while [and [A | B] break | skip | reject]] you can find other equivalent idioms at http://en.wikibooks.org/wiki/REBOL_Programming/Language_Features/Parse#Parse_idioms | 3-Dec-09 11:01 |
4658 | Janko | but it is a level less simple and nice to use than simple parse modes that's why the simple ones should be powerfull *if possible* too - you can't get a newbie impressed with charset parsing because he won't understand it probably. | 3-Dec-09 10:57 |
4657 | Janko | yes, you are right .. if you can write partser for php then you can make anything with it. I always supposed parse with charsets is like low level step by one char in a looop and call "events" and change states , with which you can parse anything from xml to languages .. well but parse with charsets is still much more elegant | 3-Dec-09 10:56 |
4656 | Janko | Oldes if that is in R3 >> copy x to [" ." | " !"] << this is exactly as I was proposing above :) , very nice! I know I have to .. I haven't really needed them yet I guess, I solved some things less elegantly in other ways without them. I intend to take the plunge next time I need them. | 3-Dec-09 10:54 |
4655 | Janko | Ladislav, thanks.. I didn't know you could set the position back with :here , that is interesting and probably expands what you can do with parse a lot. | 3-Dec-09 10:52 |
4654 | Oldes | And Janko... if you don't use charsets at all, I think you should give it a try. It's not so difficult. I think that if I can write parser to colorize PHP code, than you can parse everything. | 2-Dec-09 22:04 |
4653 | Oldes | Just would like to remember that there is something like R3 where: >> parse "I like Apple . I like Windows ! I like Linux . I like Amiga ." [any ["I like " copy x to [" ." | " !"] (probe x) to "I like "]] "Apple" "Windows" "Linux" "Amiga" | 2-Dec-09 22:02 |
4652 | Ladislav | Janko: the only problem is, that you cannot use: C: [to [A | B]] , where A and B are "general rules", but you can always write: C: [here: [A | B] :here | skip C] , which would do what you want | 2-Dec-09 20:49 |
4651 | Graham | http://www.mail-archive.com/rebol-bounce@rebol.com/msg01983.html | 2-Dec-09 19:59 |
4650 | Graham | BTW, Bolek wrote a regex engine in Rebol ... | 2-Dec-09 19:59 |
4649 | Janko | (aha bitsets.. I was calling them charsets upthere) | 2-Dec-09 19:57 |
4648 | Graham | you have to turn off parse's default delimiters and use bitsets | 2-Dec-09 19:56 |
4647 | Janko | and I know everything has limitations ... this functionality OR with taking the first that appears would just in practice solve me many cases | 2-Dec-09 19:56 |
4646 | Janko | I know parsing csv can be messy ... at least at this high level I don't know how to do it with escapes and commas in etc | 2-Dec-09 19:55 |
4645 | Gregg | That said, if you know the format (e.g. WRT quotes and escapes), it can be done with PARSE. It just may not be a one-liner. | 2-Dec-09 19:54 |
4644 | Gregg | CSV parsing is an issue, because REBOL handles some inputs well, but fails for what may be a common way things are formatted. "CSV" isn't always as simple as it sounds. | 2-Dec-09 19:54 |
4643 | Gregg | It's not necessarily a PARSE limitation, but there are things we'd like PARSE to do that aren't always reasonable. :-) TO and THRU can work very well, but that doesn't mean they'll work for every situation. You may have to use rules where you check for your target value or just SKIP, marking locations in the input as you go. | 2-Dec-09 19:52 |
4642 | Janko | "janko","some\"thing92!","graham" I am not sure but I think here you have the same problem | 2-Dec-09 19:51 |
4641 | Janko | I just started talking about this as a general limitation of parse that I meed a lot of times and I suppose Paul could of meet it when trying to parse CSV | 2-Dec-09 19:49 |
4640 | Janko | I don't have real example right now :) I had them few times before and I also asked here about them and I solved with your help somehow | 2-Dec-09 19:49 |
4639 | Janko | >> parse "I like Apple . I like Windows ! I like Linux . I like Amiga ." [ [ some [ thru "I like" copy IT [to "." ( prin "so so: ") | to "!" (prin "v ery much: ") ] (print IT) ]] so so: Apple so so: Windows ! I like Linux so so: Amiga | 2-Dec-09 19:48 |
4638 | Graham | Janko, best thing to do is show us a string you can't parse ... and someone will show you how to do it. | 2-Dec-09 19:45 |
4637 | Janko | BUT .. what if I want to have controll there .. or if for the sake of example it's a more complex multicharacter difference like "<DOT>" "<EXCLAMATION>" | 2-Dec-09 19:44 |
4636 | Janko | ok , you again found a solution to my specific problem :)) | 2-Dec-09 19:42 |
4635 | Graham | charset [ #"!" #"." ] | 2-Dec-09 19:42 |
4634 | Janko | this is the common to all problems where that I am describing .. if I had > to [ "." | "!" ] and parse would find both and go to the one that is closer it would be solved. | 2-Dec-09 19:41 |
4633 | Janko | >> parse "This is Apple . This is Windows ! This is Linux . This is Amiga ." [ some [ thru "This is" copy IT [to "." | to "!" ] (print IT) ]] Apple Windows ! This is Linux Amiga | 2-Dec-09 19:39 |
4632 | Janko | The pattern is known ... the scentence starts with this is and can end with . or ! but they can come in any order .. if you try to parse with "." first you will get ---- ops some errors upthere .. just a sec | 2-Dec-09 19:38 |
4631 | Janko | parse "This is Apple . This is Windows ! This is Linux . This is Amiga ." [ some [ "This is" copy IT (print IT) to [ "." | "!" ] ] | 2-Dec-09 19:36 |
4630 | Graham | If you don't know what pattern the data is .. you can't parse it with anything. | 2-Dec-09 19:34 |
4629 | Graham | I know what you mean .. so you have to order your rules knowing what the data looks like | 2-Dec-09 19:34 |
4628 | Janko | whigh = which | 2-Dec-09 19:33 |
4627 | Janko | no wgih is the closest .. look at this example (I hope this will be better) | 2-Dec-09 19:33 |
4626 | Graham | and see which has the best fit ? | 2-Dec-09 19:32 |
4625 | Graham | to go to the closest one .. means it has to try all the rules?? | 2-Dec-09 19:32 |
4624 | Graham | [ some [ "start" digits [ "end" | "finish" ] ] should work | 2-Dec-09 19:31 |
4623 | Janko | you can use to but it still won't work | 2-Dec-09 19:31 |
4622 | Janko | yes , then you have to do charset parsing (but I don't know that yet :) ) .. I was just trying to say if there would be the way to say something like "to any [ "A" | "B" ] and it would go to the closest one A LOT of problems with parse would be easily solvable | 2-Dec-09 19:30 |
4621 | Graham | your problem is because you are using 'thru which breaks the other rule | 2-Dec-09 19:29 |
4620 | Graham | parse string [ some [ "start" digits "end" | "start" digitis "finish ]] | 2-Dec-09 19:28 |
4619 | Graham | In this case I would use block parsing ... then I'm no expert in parsing | 2-Dec-09 19:27 |
4618 | Janko | I was trying to show an example where you have two possible endings and you want to process both (and you can differently with parens) ) but you don't know in what order they will come or anything | 2-Dec-09 19:25 |
4617 | Graham | change the rule again | 2-Dec-09 19:24 |
4616 | Janko | ok .. but I meant that you have "start 111 end start 222 finish start 333 end " then it won't work :) | 2-Dec-09 19:23 |
4615 | Graham | [ to "end" | to "finish" ] | 2-Dec-09 19:23 |
4614 | Graham | change it | 2-Dec-09 19:22 |
4613 | Janko | parse "start 111 end start 222 finish" [ some [ thru "start" copy NUMS [ to "finish | to "end" ] ] ] this wont work | 2-Dec-09 19:21 |
4612 | Graham | this is a current parse limitation. | 2-Dec-09 19:20 |
4611 | Janko | from Advocacy --> Graham [ to "A" | to "B" ] won't work as I want .. I will try to find a concrete example | 2-Dec-09 19:16 |
4610 | Janko | I know I was stopped by parse in some occasions where. I think always every time the problem would be solvable if I had for example >> to [ "A" | "B" ] where parser would check where is A and where is B and go to the closest one. | 2-Dec-09 19:15 |
4609 | Pekr | Dialect is a dialect. The only difference in string vs block parsing, imo is, that with block parsing, you are using REBOL datatypes to identify/match your types, whereas with string you are more "free-form" :-) | 17-Nov-09 16:53 |
4608 | JoshF | OK. Thanks again for the timely help! I have to run off to work (which is firewalled up the yang), so you'll be able to avoid more silly questions from me for at least the next ten hours! ; - ) | 17-Nov-09 14:27 |
4607 | Ladislav | right, what you are doing is a dialect | 17-Nov-09 14:24 |
4606 | JoshF | I understood that character stuff wouldn't work in a dialect -- but my understanding is imperfect. | 17-Nov-09 14:23 |
4605 | JoshF | The difference between what I'm doing and what you linked to is that it's working against a string, while I'm doing a dialect, no? | 17-Nov-09 14:22 |
4604 | Pekr | it is a bit difficult to understand recursive rules, but :-) | 17-Nov-09 14:20 |
4603 | Henrik | Depending on the situation, it can be hard to tell whether you are dealing with a word or a specific value. that's the price for freely interchangable code/data. :-) a: [none] b: copy a b: reduce b ; me doing this behind your back a == [none] ; word! b == [none] ; none! | 17-Nov-09 14:20 |
4602 | Pekr | http://www.rebol.com/docs/core23/rebolcore-15.html#section-6 | 17-Nov-09 14:20 |
4601 | Ladislav | ...except for the fact, that lit-words are used in the Do dialect (= when Rebol is concerned, as you say), when you want to write an expression, which evaluates to a specific word, so, e.g. the expression: 'a evaluates to the same value as the expression: first [a] , which happens to be the word A | 17-Nov-09 14:19 |
4600 | Henrik | a trap that you might fall into: type? first [none] == word! type? first reduce [none] == none! type? first reduce ['none] == word! | 17-Nov-09 14:18 |
4599 | JoshF | OK... Thanks very much. That helps a lot. I was right down the road to writing an expression parser, then that whole slash thing stopped me dead in my tracks. Now I should be able to get into some _real_ trouble! | 17-Nov-09 14:18 |
4598 | Ladislav | right | 17-Nov-09 14:17 |
4597 | JoshF | OK... So, let me paraphrase... As far as REBOL is concerned, lit-words are used only by the parse dialect to represent a thing to match to, whereas words are evaluated to find the thing to match to. However, because of parsing constraints in REBOL as a whole (the significance of "/" when dealing with indexable variables), there's no way to "escape" the slash into an unevaluated (literal) word without the dodge you showed me. | 17-Nov-09 14:16 |
4596 | Henrik | I think you can say, that a word can be an evaluated lit-word. When you are typing a word directly into the console, you evaluate the word into a value that it's bound to. When entering a lit-word, it's evaluated into a word. | 17-Nov-09 14:15 |
4595 | Ladislav | Compare: >> parse [a] [a] ** Script Error: a has no value ** Near: parse [a] [a] >> parse [a] ['a] == true | 17-Nov-09 14:14 |
4594 | Ladislav | in Parse, lit-words are used for matching, while words are looked up for values, which then are used for matching, so totally different behaviour | 17-Nov-09 14:13 |
4593 | JoshF | Or are they just used for the special case of dealing with a / in load? ; - ) | 17-Nov-09 14:13 |
4592 | JoshF | I thought there was only word!'s and then everything else were more concrete types. I guess what I am asking is what is the purpose of lit-words? | 17-Nov-09 14:12 |
4591 | Ladislav | just a different datatype | 17-Nov-09 14:12 |
4590 | JoshF | OK... Mechanically, I see what you're saying, but what's the difference between a lit-word and a word? The spirit eludes me... | 17-Nov-09 14:11 |
4589 | Henrik | And also hence the expression "a block is or isn't loadable" | 17-Nov-09 14:11 |
4588 | Henrik | If LOAD won't eat a block, PARSE won't either, so you can test your block with LOAD. Some words can't be typed directly in, hence ladislav's solution. | 17-Nov-09 14:11 |
4587 | Ladislav | check as follows: type? :lit-div type? :tdiv | 17-Nov-09 14:10 |
4586 | Ladislav | My example works, since the LIT-DIV variable refers to a lit-word, while your tdiv refers to a word | 17-Nov-09 14:09 |
4585 | JoshF | Both tdiv and lit-div type? to a word!... | 17-Nov-09 14:09 |
4584 | JoshF | Ha! Black magic! That works a champ Ladislav, thanks very much! I had tried >> tdiv: to-word "/" == / >> parse [3 / 2] [some [integer! (print "number") | ['+ | '- | '* | tdiv ] (print "op ")]] But had gotten the same error. What makes yours work? | 17-Nov-09 14:07 |
4583 | Ladislav | JoshF: Rebol load does not parse the '/, but you can do: as-lit-word: func ['word [any-word!]] [to lit-word! word] lit-div: as-lit-word / parse [3 - 2] [some [integer! (print "number") | ['+ | '- | '* | lit-div] (print "op")]] | 17-Nov-09 14:04 |
4582 | JoshF | The second one failed when I tried to extend the dialect with multiply (*) and divide (/). After further experimentation, it seems that you can't escape the "/". Google has not been helpful here... Does anybody have any ideas? I could parse for just a word! instead of the +, -, etc., but I wanted parse to do the work of deciding what was a valid operation or not. Sorry for the multiple messages, I'm still trying to figure this client out... Thanks for any advice! | 17-Nov-09 14:02 |
4581 | JoshF | >> parse [3 - 2] [some [integer! (print "number") | ['+ | '- | '* | '/ ] (print "op")]] ** Syntax Error: Invalid word-lit -- ' ** Near: (line 1) parse [3 - 2] [some [integer! (print "number") | ['+ | '- | '* | '/ ] (print "op")]] | 17-Nov-09 14:00 |
4580 | JoshF | >> parse [3 + 2] [some [integer! (print "number") | ['+ | '- ] (print "op")]] number op number == true | 17-Nov-09 13:59 |
4579 | JoshF | Hi! I'm trying to use REBOL's parse to make a simple calculator dialect. However, I'm having trouble with escaping entities (I think)... Here's my first try (that worked): | 17-Nov-09 13:58 |
4578 | Robert | IMO that would be really nice. | 8-Nov-09 12:04 |
4577 | Robert | I have used www.antlr.org stuff several years ago with C/C++ target. It's a very cool parser generator toolkit. Just took a look again. It has emitters for different languages. Maybe one of the parse gurus here can take a look if we can do a REBOL emitter. | 8-Nov-09 12:04 |
4576 | BrianH | Agreed :) | 26-Oct-09 18:36 |
4575 | Steeve | But it should return a proper error message as Pekr noticed it. | 26-Oct-09 18:35 |
4574 | BrianH | Otherwise adding them would be difficult. | 26-Oct-09 18:05 |
4573 | BrianH | Keywords that are *planned* to be added should definitely be reserved. | 26-Oct-09 18:03 |
4572 | Pekr | posted to Chat/R3/Parse group ... | 26-Oct-09 13:31 |
4571 | Pekr | Hmm, you are right .... But we might need better error message, no? >> test: ["123"] parse "123" [test] == true >> limit: ["123"] parse "123" [limit] ** Script error: PARSE - invalid rule or usage of rule: end! ** Where: parse ** Near: parse "123" [limit] | 26-Oct-09 13:28 |
4570 | Steeve | if you just try to use it, your parsing may crash. So, it's doing nothing but it's here. | 26-Oct-09 13:13 |
4569 | Pekr | I thought it is not implemented yet, hence no reservation? | 26-Oct-09 13:11 |
4568 | Pekr | :-) | 26-Oct-09 13:11 |
4567 | Steeve | (in R3) | 26-Oct-09 13:07 |
4566 | Steeve | Something funny. I spent an hour debugging a parsing rule. To finally understand this. Never name a rule, LIMIT. LIMIT keyword is reserved for a further use in parse apparently. | 26-Oct-09 13:07 |
4565 | BrianH | Will, R2/Forward is already available for download in DevBase (R3 chat). It is a little outdated though, since I had to take a break to rewrite R3's module system. I'll catch up when I get the chance. The percentage of R3 that I can emulate has gone down drastically since the last update, since R3 has made a lot of changes to basic datatype behavior since then. We'll see what we can do. | 26-Oct-09 5:45 |
4564 | BrianH | Chris, there can be an advantage in R3 to breaking up a bitset into more that one bitset on occasion, mostly memory savings. However, it might not work as well as you might like since offset and/or sparse bitsets aren't supported. Bitsets that involve high codepoints will take a lot of RAM no matter what you do. | 26-Oct-09 5:40 |
4563 | Graham | Rebol doesn't have lines :) | 26-Oct-09 4:49 |
4562 | Steeve | R3 one liner ;-) >> map-each [a b] parse "this-is-a-string" "-" [ajoin [a #"-" b]] | 26-Oct-09 0:16 |
4561 | Geomol | Another: >> out: parse "this-is-a-string" "-" >> forall out [change/part out rejoin [out/1 "-" out/2] 2] >> out == ["this-is" "a-string"] | 25-Oct-09 22:35 |
4560 | Geomol | Sunanda, one way: >> out: clear [] >> parse "this-is-a-string" [mark1: any [thru "-" [to "-" | to end] mark2: (append out copy/part mark1 mark2) skip mark1:]] >> out == ["this-is" "a-string"] | 25-Oct-09 22:32 |
4559 | Will | is R2/Forward available for download? thx | 25-Oct-09 22:29 |
4558 | Sunanda | I guess parse can do this too? http://stackoverflow.com/questions/1621906/is-there-a-way-to-split-a-string-by-every-nth-seperator-in-python | 25-Oct-09 21:49 |
4557 | Chris | An example: a nested d: [k v] structure where 'k is a word and 'v is 'd or any other type: data: [k [k "s"]] R2, you can validate with d: [word! [into d | skip]] Now you have to specify: d: [word! [and any-block! into d | skip]] otherwise you get an error if 'v is a string! | 22-Oct-09 21:58 |
4556 | Chris | Allowing 'into to look inside strings can break current usage of 'into, requiring [and any-block! into ...] | 22-Oct-09 21:40 |
4555 | Chris | Not size, efficiency. | 22-Oct-09 20:03 |
4554 | Steeve | if the size is a problem you can build a function to test each range. But It will be slow | 22-Oct-09 19:35 |
4553 | Steeve | It seems | 22-Oct-09 19:31 |
4552 | Chris | That's what I'm asking. Complemented bitsets wouldn't make a difference here though as the excluded range is of similar scope, right? | 22-Oct-09 19:30 |
4551 | Steeve | So W1 + W+ = 128Kb Is this a problem ? | 22-Oct-09 19:26 |
4550 | Steeve | 64 Kb , sorry | 22-Oct-09 19:24 |
4549 | Steeve | Anyway, a bitset with a length of 2 ** 16 is not so huge in memory (only 16kb) | 22-Oct-09 19:23 |
4548 | Steeve | Uses R3 (and his optimized complemented bitsets) | 22-Oct-09 19:21 |
4547 | Chris | Both w1 and w+ appear to be very large values. Would it be smart to perhaps do: [[aw1 | w1] any [aw+ | w+]] Where 'aw1 and 'aw+ are limited to ascii values? | 22-Oct-09 19:08 |
4546 | Chris | (sorry if that looks messy) | 22-Oct-09 19:04 |
4545 | Chris | Is there any advantage in breaking up charsets that represent a large varied range of the 16-bit character space? For example, XML names are defined as below (excluding > 2 ** 16), but are most commonly limited to the ascii-friendly subset: w1: charset [ #"A" - #"Z" #"_" #"a" - #"z" #"^(C0)" - #"^(D6)" #"^(D8)" - #"^(F6)" #"^(F8)" - #"^(02FF)" #"^(0370)" - #"^(037D)" #"^(037F)" - #"^(1FFF)" #"^(200C)" - #"^(200D)" #"^(2070)" - #"^(218F)" #"^(2C00)" - #"^(2FEF)" #"^(3001)" - #"^(D7FF)" #"^(f900)" - #"^(FDCF)" #"^(FDF0)" - #"^(FFFD)" ] w+: charset [ #"-" #"." #"0" - #"9" #"A" - #"Z" #"_" #"a" - #"z" #"^(B7)" #"^(C0)" - #"^(D6)" #"^(D8)" - #"^(F6)" #"^(F8)" - #"^(037D)" #"^(037F)" - #"^(1FFF)" #"^(200C)" - #"^(200D)" #"^(203F)" - #"^(2040)" #"^(2070)" - #"^(218F)" #"^(2C00)" - #"^(2FEF)" #"^(3001)" - #"^(D7FF)" #"^(f900)" - #"^(FDCF)" #"^(FDF0)" - #"^(FFFD)" ] word: [w1 any w+] | 22-Oct-09 19:04 |
4544 | Pekr | ah, got reply on Chat from Carl towards complementing: "Re #5718: Pekr, that's a good question, and I think the answer must be YES. We need to be able to complement bitmaps in a "nice way". Otherwise, Unicode bitmaps, even if simply used on ASCII chars, would take a lot of memory. This change should be listed on the project sheet, and if not, I'll add it there." | 18-Oct-09 7:11 |
4543 | Maxim | (it only accepts a string... dummy :-) | 18-Oct-09 1:04 |
4542 | Maxim | doh... when you're too close to the tree... you can't see the forest... I was using TO parse command on a rule ... this obviously won't work.... | 18-Oct-09 1:03 |
4541 | Maxim | my deadline is to have a site working by this week... unless this darned bug I am trying to kill doesn't kill me first. | 18-Oct-09 0:42 |
4540 | BrianH | It's on my list... | 18-Oct-09 0:41 |
4539 | Maxim | I promise. | 18-Oct-09 0:41 |
4538 | Maxim | well, build it and I will try it ;-) | 18-Oct-09 0:40 |
4537 | BrianH | Which is what a rule compiler does :) Actually, it sounds like you could adapt the tricks of the ruule compiler to *your* rule compiler, which would let you use the new operations in your rule source and have the workarounds generated in the output. | 18-Oct-09 0:39 |
4536 | Maxim | and its not simple parsing since I use parsing index manipulation, which is also dictated by the source data in encounters. its like swatting flies using a fly swatter at the end of a rope, while riding a roller coster which changes layout every time you ride it ;-) | 18-Oct-09 0:39 |
4535 | Maxim | really, the problem is not the parsing itself... its getting the darn rules to generate the proper rules hehehe. | 18-Oct-09 0:37 |
4534 | BrianH | Of course the *result* of the compilation would be self-modifying rules :) | 18-Oct-09 0:36 |
4533 | BrianH | If the self-modifying rules are strung-together basic blocks, you can use the rule compiler to generate the blocks. And the R3 changes make self-modifying rules less necessary, so you can have even larger basic blocks. | 18-Oct-09 0:35 |
4532 | Maxim | since I use binding to map inner rules which are also constructed on the fly but have to be pushed and poped from the stack as I traverse data... its a lot of fun :-D | 18-Oct-09 0:34 |
4531 | Maxim | the rule I am writing now actually does JIT rule compilation... hairy to debug :-) | 18-Oct-09 0:32 |
4530 | Maxim | laden with many paren expressions and a stack on top of it. | 18-Oct-09 0:30 |
4529 | Maxim | a rule compiler doesn't adapt very well to self-modifying rules | 18-Oct-09 0:29 |
4528 | BrianH | Maxim, that is what Pekr was talking about. That is planned to be fixed. | 18-Oct-09 0:29 |
4527 | BrianH | Maxim, Remark could be adjusted to use the rule compiler. For that matter, Remark could use R2/Forward (which needs some work, but is already better than R2 on its own). | 18-Oct-09 0:28 |
4526 | Maxim | you end up with a full codepoint bitset minus one byte if it complemented or not | 18-Oct-09 0:28 |
4525 | Maxim | one situation which complemet can't handle very well (ram wise): union charset "a" complement charset "b" | 18-Oct-09 0:27 |
4524 | Maxim | but wouldn't work with remark ;-) | 18-Oct-09 0:26 |
4523 | BrianH | Pekr, we still need complementing to be enhanced. Even Carl has said so. | 18-Oct-09 0:26 |
4522 | BrianH | Gabriele, these changes can be backported to R2 in the form of a rule compiler that generates (unreadable) R2 parse rules. | 18-Oct-09 0:25 |
4521 | Pekr | So - we don't need complementing to be enhanced? Because we talked about it, but it is not defined in proposal, it is not part of Carl's feature table, and I also got no reaction on R3 Chat .... | 17-Oct-09 14:41 |
4520 | Pekr | An=And | 17-Oct-09 11:50 |
4519 | Pekr | Gabriele - wrong perception :-) The correct claim should be - "An now nothing prevents me from fully switching to R3 ..." :-) | 17-Oct-09 11:50 |