syntax improvement, new operators
er... at atlasocean.com
Thu Feb 8 21:44:46 UTC 2007
I've been working on an extensive Ragel tutorial for Ragel, RegEx, and
FSM newbies based on a full PDF parser I built with Ragel. Based on
that work, I believe that the current Ragel action embedding names can
be both improved conceptually by re-categorizing and re-naming them,
and that doing so exposes three other operators that are missing in
the re-categorization and that I personally would like to see added to
Ragel. I have suggested currently unused symbols for them.
I'm posting this to the group to get some discussion and feedaback on
the proposal. Here are the operators and their suggested names and
categories. The new operators are listed at the end of each group, and
should be obivous.
I've also included some notes on what the operators mean, and also
some usage notes. Developing this conceptual framework has greatly
aided my understanding of Ragel and hopefully will help others as
well. I have found that action embeddings are the most difficult
aspect of Ragel to learn, and believe this conceptual framework
improves the situation immensely.
> aka First -- This action will be executed on the first character the machine recognizes.
$ aka Each -- This action will be executed on each character the
@ aka Match -- This action will be executed on characters the machine
recognizes that puts the machine into a match state.
< aka Continue -- (New) This action will be executed on the next
character the machine recognizes when the machine is in a match state.
Multiple character actions can be executed on the recognition of a
single character. For example, both the First and Each action are
executed (in that order) after the machine recognizes the very first
Ragel guarantees that character actions will always be executed in the
Character_Actions_Seq = First Each+ Match (Continue Each+ Match)*
% aka Accept -- This action will only be executed when the machine
accepts a match.
%\ aka Fail -- (New) This action will only be executed when the
machine fails to either: (a) recognize a character, or (b) accept a
%? aka Skip -- (New) This action will be executed instead of Fail when
either the Optional operator or the Kleene Star operator is applied to
A machine can execute its Fail or Skip action even if it has already
recognized one-or-more characters. Therefore, to avoid resource leaks
(a) only acquire resources in your First and Each actions that will
be cleaned up by your Fail or Skip action, and/or
(b) acquire and release resources in your Match and Continue actions,
The latter is usually the best choice if its an option.
More information about the ragel-users