How to avoid % action being called when the match continues?

Adrian Thurston adrian.thurston at esentire.com
Mon Dec 7 13:31:54 UTC 2009

When the '%' is seen it is unknown whether ragel is trying to parse more 
of node name, or the start of L_BRACKET. Since it can't know, it does 
both. Refactor like this:

step      = node_name L_BRACKET @command .....

Now when command is called, it is no longer unclear what is being 
parsed. The node_name is ended.


IƱaki Baz Castillo wrote:
> Hi, I have a simple grammar:
>   pchar     = ALPHA | DIGIT | ("%" HEXDIG HEXDIG);
>   L_BRACKET = "%5b";
>   node_name = ( pchar )+ -- L_BRACKET
>   step      = node_name %command L_BRACKET ..... 
> As you can see, when parsing "step" Ragel calls "command" action when 
> L_BRACKET is detected. However, if any other hex-escaped appears into 
> node_name then Ragel runs "command" action and then continues still into 
> "node_name".
> This is: when Ragel is parsing "node_name" and founds "%" (even if it's "%99" 
> rather than "%5b") then it runs "command" action. I expected that Ragel 
> wouldn't run the leaving action as it remains into node_name.
> Why doe Ragel run the % leaving action when finding "%"? perhaps because Ragel 
> must take the decision per byte without reading more than one byte?
> If so, is there any way to avoid "command" action being called several times 
> for the same node_name?
> To clarify: if "step" is:
>   qqq%00www%11eee%5bzzz
> then Ragel calls "command" action 3 times (each times it finds "%").
> Thanks for any help.

