[ragel-users] Problem generating code.

Brian Pane brianp at brianp.net
Sun May 9 21:13:18 UTC 2010


On Sun, May 9, 2010 at 1:43 PM, Husam Senussi <husam at senussi.com> wrote:

> I'm trying to create RFC2822 parer but I'm having problem generating the code for
> some reason ragel keep running until I press CTRL C "was running for 20 minutes",
> below the grammar I'm trying to use.

Whenever I've encountered very long Ragel processing times with my own
grammars, the reason usually has been nondeterminism. For example, with
a grammar like this:

word = space* alpha+ space*;
number = space* digit+ space*;
main = ( space+ | word | number )*;

there is an ambiguity: if the first input character is a space, it might
be the start of the "space*" option in main, but it might also be the
start of the "word" option in main or the start of the "number" option
in main, since those can start with a space also.

Internally, Ragel has to build a state graph that models those
nondeterministic states.  The more ambiguity there is in the
grammar, the bigger this graph becomes, and the longer it
takes for Ragel to run.  With my own grammars, I've found
that the run time of Ragel and the subsequent C compilation
is a good estimator of how nondeterministic my grammar is.

I've found that it helps, when using the "|" operator, to make
the different options start with distinct prefix strings.  In
the case of my example grammar above, a less ambiguous
alternative would be:

word: alpha+;
number: digit+;
main := ( space+ | word | number )+;

For more complicated languages, another good pattern I've
learned from other people's grammars is to put optional space
at the end of each rule and never at the start.

-Brian

_______________________________________________
ragel-users mailing list
ragel-users at complang.org
http://www.complang.org/mailman/listinfo/ragel-users



More information about the ragel-users mailing list