Grammar testing proposal

Adrian Thurston thurs... at
Tue Sep 19 02:13:17 UTC 2006

Hi Colin,

I don't have a TXL grammar for Ragel. You could get by with a minimal 
one which captured the regular language as simply a list. But I agree, 
TXL might be too much for merely testing Ragel. You'd need a (partial) 
grammar for all the host languages. Some exist already, but future 
languages supported might need to have grammars made.


Colin Fleming wrote:
> Hehe, TXL does actually look really interesting -  and potentially
> appropriate! I'd be worried about raising the bar to Ragel though, it
> looks like TXL might take some time to wrap your head round.
> BTW do you have a grammar spec for Ragel?
> On 9/15/06, Adrian Thurston <thurs... at> wrote:
>> Colin, great idea. One issue might be specifying language independent
>> actions. This could get tough if in the future we support non c-like
>> languages. For example, there was mention of supporting Ruby.
>> Perhaps TXL ( might be useful. It could be used to define
>> a mini toy language and to write transformations to the host languages.
>> Though I'm connected to that project so I'm biased in regard to it being
>> appropriate :)
>> -Adrian
>> Colin Fleming wrote:
>>> Hi all,
>>> I've been thinking about various ways to test Ragel and the generated
>>> grammars, here's what I've come up with. I'm really interested in any
>>> feedback. I'm currently developing a couple of grammars that I'm
>>> primarily interested in using with Java. The Java generation is still
>>> a bit experimental, so I'd like to be able to use acceptance tests
>>> that confirm that a) the grammar works as expected, b) the results are
>>> consistent across Java/C++/whatever, and c) that the results are also
>>> consistent across different code generation strategies.
>>> This last one is probably currently more useful to Adrian than anyone,
>>> but I'm probably going to reimplement rlcodegen in Java shortly, so it
>>> will be great for testing that as well as testing code generation
>>> implementations for any new languages, or new code generation
>>> strategies.
>>> So, I propose a parser class generator that will take a raw Ragel
>>> grammar and generate an rl file for whichever of the supported
>>> languages the user requests. This rl file will generate a basic
>>> parsing class, with the standard methods: init, execute, finish. The
>>> Ragel syntax would be slightly extended to specify features of the
>>> generated class, and these extensions stripped out when the rl file is
>>> written. This would actually probably be pretty generally useful too,
>>> a lot of people just want a support class that they can integrate into
>>> a larger project, I imagine.
>>> The whole point of this thing is testing, so unit test data and
>>> expected values would be encoded in the source file. Either a test
>>> class or just the parser could be generated, or both.
>>> An example is worth a thousand words, so here goes:
>>> %%{
>>>   # Variables for the generated class, initialised in init() method
>>>   # public vars generate getters
>>>   public int val = 0;
>>>   private boolean neg = true;
>>>   action see_neg {
>>>     neg = true;
>>>   }
>>>   action add_digit {
>>>     val = val * 10 + (fc - '0');
>>>   }
>>>   main :=
>>>     ( '-'@see_neg | '+' )? ( digit @add_digit )+
>>>     '\n' @{ fbreak; };
>>>   test {
>>>     input "1\n";
>>>     output "1";
>>>   }
>>>   test {
>>>     input "213 3213\n";
>>>     output "unexpected char ' ' in input";
>>>     failure;
>>>   }
>>> }%%
>>> Obviously one concern here is overloading the Ragel syntax, maybe a
>>> prefix would be good to highlight the new keywords as preprocessor
>>> directives.
>>> A few more thoughts:
>>> It would be good to be able to specify variables of the alphabet type:
>>> public alphtype character;
>>> It would also be interesting to track the states the machine moves
>>> through on each run, they could be compared to ensure that the
>>> different strategies are behaving equally.
>>> I'm also not sure about having the test code in with the actual
>>> grammar, but I guess an include directive would make that easier.
>>> Any thoughts or ideas?
>>> Cheers,
>>> Colin

More information about the ragel-users mailing list