[ragel-users] How do I act on eof in state charts

Adrian Thurston thurston at complang.org
Fri Oct 23 02:04:24 UTC 2009


If you are unioning tokens together and then doing a kleene star then 
you could do this;

     action lit {}
     action space {}

     string_literal = ( "'" ( [^'] | "''" )* "'" ) %lit;
     ws = ' ' @space;

     main := ( ws | string_literal )**;

If you're not and you want to make yourself a self-contained string lit 
that is safe to use regardless which operation it is used in next, then 
do the following (you seem to have something like this already).

     action lit {}
     action space {}

     string_literal = ( "'" ( [^'] | "''" $1 )* "'" ) %0 %lit;
     ws = ' ' @space;

     main := ws string_literal ( ws | string_literal )*;

-Adrian

Antony Blakey wrote:
> On 22/10/2009, at 12:10 AM, Antony Blakey wrote:
> 
>> string_literal_body =
>>  start: (
>>    "'" -> seen_quote |
>>    [^'] -> start
>>  ),
>>  seen_quote: (
>>    "'" -> start |
>>    [^'] @{ fhold; } -> final
>>  );
>> string_literal = "'" string_literal_body %{ puts "string_literal" } ;
>>
>> The problem occurs when a string literal ends at eof. How do I  
>> specify the eof 'match' in the seen_quote state such that all the  
>> leaving-transition actions that are in place above the  
>> string_literal are executed, such as the 'puts' on the  
>> 'string_literal' machine. I don't want to manually duplicate the  
>> parent code because multiple machines reference string_literal, with  
>> different leaving-transition actions.
>>
>> I couldn't get it to work using priorities - the terminator needs  
>> lookahead to disambiguate; the following doesn't work:
>>
>> string_literal = "'" [^']* ( "''" [^']* )* '"'
> 
> I ended up doing this:
> 
>    string_literal_unqoted = "'" [^']* "'" ;
>    string_literal = string_literal_unqoted+ $(longest, 1) %(longest,  
> 0) % { puts "string_literal" };
> 
> which works. I would have thought that this:
> 
>    string_literal = string_literal_unqoted string_literal_unqoted** %  
> { puts "string_literal" };
> 
> would work, but '**' doesn't work for me - if I use a main like this:
> 
>    main := space* ( string_literal space* )* ;
> 
> then I get two string_literals from 'a''b' rather than one, while the  
> explicit priorities do give me one string_literal.
> 
> Of course this works as well:
> 
>    main := space* ( string_literal <: space* )* ;
> 
> but given the pervasiveness of whitespace handling in my grammar (a  
> full Smalltalk parser) it's a real PITA because everything you want  
> greedy consumption for has to be thus annotated wherever it is used.
> 
> Still, I'm interested in how you specify EOF transitions in an  
> explicit state machine. I also had success appending a 0 onto the  
> input for use as an EOF marker that can be matched in the state  
> machine, but I'm not sure yet if/how that will interfere with true EOF  
> functioning.
> 
> Antony Blakey
> -------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
> 
> In anything at all, perfection is finally attained not when there is  
> no longer anything to add, but when there is no longer anything to  
> take away.
>    -- Antoine de Saint-Exupery
> 
> 
> 
> _______________________________________________
> ragel-users mailing list
> ragel-users at complang.org
> http://www.complang.org/mailman/listinfo/ragel-users




More information about the ragel-users mailing list