[ragel-users] Re: Is this the right way to do it ?

Gaspard Bucher gasp... at teti.ch
Wed Oct 31 17:07:55 UTC 2007


Speed was not the main issue for choosing ragel: gluing my Command
class with the lexer and lemon was not easy and felt unnatural. The
way ragel works is very intuitive to me. Moreover, I had a grief
against lemon: when the current state is terminal (only a default
action which is a 'resolve'), it still needs one more token (or EOF)
to trigger the reduction.

Rubyk (the tool I am working on) is about multimedia and AI, so state
machines feels like home and learning about ragel might help me for
the music production (networks of possible melodies with paths chosen
from the pattern recognition). Music is a state machine !

I think I am becoming a fan of ragel. I might also use it to parse
zafu templates and zazen (textile improved) for the CMS I am working
on (http://zenadmin.org).

Ragel is the kind of goodie that puts you into the state "I should
rewrite this using ragel" a couple of times a day... So I am very glad
flex/lemon were not such good friends (even though lemon is really
nice to use).

Thanks for the reply. I feel more confident with the way I am doing things.

Gaspard

2007/10/31, Adrian Thurston <thurs... at cs.queensu.ca>:
> Hi Gaspard,
>
> The other way to catpure token text is to set pointers to mark the start and end of tokens. It is faster but requires that you be careful about buffer boundaries.
>
> In my opinion this is a valid way to parse and the motivation is speed. However if speed is not a requirement and you're dealing with a token stream I would suggest that you use the more traditional lexer+parser approach.
>
> Adrian
>
> -----Original Message-----
> From: Gaspard Bucher <gasp... at teti.ch>
>
> Date: Wed, 31 Oct 2007 07:58:21
> To:ragel-users <ragel-users at googlegroups.com>
> Subject: [ragel-users] Is this the right way to do it ?
>
>
>
> I am implementing a parser to read commands from user (interactive) or
> from a stored file. The idea is to build the objects and their
> relation inside rubyk (http://rubyk.org). Some examples of the syntax:
>
> create a metronome object: m1 = Metro(120)
> create a metronome object: m1 = Metro(metro:120) # same as above
> create a note out object:     n  = NoteOut(velocity:80 port:"funk")
> create a script object:         cooking = Script(".... Lua code ....")
> create links:               m1.1 => 1.cooking, cooking.1 =>
> 1.n
>
> Here is a rough prototype to implement the parsing using ragel (have
> been using flex/lemon).
>
> Am I doing this right ? More precisely :
> 1. is there a better way to extract token values ( instead of by
> repeated @a appends) ?
> 2. would it be simpler to use ragel only for building the tokens and
> let lemon handle the actions ?
>
> Thanks for your answers.
>
> Gaspard
>
> =================== prototype.rl ========
> #include <iostream>
> #include <cstdio>
> #define MAX_BUFFER_SIZE 2048
>
> %%{
>   machine foo;
>   write data noerror;
> }%%
>
> class Command
> {
> public:
>   void parse(char * str)
>   {
>     char *p = str; // data pointer
>     char *pe = str + strlen(str); // past end
>     int cs;        // machine state
>     int len = 0;
>     char token[MAX_BUFFER_SIZE + 1];
>
>     %%{
>       action a {
>         if (len >= MAX_BUFFER_SIZE) {
>           std::cerr << "Buffer overflow !" << std::endl;
>           // stop parsing
>           return;
>         }
>         token[len] = fc; /* append */
>         len++;
>       }
>
>       action set_var {
>         token[len] = '\0';
>         mVariable = token;
>         len = 0;
>       }
>
>       action key {
>         token[len] = '\0';
>         std::cout << "[key   :" << token << "]" << std::endl;
>         len = 0;
>       }
>
>       action set_klass {
>         token[len] = '\0';
>         mClass = token;
>         len = 0;
>       }
>
>       action space {
>         printf(" ");
>       }
>
>       action ret {
>         printf("\n");
>       }
>
>       action set_string {
>         token[len] = '\0';
>         mValue = token;
>         len = 0;
>       }
>
>       action set_float {
>         token[len] = '\0';
>         mValue = token;
>         len = 0;
>       }
>
>       action set_integer {
>         token[len] = '\0';
>         mValue = token;
>         len = 0;
>       }
>
>       action set_from {
>         mFromPort = atoi(mValue.c_str());
>         mFrom = mVariable;
>       }
>
>       action create_instance {
>         std::cout << "NEW  (" << mVariable << "=" << mClass << "()" <<
> ")" << std::endl;
>       }
>
>       action create_link {
>         mToPort = atoi(mValue.c_str());
>         mTo   = mVariable;
>         std::cout << "LINK (" << mFrom << "." << mFromPort << "=>" <<
> mToPort << "." << mTo << ")" << std::endl;
>       }
>
>       ws     = (' ' | '\n' | '\t')+;
>
>       identifier = 'a'..'z' @a (digit | alpha | '_')* @a;
>
>       var    = identifier %set_var;
>
>       klass  = 'A'..'Z' @a (digit | alpha | '_')* @a %set_klass;
>
>       string  = '"' ([^"\\] | '\n' | ( '\\' (any | '\n') ))* @a
> %set_string '"';
>       float   = ('1'..'9' @a digit* @a '.' @a digit+ @a) %set_float;
>       integer = ('1'..'9' @a digit* @a) %set_integer;
>
>       value  = (string | float | integer);
>
>       key    = identifier %key;
>
>       param  = (key ':' ws* value);
>
>       parameters = value | (param ws*)+;
>
>       create_instance = var ws* '=' ws* klass '(' parameters? ')'
> @create_instance;
>
>       create_link = var '.' integer @set_from ws* '=>' ws* integer '.'
> var @create_link;
>
>       main := ((create_instance | create_link) ws*)+  ;
>
>       write init;
>       write exec;
>     }%%
>
>     printf("\n");
>   }
> private:
>   std::string mVariable, mFrom, mTo, mClass, mValue;
>   int         mFromPort,     mToPort;
> };
>
> int main()
> {
>   Command cmd;
>   cmd.parse("a=Value() b=Super(23.3)c=This(hey:\"mosdffasl\" come:
> 3)\na.1=>1.b a.2=>2.b");
> }
> ===========================
>
>
>
>
>
> >
>



More information about the ragel-users mailing list