[ragel-users] Parsing of names with spaces in them

Victor Khimenko khim at chromium.org
Mon Jan 23 08:26:54 UTC 2012


On Mon, Jan 23, 2012 at 11:11 AM, Gerald Gutierrez <
gerald.gutierrez at gmail.com> wrote:

> Hello folks,
>
> I recently found Ragel and have discovered that it is a very pleasant
> piece of software. That said, I've run into a problem that I was
> hoping is common and a solution available.
>
> Sadly the problem is common but solution is quite explicitly is NOT
available.


> Please see the example code at https://gist.github.com/1661150.
>
> Basically, I'd like to parse the following:
>
> name:name
>
> where the names start and end with an alnum, and can contain any
> combination of alnum and spaces inside. They could also be blank. My
> rules for this are:
>
> identifier = alnum (space* alnum)*;
> name       = (identifier | zlen) >sName $pName %fName;
>
> The names can be separated by a colon and optionally spaces inbetween
> the names and the colon. My rules for this are:
>
> sep = space* ":" space*;
> main := name sep name;
>
> This doesn't work because apparently the space* in identifier and the
> space* in sep confuse the parser. I end up getting the action fName
> executed in every space of the name.
>
> If I change sep to:
>
> sep = ":";
>
> then everything is fine. How do I modify these rules so that the
> parser does what I intend?
>

The answer is simple: you can't. Ragel generates DFA with actions attached.
This means: symbol in => action out.

Your definition is ambigous: when you see a space you have no idea if it
belongs to the identifier or not. You must scan ahead and look for the next
non-space char: if it's colon then the previous space was not part of the
identifier, if it's alnum then it is. This is not something DFA can/should
do...

To solve your problem you need something more powerful: scanner (see the
last chapter of ragel documentation), or full-blown parser: kelbt -
http://www.complang.org/kelbt/ (ragel itself uses it), bison -
http://www.gnu.org/software/bison/ (most commonly used parser), etc.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.colm.net/pipermail/ragel-users/attachments/20120123/73984075/attachment-0001.html>
-------------- next part --------------
_______________________________________________
ragel-users mailing list
ragel-users at complang.org
http://www.complang.org/mailman/listinfo/ragel-users


More information about the ragel-users mailing list