[ragel-users] longest matching kleene star & parse error

Matthieu Tourne matthieu.tourne at gmail.com
Fri Feb 11 01:58:38 UTC 2011


Hi,

I'm trying to write a simple html lexer with ragel.
I have something looking like this, to match and take an action on the
attribute 'src=' potentially found in several tag attributes :

  tag_content = (
        ('src='i  ((('\'' string_sgl_exp) | '"' string_dbl_exp)
                      >src_attr_start
                      @src_attr_end))

        | any
    )** <>lerr{  };

    tag_exp = tag_content :>> '>';

My problem is if I create an attribute <img srt="..."> (srt is not a proper
attribute, but it does generate a parse error).
What I'd like to do would be <>lerr{ fhold; fgoto tag_content; }, which
would work if tag_content was an entry point.
But I use tag_exp in several places where an entry point wouldn't work, for
instance :

img_tag := tag_exp [...] @end_img_action;
script_tag := tag_exp [...] @end_script_tag_action;

I've considered creating a ragel scanner, but I don't really care for
backtracking, I'd just like to be able to hide the error. This would work
exactly the way I want to, by doing a fgoto tag_content. It would basically
restart the parse on t='...' and have it would fall under the "any"
category.

Is there an elegant way to do this, or to just hide the error ?

Thank you,

Matthieu.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.colm.net/pipermail/ragel-users/attachments/20110210/1a008490/attachment-0001.html>
-------------- next part --------------
_______________________________________________
ragel-users mailing list
ragel-users at complang.org
http://www.complang.org/mailman/listinfo/ragel-users


More information about the ragel-users mailing list