[ragel-users] Maintaining char & line counts in a scanner

Joe Wildish joe at elusive.cx
Wed Apr 21 22:44:49 UTC 2010


Hi All,

I'm using ragel as a scanner to tokenise input for parsing of a database query language. I'd like to maintain a line number and character offset in the struct that represents a matched token but I'm having a little difficulty.

My question is this: Is it possible to define a match for a expression (e.g. a newline), but to have that expression match NOT consume the input, and instead pass the input on to other expressions? At the same time, the original expression would execute a user action.

My idea would be to have two expressions - one that matches a newline and one that matches any other character. Clearly there would be an associated action with these expressions to maintain variables for the line and char count. Currently I have various expressions, some of which can potentially match multiple newlines (think multi-line comments), and some of which consume dead input (whitespace). I have played around keeping a tally of the counts on each successful match of a token (outside of the machine exex), but as in some cases I am discarding input completely within the state machine and not creating a token, it becomes difficult to track.... ideally, I'd like to keep it all within the machine, but can't see the best way to proceed.

Any help or pointers would be much appreciated.

Cheers,
-Joe
_______________________________________________
ragel-users mailing list
ragel-users at complang.org
http://www.complang.org/mailman/listinfo/ragel-users



More information about the ragel-users mailing list