[ragel-users] Detect keywords with a ragel scanner

Alec Tica alexandru.tica at gmail.com
Thu Jul 14 21:20:42 UTC 2011


Hi,

I'm new to Ragel and I'm trying to figure out how to solve,
apparently, a very simple problem. Let's say I have the following
text:

"select 1 from dual;select 2 from dual;/*comment*/select 3 from dual;select"

I want to detect all "select" keywords using a scanner but taking into
consideration the word boundaries. "select" is a keyword only if:

1. starts at: the very beginning of the text or it has a whitespace
before or a comment or a statement separator (;)
2. ends at: the very end of the text or it has a whitespace after or a
comment or a statement separator (;)
3. is not within quotes
4. is not part of a comment

Till now I have:

<code>
%%{
  machine example;

  action is_eof {
    true if p == eof - 1
  }

  # eof
  EOF = zlen when is_eof;

  # strings
  squoted_string = ['] ( (any - [''])** ) ['];
  dquoted_string = '"' ( any )* :>> '"';

  # comments
  ml_comment = '/*' ( any )* :>> '*/';
  sl_comment = '--' ( any )* :>> ('\n' | EOF);
  comment = ml_comment | sl_comment;

  tail = space | comment | ';' | EOF;

  # keyword
  select = 'select' . tail;

  main := |*
    squoted_string;
    dquoted_string;
    comment;
    select => { puts "found at #{ts}-#{te}" };
    any;
  *|;

}%%

%% write data;

data = 'unselect 1 from dual;select 2 from dual;/*comment*/select 3
from dual;select'
# convert the provided string in a stream of chars
stream_data = data.unpack("c*") if(data.is_a?(String))
eof = stream_data.length

%% write init;
%% write exec;
</code>

Of course, the above scanner incorrectly matches the "unselect" word
from the data. Anyway, I feel that I'm not on the right track
therefore I'd like to ask for your advice.

Many thanks in advance!

-- 
talek

_______________________________________________
ragel-users mailing list
ragel-users at complang.org
http://www.complang.org/mailman/listinfo/ragel-users



More information about the ragel-users mailing list