Transition actions on EOF handling

Alexander Strange astra... at gmail.com
Sat Jun 30 01:07:22 UTC 2007


I wrote this program:
#include <stdio.h>
#include <string.h>
%% machine romaji;
%% write data;

int main(void)
{
	char buf[1024];
	int ilen;

	scanf("%s",buf);
	ilen = strlen(buf);
	{
		char *p = buf, *pe = &buf[ilen], *m="qqq";
		int cs;
		%%{
			alphtype unsigned char;

			action mora_out {printf("%s",m);}

			mora = ("a" % {m="あ";} |
					"e" % {m="え";} |
					"i" % {m="い";} |
					"o" % {m="お";} |
					"u" % {m="う";} |
					"ga" % {m="が";} |
					"ge" % {m="げ";} |
					"gi" % {m="ぎ";} |
					"go" % {m="ご";} |
					"gu" % {m="ぐ";}) % mora_out;

			char = mora;

			main := char+;
		}%%

		%% write init;
		%% write exec;
		%% write eof;
	}
	printf("\n");
	return 0;
}

and run it:

> ./r2h
aei
あえ

I don't blame you if you can't read the output, but there should be 3
characters instead of two. The last one isn't firing, apparently
because it reaches the end of the string. I've got the same problem
(worked around) in Perian 1.0, which uses Ragel for the subtitle
parsing. Is there a general way to handle this?

Also, how fast is Ragel for general text searches? At the expense of
more tables and FSM purity, it could maybe implement something really
fast like Boyer-Moore.



More information about the ragel-users mailing list