[ragel-users] 'string' ranges
thurs... at cs.queensu.ca
Fri Apr 6 14:38:27 UTC 2007
I think in theory enumerating all the possibilities with a script then
leaving it up to the minimization routine would work. Though it might
end up taking forever to compile.
Semantic conditions could be made to work, but I would advise trying to
express the ranges directly. If that doesn't work well (or maybe if it
generates too many states) you could go the semantic condition route.
To express them directly assemble the ranges byte by byte and then
section by section. I've never done this in a real program so if you try
it out (or one of the other techniques) would you mind sending a message
to the list to say how it went?
alphtype unsigned char;
r1 = 0x0D ( 0xD0 .. 0xD9 );
0x0A ( 0x07 .. 0xFF ) |
( 0x0B | 0x0C ) any |
0x0D ( 0x00 .. 0x40 );
You're probably aware of this but I'll mention it just to put it out
there ... for a really simple solution you can always process in two
passes. First expand to a fixed-width character then change the alphabet
type to short or int and process with Ragel.
> Hello all,
> I'm wanting my ragel state machine to process unicode text encoded as utf-8.
> There are some unicode ranges that I want to transition on e.g.
> range = [0x0ED0-0x0ED9];
> but I don't know how to express this in a minimal way with an unsigned char
> alphabet (i.e. I don't think it can be done directly in ragel's expression
> My brain isn't in the best condition, but the two approaches I have thought
> 1.) use a script to write out the set of strings in the range and leave it to
> ragel to minimise the states (or something like this)
> 2.) use ragel's semantic conditions somehow.. (assemble utf-32 version and use
> integer comparison)
> But before I attempt either, has anyone had to do anything similar? Or are
> there any suggestions I could use?
> Thanks, and have a good Easter
> - Paul
> You received this message because you are subscribed to the Google Groups "ragel-users" group.
> To post to this group, send email to ragel-users at googlegroups.com
> To unsubscribe from this group, send email to ragel-users-unsubscribe at googlegroups.com
> For more options, visit this group at http://groups.google.com/group/ragel-users?hl=en
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 252 bytes
Desc: OpenPGP digital signature
More information about the ragel-users