Minimisation question

Colin Fleming colin.flem... at coreproc.com
Wed Sep 13 23:48:49 UTC 2006


Hi Adrian,

Ok, great that there's no downside to it. Here's some stats (I've
attached the grammar in case you're interested, it's not finished but
it's close):

The part that gives the problem is the doctype. The smallest part I
could get to compile was doctypedecl:

No minimisation:

time ragel asciixml.rl -s -n -M doctypedecl | rlcodegen -V > test.dot
fsm name  : AsciiXml
num states: 1361

real    1m58.401s
user    1m56.850s
sys     0m1.481s

Memory use peaks at about 765MB. If I try and use the next largest
production it allocates up to 1.8GB and then malloc fails.

With minimisation:

time ragel asciixml.rl -s -M doctypedecl | rlcodegen -V > test.dot
fsm name  : AsciiXml
num states: 269

real    1m58.358s
user    1m56.792s
sys     0m1.453s

More or less the same time and memory usage.

However with incremental minimisation:

time ragel asciixml.rl -s -e -M doctypedecl | rlcodegen -V > test.dot
fsm name  : AsciiXml
num states: 269

real    0m0.076s
user    0m0.069s
sys     0m0.010s

It's practically instantaneous and works a charm. It also easily
allows me to compile the whole grammar, which is significantly more
complex:

time ragel asciixml.rl -s -e | rlcodegen -V > test.dot
fsm name  : AsciiXml
num states: 445

real    0m0.124s
user    0m0.119s
sys     0m0.010s

This is a cut-down grammar that only uses ASCII characters, the full
XML spec requires Unicode, this makes the machine much more complex
because all the character ranges are treated properly (i.e. the same
number of states but a lot more transitions). Using incremental
minimisation allows that machine to be generated in 2.199s.

Assuming it's reliable, I can't see a reason not to use it, or to have
it turned on by default.

Cheers,
Colin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: asciixml.rl
Type: application/octet-stream
Size: 6017 bytes
Desc: not available
URL: <http://www.colm.net/pipermail/ragel-users/attachments/20060913/e9e98fee/attachment-0001.obj>


More information about the ragel-users mailing list