Minimisation question

Adrian Thurston thurs... at
Thu Sep 14 21:45:33 UTC 2006

I've been using gdb and hitting ctrl-c when it explodes. From there I 
explore the data to see what's going on. But if it terminates I would use 
gprof to get a sense of what's happening. Once I have an idea of where the 
problem might be I make the smallest ragel spec I can which reproduces the 
problem and explore using gdb or sometimes print statements.

If you want to send me the full spec I'd love to look into why it's taking 
so long. Ragel should be able to process it faster.


Colin Fleming wrote:
> Great, thanks! The patch works well for the grammar I sent:
> time ragel asciixml.rl -s -M doctypedecl | rlcodegen -V >
> fsm name  : AsciiXml
> num states: 269
> real    0m0.087s
> user    0m0.079s
> sys     0m0.011s
> But the full Unicode grammar still takes a while:
> time ragel xml.rl -s | rlcodegen -V >
> fsm name  : Xml
> num states: 445
> real    0m43.829s
> user    0m43.293s
> sys     0m0.565s
> I guess there's another production creating unreachable states. It's a
> vast improvement though, it never even got close to compiling
> beforehand. Is there some way I can diagnose what's happening myself?
> On 9/14/06, Adrian Thurston <thurs... at> wrote:
>> Pretty much every operation can cause unreachable states. For example,
>> during union a new start state is created with epsilon transitions drawn to
>> the old start states. The epsilon operation effectively copies transition
>> lists from the destination state to the source state. If the old start
>> states don't have any transitions in from elsewhere in the machine then they
>> become unreachable. The on-the-fly cleanup will reap states with no external
>> entry points. This gets most unreachables, but just like in garbage
>> collection, cycles cause problems.
>> In your ragel file the subtraction operation is leaving behind unreachables.
>> Subtraction may in fact need a full mark and sweep (I had thought the
>> on-the-fly was sufficient). The attached patch will allow you to compile
>> without -e.
>> Cheers,
>>   Adrian
>> Colin Fleming wrote:
>>> Ok, thanks for the help! What causes unreachable states? Can I change
>>> something about the grammar to avoid them (apart from -e, of course)?
>>> Cheers,
>>> Colin
>> Index: ragel/fsmgraph.cpp
>> ===================================================================
>> --- ragel/fsmgraph.cpp  (revision 3586)
>> +++ ragel/fsmgraph.cpp  (working copy)
>> @@ -496,6 +496,7 @@
>>         /* Remove states that have no path to a final state. */
>>         removeDeadEndStates();
>> +       removeUnreachableStates();
>>  }
>>  bool FsmAp::inEptVect( EptVect *eptVect, StateAp *state )

More information about the ragel-users mailing list