[ragel-users] RFC-2822 recognizer: best way to test it?

Adrian Thurston thurs... at cs.queensu.ca
Thu Jun 7 20:08:16 UTC 2007


Hi Wincent,

What I normally do is embed actions which collect the text matching a
pattern and print it out. If you have this machine definition:

atext = [a-z]+;

You can do this:

action clear { buf.clear(); }
action append { buf.append(*p); }
action log_atext { print "atext: " buf "\n"; }

atext = [a-z]+ >clear $append %log_atext;

-Adrian

Wincent Colaiuta wrote:
> Hi!
> 
> As my first Ragel project I'm writing a recognizer for RFC-2822 email
> addresses. All the recognizer has to do is scan an input string and
> decide whether or not it conforms to RFC-2822. I'll write a little bit
> of background first; but in the end my question is, what's the best
> way to test this?
> 
> I basically started by taking RFC-2822 (<http://www.ietf.org/rfc/
> rfc2822.txt>) and taking the rules -- written in the RFC using
> Augmented Backus-Naur Form (ABNF) notation (<http://www.ietf.org/rfc/
> rfc2234.txt>) -- and rewriting them using Ragel syntax.
> 
> There is one circular dependency in those rules ("comment" needs
> "ccontent", but "ccontent" needs "comment") and so for the time being
> I've commented out that dependency (in other words, nesting of
> comments inside comments isn't yet implemented). If everything works
> out ok I will as a last step use the trick described here <http://
> groups.google.com/group/ragel-users/browse_thread/thread/
> f3fdde1d51c86aaf/e4f2b110236b8660> to manually handle the recursion.
> 
> Running ragel on the input causes it to spin forever, so I've
> simplified some of the rules (mostly by commenting out the optional
> whitespace) and now it compiles (using C as the output language).
> Before I begin tweaking the rules back into conformance with the RFC I
> wanted to ask about testing techniques.
> 
> What I have is effectively a black box where I stick input in and get
> success or failure message back at the end. Is there any way to break
> this down into smaller parts of testing purposes? In other words,
> instead of testing that "f... at example.com" passes (it does), can I test
> that "example.com" matches  a "domain", or even lower, that "foo" is
> valid "atext". Basically, I can test that the whole works, but I'd be
> much more confident if I could individually test the parts as well.
> 
> What's the best methodology here?
> 
> Cheers,
> Wincent
> 
> 
> --~--~---------~--~----~------------~-------~--~----~
> You received this message because you are subscribed to the Google Groups "ragel-users" group.
> To post to this group, send email to ragel-users at googlegroups.com
> To unsubscribe from this group, send email to ragel-users-unsubscribe at googlegroups.com
> For more options, visit this group at http://groups.google.com/group/ragel-users?hl=en
> -~----------~----~----~----~------~----~------~--~---

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 252 bytes
Desc: OpenPGP digital signature
URL: <http://www.colm.net/pipermail/ragel-users/attachments/20070607/d4955477/attachment-0001.sig>


More information about the ragel-users mailing list