[ragel-users] How to distinguish between two machines from within the same action?

Fran├žois Beausoleil francois at teksol.info
Thu Sep 8 21:06:32 UTC 2011

 Hey folks!  

I'm writing the "definitive" URL parser class. Lofty goal, perhaps, but also a learning exercise. I have an issue with entering and leaving actions.

My code's on GitHub: https://github.com/francois/urlparser/blob/master/url.rl#L34

Given the following two URLs:


For both URLs, I correctly recognize the scheme. For both URLs, either user or hostname is wrong, and in both cases, the port's not recognized.

My Ruby implementation is at https://github.com/francois/urlparser/blob/master/ruby/lib/urlparser/parser.rl#L14

My question boils down to: how do I definitively know that what I'm looking at is a user, vs a hostname, since both have nearly the same set of characters. Should I be using "State Action Embedding Operators"? Actually, scratch that: it seems that's what I should be doing, because I managed to recognize the host in some cases. For the first URL above, I can recognize most of the port: I end up with 123, not 1234, thus losing the last character.

A little pointer to some existing parser with the similar behavior would be appreciated.


