[ragel-users] parsing a netstring

Adrian Thurston thurs... at cs.queensu.ca
Tue Oct 9 21:10:52 UTC 2007


Hi Chuck,

Yes, using fnext to call out of the string consuming machine is one way
to do it. The code looks good to me.

As you said you can use conditions as well. I think one of the examples
in the manual deals with variable length fields. So there is that route.

And also yes, you can modify p to jump ahead of the area. Just be
mindful of jumping past pe. If you have all the data at once this isn't
a problem, but if you get your data in blocks then you have to watch out
and hack in some solution.

With error actions you have to keep in mind that the operators have
slightly different meanings because they select states as opposed to
transitions. The error action embedding operators let you handle the
case of 'no transition' in the states they select.

Adrian

Chuck Remes wrote:
> I'm suddenly finding all sorts of uses for ragel!
> 
> I want to write a parser for netstrings. The definition of a  
> netstring is pretty simple. It comes in the following format:
> 
> size_in_decimal':''string array size_in_decimal bytes long'','
> 
> I wrote a machine to parse through this and capture every byte, but  
> I'm unclear how to terminate my get_string machine. Right now I have  
> it call the action store_string as a finishing action for each byte  
> processed. The action stores the byte and increments a counter  
> variable. When the counter variable exceeds the number of bytes to be  
> processed, I want to advance out of that machine and move to the next  
> machine to confirm the byte array was terminated properly.
> 
> I'm not sure I'm doing this correctly. From the docs (section 6.5) it  
> appears using a 'semantic condition' would make sense here, but that  
> part of the documentation is unclear to me so I'm using this  
> alternate methodology. Am I on the right track? Also, is there a way  
> to skip 'N' bytes forward instead of copying them one by one into a  
> new array (super slow!)? I'm thinking I can directly modify the 'p'  
> variable but I'm not sure this is the right way.
> 
> Secondly, I'm not sure how to capture errors. I'm already using the  
> form '@action' to do some work in a machine. Can I specify an error  
> action using the same operator in the same machine? E.g - get_size =  
> ( digit @store_size @err(size_error) )+;
> 
> Thanks for any input. My sample machine is listed below.
> 
> %%{
> 	machine parse_netstring;
> 
> 	# snipped out some actions for the sake of brevity
> 
> 	action store_size {
> 		size = ( size * 10 ) + fc; # accumulate string length
> 	};
> 
> 	action alloc_buffer {
> 		buffer = Array.new(size);
> 		i = 0;
> 	};
> 
> 	action store_string {
> 		buffer[i] = fc;
> 		i = i + 1;
> 		fnext get_string_terminator if i > size;
> 	};
> 
> 	get_size = ( digit >validate_not_zero ) . ( digit @store_size )*;
> 
> 	get_delimeter = ( ':' @alloc_buffer );
> 
> 	get_string = ( any @store_string )*;
> 
> 	get_netstring_terminator = ',' @finalize;
> 
> 	main := get_size . get_delimeter . get_string;
> }%%
> 
> --~--~---------~--~----~------------~-------~--~----~
> You received this message because you are subscribed to the Google Groups "ragel-users" group.
> To post to this group, send email to ragel-users at googlegroups.com
> To unsubscribe from this group, send email to ragel-users-unsubscribe at googlegroups.com
> For more options, visit this group at http://groups.google.com/group/ragel-users?hl=en
> -~----------~----~----~----~------~----~------~--~---
> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 252 bytes
Desc: OpenPGP digital signature
URL: <http://www.colm.net/pipermail/ragel-users/attachments/20071009/fa1fcd96/attachment-0001.sig>


More information about the ragel-users mailing list