[ragel-users] Re: Newbie question: Scanners?

Adrian Thurston thurs... at cs.queensu.ca
Mon May 14 22:18:17 UTC 2007


Hi Andrew,

You could use a scanner for this but it isn't necessary. You could
handle it with pure state machines if you like.

Make a machine definition for IP addresses and a definition for delays
and as long as you embed the actions which indicate IP/delay at some
point at or past the second dot/space you can union them together
because they diverge at the second dot/space.

delay = digit+ '.' digit+ ' '* 'ms' @{print 'delay';};
IP = digit+ '.' digit+ '.' digit+ '.' digit+ %{print 'IP';};

main := 'something' ( delay | IP ) 'something else';

-Adrian

AndrewO wrote:
> Hi Adrian,
> 
> Thanks for the response.   I think I understand the part about
> backtracking, but I guess I'm still wondering about the first part.
> It might help if I describe my situation a little more: I'm trying to
> write something to parse the output from the traceroute utility more
> quickly than a standard regex based solution written in Perl or Ruby.
> The thing that's tripped me up in the past is that there's some
> possible ambiguity which I think would have to be solved with
> backtracking.  For example, you can have lines like the following:
> 
> 6 sl-bb24-pen-15-0.sprintlink.net (144.232.16.81)  113.927 ms  110.118
> ms  109.133 ms
> 
> 5 195.3.70.65 (195.3.70.65)  17.557 ms  10.957 ms  11.692 ms
> 
> 7  61.19.60.22 (61.19.60.22)  2.708 ms 202.129.63.70 (202.129.63.70)
> 2.751 ms *
> 
> Hostnames are easy to extract.  Where is gets complicated is on the
> last line.  It can't be known that 202.129.63.70 is an IP address and
> not a delay until the second period.
> 
> So does this situation fit into criteria of being able to be "broken
> down into a list of items taken from a pool of possibilities"?
> 
> Sorry if this is a little remedial and thanks for the help.
> 
> -Andrew
> 
> On May 11, 3:49 pm, Adrian Thurston <thurs... at cs.queensu.ca> wrote:
>> Hi Andrew,
>>
>> Scanners are suitable for processing streams of tokens. Generally this
>> is any input that can be broken down into a list of items taken from a
>> pool of possibilities.
>>
>> You can also use a scanner for its backtracking features. They are
>> useful in cases where you want to attempt one pattern and should that
>> fail you would like to match some other pattern against a shorter string.
>>
>> Regards,
>>  Adrian
>>
>> AndrewO wrote:
>>> Hi,
>>> I'm pretty new to FSMs, so this is probably an obvious question: when
>>> would you use a scanner over a standard machine?  Is it just for being
>>> able to backtrack if you're going to have ambiguity?  Or is it deeper
>>> than that?
>>> Sorry if this has been covered elsewere.
>>> Thanks,
>>> Andrew O'Brien
>>
>>  signature.asc
>> 1KDownload
> 
> 
> --~--~---------~--~----~------------~-------~--~----~
> You received this message because you are subscribed to the Google Groups "ragel-users" group.
> To post to this group, send email to ragel-users at googlegroups.com
> To unsubscribe from this group, send email to ragel-users-unsubscribe at googlegroups.com
> For more options, visit this group at http://groups.google.com/group/ragel-users?hl=en
> -~----------~----~----~----~------~----~------~--~---

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 252 bytes
Desc: OpenPGP digital signature
URL: <http://www.colm.net/pipermail/ragel-users/attachments/20070514/22e43e06/attachment-0001.sig>


More information about the ragel-users mailing list