[ragel-users] Writing a Telnet parser

Adrian Thurston thurston at complang.org
Sun Oct 3 04:53:25 UTC 2010


Try refactoring your grammar:

plain_text = [a-z];
something_else = ^plain_text;

main := (
	plain_text+ %{ end_of_plain_text(); } |
	something_else+
)*;

On 10-09-29 05:39 PM, Jonathan Castello wrote:
> I have, actually. If I have "plain_text %leaving", the leaving action
> is still executed after every plain_text character, as the generated
> graph seems to indicate. If I put %leaving after the telnet_stream
> itself, the graph suggests that it's only executed on EOF, which will
> never occur as I am processing a potentially infinite network stream.
> (Because of this, I explicitly set eof = NULL in the code before "%%
> write init", as the documentation suggests.)
>
> ~Jonathan
>
> On Wed, Sep 29, 2010 at 10:43 AM, Adrian Thurston
> <adrian.thurston at esentire.com>  wrote:
>> Have you tried leaving actions? It sounds like that is what you want.
>>
>> -Adrian
>>
>> On 10-09-29 10:34 AM, Jonathan Castello wrote:
>>>
>>> Hi Adrian,
>>>
>>> Thanks for your help. Actually, I know how I want to buffer them; the
>>> problem is actually extracting them when I want to. I need some way to
>>> extract the characters only when the next character isn't plain_text
>>> or there is no next character. I've tried adding an entry action to
>>> cr_sequence and iac_sequence, but that doesn't work when you reach the
>>> end of the subject data without seeing a CR or IAC. What I was hoping
>>> to do is maintain a 'left' pointer to the first plain_text character,
>>> and use fpc as the 'right' pointer when I reach the last contiguous
>>> plain_text character. Then I would pass the left pointer and the
>>> length of that contiguous stretch (fpc-left) to the user-provided
>>> callback.
>>>
>>> I could copy each character to a temporary buffer, but I was hoping to
>>> avoid extra allocations. I want to just pass pointers into the
>>> original block of text being parsed, so the calling code can do any
>>> copying and allocating required. My entry action attempt was the
>>> closest I could get: it would properly fire before a non plain_text
>>> sequence, but the major issue is that it wouldn't fire at all when it
>>> reached the end of the subject line.
>>>
>>> Thanks again,
>>> ~Jonathan
>>>
>>> On Wed, Sep 29, 2010 at 10:07 AM, Adrian Thurston
>>> <adrian.thurston at esentire.com>    wrote:
>>>>
>>>> Hi Jonathan,
>>>>
>>>> Ragel does not do any buffering of text for you. It's up to you to decide
>>>> how you want to do that, then implement it yourself. There are a couple
>>>> options. You can copy text to a buffer as you move over characters, or
>>>> you
>>>> can extract them from the input buffer when you need them. The first
>>>> approach is simpler and guaranteed to work without hitches. The second
>>>> technique is faster, but you have to consider buffer block boundaries.
>>>>
>>>> -Adrian
>>>>
>>>> On 10-09-28 08:30 PM, Jonathan Castello wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> I'm building a Telnet parser using Ragel, and I'm having an issue
>>>>> making the actions do what I want. I've pasted the machine definition
>>>>> to a gist: http://gist.github.com/602242
>>>>>
>>>>> The issue is a little hard for me to describe, so I'll try to
>>>>> illustrate it as best as I can. If I have a stream of input, and some
>>>>> part of it is "abcdef<IAC><GA>ghi" (where<x>      is a mnemonic for a
>>>>> single byte), I want to emit events as such: text("abcdef"),
>>>>> command("<GA>"), text("ghi"). The caller provides callbacks, and I
>>>>> would pass the data to them as I interpret it.
>>>>>
>>>>> The problem is that I can't figure out how to define actions that
>>>>> would only trigger when the next character doesn't match plain_text
>>>>> (or there's no more data left to parse in that particular packet), so
>>>>> I can get that full stretch of characters. At the moment, I can only
>>>>> get text("a"), text("b"), text("c") etc. to work, i.e. one plain_text
>>>>> match at a time.
>>>>>
>>>>> I suspect the problem is that cr_sequence and iac_sequence are
>>>>> supposed to behave this way - they, too, match singular "terms" each
>>>>> time before returning to the start - but here I am, wanting to give
>>>>> plain_text special treatment. Am I even coming at this from the right
>>>>> angle?
>>>>>
>>>>> Thanks in advance for any advice!
>>>>> ~Jonathan Castello
>>>>>
>>>>> _______________________________________________
>>>>> ragel-users mailing list
>>>>> ragel-users at complang.org
>>>>> http://www.complang.org/mailman/listinfo/ragel-users
>>>>>
>>>>
>>>> _______________________________________________
>>>> ragel-users mailing list
>>>> ragel-users at complang.org
>>>> http://www.complang.org/mailman/listinfo/ragel-users
>>>>
>>>
>>> _______________________________________________
>>> ragel-users mailing list
>>> ragel-users at complang.org
>>> http://www.complang.org/mailman/listinfo/ragel-users
>>>
>>
>> _______________________________________________
>> ragel-users mailing list
>> ragel-users at complang.org
>> http://www.complang.org/mailman/listinfo/ragel-users
>>
>
> _______________________________________________
> ragel-users mailing list
> ragel-users at complang.org
> http://www.complang.org/mailman/listinfo/ragel-users

_______________________________________________
ragel-users mailing list
ragel-users at complang.org
http://www.complang.org/mailman/listinfo/ragel-users



More information about the ragel-users mailing list