[ragel-users] Re: Feature Request: Inline Scanner
cmantu... at gmail.com
Sat Nov 4 21:24:19 UTC 2006
On 11/4/06, Adrian Thurston <thurs... at cs.queensu.ca> wrote:
> But I still think containment could be useful. Maybe you'd want to have one
> markup for the whole email address and other markups which give you the user
> and host names.
Sure, you could do that but, at the same time, you could also do it
sequentially with pretty much the same end result given that string
concatenation is a pretty simple thing to do. Personally, I don't feel
the need for capture within capture although it could prove useful in
> > Hmm, keep a global variable (ex: alltokstarts)? Thss 'alltokstarts'
> > var could be defined as min(tokstart, ts1, ts2, ts3, ...).
> I wonder if maintaining this could be made fast even when the number of
> variables grows.
Hmm, 'alltokstarts' could be updated at the beginning of each capture
with something like min(alltokstarts, ts(n)), no? This would scale
> >>From the point of view of the FSM, the inline scanner would be a
> > virtual state. Transitions to this virtual state would happen if and
> > only if at least one of the inline scanner patterns matches. If there
> > is no possible match then the FSM would error.
> You'll have to bear with me here, I can be thick sometimes!
>From my point of view you are the expert here. Therefore, if you don't
understand what I'm saying, the blame is totally on me! :-)
>From what you're saying it seems like it's not really a scanner
No, it's not like a regular scanner that keeps repeatedly trying to
match any of the expressions. I guess I should rename my proposed
'inline scanner' to 'longest-match capture'.
>but more like a union because if it finishes when it matches a
pattern then it won't
>ever match more than one. Is that right?
Well, union with a twist. For example, with:
|> patternA => actionA; patternB => actionB; <|
Once patternA or patternB matches (the longest or the first wins as
with a regular scanner), the capture machine is done.
>If that's the case then it seems like the criteria for it starting is
the same as for it finishing.
Hmm, not sure I'm following you here. In any case, after I emailed the
list yesterday, I thought a little bit more about the use of state
embeddings as a way to emulate this functionality and end up
concluding that it was probably rubbish. But I thought that the state
chart paradigm could be used to illustrate the basic idea. For
example, with an expression like:
pattern= patA |> patB1 => actionB1; patB2 => actionB2; <| patC;
one could have a state chart like so:
pattern = (
start: ( patA -> matched_patA ),
matched_patA: ( |> patB1 => actionB1; patB2 => actionB2; <| ->
matched_patB: ( patC -> final )
Another thing to consider is whether my initially proposal of strictly
relying on longest-match for capture makes sense. Maybe the programmer
should have a choice?
So, do you think this is something you'd be willing to implement? :-)
"We hold [...] that all men are created equal; that they are
endowed [...] with certain inalienable rights; that among
these are life, liberty, and the pursuit of happiness"
-- Thomas Jefferson
More information about the ragel-users