[ragel-users] signed/unsigned portability issue
william at 25thandClement.com
Thu Oct 24 19:53:12 UTC 2013
On Thu, Oct 24, 2013 at 08:52:17PM +0200, Peter van Dijk wrote:
> Hello folks,
> we (PowerDNS) have a small Ragel parser for segmenting and unescaping DNS
> TXT record data. Some time ago, we expanded the allowed inputs for this
> parser to the full 8 bit 'extended ASCII' range (which Ragel calls
> This works well on most platforms - but it failed for us on Debian/s390x.
> After a lot of digging I found that char is unsigned on s390x, while it is
> signed on amd64, i386 and many other platforms.
> I have added 'alphtype unsigned char' to our Ragel file. This makes the
> parser work reliably on both amd64 and s390x (and, hopefully, many other
> However, I feel something is wrong. It seems that on s390x, Ragel is
> mostly confused about the type of char. It generates a parser that treats
> extend as -128..127, but maps non-ASCII inputs in the 128..255 range. This
> discrepancy feels like a Ragel issue to me.
> A much longer version of this story is at
> My question: is this a Ragel bug? Regardless of yes/no, is what I did
> (alphtype unsigned char) the best workaround?
IMHO it would probably be better for Ragel to use unsigned char arithmetic
for both char and unsigned char. Off the top of my head it even seems like
Ragel should treat all input as unsigned.
FWIW, I always use unsigned arithmetic, for Ragel and most everything else.
Signed arithmetic is for mathematical formulas, not bit twiddling and string
processing. At the very least, it quickly leads to undefined behavior,
whereas signed->unsigned conversions in C are always well defined.
Does anybody on the list actually use or depend on signed behavior in their
ragel-users mailing list
ragel-users at complang.org
More information about the ragel-users