A collection of high performance c-string transformations

Groxx · on March 29, 2010

Does it have UTF / other format support? I really have no use if it doesn't, and I don't see anything mentioning it on the first page.

klodolph · on March 29, 2010

This was my question, so I looked at the source. It refers to a page on algorithms that claims that it is 8-bit clean, i.e., it preserves non-ascii characters but does not modify them. I then compiled a test program to verify this, because the bit twiddling in the source code looks a little nuanced. It IS 8-bit clean.

Although, if you're looking for something that converts non-ascii characters to their lower case equivalents, just use LibICU. This library looks nice for decoding things like HTTP which have plenty of ascii-only case insensitivity.

mbreese · on March 29, 2010

Since it is doing bit manipulation, then probably not. I mean, the toupper function is treating (4) 8 bit chars as one 32 bit uint. Then doing math... so, I assume that it will miss the finer points of I18N.

However, the base64/85 en/decoders should work just fine.

jws · on March 29, 2010

And in the same vein as the nginx optimizations suggested in a comment yesterday, it fails on arcitectures that don't support unaligned access.

malkia · on March 29, 2010

Funny, but typing CString in google reveals some new interesting products.

DrJokepu · on March 29, 2010

Great stuff, it's a shame it doesn't work on Windows (yet). Makes me want to spend the rest of my day trying to port it to MSVC.

shin_lao · on March 29, 2010

You can use <boost/cstdint.hpp> http://www.boost.org/

sid0 · on March 29, 2010

Surely if stdint is all that's missing, http://code.google.com/p/msinttypes/ will work?

halostatue · on March 29, 2010

We use http://www.azillionmonkeys.com/qed/pstdint.h in house.

sausagefeet · on March 29, 2010

I've been looking for a high performance toupper lately too.