Unicode in C

Unicode in C

Elazar Leibovich elazarl at gmail.com
Tue Mar 13 17:43:12 IST 2012


On Tue, Mar 13, 2012 at 5:22 PM, Nadav Har'El <nyh at math.technion.ac.il>wrote:

>
> Qt appears to use internally UTF-16. What major free software C library
> actually prefer UTF-8?
>

Are you talking about the internal representation, or the external
interface?

The internal representation is in many cases UTF-16. Indeed, except of
golang, and so it seems perl, I can't think of any other language open
source or not, that has UTF-8 as internal representation.

That said, the internal representation should not be exposed to anyone, so
it shouldn't really matter to anyone you're using ISO-5589-1 internally, as
long as they don't have to convert their text to that arcane format.

However, if you look around, a lot of text files, documentation, HTML
files, open network wire formats (eg, json) are using UTF-8 as their text
encoding format. So in this sense, I think it's a de facto standard.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.huji.ac.il/pipermail/linux-il/attachments/20120313/a4df3088/attachment.html>


More information about the Linux-il mailing list