Unicode in C

Unicode in C

Nadav Har'El nyh at math.technion.ac.il
Tue Mar 13 09:56:14 IST 2012


On Mon, Mar 12, 2012, Omer Zak wrote about "Re: Unicode in C":
> It depends upon your tradeoffs.
>...
> 2. Otherwise, specify two such APIs - one is UTF-8 based, one is fixed
> size wide character based.  Create two binary variants of the libhspell
>...

This is why I asked this question in the first place - I'm aware of the
tradeoffs, and the possibility two create two variants for every
function (or three, if you include the existing ISO-8859-8 API).

I was just wondering - could it be that 20 years (!) after UTF-8 was
invented for use in Plan 9 to counter "wide characters", that neither
method has "won"?

As I see it, UTF-8 vs. wide characters (or UTF-16, or UTF-32, which
are all similar for my needs) is a big-endian/little-endian kind of
issue, where it's possible to list all sorts of advantages to each one,
but at the end, each of the choices is good and has a large number of
followers, and a choice has to be made. Continuing to use all of these
approaches is not a good thing, as I see it. Even if in practice, I
can write all these APIs with not too much effort, I think it's ugly.

-- 
Nadav Har'El                        |                   Tuesday, Mar 13 2012, 
nyh at math.technion.ac.il             |-----------------------------------------
Phone +972-523-790466, ICQ 13349191 |Live as if you were to die tomorrow,
http://nadav.harel.org.il           |learn as if you were to live forever.



More information about the Linux-il mailing list