Unicode in C

Mon Mar 12 17:49:08 IST 2012

On Mon, Mar 12, 2012 at 5:39 PM, E L <elylevy at cs.huji.ac.il> wrote:

> What's the advantage of using ucs-4 internally?
> Especially if the program needs to save memory (embedded devices are
> pretty common these days).
>

UTF-32 or UCS-4, is the only encoding form that allows random access to
each Unicode codepoint, each codepoint is 32 bits exactly. As I mentioned,
UTF-16 was created with the intention of having indexable codepoints, but
eventually there were too many of them (eg
http://www.fileformat.info/info/unicode/char/1f3e9/index.htm
https://plus.google.com/109925364564856140495/posts etc).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.huji.ac.il/pipermail/linux-il/attachments/20120312/04127d67/attachment.html>