<div dir="ltr"><div class="gmail_quote">On Mon, Mar 12, 2012 at 5:39 PM, E L <span dir="ltr">&lt;<a href="mailto:elylevy@cs.huji.ac.il">elylevy@cs.huji.ac.il</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div dir="ltr">What&#39;s the advantage of using ucs-4 internally?<br>Especially if the program needs to save memory (embedded devices are pretty common these days).<br></div></blockquote><div><br></div><div>UTF-32 or UCS-4, is the only encoding form that allows random access to each Unicode codepoint, each codepoint is 32 bits exactly. As I mentioned, UTF-16 was created with the intention of having indexable codepoints, but eventually there were too many of them (eg <a href="http://www.fileformat.info/info/unicode/char/1f3e9/index.htm">http://www.fileformat.info/info/unicode/char/1f3e9/index.htm</a> <a href="https://plus.google.com/109925364564856140495/posts">https://plus.google.com/109925364564856140495/posts</a> etc).</div>

<div><br></div></div></div>