<div dir="ltr"><div class="gmail_quote">On Mon, Mar 12, 2012 at 7:37 PM, Nadav Har&#39;El <span dir="ltr">&lt;<a href="mailto:nyh@math.technion.ac.il">nyh@math.technion.ac.il</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

On Mon, Mar 12, <a href="tel:2012" value="+9722012">2012</a>, Elazar Leibovich wrote about &quot;Re: Unicode in C&quot;:<br>

<div class="im">&gt; The simplest option is, to accept StringPiece-like structure (pointer to<br>

&gt; buffer + size), and encoding, then to convert the data internally to your<br>

&gt; encoding (say, ISO-8859-8, replacing illegal characters with whitespace),<br>

&gt; and convert the other output back.<br>

<br>

</div>This is an option, but certainly not the simplest :-)<br></blockquote><div><br></div><div>It was the simplest idea <i>I could think of</i> at this moment ;p</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="im"><br>

</div>What &quot;iconv-like library&quot;?<br></blockquote><div><br></div><div>iconv-like means, &quot;Do you mind using iconv from glibc, and if that&#39;s a problem due to support in Windows, embedded systems, etc that do not feature glibc, do you mind having a dependency on other library, such as ICU, or at least something more lightweight like that would handle all UTF-* conversions?&quot;</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

I&#39;m not ruling this idea out. But what worries me is that at the end,<br>

my users only use 1% of this library&#39;s features - e.g., I&#39;ll never need<br>

this library&#39;s support from converting one encoding of Chinese to<br>

another. So people who want to use the 50 KB libhspell will suddenly need<br>

the 15 MB libicu.<br></blockquote><div><br></div><div>At least when using iconv on linux this is not the case. First, this library is available at every distro, and second, iconv is smart enough to split the functionality amongst many .so files, and to dynamically load only the required shared objects at runtime. I&#39;m not sure what&#39;s the state of iconv at Windows though. Maybe you can fallback there to native system calls.</div>

<div><br></div><div>That said, on a second thought, all the single-byte encoding seems to me more and more deprecated. Thus, I think it might be sufficient to support only UTF-16 and UTF-8. UTF-8 is common at network and files, and UTF-16 is common as inside format in C++ libraries Java and C#, so it&#39;s important to support it for easier interoperability with those.</div>

</div></div>