<div dir="ltr">On Tue, Mar 13, 2012 at 10:16 PM, Nadav Har'El <span dir="ltr"><<a href="mailto:nyh@math.technion.ac.il">nyh@math.technion.ac.il</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On Tue, Mar 13, <a href="tel:2012" value="+9722012">2012</a>, Elazar Leibovich wrote about "Re: Unicode in C":<br>
<div class="im">> Something very important, one need to consider is Unicode normalization.<br>
> That is, how to strip out the Niqud, and to substitute, say KAF WITH DAGESH<br>
> (U+FB3B) with just a KAF (U+05DB) etc.<br>
<br>
</div>Is this really important? Does anybody actually use "Kaf with Dagesh" ?<br>
Why does it even exist? :(<br></blockquote><div><br></div><div>I'm not sure, neither I'm not sure why LOVE HOTEL or JAPANESE GOBLIN exists. When I read those stuff I'm not sure whether to laugh or cry. Most are probably never used, although I need to ask people at the publishing industry, maybe they use special symbols there. Maybe some of the wise folks in the list will enlighten as.</div>
<div>However as they say, the Unicode consortium הקדים רפואה למכה, and made standard normalization algorithms which are supposed to solve this problem and convert all text to standard form (I'm not sure if it's really covering all the edge cases though).</div>
<div><br></div><div>I'm not sure if the normalization should be included in hspell, however I would put a notice that the input is expected to be normalized in order to work. And I would at least support Niqud.</div><div>
<br></div></div></div>