Unicode in C

Unicode in C

Daniel Shahaf d.s at daniel.shahaf.name
Tue Mar 13 22:39:00 IST 2012


Nadav Har'El wrote on Tue, Mar 13, 2012 at 22:16:23 +0200:
> On Tue, Mar 13, 2012, Elazar Leibovich wrote about "Re: Unicode in C":
> > Something very important, one need to consider is Unicode normalization.
> > That is, how to strip out the Niqud, and to substitute, say KAF WITH DAGESH
> > (U+FB3B) with just a KAF (U+05DB) etc.
> 
> Is this really important? Does anybody actually use "Kaf with Dagesh" ?
> Why does it even exist? :(
> 

FWIW, Unicode normalization isn't just about ignoring niqud, it's also
about having >=2 equivalent forms for the same object--- such as
é (U+00e9) and ́e (U+0301,U+0065).

I'm not sure whether this particular issue applies to Hebrew.

Daniel
(maybe you knew this already)

> I noticed there are even more bizarre characters, like "HEBREW LETTER
> ALEF WITH MAPIQ" (!?), "HEBREW LIGATURE ALEF LAMED", "HEBREW LETTER WIDE
> ALEF", "HEBREW LETTER ALEF WITH QAMATS" (Is Yiddish called Hebrew now??)
> "HEBREW LETTER ALTERNATIVE AYIN", and other junk. Why do these exit?
> This is sad.
> 
> Nadav.
> 
> 
> -- 
> Nadav Har'El                        |                   Tuesday, Mar 13 2012, 
> nyh at math.technion.ac.il             |-----------------------------------------
> Phone +972-523-790466, ICQ 13349191 |War doesn't determine who's right but
> http://nadav.harel.org.il           |who's left.
> 
> _______________________________________________
> Linux-il mailing list
> Linux-il at cs.huji.ac.il
> http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il



More information about the Linux-il mailing list