convert old DOS Hebrew encoding

convert old DOS Hebrew encoding

Tzadik Vanderhoof tzadik.vanderhoof at gmail.com
Fri Nov 19 15:21:11 IST 2010


Just wanted to close the loop on this and let you know that I published a
CPAN module for this extended DOS encoding of Hebrew (including vowels and
dageshim).  Here it is:

http://search.cpan.org/~tzadikv/Encode-DosHebrew-0.51/lib/Encode/DosHebrew.pm<http://search.cpan.org/%7Etzadikv/Encode-DosHebrew-0.51/lib/Encode/DosHebrew.pm>

The encoding table itself is at the end of the code in hex.  The first
character of each line is the 8-bit DOS Hebrew encoding.  The rest of the
characters are the unicode equivalent.

Example:

*e1 d1 bc* beis

means that the byte *e1* (hex) in the DOS encoding converts to the 2 bytes:
05*d1* 05*bc* in unicode (the leading unicode "05" for Hebrew is inferred)/
The last part is just a comment that this represents a beis with a dagesh
(oops sorry for the ashkenaz transliteration)

I am still looking for more information about the origin of this encoding
and what programs used it, so I can add that to the documentation.

Tzadik
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.huji.ac.il/pipermail/linux-il/attachments/20101119/099449f8/attachment.html>


More information about the Linux-il mailing list