Emacs & Hebrew

Emacs & Hebrew

Omer Zak w1 at zak.co.il
Thu Jun 14 22:27:22 IDT 2012


The discussion below reminds me of the Worse Is Better debate (see, for
example, http://www.codinghorror.com/blog/2004/08/worse-is-better.html).

Once we start making BiDi rendering mode dependent upon nitpicking
details of the particular text displayed in a buffer, it is a losing
game.  There are so many special cases, you are bound to lose some
pathological corner cases.

And According to Behdad Esfahbod, the very Unicode BiDi algorithm itself
fails to correctly handle all kinds of corner cases in rendering Arabic
text.

How, for example, should we handle a Perl code snippet having Nadav
Har'EL's example, which is embedded in Hebrew text (say, a chapter in a
Perl textbook written in Hebrew)?

The solution that I propose is:
- Turn off BiDi by default in all programming language major modes.
- Turn on BiDi by default in major modes which handle text.

In both cases, provide an easy way to display a marked text snippet in
the opposite BiDi rendering mode.
- When the default is to turn off the BiDi mode, the display of text
after BiDi rendering can be an uneditable pop-up window.
- When the default is to turn on BiDi mode, then when the user wants to
see the text rendered in logical ordering (without BiDi), he should be
able to edit it in this mode (and with an easy way to insert
directionality modifying/overriding special characters) - I expect it to
be used to clean up places where the BiDi rendering engine messed up the
text.

--- Omer


On Thu, 2012-06-14 at 19:58 +0300, Eli Zaretskii wrote:
> > Date: Thu, 14 Jun 2012 07:36:21 +0300
> > From: Nadav Har'El <nyh at math.technion.ac.il>
> > Cc: linux-il at cs.huji.ac.il
> > 
> > Like I said, try the example Perl instruction of changing aleph into
> > bet:
> > 
> > 	s/א/ב/
> > 
> > See how the Bidi algorithm makes it appear as if we're changing the
> > other way around - bet into aleph. I honestly don't see what kind of
> > base direction choice or other high-level protocol can "save" this
> > case. I have to admit I didn't try it on Emacs 24 (which I don't have
> > yet), but did try it on other bidi-capable programs. Maybe there is a
> > workaround in this case, and you can educate me?
> 
> This snippet will indeed be rendered confusingly by any bidi-aware
> application (including Emacs 24.1).  But the solution is not to turn
> off the reordering wholesale.  That's because you do want the
> reordering here:
> 
>     $word =~ s/ה$/h/o if $opts{"שמור_מפיק"};
> 
> and here:
> 
>   # TODO: resolve the following linguistic question: Is there a difference
>   # in the pronunciation and spelling of בדוחפם (when they push, bdoxpam)
>   # and לדחופם (to push them, lidxpam)? The first is an subjectization of
 <  # לדחוף, and the second is a objectization. I do *not* know if the above
>   # differentiation is valid or correct, and failed to find references to
>   # support my gut feeling. Thus, on the mean while, I produce a waw-less
>   # form, as done by rav-millim.
> 
> So what is needed is _selective_ reordering: some portions of the
> buffer need to be reordered, while others need to stay in their
> original logical order.
> 
> For buffers that hold source code in some programming language, the
> parts to reorder are comments and strings.  Precisely what is a
> comment or a string is something each major mode should determine --
> and they all do already.  For markup languages such as HTML and XML,
> these are labels and perhaps other things.  It's possible that there
> are other types of non-plain text that need similar treatment; I need
> feedback for making sure all the possibilities are covered.
> 
> Emacs currently lacks some fundamental infrastructure to support such
> selective reordering, but it is on my todo list, and I will publish
> the basic design principles in the near future.  However, my ideas of
> which infrastructure is needed and how to expose it to Lisp
> applications are based on a very limited number of use cases, most of
> them from my daily experience, where I almost never use Hebrew.  I
> cannot be sure my experience covers enough turf to be a solid basis
> for supporting non-plain text.
> 
> This is where this community should enter the picture.  When you find
> a use case where the default reordering doesn't do a good enough job,
> don't just disable reordering.  Instead, report those cases as bugs or
> missing features, and if you can, send suggestions for how to solve
> this, so that you and others will be happy.  If you just disable
> reordering, we will never be able to get it better.
> 
> Thanks in advance.
-- 
MS-Windows is the Pal-Kal of the PC world.
My own blog is at http://www.zak.co.il/tddpirate/

My opinions, as expressed in this E-mail message, are mine alone.
They do not represent the official policy of any organization with which
I may be affiliated in any way.
WARNING TO SPAMMERS:  at http://www.zak.co.il/spamwarning.html




More information about the Linux-il mailing list