Emacs & Hebrew
Omer Zak
w1 at zak.co.il
Thu Jun 14 23:38:16 IDT 2012
Hello Eli,
Since I didn't say it before, I'll say it now.
First of all, it is GREAT that Emacs 24 has BiDi support.
None of my comments are meant to detract from this major achievement.
I am sorry for my part in leading you to feel that we detract in any way
from your achievement.
On Thu, 2012-06-14 at 23:04 +0300, Eli Zaretskii wrote:
> > From: Omer Zak <w1 at zak.co.il>
> > Date: Thu, 14 Jun 2012 22:27:22 +0300
> >
> > Once we start making BiDi rendering mode dependent upon nitpicking
> > details of the particular text displayed in a buffer, it is a losing
> > game. There are so many special cases, you are bound to lose some
> > pathological corner cases.
>
> You are, in effect, saying that there's no hope for Emacs to support
> programming languages, something it already does quite well.
Simply, have Emacs continue to support programming languages, just
without default BiDi support when editing such text.
Since it is anyway good practice to localize strings in separate *.po
files, I suggest that all of we invest our nitpicking skills in
specifying how to get Emacs to provide excellent support for BiDi
rendering of *.po files and similar ones (such as Android's
strings.xml).
> Because
> the same machinery that is used now to find comments and strings,
> fontify them correctly, and support various specialized commands for
> them -- the same machinery is what is needed to reorder those same
> comments and strings.
About strings, see above.
About comments, let's not bother. I don't think it'll be a good
practice to write comments in any language other than English. If
anyone wants to do so, caveat emptor.
However, there is a question:
What if anyone needs to illustrate how a function handles Hebrew text
and because of this he needs to write a portion of the function's
comment in Hebrew?
Would it be reasonable to require the user to insert one of the
directionality special characters as a signal for Emacs to turn on
visual BiDi ordering mode in that comment (and turn it back off at
the comment's end as defined in that programming language)?
You ignored other places which might use Hebrew text in programming
languages:
- Perl's regular expressions (also Python).
- Identifiers in computer languages, which allow non-ASCII characters in
identifiers.
> Maybe I'm missing something, but then please give specific examples
> why you think this is a losing game.
- Strings may have various and differing formatting characters/phrases
(C, LISP and FORTRAN have their differing formatting languages).
- HTML/XML fragments - can be part of either strings or comments.
> > How, for example, should we handle a Perl code snippet having Nadav
> > Har'EL's example, which is embedded in Hebrew text (say, a chapter in a
> > Perl textbook written in Hebrew)?
>
> By putting text properties on each portion of that text guiding
> the reordering engine which portions to reorder and with what base
> embedding direction. Typically, a textbook with embedded code
> snippets has those snippets marked by some markup, like @code.. at end code
> in Texinfo; these can be used to place the text properties as required.
I agree with this approach.
With time, algorithms for automatic derivation of text directionality
handling for each programming language will become better and better.
Questions/points:
- How will we let the programmer override, for one place in one file,
the automatic derivation as needed to deal with pathological cases?
- How reasonable will it be to ask the programmer to insert extra
characters into the text just to get it correctly rendered?
> In any case, even if the use case you describe is hard to handle, it
> doesn't yet mean that we shouldn't handle simpler cases. Emacs is a
> programmer's editor, so rendering correctly a program source code is
> something that it should do well, even if it has problems with code
> embedded in plain text. If MS Studio does it, it would be a shame if
> Emacs didn't, don't you think?
Strings are not that simple cases (see above).
Also, in several cases, the correct way to render a piece of text
depends upon various external considrations, of which there is no
reasonable way to make Emacs aware.
Hence, my preference is toward making it easy for the user to see the
text in both logical and visual ordering (not necessarily at the same
time).
> > The solution that I propose is:
> > - Turn off BiDi by default in all programming language major modes.
>
> That doesn't provide solution for displaying comments and strings in a
> legible form.
See my comment about *.po files usage above.
> > In both cases, provide an easy way to display a marked text snippet in
> > the opposite BiDi rendering mode.
>
> Isn't that what I describe above? And if so, what are we exactly
> arguing about? is that about who marks the text to be reordered, where
> I think Emacs should know that automagically, while you want to
> place that burden on the user?
My claim is that the computer will get it wrong anyway, frequently
enough that we need to provide the user with an easy way to override the
computer's way of rendering text.
Unless priorities have changed without my knowing about this, Emacs is
an editor. It is not meant to be a WYSIWYG type editor. It is not
meant to be a text viewer (such as a Web browser). It must make it easy
to edit the text, not necessarily render it in some final form.
And when programming, there are enough cases in which it is easier to
edit BiDi text displayed in logical order (not reordered) rather than
visual order.
> > - When the default is to turn off the BiDi mode, the display of text
> > after BiDi rendering can be an uneditable pop-up window.
> > - When the default is to turn on BiDi mode, then when the user wants to
> > see the text rendered in logical ordering (without BiDi), he should be
> > able to edit it in this mode (and with an easy way to insert
> > directionality modifying/overriding special characters) - I expect it to
> > be used to clean up places where the BiDi rendering engine messed up the
> > text.
>
> These are separate features, not directly related to display of
> program source code. They can be easily implemented, if the consensus
> is that they are needed, since the infrastructure for that already
> exists in Emacs. By contrast, selectively reordering portions of a
> buffer is not yet possible, so that is where my work will happen in
> the near future.
OK, infrastructure to allow a buffer to be displayed in mixed logical +
visual renderings will be helpful.
--- Omer
--
MS-Windows is the Pal-Kal of the PC world.
My own blog is at http://www.zak.co.il/tddpirate/
My opinions, as expressed in this E-mail message, are mine alone.
They do not represent the official policy of any organization with which
I may be affiliated in any way.
WARNING TO SPAMMERS: at http://www.zak.co.il/spamwarning.html
More information about the Linux-il
mailing list