Linux HTML mail agent with RTL and LTR paragraph explicit support
Dotan Cohen
dotancohen at gmail.com
Mon Jun 25 18:21:25 IDT 2012
Schachar, I before addressing the issue at hand, I would like to state
an observation. When I reply to your mail, all text is of the same
quote level. That is, there is a single > at the beginning of each
line, whether it is a line that you wrote or a line that I wrote.
Obviously, I am replying to the HTML portion of your multipart
message, not the plain text portion. My mailer (Gmail) does not know
that blockquote type="cite" means that the text is a quote. Why should
it, is that a standard (I don't know, it might be)? This is a good
argument against HTML mail.
You can tell me that my mailer (Gmail) is broken. But remember that
Gmail is now no less a defacto standard mailer such as Outlook once
was, and that you advocate compatibility with Outlook based on it's
defacto standard status.
I've manually fixed the nesting below:
On Mon, Jun 25, 2012 at 5:19 PM, Shachar Shemesh <shachar at shemesh.biz> wrote:
>> On 06/25/2012 01:42 PM, Dotan Cohen wrote:
>>> On Mon, Jun 25, 2012 at 8:06 AM, Shachar Shemesh <shachar at shemesh.biz>
>>> wrote:
>>>
>>> I disagree completely. The embedding control characters are designed for,
> well, embedding.
>
>> Correct.
>
> Good. But
>
>> As plain text has no concept of a paragraph,
>
> Well, that really depends on what you mean by "plain text".
Plain text is a sequence of bytes in a standard encoding which may or
may not begin with a BOM and is designed to be read in a text editor.
A text editor is a program that reads a sequence of bytes and using a
table commonly referred to as an encoding then displays character
glyphs on the screen as per the sequence of bytes.
> RLE/PDF are
> defined by the UBA (Unicode BiDi Algorithm), and it, clearly, does have a
> concept of a paragraph.
>
Are you referring to the use of linefeeds to designate the end of an
embedded section?
>> using \n, \n\n,
>> \r\n, \r\n\r\n, or any other convention for a paragraph is arbitrary.
>
> Technically true, but both irrelevant and misleading. Misleading because the
> choice of \n or \r\n was arbitrary, but is now standard. Irrelevant because
> we are talking about the UBA, not "plain text" (whatever that means).
>
I see that you are. Fine, I was unaware that they did call that a
paragraph and I do know that embedded sections do end at newlines.
Whatever, let us agree then that sections of text separated by
newlines are paragraphs as that is how the embedded sections end.
>> So if any arbitrary part of the text is to be RTL (no matter if the
>> user calls it a paragraph or not) then it is to be marked as an
>> embedded RTL section.
>
> This is incorrect. It does not matter much what the user calls a paragraph,
> but if the text editor calls a certain run a paragraph, then that is the
> case.
>
Alright.
> You make it sound as if, in the sequence "something <RLE> more something \n
> even more <PDF>", the third part, saying " even more" will have an RTL
> level. That will simply not be the case with any UBA conforming text editor,
> as UBA specifically says that any embedding levels are reset when the
> paragraph is terminated. This is because the embedding controls are embedded
> in the paragraph.
>
> In other words, a paragraph is a paragraph, with BiDi direction, and
> embedding is embedding. The two are not the same.
>
So we have established that sections of text separated by newlines are
paragraphs. Let us return to the issue. In a plain text file, as
defined above, there does exist a method by which the author of the
file may specify that a paragraph is to be RTL. Therefore there is no
need for HTML to send RTL emails, nor is there technical need for the
email client to guess. However, I agree that there is practical need
for the email client to guess as many users may not mark RTL
paragraphs as RTL (be them plain text or HTML).
Have I forgotten anything?
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
More information about the Linux-il
mailing list