Reading RTF files

Reading RTF files

Alan Yaniger alan at tkos.co.il
Mon Jul 13 14:38:33 IDT 2009


Problems with OO's RTF import of text boxes is a known bug:

see http://qa.openoffice.org/issues/show_bug.cgi?id=95665


This is part of a general problem with OOo's import of RTF drawing 
objects, and not restricted to Hebrew.


Alan


Ehud Karni wrote:

> On Fri, 10 Jul 2009 09:38:45 Micha Silver wrote:
>   
>> Ehud Karni wrote:
>>
>>     
>>> I use `catdoc' which works quiet good (for both *doc and *rtf).
>>> `catdoc' is available as a package for Centos and Debian.
>>>       
>> Thanks for the tip, but I can't get any sensible output. I ran:
>>
>>  catdoc -a -d8859-8 invoice150711.rtf | fribidi --charset ISO8859-8
>> --width=80 --rtl
>> and I get 188 empty lines. :-(
>>     
>
> After Micha sent me his RTF file, I found out that it contain
> "text boxes", not plain text.
>
> Using open office does not help, It shows empty (almost) page.
>
> Filter your RTF with the sed command bellow, it will drop the boxes.
> Than run catdoc as above or use open office (I used `ooviewdoc') both
> will show you the data (ooviewdoc saves more of the original layout).
> BTW. When viewed with M$word, the filtered file show empty boxes.
>
> Ehud.
>
>
> sed -e "s/{...do.dobxpage.dobypara.dodhgt8192.dptxbx.dptxbxmar0{/{/g"   \
>     -e "s/}.dpx[0-9]*.dpy[0-9]*.dpxsize[0-9]*.dpysize[0-9]*.dplinehollow0}/}/g"
>
>
>
> --
>  Ehud Karni           Tel: +972-3-7966-561  /"\
>  Mivtach - Simon      Fax: +972-3-7976-561  \ /  ASCII Ribbon Campaign
>  Insurance agencies   (USA) voice mail and   X   Against   HTML   Mail
>  http://www.mvs.co.il  FAX:  1-815-5509341  / \
>  GnuPG: 98EA398D <http://www.keyserver.net/>    Better Safe Than Sorry
>
> _______________________________________________
> Linux-il mailing list
> Linux-il at cs.huji.ac.il
> http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
>   




More information about the Linux-il mailing list