Odyssey of trying to convert several utf-8 encoded text files into MS-Word *.doc files

Odyssey of trying to convert several utf-8 encoded text files into MS-Word *.doc files

Amos Shapira amos.shapira at gmail.com
Sun Nov 27 05:27:45 IST 2011


Hi Omer,

Try pandoc (http://johnmacfarlane.net/pandoc/)?

I haven't played with it (just heard about it being used by some Wiki
software and filed it away in my bookmarks), but one of its claims to fame
is "convert markdown to ... OpenDocument... and ODT".

It won't solve the whole original problem but might be better than going
through HTML.

--Amos

On 27 November 2011 10:22, Omer Zak <w1 at zak.co.il> wrote:

> Hello Matan,
> Thanks for the suggestion.
>
> The unoconv version 0.3-6 package in Debian Squeeze still depends upon
> some OpenOffice packages, which conflict with some LibreOffice packages,
> which I don't want to uninstall.
>
> However, I do have the Python uno module installed.
> I was referred by LibreOffice's help (search term 'API', item
> 'Programming LibreOffice') to http://api.openoffice.org/ and from there
> found a code snippet for importing a plain text encoded with a given
> character set at
>
> http://codesnippets.services.openoffice.org/Writer/Writer.ImportingAPlainTextEncodedWithAGivenCharacterSet.snip
> However, the code itself is missing - seems to have been removed at some
> time (nevermind the fact that it is said to be written in OOBasic rather
> than in Python/Java/CPP).
>
> Yet another possibility is to write a script in OOBasic, to be invoked
> from within LibreOffice.
>
> At the moment, conversion into HTML seems to be easier for me (I found
> that AbiWord can import an HTML file with suitable font declarations,
> and set the font correctly) - as it doesn't require me to cram yet
> another programming language (one less yak to shave!).
>
> --- Omer
>
>
> On Sun, 2011-11-27 at 00:31 +0200, Matan Ziv-Av wrote:
> > On Sun, 27 Nov 2011, Omer Zak wrote:
> >
> > > I need to convert several utf-8 encoded text files into MS-Word *.doc
> > > format.  So I need to accomplish it from the command line.
> > > In Linux, it is easy to find tools to convert from MS-Word formats into
> > > text, but a Google search failed to yield converters in the opposite
> > > direction.
> > >
> > > I tried two word processors:  LibreOffice and AbiWord.
> > >
> > > LibreOffice (version 1:3.4.3-3~bpo60+1 in Debian Squeeze - yes, it's a
> > > backport) allows you to perform the conversion, but you must go through
> > > a GUI.  I did not find instructions how to accomplish this from the
> > > command line.
> >
> > Libreoffice has a headless mode, where it accepts commands from a pipe.
> > unoconv is a nice command line front end for this mode.
> >
> > Libreoffice also has --convert-to option. Try libreoffice --help to see
> > all options.
>
> --
> Did you shave a yak today?
> My own blog is at http://www.zak.co.il/tddpirate/
>
> My opinions, as expressed in this E-mail message, are mine alone.
> They do not represent the official policy of any organization with which
> I may be affiliated in any way.
> WARNING TO SPAMMERS:  at http://www.zak.co.il/spamwarning.html
>
>
> _______________________________________________
> Linux-il mailing list
> Linux-il at cs.huji.ac.il
> http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.huji.ac.il/pipermail/linux-il/attachments/20111127/6133b84d/attachment-0001.html>


More information about the Linux-il mailing list