Preparing to convince to shift to non-propriety documents formats

Preparing to convince to shift to non-propriety documents formats

Dotan Cohen dotancohen at gmail.com
Mon Feb 20 17:15:45 IST 2012


On Mon, Feb 20, 2012 at 10:40, Nadav Har'El <nyh at math.technion.ac.il> wrote:
> On Sun, Feb 19, 2012, Dotan Cohen wrote about "Re: Preparing to convince to shift to non-propriety documents formats":
>> Undocumented? Which file format is that? All the .doc and .docx
>> formats are documented, even the older binary formats.
>
> Where is the ".doc" format documented?
>
> I once wrote a tool to extract the text in MS Office files (for a search
> engine). It was a really annoying reverse-engineering-like
> trial-and-error process, and I could hardly find any documentation.
> The PowerPoint format (.ppt) was particularly odd.
>
> What documentation do you refer to?
>

Here are the pre-2007 formats:
http://msdn.microsoft.com/en-us/library/ff381461.aspx

And here are the current versions:
http://msdn.microsoft.com/en-us/library/cc313118.aspx


-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com



More information about the Linux-il mailing list