Preparing to convince to shift to non-propriety documents formats
Dotan Cohen
dotancohen at gmail.com
Mon Feb 20 17:15:45 IST 2012
On Mon, Feb 20, 2012 at 10:40, Nadav Har'El <nyh at math.technion.ac.il> wrote:
> On Sun, Feb 19, 2012, Dotan Cohen wrote about "Re: Preparing to convince to shift to non-propriety documents formats":
>> Undocumented? Which file format is that? All the .doc and .docx
>> formats are documented, even the older binary formats.
>
> Where is the ".doc" format documented?
>
> I once wrote a tool to extract the text in MS Office files (for a search
> engine). It was a really annoying reverse-engineering-like
> trial-and-error process, and I could hardly find any documentation.
> The PowerPoint format (.ppt) was particularly odd.
>
> What documentation do you refer to?
>
Here are the pre-2007 formats:
http://msdn.microsoft.com/en-us/library/ff381461.aspx
And here are the current versions:
http://msdn.microsoft.com/en-us/library/cc313118.aspx
--
Dotan Cohen
http://gibberish.co.il
http://what-is-what.com
More information about the Linux-il
mailing list