Hebrew spell-checking in OpenOffice

Tue Nov 2 14:44:20 IST 2010

On Tue, Nov 02, 2010, Lior Kaplan wrote about "Re: Hebrew spell-checking in OpenOffice":
> I've double checked this, and Debian doesn't include a tool needed for
> building the hunspell target. See
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=602189

I see :(

However, if we're talking about the OpenOffice package, not the debian
package, you're not really constrained by what is available on Debian.

> The solution to #66939 was providing the dictionary in UTF8 instead of
> iso-8859-8 encoding. The "freeze" by oo.org was actually a conversion to
> UTF-8.
> 
> Hunspell might be faster than myspell, but the difference is minor comparing
> to the UTF8 conversion. At the moment oo.org loads the dictionary in less
> the 1 sec.

Oh, sorry. I guess I remembered it wrongly.

You're right. I checked on my system, and the myspell-format dictionary takes
0.3 seconds to load, while the hunspell-format takes 0.1 seconds. Not a
dramatic difference. The uncompressed size of the hunspell format is half
that of myspell - again, not dramatic. I think the difference in memory use
is more dramatic (9 MB vs. 36 MB in a test I just did).

I never understood the UTF-8 problem, by the way. Was this bug ever fixed?
No encoding conversion should have ever been this slow. Even if they would
pipe to an external "iconv" process, it would still have been 100 times
faster ;-)

-- 
Nadav Har'El                        |    Tuesday, Nov  2 2010, 25 Heshvan 5771
nyh at math.technion.ac.il             |-----------------------------------------
Phone +972-523-790466, ICQ 13349191 |Classical music: music written by a
http://nadav.harel.org.il           |decomposing composer.