Announce: Hspell 1.1

Announce: Hspell 1.1

kobi zamir kobi.zamir at gmail.com
Fri Jan 1 13:38:07 IST 2010


Nadav and Dan:
It's great news to hear about the new release.
thank you for the hard work and time you put into this project over the years.
without hspell the hebrew free software was not what it is today.

Ely:
Dan suggested earlier : Luah HaShemot HaShalem and Luah HaP`alim
HaShalem by Shaul Bakali
I will add : http://culmus.sourceforge.net/dictionary/index.html

anyway you are missing the point, *you* should do the things you
listed, all the things you listed are doable, but doing them take hard
work and time.

if you think these things are important just do them. they are
possible to do and when you start you will see that a lot of the work
already been done by people in the community. hspell is not just a
spell checker but also a grammatical analyzer that can tell you word
tense, number, type, sex and hataye. this options in hspell are a big
step in achieving your goal and culmus's dictionary project is
aonther, i'm sure you will find other sources when you start the work.
i hope to here about your project when it will have some working code
to show.

kobi

2010/1/1 Ely Levy <elylevy at cs.huji.ac.il>:
> I think it should be done in the following order:
> - If hspell doesn't have it add for each word if it's a verb adjective and
> so on.
> - Grammatical analyzer - I saw a doc work that was released under GPL about
> it long ago.
> - Grammatical fixer (maybe better spelling suggestion based on grammar
> - Independent of that we need a list of words and their nikud (I also saw
> one in that doc work)
> - Nikud checker
> - Nakdan
>
> Does anyone know where will be a good place to start getting word list with
> nikud?
> Or where is the doc work that made grammatical analyzer?
>
> Ely
> On Fri, Jan 1, 2010 at 10:18 AM, Dan Kenigsberg <danken at cs.technion.ac.il>
> wrote:
>>
>> Who said anything about *few* rules? They are many, and are complex, and
>> have
>> gazillion of exceptions. But they exist, and putting them into effect in
>> hspell's inflection scripts is doable, albeit requiring a lot of
>> meticulous
>> work. The classical references for niqqud are Luah HaShemot HaShalem and
>> Luah
>> HaP`alim HaShalem by Shaul Bakali. These tables include all the rules and
>> all
>> the exceptions needed to add the correct niqqud to Hebrew words.
>>
>> On Fri, Jan 01, 2010 at 02:02:21AM +0200, Ely Levy wrote:
>> > I can only talk from my own experience, I couldn't find any good source
>> > for
>> > rules about nikud and grammar in a simple form.
>> > I did find some gpled work list with nikud, and I think I even talked to
>> > the
>> > people in mila.
>> > But no one could provide that few rules you are talking about.
>> > (And I'm still confused about the difference between old and modern
>> > grammar/nikud...)
>> >
>> > Ely
>> >
>> > On Thu, Dec 31, 2009 at 4:11 PM, Nadav Har'El
>> > <nyh at math.technion.ac.il>wrote:
>> >
>> > > On Thu, Dec 31, 2009, E L wrote about "Re: Announce: Hspell 1.1":
>> > > > I think the main problem is what need to be done and not the man
>> > > > power to
>> > > > program it.
>> > > > If someone know of what are the rules grammar or nikud checkers
>> > > > should
>> > > > follow I'm sure it won't be a big
>> > > > deal programing one
>> > >
>> > > I beg to differ.
>> > >
>> > > First of all, most of the needed knowledge already exists, published
>> > > in
>> > > numerous papers and books, and demonstrated by several pieces of
>> > > commercial
>> > > software. One doesn't need to come with advanced knowledge of the
>> > > topic,
>> > > any more than I had to be some spell-checking expert before I started
>> > > Hspell.
>> > > All one needs is a willingness to learn, and of course the
>> > > resourcefulness
>> > > to put it into good use.
>> > >
>> > > Second, while the work on Hspell had a lot of very interesting
>> > > theoretical
>> > > sides and problems to solve (in linguistics, language, compression,
>> > > etc.),
>> > > most of the work was actually the mundane and almost endless task of
>> > > making
>> > > lists of words (a task which you can see, still isn't done 10 years
>> > > after
>> > > starting the project). For niqqud checking, there is also a lot of
>> > > similar
>> > > mundane work that needs to be done (writing the right niqqud for each
>> > > word),
>> > > and that takes a lot of time.
>> > > For grammar checking, it depends what you call grammar: If you also
>> > > want
>> > > to include semantics, and not just grammar - like Prof. Uzzi Ornan did
>> > > in
>> > > his text-to-speech and niqqud research (and product) - there's also
>> > > tons
>> > > of work that needs to be done on creating classes of nouns, listing
>> > > arguments
>> > > of verbs, and so on. I guess you can start with just grammar, though,
>> > > and
>> > > in this case, you're right - it should be doable without too much data
>> > > collection - so maybe this is indeed a good project to start with.
>> > >
>> > > This is all very interesting work. Unfortunately, I do not see myself
>> > > starting it in the near future. If anyone is interested in taking a
>> > > shot
>> > > at it, I'd love to advise - please contact me and/or Dan privately.
>> > >
>> > > Nadav.
>> > >
>> > > --
>> > > Nadav Har'El                        |     Thursday, Dec 31 2009, 14
>> > > Tevet
>> > > 5770
>> > > nyh at math.technion.ac.il
>> > > |-----------------------------------------
>> > > Phone +972-523-790466, ICQ 13349191 |I couldn't afford a cool
>> > > signature, so
>> > > I
>> > > http://nadav.harel.org.il           |just got this one.
>> > >
>>
>> > _______________________________________________
>> > Linux-il mailing list
>> > Linux-il at cs.huji.ac.il
>> > http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
>>
>>
>> --
>> Dan Kenigsberg        http://www.cs.technion.ac.il/~danken        ICQ
>> 162180901
>
>
> _______________________________________________
> Linux-il mailing list
> Linux-il at cs.huji.ac.il
> http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
>
>



More information about the Linux-il mailing list