Wrote a little add() function to clean up the duplication of
add_document().
Delete the index directory if it exists and we're being forced to
recreate it.
English fuzzy matches are preferred, followed by Roomaji and then
everything else.
The return tuple from lookup() now has a `name` parameter for the actual
name that was matched.
The setup command loads the default data into a default location, then
creates a whoosh index in a default location.
get_index is now open_index and can be made to explicitly recreate the
index. It also actually opens the index if it already existed, even
across processes, now that FileStorage is working.
The lookup command takes no switches for aiming at a different database;
it only uses the default data stores.
csvimport is now load; csvexport is now dump.
Both take an optional -e switch to specify an engine, but will happily
use a default SQLite database in the pokedex package directory.
Additionally, the CSV directory is now controlled by the optional -d
switch, and defaults to Doing The Right Thing.
So `pokedex load` now does exactly what you'd expect: loads the data
from the right files into a consistently-located database.
Good news: This no longer relies on InnoDB's default row order.
Bad news: InnoDB in MySQL 5.0 has a bug where it will sort rows
physically according to a secondary index, if there's a composite
primary key and a single-column index and the phase of the moon is
right. So a couple tables have been, once again, reordered -- but
correctly this time.
Good news: This bug will no longer fuck me up!
Whoosh's spelling module unfortunately ignores any "words" that don't
look like words, even though the algorithm words fine with arbitrary
input.
I had to clone some code from whoosh.spelling, but avoiding the
isalpha() check solved a bunch of problems. Now the index happily
compares against anything I feed into it.