Commit graph

34 commits

Author SHA1 Message Date
a_magical_me
e7c40a08af Speed up import pokedex.db slightly.
Importing pokedex can take several seconds due to its rather large
dependencies—in particular, sqlalchemy, whoosh, and pkg_resources seem
to be the largest offenders. Normally, it would be possible to import
only the submodules one needs (pokedex.db, say), but pokedex.__init__
brings in all the submodules, for use by the command-line interface.

The fix is rather obvious:

- Move the command-line stuff into pokedex.main.

  Note: because the submodules are no longer imported by default, any
  script which expects `import pokedex` to be useful will likely break.

  Note: the `pokedex` command will not work until you re-run `python
  setup.py develop`, to update entry_points.txt.

- Don't import pkg_resources until necessary.
2011-04-03 03:13:07 -07:00
Petr Viktorin
bb4861b8c6 Faster pokedex load for PostgreSQL #526
Also added the -S (--safe) option, which disables the backend-specific
optimizations.

This gives over 3× speedup on my machine :)
2011-03-29 17:42:48 +03:00
Petr Viktorin
497ba412b0 Speed tweaks for pokedex load in SQLite 2011-01-27 21:51:30 -08:00
Eevee
c1c1d8cb63 Added a big old construct-based pkm parser. #183 2010-06-17 21:47:44 -07:00
a_magical_me
febfb239fb Python 2.5 compatibility 2010-05-25 14:41:15 -07:00
a_magical_me
ffc30bff8f Factor out logic for finding the default db/index. #180
Note: `if not x:` has changed to `if x is not None:`, changing the
semantics slightly.  Shouldn't be a big issue.
2010-05-13 21:45:51 -07:00
Eevee
5e52bef91a Make plumbing respect the same env vars as the CLI. #180 2010-05-12 23:23:05 -07:00
Eevee
1c230f5990 Make pokedex status a bit more useful. #180 2010-05-12 23:18:02 -07:00
Eevee
79df4768bf Split PokedexLookup(recreate=True) into its own method. #216 2010-05-12 22:38:36 -07:00
Eevee
6106737465 Tiny fix for CLI help. 2010-04-24 14:52:23 -07:00
Eevee
2204b95585 Overhauled CLI. #180
- Everything now accepts -i, -e, -q, and -v.

- Plumbing commands now announce what database/index they're using and
  where they got them from.

- New command status, which does nothing but still does the announcing.

- New command reindex, which recreates only the whoosh index.
2010-04-24 14:06:56 -07:00
Eevee
d6fd697018 Totally overhauled lookup to use a class.
Now state is held within an object, rather than passed back to the
caller who must then pass it in again.  That was retarded and I don't
know why I ever did it.

Code is much cleaner now.

With apologies to anyone running annotate.
2010-03-21 23:27:47 -07:00
Eevee
e2bd074146 Fix crash when stdin has no encoding. 2009-08-25 08:07:54 -07:00
Eevee
909e61cc97 Added support for type: prefix and forme lookup. #15 2009-08-23 16:27:13 -07:00
Eevee
2bc41e2c62 Added support for lookup by other language name. #15
English fuzzy matches are preferred, followed by Roomaji and then
everything else.

The return tuple from lookup() now has a `name` parameter for the actual
name that was matched.
2009-08-22 01:13:34 -07:00
Eevee
4e51867e95 Added lookup support for foreign language names. #15
Changed lookup()'s return value to be a list of named tuples so the
caller can know which language each result is in.
2009-08-21 00:30:01 -07:00
Eevee
16072ceb44 Added setup command and made lookup work sanely. #15
The setup command loads the default data into a default location, then
creates a whoosh index in a default location.

get_index is now open_index and can be made to explicitly recreate the
index.  It also actually opens the index if it already existed, even
across processes, now that FileStorage is working.

The lookup command takes no switches for aiming at a different database;
it only uses the default data stores.
2009-08-18 23:50:13 -07:00
Eevee
fd5e863eed Added --quiet switch to dump/load. 2009-08-18 18:36:45 -07:00
Eevee
1a7d046fbc Vastly improved the pokedex import/export UI.
csvimport is now load; csvexport is now dump.

Both take an optional -e switch to specify an engine, but will happily
use a default SQLite database in the pokedex package directory.

Additionally, the CSV directory is now controlled by the optional -d
switch, and defaults to Doing The Right Thing.

So `pokedex load` now does exactly what you'd expect: loads the data
from the right files into a consistently-located database.
2009-08-18 18:02:53 -07:00
Eevee
e8ed55c297 Improved CSV import speed by several orders of magnitude. 2009-07-31 00:03:02 -07:00
Eevee
398545a77f Make help message readable for people without a UTF-8 terminal. 2009-07-28 18:31:06 -07:00
Eevee
d997e27112 Changed exception syntax to work with Python 2.5. 2009-07-28 08:25:11 -07:00
Eevee
64d3c7d5f1 Fixed csvexport to write in primary key order.
Good news: This no longer relies on InnoDB's default row order.

Bad news: InnoDB in MySQL 5.0 has a bug where it will sort rows
physically according to a secondary index, if there's a composite
primary key and a single-column index and the phase of the moon is
right.  So a couple tables have been, once again, reordered -- but
correctly this time.

Good news: This bug will no longer fuck me up!
2009-07-26 22:19:27 -07:00
Eevee
d4077cc71d Added command_ prefix to CLI commands to fix import problems. 2009-07-25 02:43:30 -07:00
Eevee
b13ffac247 Pokédex lookup now uses a whoosh index and spell-checker. #15 2009-07-25 01:28:33 -07:00
Eevee
8fb0e550ad Stubbed in a simple lookup command. #15 2009-07-22 23:44:53 -07:00
Eevee
634ef3ed1e Fixed a slew of foriegn key import problems. #29
Curse's type_id was 0, which is bogus; this has been fixed by creating a
real ????? type.
Fourth-gen moves all had zero as a contest effect id, which was also
bogus.
Pokémon 494 and 495 were junk and have been scrapped entirely.
pokemon_form_groups's description column was too short.

pokedex's connect() now takes kwargs passed to sessionmaker().

A more major change: some tables, like pokemon, are self-referential and
contain rows that refer to rows later in the table (for example, Pikachu
evolves from Pichu, which has a higher id).  At the moment such a row is
loaded, the foreign key is thus bogus.  I solved this by turning on
autocommit and wrapping add() in a try block, then attempting to readd
every failed row again after the rest of the table is finished.  Slows
the import down a bit, but makes it work perfectly with foreign key
checks on.
2009-07-03 23:12:13 -04:00
Eevee
15ee3fcccf Fixed csvimport to load in table dependency order. 2009-05-28 21:16:18 -07:00
Eevee
8812dd9654 Fixed table loading under SQLAlchemy 0.5.3.
Apparently the secret property on a singleton hidden in the guts of
SQLAlchemy has been made private recently, so what I wanted to do (get a
list of all ORM classes) is now impossible.  I gave up on trying to find
a real solution and just slapped together something using dir().
2009-05-02 17:44:26 -07:00
Eevee
d9a2d96ede Made csvimport somewhat tolerant of load errors.
It used to abruptly abort if a csv file were missing, which wasn't very
nice when I'd just added a new table definition and was trying to reload
everything else.

Now it prints a status per table while loading, and will declare missing
tables to be...  missing.
2009-05-01 06:24:09 -07:00
Eevee
85ee27dedd Fixed CSV import's handling of Boolean columns. 2009-03-25 20:43:09 -04:00
Eevee
ac325b620d CSV import now respects NULLability of columns.
Empty strings loaded into NULL columns are changed to NULL instead.
2009-03-08 21:34:48 -04:00
Eevee
20c9c23f51 Fixed some MySQL import problems.
Tables weren't being defined as UTF-8 if that wasn't the server default.

A lot of tables were trying to create erroneous auto_increment columns.

Foreign key checks were pretty much fucking everything up.
2009-03-07 18:54:01 -08:00
Eevee
bad044d1d8 Initial commit, with much of the data imported.
Includes a wrapper script 'pokedex' that can, so far, read data from a
db and spit out CSVs or deploy CSVs to a db.
2009-02-05 00:05:42 -08:00