Monday, 19 June 2017

python - sqlite - rapidly constructing table with primary key

I am trying to work around the 999 variable limit inherent in SQLite when wrapped by ORMs such as peewee. I am trying to construct a few dozen tables with ~50k rows and ~20 columns each. However, due to the 999 limit, I have to limit my inserts to ~50 rows per insert statement. This is extremely slow.




How can I make this faster? If I do not have a primary key constraint, then this requirement goes away, as I can just use pandas to dump directly to SQL, but then modifying later to have a primary key is a pain.



Here is an example:



from peewee import *

database = SqliteDatabase(None)

class Base(Model):

class Meta:
database = database


colnames = ["A", "B", "C", "D", "E", "F", "G", "H"]
cols = {x: TextField() for x in colnames}

table = type('mytable', (Base,), cols)
database.init('test.db')
database.create_tables([table])


data = []
for x in range(150): # if this number is any higher this crashes
data.append({x: 1 for x in colnames})


with database.atomic() as txn:
table.insert_many(data).execute()



How can I get around this limitation? In the peewee documentation, they mention using apsw, which has the ability to modify the SQLite max_variables variable, but I am worried about the effects of increasing this variable to some huge number.

No comments:

Post a Comment

casting - Why wasn't Tobey Maguire in The Amazing Spider-Man? - Movies & TV

In the Spider-Man franchise, Tobey Maguire is an outstanding performer as a Spider-Man and also reprised his role in the sequels Spider-Man...