Partitioning a 250+ million table
Posted by:
bob bob
Date: May 14, 2009 11:44AM
I am doing some research on some email logs. Lots of them. I put the data into a table and this grows at about 250 million records per month. With some nice indexing I can do adhoc queries for my research. I am the only user of the DB.
My problem is that when I add a new week's worth of data (about 5million records) the loading of the data takes forever, because of the indexing. I have tried several of the published methods for re-indexing but the table and index are getting too large.
So to partitioning. I like the idea of partitioning the table, but here is the table structure:
Hash field on (datetime, to field, subject) to produce UID
Datetime
From field userid
Subject Header
Mail file size
Most examples would partition on the Date field, eg; table per week or similar. However I don't ever search on date. I always search on userid. Given that I search on userid how can I partition for performance advantage? There are millions of userids so I need an index.
I would really like to use the ARCHIVE engine as well (for space) but I am stuck with the MYISAM because of the need for an index on userid. Ideas?
Finally, the UID is there to act as a key to two other tables, one with the name of any attachments (if present) and the other is any 'To:' fields.
Thanks
Subject
Views
Written By
Posted
Partitioning a 250+ million table
4831
May 14, 2009 11:44AM
2774
May 16, 2009 05:04PM
2552
May 19, 2009 10:02AM
Sorry, you can't reply to this topic. It has been closed.
Content reproduced on this site is the property of the respective copyright holders.
It is not reviewed in advance by Oracle and does not necessarily represent the opinion
of Oracle or any other party.