MySQL Forums
Forum List  »  MyISAM

Re: Dumping enormous MYISAM table with BLOB based on WHERE clause
Posted by: Rick James
Date: January 29, 2015 04:53PM

OPTIMIZE will read the old table, write the new table, then swap names and DROP the old table.
Reading depends on how much data is still in the table (one month's worth).
Writing will also take that much time.
Swapping and dropping takes no time.

Note that it, at the peak disk usage, it will need enough disk space for the entire current table, plus the space of the new table.

Consider this for going forward: PARTITION the table by week (or maybe day or month). DROP PARTITION is instantaneous and does not require any temporary growth in disk space. It immediately returns the DROPped space to the OS. This blog discusses the technique:
http://mysql.rjweb.org/doc.php/partitionmaint

In general, think of the time as being I/O-bound. Count the number of rows read or written or deleted by the entire process. That will give you a metric by which to compare techniques.

For raw performance, the partitioning scheme wins, hands down. Zero reads, zero writes, zero deletes.

Your dump, drop, reload:
Dump: read + write 1 month's worth
Drop: 0
Reload: read + write 1 month's worth
Total: 4 month's worth of I/O.

Jon's DELETE + OPTIMIZE:
Delete: N-1 month's worth (assuming the table now has N months in it)
Optimize: 1 write + 1 read
swap + drop: 0
Total: N+1 month's worth of I/O.

500GB (N) will take hours to do. If a month is a tiny fraction (N is large), then your technique will beat Jon's. If you have less than 3 month's data, Jon wins.

Note, since you are using MyISAM, the table is locked for the entire OPTIMIZE or DELETE.

Do the PARTITION gain immediately after this purging. Or do it as part of the process, thereby possibly saving 2 units of work. Here's how:

Your technique:
1. mysqldump (2 units - one read, one write)
2. CREATE TABLE with PARTITION BY RANGE(TO_DAYS(serve_date)) etc. (0 units)
3. DROP huge table (0)
4. mysql < dump (2 units)
Now it is partitioned, and ready for the next 'free' purge, based on my notes.

Jon's technique:
1. DELETE (N-1)
2. Skip the Optimize; step 3+4+5 have the same effect.
3. CREATE partitioned table
4. Copy table over (2 units)
5. DROP big table

But, at that point, you may as well do:
1. CREATE partitioned table (0)
2. INSERT INTO that table SELECT last month's data (2, plus expand disk by 1 month's worth) (No OPTIMIZE needed)
3. DROP old table (0)
Total: Only 2 units of time.

Bingo-- Do that last one; it is the fastest, and it sets you up for quick purging in the future! And it uses the same amount of extra disk space that the others take.

Options: ReplyQuote




Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.