MySQL Forums
Forum List  »  InnoDB

Re: deleting data from large fragmented InnoDB table
Posted by: Rick James
Date: March 05, 2014 05:10PM

> Do you think partitioning by hash is ok? When we delete data daily, will the partition which runs empty on each end of a week be dropped automatically and will it be used for the new data?

No & no. The reasons are the same as for your current partitioning.

> partitions: p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13,p14,p15,p16,p17,p18,p19,p20,p21,p22,p23,p24,p25,p26,p27,p28,p29,p30,p31
> type: ALL

Bad & bad. The first is no partition "pruning", and each entire partition is scanned (implying that no indexes were useful). (That is, the first table needed a full scan.)

> Yes, we need all the data to show requests and responses.

OK.

> Data_length: 204959367168
> Data_free: 194778234880

Smells like a lot of churn.

> With "Beginning" I mean after we have added the two new fields and renamed the tables, the "new" table will have two more fields wich are NULL for all old records. At beginning of the work with the two additional fields ;)

(See caveat below.) Rather than go through the costly exercise in adding two columns, I would create a new table for the two columns, plus the same PRIMARY KEY as the original table (but without AUTO_INCREMENT). Then, I would LEFT JOIN to the new table. This would have exactly the same output as a SELECT on a single combined table.
Advantage: No ALTER needed.
Disadvantage: The JOIN will be slightly slower.
No diff: 'old' rows will show NULL for the two new columns either way.
Caveat: Assuming you will be changing the partitioning, you could add the columns at that time 'for free'.

> My plan was to select 1000 rows, insert into the new table and delete in the old table, but maybe I should use the partitions and select the data by DAY. It would use almost 500,000 rows in average.

1K rows -- good. Insert -- good. Delete -- costly; it might be better to wait until a whole day's data has been moved, then truncate the partition. Some notes on batched actions like that: http://mysql.rjweb.org/doc.php/deletebig

Hmmm... a potential problem: Will you need to SELECT the rows that were moved and deleted? If so, a UNION between the two tables would work. (And your Delete would work better than my Truncate.)

My link (partitionmaint) addresses some of your issues -- slow SELECTs, churn, slow deletes.

Another suggestion -- both for query speed and disk space: compress the TEXT and BLOB fields that are compressible. (English usually compresses 3:1.) Compress in the client, change the fields to BLOB (since TEXT does not like the non-char bytes), etc.

Since restructuring the table will be very costly, you may as well add the two new columns, change the partitioning, and do anything else, all at the same time.

Options: ReplyQuote




Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.