MySQL Forums
Forum List  »  Performance

Re: large multiple table update taking a very long time, help!
Posted by: Rick James
Date: April 19, 2009 11:50AM

* All 4 tables have the same number of rows? Is there some reason to have 4 tables, not one?

* TEXT vs VARCHAR -- There is very little difference, especially now that VARCHAR can go up to 65535 (the limit for TEXT). If VARCHARs are not too big, it might help certain queries avoid creating tmp tables on disk. This is because tmp table would like to be MEMORY tables, but won't if you are dealing with TEXT/BLOB.

* "Avg_row_length: 205" -- This tells me that the TEXT fields are usually very small. So, my comments about using another table for TEXT fields does not apply.

* "Avg_row_length: 205" -- This makes me wonder if the TEXT fields are usually empty?? If there are a lot of fields that are NULL (especially if they tend to be NULL at the same time), then there is another reason for splitting them off into their own table -- that extra table would have fewer rows. The JOIN to get the stuff (IF you don't have any filtering via WHERE, or GROUPing) should not be costly.

* You bad experience with JOIN -- probably came from having WHERE clause filtering in both tables.

* KEY `field4` (`field4`(32)), KEY `field14` (`field14`(20)), -- I find that "prefix" indexes are virtually useless. And you could avoid them by changing those fields from TINYTEXT to VARCHAR(255) (or less).

* I see not "compond keys". Do you understand the benefit of such?

* I still don't understand your FK_id -- sounds like a mistake.

* table2 and table4 -- if they are mainly to map between id and class_id, recommend INDEX(id, class_id) and INDEX(class_id, id). Since you are using MyISAM, that would cut in half the effort of mapping one id to the other.

* I hope your use of "table1" and "field1" is just to hide your business logic, and not really what you call things.

* Smaller -> more cacheable -> faster. Do you really need BIGINT (8 bytes)? Or would INT (4 bytes, limit to 2G/4G)? Or MEDIUMINT, or ...?

* "mysql should take the max if it is over, right?" -- No, it is much worse; it simply takes the bottom 32 bits, which is 0. key_buffer_size = 0 would not be good.

* You have 25GB of data and 22GB of index currently, and 32GB of RAM. Are these all the tables? Do you expect them to grow significantly? The data will probably be fully cached now, but will soon overflow. The indexes may or may not be happy with 4GB (but not happy with 0). Do
SHOW STATUS LIKE 'key%';
key_reads/key_read_requests is the fraction of index blocks that needed disk I/O (for reading). If that is under a few percent, you are doing fine. If higher, let's look for remedies.

*
UPDATE IGNORE `table1`, `table2`
   SET `table1`.`class_id` = `table2`.`class_id`
   WHERE `table1`.`id` = `table2`.`FK_id`;
table2 has no index on FK_id and table1 does hav an index on id. This means that the query will do a table scan of table2, reaching into table1 once for each of the 70M rows. This happens to be the right order, since it wants to change table1. And if it is actually changing every row, that requires a lot of writes (9GB).

* A NULLable INT field takes 1 bit for the NULL flag, plus 0 or 4 bytes for the INT. You started with 0 bytes, then changed them to 4. This means that each row got fatter, requiring reallocating the row, possibly somewhere else in the .MYD file. I believe (without proof) that the UPDATE would have run a lot faster if class_id had been made NOT NULL to start with.

Options: ReplyQuote


Subject
Views
Written By
Posted
Re: large multiple table update taking a very long time, help!
2591
April 19, 2009 11:50AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.