MySQL Forums
Forum List  »  Backup

Re: Backup and trimming data idea
Posted by: Rick James
Date: June 30, 2012 11:50AM

> look for all the where key and the order by combine them as one index even how many fields it could be
Close. Be sure to have the '=' or 'IN' things first. It is not useful to have more than one other field last in the index. Example

WHERE a = ... AND b > ... ORDER BY c
INDEX(a, b) is optimal
* a is first because of '='
* WHERE fields take priority over GROUP BY and ORDER BY, so 'b' is useful, but not 'c'.
* That is, INDEX(a, b, c) would be ok, but 'c' would not be used.

Exception: There are cases where the following would be best. (Sorry, I cannot explain the cases.)
INDEX(c)
That is, the optimizer decides that fetching the fields in sorted order (c) is more important than filtering (on a and b). It won't hurt to include that INDEX, then use EXPLAIN to see if it is used.

> useless when we are looking it via the like 'J%'?
My bad.
It is not completely "useless", but INDEX(last_name, first_name) will only find last_names beginning with J, without making any use of first_name="Rick". The point is that this would be better for that case:
INDEX(first_name, last_name)
The 'rule' is to have the '=' fields first in the INDEX.

JOINs are more involved. First, the optimizer has to decide which table to start with (unless you are using LEFT JOIN). It will say to itself:
1. Let's see how efficient it might be to start with table1.
2. Let's see how efficient it might be to start with table2.
Now, it will decide which way to do it.

The index for the first table (that it picked) follows the guidelines already outlined. The second table is JOINed ON some field; that field becomes part of the decision making for the second table. Think of the ON field as being '='. Example:

FROM tbl1
JOIN tbl2 ON tbl1.id = tbl2.id
WHERE tbl2.b = 345 AND tbl2.c LIKE 'J%' ...

For tbl2, either of these would be useful if tbl2 is done _second_:
INDEX(id, b, c)
INDEX(b, id, c)

But... If the Optimizer decides to use tbl2 _first_, then
INDEX(b, c)
is likely to be the 'best'.

So, if you can't figure out which table will be picked first, have two indexes so that the optimizer is free to pick:
INDEX(b, id, c) or (id, b, c), and
INDEX(b, c)
(Again, order of the fields is important.)

To do a complete job, you need to find all the SELECTs, UPDATEs, DELETEs, on a table, and decide what is the minimal set of indexes that handle all the cases. (Do not simply 'index every field'.)

> chunk deletes but there is a risk of table locking right
DELETEing a 'chunk' of 1000 rows is usually so fast that users do not notice the impact. DELETEing a million rows in a single statement (or single transaction) will impact other users. DROP PARTITION has no impact. (I started that document before PARTITIONing was common.) So, PARTITIONing is optimal, but only when it is otherwise appropriate. One Rule-of-Thumb: Don't PARTITION a table with fewer than a million rows. More RoTs: http://mysql.rjweb.org/doc.php/ricksrots

Options: ReplyQuote


Subject
Views
Written By
Posted
3879
June 22, 2012 07:34PM
1809
June 23, 2012 05:23PM
1881
June 23, 2012 05:36PM
1899
June 24, 2012 01:51PM
1929
June 25, 2012 10:16AM
1767
June 26, 2012 08:14AM
1887
June 26, 2012 08:46AM
1779
June 27, 2012 08:36AM
1861
June 27, 2012 09:03AM
1714
June 28, 2012 11:19PM
1672
June 29, 2012 07:12AM
Re: Backup and trimming data idea
1667
June 30, 2012 11:50AM
1762
July 02, 2012 09:47AM
1702
July 03, 2012 01:32PM
1760
July 05, 2012 11:13AM
1919
July 06, 2012 09:57AM
1696
July 07, 2012 10:50AM
1744
July 08, 2012 09:33AM
1731
July 16, 2012 08:08PM
1695
July 21, 2012 10:57PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.