MySQL Forums
Forum List  »  Backup

Re: Backup and trimming data idea
Posted by: Rick James
Date: June 30, 2012 11:50AM

> look for all the where key and the order by combine them as one index even how many fields it could be
Close. Be sure to have the '=' or 'IN' things first. It is not useful to have more than one other field last in the index. Example

WHERE a = ... AND b > ... ORDER BY c
INDEX(a, b) is optimal
* a is first because of '='
* WHERE fields take priority over GROUP BY and ORDER BY, so 'b' is useful, but not 'c'.
* That is, INDEX(a, b, c) would be ok, but 'c' would not be used.

Exception: There are cases where the following would be best. (Sorry, I cannot explain the cases.)
INDEX(c)
That is, the optimizer decides that fetching the fields in sorted order (c) is more important than filtering (on a and b). It won't hurt to include that INDEX, then use EXPLAIN to see if it is used.

> useless when we are looking it via the like 'J%'?
My bad.
It is not completely "useless", but INDEX(last_name, first_name) will only find last_names beginning with J, without making any use of first_name="Rick". The point is that this would be better for that case:
INDEX(first_name, last_name)
The 'rule' is to have the '=' fields first in the INDEX.

JOINs are more involved. First, the optimizer has to decide which table to start with (unless you are using LEFT JOIN). It will say to itself:
1. Let's see how efficient it might be to start with table1.
2. Let's see how efficient it might be to start with table2.
Now, it will decide which way to do it.

The index for the first table (that it picked) follows the guidelines already outlined. The second table is JOINed ON some field; that field becomes part of the decision making for the second table. Think of the ON field as being '='. Example:

FROM tbl1
JOIN tbl2 ON tbl1.id = tbl2.id
WHERE tbl2.b = 345 AND tbl2.c LIKE 'J%' ...

For tbl2, either of these would be useful if tbl2 is done _second_:
INDEX(id, b, c)
INDEX(b, id, c)

But... If the Optimizer decides to use tbl2 _first_, then
INDEX(b, c)
is likely to be the 'best'.

So, if you can't figure out which table will be picked first, have two indexes so that the optimizer is free to pick:
INDEX(b, id, c) or (id, b, c), and
INDEX(b, c)
(Again, order of the fields is important.)

To do a complete job, you need to find all the SELECTs, UPDATEs, DELETEs, on a table, and decide what is the minimal set of indexes that handle all the cases. (Do not simply 'index every field'.)

> chunk deletes but there is a risk of table locking right
DELETEing a 'chunk' of 1000 rows is usually so fast that users do not notice the impact. DELETEing a million rows in a single statement (or single transaction) will impact other users. DROP PARTITION has no impact. (I started that document before PARTITIONing was common.) So, PARTITIONing is optimal, but only when it is otherwise appropriate. One Rule-of-Thumb: Don't PARTITION a table with fewer than a million rows. More RoTs: http://mysql.rjweb.org/doc.php/ricksrots

Options: ReplyQuote


Subject
Views
Written By
Posted
3521
June 22, 2012 07:34PM
1689
June 23, 2012 05:23PM
1748
June 23, 2012 05:36PM
1765
June 24, 2012 01:51PM
1800
June 25, 2012 10:16AM
1651
June 26, 2012 08:14AM
1750
June 26, 2012 08:46AM
1636
June 27, 2012 08:36AM
1732
June 27, 2012 09:03AM
1594
June 28, 2012 11:19PM
1548
June 29, 2012 07:12AM
Re: Backup and trimming data idea
1553
June 30, 2012 11:50AM
1628
July 02, 2012 09:47AM
1571
July 03, 2012 01:32PM
1615
July 05, 2012 11:13AM
1786
July 06, 2012 09:57AM
1579
July 07, 2012 10:50AM
1597
July 08, 2012 09:33AM
1589
July 16, 2012 08:08PM
1565
July 21, 2012 10:57PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.