MySQL :: Need help understanding optimizer behavior in 5.6

New Topic

Need help understanding optimizer behavior in 5.6

Posted by: Aaron Craven
Date: June 22, 2015 08:30AM

Hi all! I'm trying to understand a bit of optimizer behavior in 5.6 and I'm hoping you can help me out.

The background is that we are working towards upgrading to 5.6 from a very old 5.0 server. Overall things seem to be working well, but we have a financial system that performs extremely poorly on the new server. This system has some very large tables in it (as large as 22MM records), and is quite complex. So I decided to try and create a simplified test scenario.

My scenario consists of three tables, an "entries" table, an "entries group" table, and a "category" table (basically just a value lookup table) as follows (note, this is created on a 5.6 server. Our only 5.0 server is production and I cannot make changes there, so I cannot test this on 5.0 for comparison):

CREATE TABLE `entry` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`group_id` int(11) NOT NULL,
`entry_val` smallint(6) NOT NULL,
`uneven_val` smallint(6) NOT NULL,
PRIMARY KEY (`id`),
KEY `idx_entry_on_group_id` (`group_id`),
KEY `idx_entry_on_uneven_val` (`uneven_val`)
) ENGINE=InnoDB AUTO_INCREMENT=50001 DEFAULT CHARSET=utf8 COMMENT=''This is a test table.''

CREATE TABLE `entry_group` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`category_id` smallint(6) NOT NULL,
`grp_val1` smallint(6) NOT NULL,
`grp_val2` smallint(6) NOT NULL,
PRIMARY KEY (`id`),
KEY `idx_entry_group_on_category_id` (`category_id`)
) ENGINE=InnoDB AUTO_INCREMENT=9762 DEFAULT CHARSET=utf8 COMMENT='This is a test table.'

CREATE TABLE `category` (
`id` smallint(6) NOT NULL AUTO_INCREMENT,
`name` varchar(10) NOT NULL,
PRIMARY KEY (`id`),
) ENGINE=InnoDB AUTO_INCREMENT=8 DEFAULT CHARSET=utf8 COMMENT='This is a test table.'

I populated the category table with 7 values by hand, and then used a program to generate sample data in the other tables as follows:

entry:
25000 rows were inserted into entries, randomly choosing when to create a new associated group record (see below), creating an average group size of 10 entries (max 68, min 1). For each entry, entry_val is chosen randomly from the entire range of smallint values. For uneven_val, about 90% of the records were given a single value (212), and the rest are assigned a random smallint value.

entry_group:
This table is populated during creation of entry records (see above), which results in approximately 2500 records (2451 in my test). grp_val1 and grp_val2 are randomly assigned a smallint value (and are not used in this test). For category_id, approximately 90% of records are given the value of 1. The rest are randomly given an id between 1 and 7 (these values are of course the ids of the records in the category table).

So everything is evenly distributed except the relationship between entry_group and category and the value of entry.uneven_val.

After doing all this, I started playing with queries until I got a plan that acted similarly to the production tables. For this test, I had to be a little generous with my range predicate, but in our production financial system, we have predicates that are quite a bit more selective. This is the query I found:

select *
from entry e inner join
entry_group g on
g.id = e.group_id inner join
category c on
c.id = g.category_id
where e.uneven_val between 300 and 32000;

The plan for this has MySQL starting with a tablescan on category, then joining to entry_group, and finally entry. The range predicate is ignored for plan purposes, despite the fact that it is 1) highly selective and 2) the ONLY limiting condition in the query.

+----+-------------+-------+------+-------------------------------------------------------+------------------------------------+---------+------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-------------------------------------------------------+------------------------------------+---------+------------+------+-------------+
| 1 | SIMPLE | c | ALL | PRIMARY | NULL | NULL | NULL | 7 | NULL |
| 1 | SIMPLE | g | ref | PRIMARY,idx_entry_group_on_category_id | idx_entry_group_on_category_id | 2 | db.c.id | 175 | NULL |
| 1 | SIMPLE | e | ref | idx_entry_on_group_id,idx_entry_on_uneven_val | idx_entry_on_group_id | 4 | db.g.id | 5 | Using where |
+----+-------------+-------+------+-------------------------------------------------------+------------------------------------+---------+------------+------+-------------+

I know that I can force the access plan I want (start with entry and work towards category) using a straight_join. Interestingly, I can even do it by changing the category join to a left join and adding ""e.category_id is not null" to the where clause.

With this data, the two plans actually have very similar runtime performance (~0.16sec), but as entry_group and entry grow, it doesn't scale well because it is chewing through more and more unnecessary data.

But what I really want to know is why MySQL chooses this access plan in the first place... I suspect this has something to do with uneven distribution of keys, but I'm uncertain.

Also, if you had to fix this problem without modifying the SQL at all, how would you do that?

Navigate: Previous Message• Next Message

Options: Reply• Quote

Subject

Views

Written By

Posted

Need help understanding optimizer behavior in 5.6

2560

Aaron Craven

June 22, 2015 08:30AM

Re: Need help understanding optimizer behavior in 5.6

1203

Aaron Craven

June 22, 2015 08:31AM

Re: Need help understanding optimizer behavior in 5.6

1208

Rick James

June 23, 2015 07:00PM

Re: Need help understanding optimizer behavior in 5.6

1208

Aaron Craven

June 24, 2015 06:22AM

Re: Need help understanding optimizer behavior in 5.6

1175

Rick James

June 25, 2015 11:37PM

Re: Need help understanding optimizer behavior in 5.6

1228

Aaron Craven

June 26, 2015 06:56AM

Re: Need help understanding optimizer behavior in 5.6

1193

Øystein Grøvlen

June 30, 2015 05:46AM

Re: Need help understanding optimizer behavior in 5.6

1775

Aaron Craven

June 30, 2015 08:06AM

Re: Need help understanding optimizer behavior in 5.6

1155

Aaron Craven

June 30, 2015 08:16AM

Re: Need help understanding optimizer behavior in 5.6

1449

Øystein Grøvlen

July 01, 2015 03:49AM

Re: Need help understanding optimizer behavior in 5.6

1361

Aaron Craven

July 01, 2015 06:48AM

Re: Need help understanding optimizer behavior in 5.6

1206

Edwin Desouza

July 01, 2015 03:05PM

Re: Need help understanding optimizer behavior in 5.6

1252

Aaron Craven

July 02, 2015 07:48AM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.