Hi all. I have a simple select query on a large table selecting the wrong index, and I can't see why. I've renamed tables & columns to protect internal company info. The query:
---
SELECT ID FROM V2_THING WHERE DEMO='N' AND CREATED_DATETIME > NOW() - INTERVAL 50 DAY AND STATUS='C'
---
Relevant indexes:
KEY `idx_status_thing` (`status`)
KEY `idx_created_datetime_thing` (`created_datetime`)
---
Data profile:
STATUS = 'C': 14,690,555 records
STATUS = 'I': 67,348,939 records
STATUS = 'X': 3 records
CREATED_DATETIME > NOW() - INTERVAL 50 DAY: 4,569,918 records
---
Based on this, I would expect the optimizer to choose the idx_created_datetime_thing index, as it culls the table down to 4,569,918 records, as opposed to the idx_status_thing index, which only culls the table to 14,690,555 records. ( Note this profile was taken about 2 days before the optimizer trace, so figures might be *slightly* different now, but not significantly.
---
The optimizer trace:
https://pastebin.com/eEua9Tqv
Parts of this trace suggest the optimizer is correctly ( in my opinion ) choosing the idx_created_datetime_thing idex, but then later on it switches to using idx_status_thing, and near the bottom it even appears to mix up the rows_for_plan ( idx_created_datetime_thing ) and cost_for_plan ( idx_status_thing ) figures:
"condition_filtering_pct": 1.2741,
"rows_for_plan": 503604,
"cost_for_plan": 1.11e7,
"chosen": true
What's going on?