MySQL Forums
Forum List  »  Optimizer & Parser

group by date(indexed datetime) hard to optimize
Posted by: Lee Marshall
Date: March 30, 2010 12:37AM

I'm trying to understand the optimization logic for tight index scans http://dev.mysql.com/doc/refman/5.0/en/tight-index-scan.html as it relates to group bys on leading parts of dates. At one point my inner query had an explain that included "Using filesort".

The only column that really matters is "date", which is an indexed datetime with at this point 10s of millions of distinct values.

Somewhere in my thrashing, I added ORDER BY NULL and that managed to get the inner "Using filesort" to go away. It still reports "Using temporary", and I've only made the query go twice as fast as originally. I needed to force the index range scan.

The outer query is still "Using filesort" but it's only 12K rows, so I don't really care.


Questions:

* Is there a document that explains how I can transform a datetime in a group by and still get maximum benefit from the index?

* http://dev.mysql.com/doc/refman/5.0/en/using-explain.html interprets many message that show up in the Extra column. It doesn't have an entry for "Using temporary". How can I understand how bad it is?

* How can I tell if the tight index scan GROUP BY optimization is being used?

Thank you.


EXPLAIN EXTENDED select date_format(d,"%Y/%m") ym, caption, player_gender, sum(ct) ct
from
(
select date(date) D, player_age, player_gender, count(*) ct
from impressions i FORCE INDEX (date)
where Date >= '2010-01-01' and Date < '2010-03-01'
group by date(date), player_age, player_gender
ORDER BY NULL
) thing
LEFT OUTER JOIN neilsenAges a
ON a.fromAge<=thing.player_age and thing.player_age<=a.toAge
group by ym, caption, player_gender;

CREATE TABLE `impressions` (
  `id` int(11) NOT NULL,
  `line_id` int(11) default NULL,
  `site_id` int(6) NOT NULL,
  `date` datetime NOT NULL,
  `duration` float default NULL,
  `ad_id` int(11) NOT NULL,
  `game_id` int(11) NOT NULL,
  `slot` varchar(40) default NULL,
  `player_id` int(11) NOT NULL,
  `player_age` tinyint(4) default NULL,
  `player_gender` tinyint(4) default NULL,
  `player_zipcode` mediumint(9) default NULL,
  `game_type` varchar(3) default NULL,
  `session_key` varchar(32) default NULL,
  `account_id` int(11) default NULL,
  `device_id` int(11) default NULL,
  PRIMARY KEY  (`id`),
  UNIQUE KEY `line_id` (`line_id`),
  KEY `site_id` (`site_id`),
  KEY `ad_id` (`ad_id`),
  KEY `player_id` (`player_id`),
  KEY `date` (`date`),
  KEY `device_id` (`device_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

Options: ReplyQuote


Subject
Views
Written By
Posted
group by date(indexed datetime) hard to optimize
7891
March 30, 2010 12:37AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.