Re: How do I avoid "Using file sort" and "Using temporary"
Jay,
Thanks for the recommendation! I am unable to test it right now, because I changed my schema and it seems to resolve the problem.
I re-ordered the fields in the compound PRIMARY keys, based on the cardinality and it made my query run significantly faster.
Here are the changes:
mysql> show create table page_view_fact \G
*************************** 1. row ***************************
Table: page_view_fact
Create Table: CREATE TABLE `page_view_fact` (
`page_dim_id` int(11) NOT NULL default '0',
`registrant_dim_id` int(11) NOT NULL default '0',
`time_dim_id` int(11) NOT NULL default '0',
`time_on_page` int(11) NOT NULL default '0',
PRIMARY KEY (`registrant_dim_id`,`time_dim_id`,`page_dim_id`),
KEY `page_dim` (`page_dim_id`),
KEY `registrant_dim` (`registrant_dim_id`),
KEY `time_dim` (`time_dim_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
mysql> show index from page_view_fact;
+----------------+------------+----------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------------+------------+----------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
| page_view_fact | 0 | PRIMARY | 1 | registrant_dim_id | A | 25002 | NULL | NULL | | BTREE | |
| page_view_fact | 0 | PRIMARY | 2 | time_dim_id | A | 100010 | NULL | NULL | | BTREE | |
| page_view_fact | 0 | PRIMARY | 3 | page_dim_id | A | 100010 | NULL | NULL | | BTREE | |
| page_view_fact | 1 | page_dim | 1 | page_dim_id | A | 510 | NULL | NULL | | BTREE | |
| page_view_fact | 1 | registrant_dim | 1 | registrant_dim_id | A | 25002 | NULL | NULL | | BTREE | |
| page_view_fact | 1 | time_dim | 1 | time_dim_id | A | 92 | NULL | NULL | | BTREE | |
+----------------+------------+----------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
6 rows in set (0.00 sec)
mysql> show create table registrant_selection \G
*************************** 1. row ***************************
Table: registrant_selection
Create Table: CREATE TABLE `registrant_selection` (
`report_id` int(11) NOT NULL default '0',
`registrant_dim_id` int(11) NOT NULL default '0',
PRIMARY KEY (`registrant_dim_id`,`report_id`),
KEY `registrant_dim` (`registrant_dim_id`),
KEY `report` (`report_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
mysql> show index from registrant_selection;
+----------------------+------------+----------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------------------+------------+----------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
| registrant_selection | 0 | PRIMARY | 1 | registrant_dim_id | A | 24240 | NULL | NULL | | BTREE | |
| registrant_selection | 0 | PRIMARY | 2 | report_id | A | 96963 | NULL | NULL | | BTREE | |
| registrant_selection | 1 | registrant_dim | 1 | registrant_dim_id | A | 24240 | NULL | NULL | | BTREE | |
| registrant_selection | 1 | report | 1 | report_id | A | 99 | NULL | NULL | | BTREE | |
+----------------------+------------+----------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
4 rows in set (0.00 sec)
mysql> select pd.page_type,sum(time_on_page) from page_view_fact pvf join page_dim pd on pvf.page_dim_id=pd.page_dim_id join registrant_sele
ction rs on pvf.registrant_dim_id=rs.registrant_dim_id and rs.report_id=87 group by pd.page_type;
+-------------------+-------------------+
| page_type | sum(time_on_page) |
+-------------------+-------------------+
| | 93115 |
| Products Page | 100634 |
| TouchPoint Portal | 121134 |
| White Paper | 85469 |
+-------------------+-------------------+
4 rows in set (0.39 sec)
mysql> explain select pd.page_type,sum(time_on_page) from page_view_fact pvf join page_dim pd on pvf.page_dim_id=pd.page_dim_id join registr
ant_selection rs on pvf.registrant_dim_id=rs.registrant_dim_id and rs.report_id=87 group by pd.page_type;
+----+-------------+-------+--------+---------------------------------+---------+---------+---------------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------------------------+---------+---------+---------------------------------------+------+----------------------------------------------+
| 1 | SIMPLE | rs | ref | PRIMARY,registrant_dim,report | report | 4 | const | 885 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | pvf | ref | PRIMARY,page_dim,registrant_dim | PRIMARY | 4 | core_dimensional.rs.registrant_dim_id | 4 | |
| 1 | SIMPLE | pd | eq_ref | PRIMARY | PRIMARY | 4 | core_dimensional.pvf.page_dim_id | 1 | |
+----+-------------+-------+--------+---------------------------------+---------+---------+---------------------------------------+------+----------------------------------------------+
3 rows in set (0.00 sec)
The amount of rows is 885*4 vs 1024*4, however query executes about 8x faster.
Thanks, Solon.