MySQL Forums
Forum List  »  Optimizer & Parser

Millions of Rows - Several GB - How to Optimize these queries
Posted by: Mike K
Date: February 18, 2008 07:46PM

The DB structure is very flexible, allowing for various types of widgets and each type having various fields/values associated.

For the sake of this question, assume that there are 3 types of widgets:
mysql> SELECT * FROM widgets_types;
+--------+---------------------+---------------------+
| typeid | typename            | modified            |
+--------+---------------------+---------------------+
|  1     | X Widgets           | 2008-02-05 09:57:37 | 
|  2     | Y Widgets           | 2008-02-05 09:58:39 | 
|  3     | Z Widgets           | 2008-02-05 09:59:29 | 
+--------+---------------------+---------------------+
3 rows in set (0.02 sec)

There are various fields that can be associated with each widget type:
mysql> SELECT * FROM widgets_fields;
+----+-----------------------+-------------+
| id | fieldname             | fieldtypeid |
+----+-----------------------+-------------+
|  1 | First Name            |         001 | 
|  2 | Last Name             |         001 | 
|  3 | Primary Phone         |         001 | 
|  4 | E-mail                |         001 | 
|  5 | Address               |         001 | 
|  6 | City                  |         001 | 
|  7 | Zip/Postal Code       |         001 | 
|  8 | State                 |         001 | 
+----+-----------------------+-------------+
8 rows in set (0.02 sec)

Each widget type has a different set of fields associated with it.
mysql> SELECT * FROM widgets_types_fields;
+----+--------+---------+--------+
| id | typeid | fieldid | active |
+----+--------+---------+--------+
|  1 |      1 |       1 |      1 |
|  2 |      1 |       2 |      1 |
|  3 |      1 |       3 |      1 |
| 12 |      2 |       1 |      1 |
| 13 |      2 |       2 |      1 |
| 15 |      2 |       5 |      1 |
| 16 |      2 |       8 |      1 |
| 23 |      3 |       1 |      1 |
| 24 |      3 |       2 |      1 |
| 25 |      3 |       3 |      1 |
| 26 |      3 |       6 |      1 |
| 27 |      3 |       8 |      1 |
+----+--------+---------+--------+
12 rows in set (0.21 sec)


The dynamic capability of this DB structure needs to be retained as new widget types will get added and different fields will be associated with different widget types.

** due to the required flexibility, it won't be possible to have a single table widget type with fixed columns for each field for that type **

mysql> describe widgets;
+------------------+---------------------------+------+-----+-------------------+----------------+
| Field            | Type                      | Null | Key | Default           | Extra          |
+------------------+---------------------------+------+-----+-------------------+----------------+
| id               | int(8) unsigned           | NO   | PRI | NULL              | auto_increment | 
| typeid           | int(8) unsigned           | NO   | MUL | 0                 |                | 
| start_price      | decimal(5,2)              | NO   |     | NULL              |                | 
| current_price    | decimal(5,2)              | NO   |     | NULL              |                | 
| sale_price       | decimal(5,2)              | NO   |     | NULL              |                | 
| ts_sold          | int(10) unsigned zerofill | NO   |     | 0000000000        |                | 
| modified         | timestamp                 | NO   |     | CURRENT_TIMESTAMP |                | 
+------------------+---------------------------+------+-----+-------------------+----------------+
7 rows in set (0.00 sec)

mysql> show index from widgets;
+---------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table   | Non_unique | Key_name   | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+---------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| widgets |          0 | PRIMARY    |            1 | id          | A         |     2316349 |     NULL | NULL   |      | BTREE      |         | 
| widgets |          1 | idx_typeid |            1 | typeid      | A         |          18 |     NULL | NULL   |      | BTREE      |         | 
+---------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
2 rows in set (2.19 sec)

mysql> describe widget_values;
+----------+-----------------+------+-----+-------------------+----------------+
| Field    | Type            | Null | Key | Default           | Extra          |
+----------+-----------------+------+-----+-------------------+----------------+
| id       | int(8) unsigned | NO   | PRI | NULL              | auto_increment | 
| widgetid | int(8) unsigned | NO   | MUL | NULL              |                | 
| fieldid  | int(8) unsigned | NO   | MUL | NULL              |                | 
| valchar  | varchar(255)    | NO   |     | NULL              |                | 
| modified | timestamp       | NO   |     | CURRENT_TIMESTAMP |                | 
+----------+-----------------+------+-----+-------------------+----------------+
5 rows in set (0.00 sec)

mysql> show index from widget_values;
+---------------+------------+------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table         | Non_unique | Key_name                     | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+---------------+------------+------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| widget_values |          0 | PRIMARY                      |            1 | id          | A         |    17488933 |     NULL | NULL   |      | BTREE      |         | 
| widget_values |          1 | idx_widgetid_fieldid         |            1 | widgetid    | A         |     5829644 |     NULL | NULL   |      | BTREE      |         | 
| widget_values |          1 | idx_widgetid_fieldid         |            2 | fieldid     | A         |    17488933 |     NULL | NULL   |      | BTREE      |         | 
| widget_values |          1 | idx_fieldid_valchar          |            1 | fieldid     | A         |          17 |     NULL | NULL   |      | BTREE      |         | 
| widget_values |          1 | idx_fieldid_valchar          |            2 | valchar     | A         |     5829644 |     NULL | NULL   |      | BTREE      |         | 
| widget_values |          1 | idx_widgetid_fieldid_valchar |            1 | widgetid    | A         |     5829644 |     NULL | NULL   |      | BTREE      |         | 
| widget_values |          1 | idx_widgetid_fieldid_valchar |            2 | fieldid     | A         |    17488933 |     NULL | NULL   |      | BTREE      |         | 
| widget_values |          1 | idx_widgetid_fieldid_valchar |            3 | valchar     | A         |    17488933 |     NULL | NULL   |      | BTREE      |         | 
+---------------+------------+------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
8 rows in set (17.89 sec)

The problem:
There are millions of rows in the widgets table, and even more in widget_values.

I need to be able to select matching widgets, such as:
Scenario 1:
- typeid = 3
- fieldid 8 has value 'CA'
- fieldid 1 has value LIKE '%john%'
mysql> EXPLAIN SELECT `widgetid` FROM `widget_values` WHERE `widgetid` IN (SELECT `widgetid` FROM `widget_values` WHERE `fieldid` = '8' AND `valchar` = 'CA') AND `widgetid` IN (SELECT `widgetid` FROM `widgets` WHERE `typeid` = '3') AND `fieldid` = '1' AND `valchar` LIKE '%john%';
+----+--------------------+---------------+----------------+-----------------------------------------------------------------------+-----------------------+---------+------------+---------+--------------------------+
| id | select_type        | table         | type           | possible_keys                                                         | key                   | key_len | ref        | rows    | Extra                    |
+----+--------------------+---------------+----------------+-----------------------------------------------------------------------+-----------------------+---------+------------+---------+--------------------------+
|  1 | PRIMARY            | widget_values | ref            | idx_fieldid_valchar                                                   | idx_fieldid_valchar   | 4       | const      | 5051360 | Using where              | 
|  3 | DEPENDENT SUBQUERY | widgets       | ref            | idx_typeid                                                            | idx_typeid            | 4       | const      |  917832 | Using where; Using index | 
|  2 | DEPENDENT SUBQUERY | widget_values | index_subquery | idx_widgetid_fieldid,idx_fieldid_valchar,idx_widgetid_fieldid_valchar | idx_widgetid_fieldid  | 8       | func,const |       1 | Using where              | 
+----+--------------------+---------------+----------------+-----------------------------------------------------------------------+-----------------------+---------+------------+---------+--------------------------+
3 rows in set (0.32 sec)
and running the actual query:
1887 rows in set (3 min 6.83 sec)
SHOW INNODB STATUS while running the query showed this: view


Scenario 2:
- typeid = 3
- fieldid 7 = 90210
- fieldid 8 = 'CA'
mysql> EXPLAIN SELECT `widgetid` FROM `widget_values` WHERE `widgetid` IN (SELECT `widgetid` FROM `widget_values` WHERE `fieldid` = '8' AND `valchar` = 'CA') AND `widgetid` IN (SELECT `widgetid` FROM `widgets` WHERE `typeid` = '3') AND `fieldid` = '7' AND `valchar` = '90210';
+----+--------------------+---------------+----------------+-----------------------------------------------------------------------+-----------------------+---------+-------------+--------+--------------------------+
| id | select_type        | table         | type           | possible_keys                                                         | key                   | key_len | ref         | rows   | Extra                    |
+----+--------------------+---------------+----------------+-----------------------------------------------------------------------+-----------------------+---------+-------------+--------+--------------------------+
|  1 | PRIMARY            | widget_values | ref            | idx_fieldid_valchar                                                   | idx_fieldid_valchar   | 261     | const,const |    464 | Using where              | 
|  3 | DEPENDENT SUBQUERY | widgets       | ref            | idx_typeid                                                            | idx_typeid            | 4       | const       | 917832 | Using where; Using index | 
|  2 | DEPENDENT SUBQUERY | widget_values | index_subquery | idx_widgetid_fieldid,idx_fieldid_valchar,idx_widgetid_fieldid_valchar | idx_widgetid_fieldid  | 8       | func,const  |      1 | Using where              | 
+----+--------------------+---------------+----------------+-----------------------------------------------------------------------+-----------------------+---------+-------------+--------+--------------------------+
3 rows in set (0.60 sec)
and running the actual query:
181 rows in set (0.10 sec)


Scenario 3:
- typeid = 3
- fieldid 1 LIKE '%angel%'
- fieldid 2 LIKE '%goff%'
mysql> EXPLAIN SELECT lv1.widgetid FROM widget_values AS lv1, widget_values AS lv2, widgets WHERE widgets.typeid = 3 AND lv1.fieldid = 1 AND lv1.valchar LIKE '%angel%' AND lv1.widgetid = widgets.id AND lv2.fieldid = 2 AND lv2.valchar LIKE '%goff%' AND lv2.widgetid = widgets.id;   
+----+-------------+---------+------+-----------------------------------------------------------------------+----------------------+---------+--------------------------+--------+-------------+
| id | select_type | table   | type | possible_keys                                                         | key                  | key_len | ref                      | rows   | Extra       |
+----+-------------+---------+------+-----------------------------------------------------------------------+----------------------+---------+--------------------------+--------+-------------+
|  1 | SIMPLE      | widgets | ref  | PRIMARY,idx_typeid                                                    | idx_typeid           | 4       | const                    | 917832 | Using index | 
|  1 | SIMPLE      | lv1     | ref  | idx_widgetid_fieldid,idx_fieldid_valchar,idx_widgetid_fieldid_valchar | idx_widgetid_fieldid | 8       | widgetr.widgets.id,const |      1 | Using where | 
|  1 | SIMPLE      | lv2     | ref  | idx_widgetid_fieldid,idx_fieldid_valchar,idx_widgetid_fieldid_valchar | idx_widgetid_fieldid | 8       | widgetr.widgets.id,const |      1 | Using where | 
+----+-------------+---------+------+-----------------------------------------------------------------------+----------------------+---------+--------------------------+--------+-------------+
3 rows in set (0.00 sec)
and running the actual query:
47 rows in set (5 min 17.17 sec)



The ultimate goal is to get the matching widgetid's as fast as possible. Eagerly awaiting feedback from you all...


Hardware: Q6600 2.4Ghz & 4GB memory
show variables output: view

Thanks in advance.

Options: ReplyQuote


Subject
Views
Written By
Posted
Millions of Rows - Several GB - How to Optimize these queries
9399
February 18, 2008 07:46PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.