MySQL Forums
Forum List  »  NDB clusters

Relative speed of Subquery / Join statements
Posted by: Hayden Stainsby
Date: November 21, 2006 09:30AM

I have a cluster set up on five 3GHz P4 machines (2Gb RAM for NDB nodes and 1Gb RAM SQL/MGM nodes).

I've found that a query based around a select (subselect join) was much, much slower than a query without using a subquery (but rather two joins).

Details:

The subquery incantation query string is:
SELECT Universe.universeid, Universe.name, Universe.skuCount FROM Universe WHERE (Universe.universeid in (SELECT Product.universeid FROM Item INNER JOIN Product ON Item.itemid=Product.productid WHERE Item.manufacturerid=2138)) AND Universe.skuCount>0 ORDER BY Universe.skuCount

and gives this on EXPLAIN:
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: Universe
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 706
Extra: Using where; Using filesort
*************************** 2. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: Item
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 2009
Extra: Using where with pushed condition
*************************** 3. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: Product
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: fp.Item.itemid
rows: 1
Extra: Using where
3 rows in set (0.00 sec)

It returns a single row in 13-14 seconds.

The JOIN only incantation looks like:
SELECT Universe.universeid, Universe.name, Universe.skuCount FROM Universe LEFT JOIN Product ON Product.universeid = Universe.universeid INNER JOIN Item ON Item.itemid=Product.productid WHERE Universe.skuCount>0 AND Item.manufacturerid=2138 ORDER BY Universe.skuCount

and gives this explanation:
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: Item
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 2009
Extra: Using where with pushed condition; Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: Product
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: fp.Item.itemid
rows: 1
Extra: Using where
*************************** 3. row ***************************
id: 1
select_type: SIMPLE
table: Universe
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: fp.Product.universeid
rows: 1
Extra: Using where with pushed condition
3 rows in set (0.00 sec)

It returns 47 (identical) rows in 0.06 seconds. The extra rows are due to the database structure and the time added by grouping on the primary key is negligible.

If anyone is able to shed any light on such and extreme difference in computational time it would be greatly appreciated. My best guess is still that I've misconfigured the cluster in some way.

As a comparison I ran the same queries on the same data but running on a single MySQL server and using the InnoDB engine for the tables in the database. The machine running the server is the same spec as the SQL nodes.

The subselect incantation took 0.73 seconds and the non-subselect incantation returned in 0.01 seconds. I'm not expecting the to get these speeds out of the cluter for single queries, that's not why I've got a cluster, but I'm hoping to get something closer or at least know what sort of queries I should be avoiding and why.

Thanks in advance.

Options: ReplyQuote


Subject
Views
Written By
Posted
Relative speed of Subquery / Join statements
3238
November 21, 2006 09:30AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.