MySQL Forums
Forum List  »  Router & Proxy

Running one query in parallel on lots of computers
Posted by: David Andersen
Date: July 20, 2007 07:03PM

Thanks a lot for the answer Jan!

I would like to use MySQL proxy to run one query in parallel on lots of computers.

My database has a huge table, which is split horizontally, each part being placed on its own node (or computer). What I specifically want to use MySQL Proxy for, is to run a query like the following in parallel on the nodes:
select count(*) from t group by c;

At the end of this post you can see an example on how this query could be run in parallel.

If it is possible to implement this using MySQL proxy, then MySQL will have its first solution for running one query in parallel across several nodes (there is a large demand for this functionality).

The critical point is whether MySQL Proxy can send queries to several nodes in parallel without blocking after sending the first query to the first node?



Example proving that a query can be run across many tables (each table could be placed on a different node) and still retreve the correct result:

mysql> create table sales_partition1 (productid int, customerid int);
Query OK, 0 rows affected (0.11 sec)

mysql> create table sales_partition2 (productid int, customerid int);
Query OK, 0 rows affected (0.11 sec)

mysql> create table sales_partition3 (productid int, customerid int);
Query OK, 0 rows affected (0.22 sec)

mysql> insert into sales_partition1 (productid, customerid) values (1, 1), (1, 2);
Query OK, 2 rows affected (0.11 sec)
Records: 2 Duplicates: 0 Warnings: 0

mysql> insert into sales_partition2 (productid, customerid) values (1, 3), (2, 1);
Query OK, 2 rows affected (0.11 sec)
Records: 2 Duplicates: 0 Warnings: 0

mysql> insert into sales_partition3 (productid, customerid) values (2, 2), (1, 4);
Query OK, 2 rows affected (0.11 sec)
Records: 2 Duplicates: 0 Warnings: 0

mysql> select * from sales_partition1;
+-----------+------------+
| productid | customerid |
+-----------+------------+
| 1 | 1 |
| 1 | 2 |
+-----------+------------+
2 rows in set (0.09 sec)

mysql> select * from sales_partition2;
+-----------+------------+
| productid | customerid |
+-----------+------------+
| 1 | 3 |
| 2 | 1 |
+-----------+------------+
2 rows in set (0.09 sec)

mysql> select * from sales_partition3;
+-----------+------------+
| productid | customerid |
+-----------+------------+
| 2 | 2 |
| 1 | 4 |
+-----------+------------+
2 rows in set (0.09 sec)

mysql> select productid, sum(thecount) from (
-> (select productid, count(*) as thecount from sales_partition1 group by productid) union all
-> (select productid, count(*) as thecount from sales_partition2 group by productid) union all
-> (select productid, count(*) as thecount from sales_partition3 group by productid)) as combinedResultSet group by productid;
+-----------+---------------+
| productid | sum(thecount) |
+-----------+---------------+
| 1 | 4 |
| 2 | 2 |
+-----------+---------------+
2 rows in set (0.11 sec)

Options: ReplyQuote


Subject
Views
Written By
Posted
Running one query in parallel on lots of computers
7061
July 20, 2007 07:03PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.