Performance question: Double join much slower that 2 times each single join
Hi all, while I've used MySQL for over a decade, this is my first post.
I have a deep performance problem and I wonder if anyone can give a tip.
I have 3 tables: L (lines of invoices), A (invoices) and B (product categories).
I want to join all of them to make a large table demapping at each invoice line the details of the invoice and the details of the contained product.
As naturally expected, I do 2 joins: L join A join B.
Nevertheless this is MUCH slower than L join A alone and L join B alone.
- All involved fields are propely indexed (primary keys in A and B and indexes in L).
- All tables have the same charset and collation.
- The matching fields are defined char() fixed length with the same field length on the proper tables.
I expected that if one join lasted 5 secs, and the other 5 secs, the double join could last something between 5 and 10 or even something upt to 5x5 = 25 secs, but lasts orders of magnitude more.
- Table L is nearly 5 million lines.
- Tables A and B are TEMPORARY.
Here are the details of the number of lines for each table:
SELECT COUNT(*) FROM L; -> 4.491.396 rows
SELECT COUNT(*) FROM A; -> 233.426 rows
SELECT COUNT(*) FROM B; -> 2.512 rows
Here are the details of the time consumed. For each cross I have LIMITED to 10.000 rows. LxA -> practically immediate, LxB also practically immediate, but LxAxB -> More than one minute!!!
---------- RESULTS WITH LIMIT 10.000 ----------
SELECT
L.*,
A.*
FROM L
INNER JOIN A
ON L.InvoiceId = A.InvoiceId
LIMIT 10000;
/* 0 rows affected, 10.000 rows found. Duration for 1 query: 0,015 sec. (+ 0,203 sec. network) */
SELECT
L.*,
B.*
FROM L
INNER JOIN B
ON L.CategoryId = B.CategoryId
LIMIT 10000;
/* 0 rows affected, 10.000 rows found. Duration for 1 query: 0,015 sec. (+ 0,250 sec. network) */
SELECT
L.*,
A.*,
B.*
FROM L
INNER JOIN A
ON L.InvoiceId = A.InvoiceId
INNER JOIN B
ON L.CategoryId = B.CategoryId
LIMIT 10000;
/* 0 rows affected, 10.000 rows found. Duration for 1 query: 0,500 sec. (+ 61,838 sec. network) */
---------- RESULTS WITH LIMIT 10.000 ----------
With 10 times more lines (LIMIT 100.000 rows) the results for LxA, LxB and LxAxB are respectively:
LxA ------> time < 2s
/* 0 rows affected, 100.000 rows found. Duration for 1 query: 0,062 sec. (+ 1,591 sec. network) */
LxB ------> time < 4s
/* 0 rows affected, 100.000 rows found. Duration for 1 query: 0,031 sec. (+ 3,229 sec. network) */
LxAxB ------> time ~ 25 min!!!
/* 0 rows affected, 100.000 rows found. Duration for 1 query: 0,499 sec. (+ 1.486,237 sec. network) */
---------- EXPLAIN RESULTS ----------
I do not know how to read the "explain" results, but I post them because they may contain any help for anyone willing to help me:
LxA:
"id";"select_type";"table";"type";"possible_keys";"key";"key_len";"ref";"rows";"Extra"
"1";"SIMPLE";"A";"ALL";"PRIMARY";NULL;NULL;NULL;"233691";""
"1";"SIMPLE";"L";"ref";"PRIMARY,InvoiceId";"PRIMARY";"42";"memora.A.InvoiceId";"4";"Using where"
LxB:
"id";"select_type";"table";"type";"possible_keys";"key";"key_len";"ref";"rows";"Extra"
"1";"SIMPLE";"B";"ALL";NULL;NULL;NULL;NULL;"2438";""
"1";"SIMPLE";"L";"ref";"CategoryId";"CategoryId";"21";"func";"409";"Using where"
LxAxB:
"id";"select_type";"table";"type";"possible_keys";"key";"key_len";"ref";"rows";"Extra"
"1";"SIMPLE";"B";"ALL";NULL;NULL;NULL;NULL;"2438";""
"1";"SIMPLE";"A";"ALL";"PRIMARY";NULL;NULL;NULL;"233691";"Using join buffer"
"1";"SIMPLE";"L";"ref";"PRIMARY,InvoiceId,CategoryId";"PRIMARY";"42";"memora.A.InvoiceId";"4";"Using where"
---------- So... what? ----------
The server is located on localhost (windows laptop) and the client program accesses via 127.0.0.1 to force TCP/IP instead of sockets. So it cannot be hubs or physical net running slow. Client and server are in the same machine.
Any tips on why it is so slow?