3 queries for same data: runtime: 5 hours, 5 hours and... 0.01 seconds...
Hi,
We are using MySQL on a large emailing database. When are attempting to get data out regarding people contacted 5 days ago.
We have a log table containing everyone contacted (250 000 000 rows) with a user ID and a send ID, and a campaign table (30000 rows) with a campaign name and a send Id (one campaign can correspond to several send ID, like 'daily email', and each daily mail has 1 unique ID). All columns needed by the queries are indexed.
So, our mail interface runs a standard join. The query took about 5 hours to run. An explain showed us that it was attempting to use the wrong index (iStatus, common to each record, rather than iDeliveryId), though adding a FORCE INDEX just made MySQL ignore everything :
-----
SELECT B0.iBroadLogId, B0.iDeliveryId, D1.iDeliveryId
FROM
NmsBroadLogRcp B0 JOIN NmsDelivery D1
ON (D1.iDeliveryId = B0.iDeliveryId)
WHERE (D1.sDeliveryCode = 'DM18342')
AND (B0.iStatus = 1)
AND (DATE(D1.tsContact) = DATE((CURDATE() - INTERVAL 5 DAY)))
LIMIT 1000;
-----
I rewrote the query with a subquery:
-----
select iBroadLogId, iDeliveryId
FROM NmsBroadLogRcp
where iStatus = 1
and iDeliveryId in (
select iDeliveryId FROM NmsDelivery WHERE sDeliveryCode = 'DM18342'
and date(tsContact) = DATE(CURDATE() - INTERVAL 5 DAY)
) limit 1000;
-----
I killed this query after 15 minutes... Strange thing is: the subquery when run on it's own executes in less than 0.01 seconds:
-----
select iDeliveryId FROM NmsDelivery WHERE sDeliveryCode = 'DM18342'
and date(tsContact) = DATE(CURDATE() - INTERVAL 5 DAY)
-----
This query returned the ID 541687496
So, I rewrote the query again, but replacing the subquery with the ID that it just returned:
-----
select iBroadLogId, iDeliveryId
FROM NmsBroadLogRcp
where iStatus = 1
and iDeliveryId = 541687496
limit 1000;
-----
Now I get my result set in 0.01 seconds...
I have seen this before on other similar queries... I cannot understand why there is such a huge performance hit between the second and the third versions of this query. If I can get a subquery to run in less than 1/10th of a second, and the main query to run with a hardcoded value in less than 1/10th of a second, why does the complete query+subquery take minues to hours to run? Can anyone light my candle?
Cheers,
Daniel