Re: Optimize Inner Join With WHERE BETWEEN dates
The server has 64GB on a dedicated v5.5 machine.
Making the suggested optimizations resulted in the following run times:
Q1:
...
AND ( a.ClockIn BETWEEN b.ClockIn AND b.ClockOut
OR a.ClockOut BETWEEN b.ClockIn AND b.ClockOut
)
AND a.UserID IN (1,37491,74806,73680,13179,13220)
AND b.UserID IN (1,37491,74806,73680,13179,13220)
...
13 rows in set (3.07 sec)
---
Q2:
...
AND a.ClockIn < b.ClockOut AND a.ClockOut > b.ClockIn
AND a.UserID IN (1,37491,74806,73680,13179,13220)
AND b.UserID IN (1,37491,74806,73680,13179,13220)
...
14 rows in set (2.32 sec)
---
Q3:
...
AND ( a.ClockIn BETWEEN b.ClockIn AND b.ClockOut
OR a.ClockOut BETWEEN b.ClockIn AND b.ClockOut
)
AND a.UserID = 1 AND b.UserID = 1 )
UNION ALL ( ...
---
13 rows in set (3.06 sec)
Q4:
...
AND a.ClockIn < b.ClockOut AND a.ClockOut > b.ClockIn
AND a.UserID = 1 AND b.UserID = 1 )
UNION ALL ( ...
---
13 rows in set (2.41 sec)
So breaking the query into UNIONS with constant UserID and rewriting the BETWEEN constraints showed the best improvement although not as much as I was hoping to get. I did realize though that I don't need the RecordIDs of offending records, just the UserID of the user with them so I can add a LIMIT 1 to each query:
Q5:
...
AND a.ClockIn < b.ClockOut AND a.ClockOut > b.ClockIn
AND a.UserID = 1 AND b.UserID = 1
LIMIT 1 )
UNION ALL ( ...
...
1 row in set (1.94 sec)
Worst case runtime is still the same assuming no overlapping records exist but when at least one is found that's enough for this check.
I also added another constraint to limit the range of user records since I don't need to check all records for every user, only the records over a requested report range.
Q6:
...
AND a.ClockIn < b.ClockOut AND a.ClockOut > b.ClockIn
AND a.UserID = 1 AND b.UserID = 1
AND a.ClockIn >= '2008-01-01 00:00:00' AND a.ClockOut <= '2010-01-01 00:00:00'
LIMIT 1 )
UNION ALL ( ...
...
1 row in set (0.73 sec)
This looks like it's probably as good as it's going to get.