Re: Optimize Inner Join With WHERE BETWEEN dates
There are eight timeclass values 1-8.
The example query of:
SELECT a.UserID
FROM records AS a INNER JOIN records AS b
WHERE a.UserID = b.UserID
AND a.TimeClass = b.TimeClass
AND a.RecordID != b.RecordID
AND a.ClockIn <> b.ClockOut
AND a.ClockOut <> b.ClockIn
AND ( a.ClockIn BETWEEN b.ClockIn AND b.ClockOut
OR a.ClockOut BETWEEN b.ClockIn AND b.ClockOut
)
AND a.UserID IN (1,37491,74806,73680,13179,13220)
AND b.UserID IN (1,37491,74806,73680,13179,13220)
showed no difference with the new indexes:
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: a
type: range
possible_keys: ix_accrualsum,idx_recordalerts,idx_payroll_elligible,idx_overlappcheck,idx_hourlyreport,idx_hourlyreport3,idx_useridclockin,idx_useridclockout
key: idx_overlappcheck
key_len: 4
ref: NULL
rows: 3767
Extra: Using where; Using index
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: b
type: ref
possible_keys: ix_accrualsum,idx_recordalerts,idx_payroll_elligible,idx_overlappcheck,idx_hourlyreport,idx_hourlyreport3,idx_useridclockin,idx_useridclockout
key: idx_overlappcheck
key_len: 5
ref: timeclock_beta.a.UserID,timeclock_beta.a.TimeClass
rows: 97
Extra: Using where; Using index
2 rows in set (0.00 sec)
On the other hand here is an abbreviated query of one of the largest queries to be expected using production data for ~300 unique users when using the new idx_useridclockin and idx_useridclockout indexes:
( SELECT a.UserID
FROM records AS a INNER JOIN records AS b
WHERE a.UserID = b.UserID
AND a.TimeClass = b.TimeClass
AND a.RecordID != b.RecordID
AND a.ClockIn <> b.ClockOut AND a.ClockOut <> b.ClockIn
AND a.ClockIn < b.ClockOut AND a.ClockOut > b.ClockIn
AND a.UserID = 64240
AND b.UserID = 64240 AND a.ClockIn >= '2013-05-01 11:48:00' AND a.ClockIn <= '2013-05-31 15:00:00' LIMIT 1 )
UNION ALL ( SELECT a.UserID
FROM records AS a INNER JOIN records AS b
WHERE a.UserID = b.UserID
AND a.TimeClass = b.TimeClass
AND a.RecordID != b.RecordID
AND a.ClockIn <> b.ClockOut AND a.ClockOut <> b.ClockIn
AND a.ClockIn < b.ClockOut AND a.ClockOut > b.ClockIn
AND a.UserID = 75916
AND b.UserID = 75916 AND a.ClockIn >= '2013-05-01 11:48:00' AND a.ClockIn <= '2013-05-31 15:00:00' LIMIT 1 )
UNION ALL ( SELECT a.UserID
FROM records AS a INNER JOIN records AS b
WHERE a.UserID = b.UserID
AND a.TimeClass = b.TimeClass
AND a.RecordID != b.RecordID
AND a.ClockIn <> b.ClockOut AND a.ClockOut <> b.ClockIn
AND a.ClockIn < b.ClockOut AND a.ClockOut > b.ClockIn
AND a.UserID = 21449
AND b.UserID = 21449 AND a.ClockIn >= '2013-05-01 11:48:00' AND a.ClockIn <= '2013-05-31 15:00:00' LIMIT 1 )
UNION ALL ( SELECT a.UserID
FROM records AS a INNER JOIN records AS b
WHERE a.UserID = b.UserID
AND a.TimeClass = b.TimeClass
AND a.RecordID != b.RecordID
AND a.ClockIn <> b.ClockOut AND a.ClockOut <> b.ClockIn
AND a.ClockIn < b.ClockOut AND a.ClockOut > b.ClockIn
AND a.UserID = 52195
AND b.UserID = 52195 AND a.ClockIn >= '2013-05-01 11:48:00' AND a.ClockIn <= '2013-05-31 15:00:00' LIMIT 1 )
...
And unique parts of the EXPLAIN for the full query:
...
*************************** 567. row ***************************
id: 284
select_type: UNION
table: a
type: ref
possible_keys: ix_accrualsum,idx_recordalerts,idx_payroll_elligible,idx_overlappcheck,idx_hourlyreport,idx_hourlyreport3,idx_useridclockin,idx_useridclockout
key: idx_recordalerts
key_len: 4
ref: const
rows: 1
Extra: Using where
*************************** 568. row ***************************
id: 284
select_type: UNION
table: b
type: ref
possible_keys: ix_accrualsum,idx_recordalerts,idx_payroll_elligible,idx_overlappcheck,idx_hourlyreport,idx_hourlyreport3,idx_useridclockin,idx_useridclockout
key: idx_recordalerts
key_len: 4
ref: const
rows: 1
Extra: Using where
...
*************************** 578. row ***************************
id: 289
select_type: UNION
table: b
type: ref
possible_keys: ix_accrualsum,idx_recordalerts,idx_payroll_elligible,idx_overlappcheck,idx_hourlyreport,idx_hourlyreport3,idx_useridclockin,idx_useridclockout
key: idx_overlappcheck
key_len: 5
ref: const,timeclock_beta.a.TimeClass
rows: 97
Extra: Using where; Using index
*************************** 579. row ***************************
id: 290
select_type: UNION
table: a
type: range
possible_keys: ix_accrualsum,idx_recordalerts,idx_payroll_elligible,idx_overlappcheck,idx_hourlyreport,idx_hourlyreport3,idx_useridclockin,idx_useridclockout
key: idx_useridclockin
key_len: 12
ref: NULL
rows: 3
Extra: Using where
*************************** 580. row ***************************
id: 290
select_type: UNION
table: b
type: ref
possible_keys: ix_accrualsum,idx_recordalerts,idx_payroll_elligible,idx_overlappcheck,idx_hourlyreport,idx_hourlyreport3,idx_useridclockin,idx_useridclockout
key: idx_overlappcheck
key_len: 5
ref: const,timeclock_beta.a.TimeClass
rows: 97
Extra: Using where; Using index
*************************** 581. row ***************************
id: NULL
select_type: UNION RESULT
table: <union1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,...>
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: NULL
Extra:
581 rows in set (0.04 sec)
So the new indexes are being used.
So the question is one of performance. On a cold restart of the server the query WITHOUT the new indexes takes 13 seconds. If I add back the indexes, restart the server, and run it again it takes 17 second when using idx_useridclockin for table `a` instead of idx_overlapcheck.