First, here's a bunch of random tips, some of which apply:
* BIGINT for ids -- Will you really have billions of rows? BIGINT takes 8 bytes; consider using INT (4 bytes) or smaller datatype.
* UNSIGNED -- The default for INT (etc) is SIGNED. Many (not all) database numbers are really UNSIGNED; say so.
* INT(3) -- The "(3)" does nothing. If you are implying that the numbers are small; use TINYINT or SMALLINT.
* NULL -- The default is NULL, but usually you know that the field will never be NULL; so say NOT NULL.
* TEXT/BLOB -- Triggers the use disk temp tables in some cases. Consider using VARCHAR/VARBINARY.
* foo1, foo2, foo3 -- Do not use columns to create arrays; create another table with one row per array entry.
* MyISAM -- Consider switching to the preferred InnoDB.
* INDEX(foo(255)) -- Consider not indexing foo; consider shortening "255".
* VARCHAR(255) -- Use a reasonable limit.
* CHAR(10) -- Is it really fixed length? Probably you should use VARCHAR.
* Lots of indexes -- Slows down INSERTs.
* Only 1-column indexes -- You probably need to learn about "compound indexes".
* INDEX(flag) -- Rarely is it useful to index a 2-valued field (or even small number of values)
* "using temporary using filesort" -- This is not evil in itself; the complexity of the query may demand such. Each of DISTINCT, GROUP BY, and ORDER BY may require a filesort.
* name LIKE '%Joe' -- The leading wild card makes it impossible to use the INDEX on `name`.
* WHERE ... OR ... -- Hard to optimize.
* WHERE ... IN ( SELECT ... ) -- Likely to be terribly slow; see if you can use a JOIN instead.
* Instead of using LIKE, see if FULLTEXT will work for you.
* FOREIGN KEYs slow down INSERTs.
Now, let's dissect the query:
SELECT DISTINCT st.id, st.tn, st.create_time_unix
FROM ticket st
INNER JOIN queue sq ON sq.id = st.queue_id
INNER JOIN article art ON st.id = art.ticket_id
WHERE 1=1
AND sq.group_id IN (8, 8, 14, 14, 15, 15, 18, 18, 19, 19,
24, 24, 25, 25, 26, 26, 28, 28, 29, 29, 30, 30, 31, 31,
35, 35, 36, 36, 40, 40, 41, 41, 42, 42, 43, 43, 46, 46,
47, 47, 52, 52, 53, 53, 55, 55, 56, 56, 62, 62, 63, 63,
64, 64, 66, 66, 68, 68, 70, 70, 71, 71, 72, 72, 73, 73,
74, 74, 75, 75, 76, 76, 78, 78, 80, 80, 81, 81, 82, 82,
83, 83 )
AND art.a_body LIKE '%Jayne'
AND art.a_body LIKE 'Twyford%'
ORDER BY st.create_time_unix DESC
LIMIT 20000000;
First, I wonder if you are really expecting `a_body` to _start_ with "Twyford"? Perhaps you meant
AND art.a_body LIKE '%Jayne Twyford%'
And, if the query is likely to involve names, then FULLTEXT would be an excellent way to go.
This mixture could be an issue: DISTINCT, ORDER BY, LIMIT. It demands that the entire set of possible rows be gathered, then sorted, then dedupped (DISTINCT), and finally limited. That _may_ involve two "filesorts" and their associated "temporaries". There is no getting around that -- without rethinking the query.
Since you did not SELECT any TEXT fields, the "temp" and "filesort" may have been done entirely in RAM. It tries to use MEMORY for the temp, but certain things (too big, TEXT, etc) cause it to switch to MyISAM.
DISTINCT is natural when you are expecting duplicates.
DISTINCT is also useful when you have a JOIN and get unexpected duplicates. I'm guessing this is your situation. However, this often implies that too many rows are being generated by the JOIN. (I'm pretty sure the duplicate ids in the IN(...) are not causing duplicate output rows.) Since I don't understand the intent of the JOINs, let me merely point out a solution that works in some cases:
Turn a JOINed table into a subquery of this form:
SELECT ...
FROM ... x
JOIN ( SELECT ... ) y ON x... = y...
This is especially useful if a GROUP BY or LIMIT can be moved into the inner SELECT, thereby minimizing the number of rows in the subquery.
Improperly done, such subqueries can slow things down.
A question to help understand what the optimizer is thinking. Are most of the rows of `Queue` covered by IN (...)? If so, there was no advantage in starting with `Queue`. The optimizer seemed to decide that the LIKEs might whittle down the number of rows a lot, thereby being a good candidate table for starting with.
If FULLTEXT will work for you, then add this index and filter:
FULLTEXT(a_body)
AND MATCH(art.a_body) AGAINST('+Jayne +Twyford' IN BOOLEAN MODE)
It may run 100 times as fast, even without getting rid of filesort and temp.