MySQL Forums
Forum List  »  Newbie

Re: Slow query, trying to optimize search..
Posted by: Rick James
Date: January 14, 2015 10:18PM

Post SHOW VARIABLES as another message in this thread; I'll critique the combination of VARIABLES and GLOBAL STATUS. Also, how much RAM do you have?

Meanwhile, let's look at the SELECTs and their speed. Let's look at the processing steps that occur in most of the SELECTs you presented:

1. The WHERE clause is used to filter down to the interesting rows:

1a. WHERE MATCH ...
These will use the FULLTEXT index and take very little time (assuming there are not many rows with the word(s) you are checking for).

1b. WHERE flag = 1
Since `flag` is not indexed, the entire table must be scanned; this takes time -- a "full table scan".

1c. WHERE (this) AND (that)
The optimizer is usually good at deciding whether (this) or (that) will be faster; it filters on the faster one, then checks the other comparison.

2. If the query can be satisfied just using the index, step 3 is skipped. This should explain why "COUNT(*) WHERE MATCH" is a little faster than "COUNT(*) WHERE MATCH AND flag".

3. The rows that remain after filtering (step 1) are fetched from the data. This involves a random lookup using the PRIMARY KEY (which is appended to each SECONDARY KEY). (Note: the "data" and the FULLTEXT "index" are stored separately.) If there are only a few rows, this lookup does not cost much time.

4. ORDER BY -- Sort the data according to an ORDER BY, if present. If there were an INDEX(timestamp), all of that I have said is up for grabs -- the Optimizer might or might not use it, _instead_ of Step 1. The time for the sort depends on how many rows are left after filtering. Sorting time is rarely dominant in the overall picture.

5. LIMIT -- Only after the sort can the LIMIT be applied. A LIMIT 100 without an ORDER BY is rather meaningless, but it is much faster since it can abort step 3 after fetching 100 rows.

I hope that covers all the SELECTs you presented; if not, let me know and I will explain.

Some further notes, leading up to INDEX recommendations:
* MySQL almost never uses two INDEXes in a single query.
* If an indexed value occurs in more than ~20% of the rows, the INDEX won't be used because of the lookup cost in step 3.
* A FULLTEXT index is used in preference to other INDEXes.
* "NOT" _may_ invalidate the use of an INDEX. (Do EXPLAIN SELECT ... to see what index it does or does not use.)

Those notes lead to the following observations:
* Based on your COUNT(*) with various WHERE clauses, neither of these be of much use (20% rule):
INDEX(status_int), INDEX(site_filter)
* However this would trigger step 2, possibly doubling the speed of the queries that reference only those two fields:
INDEX(status_int, site_filter) -- or the opposite order.
* Since name="blah" only 1% of the time, this is likely to be very useful:
INDEX(name)
* And this would trigger step 2 for at least one of your SELECTs:
INDEX(name, status_int, site_filter)
And that compound index would be beneficial for the previous case.
* INDEX(timestamp), if the optimizer chose to use it for the ORDER BY + LIMIT, would do something like (1) Scan the index, (2) reach into the data to check the WHERE, (3) stop after 100 _filtered_ rows. The risk in this is that if most of the rows are filtered out, it would have to check a lot of rows before getting to 100.

Caveat: All of what I said is _usually_ correct; there can be exceptions.

I doubt if you have presented the "real" SELECTs. I hope I have explained things enough so you can predict what happens for them as well. And you should be able to estimate when a possible INDEX will be helpful. Feel free to ask us about more SELECTs and INDEXes.

I have an accurate model of how most of the MySQL optimizer works; it has been validated by thousands of SELECT timings, such as yours.

Thank you for using SQL_NO_CACHE without prompting. (Generally, I recommend turning OFF the Query Cache for production machines. With the query_cache% VARIABLEs and the Qc% STATUSes, I will see how effective you QC is.)

Options: ReplyQuote


Subject
Written By
Posted
Re: Slow query, trying to optimize search..
January 14, 2015 10:18PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.