MySQL Forums
Forum List  »  Performance

Re: query to find gaps in sequential indexes causes CPU spikes
Posted by: Rick James
Date: December 01, 2011 10:13AM

> would setting the innodb_buffer_pool_instances to 27 help overall performance? Is 1GB is the optimal instance size
(I have no experience (yet) with such.) It _should_ help because it would split up a significant performance hindrance -- the "mutex contention". Having multiple instances has some overhead, hence the admonition about not making them smaller than a GB. At the other extreme, having lots (27) may have some overhead, too. How about 9 instances of 3GB? And, if you get any benchmarks that look like they prove something, please post them.

> The table could be split vertically as you suggested, but would that help performance in this case?
It is hard to say. It depends on many more factors that are not yet available (and hard to even ask for) -- What are the common SELECTs? Do they include that BLOB? When you do want the BLOB, are you fetching just one, or several? How many rows are scanned in the various queries? What percent of the BLOBs bigger than 255 characters? The answers interact in complex ways, so even if I had the answers, I could only guess at whether Vertical partitioning would be beneficial. (Some other schemas jump up and say they will likely benefit; yours is less clear cut.)

> I did try the NOT EXISTS suggestion. The cpu usage was down, but only 1 - 2%.
Alas, I can't get them all right.

> max_connections is currently 2500
If you do stick with that high number, you should probably decrease the buffer_pool by a GB, to allow room for 2500 processes.

Is this EXPLAIN still valid?
> select_type type key key_len ref rows extra
> primary range column1_column0 8 NULL 1250268 Using where; Using index
> dependant subquery ref column1_column0 8 const,func 1 Using where; Using index
If so, then the query is "Using index", and the Vertical partitioning would have no effect on it. "Using index" means that the query need only access the index, not the data, to find the desired info. (Think of an INDEX as another table, but with only a subset of the columns in it, and ordered in a different way than the table.)

> The table has 600,001 rows where Column1 (descid) = 10.
So, it works something like:
1. a "range scan" of 600K rows,
2. 600K random lookups into the other table (actually just into the index)
3. Sort (ORDER BY ...).
4. Deliver one row (LIMIT 1).
I don't know if it will cut short on the 600K when it finds what it needs.

This leads back to the question of "why?". Why are you doing this query? Why do you need to re-do it so often? Is there some other way to achieve the result?

> It is possible to pre-calculate and save the information elsewhere which I might have to do.
Yes...

Case 1: The table rarely changes.
Solution 1: Use the Query cache. (Use DEMAND and SQL_CACHE)
Solution 2: Do your own flavor of caching.

Case 2: The table changes frequently, but it is ok to deliver a "stale" answer:
Solution: Do your own flavor of caching.

Case 3: You could do the following:
Solution: Whenever a new row is inserted (or updated or deleted), see if that would lead to the change in this query. Then update a table of all the answers -- a 2 column table with descid (the PRIMARY KEY) and regid. (Without understanding the data, I hesitate to be more specific.)

Case 3 would be slightly more complicated, and much faster.

The query looks for "the first gap in regid for a given descid", correct?

I'm toying with something like
SET @x = 1;
SELECT regid, @x := @x+1 FROM tbl WHERE descid = 10 AND regid != @x LIMIT 1;
That (if it works) would be less than one pass, no sort, etc.

If you don't want to catch a missing regid=1, then replace the SET with
SELECT @x := MIN(regid) FROM tbl WHERE descid = 10;

"Think out of the box".

Options: ReplyQuote


Subject
Views
Written By
Posted
Re: query to find gaps in sequential indexes causes CPU spikes
1159
December 01, 2011 10:13AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.