MySQL Forums
Forum List  »  Performance

Re: Looking for understanding of optimizing queries
Posted by: Rick James
Date: September 09, 2011 09:29AM

> select managementState, count(ID) from cfgNodes group by managementState;
IF the information can be scanned in the GROUP BY order, the calculation is simpler. It reads until there is a change in managementState, tallying as it goes, then works on the next value of managementState. If it did not scan in that order, it would either have to keep a hash of managementState value, together with partial counts, or do a sort.

You can run EXPLAIN twice on the same query and get two different explain plans! This is because of all the things that it takes into consideration, and those things could be changing. Especially if you change constants in the query, the plan can change. (Example WHERE a=1 --> WHERE a=2) I have seen as many as 6 different plans for the 'same' query (where constants differed).

Back to
> select managementState, count(ID) from cfgNodes group by managementState;
The only reason (I think) for saying COUNT(ID) instead of COUNT(*) is if ID could be NULL. Saying COUNT(ID) requires the execution to check ID for NULL -- usually a waste of time.

INDEX(managementState, ID) is perfect for that query, and it will probably be "Using index". However, if this is InnoDB, and you have PRIMARY KEY(managementState, ID), now things get a little tricky. The PK is embedded in the data, not stored separately like other indexes are. Now it has to do a "table scan". (Unfortunately, it sometimes implies it is doing an "index scan".) This is a _rare_ case where adding a redundant index _may_ improve performance.

In MyISAM (which you are using), PRIMARY KEY, UNIQUE KEY, KEY, and INDEX are virtually identical -- they are stored in a separate BTree. (Essentially the only difference: PK and UNIQUE have a uniqueness constraint.)

Now for another wrinkle:
> UNIQUE KEY `selectByManagementState` (`managementState`,`ID`)
That has _all_ the fields of the table. The data in a MyISAM table cannot be assumed to be in any particular order. That index, because of the order (as discussed above), is best for doing the GROUP BY. However, INDEXes are in BTrees, which might be slightly less efficient than the raw data for scanning.

Eh?
CREATE TABLE `IDs` (
`ID` int(11) NOT NULL default '0',
PRIMARY KEY (`ID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
Why does this 1-cell table exist? Do you know about AUTO_INCREMENT?

log-queries-not-using-indexes -- I find this to be essentially useless, so I never turn it on. The one obscure benefit: If my tables are small today, and will grow huge tomorrow, this flag may warn me about queries that will become inefficient later. Not using an index is fine if there are only a few rows in the table, but disastrous for million-row tables.

Recommend
long_query_time = 1

> We're planning an MyISAM -> Innodb transition soon.
Such a deal I have for you! (OK, it is free.)
http://mysql.rjweb.org/doc.php/myisam2innodb
That explains a lot of subtle INDEX issues that are different between the two Engines.

> my original goal of eliminating all entries in the slow log is actually going to be impossible.
Essentially, yes. I monitor several dozen different systems' SlowLogs. They all have SlowLog entries. I check them to see if worst ones can be improved. But often they "don't matter", or are "acceptable". Examples
* the SELECT from mysqldump
* batch inserts of data
* report writing (often this can be improved, but still may be acceptable)
* one-of queries where someone is rummaging around in the data.
* experimenting with reformulated SELECTs.
* Oops -- such as a cross-join by mistake (and it won't happen again).

> the WHERE condition is only supposed to filter out very few rows
Then the INDEX is probably useless. In fact, if the WHERE clause is not estimated to whittle the data down to under 10-30% of the total, a table scan is (often) preferred. Again log-queries-not-using-indexes leads to bogus entries.

Options: ReplyQuote


Subject
Views
Written By
Posted
Re: Looking for understanding of optimizing queries
888
September 09, 2011 09:29AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.