MySQL :: Looking for understanding of optimizing queries

New Topic

Looking for understanding of optimizing queries

Posted by: Peter Valdemar Mørch
Date: September 07, 2011 07:00AM

Hi,

We have an app with 101 tables that need a review with respect to indexes and queries for performance reasons.

I therefore enabled the slow log, and set out to eradicate all entries in it.

Now I discover, that:

1: Sometimes "explain $query" shows that an index is used (all key columns are
   NOT NULL), when in fact "$query" shows up in the slow log exactly iff
   log-queries-not-using-indexes is set in /etc/mysql/my.cnf. (See below for
   example) Why is that?
2: Whether or not an index is used depends on many things, including:
  2a: What (number of) rows that are in each table involved
  2b: What columns are selected
3: Sometimes even if indexes are used, I've seen that another set of indexes
   can speed up the query 10-fold. How can I identify such cases?

select Version() is "5.1.49-3-log" from mysql-server-5.1 5.1.49-3 on debian stable

A couple of questions:

Especially 1: above, that "explain $query" and the slow log disagree, really makes it difficult for me to move on. How can I get a list of which queries don't use indexes? Am I doing something wrong here?

Are there better tools to analyse queries than "explain extended $query"? (I hope so)

My naive attempt at an approach was: Take a backup/mysqldump from a representative large installation. Analyse its slow log. Make sure that queries on that large installation don't show up in the slow log (that doesn't work because of 1: above). Does that make sense? Are there better approaches? Unfortunately my previous approach: Run it all on my developer machine with a tiny debug database doesn't work...

Since which index is used depends on so many things, I guess for the developer optimising queries and indexes is more of an experience-based trial-and-error sort of thing than something that it is predictable. Or what?

We're planning an MyISAM -> Innodb transition soon. Do the index performance characteristics change so that we should do the transition first and optimise post-transition? I guess that is a yes, but how much so?

Extra details and dumps, etc:

###############
About 1:
###############

CREATE TABLE `atable` (
  `ID` int(11) NOT NULL auto_increment,
  `otherID` int(11) NOT NULL default '0',
  PRIMARY KEY  (`ID`),
  KEY `otherID` (`otherID`)
) ENGINE=MyISAM;
INSERT INTO `atable` VALUES (98,135),(97,134),(4,4),(5,5);

RESET QUERY CACHE; FLUSH TABLES;

-- This shows that key = "otherID" which I interpret to mean that the otherID
-- index will be used
EXPLAIN SELECT otherID FROM atable;

-- However, this still shows up in the show log exactly iff
-- log-queries-not-using-indexes is set in my.cnf
SELECT otherID FROM atable;

###############
About 2a:
###############
I have two database dumps that are too large for inlining, so I've put them up on github's gist for downloading [1]. See links at the bottom.

In the following, I'll provide the details for two databases with identical structure but different rows and a particular query, where explain says something different for the two databases, but even though explain in both cases assures me that indexes are used, the slow log still mentions these selects for some reason (as described above):

The tables involved are here and at [2]:

CREATE TABLE `IDs` (
  `ID` int(11) NOT NULL default '0',
  PRIMARY KEY  (`ID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE `cfgMonitoredTableRows` (
  `ID` int(11) NOT NULL,
  `nodeID` int(11) NOT NULL default '0',
  `monitoredTableID` int(11) NOT NULL default '0',
  PRIMARY KEY  (`ID`),
  KEY `nodeTable` (`nodeID`,`monitoredTableID`),
  KEY `mtrSelectInDtMonitoredTablesCron5` (`nodeID`,`ID`,`monitoredTableID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE `cfgNodes` (
  `ID` int(11) NOT NULL,
  `managementState` enum('managed','silent','unmanaged') NOT NULL default 'managed',
  PRIMARY KEY  (`ID`),
  UNIQUE KEY `selectByManagementState` (`managementState`,`ID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE `dtEventSources` (
  `ID` int(11) NOT NULL,
  `sourceType` enum('capmonStatus','nodeAlarm','serviceAlarm','statThresholdAlarm','interfaceStatusAlarm','newNodeEvent','syslogFilterMatchEvent','spanningTreeEvent','configBackupEvent','syslogFilterAlarm','configBackupAlarm') NOT NULL default 'nodeAlarm',
  `currentState` enum('ok','pending','unknown','softWarning','warning','softCritical','critical','unmanaged') default NULL,
  `nodeID` int(11) default NULL,
  PRIMARY KEY  (`ID`),
  KEY `nodeID` (`nodeID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;j

The select involved can be found in [3]:

SELECT MTR.nodeID
    FROM     cfgMonitoredTableRows as MTR,
             IDs,
             cfgNodes as N
             LEFT JOIN dtEventSources as ES ON
               ( ES.sourceType = "nodeAlarm" and ES.nodeID = N.ID )
    WHERE    MTR.ID = IDs.ID
             AND MTR.nodeID = N.ID
             AND ES.currentState != "critical"
             AND N.managementState != "unmanaged";

For a "large" database [4] with "many" rows, explain tells me that a key is used for all tables in the select:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: N
         type: index
possible_keys: PRIMARY,selectByManagementState
          key: selectByManagementState
      key_len: 5
          ref: NULL
         rows: 1814
     filtered: 100.00
        Extra: Using where; Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: ES
         type: ref
possible_keys: nodeID
          key: nodeID
      key_len: 5
          ref: large.N.ID
         rows: 4
     filtered: 100.00
        Extra: Using where
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: MTR
         type: ref
possible_keys: PRIMARY,nodeTable,mtrSelectInDtMonitoredTablesCron5
          key: mtrSelectInDtMonitoredTablesCron5
      key_len: 4
          ref: large.ES.nodeID
         rows: 47
     filtered: 100.00
        Extra: Using where; Using index
*************************** 4. row ***************************
           id: 1
  select_type: SIMPLE
        table: IDs
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: large.MTR.ID
         rows: 1
     filtered: 100.00
        Extra: Using index

But for the exact same structure in a "small" database [5] with "few" rows, explain tells me that a different set of keys will be used:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: IDs
         type: index
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 118
     filtered: 100.00
        Extra: Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: MTR
         type: eq_ref
possible_keys: PRIMARY,nodeTable,mtrSelectInDtMonitoredTablesCron5
          key: PRIMARY
      key_len: 4
          ref: small.IDs.ID
         rows: 1
     filtered: 100.00
        Extra: 
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: N
         type: eq_ref
possible_keys: PRIMARY,selectByManagementState
          key: PRIMARY
      key_len: 4
          ref: small.MTR.nodeID
         rows: 1
     filtered: 100.00
        Extra: Using where
*************************** 4. row ***************************
           id: 1
  select_type: SIMPLE
        table: ES
         type: ref
possible_keys: nodeID
          key: nodeID
      key_len: 5
          ref: small.MTR.nodeID
         rows: 10
     filtered: 103.83
        Extra: Using where

Regardless of what explain says, they go in the slow log...

URLS:

1: https://gist.github.com/1200395
2: https://raw.github.com/gist/1200395/createTables.sql
3: https://raw.github.com/gist/1200395/select.sh
4: https://raw.github.com/gist/1200395/large.sql
5: https://raw.github.com/gist/1200395/small.sql

Navigate: Previous Message• Next Message

Options: Reply• Quote

Subject

Views

Written By

Posted

Looking for understanding of optimizing queries

3467

Peter Valdemar Mørch

September 07, 2011 07:00AM

Re: Looking for understanding of optimizing queries

1025

Jørgen Løland

September 08, 2011 06:34AM

Re: Looking for understanding of optimizing queries

904

Peter Valdemar Mørch

September 08, 2011 07:14AM

Re: Looking for understanding of optimizing queries

891

Jørgen Løland

September 09, 2011 02:35AM

Re: Looking for understanding of optimizing queries

773

Peter Valdemar Mørch

September 09, 2011 03:59AM

Re: Looking for understanding of optimizing queries

894

Rick James

September 09, 2011 09:29AM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.