MySQL Forums
Forum List  »  Performance

Re: Adding index kills query
Posted by: Robert Stafford
Date: April 27, 2014 07:53PM

After several suggestions, I have now fixed two issues with my original code: (a) I added primary keys to all tables; (b) I changed the Transaction Amount variable from float to decimal.

To clarify a few issues, this database does not change - it was provided to me to analyze, so there are no writes to it. It lives on a standalone computer, I am the only user, and I have generally only run one process against the database at a time. I would like to perform a variety of analyses on the main table, which has 350 million rows, including summing by various category variables, looking for patterns within those sums, etc.

In an effort to understand the problem with the query performance, I ran a series of tests, which involved creating a table, Trans_smp_micro, consisting of a subset of the original table of various sizes, and running two queries:

select iCompanyID, sum(dcPurchaseAmt)
FROM Trans_smp_micro
GROUP BY iCompanyID
;

select iCompanyID, sum(dcPurchaseAmt)
FROM Trans_smp_micro
ignore index (idxCompBrand)
GROUP BY iCompanyID
;

I ran this query on subsets varying from 30K to 19M rows, using both the InnoDB and MyISAM engines, and found the following patterns (all performance reported as execution time seconds per million rows):

InnoDB ignoring index: 1.5 s/M for less than 1M rows, then jumping to 4.5 s/M up to 4M rows. I have never run the query on bigger sets to completion

InnoDB using index: increasing from 3.0 s/M at 30K to 3.9 s/M at 500K, then jumping to 238 s/M at 1M, 850 s/M at 2m, and 1,257 s/M at 4M. I have never run the query on bigger sets to completion.

MyISAM ignoring index: 0.9 s/M up to 17M rows, then jumping to 1.8 s/M for 19M rows

MyISAM using index: 0.9 s/M up to 17M rows, then jumping to 1,450 s/M for 19M rows

Several observations:
- With both engines, both using and ignoring the index, the performance remains relatively stable until the row count hits a threshold, at which point the performance deteriorates. I assume this is because the system runs out of a resource, probably RAM allocated to the sort function, and changes the approach.

- MyISAM is always faster than InnoDB for the same size, and the performance with MyISAM does not deteriorate until I get to much bigger sets.

- With MyISAM, the performance using and ignoring the index is roughly identical until the system hits the threshold, at which point the execution time doubles ignoring the index, but increases by a factor of more than 1000 using the index.

- with InnoDB, ignoring the index is always at least twice as fast as using the index, and the performance wall is much smaller: execution time doubles without the indexs, while it increases by factor of hundreds using the index

Which leaves me with a few questions:

- Why is using an index not helping, and sometimes badly hurting, the performance?

- Are there resource settings that I can adjust to avoid the performance threshold? Ultimately, once I'm done testing and tuning, I need to run a series of queries on the full set, with 350M rows.

Options: ReplyQuote


Subject
Views
Written By
Posted
2287
April 23, 2014 06:10PM
1019
April 23, 2014 06:14PM
1059
April 24, 2014 02:44AM
1205
April 24, 2014 09:07AM
1009
April 25, 2014 07:39AM
1140
April 25, 2014 11:54AM
1064
April 25, 2014 02:25PM
1107
April 28, 2014 06:06AM
923
April 24, 2014 09:37AM
981
April 24, 2014 06:31PM
1267
April 24, 2014 07:10PM
1085
April 24, 2014 07:19PM
1224
April 25, 2014 10:55AM
942
April 25, 2014 11:02PM
Re: Adding index kills query
1898
April 27, 2014 07:53PM
1024
April 27, 2014 08:50PM
1009
April 27, 2014 09:00PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.