MySQL Forums
Forum List  »  Performance

Re: INNODB large database performance
Posted by: Rick James
Date: February 25, 2013 10:39PM

> My thoughts behind merge tables was that they would be easier to manage than partitions?

There are arguments either way.
With PARTITIONs in 5.6, you can play some of the tricks that used to exist only with MERGE.

> I already have summary tables which is what I use if the date range is greater than a month

Compare the time take for a report of 27 days versus 1 month. I suspect the 'month' is much faster. A summary table with 1-day granularity will work fine for any number of days from 1 up. If the users don't like a "day" breaking at midnight, then maybe a summary table by hours would work for shorter time spans.

> A customer can have 10000 channels each writing data every 5 or 15 minutes

For efficiency, use 'batched' inserts or LOAD DATA. (Not one row at a time.)

> d.EntryDate BETWEEN '2013-01-01 00:00:00' AND '2013-02-15 23:59:59'

I prefer
      d.EntryDate >= '2013-01-01'
 AND  d.EntryDate <  '2013-01-01' + INTERVAL 45 DAY
It is much easier than worrying about leap year, etc.

14M rows puts it at the 90th percentile of tables discussed on these forums. 1B is at the 99th percentile. You will need a lot more techniques than we have discussed so far. With 3 secondary keys a billion rows will take 12 months to load into the database (100% of disk being used for writes), unless various actions are taken.
* some flavor of RAID striping.
* get rid of some of the indexes.
* get rid of foreign keys.
* summary tables -- this helps you get rid of indexes.

I assume you have enough disk space for a billion (or 15B) rows?

Goal #1:
What is the most number of rows a single 'customer' can have? Will that fit on a single machine? Will it perform 'adequately'?

Goal #2:
If you are going for 15B rows, you should consider sharding. That is, put different customers on different machines. Or... (and this gets more complex), spread a customer's data across multiple shards.

Assuming you are talking about 15B rows in `data` but much smaller counts in the other table, let's take a serious look at shrinking the size of `data`:
CREATE TABLE `data` (
`EntryDate` datetime NOT NULL -- 8 bytes; TIMESTAMP is only 4; change it!
`DataVal` float NOT NULL -- 4 bytes; OK, unless you can use MEDIUMINT (3 bytes) or SMALLINT (2)
`ChannelID` int(11) NOT NULL -- 4 bytes (and unnecessarily SIGNED?); would MEDIUMINT UNSIGNED (0..16M, 3 bytes) suffice?
`SiteRateID` int(11) DEFAULT NULL -- 4 bytes plus NULL bit; Can you use, say, zero instead of NULL?
PRIMARY KEY (`EntryDate`,`ChannelID`), -- Good order, assuming data comes in roughly in chronological order
KEY `FK_data_ChannelID` (`ChannelID`), -- is this needed? How many different ChannelIDs will there be?
CONSTRAINT `FK_data_ChannelID` FOREIGN KEY ... -- the dynamic checks will eat performance!
) ENGINE=InnoDB -- good -- self-recovers after power failure.
DEFAULT CHARSET=latin1; -- no CHARs, so does not matter.

Changing EntryDate will save 60GB!
Smaller --> more cacheable --> faster

Options: ReplyQuote


Subject
Views
Written By
Posted
3547
February 16, 2013 03:11AM
1195
February 20, 2013 05:31PM
1044
February 21, 2013 02:51AM
1277
February 21, 2013 11:12PM
1327
February 22, 2013 03:38AM
1027
February 23, 2013 12:04AM
984
February 23, 2013 03:28PM
925
February 24, 2013 02:10PM
923
February 24, 2013 03:02PM
Re: INNODB large database performance
1052
February 25, 2013 10:39PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.