> My thoughts behind merge tables was that they would be easier to manage than partitions?
There are arguments either way.
With PARTITIONs in 5.6, you can play some of the tricks that used to exist only with MERGE.
> I already have summary tables which is what I use if the date range is greater than a month
Compare the time take for a report of 27 days versus 1 month. I suspect the 'month' is much faster. A summary table with 1-day granularity will work fine for any number of days from 1 up. If the users don't like a "day" breaking at midnight, then maybe a summary table by hours would work for shorter time spans.
> A customer can have 10000 channels each writing data every 5 or 15 minutes
For efficiency, use 'batched' inserts or LOAD DATA. (Not one row at a time.)
> d.EntryDate BETWEEN '2013-01-01 00:00:00' AND '2013-02-15 23:59:59'
I prefer
d.EntryDate >= '2013-01-01'
AND d.EntryDate < '2013-01-01' + INTERVAL 45 DAY
It is much easier than worrying about leap year, etc.
14M rows puts it at the 90th percentile of tables discussed on these forums. 1B is at the 99th percentile. You will need a lot more techniques than we have discussed so far. With 3 secondary keys a billion rows will take 12 months to load into the database (100% of disk being used for writes), unless various actions are taken.
* some flavor of RAID striping.
* get rid of some of the indexes.
* get rid of foreign keys.
* summary tables -- this helps you get rid of indexes.
I assume you have enough disk space for a billion (or 15B) rows?
Goal #1:
What is the most number of rows a single 'customer' can have? Will that fit on a single machine? Will it perform 'adequately'?
Goal #2:
If you are going for 15B rows, you should consider sharding. That is, put different customers on different machines. Or... (and this gets more complex), spread a customer's data across multiple shards.
Assuming you are talking about 15B rows in `data` but much smaller counts in the other table, let's take a serious look at shrinking the size of `data`:
CREATE TABLE `data` (
`EntryDate` datetime NOT NULL -- 8 bytes; TIMESTAMP is only 4; change it!
`DataVal` float NOT NULL -- 4 bytes; OK, unless you can use MEDIUMINT (3 bytes) or SMALLINT (2)
`ChannelID` int(11) NOT NULL -- 4 bytes (and unnecessarily SIGNED?); would MEDIUMINT UNSIGNED (0..16M, 3 bytes) suffice?
`SiteRateID` int(11) DEFAULT NULL -- 4 bytes plus NULL bit; Can you use, say, zero instead of NULL?
PRIMARY KEY (`EntryDate`,`ChannelID`), -- Good order, assuming data comes in roughly in chronological order
KEY `FK_data_ChannelID` (`ChannelID`), -- is this needed? How many different ChannelIDs will there be?
CONSTRAINT `FK_data_ChannelID` FOREIGN KEY ... -- the dynamic checks will eat performance!
) ENGINE=InnoDB -- good -- self-recovers after power failure.
DEFAULT CHARSET=latin1; -- no CHARs, so does not matter.
Changing EntryDate will save 60GB!
Smaller --> more cacheable --> faster