Re: Large table designissue
Posted by: Rick James
Date: October 25, 2009 09:11AM

Commenting on your ideas:

a) (Split out old data) -- This will help some. It does add complexity, but it sounds like you know how to handle it. This is worth considering, even if you take other steps. However, it is only a partial solution.

b) (5.1; partitioning) -- 5.1, itself, probably has nothing to help your situation. Partitioning probably won't help any. More below.

c) (more slaves) -- More slaves will certainly let you scale out to more users. But it sounds like sluggishness of each query is the big issue.

Adding on...
d) Summary tables -- You mentioned aggregate queries. Suggest you make a table of aggregates of old data, indexed by, say, day. Each night, add the appropriate rows to that table. Then have your frontends query that table. It should be designed to have fewer rows (I like to aim for 10x), fewer columns (does not have details, just SUMs, COUNTs), and better index(es). Don't keep AVG; use SUM/COUNT. Running the aggregation on the new data each night will be quite fast.

By doing such, you don't need to add indexes to the main ("fact") table. In fact, you should remove all the indexes by the auto_increment primary key. This makes inserting into it faster.

Ok, perhaps you want up-to-the-second aggregations? Some options:
* [Re]build today's summary rows on the fly. (One UI that does that takes half a second to catchup on an hourly summary of a 13M row table.)
* UNION between summary SELECT and a SELECT on the Fact table.

I assume you have an "API layer" between users and the data. Users ask for "foobar stats for last week"; the layer turns that into "SELECT ...". If you already have that layer, switching to the Summary table(s) would be relatively easy. If not, I recommend you add that layer now.

To advise in more detail, please provide
* SHOW CREATE TABLE tbl\G -- engine, indexes
* SHOW TABLE STATUS LIKE 'tbl'\G -- sizes
* EXPLAIN SELECT ...\G -- clues of inefficiencies
* SHOW VARIABLES LIKE '%buffer%'; -- cache size
and surround them with [ code ] and [ / code ]

More of my ramblings on Summary tables:
http://forums.mysql.com/read.php?10,282768 (Which storage engine?)
http://forums.mysql.com/read.php?10,276359 (should i go with myisam or innodb?)
http://forums.mysql.com/read.php?125,266570 (Best way to store/access large amounts of data?)
http://forums.mysql.com/read.php?24,263259 (Slow JOIN to convert IP to Country Code)
http://forums.mysql.com/read.php?10,263071 (Bitmap Index)
http://forums.mysql.com/read.php?24,253221 (InnoDB CPU Spikes)
http://forums.mysql.com/read.php?10,254332 (Very long query doubt... (Monster query...))
http://forums.mysql.com/read.php?125,252723 (Database Design for huge set of data)
http://forums.mysql.com/read.php?10,252593 (Counters)
http://forums.mysql.com/read.php?21,247124 (Solving table locking issues)
http://forums.mysql.com/read.php?10,247779 (compressing tables)
http://forums.mysql.com/read.php?125,245133 (Design approach for summary table by 3 items)

Options: ReplyQuote


Subject
Written By
Posted
October 24, 2009 02:46AM
Re: Large table designissue
October 25, 2009 09:11AM
October 29, 2009 09:36PM
October 29, 2009 10:34PM
November 13, 2009 01:05PM
November 18, 2009 12:15AM
November 19, 2009 12:38PM
November 19, 2009 12:55PM
November 24, 2009 11:49PM
November 25, 2009 11:23AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.