MySQL :: Re: large InnoDB table partitioning without explicit PK

New Topic

Re: large InnoDB table partitioning without explicit PK

Posted by: Rick James
Date: November 19, 2014 12:12AM

(I guess I missed this message.)

> Regarding Select_scan / Com_select - (relatively high) number of full table scans, there was a select statement...

I love working with you -- I point you in a direction; you carry the ball from there.

> My idea it to change it and perform aggregations on MySQL directly (either on a heap table / temp or on a slave or both).

Sure. The use of the MEMORY cache was designed specifically to make that work well. (BTW, my implementation did not have multiple threads, and it could batch multiple rows in a single INSERT.)

An issue with aggregation -- when doing the scan to collect the rows to summarize, it is best to have those rows consecutive. That is, they should be one of
* an entire table (such as the MEMORY cache table) or
* an entire partition or
* a range of rows based on the PRIMARY KEY (not a secondary key!)

This is because it is very efficient to do a table scan or a range scan; it is much less efficient to use a secondary key. In case you are not familiar with InnoDB's indexing scheme, I'll briefly explain it.
* A B+Tree is efficient to scan consecutive rows.
* The data and the PRIMARY KEY live together in one BTree (actually a B+Tree);
* Each secondary INDEX lives in a separate B+Tree -- it contains the secondary column(s), plus the PK's column(s);
* A secondary key lookup requires 2 BTree lookups; Thus, scanning one of your secondary keys would be quite inefficient.

By using an AUTO_INCREMENT PK (and no cache table), you have a way to summarize less than the full table (or partition) efficiently. You just remember where you "left off".

The aggregation would probably be something like
INSERT INTO agg SELECT a,b, sum(c), count(*), ... FROM xx [WHERE (range)] GROUP BY a,b;
That requires touching each original row 1 extra time, on top of the 1 INSERT that touches it now. I would guess that the aggregation might take one-fifth the CPU cost of the single-row inserts.

Sorry, but adding in aggregation will mess with your nice 23K inserts/sec. I don't really know how much it will degrade things -- to make the scan possible and/or to add an AI and/or to have an efficient cache table, etc. I would hope that you could get aggregation + 3rd party code at better than 10K inserts/sec.

Your COUNT(*) slow queries could, instead, run off some agg table.

> compressed and transferred onto another system.

Do you need replication for backup? Seems like this 'dump' could be your backup?
I rarely seen a need to push "reports" to a Slave -- given that the Summary tables are well designed.

> I quickly noticed that all available RAM was eaten up and MySQL started crashing.

Was that because of running out of memory outside MySQL? When MySQL (buffer_pool, etc) starts swapping, it feels like a crash.

Navigate: Previous Message• Next Message

Options: Reply• Quote

Subject

Views

Written By

Posted

large InnoDB table partitioning without explicit PK

5213

Miko M

November 12, 2014 07:41PM

Re: large InnoDB table partitioning without explicit PK

2535

Rick James

November 13, 2014 06:40PM

Re: large InnoDB table partitioning without explicit PK

2384

Miko M

November 14, 2014 03:02AM

Re: large InnoDB table partitioning without explicit PK

1810

Rick James

November 15, 2014 12:05AM

Re: large InnoDB table partitioning without explicit PK

1838

Miko M

November 16, 2014 07:39PM

Re: large InnoDB table partitioning without explicit PK

1800

Rick James

November 17, 2014 04:21PM

Re: large InnoDB table partitioning without explicit PK

1856

Rick James

November 17, 2014 07:48PM

Re: large InnoDB table partitioning without explicit PK

1844

Miko M

November 18, 2014 02:00AM

Re: large InnoDB table partitioning without explicit PK

1917

Rick James

November 18, 2014 11:35PM

Re: large InnoDB table partitioning without explicit PK

1908

Miko M

November 22, 2014 06:44AM

Re: large InnoDB table partitioning without explicit PK

1781

Miko M

November 18, 2014 01:13AM

(non-unique) index more efficient than partition pruning

2238

Miko M

November 19, 2014 12:27AM

Re: (non-unique) index more efficient than partition pruning

2004

Rick James

November 19, 2014 06:01PM

Re: (non-unique) index more efficient than partition pruning

2127

Miko M

November 22, 2014 08:39AM

Re: large InnoDB table partitioning without explicit PK

2077

Miko M

November 14, 2014 03:04AM

Re: large InnoDB table partitioning without explicit PK

2118

Miko M

November 14, 2014 03:05AM

Re: large InnoDB table partitioning without explicit PK

2132

Rick James

November 15, 2014 12:32AM

Re: large InnoDB table partitioning without explicit PK

2138

Miko M

November 16, 2014 08:31PM

Re: large InnoDB table partitioning without explicit PK

1916

Rick James

November 19, 2014 12:12AM

Re: large InnoDB table partitioning without explicit PK

1810

Miko M

November 22, 2014 09:07AM

Re: large InnoDB table partitioning without explicit PK

1857

Rick James

November 23, 2014 09:00PM

Re: large InnoDB table partitioning without explicit PK

2000

Miko M

December 01, 2014 11:16AM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.