MySQL Forums :: Partitioning :: large InnoDB table partitioning without explicit PK


Advanced Search

Re: large InnoDB table partitioning without explicit PK
Posted by: Miko M ()
Date: November 22, 2014 09:07AM

> (I guess I missed this message.)
>
> > Regarding Select_scan / Com_select - (relatively
> high) number of full table scans, there was a
> select statement...
>
> I love working with you -- I point you in a
> direction; you carry the ball from there.
[Miko] The pleasure's all mine!
>
> > My idea it to change it and perform aggregations
> on MySQL directly (either on a heap table / temp
> or on a slave or both).
>
> Sure. The use of the MEMORY cache was designed
> specifically to make that work well. (BTW, my
> implementation did not have multiple threads, and
> it could batch multiple rows in a single INSERT.)
[Miko] Ok, thanks for letting me know.

> An issue with aggregation -- when doing the scan
> to collect the rows to summarize, it is best to
> have those rows consecutive. That is, they should
> be one of
> * an entire table (such as the MEMORY cache table)
> or
> * an entire partition or
> * a range of rows based on the PRIMARY KEY (not a
> secondary key!)
>
> This is because it is very efficient to do a table
> scan or a range scan; it is much less efficient to
> use a secondary key. In case you are not familiar
> with InnoDB's indexing scheme, I'll briefly
> explain it.
> * A B+Tree is efficient to scan consecutive rows.
> * The data and the PRIMARY KEY live together in
> one BTree (actually a B+Tree);
> * Each secondary INDEX lives in a separate B+Tree
> -- it contains the secondary column(s), plus the
> PK's column(s);
> * A secondary key lookup requires 2 BTree lookups;
> Thus, scanning one of your secondary keys would
> be quite inefficient.
>
> By using an AUTO_INCREMENT PK (and no cache
> table), you have a way to summarize less than the
> full table (or partition) efficiently. You just
> remember where you "left off".
>
> The aggregation would probably be something like
> INSERT INTO agg SELECT a,b, sum(c), count(*), ...
> FROM xx [WHERE (range)] GROUP BY a,b;
> That requires touching each original row 1 extra
> time, on top of the 1 INSERT that touches it now.
> I would guess that the aggregation might take
> one-fifth the CPU cost of the single-row inserts.
[Miko] Many thanks for your explanation.
So it seems that I will have to use the partition to collect the rows.
In this case do you think it is still worth to add an AUTO_INCREMENT BIGINT as the PK (together with existing partitioning key)?

> Sorry, but adding in aggregation will mess with
> your nice 23K inserts/sec. I don't really know
> how much it will degrade things -- to make the
> scan possible and/or to add an AI and/or to have
> an efficient cache table, etc. I would hope that
> you could get aggregation + 3rd party code at
> better than 10K inserts/sec.
[Miko] Aggregations will be carried out in a slightly different way.
Data will be copied based on the partitioning key into the MEMORY engine and all the aggregations will done there.

> Your COUNT(*) slow queries could, instead, run off
> some agg table.
>
> > compressed and transferred onto another system.
>
>
> Do you need replication for backup? Seems like
> this 'dump' could be your backup?
> I rarely seen a need to push "reports" to a Slave
> -- given that the Summary tables are well
> designed.
[Miko] Actually, not. Initially I thought that all the aggregations and summaries could be carried out on a slave. But MEMORY on a master seems to be even a better option.

> > I quickly noticed that all available RAM was
> eaten up and MySQL started crashing.
>
> Was that because of running out of memory outside
> MySQL? When MySQL (buffer_pool, etc) starts
> swapping, it feels like a crash.
[Miko] If I can recall it correctly this situation that often led to MySQL instance crash was occuring when large volume of data (whole hourly partition) was save into a flat file and a tmp table has been used (select * from Table1 INTO OUTFILE '/tmp/partP2014111213' where col20 between...)
It really looked strange since tmp_table_size was configured to 16777216
I am happy to perform such test again as ling as you want to know the details.

Options: ReplyQuote


Subject Views Written By Posted
large InnoDB table partitioning without explicit PK 2455 Miko M 11/12/2014 07:41PM
Re: large InnoDB table partitioning without explicit PK 1388 Rick James 11/13/2014 06:40PM
Re: large InnoDB table partitioning without explicit PK 1220 Miko M 11/14/2014 03:02AM
Re: large InnoDB table partitioning without explicit PK 973 Rick James 11/15/2014 12:05AM
Re: large InnoDB table partitioning without explicit PK 992 Miko M 11/16/2014 07:39PM
Re: large InnoDB table partitioning without explicit PK 875 Rick James 11/17/2014 04:21PM
Re: large InnoDB table partitioning without explicit PK 976 Rick James 11/17/2014 07:48PM
Re: large InnoDB table partitioning without explicit PK 943 Miko M 11/18/2014 02:00AM
Re: large InnoDB table partitioning without explicit PK 989 Rick James 11/18/2014 11:35PM
Re: large InnoDB table partitioning without explicit PK 1033 Miko M 11/22/2014 06:44AM
Re: large InnoDB table partitioning without explicit PK 927 Miko M 11/18/2014 01:13AM
(non-unique) index more efficient than partition pruning 1152 Miko M 11/19/2014 12:27AM
Re: (non-unique) index more efficient than partition pruning 1029 Rick James 11/19/2014 06:01PM
Re: (non-unique) index more efficient than partition pruning 1055 Miko M 11/22/2014 08:39AM
Re: large InnoDB table partitioning without explicit PK 1042 Miko M 11/14/2014 03:04AM
Re: large InnoDB table partitioning without explicit PK 1093 Miko M 11/14/2014 03:05AM
Re: large InnoDB table partitioning without explicit PK 1176 Rick James 11/15/2014 12:32AM
Re: large InnoDB table partitioning without explicit PK 1112 Miko M 11/16/2014 08:31PM
Re: large InnoDB table partitioning without explicit PK 953 Rick James 11/19/2014 12:12AM
Re: large InnoDB table partitioning without explicit PK 918 Miko M 11/22/2014 09:07AM
Re: large InnoDB table partitioning without explicit PK 911 Rick James 11/23/2014 09:00PM
Re: large InnoDB table partitioning without explicit PK 999 Miko M 12/01/2014 11:16AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.