Re: large InnoDB table partitioning without explicit PK
Posted by: Miko M
Date: December 01, 2014 11:16AM
> > As far as I understand MySQL is a multithreaded
> Yes and no.
> * Each connection might use a separate process
> (thread). But only one CPU per connection is
> * Disk I/O (for all connections) makes use of some
> number of extra threads. The number of such
> threads is controlled by
[Miko] Ok, I see.
> > I am not sure whether I fully understand Plan C.
> Could you please give me a hint which variable
> should I use/tune here?
> Let's put that off. I don't have enough
> confidence that it is worth pursuing.
> > However with 200B of average size of a record,
> even high insert rate (5k inserts between BEGIN
> and COMMIT) should non be a problem for
> innodb_log_buffer_size=16M (as long as I
> understand this variable properly).
> I don't know how much overhead there is on top of
> the 200B, so I don't know how to do the
> > Moreover I've found very few sources in which
> authors advise to use more than 32M for
> > However in one source I found a number of 25% of
> And those sources probably don't give enough
> details to explain their recommendations. Sorry,
> I don't have any good rationale, either.
> > I should have observed status after either every
> parameter/variable change or load characteristic
> Yes, but only after running for a significant
> amount of time (say, a day, or a zillion bytes of
> > How running sql #1 and sql #2 one by one for
> every column may affect the inserting performance,
> I mean inserting into the Staging table?
> 1. Insert 1000 rows into Staging
> 2. run the 2 SQL statements to normalize col1.
> 3. run the 2 SQL statements to normalize col2.
> 4. run the 2 SQL statements to normalize col3.
> 5. Summarize: one INSERT .. SELECT .. GROUP BY..
> for each summary table
> 6. INSERT INTO FACT SELECT * FROM Staging.
> The more rows you INSERT in step 1, the more you
> amortize the cost of the other steps.
> Running SQL 1&2 on 10000 rows does not take much
> longer than for 1000 rows. It is somewhere
> between 1x and 10x.
> No, don't take that to extremes -- don't Stage
> millions of rows. There is a hard-to-define sweet
> spot somewhere:
> * A smaller number of rows gets the data into
> Summary faster
> * A larger number is more efficient overall.
> * Larger may have issues with Replication.
> * Larger may cause less frequent, but longer,
> interference with SELECTs.
> * Etc.
> So, it depends on what is important to you. I
> don't yet have the full list; and I doubt if you
> have the full list.
[Miko] Yes! It makes sense and works well!
However, I have to admit that I've expected that operations on heap tables (staging) will run faster (more efficient). Btree index helps a lot but it takes some time to built it too. I wonder if you have ever compared heap tables to innodb on ram disk?
> > To me it looks like more and more exclusive
> The exclusive locks on Staging are irrelevant if
> you ping-pong.
> The exclusive locks on Summary and Fact are
> important, but they are less frequent because of
> the way Staging works -- a goal of Staging is to
> do all the prep work in Staging and StageProcess,
> thereby making the transfer to Summary & Fact fast
> and efficient. Or at least more efficient that
> your original IODKU.
> Anyway, do you care about locks on Summary and
> Fact? You probably do, but we have not gotten
> into those details yet.
[Miko] There are not many (concurrent) users and locks (on Summary especially) are not significant. As a matter of fact this system has much more writes (inserts) than reads (selects).
> * If the app is doing a point query into Fact, it
> is extremely unlikely to hit the same rows as
> StageProcess -> Fact is hitting.
> * If the app is building a "report" from Summary,
> it could be conflicting. But building a report
> may take second(s), while converting 1000 Staging
> rows into 100 rows in Summary will take only
> milliseconds. Even 10K rows, maybe even 100K
> rows, would not have a noticeable impact to the
> end user waiting for the report. On the other
> side, the Report may be blocking the StageProcess
> -> Summary. With the ping-pong approach, that
> simply says that this Staging will take longer to
> finish and the next Staging table (after RENAMEs)
> will be bigger. Not a big problem.
> Note that none of these comments mention any
> slowdown in INSERTing into Staging. The locks
> caused by those INSERTs are fighting with each
> other. And hence the suggestion of having a pair
> of Staging tables for each insertion thread.
> > folds SQL #2 into the INSERT INTO Fact
> INSERT INTO Fact
> SELECT s.this, s.that, h.host_id, ... FROM
> StageProcess s JOIN HostNorm h ...
> I needed host_id in some of the Summary tables and
> the Fact table. So it was better (in my case) for
> SQL#2 to actually populate
> Stage_process.`host_id`. That minimized the
> number of times I needed to lookup hostname to get
> > To me, inserting into the Fact table must be
> allowed only when Staging and HostNorm tables are
> fully referenced to each other (in other words
> there are no NULLs in Staging.host_id and
> HostNorm.host_name contains all values from
> I agree. SQL#1 assures that "HostNorm.host_name
> contains all..."; SQL#2 takes care of all the
> NULLs in StageProcess.host_id. After that,
> "inserting into Fact" should be safe.
> > Partition pruning does not work with the
> 'between' or '>' condition (i.e. col20 between
> '2014-11-23 00:00:00' and '2014-11-23 01:00:00').
> That's unfortunate. Surely that's been fixed
> since MariaDB 5.2.
> (Note that in that example, the BETWEEN covers one
> second more than one hour.)
> > [Miko] You mean P1970010101? It is always empty.
> It is created with 'create table'. No records are
> inserted here.
> Yes. Sorry, I failed to notice it.
> > So it seems that I will have to use the
> partition to collect the rows.
> > In this case do you think it is still worth to
> add an AUTO_INCREMENT BIGINT as the PK (together
> with existing partitioning key)?
> Staging with ping-pong is beautiful in that the
> Fact table is not touched for Summarization; and
> StageProcess is a full table scan (and all rows
> are needed; A_I not used).
> To do the Summarization from the Fact table,
> having an A_I of sufficient size is vital. And
> that is an argument against having no explicit PK.
> (A "subtle" reason, as mentioned in an earlier
> > Data will be copied based on the partitioning
> key into the MEMORY engine and all the
> aggregations will done there.
> To maximize performance, you need to minimize the
> number of times data is copied.
> If the 3rd party software were to INSERT into
> Staging (while thinking it is Fact), you might be
> able to cut out a "copy".
[Miko] True. I've also noticed that with copying whole (hourly) partition from Fact to Staging is just 10-25% faster with partition pruning that using an index. But it might be the case of the index (it is an incremental timestamp inserted as varchar).
> > select * from Table1 INTO OUTFILE
> Is /tmp a separate "filesystem"? Perhaps the
> crash had to do with filling up that filesystem?
> Or is /tmp a "ramdisk" (which I frown on)?
[Miko] No, the file system was not full. The /tmp was not a separate file system but it was not the problem with system full (which led to the crash).
November 12, 2014 07:41PM
November 13, 2014 06:40PM
November 14, 2014 03:02AM
November 15, 2014 12:05AM
November 16, 2014 07:39PM
November 17, 2014 04:21PM
November 17, 2014 07:48PM
November 18, 2014 02:00AM
November 18, 2014 11:35PM
November 22, 2014 06:44AM
November 18, 2014 01:13AM
November 19, 2014 12:27AM
November 19, 2014 06:01PM
November 22, 2014 08:39AM
November 14, 2014 03:04AM
November 14, 2014 03:05AM
November 15, 2014 12:32AM
November 16, 2014 08:31PM
November 19, 2014 12:12AM
November 22, 2014 09:07AM
November 23, 2014 09:00PM
Re: large InnoDB table partitioning without explicit PK
December 01, 2014 11:16AM