Re: large InnoDB table partitioning without explicit PK
Posted by: Rick James
Date: November 23, 2014 09:00PM
> As far as I understand MySQL is a multithreaded application.
Yes and no.
* Each connection might use a separate process (thread). But only one CPU per connection is used.
* Disk I/O (for all connections) makes use of some number of extra threads. The number of such threads is controlled by
> I am not sure whether I fully understand Plan C. Could you please give me a hint which variable should I use/tune here?
Let's put that off. I don't have enough confidence that it is worth pursuing.
> However with 200B of average size of a record, even high insert rate (5k inserts between BEGIN and COMMIT) should non be a problem for innodb_log_buffer_size=16M (as long as I understand this variable properly).
I don't know how much overhead there is on top of the 200B, so I don't know how to do the arithmetic.
> Moreover I've found very few sources in which authors advise to use more than 32M for inno_db_log_buffer.
> However in one source I found a number of 25% of innodb_buffer_pool_size.
And those sources probably don't give enough details to explain their recommendations. Sorry, I don't have any good rationale, either.
> I should have observed status after either every parameter/variable change or load characteristic change.
Yes, but only after running for a significant amount of time (say, a day, or a zillion bytes of activity).
> How running sql #1 and sql #2 one by one for every column may affect the inserting performance, I mean inserting into the Staging table?
1. Insert 1000 rows into Staging
2. run the 2 SQL statements to normalize col1.
3. run the 2 SQL statements to normalize col2.
4. run the 2 SQL statements to normalize col3. etc.
5. Summarize: one INSERT .. SELECT .. GROUP BY.. for each summary table
6. INSERT INTO FACT SELECT * FROM Staging.
The more rows you INSERT in step 1, the more you amortize the cost of the other steps.
Running SQL 1&2 on 10000 rows does not take much longer than for 1000 rows. It is somewhere between 1x and 10x.
No, don't take that to extremes -- don't Stage millions of rows. There is a hard-to-define sweet spot somewhere:
* A smaller number of rows gets the data into Summary faster
* A larger number is more efficient overall.
* Larger may have issues with Replication.
* Larger may cause less frequent, but longer, interference with SELECTs.
So, it depends on what is important to you. I don't yet have the full list; and I doubt if you have the full list.
> To me it looks like more and more exclusive locks.
The exclusive locks on Staging are irrelevant if you ping-pong.
The exclusive locks on Summary and Fact are important, but they are less frequent because of the way Staging works -- a goal of Staging is to do all the prep work in Staging and StageProcess, thereby making the transfer to Summary & Fact fast and efficient. Or at least more efficient that your original IODKU.
Anyway, do you care about locks on Summary and Fact? You probably do, but we have not gotten into those details yet.
* If the app is doing a point query into Fact, it is extremely unlikely to hit the same rows as StageProcess -> Fact is hitting.
* If the app is building a "report" from Summary, it could be conflicting. But building a report may take second(s), while converting 1000 Staging rows into 100 rows in Summary will take only milliseconds. Even 10K rows, maybe even 100K rows, would not have a noticeable impact to the end user waiting for the report. On the other side, the Report may be blocking the StageProcess -> Summary. With the ping-pong approach, that simply says that this Staging will take longer to finish and the next Staging table (after RENAMEs) will be bigger. Not a big problem.
Note that none of these comments mention any slowdown in INSERTing into Staging. The locks caused by those INSERTs are fighting with each other. And hence the suggestion of having a pair of Staging tables for each insertion thread.
> folds SQL #2 into the INSERT INTO Fact
INSERT INTO Fact
SELECT s.this, s.that, h.host_id, ... FROM StageProcess s JOIN HostNorm h ...
I needed host_id in some of the Summary tables and the Fact table. So it was better (in my case) for SQL#2 to actually populate Stage_process.`host_id`. That minimized the number of times I needed to lookup hostname to get host_id.
> To me, inserting into the Fact table must be allowed only when Staging and HostNorm tables are fully referenced to each other (in other words there are no NULLs in Staging.host_id and HostNorm.host_name contains all values from Staging.host_name).
I agree. SQL#1 assures that "HostNorm.host_name contains all..."; SQL#2 takes care of all the NULLs in StageProcess.host_id. After that, "inserting into Fact" should be safe.
> Partition pruning does not work with the 'between' or '>' condition (i.e. col20 between '2014-11-23 00:00:00' and '2014-11-23 01:00:00').
That's unfortunate. Surely that's been fixed since MariaDB 5.2.
(Note that in that example, the BETWEEN covers one second more than one hour.)
> [Miko] You mean P1970010101? It is always empty. It is created with 'create table'. No records are inserted here.
Yes. Sorry, I failed to notice it.
> So it seems that I will have to use the partition to collect the rows.
> In this case do you think it is still worth to add an AUTO_INCREMENT BIGINT as the PK (together with existing partitioning key)?
Staging with ping-pong is beautiful in that the Fact table is not touched for Summarization; and StageProcess is a full table scan (and all rows are needed; A_I not used).
To do the Summarization from the Fact table, having an A_I of sufficient size is vital. And that is an argument against having no explicit PK. (A "subtle" reason, as mentioned in an earlier comment.)
> Data will be copied based on the partitioning key into the MEMORY engine and all the aggregations will done there.
To maximize performance, you need to minimize the number of times data is copied.
If the 3rd party software were to INSERT into Staging (while thinking it is Fact), you might be able to cut out a "copy".
> select * from Table1 INTO OUTFILE '/tmp/partP2014111213'...
Is /tmp a separate "filesystem"? Perhaps the crash had to do with filling up that filesystem? Or is /tmp a "ramdisk" (which I frown on)?
November 12, 2014 07:41PM
November 13, 2014 06:40PM
November 14, 2014 03:02AM
November 15, 2014 12:05AM
November 16, 2014 07:39PM
November 17, 2014 04:21PM
November 17, 2014 07:48PM
November 18, 2014 02:00AM
November 18, 2014 11:35PM
November 22, 2014 06:44AM
November 18, 2014 01:13AM
November 19, 2014 12:27AM
November 19, 2014 06:01PM
November 22, 2014 08:39AM
November 14, 2014 03:04AM
November 14, 2014 03:05AM
November 15, 2014 12:32AM
November 16, 2014 08:31PM
November 19, 2014 12:12AM
November 22, 2014 09:07AM
Re: large InnoDB table partitioning without explicit PK
November 23, 2014 09:00PM
December 01, 2014 11:16AM