Re: What Method is Best
Posted by: Larry Adams
Date: September 14, 2012 05:30AM
The table type is a job event table. The primary key is a combnation of multiple columns. The primary key includes:
clusterid (int) + jobid (int) + indexid (int) + submit_time (time_stamp)
Active job records remain in an active table, and their state changes quite frequently. I use ON DUPLICATE KEY UPDATE heavily here (MySQL rules here). But once a job ends, it is transferred to a finished table that holds the data for long term storage.
The reason that the start and end times are important, and proper pruning is important is that there is one feature in the product that asks the question, what jobs were running on so an so host between some selected date range.
The resulting query is a tricky one. And technically, you can not perform the query wihtout what would be esstentially a scan of all tables. Here is that scenario:
Tell me what jobs were running between 12:00pm and 4:00pm yesterday. So, you have a few cases:
From Finished Table:
1) End Time between 12:00 - 4:00
2) Start Time between 12:00 - 4:00
3) Start Time before 12:00 and end time After 4:00
From Active Table (running/pending):
1) Start Time Before 4:00
So, there is a union that takes place to return results first between active and finished jobs. The query and predicament becomes problematic in case 3) from the Finished table.
So, to remediate that, we put a time buffer around the times of case 3, that is roughly 3 standard deviations from the average run time of jobs. That way we can prevent a full partition scan.
So, right now, I'm using what I refer to as poor mans partitioning where I maintain individual tables partitioned by endTime. I further more metainain a table that stores the min(startTime) and max(endTime) in each table.
Whenever I get a request for a Job Zoom Window (the case above), I select the partitions to include in the union query based first on the start time of the window, and finally on the end time.
I'm just wondering if there is anything inherit is the native design of partioning that can get me out this evil union game I'm in. I would rather have mysql do the heavy lifting here, if you know what I mean.
Edited 3 time(s). Last edit at 09/14/2012 05:33AM by Larry Adams.
September 12, 2012 03:12PM
September 14, 2012 02:36AM
Re: What Method is Best
September 14, 2012 05:30AM
September 14, 2012 11:11PM
Sorry, you can't reply to this topic. It has been closed.
Content reproduced on this site is the property of the respective copyright holders.
It is not reviewed in advance by Oracle and does not necessarily represent the opinion
of Oracle or any other party.