MySQL Forums
Forum List  »  Performance

Re: Optimize multiple writes on the same row
Posted by: Rick James
Date: March 20, 2012 11:32AM

Thanks for the details. That redirects the effort radically. Let me rephrase the situation as it eventually become:

* 5M row table
* random rows are being UPDATEd, upwards of 1000 changes/second.
* desire prompt response to simple SUM/COUNT/AVG over entire table

(Or will the desired stats be filtered by some WHERE clause? -- Please so state, as it will change my answer!)

As an UPDATE is done, also INSERT one row into a table I will call `Delta`. It will contain the changes in values for the row you are UPDATEing. Delta will have
ct -- 0 for UPDATE, +1 for INSERT, -1 for DELETE
val1 -- new value - old value (yes, this requires reading the data; we can discuss techniques to batch this.)
val2 -- ditto for another column.
val1sq -- If you need standard deviation, then have a column with the delta in the square of the value.

So far, an UPDATE has turned into
* SELECT (to get old value)
* UPDATE (to change the actual table)
* INSERT (one row into Delta)

On the side you need another table; lets call it `Totals`. It contains one(?) row with all the 'current' averages, etc. Well, no. It does not contain averages; instead it contains `count` and `sum`. The query to get the "average" will divide those two while SELECTing this _one_ row. (Very fast.)

But that 1-row table is not quite up to date. What to do? Periodically do the following. (Suggest you do it in a continual loop, and not really "time" it.) The discussion above always INSERTs into `Delta2`.
Setup:
* Have two tables Delta1 and Delta2; we will ping-pong between them (details coming). If possible, make them ENGINE=MEMORY (to avoid I/O).
* Have a tmp table:
-- Do once:
CREATE TEMPORARY TABLE Tmp (...) ENGINE=MEMORY
SELECT SUM(count), SUM(sum_val1), ...
FROM Delta1;
[/code]

The guts of the loop:
UPDATE    `Totals`
    JOIN  `Tmp`
    SET Totals.count = Totals.count + Tmp.count, -- Fold Delta1 stuff into Totals
        Totals.sum_val1 = ...;
TRUNCATE TABLE Tmp; -- get ready for next time thru the loop
TRUNCATE TABLE Delta1; -- since that info has now been put into Totals
RENAME TABLE Delta1 TO x, Delta2 TO Delta1, x TO Delta2; (swap names) -- atomic, so it (briefly) blocks the INSERTs.

Now, how to get the AVG, etc?
SELECT ... sum_val1 / count AS 'average val1', ... FROM Totals;

This algorithm is self-adjusting. For example, in peak traffic, it will run 'slower' (because of "SELECT SUM(...) FROM Delta1" taking longer), but it will run more efficiently (because the other steps will be less frequent). The lock on Totals (due to UPDATE) to be so brief that the 'report' query will never notice.

The loop may run so fast that you need to add a SLEEP to it. While developing this, I suggest you instrument it so you can see things like
* how often it runs
* how many rows are in Delta1

Clever enough? And it should be very efficient.

Options: ReplyQuote


Subject
Views
Written By
Posted
Re: Optimize multiple writes on the same row
1180
March 20, 2012 11:32AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.