MySQL Forums
Forum List  »  Newbie

Replacing a subset of records in a 8.5M row table, with performance?
Posted by: David Wynter
Date: June 29, 2005 09:48AM

I have a table with 8.5M rows. Every week I receive in files the latest set of these records, which I want to reconcile against the existing records. I.e. remove those that do not exist in the file and add those that do not exist in the table.

I tried to use the INSERT ...... ON DUPLICATE UPDATE method, but each file has about 60,000 records and takes about 30 minutes to complete. Since there are about 104 of these then it would take 52 hours to complete.

So I thought I was going to fix it by loading each files' worth of records into a HEAP table (takes about 40 seconds). Then DELETE all those rows that match on the grouping key in the table with 8.5M rows then insert the new records from the HEAP table into the tables with the 8.5M rows. But the DELETE on the 60,000 reocrds in the first files is still running after 40 minutes, so it is proving slower.

Here is the temp table DDL:

CREATE TABLE IF NOT EXISTS FI_CROSS_REF_DATA_IDENTIFIERS_TEMP
(ASSET_ID BIGINT NOT NULL, ID_CD CHAR(3) NOT NULL, ID_NUMBER VARCHAR(30) NULL, ID_EFFECTIVE_DATE DATE NOT NULL, KEY `U_Index` TYPE HASH (`ASSET_ID`,`ID_CD`,`ID_NUMBER`,`ID_EFFECTIVE_DATE`)) ENGINE = MEMORY

The index is there to remove duplicate records ( they are in the file).

Then after the fast INSERT I do the delete:

DELETE FROM FI_CROSS_REF_DATA_IDENTIFIERS
WHERE FI_CROSS_REF_DATA_IDENTIFIERS.ASSET_ID IN
(SELECT ASSET_ID FROM FI_CROSS_REF_DATA_IDENTIFIERS_TEMP)

Finally the INSERT of the new records

INSERT INTO FI_CROSS_REF_DATA_IDENTIFIERS
( ASSET_ID, ID_CD, ID_NUMBER, EFFECTIVE_DATE )
SELECT ASSET_ID, ID_CD, ID_NUMBER, ID_EFFECTIVE_DATE
FROM FI_CROSS_REF_DATA_IDENTIFIERS_TEMP

The tables being loaded cannot afford to be off line. It needs Select only access from a front end

Suggested approaches?

thx.

David

Options: ReplyQuote




Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.