MySQL Forums
Forum List  »  Newbie

Re: Replacing a subset of records in a 8.5M row table, with performance?
Posted by: Jay Pipes
Date: June 29, 2005 10:54AM

LOL. ;) Chapter 8 covers using both of your SQL scenarios to accomplish de-duping and addressing orphaned records...cheers.

But......

I have a sneaking suspicion that you may actually be able to get better performance by using an alternate technique here. Basically, you're causing a lot of memory trashing to be done by doing the DELETE and then the INSERT, when in fact, all you really need to do is an INSERT with a LEFT JOIN WHERE ... IS NOT NULL:

INSERT INTO FI_CROSS_REF_DATA_IDENTIFIERS
SELECT ASSET_ID, ID_CD, ID_NUMBER, ID_EFFECTIVE_DATE
FROM FI_CROSS_REF_DATA_IDENTIFIERS_TEMP
LEFT JOIN FI_CROSS_REF_DATA_IDENTIFIERS
ON FI_CROSS_REF_DATA_IDENTIFIERS.ASSET_ID = FI_CROSS_REF_DATA_IDENTIFIERS_TEMP.ASSET_ID
WHERE FI_CROSS_REF_DATA_IDENTIFIERS.ASSET_ID IS NULL;

This gets rid of your DELETE altogether, and allows MySQL to optimize based on a NOT EXISTS predicate (you'll see this optimization if you use EXPLAIN on the SELECT statement from the INSERT above)

One final thing:

For the new record data, I'd recommend simply using a regular temporary MyISAM table, and building an index on the ASSET_ID column *only*, since it's the only column used in the LEFT JOIN. Also, to speed things up, create the index *after* inserting the new records into the temporary table, which will allow MySQL to build the index in one operation, instead of adding to it for each inserted record.

So, the end process would be:

/* Create a regular temp table with b-tree index on ASSET_ID */
CREATE TEMPORARY TABLE FI_CROSS_REF_DATA_IDENTIFIERS_TEMP
(
ASSET_ID BIGINT NOT NULL
, ID_CD CHAR(3) NOT NULL
, ID_NUMBER VARCHAR(30) NULL
, ID_EFFECTIVE_DATE DATE NOT NULL
, PRIMARY KEY (ASSET_ID)
);

/* Do your load into the temp table here... (I'm assuming LOAD DATA statement of some sort...) */
LOAD DATA ...

/* Insert only new records into permanent storage */
INSERT INTO FI_CROSS_REF_DATA_IDENTIFIERS
SELECT ASSET_ID, ID_CD, ID_NUMBER, ID_EFFECTIVE_DATE
FROM FI_CROSS_REF_DATA_IDENTIFIERS_TEMP
LEFT JOIN FI_CROSS_REF_DATA_IDENTIFIERS
ON FI_CROSS_REF_DATA_IDENTIFIERS.ASSET_ID = FI_CROSS_REF_DATA_IDENTIFIERS_TEMP.ASSET_ID
WHERE FI_CROSS_REF_DATA_IDENTIFIERS.ASSET_ID IS NULL;

/* Drop the temp table */
DROP TABLE FI_CROSS_REF_DATA_IDENTIFIERS_TEMP;

---------

footnote: I'm still sticking by my original comment that you should get rid of the BIGINTs :)

Cheers,

Jay

Jay Pipes
Community Relations Manager, North America, MySQL Inc.

Got Cluster? http://www.mysql.com/cluster
Personal: http://jpipes.com

Options: ReplyQuote




Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.