Re: 120M records - dedupe across multi columns not working
Posted by: Rick James
Date: March 14, 2011 11:10PM

Another one to try:
SET SESSION SQL_BIG_SELECTS = 1;
CREATE TABLE new_table (
    `Domain_Domain` varchar(255) collate utf8_unicode_ci NOT NULL,
    `fk_Suffix_SuffixID` int unsigned NOT NULL default '0',
     PRIMARY KEY (`Domain_Domain`, `fk_Suffix_SuffixID`);
INSERT IGNORE INTO `new_table`
    (`Domain_Domain`, `fk_Suffix_SuffixID`)
    SELECT `Domain_Domain`, `fk_Suffix_SuffixID` 
        FROM `old_table`
        ORDER BY `Domain_Domain`, `fk_Suffix_SuffixID`;

The trick is to get it to do a sort instead of using the key_buffer for DISTINCT, IGNORE, etc.

Also, using this on the SELECT may help:
SQL_BIG_RESULT
http://dev.mysql.com/doc/refman/5.1/en/select.html

See also:
http://forums.mysql.com/read.php?24,277763 (enable keys, force sort, not keycache)

Please report back on what does or does not work.

Options: ReplyQuote


Subject
Written By
Posted
Re: 120M records - dedupe across multi columns not working
March 14, 2011 11:10PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.