Error inserting UTF8 data into primary key
Posted by: Jim Jake
Date: March 11, 2011 06:12AM

I am attempting to perform a restoration of a mysqldump from a 5.0 server into a new 5.5 server. This data is currently in UTF8 format, and seems to be functioning correctly on the 5.0 machine:

#
#	Mysql 5.0
#

mysql> select * from search_total where word like "weiß" \G
*************************** 1. row ***************************
 word: weiß
count: 0.138662
1 row in set (0.00 sec)

mysql> select * from search_total where word like "weis" \G
*************************** 1. row ***************************
 word: weis
count: 0.30103
1 row in set (0.00 sec)

mysqldump generates the following insert statements:
INSERT INTO `search_total` VALUES ('weis',0.30103);
INSERT INTO `search_total` VALUES ('weiß',0.138662);
However, running these insert statements on the 5.5.9 system provides an error:

mysql> INSERT INTO `search_total` VALUES ('weis',0.30103);
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO `search_total` VALUES ('weiß',0.138662);
ERROR 1062 (23000): Duplicate entry 'weiß' for key 'PRIMARY'
mysql>

It seems the insert statement is translating weiß into weis, then checking against the index, whose contents do not seem to be stored in utf8

For example, if I reverse the order of the inserts, I receive the same error. The first entry should store the UTF8 value

mysql> INSERT INTO `search_total` VALUES ('weiß',0.138662);
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO `search_total` VALUES ('weis',0.30103);
ERROR 1062 (23000): Duplicate entry 'weis' for key 'PRIMARY'
mysql>

I've tried different collations (which affect selects but shouldn't affect plain inserts like these) dumping and importing as latin1, and even inserting the result from hex values:

mysql> insert into `search_total` values ((select 0x77656973), .111) \G
Query OK, 1 row affected (0.00 sec)

mysql> insert into `search_total` values ((select 0x776569C39F), .111)\G
ERROR 1062 (23000): Duplicate entry 'weiß' for key 'PRIMARY'
mysql> select * from search_total \G
*************************** 1. row ***************************
 word: weis
count: 0.111
1 row in set (0.00 sec)

mysql>

What do I need to do in 5.5.9 to get this data imported and stored correctly?


Databases are set to /*!40100 DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci */

The table as it exists on 5.0:

mysql> show create table search_total \G
*************************** 1. row ***************************
       Table: search_total
Create Table: CREATE TABLE `search_total` (
  `word` varchar(50) NOT NULL default '',
  `count` float default NULL,
  PRIMARY KEY  (`word`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
1 row in set (0.00 sec)


mysql> show session variables like "coll%" \G
*************************** 1. row ***************************
Variable_name: collation_connection
        Value: utf8_general_ci
*************************** 2. row ***************************
Variable_name: collation_database
        Value: utf8_general_ci
*************************** 3. row ***************************
Variable_name: collation_server
        Value: utf8_general_ci
3 rows in set (0.00 sec)

mysql>

#
# And on mysql 5.5.9
#

mysql> show create table search_total \G
*************************** 1. row ***************************
       Table: search_total
Create Table: CREATE TABLE `search_total` (
  `word` varchar(50) NOT NULL DEFAULT '',
  `count` float DEFAULT NULL,
  PRIMARY KEY (`word`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

mysql> 


mysql> show session variables like "coll%" \G
*************************** 1. row ***************************
Variable_name: collation_connection
        Value: utf8_unicode_ci
*************************** 2. row ***************************
Variable_name: collation_database
        Value: utf8_unicode_ci
*************************** 3. row ***************************
Variable_name: collation_server
        Value: utf8_unicode_ci
3 rows in set (0.01 sec)

Options: ReplyQuote


Subject
Views
Written By
Posted
Error inserting UTF8 data into primary key
6584
March 11, 2011 06:12AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.