I am trying to import some data from a .txt file set for UTF-8 into a new table that supports Croatian text. But some characters are not surviving the import. They show up as unknown characters.
Let me also add that this is a 5.5.12 MySQL Community Server (GPL) by Remi running on the latest CentOS distribution.
Here is the table,
delimiter $$
CREATE TABLE `abcd` (
`xxxx4` varchar(100) CHARACTER SET latin1 NOT NULL DEFAULT 'xxxx4 Here',
`xxxx5` varchar(100) CHARACTER SET latin2 COLLATE latin2_croatian_ci NOT NULL DEFAULT 'xxxx5 Here',
`xxxx3` varchar(100) CHARACTER SET latin1 DEFAULT NULL,
`xxxx2` varchar(200) CHARACTER SET latin2 COLLATE latin2_croatian_ci DEFAULT 'xxx2 Here',
`xxxx1` varchar(100) CHARACTER SET latin1 NOT NULL DEFAULT 'xxxx1 Here',
PRIMARY KEY (`xxxx5`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_german1_ci$$
And here is a data sample
Acquitted**Mirjan Kupreškić++/w/index.php?title=Mirjan_Kupre%C5%A1ki%C4%87&action=edit&redlink=1**Bosnian Croat, HVO member**Lašva Valley massacres against Bosniak civilians**Acquitted on 23 October 2001.7++#cite_note-kupreskic-6
Acquitted**Vlatko Kupreškić++/w/index.php?title=Vlatko_Kupre%C5%A1ki%C4%87&action=edit&redlink=1**Bosnian Croat, HVO member**Lašva Valley massacres against Bosniak civilians**Acquitted on 23 October 2001.7++#cite_note-kupreskic-6
Here is how they show up:
'Acquitted', 'Mirjan Kupre??kiÄ?++/w/index.php?title=Mirjan_Kupre%C5%A1ki%C4%87&action=edit&redlink=1', 'Bosnian Croat, HVO member', 'La??va Valley massacres against Bosniak civilians', 'Acquitted on 23 October 2001.7++#cite_note-kupreskic-6'
'Acquitted', 'Vlatko Kupre??kiÄ?++/w/index.php?title=Vlatko_Kupre%C5%A1ki%C4%87&action=edit&redlink=1', 'Bosnian Croat, HVO member', 'La??va Valley massacres against Bosniak civilians', 'Acquitted on 23 October 2001.7++#cite_note-kupreskic-6'
The LOAD DATA INFILE command does not involve itself with character sets or collations as far as I know.
I already have a table in the same database that is showing the characters correctly and this table copies the attributes of that other table, such as column collation and table type and default collation for the table.
Notice the Croatian characters show up fine in this post.
What is going on? What do I change to make sure these characters survive the import?
Edited 1 time(s). Last edit at 08/21/2011 02:18PM by Paul Pikowsky.