MySQL :: Re: How to correct corrupted UTF-8 characters

New Topic

Re: How to correct corrupted UTF-8 characters

Posted by: James Cobban
Date: October 23, 2010 03:37PM

I don't really understand your question. I don't know anything really about the MS Access database that I am converting from, or even about how MS Access stores text. I am assuming that it uses MS's private 16 bit character encoding that is similar to UTF-16 with the low order byte first, in which case the ½ should be internally represented as 0xbd00. But I don't know if that is the case because I don't really want to spend time learning MS Access. I used SQLFront 5.1 build 4.16 to convert the .mdb file to MySQL tables, specifying that I wanted UTF-8 output. SQLFront created tables such as:

CREATE TABLE `tblLR` (
`IDLR` int(10) NOT NULL AUTO_INCREMENT,
`FSPlaceID` varchar(255) CHARACTER SET latin1 DEFAULT NULL,
`Preposition` varchar(120) CHARACTER SET latin1 DEFAULT NULL,
`Location` varchar(255) CHARACTER SET latin1 DEFAULT NULL,
`SortedLocation` varchar(255) CHARACTER SET latin1 DEFAULT NULL,
`ShortName` varchar(255) CHARACTER SET latin1 DEFAULT NULL,
`Tag1` tinyint(3) unsigned DEFAULT NULL,
`Used` tinyint(3) unsigned DEFAULT NULL,
`Notes` longtext CHARACTER SET latin1,
`Verified` tinyint(3) unsigned DEFAULT NULL,
`Latitude` double(53,0) DEFAULT NULL,
`Longitude` double(53,0) DEFAULT NULL,
`FSResolved` tinyint(3) unsigned DEFAULT NULL,
`VEResolved` tinyint(3) unsigned DEFAULT NULL,
`qsTag` tinyint(3) unsigned DEFAULT NULL,
PRIMARY KEY (`IDLR`)
) ENGINE=MyISAM AUTO_INCREMENT=32882 DEFAULT CHARSET=utf8

Despite the references to "latin1" the varchar and text fields all contain valid UTF-8. The only problem, as I said, is that the characters from the portion of page 0 with the high-order bit on were not translated from 0xhh to 0xC2hh for some reason. I don't really care why SQLFront translated the characters that way, I just want to search and replace all occurrences of the badly translated characters. I could write a program to do the search and replace, I was just hoping there was some simpler way to do it. I tried explicitly redefining the character set of one of the fields to UTF8, and MySQL said it had converted, but no change was made to the data.

Navigate: Previous Message• Next Message

Options: Reply• Quote

Subject

Views

Written By

Posted

How to correct corrupted UTF-8 characters

5240

James Cobban

October 05, 2010 08:59PM

Re: How to correct corrupted UTF-8 characters

2130

Rick James

October 06, 2010 10:13PM

Re: How to correct corrupted UTF-8 characters

2227

James Cobban

October 23, 2010 03:37PM

Re: How to correct corrupted UTF-8 characters

3472

James Cobban

October 23, 2010 04:11PM

Re: How to correct corrupted UTF-8 characters

2797

James Cobban

October 23, 2010 04:26PM

Re: How to correct corrupted UTF-8 characters

3724

Rick James

October 26, 2010 11:20PM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.