MySQL Forums
Forum List  »  Microsoft Access

Convert Access Latin1 to utfmb4 problem
Posted by: Preston Hardison
Date: January 22, 2021 11:05PM

I have a problem converting legacy Access databases to MySQL. Access ported over multiple generations and in .accdb format. I assume they are encoded in Latin 1 case insensitive accent sensitive. I am wrestling with early poor decisions Access allowed me to make and I was too naive to know better.

The main field is a varchar field 255 char ("Title") with a unique index and contains text strings. The database is a long annotated list of single words and phrases. Most are short, but hundreds of thousands of entries are over 191 characters. I need them to be unique and accent sensitive (e.g. Aché are indigenous peoples in Paraguay. ache is to hurt).

If I import to utf8mb4, the maximum index length is 191 characters. I don't really need universal character and emoji capability in the Title field, but I do with others. I can create a "tiebreaker field" (188 char + 3 char tiebreaker unique index), But then I have the problem of identifying all the duplicates beforehand, and imports to MySQL drop hundreds of thousands of records.

I spent many hours trying to find documentation of this issue on the web. I've not encountered comprehensive treatments of these kinds of collation issues. I'm even only guessing at Access collation, since I haven't found a way to get Access to reveal it's exact encoding.

Any links or search terms I can use that are not so generic I just get hundreds of pages on collation-free descriptions of performing ideal transfers?

Options: ReplyQuote


Subject
Views
Written By
Posted
Convert Access Latin1 to utfmb4 problem
563
January 22, 2021 11:05PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.