Why are 'Ä' and 'A' collated together?
Posted by: Tim Martin
Date: April 19, 2007 07:35AM

I was a little surprised to find out that MySQL collates 'Ä' and 'A' as the same character in both utf8_unicode_ci and utf8_general_ci. As I understand it this potentially breaks German text, where the meaning of a word can depend on the presence or absence of an umlaut.

By my understanding of the Unicode Collation Algorithm, these letters should compare differently at level 2 using the version-4.0.0 UCA weight keys referred to in the manual.

What's the rationale for this? It seems to run deeper than just these characters, indeed all characters I tried were compared at level 1 only. I can cope with a lack of level 2 ordering, but for my application it's important that distinct characters are regarded as distinct.

Options: ReplyQuote


Subject
Views
Written By
Posted
Why are 'Ä' and 'A' collated together?
2816
April 19, 2007 07:35AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.