Collations utf8_general_ci and utf8_unicode_ci and character equivalence
Posted by: chrislb
Date: February 10, 2007 02:58PM

I was getting dublicate entry warnings when adding new data when I found out that some characters under collations utf8_general_ci and utf8_unicode_ci are treated the same. So entries containing an ΓΌ (u-Umlaut) can be similar to entries having an u at this position.

This behaviour seems fairly bad when viewed from an avarage user's perspective, and switching to some collation that doesn't fit to the original language (as many collations (still?) don't exist) isn't a good solution either.

In my case I am using chinese pinyin that can contain the u-Umlaut and use it under the turkish collation.

My questions thus are
1) Why is utf8_general_ci and utf8_unicode_ci not applicable to the general case?
2) Will there ever by collations for other languages apart from the existing european ones for utf8?

Chris

Options: ReplyQuote


Subject
Views
Written By
Posted
Collations utf8_general_ci and utf8_unicode_ci and character equivalence
3299
February 10, 2007 02:58PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.