MySQL Forums
Forum List  »  Newbie

Re: Charset and Collation Question
Posted by: Rick James
Date: November 25, 2014 07:33PM

> I read through most of your site (still digesting it!) and making my way through the other sections

I wish I had some grand insight that would make it easier to digest.

> Thanks for pointing it out the canonical/exact pattern.

Hmmm... I should add yet another section to the blog.

> use utf8mb4 ... you appear to suggest utf8_unicode_ci. Should I prefer this over utf8mb4_unicode_ci

I doubt if utf8_... collation will work with utf8mb4 charset. Use utf8mb4_...

> Is the index abandoned just because you use the "Collate xx" clause in a query, or is it abandoned because a specific collation is used

Neither. Because you asked to perform the WHERE with a different collation than that of the INDEX.

foo VARCHAR(...) ... COLLATE xx
INDEX(foo)
WHERE foo LIKE '...' COLLATE yy
When xx != yy, the INDEX will not used. This probably means a slow response. So, for speed, pick xx according to the type of compare you desire:
foo VARCHAR(...) ... COLLATE xx
INDEX(foo)
WHERE foo LIKE '...'
(I was suggesting that some _offline_ query might need the WHERE...COLLATE syntax for some different purpose.)

Yes, _unicode_ci is slower than _general_ci, but don't let that be a factor. The effort to launch the SELECT, plus getting the answer back, plus building the autocomplete, plus etc, outweighs the collation cost.

No, I can't advise which is 'best' for _your_ application:
* utf8mb4_bin
* utf8mb4_general_ci
* utf8mb4_unicode_ci
* canonical/exact pattern
* or even create a case-insensitive, accent-sensitive collation to use

Options: ReplyQuote


Subject
Written By
Posted
November 15, 2014 08:55AM
Re: Charset and Collation Question
November 25, 2014 07:33PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.