Re: character equality under collation should not be used to test uniqueness
Posted by: Alexander Barkov
Date: April 25, 2006 02:18AM

Greg, according to the SQL standards, any operation
which involves comparison must honor the argument's collations,
which includes comparison operators in WHERE clause (<,>,=,etc),
IN, BETWEEN, DISTINCT, MIN, MAX, ORDER BY, UNIQUE constraints, and so on.

If you want to use different rules for DISTINCT and =,
the SQL standard provides this possibility by means of COLLATE clause.

For example, select distinct vdata collate utf8_bin from v;

You can also create the column with utf8_bin:

create table v2 (v2data varchar(80) character set utf8 collate utf8_bin unique not null) engine=InnoDB;

And then use COLLATE for comparison:

select * from v2 where v2data = 'naive' collate utf8_general_ci;

Note, utf8_bin is not just accent-sensitive, it is also case sensitive.
Instead if using utf8_bin in CREATE, you might probably need some accent-sensitive
but case insensitive collation. Unfortunately, utf8 does not currently have any
collations of this kind. Consider switching to other character sets, which do have
accent-sensitive case insensitive collations. Which languages do you need to cover?
I could try to suggest the best choice.

Options: ReplyQuote

Written By
Re: character equality under collation should not be used to test uniqueness
April 25, 2006 02:18AM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.