Re: What am I missing when dealing with foreign chars?
Posted by: Rick James
Date: November 09, 2010 10:44AM

That depends.
mysql> SHOW COLLATION LIKE 'utf8%';
+--------------------+---------+-----+---------+----------+---------+
| Collation          | Charset | Id  | Default | Compiled | Sortlen |
+--------------------+---------+-----+---------+----------+---------+
| utf8_general_ci    | utf8    |  33 | Yes     | Yes      |       1 |
| utf8_bin           | utf8    |  83 |         | Yes      |       1 |
| utf8_unicode_ci    | utf8    | 192 |         | Yes      |       8 |
| utf8_icelandic_ci  | utf8    | 193 |         | Yes      |       8 |
| utf8_latvian_ci    | utf8    | 194 |         | Yes      |       8 |
| utf8_romanian_ci   | utf8    | 195 |         | Yes      |       8 |
| utf8_slovenian_ci  | utf8    | 196 |         | Yes      |       8 |
| utf8_polish_ci     | utf8    | 197 |         | Yes      |       8 |
| utf8_estonian_ci   | utf8    | 198 |         | Yes      |       8 |
| utf8_spanish_ci    | utf8    | 199 |         | Yes      |       8 |
| utf8_swedish_ci    | utf8    | 200 |         | Yes      |       8 |
| utf8_turkish_ci    | utf8    | 201 |         | Yes      |       8 |
| utf8_czech_ci      | utf8    | 202 |         | Yes      |       8 |
| utf8_danish_ci     | utf8    | 203 |         | Yes      |       8 |
| utf8_lithuanian_ci | utf8    | 204 |         | Yes      |       8 |
| utf8_slovak_ci     | utf8    | 205 |         | Yes      |       8 |
| utf8_spanish2_ci   | utf8    | 206 |         | Yes      |       8 |
| utf8_roman_ci      | utf8    | 207 |         | Yes      |       8 |
| utf8_persian_ci    | utf8    | 208 |         | Yes      |       8 |
| utf8_esperanto_ci  | utf8    | 209 |         | Yes      |       8 |
| utf8_hungarian_ci  | utf8    | 210 |         | Yes      |       8 |
+--------------------+---------+-----+---------+----------+---------+
(Do that on your machine; the list may be different.)
_ci means "case insensitive", which (unfortunately) includes "accent insensitive".  That is 'n' and 'tilde-n' are considered equal under utf8_%_ci.

WHERE ... COLLATE utf8_bin
will treat the two characters different.  But it will also tread 'N' and 'n' as different.

Is that Spanish?  If you are dealing only with Spanish, you could experiment with these
mysql> SHOW COLLATION LIKE '%span%';
+-------------------+---------+-----+---------+----------+---------+
| Collation         | Charset | Id  | Default | Compiled | Sortlen |
+-------------------+---------+-----+---------+----------+---------+
| latin1_spanish_ci | latin1  |  94 |         | Yes      |       1 |
| utf8_spanish_ci   | utf8    | 199 |         | Yes      |       8 |
| utf8_spanish2_ci  | utf8    | 206 |         | Yes      |       8 |
+-------------------+---------+-----+---------+----------+---------+

Please post any results you get.

Options: ReplyQuote


Subject
Views
Written By
Posted
Re: What am I missing when dealing with foreign chars?
1200
November 09, 2010 10:44AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.