MySQL :: Re: UTF8 characters substituted in queries?

New Topic

Re: UTF8 characters substituted in queries?

Posted by: Dave Varon
Date: September 15, 2005 06:37PM

Hi again,

Thanks for your suggestions. I've tried a number of different utf8-based collations with only moderate progress: utf_unicode_ci, utf8_bin, utf8_roman_ci, etc.

These are my current settings:
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/local/mysql-standard-4.1.9-apple-darwin7.7.0-powerpc/share/mysql/charsets/ |
| collation_connection | utf8_bin |
| collation_database | utf8_bin |
| collation_server | utf8_bin
...
| init_connect | SET NAMES utf8 |

init_connect doesn't seem to have an effect:

mysql> select * from party where name_human_first = 'Rub\303\251n' collate utf8_unicode_ci;
ERROR 1253 (42000): COLLATION 'utf8_unicode_ci' is not valid for CHARACTER SET 'latin1'
mysql>

After 'SET NAMES utf8', the charset settings are consistent:

mysql> select name_human_first from party where name_human_first = 'Rub\303\251n' collate utf8_unicode_ci;
+------------------+
| name_human_first |
+------------------+
| Rubén |
| Ruben |
+------------------+
2 rows in set (0.12 sec)

mysql> select name_human_first from party where name_human_first = 'Ruben' collate utf8_unicode_ci;
+------------------+
| name_human_first |
+------------------+
| Rubén |
| Ruben |
+------------------+
2 rows in set (0.09 sec)

However, only the utf8_bin collation provides the expected results:

mysql> select name_human_first from party where name_human_first = 'Rub\303\251n' collate utf8_bin;
+------------------+
| name_human_first |
+------------------+
| Rubén |
+------------------+
1 row in set (0.01 sec)

mysql> select name_human_first from party where name_human_first = 'Ruben' collate utf8_bin;
+------------------+
| name_human_first |
+------------------+
| Ruben |
+------------------+
1 row in set (0.00 sec)

...but only when specified on the CLI as above. Without it, as indicated below, the characters are still treated as equivalent:

mysql> select name_human_first from party where name_human_first = 'Rub\303\251n';
+------------------+
| name_human_first |
+------------------+
| Rubén |
| Ruben |
+------------------+
2 rows in set (0.00 sec)

So I'm guessing there's another way to set the client or connection collation that I've overlooked. Right now the database, connection, server, client, table, column, and results are all set to utf8/utf8_bin.

What am I missing, and will it apply to the webapp too?

Navigate: Previous Message• Next Message

Options: Reply• Quote

Subject

Written By

Posted

UTF8 characters substituted in queries?

Dave Varon

September 15, 2005 07:25AM

Re: UTF8 characters substituted in queries?

Mark Matthews

September 15, 2005 07:37AM

Re: UTF8 characters substituted in queries?

Dave Varon

September 15, 2005 08:39AM

Re: UTF8 characters substituted in queries?

Antoine Magnier

September 15, 2005 08:56AM

Re: UTF8 characters substituted in queries?

Mark Matthews

September 15, 2005 09:02AM

Re: UTF8 characters substituted in queries?

Dave Varon

September 15, 2005 06:37PM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.