Searching/inserting certain Unicode characters
Hello,
I have done some research on this and also posted a question on Usenet, but didn't get any replies and still have no solution. The problem seems to be somewhat unique :-(
I want to insert a certain character in a text or char field and I also want to search for it (as proof that it was correctly inserted).
The character is "no-break space" which can either be represented as char(160) or hex-code A0 or as U+00A0. Or, generally speaking, I want to know how I can insert or find characters by Unicode or a similar representation?
MySQL is 5.0.77 on CentOS 5.5. The field is text with varying collations. In this case it is utf8_roman_ci (for this specific use I want to replace all "?" characters in the table with "no-break space"+"?").
There is no problem inserting the character "manually" by either creating it in the input field in PHPMyAdmin or our PHP application with ALT+255 or creating it in Word with that key sequence and then copying it to the PHP application and saving it. The character is inserted correctly which I can prove by cutting it out in the PHP output and back to Word.
What doesn't work is the insertion by REPLACE() nor a find operation with a REGEXP.
For instance, I tried
SET fr=REPLACE(fr, '?','x?')
where "x" was the character copied from Word. Or \u00A0 or \xA0. Non of this inserts the correct character. It either inserted a normal space (when I paste the character "as is" in) or throws an error or corrupts the text field.
I also tried things like
SET fr=REPLACE(fr, 'uniquestring',char(160))
or
SET fr=REPLACE(fr, 'uniquestring',CONCAT(char(160),'?'))
This cuts off all characters after the replacement position and doesn't insert the wanted character.
To simplify the whole thing I went to trying with "simple" letters, e.g. a "C" = \u0043. Let's call the Unicode representation "\unnnn". No dice.
I'm also trying to do the opposite, e.g. find a character by using the Unicode or hex representation.
SELECT * FROM col WHERE (fr LIKE '%\unnnn%')
or
SELECT * FROM col WHERE (fr REGEXP '\unnnn')
Doesn't work, too.
I can't find anything on this topic in the MySQL documentation, but I hope there *is* a distinct way to search and insert such characters with some Unicode representation.
Thanks!
Edited 2 time(s). Last edit at 02/14/2011 09:52AM by Kai Schaetzl.