Searching/inserting certain Unicode characters
Posted by: Kai Schaetzl
Date: February 14, 2011 09:48AM

Hello,

I have done some research on this and also posted a question on Usenet, but didn't get any replies and still have no solution. The problem seems to be somewhat unique :-(

I want to insert a certain character in a text or char field and I also want to search for it (as proof that it was correctly inserted).

The character is "no-break space" which can either be represented as char(160) or hex-code A0 or as U+00A0. Or, generally speaking, I want to know how I can insert or find characters by Unicode or a similar representation?
MySQL is 5.0.77 on CentOS 5.5. The field is text with varying collations. In this case it is utf8_roman_ci (for this specific use I want to replace all "?" characters in the table with "no-break space"+"?").

There is no problem inserting the character "manually" by either creating it in the input field in PHPMyAdmin or our PHP application with ALT+255 or creating it in Word with that key sequence and then copying it to the PHP application and saving it. The character is inserted correctly which I can prove by cutting it out in the PHP output and back to Word.

What doesn't work is the insertion by REPLACE() nor a find operation with a REGEXP.

For instance, I tried

SET fr=REPLACE(fr, '?','x?')

where "x" was the character copied from Word. Or \u00A0 or \xA0. Non of this inserts the correct character. It either inserted a normal space (when I paste the character "as is" in) or throws an error or corrupts the text field.
I also tried things like

SET fr=REPLACE(fr, 'uniquestring',char(160))
or
SET fr=REPLACE(fr, 'uniquestring',CONCAT(char(160),'?'))

This cuts off all characters after the replacement position and doesn't insert the wanted character.

To simplify the whole thing I went to trying with "simple" letters, e.g. a "C" = \u0043. Let's call the Unicode representation "\unnnn". No dice.

I'm also trying to do the opposite, e.g. find a character by using the Unicode or hex representation.

SELECT * FROM col WHERE (fr LIKE '%\unnnn%')
or
SELECT * FROM col WHERE (fr REGEXP '\unnnn')

Doesn't work, too.
I can't find anything on this topic in the MySQL documentation, but I hope there *is* a distinct way to search and insert such characters with some Unicode representation.

Thanks!



Edited 2 time(s). Last edit at 02/14/2011 09:52AM by Kai Schaetzl.

Options: ReplyQuote


Subject
Views
Written By
Posted
Searching/inserting certain Unicode characters
19117
February 14, 2011 09:48AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.