CHAR_LENGTH() returns incorrect value on Japanese UTF8 text
Execute the following:
CREATE TABLE multibyte
(
thing VARCHAR(20) CHARACTER SET utf8
);
INSERT INTO multibyte (thing) VALUES('human');
INSERT INTO multibyte (thing) VALUES('ははは'); #if you can't read this it's "hahaha" in hiragana
SELECT thing, CHAR_LENGTH(thing), LENGTH(thing) FROM multibyte;
Result:
+-----------+--------------------+---------------+
| thing | CHAR_LENGTH(thing) | LENGTH(thing) |
+-----------+--------------------+---------------+
| human | 5 | 5 |
| ははは | 9 | 18 |
+-----------+--------------------+
The return value should be 3, as that is the actual number of UTF8 characters. However, if one were to open the file in a plain text editor, it would appear as "„ÅØ„ÅØ„ÅØ" - which is 9 characters, just not 9 meaningful characters.
So, is this a bug, or is there a correct way to call this that I am simply missing?
Much thank and appreciation in advance,
Gregor
Subject
Views
Written By
Posted
CHAR_LENGTH() returns incorrect value on Japanese UTF8 text
14110
August 23, 2007 10:05AM
7016
June 23, 2008 10:03PM
Sorry, you can't reply to this topic. It has been closed.
Content reproduced on this site is the property of the respective copyright holders.
It is not reviewed in advance by Oracle and does not necessarily represent the opinion
of Oracle or any other party.