MySQL Forums
Forum List  »  Japanese

CHAR_LENGTH() returns incorrect value on Japanese UTF8 text
Posted by: Gregor Kaplan
Date: August 23, 2007 10:05AM

Execute the following:

CREATE TABLE multibyte
(
thing VARCHAR(20) CHARACTER SET utf8
);

INSERT INTO multibyte (thing) VALUES('human');
INSERT INTO multibyte (thing) VALUES('ははは'); #if you can't read this it's "hahaha" in hiragana

SELECT thing, CHAR_LENGTH(thing), LENGTH(thing) FROM multibyte;

Result:
+-----------+--------------------+---------------+
| thing | CHAR_LENGTH(thing) | LENGTH(thing) |
+-----------+--------------------+---------------+
| human | 5 | 5 |
| ははは | 9 | 18 |
+-----------+--------------------+

The return value should be 3, as that is the actual number of UTF8 characters. However, if one were to open the file in a plain text editor, it would appear as "„ÅØ„ÅØ„ÅØ" - which is 9 characters, just not 9 meaningful characters.

So, is this a bug, or is there a correct way to call this that I am simply missing?

Much thank and appreciation in advance,

Gregor

Options: ReplyQuote


Subject
Views
Written By
Posted
CHAR_LENGTH() returns incorrect value on Japanese UTF8 text
13899
August 23, 2007 10:05AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.