Skip navigation links

MySQL Forums :: Japanese :: CHAR_LENGTH() returns incorrect value on Japanese UTF8 text


Advanced Search

CHAR_LENGTH() returns incorrect value on Japanese UTF8 text
Posted by: Gregor Kaplan ()
Date: August 23, 2007 10:05AM

Execute the following:

CREATE TABLE multibyte
(
thing VARCHAR(20) CHARACTER SET utf8
);

INSERT INTO multibyte (thing) VALUES('human');
INSERT INTO multibyte (thing) VALUES('ははは'); #if you can't read this it's "hahaha" in hiragana

SELECT thing, CHAR_LENGTH(thing), LENGTH(thing) FROM multibyte;

Result:
+-----------+--------------------+---------------+
| thing | CHAR_LENGTH(thing) | LENGTH(thing) |
+-----------+--------------------+---------------+
| human | 5 | 5 |
| ははは | 9 | 18 |
+-----------+--------------------+

The return value should be 3, as that is the actual number of UTF8 characters. However, if one were to open the file in a plain text editor, it would appear as "„ÅØ„ÅØ„ÅØ" - which is 9 characters, just not 9 meaningful characters.

So, is this a bug, or is there a correct way to call this that I am simply missing?

Much thank and appreciation in advance,

Gregor

Options: ReplyQuote


Subject Views Written By Posted
CHAR_LENGTH() returns incorrect value on Japanese UTF8 text 10783 Gregor Kaplan 08/23/2007 10:05AM
Re: CHAR_LENGTH() returns incorrect value on Japanese UTF8 text 5186 G P 06/23/2008 10:03PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.