Unicode collation, utf8 data fetch/manip problem
Posted by: masi pay
Date: April 12, 2010 07:18PM

I have tried everywhere for an answer but to vague.

I am simply trying to fetch a substr of certain length from my table where data is stored as unicode. The unicode text stored (say Hindi, Nepali) seems to have length of 5 ascii chars for each such unicode char,hence trying to get substr of length 10 (of actual unicode/utf8) would fetch only 10 ascii chars which pritned shows as only 2 utf8 char on browser.

I have tried mb_strlen, gives same result.

on my server,
mysql_charset is UTF8 Unicode(utf8)
mysql connection collation is utf8_unicode_ci
and all the tables are collated to utf8_unicode_ci

the utf8 text from forms are stored in the tables which displays something like,
>>भारतमा पासपोर्ट छा..

Is this normal?
When i make correction through phpmyadmin to such text in utf8 char like क ख ग, these chars are displayed as ????? on browser,

Can you please advice me why is this happening?

my main purpose here is to, however, to fetch certain length of unicode text.

data passed from web form: कखगघ,
is stored as : ªà¤¾à¤¸(something like that) ? instead of html entities ಎ

I want to fetch only 2 chars ie कख but if i use substr(mytext, 0,2) in mysql query it fetches 2 ascii char which does not even print क, because it seems like the char क alone's length is 5 ascii equivalent.

I would appreciate very much if you can elaborate on this and let me know what i am doing not correct or what i need to do.

Thanks a lot

Options: ReplyQuote

Written By
Unicode collation, utf8 data fetch/manip problem
April 12, 2010 07:18PM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.