MySQL Forums
Forum List  »  Perl

Mysql character sets
Posted by: jon
Date: March 23, 2006 08:14AM

Hi,

I'm trying to make a dictionary where i have a table with:
word_id auto_increment,
word varchar(50),
unique key (word)

I read the 'words' from a search engine log file where there's a nice mix of iso8859-1 and utf8. What I do is that I first put each word in a hash (i'm also counting the words):
$words{äpple}++;
$words{æpple}++;

Now, when I insert the words from the %words hash into the DB, when I select the same words (ie first "insert into words (word) values ('äpple', 'æpple')", and then "select * from words where word in ('äpple' 'æpple')") they don't match. If I use Latin1 does the db think ä==æ and refueses to insert both, when I use utf8 i get similar problems and binary converts some of the chars so the word I get when doing the select isn't eq the word i have in the hash.

What I wonder is the correct(tm) way to do this. I have characters from two languages in my words and äpple != æpple. I want to be able to get out exactly what i put in the db, ie if i insert utf8 I want it back that way, if I insert 8859-1 I want that, and no messing with any characters. Any nice way of getting that?

Thanks in advance

Options: ReplyQuote


Subject
Written By
Posted
Mysql character sets
jon
March 23, 2006 08:14AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.