problem while storing xml with different encoding in mysql table
Posted by: Ashish Duggal
Date: May 07, 2005 05:31AM

while parsing an xml document with SAX parser, i found that encoding of the xml document received as input stream is "ISO-8859-1" . After parsing certain fields has to be stored in the mysql table where table character set is "utf8" . Now what i found that ceratin characters in the original XML document are stored as question mark (?) in the database.

1. I am using mysql 4.1.7 with system variable character_set_database as "utf8". So all my tables have charset as "utf8".

2. I am parsing some xml file as inputsream using SAX parser api (org.apache.xerces.parsers.SAXParser ) with encoding "iso-8859-1". After parsing certain fields have to be stored in mysql database.

3. Some XML files contain a "iso-8859-1" character with character code 146 which appears like apostrophes but actually it is : - ’ and the problem is that words like can’t are shown as can?t by database.

4. I notiicied that parsing is going on well and character code is 146 while parsing. But when i reterive it from the database using jdbc it shows character code as 63.

5. I am using jdbc to prepared statement to insert parsed xml in the database. It seems that while inserting some problem occurs what is this i don't know.

6. I tried to convert iso-8859-1 to utf-8 before storing into database, by using
utfString = new String(isoString.getBytes("ISO-8859-1"),"UTF-8");
But still when i retreive it from the databse it shows caharcter code as 63.

7. I also tried to retrieve it using , description = new String(rs.getBytes(1),"UTF-8");
But it also shows that description contains character with code 63 instead of 146 and it is also showing can’t as can?t

help me out where is the problem in parsing or while storing and retreiving from database. Sorry for any spelling mistakes if any.

Options: ReplyQuote

Written By
problem while storing xml with different encoding in mysql table
May 07, 2005 05:31AM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.