Insert Chinese from Php page yields strange characters 四分衛
Posted by:
Tian He
Date: January 23, 2006 11:59PM
Hello,
I am trying to write a small translation engine using MySQL and PHP 5.
I have a php page that allows users to insert new records into a dictionary database.
Here's what I've done:
1) The php page calls:
mysql_query("SET NAMES 'utf8'", $dblink);
mysql_query("SET CHARCTER SET utf8", $dblink);
So the connections between php and mysql should be in utf8 format.
2) I've also set the html to
<meta http-equiv="content-type" content="text-html; charset=utf-8">
So the php page should be receive utf-8, and I guess the form should also be taking in utf-8.
3) The database and its table has utf8_unicode_ci collation and utf8 charset
Now the weird part:
I've tried inserting an entry from phpMyAdmin into my database, and the record shows up just fine. So I am guessing mysql database is setup correctly.
However when I try to insert an entry from my php page using
$query = "INSERT INTO terms VALUE('$eng','$eng_alt','$chi','$chi_alt')";
Where $chi contains the chinese values ' 四分衛'. When I use phpMyAdmin to look at the new entry i just made, I see the chinese entry becomes '四分衛'.
The first character matches but the other two character doesn't. It also seems that the font is different. If I try other terms, some might not even have the first character matching.
I 've output $query onto the webpage right before I call mysql_query($query), and the output gives me the correct query. INSERT INTO terms VALUE('Quarterback','','四分衛','')
Does any one know what could be wrong? Help would be much appeciated!
I think it's related to the mysql_connection, corrupting the data somehow, but I am not sure. But here is my code:
---------------------
<?php header("Content-type: text/html; charset=utf-8");
/*
* Created on Jan 9, 2006
*
* To change the template for this generated file go to
* Window - Preferences - PHPeclipse - PHP - Code Templates
*/
$eng=$_REQUEST['eng'];
$chi=$_REQUEST['chi'];
$test=$_REQUEST['charset_check'];
?>
<html>
<head><title>Football Terms</title>
<meta http-equiv="content-type" content="text-html; charset=utf-8">
<meta http-equiv="Content-Language" content="zh-Hans, en" />
</head>
<h1 align="center" >
<font face="Bookman Old Style, Book Antiqua, Garamond">Chinese/English Football Translator</font>
</h1>
<h3 align="center">
<form method="post" action="addterm.php">
<table>
<tr>
<td align='center'><b>English</b>
<td align='center'><b>Alt. English</b>
<td align='center'><b>Chinese</b>
<td align='center'><b>Alt. Chinese</b>
</tr>
<tr>
<td><input name="eng" type="text" size="20"/>
<td><input name="eng_alt" type="text" size="20"/>
<td><input name="chi" type="text" size="20">
<td><input name="chi_alt" type="text" size="20"/>
<input type="hidden" name="charset_check" value="ä™®" />
</tr>
</table>
<input type="submit" />
</form>
<a href="addterm.php">Add Term To DB</a>
<a href="search.php">Search Term In DB</a>
</h3>
</html>
<?
#Database connections:
$user="root";
$password="xxxx";
$database="football1";
mysql_connect('localhost',$user,$password);
mysql_query("SET NAMES 'utf8'", $dblink);
mysql_query("SET CHARCTER SET utf8", $dblink);
mysql_query("SET COLLATION_CONNECTION='utf8_unicode_ci'");
$cEncoding = mysql_client_encoding();
echo "CLIENT ENCODING: $cEncoding";
@mysql_select_db($database);
$chi=mysql_real_escape_string($chi);
$chi_alt=mysql_real_escape_string($chi_alt);
if (bin2hex($test) == "c3a4e284a2c2ae") { /* UTF-8 for "ä™®" */
echo "$test: it's UTF-8!";
}
else{
echo "$test: it's NOT UTF-8!'";
}
if(empty($eng)||empty($chi)){
}
else{
$query = "INSERT INTO terms VALUES('$eng','$eng_alt','$chi','$chi_alt')";
echo $query;
# $query = "INSERT INTO test VALUES(convert('$chi' using utf8))";
mysql_query($query);
if(mysql_error()){ ?>
<html><head><title>Insertion Error</title><head>
<p>Insertion Error: Cannot perform query: <? echo $query ?> </p>
<?
echo mysql_error();
}
}
?>