MySQL Forums
Forum List  »  Perl

Character sets
Posted by: filippo guenzi
Date: October 13, 2010 08:10AM

Hello,

I am working with :
- MySQL Server 5.1.31
- MySQL Administrator 1.2.17
- MySQL Query Browser 1.2.17
- Windows XP Pro 2002 Service Pack 3
- Activeperl 5.10.1 Build 1700
- PHP 5.2.9-2 (not relevant here)

As far as I know character sets are specified in 3 places :
- MySQL Administrator -> Startup variables -> Advanced -> Localization -> Default Character Set which corresponds to what is specified in my.ini I guess.
- In table options (MySQL Administrator -> Select schema -> Select table -> Edit table -> Table Options -> Character Set)
- In column details (MySQL Administrator -> Select schema -> Select table -> Edit table -> Select column -> Column details -> Column Character Set)

In our business we run a program that deals with clients, orders, invoices.
In addition I use MySQL+php to get more info from that data.
So I need to export the data from the above mentioned program and import it in mySQL.
Our program exports text files in ASCII.

Before I had my whole database in utf8 and when I was loading data into it with MySQL Query Browser everything was working fine.
I then tried to automate loading of data into the database with Perl.
But all characters like é, ê, è, à, ô, î, û, ü, ç are transformed in ? when I do it with perl. Except if I save my text file in UTF-8 before running my perl script.

On one hand I am looking for a script to save all my text files in utf8.

On the other hand I am wondering if I should change the character set of my database to ascii. I tried to do that but I had a lot of problems. MySQL Query Browser was raising errors when I was trying to go from UTF8 to ASCII showing me rows where there were accented characters like é.

I heard that :
- UTF-8 and ascii match on the first 128 chars.
- UTF-8 and latin-1 match on the first 256 chars.

It looks like a mess to me :/

Here is the perl script that works fine when the database is utf-8 and the loaded file is utf-8 and kills my accented chars when it moves them from ascii files to utf8 database :

-------------------------------------------------------------

#!/usr/bin/perl
###
### simply calls LOAD DATA INFILE for specified database, tablename and csv file
###
use DBI;
use Getopt::Long;

GetOptions( "host=s" => \$hostname,
"port=i" => \$port,
"username=s" => \$username,
"password=s" => \$password,
"database=s" => \$database,
"file=s" => \$file,
"table=s" => \$table);

#data file MUST be specified (plus we substitute all occurances of \ with a /)
#die "No path to *.csv file specified!!!" unless defined $file;
#die "No table name specified!!!" unless defined $table;
#$file =~ s/\\/\//g;

#here we set default DB connection parameters
if (not defined $hostname) {$hostname = "localhost";}
if (not defined $port) {$port = "3306";}
if (not defined $username) {$username = "root";}
if (not defined $password) {$password = 'myPassword';}
if (not defined $database) {$database = "";}


#connect to database
$dsn = "DBI:mysql:database=$database;host=$hostname;port=$port";
$dbh = DBI->connect($dsn, $username, $password) or die("Could not connect to database '$database'!");

#create and execute SQL queries
$sql = '
LOAD DATA
LOCAL
INFILE "C:/flow.csv"
INTO TABLE `qua`.`catalogue_from_ciel`
fields terminated by "\t"
LINES TERMINATED BY "\r\n"
ignore 1 lines
(
`qua`.`catalogue_from_ciel`.idUnite,
`qua`.`catalogue_from_ciel`.idUniteFournisseur,
`qua`.`catalogue_from_ciel`.designationLongueCiel,
`qua`.`catalogue_from_ciel`.designationCourteCiel,
@poidsNet,
@poidsBrut,
@stockDisponible,
@disponibleSurCommande,
@prixAchatHT,
@remiseAchat1,
@remiseAchat2,
@remiseAchat3,
@fraisDeTransportAuKG,
@margeSurPaesVoulue,
@prixVenteHTPro,
@prixVenteTTCPart,
@tauxTVA,
`qua`.`catalogue_from_ciel`.codeBarre,

`qua`.`catalogue_from_ciel`.codeFournisseurCiel,
@degresAlcoolPur,
`qua`.`catalogue_from_ciel`.alcool_categorieDouanes,
`qua`.`catalogue_from_ciel`.bapsa,

@qteParColis,

@bloque
)
set
`qua`.`catalogue_from_ciel`.poidsNet=convert(replace(replace(@poidsNet," ",""),",","."),decimal(10,2)),
`qua`.`catalogue_from_ciel`.poidsBrut=convert(replace(replace(@poidsBrut," ",""),",","."),decimal(10,2)),
`qua`.`catalogue_from_ciel`.stockDisponible=convert(replace(replace(@stockDisponible," ",""),",","."),decimal(8,3)),
`qua`.`catalogue_from_ciel`.prixAchatHT=convert(replace(replace(@prixAchatHT," ",""),",","."),decimal(10,2)),
`qua`.`catalogue_from_ciel`.remiseAchat1=convert(replace(replace(@remiseAchat1," ",""),",","."),decimal(10,2)),
`qua`.`catalogue_from_ciel`.remiseAchat2=convert(replace(replace(@remiseAchat2," ",""),",","."),decimal(10,2)),
`qua`.`catalogue_from_ciel`.remiseAchat3=convert(replace(replace(@remiseAchat3," ",""),",","."),decimal(10,2)),
`qua`.`catalogue_from_ciel`.fraisDeTransportAuKG=convert(replace(replace(@fraisDeTransportAuKG," ",""),",","."),decimal(10,2)),
`qua`.`catalogue_from_ciel`.margeSurPaesVoulue=convert(replace(replace(@margeSurPaesVoulue," ",""),",","."),decimal(10,2)),
`qua`.`catalogue_from_ciel`.prixVenteHTPro=convert(replace(replace(@prixVenteHTPro," ",""),",","."),decimal(10,2)),
`qua`.`catalogue_from_ciel`.prixVenteTTCPart=convert(replace(replace(@prixVenteTTCPart," ",""),",","."),decimal(10,2)),
`qua`.`catalogue_from_ciel`.tauxTVA=convert(replace(replace(@tauxTVA," ",""),",","."),decimal(10,2)),
`qua`.`catalogue_from_ciel`.degresAlcoolPur=convert(replace(replace(@degresAlcoolPur," ",""),",","."),decimal(10,2)),
`qua`.`catalogue_from_ciel`.disponibleSurCommande=replace(replace(replace(@disponibleSurCommande,"Oui",1),"Non",0),"",0),
`qua`.`catalogue_from_ciel`.qteParColis=convert(replace(replace(@qteParColis," ",""),",","."),decimal(6,3)),
`qua`.`catalogue_from_ciel`.bloque=replace(replace(replace(@bloque,"Oui",1),"Non",0),"",0)
';
$sth = $dbh->prepare($sql);
$sth->execute;

#disconnect from database
$dbh->disconnect;

-------------------------------------------------------------

Can you please give me advices/guidelines on how to deal with character sets and explain me my perl problem (what works with MySQL Query Browser doesn't with Perl) ?

Thank you

Options: ReplyQuote


Subject
Written By
Posted
Character sets
October 13, 2010 08:10AM
October 13, 2010 10:11AM
October 17, 2010 12:38PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.