MySQL Forums
Forum List  »  Newbie

MySQL/PHP import millions of rows: need optimization
Posted by: Gilles Lemagnen
Date: January 26, 2010 02:52AM

Hi all.

I have this import script, which will get all the text files in a folder and import them into a couple of tables, and especially one massive (millions of lines, 3 colums) MyISAM MySQL table called 'current_table'.

This 'current_table' has 3 fields:
Date TYPE: datetime
Current TYPE: float
Panel TYPE: varchar(20)

I am using WampServer for the MySQL database.

Here is the details of the (uncommented) my.ini file:

# The MySQL server
[wampmysqld]
port = 3306
socket = /tmp/mysql.sock
skip-locking
key_buffer = 50M
max_allowed_packet = 1M
table_cache = 64
sort_buffer_size = 512K
net_buffer_length = 8K
read_buffer_size = 256K
read_rnd_buffer_size = 512K
myisam_sort_buffer_size = 8M

[mysqldump]
quick
max_allowed_packet = 16M

[mysql]
no-auto-rehash

[isamchk]
key_buffer = 20M
sort_buffer_size = 20M
read_buffer = 2M
write_buffer = 2M

[myisamchk]
key_buffer = 30M
sort_buffer_size = 30M
read_buffer = 3M
write_buffer = 3M

[mysqlhotcopy]
interactive-timeout

[mysqld]
port=3306

Given that there are hundreds of files in the datafile folder, and each files contain a thousant rows, i am not surprised that Firefox becomes unresponsive and do not perform a live update of the echo :
	$insert_file_flag = (mysql_affected_rows()==0) ? true : false;	// if no occurrence found, insert the file
	$text = ($insert_file_flag==0) ? "already in database<br>" : "import successful...........................<br>";

but instead shows the Hourglass cursor and displays all the echos at once as soon as the query is over.

Actually, the query never is completely over because it stops after a few dozens file imports.

Quote

Fatal error: Maximum execution time of 180 seconds exceeded in D:\Program Files\wamp\www\import_file.php on line 60

The php.ini file in WampServer has maximum execution time set to 240. I'm puzzled as to why would it says 180 in Firefox.

So I have to re-run the file_import.php script to get another couple dozen text files to be imported, then re-run it again, and re-run it again.. until all files have been processed. (but have those files which got "terminated" because of the timeout error been correctly processed? THAT remains the ultimate question!)

Is there a way to make sure the whole import process works smoothly at once?

Hopefully even if this might be a PHP related question, working with MySQL would give you enough expertise in PHP to help me on this. I didn't write the import script and I am very incompetent with MySQL data structures and optimization techniques.

Thanks


<?php

$debug = 1;	// $debug=1 will activate all the echos

require("config.php");

$directory = "./datafile/";	// directory containing ALL the data files

foreach (new DirectoryIterator($directory) as $fileInfo) {
    if($fileInfo->isDot()) continue;
    $fileName = $fileInfo->getFilename();
	
	if ($debug) echo "Filename: ".$fileName.", ";
	
	$conn = mysql_connect($host,$user,$pass) or die("Error connecting to MySQL database");
	mysql_select_db($database, $conn) or die("Could not select database");
	
	$sqlquery = "SELECT * FROM inserted_file_log WHERE filename LIKE '$fileName'";
	$result = mysql_query($sqlquery);

	$insert_file_flag = (mysql_affected_rows()==0) ? true : false;	// if no occurrence found, insert the file
	$text = ($insert_file_flag==0) ? "already in database<br>" : "import successful...........................<br>";
	if ($debug) echo $text;

	if ($insert_file_flag) {
		$panel = "";
		$fileNamePart = explode("_",$fileName);	//divide filename information
		if ($fileNamePart[0]=="Rainsensor") {
			$table = "sensor_rain";	// rain sensor data table
		} elseif ($fileNamePart[0]=="Temperature") {
			$table = "sensor_temperature";	// temperature sensor data table
		} elseif ($fileNamePart[0]=="Iradiation") {
			$table = "sensor_iradiation";	// irradiation sensor data table
		} else {
			$panel = $fileNamePart[0];	// panel information to use after
			$table = "current_data";	// current data table
		}
		
		$handle = fopen($directory.$fileName,"r");

		if ($handle) {
			$i=0;
			while (!feof($handle)) {
				$i++;
				$buffer = fgets($handle);
				if ($i<8 || $buffer=="") continue;	// do not execute inserting if there is no data
				$data = explode(";",$buffer);	// separates columns (date,time,value)
				$dateInfo = explode(".",$data[0]);
				$dateTime = $dateInfo[2]."-".$dateInfo[1]."-".$dateInfo[0]." ".$data[1];	// Change to YYYY-MM-DD HH:MM:SS format
				$value = str_replace(",", ".", $data[2]);
				
				// execute insert query
				if ($panel) { 
					// current information has also a panel value
					$sqlquery = "INSERT INTO $table VALUES ('$dateTime','$value','$panel')";
				} else { 
					// other information
					$sqlquery = "INSERT INTO $table VALUES ('$dateTime','$value')"; 
				}
				$result = mysql_query($sqlquery);
			}
			fclose($handle);
			$sqlquery = "INSERT INTO inserted_file_log VALUES ('$fileName')"; 
			$result = mysql_query($sqlquery);
		}
	}
}

?>



Edited 3 time(s). Last edit at 01/26/2010 03:06AM by Gilles Lemagnen.

Options: ReplyQuote


Subject
Written By
Posted
MySQL/PHP import millions of rows: need optimization
January 26, 2010 02:52AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.