MySQL Forums
Forum List  »  Perl

Query running too slow
Posted by: Jorge Soares
Date: November 26, 2007 08:38AM

Hi all,

I was wondering if you could help me out. I have a DB made up of 7 tables. I'm querying 3 of them at the moment.

The first one:

ZFHSkeys has the correspondence between ZF gene names and HS gene names (orthologues for the biology based people reading this, and yes, I'm looking at Zebrafish and Human genes)

and is as follows:

id - INT NOT NULL AUTO_INCREMENT (and is the primary key)
EnsemblID_ZF - TINYBLOB
EnsemblID_HS - TINYBLOB

Total entries: 13950

The second table:

ZFq is a table that stores the Gene names for ZF, External ID, Chromossome, Position and Repeats so:

id - INT NOT NULL AUTO_INCREMENT (and is the primary key)
EnsemblID - TINYBLOB
ExtID - TINYBLOB
Chr - TINYBLOB
Rep - TINYINT
Pos - MEDIUMINT

Total entries: 7108

The third table:

HSq is a table that stores the Gene names for HS, External ID, Chromossome, Position and Repeats so:

id - INT NOT NULL AUTO_INCREMENT (and is the primary key)
EnsemblID - TINYBLOB
ExtID - TINYBLOB
Chr - TINYBLOB
Rep - TINYINT
Pos - MEDIUMINT

Total entries: 11208

I need a query that allows me to get the orthologues between ZF and HS with all the information that I have for them. Here is the query I came up with:

SELECT HSZFkeys.EnsemblID_ZF, ZFq.ExtID, ZFq.Chr, ZFq.Rep, ZFq.Pos, HSZFkeys.EnsemblID_HS, HSq.ExtID, HSq.Chr, HSq.Rep, HSq.Pos FROM HSZFkeys, ZFq, HSq WHERE HSZFkeys.EnsemblID_ZF = ZFq.EnsemblID AND HSZFkeys.EnsemblID_HS = HSq.EnsemblID;

I have developed a Perl script to do the query for me and output it into a file. The script is sure taking its sweet time, so I decided to run it purely on MySQL as a normal query, output on the screen. The query takes huge amount of time as well, at the time that I'm posting this, I still haven't seen any result.

I'm also now querying the database with MySQL Query Browser and again the query is taking very long time to run (also haven't seen the result).

I am working on Ubuntu Gutsy Gibbon with a 1Ghrz processor, although it will be tweaked to 2Ghrz tomorrow. My ram is 1024Mb.

Now seriously, the query on the three ways I am using is taking more than 2 hours to run. I think that it'll take even more.

Is there something wrong with my query (not syntax wise, cause MySQL has no problems there) just semantics wise? Should I break the query in two? Although this query is already a shorter version of the one I came up with before. This query is a query of 2 queries that were then uploaded on to the DB as tables.

I seriously don't know what to do. Should I just wait for the results?

All help appreciated.

Best wishes, Jorge

P.S. As I haven't found a single specialized forum where I could post this at, I decided to go with what's most important for me, that is the Perl community. I know that the script is sound as I have used it for many other queries before. If by any chance you guys think that this is not the correct place to post it, just let me know. I'll include the script I've designed:

#!/usr/lib/perl -w
use DBI;
use Net::MySQL;
use strict;
use warnings;


print "Where do you want to store the query (file extension needed)?\n";
chomp (my $query = <>);

# declare array
my @data;
my ($x, $end, $start);


# start timer
$start = time();
# perform a math operation 200000 times
for ($x=0; $x<=200000; $x++){

$data[$x] = $x/($x+2);
}

# end timer
$end = time();

# report
print "Time taken was ", ($end - $start), " seconds";



my $host = "localhost";
my $db = "Zebra_Human";
my $user = "root";
my $password = "omshanti";

my $mysql = Net::MySQL->new(
hostname => $host,
database => $db,
user => $user,
password => $password,
);

$mysql->query(q{
SELECT HSZFkeys.EnsemblID_ZF, ZFq.ExtID, ZFq.Chr, ZFq.Rep, ZFq.Pos, HSZFkeys.EnsemblID_HS, HSq.ExtID, HSq.Chr, HSq.Rep, HSq.Pos FROM HSZFkeys, ZFq, HSq WHERE HSZFkeys.EnsemblID_ZF = ZFq.EnsemblID AND HSZFkeys.EnsemblID_HS = HSq.EnsemblID
});
my $a_record_iterator = $mysql->create_record_iterator();
open (QUERY, ">> $query") or die "Unable to create file: query.txt";

while (my $record = $a_record_iterator->each) {
printf QUERY "EnID_ZF: %s ExtID: %s Chr: %s Rep: %s Pos: %s
EnID_HS: %s ExtID: %s Chr: %s Rep: %s Pos: %s\n",
$record->[0], $record->[1], $record->[2], $record->[3], $record->[4],
$record->[5], $record->[6], $record->[7], $record->[8], $record->[9];
}
close();

for ($x=0; $x<=200000; $x++){

$data[$x] = $x/($x+2);
}

# end timer
$end = time();

# report
print "\nTime taken was ", ($end - $start), " seconds", "\n";

print "Query stored, check $query", "/n";
close(QUERY);


exit;

All the best,

Jorge

Options: ReplyQuote


Subject
Written By
Posted
Query running too slow
November 26, 2007 08:38AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.