MySQL Forums
Forum List  »  Connector/Python

Slow query performance
Posted by: Claude van der Ryst
Date: May 26, 2024 04:03PM

Hi

I have written a Python program to strip down a csv log file that contains every connection to our streaming server.

The data is stripped from the csv log and then processed line by line and each line is stored in a table.

Every time I read a line from the csv file, I check if there isn't an existing entry like this in the database to prevent duplicates.

The problem I have is that the database table currently has over 1 900 000 records. I'm using a select query by only selecting one column to try and optimize the query and feeding it only the necessary columns to match. I only need a row count returned.

Each time I run the query, it takes about 3 to 4 seconds to return a count. Checking the CPU usage of mysql shows it maxed out to 100%

A csv log file can grow as big with 4000 new lines in one hour and will expand when we have more clients.

If one line takes 3 seconds to process, This means the current instance of the python program will complete over just 3 hours.

To add to the problem, I have a cronjob running each hour to run the python program to strip out the past hour's entries. This will slow the system even further down as there is an existing instance running of the program.

Is there any way that I can optimize the query any further or change settings in mysql or any other ways to speed up these queries?

Options: ReplyQuote


Subject
Written By
Posted
Slow query performance
May 26, 2024 04:03PM


Sorry, only registered users may post in this forum.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.