MySQL Forums
Forum List  »  General

Improve speed of SQL statement
Posted by: Jan Becker
Date: January 05, 2017 04:30AM

I want to search for specific words in a Table "Article". The words are in a list called DB_Positive.

Example: In DB_Positive there is %Christmas% and %Berlin%. In ArticleID 1 there is Articletext1 (short version of text) "On Christmas all people in Berlin go on vacation." and Articletext2 (Long version) "As every year on Christmas people of the capitol of Germany go on vacation." (In total there are 7 versions of the text - all of which have to be checked)

I want to create a results table, which shows, how often each phrase was mentioned somewhere in the Article table fields.

Thus in this case the results should be
1 (ArticleID), %Christmas% (Positiveword), 2 (number of fields, in which Christmas occurs)
1, %Berlin%, 1 (as Berlin only occurs in Articletext1)

There are at the moment 50 line items in DB_Positive and 500,000 items in Articles. The initial SQL statement ran very very long. I already optimized it by first reading the DB_Positive into a in Memory table and then do the SQL statement. The result is 25,000,000 lines (50 * 500,000). But 99.9% of all lines have the result "0" (Positiveword doesn't occur anywhere in Article. I don't need these lines. There are only 25,000 relevant lines. Unfortunately when I want to select these non-zero lines, the SQL statement takes ages. And at the moment I'm only using the short version of DB_Positive and Articles. In the end there will be approx. 300 lines in DB_Positive and 2,000,000 lines in Articles.

Any suggestions on how to optimize the SQL statement?


CREATE TEMPORARY TABLE Mem_Positive (Positiveword VARCHAR(20))
ENGINE = MEMORY
AS
SELECT Positiveword FROM DB_Positive;
SELECT * FROM
(SELECT ArticleID, Positiveword, (if(Articletext1 LIKE Mem_Positive.Positiveword, 1, 0) + (if(Articletext2 LIKE Mem_Positive.Positiveword, 1, 0) + ... (in total 7 versions of the text)) AS Points
FROM Articles, Mem_Positive
WHERE
Articles.Inactive IS NULL
GROUP BY Articles.ArticleID, Positiveword) AS t1
WHERE Points > 0;

Options: ReplyQuote


Subject
Written By
Posted
Improve speed of SQL statement
January 05, 2017 04:30AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.