MySQL Forums
Forum List  »  Newbie

Slow query, trying to optimize search..
Posted by: Richard Vialoux
Date: January 13, 2015 11:23PM

Hey MySQL forum!

I am trying to speed up my website's search function, it is basically the only bottleneck at this moment.

Just upgraded to MySQL 5.6 to utilize FULLTEXT search on innodb. It works perfectly, much faster and better results already.

Essentially, my function is simple and has two separate queries, one to get the COUNT(*) of the results and the other is the query itself.

I looked online and it seemed like doing it all in one query is better in some cases but slower in others, so I have no problem leaving it split it up if it makes more sense.

Also the query I use is different if they submit a blank query. Instead of asking "WHERE MATCH" I omit that aspect of the query. That is essentially the whole problem as I am not sure where to put the index in that case, or even if I should.

We are only querying against one table in the database, but there are 14 million rows.

Here is basically how it looks:

Server specs:

32GB RAM, quad core w/hyperthreading intel E5, pretty beefy in all respects, hard drive is a 2TB spinning disk not sure how many RPM, assume 7200.
whole system is a dedicated Ubuntu box hosting Apache and MySQL, nothing else.

mysql> SELECT VERSION();
5.6.19-0ubuntu0.14.04.1

This is the table:

CREATE TABLE `subjects` (
`id` INT(16) NOT NULL AUTO_INCREMENT,
`title` CHAR(255) NULL DEFAULT NULL,
`desc` VARCHAR(1024) NULL DEFAULT NULL,
`url` VARCHAR(320) NULL DEFAULT NULL,
`status` CHAR(64) NULL DEFAULT NULL,
`status_int` TINYINT(4) NOT NULL DEFAULT '1',
`name` CHAR(100) NULL DEFAULT NULL,
`timestamp` INT(11) NULL DEFAULT NULL,
`uniq` VARCHAR(128) NULL DEFAULT NULL,
`category` SMALLINT(6) NULL DEFAULT NULL,
`view_count` INT(16) NULL DEFAULT NULL,
`site_filter` TINYINT(4) NULL DEFAULT NULL,
`img_folder` INT(5) NULL DEFAULT NULL,
`embed` INT(3) NULL DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE INDEX `url` (`url`),
FULLTEXT INDEX `fulltextindex` (`title`, `desc`)
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB
AUTO_INCREMENT=14000000;

I've added SQL_NO_CACHE to some queries to give you an uncached result.

This is the COUNT query if nothing is specified (but we need status_int to be NOT be 0 (or be a 1) and site_filter to be a 1 (or 2, 3, etc):

mysql> SELECT SQL_NO_CACHE COUNT(*) as results_count FROM subjects WHERE NOT status_int = '0' AND site_filter = '1';;
+---------------+
| results_count |
+---------------+
| 12790935 |
+---------------+
1 row in set (6.00 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) as results_count FROM subjects WHERE status_int = '1' AND site_filter = '1';
+---------------+
| results_count |
+---------------+
| 12790926 |
+---------------+
1 row in set (6.54 sec)

This is the COUNT query if we are checking for results that fall under a common 'name' (there are about 100 different "names")

mysql> SELECT COUNT(*) as results_count FROM subjects WHERE NOT status_int = '0' AND site_filter = '1' AND name = 'blah';
+---------------+
| results_count |
+---------------+
| 17395 |
+---------------+
1 row in set (5.61 sec)

Then if we actually give it a real query, like 'test', we get our result back instantly, even though we are specifying the status columns!

mysql> SELECT COUNT(*) as results_count FROM subjects WHERE MATCH (title,desc) AGAINST ('test' IN BOOLEAN MODE) AND NOT status_int = '0' AND site_filter = '1';
+---------------+
| results_count |
+---------------+
| 3131 |
+---------------+
1 row in set (0.02 sec)

mysql> SELECT COUNT(*) as results_count FROM subjects WHERE NOT status_int = '0' AND site_filter = '1' AND name = 'blah';
+---------------+
| results_count |
+---------------+
| 17395 |
+---------------+
1 row in set (5.72 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) as results_count FROM subjects WHERE NOT status_int = '0' AND site_filter = '1' AND name = 'blah';
+---------------+
| results_count |
+---------------+
| 17395 |
+---------------+
1 row in set (6.21 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) as results_count FROM subjects WHERE MATCH (title,desc) AGAINST ('test' IN BOOLEAN MODE) AND NOT status_int = '0' AND site_filter = '1';
+---------------+
| results_count |
+---------------+
| 3131 |
+---------------+
1 row in set (0.01 sec)

Even specifying a result within a certain name, is like instant:

mysql> SELECT SQL_NO_CACHE COUNT(*) as results_count FROM subjects WHERE MATCH (title,desc) AGAINST ('test' IN BOOLEAN MODE) AND NOT status_int = '0' AND site_filter = '1' AND name = 'blah';
+---------------+
| results_count |
+---------------+
| 48 |
+---------------+
1 row in set (0.01 sec)

Basically how can I utilize the magic of MySQL to make the empty queries just as fast?

For good measure here are two more count queries:

mysql> SELECT SQL_NO_CACHE COUNT(*) as results_count FROM subjects WHERE MATCH (title,desc) AGAINST ('test' IN BOOLEAN MODE);
+---------------+
| results_count |
+---------------+
| 3423 |
+---------------+
1 row in set (0.00 sec)

mysql> SELECT SQL_NO_CACHE COUNT(*) as results_count FROM subjects WHERE name = 'blah';
+---------------+
| results_count |
+---------------+
| 98694 |
+---------------+
1 row in set (5.17 sec)



The second part of the query gets the results themselves.

If you just leave the query empty this is what you get:

mysql> SELECT * FROM subjects WHERE NOT status_int = '0' AND site_filter = '1' ORDER BY timestamp DESC LIMIT 0,100;
100 rows in set (7.36 sec)

(it defaults to timestamp sorting)

If we cut off the ordering, the query would be:

mysql> SELECT SQL_NO_CACHE * FROM subjects WHERE NOT status_int = '0' AND site_filter = '1' LIMIT 0,100;
100 rows in set (0.00 sec)

So basically instant, the ordering is killing it here.

Empty query, but specifying name:

mysql> SELECT SQL_NO_CACHE * FROM subjects WHERE NOT status_int = '0' AND site_filter = '1' AND name = 'blah' ORDER BY timestamp DESC LIMIT 0,100;
100 rows in set (6.42 sec)

Same query without ordering:
mysql> SELECT SQL_NO_CACHE * FROM subjects WHERE NOT status_int = '0' AND site_filter = '1' AND name = 'blah' LIMIT 0,100;
100 rows in set (0.02 sec)

But I need it to be ordered.. at least by some column.

SELECT SQL_NO_CACHE * FROM subjects WHERE MATCH (title,desc) AGAINST ('test' IN BOOLEAN MODE) AND NOT status_int = '0' AND site_filter = '1' ORDER BY timestamp DESC LIMIT 0,100;
100 rows in set (0.02 sec)

Instant results! Without cache AND with the ordering and all my specified columns.

As you can imagine, giving it a name column to check and a real query is also instant.

So I'm pretty much one index or one change away from this search page being finished.

Thanks in advance for your help!

Also if you have any tips on my MySQL/innodb settings please advise! I have one more query I think you might find useful:

Options: ReplyQuote


Subject
Written By
Posted
Slow query, trying to optimize search..
January 13, 2015 11:23PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.