MySQL Forums
Forum List  »  Newbie

Re: Efficient MySQL for pagination of large amounts of data on website
Posted by: Jack McGuire
Date: August 16, 2010 04:07AM

Hi Barry.

The thing is I don't want to use

LIMIT offset,nr_of_rows

because according to pg147 of the MySQL 5.0 Certification Study Guide in the section "9.3.3 Limiting a Selection Using LIMIT"...

It's possible to abuse the LIMIT feature. For example, it isn't a good idea to use a clause such as LIMIT 1000000,10 to return 10 rows from a query that normally would return more than a million rows. The server must still process the query to determine the first million rows before returning the 10 rows. It's better to use a WHERE clause to reduce the query result to a more manageable size, and then use LIMIT to pull rows from the reduced result.

As far as I can make out, limiting the result set without an offset does not create efficiecy problems as in...

LIMIT 10

to return the first 10 rows. But using it with a large offset...

LIMIT 100000,10

does cause efficiency problems, ie. the query takes a long time to process.

The question is efficiency. Note what http://dev.mysql.com/doc/refman/5.1/en/limit-optimization.html says about sql_calc_found_rows...

As soon as MySQL has sent the required number of rows to the client, it aborts the query unless you are using SQL_CALC_FOUND_ROWS.

sql_calc_found_rows forces MySQL to scan the whole table.

I've come across 3 solutions to this problem. Two of which are be easily implemented in MySQL but the the other is in MS SQL Server - and I can't see how to migrate that solution to MySQL. This is outlined in my original post above.

Regarding the other solutions...the first one is...

For example let's imagine a table of posts for a forum. We want to display a paginated set of results so we have a page displaying say 10 results and underneath that we have links 1,2,3,4,5,6,next,last to take us to another page of results. This would be fine if we never deleted any records as we could simply use a WHERE clause to return the results for each page. However the nature of our table of posts, is that certain posts will get deleted. Now our WHERE clause doesn't work as WHERE id > 10 and id <=20 will not return 10 results if some rows have been deleted in this range.

So what we could do is create a temporary table with an auto_increment field called say seq_no and an id field to hold the id's from our posts table. We can now populate this temporary table with ids from the posts table and even though some rows from our posts table will be missing our seq_no's will be in nice clean incremental order. Now we can select joining our temporary seq table to our posts table and use a clause WHERE seq_no > 10 and id <=20

For a simple outline of this method see http://stackoverflow.com/questions/1243952/how-can-i-speed-up-a-mysql-query-with-a-large-offset-in-the-limit-clause

For a more indepth explanation of this method see http://www.4guysfromrolla.com/webtech/041206-1.shtml

The second solution that I've come across is to display the first page of results, say 10 rows. Then using Php we "remember" the id of the last row. We can then use a statement like

SELECT * FROM posts WHERE id > the last row we remembered LIMIT 10

we are now LIMITing the output to the first 10 records. We are not, however, using any OFFSET - which causes efficiency problems. This is my brief explanation but to properly understand this please look at

http://www.percona.com/ppc2009/PPC2009_mysql_pagination.pdf

There is a more detailed explanation of how this actually works in this pdf.

These methods have their pros and cons.

It seems that this

http://www.4guysfromrolla.com/webtech/042606-1.shtml

method (mentioned in my original post) would be ideal. But I can't figure out how to "port" it to MySQL

Thanks for any help on this



Edited 1 time(s). Last edit at 08/16/2010 08:00AM by Jack McGuire.

Options: ReplyQuote




Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.