Top 10 airports referenced from a table of flights
So, I've got a "flights" table containing 100,000+ flights (and growing fast), where each flight is between two "airports" (~5000 entries, fairly static), identified as "src_apid" (source) and "dst_apid" (destination). My problem: how do I get the list of the Top 10 most popular airports?
My current approach is this:
select ... from airports as a,flights as f where f.uid != 1 and (a.apid=f.src_apid or a.apid=f.dst_apid) group by a.apid order by count(*) desc limit 10;
But this is painfully slow, with a query taking >400 seconds, and describe shows it's using a temporary table and a filesort:
+----+-------------+-------+-------+-----------------------+------+---------+------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+-----------------------+------+---------+------+------+----------------------------------------------+
| 1 | SIMPLE | f | range | src_apid,dst_apid,uid | uid | 5 | NULL | 656 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | a | ALL | PRIMARY | NULL | NULL | NULL | 5242 | Using where |
+----+-------------+-------+-------+-----------------------+------+---------+------+------+----------------------------------------------+
Now, it's infinitely faster (<0.1s) to get a list of only "source" or "destination" airports:
select src_apid, count(*) from flights group by dst_apid order by count(*) desc limit 10;
select dst_apid, count(*) from flights group by dst_apid order by count(*) desc limit 10;
But then I need to merge two results together, and the end result might not be accurate (say, an airport has a high "src" ranking but doesn't make the top 10 for "dst").
Any ideas?
Cheers,
-j.