I am using the 'group by trick' to select only the first event for each object.
I left out the time column and the order by as they were not my main concern.
The main problem is that the group by takes around 5 minutes to complete when the event table contains around 10 million rows. This is only one of the queries required in generating a report and some of the others require a similar group by taking another 5 minutes.
Here is the query containing the order by which unfortunately means another sub query.
SELECT person_id, COUNT(id)
FROM (SELECT id, person_id
FROM (SELECT id, person_id
FROM events
ORDER BY time DESC
) AS a
GROUP BY object_id
) AS subquery
GROUP BY person_id;
And here is the create statement including the time column:
CREATE TABLE events(
id INT NOT NULL AUTO_INCREMENT,
object_id INT NOT NULL,
person_id INT NOT NULL,
time DATETIME NOT NULL,
PRIMARY KEY(id),
INDEX(object_id),
INDEX(person_id),
INDEX(time)
) ENGINE=InnoDB;
The order by in the subquery does not seem to slow the query down much which is why I omitted it from the query I posted earlier.
I am thinking a major change is needed to improve the speed measurably, and wonder if using joins instead of sub queries would be faster. Although I think using joins would be much slower though due to mysql doing a 10M x 10M cartesian product.
I appreciate your suggestions Rick but still think I need some conceptual change to improve the speed.