SELECT b.linea, b.sentido,b.dia, ti.id
FROM transaccion t, bus_sentido b, intervalos ti
WHERE (t.ppu=b.ppu and t.dia=b.dia)
and (t.hora between b.hora_inicio and b.hora_fin)
and (t.hora >= ti.horai and t.hora < ti.horaf)
GROUP BY b.linea, b.sentido, t.dia, ti.id;
To start with, drop the GROUP BY and add DISTINCT. It is probably equivalent; maybe it is faster.
SELECT DISTINCT b.linea, b.sentido,b.dia, ti.id
FROM transaccion t, bus_sentido b, intervalos ti
WHERE (t.ppu=b.ppu and t.dia=b.dia)
and (t.hora between b.hora_inicio and b.hora_fin)
and (t.hora >= ti.horai and t.hora < ti.horaf)
Moving on to
SELECT b.linea, b.sentido, b.dia, ti.id as media_hora,
count(*) as num_trx
FROM ( transaccion t force index(dia)
left join bus_sentido b
on t.ppu=b.ppu and t.dia=b.dia )
join intervalos ti -- MISSING "ON" HERE ---
WHERE (t.hora between b.hora_inicio and b.hora_fin)
and t.hora>=ti.horai
and t.hora< ti.horaf
and b.dia>='2009-04-01'
and b.dia<='2009-05-40'
group by b.dia,b.sentido,b.linea,ti.id
* Fix the ON.
* Change
KEY `dia` (`dia`)
to
KEY `dia` (`dia`, ppu)
* Remove the FORCE INDEX -- want to see what the EXPLAIN is without it.
* Try doing the intervalos as a subquery, something like
( SELECT id FROM intervalos ti
WHERE t.hora>=ti.horai
and t.hora< ti.horaf )
as media_hora
Something like this...
SELECT b.linea, b.sentido, b.dia,
( SELECT id FROM intervalos ti
WHERE t.hora>=ti.horai
and t.hora< ti.horaf ) as media_hora,
count(*) as num_trx
FROM transaccion t
left join bus_sentido b
on t.ppu=b.ppu and t.dia=b.dia
WHERE (t.hora between b.hora_inicio and b.hora_fin)
and b.dia>='2009-04-01'
and b.dia<='2009-05-40'
group by b.dia, b.sentido, b.linea, media_hora
How fast does that run? What is the EXPLAIN?
If that fails to run well enough, consider doing it in two steps. I'm thinking that it is collecting more rows than it needs to before getting to the GROUP BY.