MySQL Forums
Forum List  »  Newbie

Re: SELECT took lengthy of time
Posted by: Rick James
Date: August 28, 2010 10:54AM

1. Typo?
where airdate >= '2010-03-01' and airdate >= '2010-03-31'
where airdate >= '2010-03-01' and airdate <= '2010-03-31'
The first one gets all rows on or after Mar. 31. The second one gets just the month of march.

1. It takes time to shovel 50000 rows.

2. A JOIN can have a small impact or a HUGE impact on the speed of a query. This is why I asked.
select  sl.*, e.EmployeeName, c.CityName,
        ...etc
    from  stationlog sl
    left join  city c ON c.CityCode = sl.CityCode
    left join  employee e ON e.EmployeeID = sl.EmployeeID
    where  sl.airdate >= '2010-03-01'
      and  sl.airdate <= '2010-03-31'  -- (fixed operator)
If city has an index (likely to be PRIMARY KEY) on CityCode and employee has index on EmployeeID, then these JOINs will have a "small impact". Meanwhile, stationlog should have an index on airdate.

3. GUID vs AUTO_INCREMENT. Yes, there are reasons for using a GUID. But there are drawbacks.

* GUID (UUID) is a Globally (Universially) Unique ID. That is, it stands on its head to create a string that is, in all likelihood, to be unique relative to all the other GUIDs/UUIDs on earth, ever. It is composed of the current time (helps with the 'ever'), the MAC address (manufacturers guarantee this is unique to your networking card), etc. But, if you are not competing with IDs outside your one machine, this is gross overkill.

* An AUTO_INCREMENT index gives you "temporal" "locality of reference". That is, all the IDs you create today will be adjacent to each other. Meanwhile, GUIDs will be scattered all over the table. If you tend to access only recent data (or, in your case, data for one month), then the data will be "clustered" together. This leads to more efficient access. This efficiency difference becomes striking (like, 10 times as fast) when the dataset is bigger than RAM.

* Since the GUID/UUID is 32-36-39 bytes long (depending on representation), it is bulkier, hence the indexes are bulkier. Etc. Bigger --> Less cacheable --> Slower. INT UNSIGNED (4 billion max) is only 4 bytes.

Bottom line? If your dataset is 'small', don't worry about my comments.

4. If your dataset is less than about 10 years, then grabbing a year's report will probably ignore the index and do a table scan.

If you want to speed this up even more, consider summary table(s). Have a table that contains a row for each day, with counts, sums, etc. Do the reports (monthly, or year-long, or weekly, or...) against them. This will be immensely faster (probably 10x faster). Each night add a new row(s) to the table(s). The report process might hit as few as 365 rows (or 250 working days?) for a year report. That's probably more than 10x faster than 50000 rows!

Options: ReplyQuote


Subject
Written By
Posted
August 23, 2010 07:23PM
August 23, 2010 09:51PM
August 24, 2010 09:26PM
August 26, 2010 07:05PM
August 26, 2010 10:05PM
August 28, 2010 05:42AM
Re: SELECT took lengthy of time
August 28, 2010 10:54AM
August 29, 2010 10:20PM
August 23, 2010 11:56PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.