MySQL Forums
Forum List  »  Performance

greatest-n-per-group performace for different data types
Posted by: Martin Schulz
Date: December 13, 2010 03:59AM

Hi there,

I have this database structure:

    |- purchaseID -|- customerID -|- emotionID -|- customerCountryCode -|- customerContinentCode-|
    |     1        |     2345     |     0       |        US             |            NA          |
    |     2        |     2345     |     3       |        US             |            NA          |
    |     3        |     4456     |     0       |        UK             |            EU          |
    |     3        |     4456     |     5       |        UK             |            EU          |
    |     4        |     4456     |     2       |        UK             |            EU          |
    |     5        |     4456     |     2       |        UK             |            EU          |
    |     6        |     1234     |     0       |        US             |            NA          |
    |     7        |     6678     |     0       |        US             |            NA          |
    |     8        |     9900     |     0       |        US             |            NA          |
    |     9        |     3334     |     0       |        US             |            NA          |    
    |     10       |     3334     |     4       |        US             |            NA          |

I'm using this query to get the top 6 customers of a certain emotionID. The query takes around 30 sceonds...

    SELECT customers.customerID, Count( customers.emotionID ) / C.totalPeople * 100.0 AS emotionPercent 
    FROM `customers` 
    INNER JOIN 
    
        (SELECT customers.customerID, Count( customers.emotionID ) AS totalPeople
        FROM `customers` 
        GROUP BY customers.customerID) C 
    
    ON customers.customerID = C.customerID 
    WHERE customers.emotionID = 0 
    GROUP BY customers.customerID 
    ORDER BY emotionPercent DESC 
    LIMIT 0,6

So I'm getting this:
    |- customerID -|- emotionPercent -|
    |     1234     |        100       |     
    |     6678     |        100       |     
    |     9900     |        100       | 
    |     2345     |        50        |     
    |     3334     |        50        | 
    |     4456     |        25        |

When I search for the top 6 countries of a certain emotion I use this query, which is structually identical:

    SELECT customers.customerCountryCode, Count( customers.emotionID ) / C.totalPeople * 100.0 AS emotionPercent 
    FROM `customers` 
    INNER JOIN 
    
        (SELECT customers.customerCountryCode, Count( customers.emotionID ) AS totalPeople
        FROM `customers` 
        GROUP BY customerCountryCode) C 
    
    ON customers.customerCountryCode = C.customerCountryCode 
    WHERE customers.emotionID = 0 
    GROUP BY customers.customerCountryCode 
    ORDER BY emotionPercent DESC 
    LIMIT 0,6

This query takes 0,4 seconds. How can this be? In the last query, when I change the INNER query from
GROUP BY customers.customerCountryCode
to
GROUP BY customers.customerID

it also takes around 30 seconds. Anyone knows why?

The customerCountryCode data type is varchar(2), the customerID is an unsgined mediumint. Is this causing the problem?

Thanks in advance!

Options: ReplyQuote


Subject
Views
Written By
Posted
greatest-n-per-group performace for different data types
3137
December 13, 2010 03:59AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.