MySQL Forums
Forum List  »  Newbie

Inserting a MySQL select result not working? part 2
Posted by: Peter Brawley
Date: October 23, 2017 12:45PM

Phorum readability issues as the text drifts rightward off the page, so a new thread.

Quote

Both the number of rows returned and the avg value calculated are different from the SELECT query I used.

SELECT
x.global_id,
x.species_code_c,
x.year_c,
COUNT(*) as N,
AVG(y.fao_total) as avg_total
FROM fao_global x
JOIN fao_global y ON x.species_code_c = y.species_code_c
AND x.year_c > 0
AND x.year_c BETWEEN y.year_c AND y.year_c + 5
WHERE x.species_code_c <> 'NULL'
GROUP BY species_code_c, x.year_c
ORDER BY global_id;

global_id is the table's primary key, so it's unique to each row, so selecting it in an aggregating query like the above is entirely meaningless: as the manual page for Group By says, if sql_mode does not include "only_full_group_by", MySQL will let you include ungrouped columns in the select list, but the returned values for that column will be abritrary.

Quote

Is the syntax "year_c > 0" saying

"if there is no catch record for a species in a given year in the five year sequence assume the catch value = 0.

Or is it saying

"assume there is no catch value"?

No mysteries. SQL is literal. The join clause says: include rows where species values match and x.year>0 and x.year between y.year and y.year+5, ie when x.year is 2001 it includes species matches and y.year values from 1996 though 2001.

Quote

Either way the demoninator matters.

Is the query dividing by 5 irrespective of whether there are 5 sequential years - in which case the average will be lower.

Or is it dividing by the number of years in the five year sequence where there is a year value. So if only 3 of the 5 years in a five year sequence have a catch value the 5 year average is actually a 3 year one - in which case the average will be higher?

The query engine figures out the denominator for Avg():

drop table if exists t;
create table t(yr year,val smallint);
insert into t values(2001,0),(2000,1),(1999,2),(1998,3),(1997,4),(1996,5);
select x.yr,avg(y.val)
from t x
join t y on x.yr between y.yr and y.yr+5
group by x.yr;
+------+------------+
| yr   | avg(y.val) |
+------+------------+
| 1996 |     5.0000 |
| 1997 |     4.5000 |
| 1998 |     4.0000 |
| 1999 |     3.5000 |
| 2000 |     3.0000 |
| 2001 |     2.5000 |
+------+------------+

But does the SQL misrepresent the requirement?

Re Concat(): yes it concatenates everything it's given as a string.

Quote

+5 versus -5 in the BETWEEN statement - other than another example of my need to actually think about what is written, I wanted to check if the order of the years in the data set could matter? (2015,2014,2013,..,1950) versus (1950,1951,1952,..,2015). Not necessarily in this case but as a broader note to keep in mind.

SQL logic is set logic. Between means between in the available data irrespective of order found. Ordering by an index will speed up the query, is all.



Edited 2 time(s). Last edit at 10/23/2017 01:54PM by Peter Brawley.

Options: ReplyQuote




Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.