Use [], not <> around 'code'.
Subqueries are usually not well optimized by MySQL. An equivalent JOIN is often much faster.
The NULL question applies to
select PUB_A_ID, count(B_ID) from PUB_A_B group by PUB_A_ID;
versus
select PUB_A_ID, count(*) from PUB_A_B group by PUB_A_ID;
The former needs to look at each B_ID to see if it is NULL.
The former would like
INDEX(PUB_A_ID, B_ID)
The latter needs only
INDEX(PUB_A_ID)
Those indexes would let the entire query run inside the index (EXPLAIN would say "Using index") and not have to hit the data. Also, the output rows would be in the 'right' order, thereby avoiding a sort pass.
select NUMTERMS from PUB WHERE PUB_ID='TAG' ;
Not much can be done to optimize this. Changing the index to
INDEX(PUB_ID, NUMTERMS)
might help, but seems rather dumb since all the columns are in the index. Probably that table would stay in RAM, so we are not talking about an extra disk hit. How much RAM do you have?
select PUB_A_ID, V_I from PUB_A_B
will not use the index. Why? Why bother. All rows need to be read anyway. If the entire table happens to be cached, it will take a lot of CPU. If not, it will take a lot of disk I/O and be 10x slower.
EXECUTE SELECT 2 -- select PUB_A_ID, count(B_ID) from PUB_A_B group by PUB_A_ID;
WHILE (SELECT.2){
GET_FIELD1.SELECT.2
USE_FIELD1 w/ SELECT.3 (PREPARED STATEMENT)
EXECUTE SELECT 3 -- select NUMTERMS from PUB WHERE PUB_ID='TAG' ;
GET_FIELD1.SELECT.3
doSomeMath
updateQuery1 (PREPARED STATEMENT)
}
If you do that with a JOIN, not a loop of SELECTs, it run much faster. Something like:
select x.PUB_A_ID,
x.B_count,
p.NUMTERMS
from (
select PUB_A_ID,
count(B_ID) as B_count
from PUB_A_B
group by PUB_A_ID
) x
join PUB p on p.PUB_ID = x.PUB_A_ID