MySQL Forums
Forum List  »  General

Inconsistent behaviour with multi-table updates
Posted by: Beat Vontobel
Date: March 11, 2006 03:25PM

This is a discussion thread I started today via mail. Roland Bouman, Jay Pipes (MySQL), Markus Popp and Giuseppe Maxia were involved so far. I think it makes more sense to continue this here on the forums.


**** Original mail: Beat Vontobel ****

Hi guys,

Don't ask me what wierd thing I currently try to do... ;-) But I'm in a situation where I need a single UPDATE statement (multi-table) to process one row possibly multiple times in a complex JOIN. Now MySQL just seems to forget some rows in such statements. Rows from the table to be updated are only matched once, even if the JOIN would return them multiple times. I'd consider this a serious bug - unless somebody can point me to some source telling me that this is actually standard or intended behaviour (I checked the manual on the UPDATE syntax and didn't find anything, only a user comment from one guy that actually seems to have the same problem).

A simple test case (doesn't make much sense, but shows the behaviour):

CREATE TABLE mu (i INT, j INT);
INSERT INTO mu VALUES (1, 10), (2, 20);

SELECT * FROM mu AS x INNER JOIN mu AS y;

+------+------+------+------+
| i | j | i | j |
+------+------+------+------+
| 1 | 10 | 1 | 10 |
| 2 | 20 | 1 | 10 |
| 1 | 10 | 2 | 20 |
| 2 | 20 | 2 | 20 |
+------+------+------+------+
4 rows in set (0.00 sec)

UPDATE mu AS x INNER JOIN mu AS y SET x.j = x.j + 1;

Query OK, 2 rows affected (0.01 sec)
Rows matched: 2 Changed: 2 Warnings: 0

I'd expect to have 4 rows matched (and changed) in the UPDATE as well.

What do you think?

Thanks,
Beat


**** Roland Bouman ****

LOL!

more fun ahead:

mysql> SELECT * FROM mu AS x, (select * from mu) AS y;
+------+------+------+------+
| i | j | i | j |
+------+------+------+------+
| 1 | 16 | 1 | 16 |
| 2 | 16 | 1 | 16 |
| 1 | 16 | 2 | 16 |
| 2 | 16 | 2 | 16 |
+------+------+------+------+
4 rows in set (0.00 sec)

mysql> UPDATE mu AS x, (select * from mu) AS y SET x.j = y.j + 1;
Query OK, 1 row affected (0.05 sec)
Rows matched: 2 Changed: 1 Warnings: 0

(if i dont write the subquery it still matches 2 rows...)

I think the multi-table UPDATE syntax is very strange - there are only
two cases where I have ever needed it:
- sometimes, you can write a join that lets me test some criterion
that I would normally write as a subquery in oracle or ms sql. In
MySQL you are forced to use the join because you cannot write a
subquery on the table that is being updated.
- when you're dealing with sub-and supertypes (1 to 0 or 1
relationships), the multi-table update provides atomicity (kind of).
In this case, I find it actually quite elegant.

I have never been in the position where i needed to update the same
time multiple times like you do.

I don't think the multi-table update is standard, so it will be hard
to find out what it is supposed to do.

(...fumbling papers....just a sec....)

here's the sqlstandard (2003 it's in book 2, i think i sent you the
stuff the previous time we discussed the standard? if you havent got
it, i can send you a pointer. just letme know.) entry for UPDATE, and
the relevant productions upon which it is dependant:

<update statement: searched> ::=
UPDATE <target table> [ [ AS ] <correlation name> ]
SET <set clause list>
[ WHERE <search condition> ]

<target table> ::=
<table name>
| ONLY <left paren> <table name> <right paren>

<table name> ::= <local or schema qualified name>

<local or schema qualified name> ::=
[ <local or schema qualifier> <period> ] <qualified identifier>

<local or schema qualifier> ::=
<schema name>
| <local qualifier>

<qualified identifier> ::= <identifier>

and some more - the point is it looks like the multi-table update is
non -standard.

Now , I think that mysql flags the rows that it has updated -
immediately. somehow, the 'virtual' rows resulting from your cartesian
product are not virtual after all - the refer to the same internal
flag. It also sais that only 2 rows were matched - but we really do
see 4 virtual rows.

Frankly, I find it hard to have strong opinion about the desired
bahviour. It depends on your point of view of SQL Statements i guess.
When you regard the statement as a (declarative) specification of the
intended changes you want to apply to the set, I think the current
behaviour is correct.
On the other hand, when you look at it as procedural problem of
looping through rows and apply changes to them one by one, I would
think it should match and update all the rows produced by the
cartesian product.

kind regards,

Roland

(MySQL handles updates very directly: it immediately applies the
changes to a row on a row by row basis. This leads to all kinds of
irritating behaviour: For example, when you are updating two rows in
one statement swapping the primary key values this leads to a
constraint violations in mysql because once it is handling the first
row, it immediately puts the new value in place which at that points
violates the constraint. MS SQL, Oracle and friends and the SQL
Standard too regard the statement as a unit - not the inidividual rows
handled by it. That's why it works in those products.
I have seen another case where someone had a multi-table update with a
join, and this guy was updateing the joined columns - immediately
leading to the rest of the join being canceled because the update
caused the join condition to fail immediately after the first row was
updated. )


**** Jay Pipes ****

Beat, Roland,

Interesting stuff! As Roland mentioned, I too have never had occasion to try updating a table row twice within the same statement like Beat's example shows.

My mindset is that MySQL conceptually does the correct thing, here: it updates the distinct set of rows and fields from the target table (mu) a single time. If 2 updates are required, two UPDATE statements are needed; meaning: an UPDATE is atomic to a single unique tuple for each execution of the UPDATE which matches the row.

I would be interested to see what the behaviour of Oracle and MSSQL are in this respect. Any takers?

Cheers, and cool example, Beat!

-jay


**** Markus Popp ****

Hi folks,

I haven't tested it yet and I don't know if it's helpful in this particular
situation - but you can also add ORDER BY clauses (and also a LIMIT clause)
to an UPDATE (and also DELETE) statement as well to adjust the order in
which the rows will be updated.

There's also something similar with REPLACE queries where you can replace
more than one record, although you only insert (replace) one record (that's
when more records meet the condition to have identical unique values) - this
example reminded me a bit of this kind of REPLACE.

Later in the evening I'll test this example and I could also compare it to
PostgreSQL 8.1.3, Oracle XE beta and MS SQL 2000.

Cheers,
Markus


**** Giuseppe Maxia ****

Hi Beat,
It seems that the DBMS optimizes the access to the table in such a way that
every row is modified just once. I used a trigger to check that this is what seems to be happening.
OTOH, if you set an update with two different actions (see below), the rows are accessed twice.
The strange thing is that during the second passage the DBMS still retains the old value from the first
passage. Namely, I asked for column "j" to be incremented by 1 (first passage). Its value was '10'.
That was carried out, and the value was then '11'. During the second pass, column 'j' should be
incremented by 100. You'd expect its value to be '111', and instead it was '110'.

It looks like a bug to me, although I would never use this shaky syntax in production code.

Ciao

Giuseppe

#-----------------
create database if not exists test;
use test ;

drop table if exists mu;
CREATE TABLE mu (i INT, j INT);

INSERT INTO mu VALUES (1, 10), (2, 20);

drop table if exists log_mu;
create table log_mu (event_name varchar(10), old_j int, new_j int);

create trigger mu_bi before update on mu for each row insert into log_mu values ('before',old.j, new.j);
create trigger mu_ai after update on mu for each row insert into log_mu values ('after',old.j, new.j);

UPDATE mu as x,mu AS y SET x.j = x.j + 1;
Query OK, 2 rows affected (0.00 sec)
Rows matched: 2 Changed: 2 Warnings: 0

select * from mu; select * from log_mu;
+------+------+
| i | j |
+------+------+
| 1 | 11 |
| 2 | 21 |
+------+------+
+------------+-------+-------+
| event_name | old_j | new_j |
+------------+-------+-------+
| before | 10 | 11 |
| after | 10 | 11 |
| before | 20 | 21 |
| after | 20 | 21 |
+------------+-------+-------+

truncate mu; INSERT INTO mu VALUES (1, 10), (2, 20);
truncate log_mu;
UPDATE mu as x,mu AS y SET x.j = x.j + 1, y.j=y.j+100;
Query OK, 4 rows affected (0.00 sec)
Rows matched: 4 Changed: 4 Warnings: 0


select * from mu; select * from log_mu;
+------+------+
| i | j |
+------+------+
| 1 | 110 |
| 2 | 120 |
+------+------+
+------------+-------+-------+
| event_name | old_j | new_j |
+------------+-------+-------+
| before | 10 | 11 |
| after | 10 | 11 |
| before | 20 | 21 |
| after | 20 | 21 |
| before | 11 | 110 |
| after | 11 | 110 |
| before | 21 | 120 |
| after | 21 | 120 |
+------------+-------+-------+


**** Jay Pipes ****

Hi all!

I've forwarded your very interesting emails to Brian Aker and asked if he could provide an opinion as to whether the behaviour is a big or simply a side-effect of a rare and weird syntax...

:)


**** Markus Popp ****

Hi,

seems, the other DBMS don't understand this syntax at all (sorry, but the
error messages
that I get in Oracle and MS SQL are all in German):

PostgreSQL 8.1.3:

mpopp=# CREATE TABLE mu (i INT, j INT);
CREATE TABLE
mpopp=# INSERT INTO mu VALUES (1,10);
INSERT 0 1
mpopp=# INSERT INTO mu VALUES (2,20);
INSERT 0 1
mpopp=# SELECT * FROM mu AS x INNER JOIN mu AS y;
ERROR: syntax error at or near ";" bei Zeichen 41
ZEILE 1: SELECT * FROM mu AS x INNER JOIN mu AS y;
^
mpopp=# UPDATE mu AS x INNER JOIN mu AS y SET x.j = x.j + 1;
ERROR: syntax error at or near "AS" bei Zeichen 11
ZEILE 1: UPDATE mu AS x INNER JOIN mu AS y SET x.j = x.j + 1;
^
mpopp=# UPDATE mu x INNER JOIN mu y SET x.j = x.j + 1;
ERROR: syntax error at or near "x" bei Zeichen 11
ZEILE 1: UPDATE mu x INNER JOIN mu y SET x.j = x.j + 1;

Oracle XE:

SQL> CREATE TABLE mu (i INT, j INT);

Tabelle wurde erstellt.

SQL> INSERT INTO mu VALUES (1, 10), (2, 20);
INSERT INTO mu VALUES (1, 10), (2, 20)
*
FEHLER in Zeile 1:
ORA-00933: SQL-Befehl wurde nicht korrekt beendet


SQL> INSERT INTO mu VALUES (1, 10);

1 Zeile wurde erstellt.

SQL> INSERT INTO mu VALUES (2, 20);

1 Zeile wurde erstellt.

SQL> COMMIT;

Transaktion mit COMMIT abgeschlossen.

SQL> SELECT * FROM mu AS x INNER JOIN mu AS y;
SELECT * FROM mu AS x INNER JOIN mu AS y
*
FEHLER in Zeile 1:
ORA-00933: SQL-Befehl wurde nicht korrekt beendet


SQL> UPDATE mu AS x INNER JOIN mu AS y SET x.j = x.j + 1;
UPDATE mu AS x INNER JOIN mu AS y SET x.j = x.j + 1
*
FEHLER in Zeile 1:
ORA-00971: Schl³sselwort SET fehlt

MS SQL 2000:

CREATE TABLE mu (i INT, j INT)

INSERT INTO mu VALUES (1, 10)
INSERT INTO mu VALUES (2, 20)

SELECT * FROM mu AS x INNER JOIN mu AS y
Server: Nachr.-Nr. 170, Schweregrad 15, Status 1, Zeile 1
Zeile 1: Falsche Syntax in der Nähe von 'y'.

UPDATE mu AS x INNER JOIN mu AS y SET x.j = x.j + 1
Server: Nachr.-Nr. 156, Schweregrad 15, Status 1, Zeile 1
Falsche Syntax in der Nähe des AS-Schlüsselwortes.


By the way - I thought that multi-row inserts are SQL standard - but it
seems that MySQL is
the only one (among those that I tested here) that understands it. Can you
confirm that?

Cheers,
Markus


**** Roland Bouman ****

> I would be interested to see what the behaviour of Oracle
> and MSSQL are in this respect. Any takers?

well, I think (in fact, i'm pretty sure) that they don't support the
multi-table update.

Beat, maybe you can tell us why you need it? Maybe there's a way out after all?


**** Giuseppe Maxia ****

Hi again,
The multiple table UPDATE syntax was introduced in MySQL 4.0.4, when subqueries
were not yet available.
According to Peter Gulutzan's "SQL-99 Complete", the standard UPDATE syntax
only allows one table. ( http://www.ocelot.ca/sql99.htm )

Personally, I avoid this multiple table update syntax whenever I can. Actually,
I remember using it only when I was testing the syntax for my certification exam
or when students of my courses asked what was it about :)

ciao
Giuseppe


**** Markus Popp ****

I used it several times, when I updated rows depending on rows in a
different table - so it looks like a very useful feature to me (however, I
never made updates that were supposed to update a single row multiple times
;-)). So it makes me wonder that the other DBMS don't support it.

It also doesn't work to delete (and I think, also update) rows in a table if
you have a subquery inside the WHERE clause that accesses the same table.
The inner query might lock the table so the outer query can't access it or
something like that.

Markus


**** Roland Bouman ****

> I used it several times, when I updated rows depending
> on rows in a different table - so it looks like a very useful
> feature to me (however, I never made updates that were
> supposed to update a single row multiple times ;-)).
> So it makes me wonder that the other DBMS don't support it.

I think it has do to with the concept of a update: you update a base
table, not virtual rows generated by a join. (a case could be made
when the join is between tables that form have a one to one
relationship though)

> It also doesn't work to delete (and I think, also update) rows
> in a table if you have a subquery inside the WHERE clause
> that accesses the same table. The inner query might lock
> the table so the outer query can't access it or something
> like that.

But this is just a (IMO annoying) limitation of mysql. I don't know
about the locking and how it is solved on other rdbms-es but I have
never experienced any trouble with this in Oracle and MS SQL.


**** Markus Popp ****

But this is just a (IMO annoying) limitation of mysql.

I agree with that. I guess, this sort of query should be executed something
like this:

* execute the subquery
* release the locks
* insert the rows fetched from the subquery into the outer query
* execute the outer query

I guess, the lock release happens too late here.

Markus


**** Beat Vontobel ****

Hi guys,

thanks a lot for the analysis and your opinions (and sorry for filling your inboxes with this thread...). I personally think MySQL should update the rows multiple times if they appear multiple times in the result set of a join. After all it's just unnatural that an update and the exactly same select match a different number of rows. I agree with all of you though that it's a wierd situation - but although wierd it's just a logical consequence of MySQL's extension to allow multi-table updates (which I think is a nice feature, but the possibility of such situations is maybe the reason that the others don't support it).

> I think it has do to with the concept of a update: you update
> a base table, not virtual rows generated by a join.

Roland, that's exactly the point: Virtual or base table. But with joins allowed and updates on the base table (the current situation?) things become _very_, _very_ unintuitive. On the example table from my original mail I can even do:

UPDATE mu AS a, mu AS b SET a.j = 2, b.j = 3;
Query OK, 4 rows affected (0.11 sec)
Rows matched: 4 Changed: 4 Warnings: 0

Here all 4 rows from the join are matched! (Only two rows in the base table!) So the behaviour is completely contradictive, MySQL can in fact do multiple updates to one row in a single update statement (both rows were set to j=3 in the table after execution). I've just discovered this a second ago. And now I really think we got something that could be called a bug (maybe a conceptual one).

> * execute the subquery
> * release the locks
> * insert the rows fetched from the subquery into the outer query
> * execute the outer query

Markus, this is probably not that easy. What do you do if you've got a dependant subquery? I think we have to leave this discussion to Brian and his real geeks. ;-) We can probably just help to find the conceptual weaknesses.

Now, what the heck did make me think about all of this in the first place? ;-) I was just in the situation to solve a problem that involves some linear algebra: And after a little bit of thinking I realized that the problem of LU-decomposition of a matrix/solving of linear equation systems could be done _very_ elegantly in-place (no additional memory needed) in a declarative way in a single multi-self-join update statement! Of course I could do it with some nested loops in a stored routine (as it would be done in any imperative language), but I always feel like this is a very unnatural way to deal with data in a relational system - and SQL is a declarative language after all with just some imperative features added.

When I started this I never believed that MySQL's syntax would allow me to go that far - but now the only thing that's missing to complete this is this multi-update thing. ;-)

Cheers,
Beat


**** Roland Bouman ****

I can't really judge this - I (shame....blushing..etc) never really
questioned the algorithms behind it.

Anyway - here's another one while we're at it. What I don't get is
this: How is it possible that in the multiple table UPDATE we can SET
the fields coming from all of the tables, whereas an UPDATE on an
updateble VIEW that's based on the same join can only update the
fields coming from one of the tables?

Let me explain with code:

use test;
create table master(id int not null primary key, name varchar(64) not null);
create table detail(master_id int not null primary key, position int not null);
create view master_detail
as
select m.id
, m.name
, d.master_id
, d.position
from master m
, detail d
where m.id = d.master_id
;

insert into master values (1,'hi there');

insert into detail values (1,1);


Ok, here's where we are now:

select * from master_detail;

+----+----------+-----------+----------+
| id | name | master_id | position |
+----+----------+-----------+----------+
| 1 | hi there | 1 | 1 |
+----+----------+-----------+----------+

fine. Now, we attempt an update on the master part, just to prove we can update:

update master_detail set name = 'bye bye';
Query OK, 0 rows affected (0.00 sec)
Rows matched: 1 Changed: 0 Warnings: 0

Ok, everything's still hunky dory. Now, we decide we want to update
both the name and the position:

update master_detail set name = 'cu dude!',position=2;
ERROR 1393 (HY000): Can not modify more than one base table through a
join view 'test.master_detail'

bummer! Now we sneak our way back in:

update master m, detail d
set m.name = 'cu dude!'
, d.position = 2
where m.id = d.master_id
;
Query OK, 2 rows affected (0.09 sec)
Rows matched: 2 Changed: 2 Warnings: 0

mysql> select * from master_detail;
+----+----------+-----------+----------+
| id | name | master_id | position |
+----+----------+-----------+----------+
| 1 | cu dude! | 1 | 2 |
+----+----------+-----------+----------+
1 row in set (0.01 sec)

Anyone to shed some light on this?

Beat Vontobel
http://www.futhark.ch/mysql

Options: ReplyQuote


Subject
Written By
Posted
Inconsistent behaviour with multi-table updates
March 11, 2006 03:25PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.