May 14, 2008
MS SQL And Sub-Selects
Today I discovered what I would consider a very serious flaw in MS SQL. Unfortunately, this discovery caused quite a bit of damage to my data, and although I could/should have been more careful, I place most of the blame on poorly designed software. After some simple tests, it appears that MySQL does not suffer from the same issue.
Our exceptional DBA’s (and others) suggest that a good practice to prevent UPDATE errors is to execute a SELECT clause that uses the same WHERE clause before your run your UPDATE command. Had I done this, it would have prevented my mistake, which is why I say I should have been more careful.
Anyway, the bottom line is that it appears that MS SQL’s implementation of Sub-Selects in both SELECT and UPDATE commands do not function as expected. It turns out that if you use a completely invalid SELECT statement as a Sub-Select in an UPDATE or SELECT statement, MS SQL will return records when it should not, in some circumstances.
Here is the SQL statement that I ran that caused all of my grief:
UPDATE FoldersAccounts
SET iAccountID = 56789
WHERE iFoldersAccountsID in (
SELECT iFoldersAccountsID
FROM Folders
WHERE iFolderID in
(1159,1160,1161,1162,1193,1200,1197,1196,1195,1186,1191,1189))
Now, what is so outrageous is that the sub-select is completely invalid because there is no column named “iFoldersAccountsID” in the Folders table. If you try to run this statement by itself:
SELECT iFoldersAccountsID
FROM Folders
WHERE iFolderID in
(1159,1160,1161,1162,1193,1200,1197,1196,1195,1186,1191,1189))
The above query, when run by itself in MS SQL Server Mangagement Studio, returns the following error:
Msg 207, Level 16, State 1, Line 1
Invalid column name ‘iFoldersAccountsID’.
As I mentioned earlier, this is the proper response for this query because there is no iFoldersAccountsID column in the Folders table.
What is so crazy is that when you run the same SELECT statement as a sub-select, no error is returned an every single record in the FoldersAccounts table is returned. In this case, that’s 1204 records. That also means that when this same SELECT statement is used in the sub-select of an UPDATE command, every single record in the table is updated, which is obviously terrible.
Even more strange is that when you make the exact same Sub-select but swap out the table and use a table name in the FROM clause that has neither the column in the SELECT clause or the WHERE clause, then still no error is returned and a seemingly ambiguous number or records are returned. Thus, the following query returns 12 records, which appears to be one for each iFolderID:
SELECT iFoldersAccountsID
FROM PriceCodes
WHERE iFolderID in
(1159,1160,1161,1162,1193,1200,1197,1196,1195,1186,1191,1189))
Obviously, the PriceCodes table has neither an iFoldersAccountsID nor an iFolderID, yet the above query returns no errors.
Interestingly, if you use a nonexistant table name, that will throw an error. At least they got that right.
Since I am clearly biased as to the “correctness” of this behavior, since it caused me so much trouble, I decided to see how MySQL handles the exact same scenario. So I created the exact same table structure in a MySQL database and tried the following query:
UPDATE FoldersAccounts
SET iAccountID = 56789
WHERE iFoldersAccountsID in (
SELECT iFoldersAccountsID
FROM Folders
WHERE iFolderID in
(1159,1160,1161,1162,1193,1200,1197,1196,1195,1186,1191,1189))
The above query in MySQL does not return an error, but it does return an “empty result” (selects zero records):

IMO, this is the better behavior, but ideally this should return an error. This seems like something that should be a no-brainer, but maybe I am missing something. If so, fill me in.
One thing that I did think about here is how this behavior in MSSQL could potentially be used in an SQL Injection attack to completely screw up an entire table. This is a great reason to use Stored Procedures in web applications for any data access in web applications.
Comments(5)