AP 33 - WHERE NOT IN without NOT NULL check in subquery

Severity

Caution

Type

Correctness

The problem

If any row of a subquery using NOT IN returns NULL, the entire NOT IN operator will evaluate to either FALSE or UNKNOWN and no records will be returned. NOT IN will not match any rows if the subquery just returns a single row with NULL.

Let’s look at an example

We have an orders table

and an order_details table with one NULL for order_id

You create a query that returns orders in the orders table without a record in the order_details table

SELECT order_id 
  FROM orders 
 WHERE order_id NOT IN(SELECT order_id 
                          FROM order_details);

You might expect the following result

but you get back the following result

The query returns no rows at all. NOT IN will not match any rows if the subquery just returns a single row with NULL.

Solution

You can rewrite your query using a defensive programming style to check for NULL in the subquery

SELECT * 
  FROM orders 
 WHERE order_id NOT IN(SELECT order_id 
                          FROM order_details 
                         WHERE order_id IS NOT NULL);

This query will give you the correct answer to your question.

Alternatively, you could use NOT EXISTS instead of NOT IN.

SELECT * 
  FROM orders o 
 WHERE NOT EXISTS(SELECT NULL 
                     FROM order_details od 
                    WHERE o.order_id=od.order_id);

Exists cannot return NULL. It’s checking solely for the presence or absence of a row in the subquery and, hence, it can only return true or false. Since it cannot return NULL, there’s no possibility of a single NULL resulting in the entire expression evaluating to UNKNOWN.

While NOT and EXISTS are equivalent, NOT IN and NOT EXISTS have different behaviour when it comes to NULL.

Legitimate use of the antipattern

If you know that the column in question has a NOT NULL constraint you can leave out the extra check for NULL values in your NOT IN subquery. However, in the spirit of defensive programming you should still add it. The constraint may be dropped in the future