AP 03 - Avoid count(*) in the outer join

Severity

Warning Warning

Type

Performance

Correctness

Problem

All rows—including those in the outer join—are counted when the COUNT(*) aggregate function is used in an outer join. The count of the outer rows is typically not something you want to include in your query.

SELECT a.product _name
      ,COUNT(*) num_orders
  FROM product a
  LEFT
  JOIN orders b
    ON a.product_id=b.product_id
GROUP BY
   a.product_name

With this query, we want to count the number of orders for each product. If the query uses count(*), it will also count products that don’t have any orders. The result of the query will be wrong and not what was expected.

product_nameproduct_idorder_id
sofa11
sofa12
chair2NULL

The query would return the following result:

product_namenum_orders
sofa2
chair1

Solution

You can fix this by using the count(<column_name>) function. For <column_name>, give a column that can’t be NULL from the table with non preserved rows (the table in the outer join). A primary key column is an example.

SELECT a.product _name
      ,COUNT(b.order_id) num_orders
  FROM product a
  LEFT
  JOIN orders b
    on a.product_id=b.product_id
GROUP BY
   a.product_name

Legitimate use of the anti pattern

None