Performance
Correctness
All rows—including those in the outer join—are counted when the COUNT(*) aggregate function is used in an outer join. The count of the outer rows is typically not something you want to include in your query.
SELECT a.product _name
,COUNT(*) num_orders
FROM product a
LEFT
JOIN orders b
ON a.product_id=b.product_id
GROUP BY
a.product_name
With this query, we want to count the number of orders for each product. If the query uses count(*), it will also count products that don’t have any orders. The result of the query will be wrong and not what was expected.
product_name | product_id | order_id |
---|---|---|
sofa | 1 | 1 |
sofa | 1 | 2 |
chair | 2 | NULL |
The query would return the following result:
product_name | num_orders |
---|---|
sofa | 2 |
chair | 1 |
You can fix this by using the count(<column_name>) function. For <column_name>, give a column that can’t be NULL from the table with non preserved rows (the table in the outer join). A primary key column is an example.
SELECT a.product _name
,COUNT(b.order_id) num_orders
FROM product a
LEFT
JOIN orders b
on a.product_id=b.product_id
GROUP BY
a.product_name
None