Performance
Readability
The human brain can only store a small set of items in short term memory. That’s why we break longer pieces of information into smaller parts. This is commonly referred to as chunking, e.g. when we break up a long phone number into smaller pieces of information to remember it. The same problem exists with SQL when dealing with a large number of table joins. It gets hard for people to read, understand, and debug the code. As a rule of thumb we should not join more than five tables at a time. If possible, even reduce the number of tables you join to 3-4.
You may also get performance issues when joining a large number of tables. Each time you add a new table join, the number of possible explain plans that the CBO needs to look at grows exponentially. The optimizer might get overwhelmed as it does not have enough time to explore all of the possible plans. This may lead to poor explain plans and poor performance.
As a rule of thumb we should not join more than 5 tables together at a time. For queries with a large number of tables you should break down the query into smaller chunks that are easier to read and debug.
You can do this by using Common Table Expressions (CTE) or inline views. Let’s go through an example query with overloaded joins.
SELECT sb.buyinggroupname
,SUM(sordl.quantity*sordl.unitprice) total_sales
,SUM(sinvl.quantity*sinvl.unitprice) total_invoice
FROM SALES_CUSTOMERS sc
LEFT
JOIN SALES_BUYINGGROUPS sb
ON sc.buyinggroupid=sb.buyinggroupid
LEFT
JOIN APPLICATION_DELIVERYMETHODS ad
ON sc.DELIVERYMETHODID=ad.DELIVERYMETHODID
LEFT
JOIN SALES_ORDERS sord
ON sc.customerid=sord.customerid
LEFT
JOIN SALES_ORDERLINES sordl
ON sord.orderId=sordl.orderid
LEFT
JOIN SALES_INVOICES sinv
ON sord.orderid=sinv.orderid
LEFT
JOIN SALES_INVOICELINES sinvl
ON sinv.invoiceid=sinvl.invoiceid
GROUP BY sb.buyinggroupname
In this query we join seven tables. The query becomes hard to understand and debug. Let’s break it up into smaller chunks.
WITH customer AS(
SELECT sc.customerid
,sc.customername
,sb.buyinggroupname
,ad.deliverymethodname
FROM SALES_CUSTOMERS sc
LEFT
JOIN SALES_BUYINGGROUPS sb
ON sc.buyinggroupid=sb.buyinggroupid
LEFT
JOIN APPLICATION_DELIVERYMETHODS ad
on sc.DELIVERYMETHODID=ad.DELIVERYMETHODID)
,sales_order AS(
SELECT sord.orderid
,sord.customerid
,sordl.quantity
,sordl.unitprice
,sordl.taxrate
,sordl.stockitemid
,sordl.orderlineid
FROM SALES_ORDERS sord
JOIN SALES_ORDERLINES sordl
ON sord.orderId=sordl.orderid)
,invoice AS(
SELECT sinv.invoiceid
,sinv.customerid
,sinv.orderid
,sinvl.invoicelineid
,sinvl.quantity
,sinvl.unitprice
FROM SALES_INVOICES sinv
JOIN SALES_INVOICELINES sinvl
ON sinv.invoiceid=sinvl.invoiceid)
SELECT customer.buyinggroupname
,SUM(sales_order.quantity*sales_order.unitprice) total_sales
,SUM(sales_order.quantity*sinvl.unitprice) total_invoice
FROM customer
LEFT
JOIN sales_order
ON customer.customerid=sales_order.customerid
LEFT
JOIN invoice
ON sales_order.orderid=invoice.orderid
GROUP BY customer.buyinggroupname
We have broken the query up into three logical parts
Using a CTE we can query each of the components separately for debugging purposes. Let’s look at visual representation of the query
Visual representation of the query
Let’s zoom in on customer CTE
Customer CTE
We can collapse the CTEs
Collapse CTEs
None