22 July, 2013

Changing the order of Joins in SQL Server can increase performance

Let use create a scenario where we can test the effects. We are going to create three tables:

CREATE TABLE TableA
(
    TableA_ID UNIQUEIDENTIFIER,
    NOTE      VARCHAR (500)   
);
 
CREATE TABLE TableB
(
    TableA_ID UNIQUEIDENTIFIER,
    NOTE      VARCHAR (500)   
);
 
CREATE TABLE TableC
(
    TableA_ID UNIQUEIDENTIFIER,
    NOTE      VARCHAR (500)   
);  

There is a logical relationship between these tables, means without any primary key and also there is no index on any column.

After creating tables let us check the execution plan between following queries:

--Query 1
SELECT *
FROM   TableA AS tblA
       INNER JOIN
       TableB AS tblB
       ON tblA.TableA_ID = tblB.TableA_ID
       INNER JOIN
       TableC AS tblC
       ON tblA.TableA_ID = tblC.TableA_ID;
 
 
GO
--Query 2 
SELECT *
FROM   TableA AS tblA
       INNER JOIN
       TableC AS tblC
       ON tblA.TableA_ID = tblC.TableA_ID
       INNER JOIN
       TableB AS tblB
       ON tblA.TableA_ID = tblB.TableA_ID;  

(To check execution plan, select above complete queries and click on 'Query>Include Actual Execution Plan'. Now execute Query.)

Both execution plans are same. Now, inserting 500 rows in TableA, 500 rows in TableB and 1000 rows in TableC

INSERT  INTO TableA (TableA_ID, NOTE)
VALUES             (NEWID(), 'Some Text');
 
GO 500
INSERT INTO TableB (TableA_ID, NOTE)
SELECT TableA_ID,
       NOTE + '.1'
FROM   TableA;
 
INSERT INTO TableC (TableA_ID, NOTE)
SELECT TableA_ID,
       NOTE + '.1'
FROM   TableA
UNION ALL
SELECT TableA_ID,
       NOTE + '.2'
FROM   TableA;  

Now let us again check the execution plan of:

--Query 1
SELECT *
FROM   TableA AS tblA
       INNER JOIN
       TableB AS tblB
       ON tblA.TableA_ID = tblB.TableA_ID
       INNER JOIN
       TableC AS tblC
       ON tblA.TableA_ID = tblC.TableA_ID;
 
 
GO
--Query 2 
SELECT *
FROM   TableA AS tblA
       INNER JOIN
       TableC AS tblC
       ON tblA.TableA_ID = tblC.TableA_ID
       INNER JOIN
       TableB AS tblB
       ON tblA.TableA_ID = tblB.TableA_ID;  



According to execution plan 'Query 1' is faster than 'Query 2'.

Now question is why two different plans for these queries? The Query Optimizer generate possible execution plans and try to select best one. Join ordering is directly related to the size of the search space as the number of possible plans for a query grows very rapidly depending on the number of tables joined. The optimizer needs to make two important decisions regarding joins:
  1. The selection of a join order
  2. The choice of a join algorithm
The order in which the tables are joined determines the cost and performance of a query. Join ordering is one of the most complex problems in query optimization. In our case SQL Server pick two different execution plans where query cost are different.

We can extend this research by including relationship between these tables and read the execution plan to identify the behavior.

At the end, do not forget to drop the tables.

DROP TABLE TableA;
 
DROP TABLE TableB;
 
DROP TABLE TableC;  

2 comments:

  1. Hmmm, Replace your INNER JOIN with INNER MERGE JOIN and see the execution plan.

    ReplyDelete
    Replies
    1. As per your suggestion changing 'JOIN' to 'INNER MERGE JOIN' then we have:

      --Query 3
      SELECT * FROM TableA tblA
      INNER MERGE JOIN TableB tblB
      ON tblA.TableA_ID = tblB.TableA_ID
      INNER MERGE JOIN TableC tblC
      ON tblA.TableA_ID = tblC.TableA_ID

      GO

      --Query 4
      SELECT * FROM TableA tblA
      INNER MERGE JOIN TableC tblC
      ON tblA.TableA_ID = tblC.TableA_ID
      INNER MERGE JOIN TableB tblB
      ON tblA.TableA_ID = tblB.TableA_ID

      According to the execution plan, each query cost is 50%.

      BUT if I compare 'Query 1' and 'Query 3' then simple 'Inner Join' is faster than 'INNER MERGE JOIN'.

      Delete