SQL SERVER – Parallelism for Heap Scan

October 29, 2018

Every consulting engagement is different and I enjoy interacting with different people while I am working with different experts. Earlier this week, here is what I heard during one of the Comprehensive Database Performance Health Check engagement about Parallelism for Heap Scan.

During a consultation, I have heard two of the DBAs talking about why clustered index is important. Let me write down our conversation here. I am just calling them Mark and John, which are just made up a name for readability.

Mark: Pinal, Do we need a clustered index on this table?

Pinal: Of course, we already have a primary key on the table and as per the workload analysis, I strongly believe we should create a clustered index on it.

Mark: John, didn’t we discussed yesterday that we need to have a clustered index on the table. Additionally, currently there is no clustered index on the table hence, the table is heap and when we retrieve data from Heap, SQL Server does not use the power of parallelism to retrieve data from a heap.

John: Yeah, I get that now. I never knew it before. I will for sure create on a clustered index on the table so we can take advantage of parallelism.

Well, so far everything was going fine, however, as soon as I heard about parallelism conversation, I realized that I must now jump into the conversation and clear up some doubt.

Pinal: Hold on, I need to give some clarification here. Let us for sure create a clustered index on the table as we have already identified a good reason for it based on the workload analysis. However, it has nothing to do with parallelism and heap scan. In SQL Server you can easily see that a heap table is also scanned with multiple processors. Let me show you a demo for a heap scan.

First, let us create a table and populate it with data.

CREATE TABLE TempTable (ID INT, Col1 CHAR(100),
Col2 CHAR(100))
GO
-- Insert into table
INSERT INTO TempTable (ID, Col1, Col2)
SELECT TOP 1000000 ROW_NUMBER() OVER (ORDER BY a.name) RowID,
'Bob','Brown'
FROM sys.all_objects a
CROSS JOIN sys.all_objects b
GO

Now enable execution plan by pressing the shortcut CTRL + M or following this blog post. Next run the following query and check the execution plan.

-- SELECT Tests
SELECT *
FROM #tempTable
WHERE ID = 1000
GO

Now go to an execution plan and you will notice that the query is using parallel threads to read the data from the table and indeed doing a table scan.

You can clean up the code by dropping your temporary test table by executing following drop script.

DROP TABLE #TempTable
GO

As there is no clustered index on the table it is called heap and it proves that SQL Server can use parallel operations to read the data on the heap as well.

Reference: Pinal Dave (https://blog.sqlauthority.com)

Clustered Index, Parallel, SQL CPU, SQL Heap, SQL Index, SQL Scripts, SQL Server

SQL Authroity News – Optimizing Lazy Spool – PASS Blog

SQL Server Performance Tuning Practical Workshop – Relaunched

SQL SERVER – Methods for Accessing SQL Server XML Datatype – Quiz – Puzzle – 20 of 31

January 21, 2012

SQL SERVER – Retrieve SQL Server Installation Date Time

July 5, 2012

SQL SERVER – FIX : Msg 3178, Level 16 – File SQLAuthority is Not in the Correct State to have This Differential Backup Applied to it

April 25, 2017

1 Comment. Leave new

Jose María Laguna
October 30, 2018 9:28 pm
Thanks for this article. It has been very helpfull viewing paralelism behaviour on our server
At item SELECT plan also include “Degree of Paralelism”
There is a mistake on table name.
At CREATE and INSERT statements , name is not a temporal table ( TempTable )
that is used at SELECT statement ( #TempTable )
Reply