SQL SERVER – Parallelism for Heap Scan

Every consulting engagement is different and I enjoy interacting with different people while I am working with different experts. Earlier this week, here is what I heard during one of the  Comprehensive Database Performance Health Check engagement about Parallelism for Heap Scan.

During a consultation, I have heard two of the DBAs talking about why clustered index is important. Let me write down our conversation here. I am just calling them Mark and John, which are just made up a name for readability.

Mark: Pinal, Do we need a clustered index on this table?

Pinal: Of course, we already have a primary key on the table and as per the workload analysis, I strongly believe we should create a clustered index on it.

Mark: John, didn’t we discussed yesterday that we need to have a clustered index on the table. Additionally, currently there is no clustered index on the table hence, the table is heap and when we retrieve data from Heap, SQL Server does not use the power of parallelism to retrieve data from a heap.

John: Yeah, I get that now. I never knew it before. I will for sure create on a clustered index on the table so we can take advantage of parallelism.

Well, so far everything was going fine, however, as soon as I heard about parallelism conversation, I realized that I must now jump into the conversation and clear up some doubt.

Pinal: Hold on, I need to give some clarification here. Let us for sure create a clustered index on the table as we have already identified a good reason for it based on the workload analysis. However, it has nothing to do with parallelism and heap scan. In SQL Server you can easily see that a heap table is also scanned with multiple processors. Let me show you a demo for a heap scan.

First, let us create a table and populate it with data.

CREATE TABLE TempTable (ID INT, Col1 CHAR(100),
Col2 CHAR(100))
GO
-- Insert into table
INSERT INTO TempTable (ID, Col1, Col2)
SELECT TOP 1000000 ROW_NUMBER() OVER (ORDER BY a.name) RowID,
'Bob','Brown'
FROM sys.all_objects a
CROSS JOIN sys.all_objects b
GO

Now enable execution plan by pressing the shortcut CTRL + M or following this blog post. Next run the following query and check the execution plan.

-- SELECT Tests
SELECT *
FROM #tempTable
WHERE ID = 1000
GO

SQL SERVER - Parallelism for Heap Scan heapscanparallel

Now go to an execution plan and you will notice that the query is using parallel threads to read the data from the table and indeed doing a table scan.

You can clean up the code by dropping your temporary test table by executing following drop script.

DROP TABLE #TempTable
GO

As there is no clustered index on the table it is called heap and it proves that SQL Server can use parallel operations to read the data on the heap as well.

Reference: Pinal Dave (https://blog.sqlauthority.com)

Clustered Index, Parallel, SQL CPU, SQL Heap, SQL Index, SQL Scripts, SQL Server
Previous Post
SQL Authroity News – Optimizing Lazy Spool – PASS Blog
Next Post
SQL Server Performance Tuning Practical Workshop – Relaunched

Related Posts

1 Comment. Leave new

  • Jose María Laguna
    October 30, 2018 9:28 pm

    Thanks for this article. It has been very helpfull viewing paralelism behaviour on our server
    At item SELECT plan also include “Degree of Paralelism”

    There is a mistake on table name.
    At CREATE and INSERT statements , name is not a temporal table ( TempTable )
    that is used at SELECT statement ( #TempTable )

    Reply

Leave a Reply