Which is Faster – INSERT INTO SELECT or SELECT INTO? – Interview Question of the Week #187

August 19, 2018

Question: Which is Faster – INSERT INTO SELECT or SELECT INTO?

Answer: During the recent Comprehensive Database Performance Health Check, I ended up in a very interesting situation with my client. I realized that my customer had a large data which they wanted to move from one table to another table. However, this was taking a lot of time for them. They keep on asking me if I know any trick which they can use to reduce the insert time. As it was their critical for them we decided to work together and find out the root cause of the issue and resolve it for them.

After carefully looking into it we realized that to load the large data, my customer was using the syntax of INSERT INTO SELECT. instead of SELECT INTO Though both of the syntaxes looks same. The performance difference between them is very huge. Let us see a quick demonstration for the same.

Set Up for Test

First, let us set up a table with 10 Million rows which we will use for loading into two methods.

USE [WideWorldImporters]
GO
-- SampleTable Build
CREATE TABLE [10MillionRows](
[StockItemTransactionID] [int] NOT NULL,
[StockItemID] [int] NOT NULL,
[TransactionTypeID] [int] NOT NULL,
[CustomerID] [int] NULL,
[InvoiceID] [int] NULL,
[SupplierID] [int] NULL,
[PurchaseOrderID] [int] NULL,
[TransactionOccurredWhen] [datetime2](7) NOT NULL,
[Quantity] [decimal](18, 3) NOT NULL,
[LastEditedBy] [int] NOT NULL,
[LastEditedWhen] [datetime2](7) NOT NULL)
GO
INSERT INTO [10MillionRows]
SELECT TOP 10000000 SI.*
FROM [Warehouse].[StockItemTransactions] SI
CROSS JOIN [Warehouse].[StockItemTransactions] SI1
GO

Now we have a table which has 10 million rows we will start our experiment.

Test 1: INSERT INTO SELECT

In this test, we will test the syntax of INSERT INTO SELECT with the Statistics Time on.

-- Creating Table
CREATE TABLE [InsertIntoTest](
[StockItemTransactionID] [int] NOT NULL,
[StockItemID] [int] NOT NULL,
[TransactionTypeID] [int] NOT NULL,
[CustomerID] [int] NULL,
[InvoiceID] [int] NULL,
[SupplierID] [int] NULL,
[PurchaseOrderID] [int] NULL,
[TransactionOccurredWhen] [datetime2](7) NOT NULL,
[Quantity] [decimal](18, 3) NOT NULL,
[LastEditedBy] [int] NOT NULL,
[LastEditedWhen] [datetime2](7) NOT NULL)
GO
-- Actual Test
SET STATISTICS TIME ON
INSERT INTO [InsertIntoTest]
SELECT *
FROM [10MillionRows]
GO
SET STATISTICS TIME OFF
-- Clean up
DROP TABLE [InsertIntoTest]
GO

Let us see the statistics details of above query.

SQL Server Execution Times:
CPU time = 18937 ms, elapsed time = 19046 ms.

Please note that in the above details, we have not included the time of creating table as that would be negligible compared to the entire insert statement.

Test 2: SELECT INTO

In this test, we will test the syntax of SELECT INTO with the Statistics Time on.

SET STATISTICS TIME ON
SELECT *
INTO [dbo].[SelectIntoTest]
FROM [10MillionRows]
OPTION (MAXDOP 1)
GO
SET STATISTICS TIME OFF

Though the syntax looks very simple, we should not come to conclusion by looking at the syntax, we should only decide this based on the actual time taken by the query. Let us see the statistics details of above query.

SQL Server Execution Times:
CPU time = 6625 ms, elapsed time = 6861 ms.

You can see that the query took very little time compared to Test 2. If you noticed I have used the OPTION (MAXDOP 1) in my query. This is because when I ran the first query it was doing a single CPU query and the second query was doing a parallel execution of the query. To compare apples to apples, I have added MAXDOP into my query.

However, when I ran the same query without MAXDOP 1, I got the following results.

SQL Server Execution Times:
CPU time = 14611 ms, elapsed time = 2517 ms.

You can clearly see that when a query runs in parallel, it just uses a bit more CPU but the query execution time (elapsed time is still quicker than test 1).

Conclusion

If you compare the time of this query with the previous query which we ran in Test 1, you can clearly see that absolutely hands down the winner is Test 2: SELECT INTO. I have run this test multiple times with different datasets and scenarios and I noticed that every time SELECT INTO is faster than INSERT INTO SELECT.

Reference: Pinal Dave (https://blog.SQLAuthority.com)

SQL Performance, SQL Scripts, SQL Server

What is Lock Pages In Memory? – Interview Question of the Week #186

What Part of the SQLDump Suggest Database Corruption? – Interview Question of the Week #188

SQL SERVER – Finding Tables Created Last Week – DBA Tip Part II

January 12, 2015

SQL SERVER – Fix Error 3271: A nonrecoverable I/O error occurred on file. The remote server returned an error: (403) Forbidden

March 9, 2018

SQL SERVER – 2005 – Get Current User – Get Logged In User

October 27, 2007

15 Comments. Leave new

Wilfred Moses
August 19, 2018 3:02 pm
Great information. Since last week, I am gathering details about the SQL experience.
There are some amazing details on your blog which I didn’t know. Thanks.
Reply
Nils
August 19, 2018 6:30 pm
This is really interesting as I almost never use select into. Do you have any idea WHY the different syntax does the job so much quicker?
Nils
Reply
- David
  August 20, 2018 4:13 am
  I would hazard that INSERT INTO has overhead because you can specify which fields to use whereas SELECT INTO is optimised to loop over all fields.
  Reply
Sohrab
August 20, 2018 9:57 am
Sir,
No doubt, SELECT INTO is Faster.
But it does not copy Table or Column Level Constraints.
Sohrab
Reply
Miles
August 20, 2018 2:08 pm
Like Nils, I always assumed that there would be an overhead creating the table and therefore it would be more efficient to use an existing table. So I dont often use SELECT INTO.
Even with no Keys or constraints my testing shows it is faster – but why?
Reply
- Miles
  August 20, 2018 2:12 pm
  Execution plans seem identical!
  Reply
Miles
August 20, 2018 3:14 pm
Worked it out I think :) Creating a new table, SQL can assume that no one else is using it so places a table lock.
The following seems (with my limited testing) to have the same performance as the INSERT INTO.
INSERT INTO
{DestinationTable} WITH (TABLOCKX)
SELECT
{Field1},{Fields2}….
FROM
{SourceTable}
Can you confirm Pinal?
Reply
José
August 20, 2018 7:55 pm
Try INSERT into … with (tablock) SELECT …
SET STATISTICS TIME ON
INSERT INTO [InsertIntoTest] with (tablock)
SELECT *
FROM [10MillionRows]
GO
SET STATISTICS TIME OFF
Reply
Dmitriy Ivanov
September 21, 2018 4:10 am
Select into uses minimal logging.
Please see the article on msdn site.
Reply
beasleylarry
October 8, 2019 9:12 pm
I created a SPROC that inserts into table variables and that method performs faster than select into for me. DECLARE @TEMP_TABLE TABLE performed faster than INSERT INTO #Temp_Table vs SELECT INTO #Temp_Table
What could that possibly mean?
Reply
Sandy
January 29, 2020 4:14 pm
SELECT INTO is not supported in MySQL
Reply
- Pinal Dave
  January 29, 2020 4:24 pm
  That is correct.
  Reply
Md. Shamshuzzoha
August 27, 2020 3:40 pm
Hi Pinal,
Thanks for sharing, i got same question in my interview yesterday , i have given same answer SELECT INTO but stuck on reason for faster.
I wanted to know reason behind faster Test2 query.
Thanks
Reply