Question: Which is Faster – INSERT INTO SELECT or SELECT INTO?
Answer: During the recent Comprehensive Database Performance Health Check, I ended up in a very interesting situation with my client. I realized that my customer had a large data which they wanted to move from one table to another table. However, this was taking a lot of time for them. They keep on asking me if I know any trick which they can use to reduce the insert time. As it was their critical for them we decided to work together and find out the root cause of the issue and resolve it for them.
After carefully looking into it we realized that to load the large data, my customer was using the syntax of INSERT INTO SELECT. instead of SELECT INTO Though both of the syntaxes looks same. The performance difference between them is very huge. Let us see a quick demonstration for the same.
Set Up for Test
First, let us set up a table with 10 Million rows which we will use for loading into two methods.
USE [WideWorldImporters] GO -- SampleTable Build CREATE TABLE [10MillionRows]( [StockItemTransactionID] [int] NOT NULL, [StockItemID] [int] NOT NULL, [TransactionTypeID] [int] NOT NULL, [CustomerID] [int] NULL, [InvoiceID] [int] NULL, [SupplierID] [int] NULL, [PurchaseOrderID] [int] NULL, [TransactionOccurredWhen] [datetime2](7) NOT NULL, [Quantity] [decimal](18, 3) NOT NULL, [LastEditedBy] [int] NOT NULL, [LastEditedWhen] [datetime2](7) NOT NULL) GO INSERT INTO [10MillionRows] SELECT TOP 10000000 SI.* FROM [Warehouse].[StockItemTransactions] SI CROSS JOIN [Warehouse].[StockItemTransactions] SI1 GO
Now we have a table which has 10 million rows we will start our experiment.
Test 1: INSERT INTO SELECT
In this test, we will test the syntax of INSERT INTO SELECT with the Statistics Time on.
-- Creating Table CREATE TABLE [InsertIntoTest]( [StockItemTransactionID] [int] NOT NULL, [StockItemID] [int] NOT NULL, [TransactionTypeID] [int] NOT NULL, [CustomerID] [int] NULL, [InvoiceID] [int] NULL, [SupplierID] [int] NULL, [PurchaseOrderID] [int] NULL, [TransactionOccurredWhen] [datetime2](7) NOT NULL, [Quantity] [decimal](18, 3) NOT NULL, [LastEditedBy] [int] NOT NULL, [LastEditedWhen] [datetime2](7) NOT NULL) GO -- Actual Test SET STATISTICS TIME ON INSERT INTO [InsertIntoTest] SELECT * FROM [10MillionRows] GO SET STATISTICS TIME OFF -- Clean up DROP TABLE [InsertIntoTest] GO
Let us see the statistics details of above query.
SQL Server Execution Times:
CPU time = 18937 ms, elapsed time = 19046 ms.
Please note that in the above details, we have not included the time of creating table as that would be negligible compared to the entire insert statement.
Test 2: SELECT INTO
In this test, we will test the syntax of SELECT INTO with the Statistics Time on.
SET STATISTICS TIME ON SELECT * INTO [dbo].[SelectIntoTest] FROM [10MillionRows] OPTION (MAXDOP 1) GO SET STATISTICS TIME OFF
Though the syntax looks very simple, we should not come to conclusion by looking at the syntax, we should only decide this based on the actual time taken by the query. Let us see the statistics details of above query.
SQL Server Execution Times:
CPU time = 6625 ms, elapsed time = 6861 ms.
You can see that the query took very little time compared to Test 2. If you noticed I have used the OPTION (MAXDOP 1) in my query. This is because when I ran the first query it was doing a single CPU query and the second query was doing a parallel execution of the query. To compare apples to apples, I have added MAXDOP into my query.
However, when I ran the same query without MAXDOP 1, I got the following results.
SQL Server Execution Times:
CPU time = 14611 ms, elapsed time = 2517 ms.
You can clearly see that when a query runs in parallel, it just uses a bit more CPU but the query execution time (elapsed time is still quicker than test 1).
Conclusion
If you compare the time of this query with the previous query which we ran in Test 1, you can clearly see that absolutely hands down the winner is Test 2: SELECT INTO. I have run this test multiple times with different datasets and scenarios and I noticed that every time SELECT INTO is faster than INSERT INTO SELECT.
Reference: Pinal Dave (https://blog.SQLAuthority.com)
15 Comments. Leave new
Great information. Since last week, I am gathering details about the SQL experience.
There are some amazing details on your blog which I didn’t know. Thanks.
Welcome to my world.
This is really interesting as I almost never use select into. Do you have any idea WHY the different syntax does the job so much quicker?
Nils
I would hazard that INSERT INTO has overhead because you can specify which fields to use whereas SELECT INTO is optimised to loop over all fields.
Sir,
No doubt, SELECT INTO is Faster.
But it does not copy Table or Column Level Constraints.
Sohrab
Like Nils, I always assumed that there would be an overhead creating the table and therefore it would be more efficient to use an existing table. So I dont often use SELECT INTO.
Even with no Keys or constraints my testing shows it is faster – but why?
Execution plans seem identical!
Worked it out I think :) Creating a new table, SQL can assume that no one else is using it so places a table lock.
The following seems (with my limited testing) to have the same performance as the INSERT INTO.
INSERT INTO
{DestinationTable} WITH (TABLOCKX)
SELECT
{Field1},{Fields2}….
FROM
{SourceTable}
Can you confirm Pinal?
Try INSERT into … with (tablock) SELECT …
SET STATISTICS TIME ON
INSERT INTO [InsertIntoTest] with (tablock)
SELECT *
FROM [10MillionRows]
GO
SET STATISTICS TIME OFF
Thanks for sharing this Jose.
Select into uses minimal logging.
Please see the article on msdn site.
I created a SPROC that inserts into table variables and that method performs faster than select into for me. DECLARE @TEMP_TABLE TABLE performed faster than INSERT INTO #Temp_Table vs SELECT INTO #Temp_Table
What could that possibly mean?
SELECT INTO is not supported in MySQL
That is correct.
Hi Pinal,
Thanks for sharing, i got same question in my interview yesterday , i have given same answer SELECT INTO but stuck on reason for faster.
I wanted to know reason behind faster Test2 query.
Thanks