I am really enjoying writing about computed column and its effect in terms of storage. Before I go on with this topic, I suggest you read the earlier articles about computed column to get the complete context.
This is the list of the all the articles in the series of computed column.
SQL SERVER – Computed Column – PERSISTED and Storage
This article talks about how computed columns are created and why they take more storage space than before.
SQL SERVER – Computed Column – PERSISTED and Performance
This article talks about how PERSISTED columns give better performance than non-persisted columns.
SQL SERVER – Computed Column – PERSISTED and Performance – Part 2
This article talks about how non-persisted columns give better performance than PERSISTED columns.
SQL SERVER – Computed Column and Performance – Part 3
This article talks about how Index improves the performance of Computed Columns.
SQL SERVER – Computed Column – PERSISTED and Storage – Part 2
This article talks about how creating index on computed column does not grow the row length of table.
SQL SERVER – Computed Columns – Index and Performance
This article summarized all the articles related to computed columns.
This article actually originates from the questions asked by one of the smartest attendees I have ever met in training a few weeks ago. He really impressed me with these good questions. He asked me this: “if we create the index on computed column, does it increase the data size of the original table? In other words, does the computed column become persisted if we create an index on it?“
This is a great question and the answer is rather very simple: No. It does not increase the data size of the original table as well as it does not turn the column into persisted. When we create an index on the column of the table, there is an additional space occupied by that index. Let us run the following code and understand this behavior.
USE tempdb
GO
-- Create Table
CREATE TABLE CompCol (ID INT,
FirstName VARCHAR(100),
LastName VARCHAR(100))
GO
-- Insert One Hundred Thousand Records
INSERT INTO CompCol (ID,FirstName,LastName)
SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY a.name) RowID,
'Bob',
CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith'
ELSE 'Brown' END
FROM sys.all_objects a
CROSS JOIN sys.all_objects b
GO
-- Check the space used by table
sp_spaceused 'CompCol'
GO
-- Add Computed Column
ALTER TABLE dbo.CompCol ADD
FullName AS (FirstName+' '+LastName)
GO
-- Check the space used by table
sp_spaceused 'CompCol'
GO
-- Create non clustered index on Computed Column
CREATE NONCLUSTERED INDEX IX_CompCol_FullName
ON dbo.CompCol (FullName)
GO
-- Check the space used by table
sp_spaceused 'CompCol'
GO
-- Add Computed Column PERSISTED
ALTER TABLE dbo.CompCol ADD
FullName_P AS (FirstName+' '+LastName) PERSISTED
GO
-- Check the space used by table
sp_spaceused 'CompCol'
GO
-- Clean up Database
DROP TABLE CompCol
GO
Here we have done the following efforts:
- Basic table and measured the used space
- Computed column and measured the used space
- Index on computed column and measured the used space
- Table and measured the used space
Now let us observe the resultset.
Taking a look at the result, it is very clear that columns made persisted only takes additional spaces. On the other hand, when an index is created, it does not increase the data size of the table; rather, the index created just uses up the unused space in the table.
Let me know your opinion about this series.
Reference: Pinal Dave (https://blog.sqlauthority.com)
4 Comments. Leave new
Hi there,
what a about an index on a NON persist computed column.
It’s not a problem at all to add such an index; but what might be the
reason to allow such an index.
One could be is it th computed column is part of an index (via SUBSTR(..)
Do you have any idea why MS supports an on a NON persisted column?
pinaldave,
I am doing a SQL 2008 R2 data conversion. Oneof the tables has 2 computed persistent fields (BodyMassIndex & WtHR – which is waist to hip ratio) when I attempt to import the data after table creation, I get errors. How do I go about importing all the other data and leaving these fields null so they can compute automatically? I have tried using NULL, ”, and several other methods that have not worked. I have found nothing online to assist with this problem.
Here are the errors:
(Error)
Messages
Error 0xc0202009: Data Flow Task 1: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80004005.
An OLE DB record is available. Source: “Microsoft SQL Server Native Client 10.0” Hresult: 0x80004005 Description: “The statement has been terminated.”.
An OLE DB record is available. Source: “Microsoft SQL Server Native Client 10.0” Hresult: 0x80004005 Description: “Divide by zero error encountered.”.
(SQL Server Import and Export Wizard)
Error 0xc0209029: Data Flow Task 1: SSIS Error Code DTS_E_INDUCEDTRANSFORMFAILUREONERROR. The “input “Destination Input” (317)” failed because error code 0xC020907B occurred, and the error row disposition on “input “Destination Input” (317)” specifies failure on error. An error occurred on the specified object of the specified component. There may be error messages posted before this with more information about the failure.
(SQL Server Import and Export Wizard)
Error 0xc0047022: Data Flow Task 1: SSIS Error Code DTS_E_PROCESSINPUTFAILED. The ProcessInput method on component “Destination – ExamV2” (304) failed with error code 0xC0209029 while processing input “Destination Input” (317). The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running. There may be error messages posted before this with more information about the failure.
(SQL Server Import and Export Wizard)
Error 0xc02020c4: Data Flow Task 1: The attempt to add a row to the Data Flow task buffer failed with error code 0xC0047020.
(SQL Server Import and Export Wizard)
Error 0xc0047038: Data Flow Task 1: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED. The PrimeOutput method on component “Source – Query” (1) returned error code 0xC02020C4. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing. There may be error messages posted before this with more information about the failure.
(SQL Server Import and Export Wizard)
Well I figured it out after 1.5 days trying everything I could think of…..
The errors were due to abou 15 rows of data that someone has inserted 0.00 into rather than leave it NULL. The calculation for BodyMassIndex could not calculate on a ZERO (0) so it would throw an error and fail to import the remainer of data. I ran an update statement to set the 0.00’s to NULL and the import worked PERFECTLY!
I wanted to post what I found in the event someone else out there runs across the same type of problem.
Well DeAnn…. since the first error message reported “Divide by zero error encountered” it seems strange to me that it took 1,5 days before thinking about that ;-)
On a serious note, it’s a common mistake to overlook error messages, one feels like drowned in them! Read error messages carefully it’s an always valid piece of advice!