SQL SERVER – Computed Column – PERSISTED and Performance – Part 2

This is the third article in the series which I am writing on Persisted Columns. I suggest you read following two article first before continuing on this article.

This is the list of the all the articles in the series of computed column.

SQL SERVER – Computed Column – PERSISTED and Storage

This article talks about how computed columns are created and why they take more storage space than before.

SQL SERVER – Computed Column – PERSISTED and Performance

This article talks about how PERSISTED columns give better performance than non-persisted columns.

SQL SERVER – Computed Column – PERSISTED and Performance – Part 2

This article talks about how non-persisted columns give better performance than PERSISTED columns.

SQL SERVER – Computed Column and Performance – Part 3

This article talks about how Index improves the performance of Computed Columns.

SQL SERVER – Computed Column – PERSISTED and Storage – Part 2

This article talks about how creating index on computed column does not grow the row length of table.

SQL SERVER – Computed Columns – Index and Performance

This article summarized all the articles related to computed columns.

In this article we will see the other side, how persisted column can be expensive as well. Please note, this is not always the case, but it is the other side of the story of the persisted columns.

In earlier article we noticed that when computation is complex, computed columns are expensive to be created at run time. However, in this case we have relatively simpler example and while comparing the cost of the computed column with persisted data is less than non-persisted data. Let us quickly examine the example.

USE tempdb
GO
-- Create Table
IF EXISTS (SELECT * FROM sys.objects WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[CompCol]') AND TYPE IN (N'U'))
DROP TABLE [dbo].[CompCol]
GO
IF EXISTS (SELECT * FROM sys.objects WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[CompCol_P]') AND TYPE IN (N'U'))
DROP TABLE [dbo].[CompCol_P]
GO
CREATE TABLE CompCol (ID INT,
FirstName VARCHAR(100),
LastName CHAR(800))
GO
CREATE TABLE CompCol_P (ID INT,
FirstName VARCHAR(100),
LastName CHAR(800))
GO
-- Insert One Hundred Thousand Records
INSERT INTO CompCol (ID,FirstName,LastName)
SELECT TOP 10000 ROW_NUMBER() OVER (ORDER BY a.name) RowID,
'Bob',
CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith'
ELSE 'Brown' END
FROM sys.all_objects a
CROSS JOIN sys.all_objects b
GO
INSERT INTO CompCol_P (ID,FirstName,LastName)
SELECT TOP 10000 ROW_NUMBER() OVER (ORDER BY a.name) RowID,
'Bob',
CASE WHEN ROW_NUMBER() OVER (ORDER BY a.name)%2 = 1 THEN 'Smith'
ELSE 'Brown' END
FROM sys.all_objects a
CROSS JOIN sys.all_objects b
GO
-- Add Computed Column
ALTER TABLE dbo.CompCol ADD
FullName AS (FirstName+' '+LastName)
GO
-- Add Computed Column PERSISTED
ALTER TABLE dbo.CompCol_P ADD
FullName_P AS (FirstName+' '+LastName) PERSISTED
GO
-- Select Comparision
SELECT FullName
FROM dbo.CompCol
WHERE FullName = 'Bob Smith'
GO
SELECT FullName_P
FROM dbo.CompCol_P
WHERE FullName_P = 'Bob Smith'
GO
-- Clean up Database
DROP TABLE CompCol
DROP TABLE CompCol_P
GO


Let us examine the resultset where we are comparing two SELECT statements.

SQL SERVER - Computed Column - PERSISTED and Performance - Part 2 persisted3

It is clear from above example that here column which are persisted are more expensive. In earlier example here we have seen column which was persisted was less expensive. I suggest that you consider different parameters like size of the column, computation logic, number of the rows in consideration before marking the column as persisted.

Reference: Pinal Dave (https://blog.sqlauthority.com)

SQL Scripts
Previous Post
SQL SERVER – Computed Column – PERSISTED and Performance
Next Post
SQLAuthority News – SQL Data Camp, Chennai, July 17, 2010 – A Huge Success

Related Posts

13 Comments. Leave new

  • Hello,

    Interesting article. Never thought that a persisted column could get a behavior like this. I have tried the example and looked at it with statistics io on and statistics profile on.

    When i look at the results, the first query is using 1112 logial reads, the second one, with the persisted columns is using 8062 logical reads.

    with an index on the computed column ‘Firstname’ i get a result of both 626 logical reads and even the outcome of the profile is exactly the same..

    With this in mind, isn’t it usefull to always use an index on a computed column?

    Reply
    • Marko Parkkola
      August 4, 2010 2:15 pm

      With persisted computed columns you shift performance from selects to inserts/updates. If you add index to the column you shift it a bit more. If you are not having performance issues with inserts/updates then it is propably wise to index the column.

      But this is weird. Let’s say we create a table with computed, non-persisted, column in it.

      CREATE TABLE T (a INT PRIMARY KEY, b AS a * 2)

      Then we create nonclustered covering index.

      CREATE NONCLUSTERED INDEX IX_T ON T(a) INCLUDE(b)

      Now how is this possible? If T.b is not persisted, how it can be used as non-key column in covering index?

      I mean that to be used as non-key column it means that the value of the column is copied into the tempdb and is fetched from there (if index is used in the query). But how can it be copied into the tempdb when it has no persisted value?

      Unfortunately I can’t find any info from the MSDN about this.

      Reply
  • Thanks for your whole effort and tips in this blog. This is one of your least interesting series of articles though: “It can be black on white, investigate” :)

    Reply
  • Interesting article on Computed columns. Did not expect that persisted columns can be more expensive.

    Reply
  • Thanks. I do find it interesting. I have a few tables that use computed columns. I may have to revisit some of them to see which could benefit from being persisted.

    Reply
  • All articles based on computed columns and performance are really very very interesting. If possible can you please write a post or comment on when to use it and when not to use. I mean to say what is the best case in which we should use computed columns. :-).

    Reply
    • Marko Parkkola
      August 5, 2010 1:29 pm

      I have one example where Pinal helped me to tackle some performance issues.

      Let’s say we have tables Foo, Bar1 and Bar1 as follows.

      CREATE TABLE Foo(id1 INT PRIMARY KEY, id2 INT)
      CREATE TABLE Bar1(id INT PRIMARY KEY FooId INT)
      CREATE TABLE Bar2(id INT PRIMARY KEY FooId INT)

      Now in order to match Foo and Bar tables we have to do outer joins as follows.

      SELECT *
      FROM Foo f
      LEFT JOIN Bar1 b1 ON b1.FooId = ISNULL(f.id2, f.id1)
      LEFT JOIN Bar2 b2 ON b2.FooId = ISNULL(f.id2, f.id1)

      Now, please, don’t ask why we did things this way. I really don’t know and I think it’s horrible :)

      But anyway. Pinal solved our performance issue with persisted computed column with index.

      ALTER TABLE Foo ADD CompId AS ISNULL(id2, id1)
      CREATE NONCLUSTERED INDEX IX_Foo_Comp ON Foo (CompId)

      And now we can join tables much more efficiently.

      SELECT *
      FROM Foo f
      LEFT JOIN Bar1 b1 ON b1.FooId = f.CompId
      LEFT JOIN Bar2 b2 ON b2.FooId = f.CompId

      Reply
  • The ONLY reason that it estimates 31%:69% between the two queries is because the second table (CompCol_P) has a larger rowsize. It has an extra VARCHAR(901) column in it (because of the PERSISTED column), so it takes up more space on disk, so there are more pages to process, so there’s a higher I/O cost.

    If you added a new column to CompCol and populate it so that it is just like CompCol_P…

    ALTER TABLE CompCol ADD Junk VARCHAR(901)
    GO
    UPDATE CompCol
    SET Junk = FirstName+’ ‘+LastName
    GO

    And then compared the two queries again:

    SELECT FullName
    FROM dbo.CompCol
    WHERE FullName = ‘Bob Smith’
    GO
    SELECT FullName_P
    FROM dbo.CompCol_P
    WHERE FullName_P = ‘Bob Smith’
    GO

    Now they are 50%:50%.

    So the fact that a persistent column exists in the CompCol_P table is only INDIRECTLY responsible for making the second query more expensive… just because it makes the table take up more space.

    That’s all.

    –Brad

    Reply
    • Exactly! I believe that’s correct. it’s total bytes of I/O
      check out this experiment I did:

      if object_id(‘ContactList’) is not null
      drop table ContactList;
      go
      create table ContactList (
      RecordId int identity primary key,
      FirstName char(25) not null,
      LastName char(25) not null,
      filler char(50),
      FullName as (LastName + ‘, ‘ + FirstName) –persisted
      );
      go
      if object_id(‘ContactList_P’) is not null
      drop table ContactList_P;
      go
      create table ContactList_P (
      RecordId int identity primary key,
      FirstName char(25) not null,
      LastName char(25) not null,
      FullName as (LastName + FirstName) persisted
      );
      go

      Create nonclustered Index IX_CListFullName on ContactList (FullName);
      go
      Create nonclustered Index IX_CListFullName on ContactList_P (FullName);
      go

      set nocount on;
      insert into ContactList
      values (‘John’, ‘Smith’, space(35))
      , (‘Jane’, ‘Doe’, space(35)), (‘Sally’, ‘Lane’, space(35)), (‘Kimberly’, ‘Smithstone’, space(35))
      insert into ContactList_P
      values (‘John’, ‘Smith’)
      , (‘Jane’, ‘Doe’), (‘Sally’, ‘Lane’), (‘Kimberly’, ‘Smithstone’)
      go 10000
      set statistics io on; –set statistics io off;

      select count(*) from ContactList; select count(*) from ContactList_P

      exec sp_spaceused ContactList; exec sp_spaceused ContactList_P

      select FirstName, LastName ,FullName
      from ContactList
      go
      select FirstName, LastName ,FullName
      from ContactList_P
      go

      It produces identical logical reads for persisted AND non-persisted when adding a filler column to compare apples with apples.

      Reply
      • so I’d add that unless you have large result-sets, the results are trivial and a moot point in terms of I/O reads.

  • Jack Sparrow
    June 22, 2016 7:01 pm

    The key lesson learned from this article is to make sure you have an index on your computed columns

    Reply

Leave a Reply