SQL SERVER – Data and Page Compressions – Data Storage and IO Improvement

The performance of SQL Server is primarily decided by the disk I/O efficiency. Improving I/O definitely improves the performance. SQL Server 2008 introduced Data and Backup compression features to improve the disk I/O. Here, I will explain Data compression.

Data compression implies the reduction in the disk space reserved by data. Therefore, data compression can be configured for a table, clustered index, non-clustered index, indexed view or a partition of table or index.

Data compression is implemented at two levels: ROW and PAGE. Even page compression automatically implements row compression. Tables and indexes can be compressed when they are created by using the CREATE TABLE and CREATE INDEX statements. To change the compression state of a table, index, or partition, use the ALTER TABLE.. REBUILD WITH or ALTER INDEX.. REBUILD WITH statements. When compression state of a heap is changed, all non-clustered index are rebuilt (not with the compression type of table but there own).

Let me tell you what happens during the compression of data. In Row compression, the following four methods are used to remove unused space:

  1. The metadata overhead of the record is reduced.
  2. All numeric (for example integer, decimal, and float) and numeric-based (for example datetime and money) data type values are converted into variable length values. By default, the values of these data types are stored as fixed length like integer in 4 bytes datetime in 8 bytes, but after compression all unconsumed space is reclaimed. For example, a value 100 is stored in an integer-type column. We know an integer value between 0 and 255 can be stored in 1 byte. However, it reserves 4 bytes (integer type takes 4 bytes) on disk. Here, after compression, 3 bytes are reclaimed.
  3. CHAR and NCHAR type values are stored in variable length format. After compression, no blank character is stored with this type of data. For example, a value “DBMS” is stored in CHAR(10) type column. By default, this value will reserve 10 bytes on disk, but after compression, it will reserve only 4 bytes.
  4. NULL and 0 values across all data types are optimized and take no bytes.

Page compression is implemented through the following three methods:

  1. Row compression: as discussed above
  2. Prefix compression: In each page, for each column, a common value from all rows is identified and stored in a row below the header. After that, from all the rows, that common value is replaced with the reference of header row.
  3. Dictionary compression: Dictionary compression is same concept as that of Prefix compression implemented the second time. In Prefix compression, a separate common value is identified for each column, but in Dictionary compression, common values are identified from all columns in that page and stored in second row below the header. Then, these common values are replaced with the reference of values in new row.

Below is an example. Even it is not a good practical scenario, but sufficiently explains the compression functionality.

Review your database for performance and implement this strategies. Let us run the following different row level and page level compression and observe the size of the table once compression is in place.

USE tempdb
GO
CREATE TABLE TestCompression (col1 INT, col2 CHAR(50))
GO
INSERT INTO TestCompression VALUES (10, 'compression testing')
GO 5000
-- Original
EXEC sp_spaceused TestCompression
GO
-- DATA_COMPRESSION = ROW
ALTER TABLE TestCompression
REBUILD
WITH (DATA_COMPRESSION = ROW);
GO
EXEC sp_spaceused TestCompression
GO
-- DATA_COMPRESSION = PAGE
ALTER TABLE TestCompression
REBUILD
WITH (DATA_COMPRESSION = PAGE);
GO
EXEC sp_spaceused TestCompression
GO
-- DATA_COMPRESSION = NONE
ALTER TABLE TestCompression
REBUILD
WITH (DATA_COMPRESSION = NONE);
GO
EXEC sp_spaceused TestCompression
GO

Do you use this compression on your production server? If yes, it will be interesting if you share your feedback with us over here.

Reference: Pinal Dave (http://blog.SQLAuthority.com)

About these ads

25 thoughts on “SQL SERVER – Data and Page Compressions – Data Storage and IO Improvement

  1. Hi Pinal Dave,

    Thanks for yet another good technical overview, from your experience what sort of additional CPU load do you see once enabling compression?

  2. Hi Pinal,
    I have used data compression at row level on a reporting database which i use for SSRS reporting. The good thing was that it did not create any performance issues by using the row compression. The database is not huge either it is only around 5GB. There were space savings at the table level.

  3. Hi Pinal, I use data compression at page level on my 4 majors tables. I don’t have problems with CPU performance and I reduced almost 20% of the size of that tables.

  4. Good article. I have a question, that either this will be beneficial in-term of storage or IO during query too. Like when we will submit query, is there any performance benefits?

  5. The optimisation of NULL and 0 to no bytes sounds like a potential problem. What happens if I have a numeric field where NULL is allowed as a valid entry (i.e. bank balance is unknown) and where 0 is also a valid entry (back balance is 0)? If both truly are stored exactly the same way with no other disambiguation, then some of the data will be incorrect.

    • i think null is optimized but ’0′ would still have to be stored like any other number as either int or smallint or bigint whatever the datatype is specified for the column storing ’0′

  6. Hi Pinal,

    Love your site and straight forward info– without unnecessary bells & whistles.

    DB compression is great…The question –is how to reclaim Space subsequent to DB compression.

    MS article — advise not to use SHRINK file/database – but at times that’s the only option. Would you please advise best course to reclaim space after Compression using DBCC Shrink Database (without causing too much fragmentation)

    Thanks in advance,

  7. Hi, it was informative and good learning to me. How these compression reacts when new data is loaded to compressed table. Does it impact the table performance or do we need to rebuild compression every time when we update.

    inspired by pinal dave.(ELA)

  8. Pingback: SQL SERVER – Effect of Compressed Backup Setting at Server Level on Database Backup « SQL Server Journey with SQLAuthority

  9. Pingback: SQL SERVER – Reclaiming Space Back from Database – Quiz – Puzzle – 28 of 31 « SQL Server Journey with SQL Authority

  10. Hi Pinal,

    I have a question about page compression.

    1. Created table (heap) tblcompress with page compression
    2. Inserted 2 million rows using INSERT INTO .. WITH (TABLOCK) SELECT FROM syntax
    3. Used below to verify if the data actually was page compressed as it should have as per msdn.

    SELECT
    o.name, ips.index_type_desc, p.partition_number, p.data_compression_desc,
    ips.page_count, ips.compressed_page_count
    FROM sys.dm_db_index_physical_stats
    (DB_ID(), 266288554, NULL, NULL, ‘DETAILED’) ips
    JOIN sys.objects o ON o.object_id = ips.object_id
    JOIN sys.partitions p ON p.object_id = o.object_id

    But data doesnt seem to be page compressed? Please help

    –As per Msdn:

    The newly inserted row is page-compressed:
    • if new row goes to an existing page with page compression
    • if the new row is inserted through BULK INSERT with TABLOCK
    • if the new row is inserted through INSERT INTO … (TABLOCK) SELECT … FROM
    Otherwise, the row is row-compressed.*

  11. Hi Pinal,

    We are doing ETL of several datasources into SQL Server. We do not have much space to accomodate data which comes into target DataBase. Which of these two compression techniques can be applied during real time data load into the target DB? If there is any other way to compress during data loading into target Database please share with us,

    Many Thanks,
    Daniel

  12. i have a client requirement to compress data row or page
    I am not sure What can i compressthe data or row
    what can i do for banking domain application???????

  13. Pingback: SQL SERVER – Weekly Series – Memory Lane – #018 | SQL Server Journey with SQL Authority

  14. When I execute following command,
    ALTER TABLE [dbo].tbl_MMSG REBUILD PARTITION = ALL
    WITH
    (DATA_COMPRESSION = PAGE
    )
    I get following message.

    Msg 7738, Level 16, State 2, Line 1
    Cannot enable compression for object ‘tbl_MMSG’. Only SQL Server Enterprise Edition supports compression.

    Please let me know how to compress table in Enterprise Editiion

  15. I have been exploring for a little bit for any high quality articles or weblog posts on
    this sort of space . Exploring in Yahoo I finally stumbled upon this web site.
    Reading this info So i am happy to convey that I have a very just right uncanny feeling I came upon
    exactly what I needed. I most no doubt will make sure to do not put out of your mind this web site and give it a
    look regularly.

  16. Thanks Pinal Dave. It helped me to improve the performance from 1 minute to 30 seconds. But I could see the inconsistency in the performance. It fluctuates from 20 seconds – 40 seconds. Can you tell me, how can we make the procedure to result data in consistent manner (variation of few seconds acceptable).

    Regards,
    Sivanesan

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s