The performance of SQL Server is primarily decided by the disk I/O efficiency. Improving I/O definitely improves the performance. SQL Server 2008 introduced Data and Backup compression features to improve the disk I/O. Here, I will explain Data compression.
Data compression implies the reduction in the disk space reserved by data. Therefore, data compression can be configured for a table, clustered index, non-clustered index, indexed view or a partition of table or index.
Data compression is implemented at two levels: ROW and PAGE. Even page compression automatically implements row compression. Tables and indexes can be compressed when they are created by using the CREATE TABLE and CREATE INDEX statements. To change the compression state of a table, index, or partition, use the ALTER TABLE.. REBUILD WITH or ALTER INDEX.. REBUILD WITH statements. When compression state of a heap is changed, all non-clustered index are rebuilt (not with the compression type of table but there own).
Let me tell you what happens during the compression of data. In Row compression, the following four methods are used to remove unused space:
- The metadata overhead of the record is reduced.
- All numeric (for example integer, decimal, and float) and numeric-based (for example datetime and money) data type values are converted into variable length values. By default, the values of these data types are stored as fixed length like integer in 4 bytes datetime in 8 bytes, but after compression all unconsumed space is reclaimed. For example, a value 100 is stored in an integer-type column. We know an integer value between 0 and 255 can be stored in 1 byte. However, it reserves 4 bytes (integer type takes 4 bytes) on disk. Here, after compression, 3 bytes are reclaimed.
- CHAR and NCHAR type values are stored in variable length format. After compression, no blank character is stored with this type of data. For example, a value “DBMS” is stored in CHAR(10) type column. By default, this value will reserve 10 bytes on disk, but after compression, it will reserve only 4 bytes.
- NULL and 0 values across all data types are optimized and take no bytes.
Page compression is implemented through the following three methods:
- Row compression: as discussed above
- Prefix compression: In each page, for each column, a common value from all rows is identified and stored in a row below the header. After that, from all the rows, that common value is replaced with the reference of header row.
- Dictionary compression: Dictionary compression is same concept as that of Prefix compression implemented the second time. In Prefix compression, a separate common value is identified for each column, but in Dictionary compression, common values are identified from all columns in that page and stored in second row below the header. Then, these common values are replaced with the reference of values in new row.
Below is an example. Even it is not a good practical scenario, but sufficiently explains the compression functionality.
Review your database for performance and implement this strategies. Let us run the following different row level and page level compression and observe the size of the table once compression is in place.
USE tempdb
GO
CREATE TABLE TestCompression (col1 INT, col2 CHAR(50))
GO
INSERT INTO TestCompression VALUES (10, 'compression testing')
GO 5000
-- Original
EXEC sp_spaceused TestCompression
GO
-- DATA_COMPRESSION = ROW
ALTER TABLE TestCompression
REBUILD WITH (DATA_COMPRESSION = ROW);
GO
EXEC sp_spaceused TestCompression
GO
-- DATA_COMPRESSION = PAGE
ALTER TABLE TestCompression
REBUILD WITH (DATA_COMPRESSION = PAGE);
GO
EXEC sp_spaceused TestCompression
GO
-- DATA_COMPRESSION = NONE
ALTER TABLE TestCompression
REBUILD WITH (DATA_COMPRESSION = NONE);
GO
EXEC sp_spaceused TestCompression
GO
Do you use this compression on your production server? If yes, it will be interesting if you share your feedback with us over here.
Reference: Pinal Dave (https://blog.sqlauthority.com)
30 Comments. Leave new
Hi Pinal Dave,
Thanks for yet another good technical overview, from your experience what sort of additional CPU load do you see once enabling compression?
Hi Pinal,
I have used data compression at row level on a reporting database which i use for SSRS reporting. The good thing was that it did not create any performance issues by using the row compression. The database is not huge either it is only around 5GB. There were space savings at the table level.
Only SQL Server Enterprise Edition supports compression.
also in developer edition
Hi Pinal, I use data compression at page level on my 4 majors tables. I don’t have problems with CPU performance and I reduced almost 20% of the size of that tables.
Good article. I have a question, that either this will be beneficial in-term of storage or IO during query too. Like when we will submit query, is there any performance benefits?
Yes, comprassion can improve performance because reduced IO proportionally improves the performance.
Regards,
Pinal Dave
The optimisation of NULL and 0 to no bytes sounds like a potential problem. What happens if I have a numeric field where NULL is allowed as a valid entry (i.e. bank balance is unknown) and where 0 is also a valid entry (back balance is 0)? If both truly are stored exactly the same way with no other disambiguation, then some of the data will be incorrect.
i think null is optimized but ‘0’ would still have to be stored like any other number as either int or smallint or bigint whatever the datatype is specified for the column storing ‘0’
It is very informative one. Is this advisable to set it up in the SharePoint tables?
Hi Pinal Dave,
Any updates for the above?
Hi Pinal,
Love your site and straight forward info– without unnecessary bells & whistles.
DB compression is great…The question –is how to reclaim Space subsequent to DB compression.
MS article — advise not to use SHRINK file/database – but at times that’s the only option. Would you please advise best course to reclaim space after Compression using DBCC Shrink Database (without causing too much fragmentation)
Thanks in advance,
Hi, it was informative and good learning to me. How these compression reacts when new data is loaded to compressed table. Does it impact the table performance or do we need to rebuild compression every time when we update.
inspired by pinal dave.(ELA)
once again a great article .very helpful
so not only it’s a bless for the storage but even for the performances am i right?
Hi Pinal,
I have a question about page compression.
1. Created table (heap) tblcompress with page compression
2. Inserted 2 million rows using INSERT INTO .. WITH (TABLOCK) SELECT FROM syntax
3. Used below to verify if the data actually was page compressed as it should have as per msdn.
SELECT
o.name, ips.index_type_desc, p.partition_number, p.data_compression_desc,
ips.page_count, ips.compressed_page_count
FROM sys.dm_db_index_physical_stats
(DB_ID(), 266288554, NULL, NULL, ‘DETAILED’) ips
JOIN sys.objects o ON o.object_id = ips.object_id
JOIN sys.partitions p ON p.object_id = o.object_id
But data doesnt seem to be page compressed? Please help
–As per Msdn:
The newly inserted row is page-compressed:
• if new row goes to an existing page with page compression
• if the new row is inserted through BULK INSERT with TABLOCK
• if the new row is inserted through INSERT INTO … (TABLOCK) SELECT … FROM
Otherwise, the row is row-compressed.*
Hi Pinal,
We are doing ETL of several datasources into SQL Server. We do not have much space to accomodate data which comes into target DataBase. Which of these two compression techniques can be applied during real time data load into the target DB? If there is any other way to compress during data loading into target Database please share with us,
Many Thanks,
Daniel
Thanks it helped clarifying the concept.
i have a client requirement to compress data row or page
I am not sure What can i compressthe data or row
what can i do for banking domain application???????
When I execute following command,
ALTER TABLE [dbo].tbl_MMSG REBUILD PARTITION = ALL
WITH
(DATA_COMPRESSION = PAGE
)
I get following message.
Msg 7738, Level 16, State 2, Line 1
Cannot enable compression for object ‘tbl_MMSG’. Only SQL Server Enterprise Edition supports compression.
Please let me know how to compress table in Enterprise Editiion
I have been exploring for a little bit for any high quality articles or weblog posts on
this sort of space . Exploring in Yahoo I finally stumbled upon this web site.
Reading this info So i am happy to convey that I have a very just right uncanny feeling I came upon
exactly what I needed. I most no doubt will make sure to do not put out of your mind this web site and give it a
look regularly.
Thanks Pinal Dave. It helped me to improve the performance from 1 minute to 30 seconds. But I could see the inconsistency in the performance. It fluctuates from 20 seconds – 40 seconds. Can you tell me, how can we make the procedure to result data in consistent manner (variation of few seconds acceptable).
Regards,
Sivanesan