SQL SERVER – COLUMNSTORE and COLUMNSTORE_ARCHIVE data compression

In earlier blogs on “Fundamentals of Columnstore Index”, I took a simple approach on taking a tour of how to create a simple Columnstore index. Having said that, SQL Server 2014 supports columnstore and columnstore archival compression for columnstore tables and indexes. Columnstore tables and indexes are always stored with columnstore compression. Most data warehouses have some data that is frequently accessed and some that is accessed more infrequently.

You might be wondering the use case scenario for these commands. For example, in reality the data may be partitioned by date and the most recent data is accessed much more frequently than older data. In such cases the older data can benefit from additional compression at the cost of slower query performance. I talk to a lot of customers about this special feature:

  • To enable this scenario SQL Server added support for archival compression of SQL Server column stores from SQL Server 2014.
  • The archival compression option is enabled on a per-table or partition (object) basis; a column store can contain objects with archival compression applied and objects without. To allow for easy extensibility of existing on-disk structures, archival compression is implemented as an extra stream compression layer on top of Columnstore index implementation.
  • The Columnstore compression implementation transparently compresses the bytes being written to disk during the column store serialization process and transparently decompresses them during the deserialization process. Stream decompression is always applied when data is read from disk. Data is not cached in memory with stream compression.

The further reduction obtained by archival compression is substantial, ranging anywhere from 25% to 60%+ depending on the data. These are rough numbers based on what I have seen at customer interactions in the past.

  • Use COLUMNSTORE_ARCHIVE data compression to compress columnstore data with archival compression.

Let us look at how this can be achieved using TSQL commands.

To Apply Compression:

The simple TSQL commands to play with ColumnStore and columnStore Archive bits in single ALTER script:

-- Adding Archive bit for single Partition
ALTER TABLE tbl_myColumnStore
REBUILD PARTITION
= 1 WITH (DATA_COMPRESSION =  COLUMNSTORE_ARCHIVE) ;
-- Adding Archive bit for all Partition
ALTER TABLE tbl_myColumnStore
REBUILD PARTITION
= ALL WITH (DATA_COMPRESSION =  COLUMNSTORE_ARCHIVE) ;
-- Adding Columnstore for all parition and Archive for few Partition
ALTER TABLE tbl_myColumnStore
REBUILD PARTITION
= ALL WITH (DATA_COMPRESSION =  COLUMNSTORE_ARCHIVE ON PARTITIONS (1,3)) ;

Use COLUMNSTORE data compression to decompress archival compression. This resulting data will continue to be compressed with columnstore compression.

To Remove the Archive Compression

Similar to above commands, we can also use to remove our archival compression on specific partitions or on objects. A representative command looks like:

-- Turning Archive bit off to Columnstore for all Partitions
ALTER TABLE tbl_myColumnStore
REBUILD PARTITION
= ALL WITH (DATA_COMPRESSION =  COLUMNSTORE) ;

As I wrap up this blog, I would like to know how many of you out there are using the archival columnstore in your environments. How much compression are you getting and what is the space savings because of this? I would love to hear from your experience as we learn together.

Reference: Pinal Dave (https://blog.sqlauthority.com)

ColumnStore Index, SQL Index
Previous Post
SQL SERVER – Script to Find Strings with Same Characters
Next Post
SQL SERVER – Configure, Monitor and Restore Backups from Mobile & Web browser

Related Posts

Leave a Reply