SQL SERVER – Introduction to Dynamic Data Masking

SQL SERVER - Introduction to Dynamic Data Masking dynamicdatamasking-800x580 Data security is a critical concern for all organizations dealing with sensitive information. Protecting this data is not just a matter of good business practice; it’s often a legal requirement. One of the ways to ensure data security is through a technique known as data masking. In this blog post, we’ll explore the concept of dynamic data masking, a powerful feature available in Microsoft SQL Server 2016 and later, as well as Azure SQL Database, Azure SQL Managed Instance, and Azure Synapse Analytics. I have recently discussed this in detail with my client while working on my popular service Comprehensive Database Performance Health Check.

What is Dynamic Data Masking?

Dynamic Data Masking (DDM) is a security feature that limits the exposure of sensitive data to non-privileged users. It’s a way to ‘obfuscate’ sensitive data, replacing it with fictitious yet realistic data without changing the data in the database. DDM can be applied to specific database fields, hiding sensitive data in the results of queries.

The real beauty of DDM is its simplicity of integration with existing applications. The masking rules are applied in the query results, enabling many applications to mask sensitive data without any alterations to existing queries.

How Does It Work?

DDM allows customers to specify how much sensitive data to reveal with minimal impact on their application layer. A central data masking policy acts directly on sensitive fields in the database, and privileged users or roles with access to the sensitive data can be specified.

DDM offers full and partial masking functions, including a random mask for numeric data. These masks are defined and managed using simple Transact-SQL commands.

Types of Dynamic Data Masks

There are several types of masks available:

  1. Default: Provides full masking based on the data types of the designated fields. For example, string data types are masked with “XXXX”, numeric data types are replaced with zero, and date/time data types are replaced with a default date and time.
  2. Email: This masking method exposes the first letter of an email address and replaces the rest with a constant suffix “.com”.
  3. Random: This masking function is used on any numeric type to replace the original value with a random value within a specified range.
  4. Custom String: This method exposes the first and last letters of a value and adds a custom padding string in the middle.
  5. DateTime: This method, available in SQL Server 2022, helps to mask portions of date and time data types.

Permissions and Best Practices

No special permissions are required to create a table with a dynamic data mask, only the standard CREATE TABLE and ALTER on schema permissions. However, adding, replacing, or removing a mask on a column requires the ALTER ANY MASK permission and ALTER permission on the table.

Remember, creating a mask on a column doesn’t prevent updates to that column. So, even though users receive masked data when querying the masked column, the same users can update the data if they have write permissions. Likewise, using SELECT INTO or INSERT INTO to copy data from a masked column into another table results in masked data in the target table (assuming it’s exported by a user without UNMASK privileges).

Example for Dynamic Data Masking

The following code demonstrates how to implement dynamic data masking in SQL Server. It performs the following steps:

  1. It creates a CustomerData table with some columns masked using built-in masking functions. This will conceal sensitive data within those columns.
  2. It inserts a sample record into the table with real data.
  3. It then creates a non-privileged user and grants them SELECT permission on the table.
  4. It impersonates that user to show that they only see the masked data when querying the table.
  5. It then grants the UNMASK permission to the user.
  6. When impersonating the user again, they now see the unmasked data since they have UNMASK permission.
  7. Finally, it removes the non-privileged user and drops the table.
-- Create a table to store customer data with masked columns
CREATE TABLE CustomerData (
ID int IDENTITY PRIMARY KEY,
FullName nvarchar(100) MASKED WITH (FUNCTION = 'default()') NULL,
Email nvarchar(100) MASKED WITH (FUNCTION = 'email()') NULL,
CreditCard varchar(50) MASKED WITH (FUNCTION = 'partial(0,"XXXX-XXXX-XXXX-",4)') NULL,
BirthDate date MASKED WITH (FUNCTION = 'default()') NULL
);

-- Insert sample data
INSERT INTO CustomerData (FullName, Email, CreditCard, BirthDate)
VALUES ('John Doe', 'johndoe@example.com', '1234-5678-9012-3456', '1980-01-01');

-- Create a non-privileged user
CREATE USER NonPrivilegedUser WITHOUT LOGIN;

-- Grant SELECT permission
GRANT SELECT ON CustomerData TO NonPrivilegedUser;

-- Impersonate the user to show initial masked view
EXECUTE AS USER = 'NonPrivilegedUser';

-- Query the data
SELECT * FROM CustomerData;

-- Revert impersonation
REVERT;

-- Grant UNMASK permission
GRANT UNMASK TO NonPrivilegedUser;

-- Impersonate again to show unmasked view
EXECUTE AS USER = 'NonPrivilegedUser';

-- Query the data
SELECT * FROM CustomerData;

-- Revert impersonation
REVERT;

-- Remove the user
DROP USER NonPrivilegedUser;

-- Drop the table
DROP TABLE CustomerData;

Here is the output when the first time select runs, and the user does not have the privilege. They are not able to see all the data.

SQL SERVER - Introduction to Dynamic Data Masking mask1

Once the user has privilege, they are able to see all the necessary data.

SQL SERVER - Introduction to Dynamic Data Masking mask2

Conclusion

While DDM is a powerful tool, it’s not without its limitations. For instance, a masking rule can’t be defined for encrypted columns, FILESTREAM, COLUMN_SET or a sparse column that is part of a column set. Also, cross-database queries involving masked columns may not provide correct results.

In conclusion, Dynamic Data Masking is a valuable tool for enhancing data security. However, it’s not a standalone solution and should be used with other SQL Server security features to achieve comprehensive data protection.

If you are interested in such SQL Server Security Concepts, you can watch my recent SQL Server Security Fundamentals course at Pluralsight.

Here is the link to my Twitter and LinkedIn profile.

Reference: Pinal Dave (https://blog.sqlauthority.com)

Data Masking, SQL Server Security
Previous Post
SQL SERVER – STRING_AGG Function to Concatenate Strings
Next Post
SQL SERVER – Stored Procedures to Check for Pythagorean Triples

Related Posts

Leave a Reply