SQL SERVER – Prevent Constraint to Allow NULL

With naked eyes, we often spot the evident problems but the specific details are missed many a time. Something similar happened recently. One of the blog readers sent me an email asking about a bug in how CHECK CONSTRAINT works. He suggested that check constraint accepts NULL even though the rule is specified.

After looking at the whole script, I found out what he has done and how to prevent this type of error.

Let us first reproduce the script where the constraint allows NULL value in the column.

CREATE DATABASE TestDB
GO
USE TestDB
GO
CREATE TABLE TestTable (ID INT, Col1 INT, Col2 INT)
GO
-- Create Constraint on Col1
ALTER TABLE TestTable ADD CONSTRAINT CK_TestTable_Col1
CHECK (Col1 > 0)
GO
-- Insert will work fine
INSERT INTO TestDB.dbo.TestTable (ID, Col1, Col2)
VALUES(1,1,1)
GO
-- Insert will throw an error
INSERT INTO TestDB.dbo.TestTable (ID, Col1, Col2)
VALUES(1,0,1)
GO
-- Insert will work fine with NULL
INSERT INTO TestDB.dbo.TestTable (ID, Col1, Col2)
VALUES(1,NULL,1)
GO

Inserting the zero will throw an error.

Inserting NULL will not throw error.

The reality is that any constraint will prevent values that are evaluated as FALSE. When NULL is evaluated, it is not evaluated as FALSE but as UNKNOWN. Owing to the same reason, the constraint is allowing NULL to be inserted. If you want NULL not to be inserted, the constraint has to be created in such a way that NULL will not be allowed.

The following script is created on Col2 where NULL is not allowed. When NULL is attempted to inserted, it will throw Error 547 as displayed in the earlier image.

-- Add the Constraint on Col2
ALTER TABLE TestTable ADD CONSTRAINT CK_TestTable_Col2
CHECK (Col2 > 0 AND Col2 IS NOT NULL)
GO
-- Insert will throw an error
INSERT INTO TestDB.dbo.TestTable (ID, Col1, Col2)
VALUES(1,1,NULL)
GO
-- Insert will work fine
INSERT INTO TestDB.dbo.TestTable (ID, Col1, Col2)
VALUES(1,1,1)
GO
-- Clean up
USE MASTER
GO
ALTER DATABASE TestDB
SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
GO
DROP DATABASE TestDB
GO

Reference: Pinal Dave (http://blog.SQLAuthority.com)

About these ads

SQL SERVER – Disable Clustered Index and Data Insert

Earlier today, I received following email.

“Dear Pinal,

We looked at your script and found out that in your script of disabling indexes, you have only included selected non-clustered index during the bulk insert and missed to disabled all the clustered index. Our DBA [name removed] has changed your script a bit and included all the clustered indexes. Since then our application is not working.

When DBA [name removed] tried to enable clustered indexes again he is facing error Incorrect syntax error.

We are in deep problem [word replaced]

[Removed Identity of organization and few unrelated stuff ]”

I have replied to my client and helped them fixed the problem. However, what really caught my attention was the concept of disabling clustered index. Let us try to learn a lesson from this experience.

In this case, there was no need to disable clustered index at all. I had done all the crucial work when I was called in to work on the tuning project. I removed unused indexes, created a few optimal indexes and wrote a script to disable selected high cost indexes when bulk insert operations (and the like) are performed. There was another script which rebuilds all the indexes as well. The solution worked until they included a clustered index in order to disable the script.

Clustered indexes are in fact original tables (or heap) which are physically ordered (any more things – not scope of this article) according to one or more keys (columns). When a clustered index is disabled, its data rows  cannot be accessed. This means that there will be no insertion process possible. On the other hand, when non-clustered indexes are disabled, all the data related to it are physically deleted, but the definition of the index is kept in the system.

Due to the same reason, even reorganization of the index is not possible until the clustered index (which was disabled) is rebuilt. Now, let us come to the second part of the question which is in regards to the receiving of the error when a clustered index is ‘enabled’. This is a very common question that I receive on the blog. (The following statement is written keeping the syntax of T-SQL in mind) Clustered indexes can be disabled but cannot be enabled again; they have to be rebuilt to become enabled. It is indeed a common thinking that something which we have ‘disabled’ can be ‘enabled’ but the syntax for this is ‘rebuild’. This issue has been explained here: SQL SERVER – How to Enable Index – How to Disable Index – Incorrect syntax near ‘ENABLE’.

Let us go over this example where inserting the data is not possible when a clustered index is disabled.

USE AdventureWorks
GO
-- Create Table
CREATE TABLE [dbo].[TableName](
[ID] [int] NOT NULL,
[FirstCol] [varchar](50) NULL,
CONSTRAINT [PK_TableName] PRIMARY KEY CLUSTERED
([ID] ASC)
)
GO
-- Create Nonclustered Index
CREATE UNIQUE NONCLUSTERED INDEX [IX_NonClustered_TableName] ON [dbo].[TableName]
([FirstCol] ASC)
GO
-- Populate Table
INSERT INTO [dbo].[TableName]
SELECT 1, 'First'
UNION ALL
SELECT 2, 'Second'
UNION ALL
SELECT 3, 'Third'
GO
-- Disable Nonclustered Index
ALTER INDEX [IX_NonClustered_TableName] ON [dbo].[TableName] DISABLE
GO
-- Insert Data should work fine
INSERT INTO [dbo].[TableName]
SELECT 4, 'Fourth'
UNION ALL
SELECT 5, 'Fifth'
GO
-- Disable Clustered Index
ALTER INDEX [PK_TableName] ON [dbo].[TableName] DISABLE
GO
-- Insert Data will fail
INSERT INTO [dbo].[TableName]
SELECT 6, 'Sixth'
UNION ALL
SELECT 7, 'Seventh'
GO
/*
Error: Msg 8655, Level 16, State 1, Line 1
The query processor is unable to produce a plan because the index 'PK_TableName' on table or view 'TableName' is disabled.
*/
-- Reorganizing Index will also throw an error
ALTER INDEX [PK_TableName] ON [dbo].[TableName] REORGANIZE
GO
/*
Error: Msg 1973, Level 16, State 1, Line 1
Cannot perform the specified operation on disabled index 'PK_TableName' on table 'dbo.TableName'.
*/
-- Rebuliding should work fine
ALTER INDEX [PK_TableName] ON [dbo].[TableName] REBUILD
GO
-- Insert Data should work fine
INSERT INTO [dbo].[TableName]
SELECT 6, 'Sixth'
UNION ALL
SELECT 7, 'Seventh'
GO
-- Clean Up
DROP TABLE [dbo].[TableName]
GO

I hope this example is clear enough. There were a few additional posts I had written years ago, and they are as follows:

SQL SERVER – Enable and Disable Index Non Clustered Indexes Using T-SQL

SQL SERVER – Enabling Clustered and Non-Clustered Indexes – Interesting Fact

Reference: Pinal Dave (http://blog.SQLAuthority.com)

SQL SERVER – GUID vs INT – Your Opinion

I think the title is clear what I am going to write in your post.

This is age old problem and I want to compile the list stating advantages and disadvantages of using GUID and INT as a Primary Key or Clustered Index or Both (the usual case).

Let me start a list by suggesting one advantage and one disadvantage in each case.

INT

Advantage:

  1. Numeric values (and specifically integers) are better for performance when used in joins, indexes and conditions.
  2. Numeric values are easier to understand for application users if they are displayed.

Disadvantage:

  1. If your table is large, it is quite possible it will run out of it and after some numeric value there will be no additional identity to use.

GUID

Advantage:

  1. Unique across the server.

Disadvantage:

  1. String values are not as optimal as integer values for performance when used in joins, indexes and conditions.
  2. More storage space is required than INT.

Please note that I am looking to create list of all the generic comparisons. There can be special cases where the stated information is incorrect, feel free to comment on the same.

Please leave your opinion and advice in comment section. I will combine a final list and update this blog after a week. By listing your name in post, I will also give due credit.

Reference: Pinal Dave (http://blog.SQLAuthority.com)

SQL SERVER – Create Primary Key with Specific Name when Creating Table

It is interesting how sometimes the documentation of simple concepts is not available online. I had received email from one of the reader where he has asked how to create Primary key with a specific name when creating the table itself. He said, he knows the method where he can create the table and then apply the primary key with specific name. The attached code was as follows:

CREATE TABLE [dbo].[TestTable](
[ID] [int] IDENTITY(1,1) NOT NULL,
[FirstName] [varchar](100) NULL)
GO
ALTER TABLE [dbo].[TestTable] ADD  CONSTRAINT [PK_TestTable] PRIMARY KEY CLUSTERED
([ID] ASC)
GO

He wanted to know if we can create Primary Key as part of the table name as well, and also give it a name at the same time. Though it would look very normal to all experienced developers, it can be still confusing to many. Here is the quick code that functions as the above code in one single statement.

CREATE TABLE [dbo].[TestTable](
[ID] [int] IDENTITY(1,1) NOT NULL,
[FirstName] [varchar](100) NULL,
CONSTRAINT [PK_TestTable] PRIMARY KEY CLUSTERED
([ID] ASC)
)
GO

Reference: Pinal Dave (http://blog.SQLAuthority.com)

SQL SERVER – A Common Design Problem – Should the Primary Key Always be a Clustered Index

In SQL Server, whenever we create any key, a Primary Key automatically creates clustered index on the same. I like this feature and I use this feature every now and then.

The question is does the change of any column as Primary Key should also create a Clustered Index? Moreover, is there any case, where one would not do the same?

One of the recent conversations I had with one SQL Expert is with regard to the SSN number. The discussion was that SSN numbers are always unique and never repeated and hence are the best candidates for primary key. Additionally SSN numbers contains dashes (-), which make the datatype of the SSN numbers as String (VARCHAR or NVARCHAR). A clustered index on an integer usually performs better over a clustered index on an integer and makes the DBA to choose Primary Key of datatype Integer. At one point in our conversation, we discussed that if SSN number should be a Unique Constraint and if there should be another Identity Column as the Primary Key.

Some of the questions from our interesting discussion are as follows:

Would you have the datatype of your Primary Key as string?
Would you treat SSN as string datatype or remove the dashes and change it into an integer?
Do you have a real life example, where your primary key is not a clustered index?
What are the best practices for SSN number to store in database and obtain optimal performance?

I will post an interesting answer discussing this subject in a separate post with due credit.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Disable CHECK Constraint – Enable CHECK Constraint

Foreign Key and Check Constraints are two types of constraints that can be disabled or enabled when required. This type of operation is needed when bulk loading operations are required or when there is no need to validate the constraint. The T-SQL Script that does the same is very simple.

USE AdventureWorks
GO
-- Disable the constraint
ALTER TABLE HumanResources.Employee
NOCHECK CONSTRAINT CK_Employee_BirthDate
GO
-- Enable the constraint
ALTER TABLE HumanResources.Employee
WITH CHECK CHECK CONSTRAINT CK_Employee_BirthDate
GO

It is very interesting that when the constraint is enabled, the world CHECK is used twice – WITH CHECK CHECK CONSTRAINT. I often ask those to find the mistake in this script when they claim to know the syntax very well.

Have you ever disabled and enabled constraints in your production environment? I would like to know why you did so.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Difference Between Candidate Keys and Primary Key

Introduction

Not long ago, I had an interesting and extended debate with one of my friends regarding which column should be primary key in a table. The debate instigated an in-depth discussion about candidate keys and primary keys. My present article revolves around the two types of keys.

Let us first try to grasp the definition of the two keys.

Candidate Key – A Candidate Key can be any column or a combination of columns that can qualify as unique key in database. There can be multiple Candidate Keys in one table. Each Candidate Key can qualify as Primary Key.

Primary Key – A Primary Key is a column or a combination of columns that uniquely identify a record. Only one Candidate Key can be Primary Key.

One needs to be very careful in selecting the Primary Key as an incorrect selection can adversely impact the database architect and future normalization. For a Candidate Key to qualify as a Primary Key, it should be Non-NULL and unique in any domain. I have observed quite often that Primary Keys are seldom changed. I would like to have your feedback on not changing a Primary Key.

An Example to Understand Keys

Let us look at an example where we have multiple Candidate Keys, from which we will select an appropriate Primary Key.

Given below is an example of a table having three columns that can qualify as single column Candidate Key, and on combining more than one column the number of possible Candidate Keys touches seven. A point to remember here is that only one column can be selected as Primary Key. The decision of Primary Key selection from possible combinations of Candidate Key is often very perplexing but very imperative!

On running the following script it will always give 504 rows in all the options. This proves that they are all unique in database and meet the criteria of a Primary Key.

Run the following script to verify if all the tables have unique values or not.

USE AdventureWorks
GO
SELECT *
FROM Production.Product
GO
SELECT DISTINCT ProductID
FROM Production.Product
GO
SELECT DISTINCT Name
FROM Production.Product
GO
SELECT DISTINCT ProductNumber
FROM Production.Product
GO

All of the above queries will return the same number of records; hence, they all qualify as Candidate Keys. In other words, they are the candidates for Primary Key. There are few points to consider while turning any Candidate Key into a Primary Key.

Select a key that does not contain NULL

It may be possible that there are Candidate Keys that presently do not contain value (not null) but technically they can contain null. In this case, they will not qualify for Primary Key. In the following table structure, we can see that even though column [name] does not have any NULL value it does not qualify as it has the potential to contain NULL value in future.

CREATE TABLE [Production].[Product](
[ProductID] [int] IDENTITY(1,1) NOT NULL,
[Name] [dbo].[Name] NULL,
[ProductNumber] [nvarchar](25) NOT NULL,
[Manufacturer] [nvarchar](25) NOT NULL
)

Select a key that is unique and does not repeat

It may be possible that Candidate Keys that are unique at this moment may contain duplicate value. These kinds of Candidate Keys do not qualify for Primary Key. Let us understand this scenario by looking into the example given above. It is absolutely possible that two Manufacturers can create products with the same name; the resulting name will be a duplicate and only the name of the Manufacturer will differ in the table. This disqualifies Name in the table to be a Primary Key.

Make sure that Primary Key does not keep changing

This is not a hard and fast rule but rather a general recommendation: Primary Key values should not keep changing. It is quite convenient for a database if Primary Key is static. Primary Keys are referenced in numerous places in the database, from Index to Foreign Keys.  If they keep changing then they can adversely affect database integrity, data statistics as well as internal of Indexes.

Selection of Primary Key

Let us examine our case by applying the above three rules to the table and decide on the appropriate candidate for Primary Key. Name can contain NULL so it disqualifies as per Rule 1 and Rule 2. Product Number can be duplicated for different Manufacturers so it disqualifies as per Rule 2. ProductID is Identity and Identity column cannot be modified. So, in this case ProductID qualifies as Primary Key.

Please note that many database experts suggest that it is not a good practice to make Identity Column as Primary Key. The reason behind this suggestion is that many times Identity Column that has been assigned as Primary Key does not play any role in database. There is no use of this Primary Key in both application and in T-SQL. Besides, this Primary Key may not be used in Joins. It is a known fact that when there is JOIN on Primary Key or when Primary Key is used in the WHERE condition it usually gives better performance than non primary key columns. This argument is absolutely valid and one must make sure not to use such Identity Column. However, our example presents a different case. Here, although ProductID is Identity Column it uniquely defines the row and the same column will be used as foreign key in other tables. If a key is used in any other table as foreign key it is likely that it will be used in joins.

Quick Note on Other Kinds of Keys

The above paragraph evokes another question – what is a foreign key? A foreign key in a database table is a key from another table that refers to the primary key in the table being used. A primary key can be referred by multiple foreign keys from other tables. It is not required for a primary key to be the reference of any foreign keys. The interesting part is that a foreign key can refer back to the same table but to a different column. This kind of foreign key is known as “self-referencing foreign key”.

Summary

A table can have multiple Candidate Keys that are unique as single column or combined multiple columns to the table. They are all candidates for Primary Key. Candidate keys that follow all the three rules – 1) Not Null, 2) Unique Value in Table and 3) Static – are the best candidates for Primary Key. If there are multiple candidate keys that are satisfying the criteria for Primary Key, the decision should be made by experienced DBAs who should keep performance in mind.

Reference: Pinal Dave (http://blog.sqlauthority.com), DNS