With naked eyes, we often spot the evident problems but the specific details are missed many a time. Something similar happened recently. One of the blog readers sent me an email asking about a bug in how CHECK CONSTRAINT works. He suggested that check constraint accepts NULL even though the rule is specified.
After looking at the whole script, I found out what he has done and how to prevent this type of error.
Let us first reproduce the script where the constraint allows NULL value in the column.
CREATE DATABASE TestDB
GO USE TestDB
GO CREATE TABLE TestTable (ID INT, Col1 INT, Col2 INT) GO -- Create Constraint on Col1 ALTER TABLE TestTable ADD CONSTRAINT CK_TestTable_Col1 CHECK (Col1 > 0) GO -- Insert will work fine INSERT INTO TestDB.dbo.TestTable (ID, Col1, Col2) VALUES(1,1,1) GO -- Insert will throw an error INSERT INTO TestDB.dbo.TestTable (ID, Col1, Col2) VALUES(1,0,1) GO -- Insert will work fine with NULL INSERT INTO TestDB.dbo.TestTable (ID, Col1, Col2) VALUES(1,NULL,1) GO
Inserting the zero will throw an error.
Inserting NULL will not throw error.
The reality is that any constraint will prevent values that are evaluated as FALSE. When NULL is evaluated, it is not evaluated as FALSE but as UNKNOWN. Owing to the same reason, the constraint is allowing NULL to be inserted. If you want NULL not to be inserted, the constraint has to be created in such a way that NULL will not be allowed.
The following script is created on Col2 where NULL is not allowed. When NULL is attempted to inserted, it will throw Error 547 as displayed in the earlier image.
-- Add the Constraint on Col2 ALTER TABLE TestTable ADD CONSTRAINT CK_TestTable_Col2 CHECK (Col2 > 0 AND Col2 IS NOT NULL) GO -- Insert will throw an error INSERT INTO TestDB.dbo.TestTable (ID, Col1, Col2) VALUES(1,1,NULL) GO -- Insert will work fine INSERT INTO TestDB.dbo.TestTable (ID, Col1, Col2) VALUES(1,1,1) GO -- Clean up USE MASTER GO ALTER DATABASE TestDB SET SINGLE_USER WITH ROLLBACK IMMEDIATE; GO DROP DATABASE TestDB
We looked at your script and found out that in your script of disabling indexes, you have only included selected non-clustered index during the bulk insert and missed to disabled all the clustered index. Our DBA [name removed] has changed your script a bit and included all the clustered indexes. Since then our application is not working.
When DBA [name removed] tried to enable clustered indexes again he is facing error Incorrect syntax error.
We are in deep problem [word replaced]
[Removed Identity of organization and few unrelated stuff ]”
I have replied to my client and helped them fixed the problem. However, what really caught my attention was the concept of disabling clustered index. Let us try to learn a lesson from this experience.
In this case, there was no need to disable clustered index at all. I had done all the crucial work when I was called in to work on the tuning project. I removed unused indexes, created a few optimal indexes and wrote a script to disable selected high cost indexes when bulk insert operations (and the like) are performed. There was another script which rebuilds all the indexes as well. The solution worked until they included a clustered index in order to disable the script.
Clustered indexes are in fact original tables (or heap) which are physically ordered (any more things – not scope of this article) according to one or more keys (columns). When a clustered index is disabled, its data rows cannot be accessed. This means that there will be no insertion process possible. On the other hand, when non-clustered indexes are disabled, all the data related to it are physically deleted, but the definition of the index is kept in the system.
Due to the same reason, even reorganization of the index is not possible until the clustered index (which was disabled) is rebuilt. Now, let us come to the second part of the question which is in regards to the receiving of the error when a clustered index is ‘enabled’. This is a very common question that I receive on the blog. (The following statement is written keeping the syntax of T-SQL in mind) Clustered indexes can be disabled but cannot be enabled again; they have to be rebuilt to become enabled. It is indeed a common thinking that something which we have ‘disabled’ can be ‘enabled’ but the syntax for this is ‘rebuild’. This issue has been explained here: SQL SERVER – How to Enable Index – How to Disable Index – Incorrect syntax near ‘ENABLE’.
Let us go over this example where inserting the data is not possible when a clustered index is disabled.
GO -- Create Table CREATE TABLE [dbo].[TableName]( [ID] [int] NOT NULL, [FirstCol] [varchar](50) NULL, CONSTRAINT [PK_TableName] PRIMARY KEY CLUSTERED ([ID] ASC)
) GO -- Create Nonclustered Index CREATE UNIQUE NONCLUSTERED INDEX [IX_NonClustered_TableName] ON [dbo].[TableName] ([FirstCol] ASC) GO -- Populate Table INSERT INTO [dbo].[TableName] SELECT 1, 'First' UNION ALL SELECT 2, 'Second' UNION ALL SELECT 3, 'Third' GO -- Disable Nonclustered Index ALTER INDEX [IX_NonClustered_TableName] ON [dbo].[TableName] DISABLE
GO -- Insert Data should work fine INSERT INTO [dbo].[TableName] SELECT 4, 'Fourth' UNION ALL SELECT 5, 'Fifth' GO -- Disable Clustered Index ALTER INDEX [PK_TableName] ON [dbo].[TableName] DISABLE
GO -- Insert Data will fail INSERT INTO [dbo].[TableName] SELECT 6, 'Sixth' UNION ALL SELECT 7, 'Seventh' GO /*
Error: Msg 8655, Level 16, State 1, Line 1
The query processor is unable to produce a plan because the index 'PK_TableName' on table or view 'TableName' is disabled.
-- Reorganizing Index will also throw an error ALTER INDEX [PK_TableName] ON [dbo].[TableName] REORGANIZE
Error: Msg 1973, Level 16, State 1, Line 1
Cannot perform the specified operation on disabled index 'PK_TableName' on table 'dbo.TableName'.
-- Rebuliding should work fine ALTER INDEX [PK_TableName] ON [dbo].[TableName] REBUILD
GO -- Insert Data should work fine INSERT INTO [dbo].[TableName] SELECT 6, 'Sixth' UNION ALL SELECT 7, 'Seventh' GO -- Clean Up DROP TABLE [dbo].[TableName]
I hope this example is clear enough. There were a few additional posts I had written years ago, and they are as follows:
It is interesting how sometimes the documentation of simple concepts is not available online. I had received email from one of the reader where he has asked how to create Primary key with a specific name when creating the table itself. He said, he knows the method where he can create the table and then apply the primary key with specific name. The attached code was as follows:
CREATE TABLE [dbo].[TestTable]( [ID] [int] IDENTITY(1,1) NOT NULL, [FirstName] [varchar](100) NULL) GO ALTER TABLE [dbo].[TestTable] ADD CONSTRAINT [PK_TestTable] PRIMARY KEY CLUSTERED ([ID] ASC) GO
He wanted to know if we can create Primary Key as part of the table name as well, and also give it a name at the same time. Though it would look very normal to all experienced developers, it can be still confusing to many. Here is the quick code that functions as the above code in one single statement.
CREATE TABLE [dbo].[TestTable]( [ID] [int] IDENTITY(1,1) NOT NULL, [FirstName] [varchar](100) NULL, CONSTRAINT [PK_TestTable] PRIMARY KEY CLUSTERED ([ID] ASC)
In SQL Server, whenever we create any key, a Primary Key automatically creates clustered index on the same. I like this feature and I use this feature every now and then.
The question is does the change of any column as Primary Key should also create a Clustered Index? Moreover, is there any case, where one would not do the same?
One of the recent conversations I had with one SQL Expert is with regard to the SSN number. The discussion was that SSN numbers are always unique and never repeated and hence are the best candidates for primary key. Additionally SSN numbers contains dashes (-), which make the datatype of the SSN numbers as String (VARCHAR or NVARCHAR). A clustered index on an integer usually performs better over a clustered index on an integer and makes the DBA to choose Primary Key of datatype Integer. At one point in our conversation, we discussed that if SSN number should be a Unique Constraint and if there should be another Identity Column as the Primary Key.
Some of the questions from our interesting discussion are as follows:
Would you have the datatype of your Primary Key as string?
Would you treat SSN as string datatype or remove the dashes and change it into an integer?
Do you have a real life example, where your primary key is not a clustered index?
What are the best practices for SSN number to store in database and obtain optimal performance?
I will post an interesting answer discussing this subject in a separate post with due credit.
Foreign Key and Check Constraints are two types of constraints that can be disabled or enabled when required. This type of operation is needed when bulk loading operations are required or when there is no need to validate the constraint. The T-SQL Script that does the same is very simple.
GO -- Disable the constraint ALTER TABLE HumanResources.Employee NOCHECK CONSTRAINT CK_Employee_BirthDate
GO -- Enable the constraint ALTER TABLE HumanResources.Employee WITH CHECK CHECK CONSTRAINT CK_Employee_BirthDate
It is very interesting that when the constraint is enabled, the world CHECK is used twice – WITH CHECK CHECK CONSTRAINT. I often ask those to find the mistake in this script when they claim to know the syntax very well.
Have you ever disabled and enabled constraints in your production environment? I would like to know why you did so.
Not long ago, I had an interesting and extended debate with one of my friends regarding which column should be primary key in a table. The debate instigated an in-depth discussion about candidate keys and primary keys. My present article revolves around the two types of keys.
Let us first try to grasp the definition of the two keys.
Candidate Key – A Candidate Key can be any column or a combination of columns that can qualify as unique key in database. There can be multiple Candidate Keys in one table. Each Candidate Key can qualify as Primary Key.
Primary Key – A Primary Key is a column or a combination of columns that uniquely identify a record. Only one Candidate Key can be Primary Key.
One needs to be very careful in selecting the Primary Key as an incorrect selection can adversely impact the database architect and future normalization. For a Candidate Key to qualify as a Primary Key, it should be Non-NULL and unique in any domain. I have observed quite often that Primary Keys are seldom changed. I would like to have your feedback on not changing a Primary Key.
An Example to Understand Keys
Let us look at an example where we have multiple Candidate Keys, from which we will select an appropriate Primary Key.
Given below is an example of a table having three columns that can qualify as single column Candidate Key, and on combining more than one column the number of possible Candidate Keys touches seven. A point to remember here is that only one column can be selected as Primary Key. The decision of Primary Key selection from possible combinations of Candidate Key is often very perplexing but very imperative!
On running the following script it will always give 504 rows in all the options. This proves that they are all unique in database and meet the criteria of a Primary Key.
Run the following script to verify if all the tables have unique values or not.
GO SELECT * FROM Production.Product
GO SELECT DISTINCT ProductID FROM Production.Product
GO SELECT DISTINCT Name FROM Production.Product
GO SELECT DISTINCT ProductNumber FROM Production.Product
All of the above queries will return the same number of records; hence, they all qualify as Candidate Keys. In other words, they are the candidates for Primary Key. There are few points to consider while turning any Candidate Key into a Primary Key.
Select a key that does not contain NULL
It may be possible that there are Candidate Keys that presently do not contain value (not null) but technically they can contain null. In this case, they will not qualify for Primary Key. In the following table structure, we can see that even though column [name] does not have any NULL value it does not qualify as it has the potential to contain NULL value in future.
CREATE TABLE [Production].[Product]( [ProductID] [int] IDENTITY(1,1) NOT NULL, [Name] [dbo].[Name] NULL, [ProductNumber] [nvarchar](25) NOT NULL, [Manufacturer] [nvarchar](25) NOT NULL
Select a key that is unique and does not repeat
It may be possible that Candidate Keys that are unique at this moment may contain duplicate value. These kinds of Candidate Keys do not qualify for Primary Key. Let us understand this scenario by looking into the example given above. It is absolutely possible that two Manufacturers can create products with the same name; the resulting name will be a duplicate and only the name of the Manufacturer will differ in the table. This disqualifies Name in the table to be a Primary Key.
Make sure that Primary Key does not keep changing
This is not a hard and fast rule but rather a general recommendation: Primary Key values should not keep changing. It is quite convenient for a database if Primary Key is static. Primary Keys are referenced in numerous places in the database, from Index to Foreign Keys. If they keep changing then they can adversely affect database integrity, data statistics as well as internal of Indexes.
Selection of Primary Key
Let us examine our case by applying the above three rules to the table and decide on the appropriate candidate for Primary Key. Name can contain NULL so it disqualifies as per Rule 1 and Rule 2. Product Number can be duplicated for different Manufacturers so it disqualifies as per Rule 2. ProductID is Identity and Identity column cannot be modified. So, in this case ProductID qualifies as Primary Key.
Please note that many database experts suggest that it is not a good practice to make Identity Column as Primary Key. The reason behind this suggestion is that many times Identity Column that has been assigned as Primary Key does not play any role in database. There is no use of this Primary Key in both application and in T-SQL. Besides, this Primary Key may not be used in Joins. It is a known fact that when there is JOIN on Primary Key or when Primary Key is used in the WHERE condition it usually gives better performance than non primary key columns. This argument is absolutely valid and one must make sure not to use such Identity Column. However, our example presents a different case. Here, although ProductID is Identity Column it uniquely defines the row and the same column will be used as foreign key in other tables. If a key is used in any other table as foreign key it is likely that it will be used in joins.
Quick Note on Other Kinds of Keys
The above paragraph evokes another question – what is a foreign key? A foreign key in a database table is a key from another table that refers to the primary key in the table being used. A primary key can be referred by multiple foreign keys from other tables. It is not required for a primary key to be the reference of any foreign keys. The interesting part is that a foreign key can refer back to the same table but to a different column. This kind of foreign key is known as “self-referencing foreign key”.
A table can have multiple Candidate Keys that are unique as single column or combined multiple columns to the table. They are all candidates for Primary Key. Candidate keys that follow all the three rules – 1) Not Null, 2) Unique Value in Table and 3) Static – are the best candidates for Primary Key. If there are multiple candidate keys that are satisfying the criteria for Primary Key, the decision should be made by experienced DBAs who should keep performance in mind.
About a week ago, SQL Server Expert, Imran Mohammed, provided a script, which will list all the missing identity values of a table in a database. In this post, I asked my readers if any could write a similar or better script. The results were interesting. While no one provided a new script, my question sparked a very active discussion that is still ongoing.
When providing the script, Imran asked me if I knew of any specific circumstances in which this kind of query could be useful, as he could not think of an instance where it would be necessary to find a missing identity. I was unable to think of a single reason for listing missing identities in a table. I posted Imran’s script on the assumption that someone would come up with an improved script, but as mentioned earlier, nobody did. Instead, we have been able to follow a very interesting discussion on subject of the need, if any, for listing Missing Identity values.
So, the question is this: “Do you know a real-world scenario where a Missing Identity value in any table can create problems?“
I have already received some extremely interesting comments from many experts, and all have posed the above question in one form or another. At this moment, I am still trying to think of an example from my own experience, but have yet to find one. Imran has since come up with one good example. Here is what he and other experts have suggested so far.
Jacob Sebastian – IDENTITY values are not expected to be sequential and there are all chances of having missing identity values, the most common cause is transaction rollbacks.
Simon Worth – The identity column is basically just a random number – even though they come sequentially. A developer making an assumption that the next record inserted will have an identity that is 1 more than the last inserted record. And if this is the case – then there are flaws in the logic of the developer.
Jacob Sebastian – What if the value in my table is 1, 2, 3, 5, 6 etc where “4″ is missing from the sequence. So what is the importance of knowing whether a table has missing identity values or not?
Imran Mohammed – If you use Identity property as your most unique column and Transaction Identifier, then definitely you would want to know why few transaction did not completely, Is there any specific fashion these transaction fails (Can be found out looking at missing values of identity)… Could be helpful to debug.
In one of my recent projects, a large database migration project, I confronted a peculiar situation. SQL Server tables were already moved from Database_Old to Database_New. However, all the Primary Key and Foreign Keys were yet to be moved from the old server to the new server.
Please note that this puzzle is to be solved for SQL Server 2005 or SQL Server 2008. As noted by Kuldip it is possible to do this in SQL Server 2000.
In SQL Server Management Studio (SSMS), there is no option to script all the keys. If one is required to script keys they will have to manually script each key one at a time. If database has many tables, generating one key at a time can be a very intricate task. I want to throw a question to all of you if any of you have script for the same purpose.
As per my opinion, I think the challenge is to get orders of the column included in Primary Key as well on the filegroup they exist.
Please note here that I am not looking for names of Primary Key or Foreign Key of database. I have already written an article for the same here : SQL SERVER – Two Methods to Retrieve List of Primary Keys and Foreign Keys of Database . I am looking for T-SQL script that generates Primary Key from the existing table for all tables in database. There are already a couple of answers on my post here from two SQL Experts; you can refer to those for example here.
Following is an example of T-SQL that we need to generate. I am looking for script that will generate T-SQL script for all the Primary Key and Foreign Key for the entire database.
ALTER TABLE [Person].[Address] ADD CONSTRAINT [PK_Address_AddressID] PRIMARY KEY CLUSTERED ( [AddressID] ASC ) ON [PRIMARY]
If you have a solution for the same, please post here or email me at pinal ‘at’ sqlauthority.com and I will post on this blog with due credit. Again, please spread the word and help community become stronger by your active participation.
There are two different methods of retrieving the list of Primary Keys and Foreign Keys from database.
Method 1: INFORMATION_SCHEMA
DISTINCT Constraint_Name AS [Constraint], Table_Schema AS [Schema], Table_Name AS [TableName] FROM INFORMATION_SCHEMA.KEY_COLUMN_USAGE
Method 2: sys.objects
SELECT OBJECT_NAME(OBJECT_ID) AS NameofConstraint, SCHEMA_NAME(schema_id) AS SchemaName, OBJECT_NAME(parent_object_id) AS TableName, type_desc AS ConstraintType FROM sys.objects WHERE type_desc IN ('FOREIGN_KEY_CONSTRAINT','PRIMARY_KEY_CONSTRAINT') GO
I am often asked about my preferred method of retrieving list of Primary Keys and Foreign Keys from database. I have a standard answer. I prefer method 3, which is querying sys database. The reason is very simple. sys. schema always provides more information and all the data can be retrieved in our preferred fashion with the preferred filter.
Let us look at the example we have on our hand. When Information Schema is used, we will not be able to discern between primary key and foreign key; we will have both the keys together. In the case of sys schema, we can query the data in our preferred way and can join this table to another table, which can retrieve additional data from the same.
Let us play a small puzzle here. Try to modify both the scripts in such a way that we are able to see the original definition of the key, that is, create a statement for this primary key and foreign key.
If I get an appropriate answer from my readers, I will publish the solution on this blog with due credit.