Just a day ago, I was reading the blog post of Michale J Swart. If you are a regular reader of this blog, I am sure you will be familiar with him. He is a very interesting blogger for sure. He recently wrote an article about Ten Things I hate to See in T-SQL; it was really fun, but the thing which caught my eyes was the subject of isDeleted Column. First of all, let me say that I totally agree with his view point. Let me re-produce what Michale exactly suggests.
“Deleted records aren’t deleted. Look, they’re right there!”
You sometimes see this. Records that are marked as deleted with a flag. It’s sometimes needed to support an undelete feature but it still bugs me.
It’s also confusing when looking at procedures called s_OBJECT_Delete(). Without looking at the definition, does this procedure delete the record, or just mark it as deleted?”
I, in fact, encountered a similar experience very soon. I had gone for performance tuning consultation and I was reviewing a large table. I spotted one table called Orders. Naturally, the size of the table was in millions of the rows. I thought – it is fine, a table can have that many rows. Then I checked another table called customers and it had under thousand records. The question which came to my mind was how come thousand customers ordered millions of items. I asked local DBA coordinator the same question. He said oh, we just cleaned the customer table but the orders table is yet to clean – consider it as a smaller table.
I asked him how can I consider the orders table as a small table when it is too large. He said, well we have most of the column marked as isDeleted.
I just do not like this kind of design at all. I am firm believer of the architecture where only necessary data should be in single table and the useless data should be moved to an archived table. Instead of following the isDeleted column, I suggest the usage of two different tables: one with orders and another with deleted orders. In that case, you will have to maintain both the table, but in reality, it is very easy to maintain. When you write UPDATE statement to the isDeleted column, write INSERT INTO another table and DELETE it from original table. If the situation is of rollback, write another INSERT INTO and DELETE in reverse order. If you are worried about a failed transaction, wrap this code in TRANSACTION.
What are the advantages of the smaller table verses larger table in above described situations?
- A smaller table is easy to maintain
- Index Rebuild operations are much faster
- Moving the archive data to another filegroup will reduce the load of primary filegroup (considering that all filegroups are on different system) – this will also speed up the backup as well.
- Statistics will be frequently updated due to smaller size and this will be less resource intensive.
- Size of the index will be smaller
- Performance of the table will improve with a smaller table size.
I think we can count many other advantages. Again, this is totally my opinion and I would like to ask you question, do you have this kind of soft delete mechanism in your database? If yes, what is the usage of it and how do you handle this. This is very interesting question to me; so I may publish the result on blog post as well.
Reference: Pinal Dave (http://blog.sqlauthority.com)