Just a day ago, I was reading the blog post of Michale J Swart. If you are a regular reader of this blog, I am sure you will be familiar with him. He is a very interesting blogger for sure. He recently wrote an article about Ten Things I hate to See in T-SQL; it was really fun, but the thing which caught my eyes was the subject of isDeleted Column. First of all, let me say that I totally agree with his view point. Let me re-produce what Michale exactly suggests.
“Deleted records aren’t deleted. Look, they’re right there!”
You sometimes see this. Records that are marked as deleted with a flag. It’s sometimes needed to support an undelete feature but it still bugs me.
It’s also confusing when looking at procedures called s_OBJECT_Delete(). Without looking at the definition, does this procedure delete the record, or just mark it as deleted?”
I, in fact, encountered a similar experience very soon. I had gone for performance tuning consultation and I was reviewing a large table. I spotted one table called Orders. Naturally, the size of the table was in millions of the rows. I thought – it is fine, a table can have that many rows. Then I checked another table called customers and it had under thousand records. The question which came to my mind was how come thousand customers ordered millions of items. I asked local DBA coordinator the same question. He said oh, we just cleaned the customer table but the orders table is yet to clean – consider it as a smaller table.
I asked him how can I consider the orders table as a small table when it is too large. He said, well we have most of the column marked as isDeleted.
I just do not like this kind of design at all. I am firm believer of the architecture where only necessary data should be in single table and the useless data should be moved to an archived table. Instead of following the isDeleted column, I suggest the usage of two different tables: one with orders and another with deleted orders. In that case, you will have to maintain both the table, but in reality, it is very easy to maintain. When you write UPDATE statement to the isDeleted column, write INSERT INTO another table and DELETE it from original table. If the situation is of rollback, write another INSERT INTO and DELETE in reverse order. If you are worried about a failed transaction, wrap this code in TRANSACTION.
What are the advantages of the smaller table verses larger table in above described situations?
- A smaller table is easy to maintain
- Index Rebuild operations are much faster
- Moving the archive data to another filegroup will reduce the load of primary filegroup (considering that all filegroups are on different system) – this will also speed up the backup as well.
- Statistics will be frequently updated due to smaller size and this will be less resource intensive.
- Size of the index will be smaller
- Performance of the table will improve with a smaller table size.
I think we can count many other advantages. Again, this is totally my opinion and I would like to ask you question, do you have this kind of soft delete mechanism in your database? If yes, what is the usage of it and how do you handle this. This is very interesting question to me; so I may publish the result on blog post as well.
Reference: Pinal Dave (https://blog.sqlauthority.com)
I am using isDeleted Flag to achieve the soft delete. Before setting the flag IsDeleted = 1, i want to check whether this record is referenced by other tables. Is there any way to achieve this
If you’re uncertain of what the foreign keys actually *are* you can locate them through the sys.foreign_keys, sys.foreign_key_columns, sysobjects and sys.all_columns tables. Once you know what the references are then just back-check the references to make sure things are nice and clean. I’m rather with Pinal Dave on the notion that you should keep your operational database clean and archive old traffic on a regular basis, but it is a lot of extra overhead to make sure that kind of a facility is set up and used regularly. As a consequence, since disk space is cheap these days, a lot of teams, being pressed for time, go with the soft-delete to save the time that would otherwise be sacrificed to the gods of ‘doing it right’. However it is inevitably true that nothing is eternal, and if you don’t clean it up now, somebody else will have to do so under tremendous duress with highly limited time constraints at a really, really bad time at some point in the future :)