SQL SERVER – Introduction to BINARY_CHECKSUM and Working Example

In one of the recent consultancy, I was asked if I can give working example of BINARY_CHECKSUM. This is usually used to detect changes in a row. If any row has any value changed, this function can be used to figure out if the values are changed in the rows. However, if the row is changed from A to B and once again changed back to A, the BINARY_CHECKSUM cannot be used to detect the changes. Let us see quick example of the of same.

Following example is modified from the original example taken from BOL.

USE AdventureWorks;
GO
-- Create table
IF EXISTS (SELECT * FROM sys.objects WHERE OBJECT_ID = OBJECT_ID(N'[dbo].[myTable]') AND TYPE IN (N'U'))
DROP TABLE [dbo].[myTable] GO
CREATE TABLE myTable (column1 INT, column2 VARCHAR(256));
GO
-- Insert Value
INSERT INTO myTable VALUES (1, 'test');
GO
-- Check the Checksum
SELECT BINARY_CHECKSUM(*) AS BCH FROM myTable;
GO
-- Change the value
UPDATE myTable SET column2 = 'TEST';
GO
-- Check the Checksum
SELECT BINARY_CHECKSUM(*) AS BCH FROM myTable;
GO
-- Change the value
UPDATE myTable SET column2 = 'test';
GO
-- Check the Checksum
SELECT BINARY_CHECKSUM(*) AS BCH FROM myTable;
GO
-- Clean up
DROP TABLE [dbo].[myTable] GO

Let us check the resultset here.

SQL SERVER – Introduction to BINARY_CHECKSUM and Working Example Binarychecksum

You can clearly see when the values are change the value of the BINARY_CHECKSUM is changed as well, however, if the value is changed back to original value the BINARY_CHECKSUM is restored. Are you using this feature in your application, if yes, I am interested to know where and when do you use it.

Reference: Pinal Dave (https://blog.sqlauthority.com)

Database, SQL Function, SQL Scripts
Previous Post
SQLAuthority News – A Monthly Round Up of SQLAuthority Blog Posts
Next Post
SQL SERVER – Computed Column – PERSISTED and Performance – Part 2

Related Posts

13 Comments. Leave new

  • Chetana preetham
    August 1, 2010 8:38 am

    Hi Pinal
    Thanks for the post . I am avid follower of Sqlauthority blog.

    Here is my two cents on BINARY_CHECKSUM :)
    It can be used in transactional /merge replication .
    There are two methods to identify differences between subscriber and publisher
    1) row count
    Row count – counts the number of rows but doesn’t detect content differences between subscriber and Publisher.

    2 ) Row count and BINARY_CHECKSUM

    It counts the number of rows and also detects content differences between subscriber and Publisher.
    However this validation is very resource intensive and should be executed sparingly .

    Regards
    Chetana Pretham

    Reply
  • Thanks pinal its very informative . Also wants to know whether using BINARY_CHECKSUM or any other function can we find which column(s) has been changed.

    Regards,
    Nishit

    Reply
    • Srinivasan Prasanna
      August 2, 2010 9:15 pm

      I would think CDC should be used for column changes as opposed to Binary Check sum. Am I right, does anyone have a thought as to which one is better for column changes?

      Reply
      • Luciano Evaristo Guerche (Gorše)
        August 4, 2010 6:58 pm

        CDC is used mainly by SSIS (integration) and SSAS (analysis – cubes) as far as I know whereas binary check sum is used for optimistic locking and concurrency collision detection when table(s) do(es) not have timestamp field(s). In short: a) CDC runs in the background and the tables it yields are used by SSIS and SSAS; b) binary check sum are added as extra computed columns to select statements used by ado.net data adapters to fill ado.net datasets/datatables. Then the dataadapters shall have their update and delete commands updated so that their SQL statements use the binary check sum in the where clause for optmistic locking and concurrency collision detection.

  • Nakul Vachhrajani
    August 1, 2010 8:42 pm

    Hello!

    Great article, Pinal. Very simply demonstrates the concept of Binary checksum and also it’s pros & cons.

    We use the binary checksum in our application for merge replication (in fact, ours has a custom filter; but conceptually, it’s merge replication). We use the binary checksum only to check for row-count and checksum violations. Because most of our tables have audit records, even if the record circum-vented the replication filters, it will show up as a violation.

    That’s where I have seen a few other products use binary check-sums as well. A few use these for auditing, but now I realize that someone can easily go untraced by resetting the original value.

    Have a great week ahead!

    Reply
  • Luciano Evaristo Guerche (Gorše)
    August 2, 2010 7:13 pm

    Hi Pinal,

    Take a look at where Shawn resorted to BINARY_CHECKSUM for optimistic locking and collision detection.

    Cheers.

    Reply
  • Hi Pinal,

    My company uses BINARY_CHECKSUM to compare snapshots of data in an application that integrates our different systems. We aren’t too worried about auditing the values, i.e. if they change back to their original values, we only need the latest changed data at a set time, so BINARY_CHECKSUM works quite well for us.

    Lee.

    Reply
    • The following also returns the same Binary_Checksum values.
      select BINARY_Checksum(‘1044 ZONE 2’)
      select BINARY_Checksum(‘1074 ZONE 1’)

      Reply
  • Hi Pinal

    We’ve found a bit of a potential problem with BINARY_CHECKSUM(), perhaps we are missing something but could you explain if this is intended behavior or a bug?

    — these should all be different and are
    SELECT BINARY_CHECKSUM(‘ABCDEF-ABCDEF’)
    SELECT BINARY_CHECKSUM(‘abcdef-abcdef’)

    — add one character to each string that is separated by a ‘-‘ character
    — these should be different you would imagine
    — however if we have 7 or more characters and the same number of characters on each side they are the same
    SELECT BINARY_CHECKSUM(‘ABCDEFG-ABCDEFG’)
    SELECT BINARY_CHECKSUM(‘abcdefg-abcdefg’)

    — same effect with any other character upper case or not or alphanumerical (same ‘split’ character and case though!)
    SELECT BINARY_CHECKSUM(‘ABCDEFGxABCDEFG’)
    SELECT BINARY_CHECKSUM(‘abcdefgxabcdefg’)

    — now try 7 characters without the ‘-‘ character and all is ok
    SELECT BINARY_CHECKSUM(‘ABCDEFGABCDEFG’)
    SELECT BINARY_CHECKSUM(‘abcdefgabcdefg’)

    — or put 2 ‘-‘ characters in and all is ok!
    SELECT BINARY_CHECKSUM(‘ABCDEFG–ABCDEFG’)
    SELECT BINARY_CHECKSUM(‘abcdefg–abcdefg’)

    Paul

    Reply
    • The following also returns the same Binary_Checksum values.
      select BINARY_Checksum(’1044 ZONE 2′)
      select BINARY_Checksum(’1074 ZONE 1′)

      Reply
      • David Weprin
        June 4, 2012 10:06 pm

        CHECKSUM is acceptable when your application can tolerate infrequent missed changes. However, when accuracy is required, you can use the HASHBYTES() function to detect changes, although at a much higher cost of speed and storage space. See https://docs.microsoft.com/en-us/sql/t-sql/functions/hashbytes-transact-sql?view=sql-server-2017

      • I found a usage for it when I was trying to discover an easy way to delete duplicate rows from a table without specifying all columns (could be 50+). The suggestion made was to use the checksum within the context of the partition by clause for the row_number function. Bear in mind that I don’t think this can be blindly used in all cases (check data type restrictions on checksum), but it sure does work for my test cases. Link to the original source is here:

  • Hello, the following returns the same value. What do you recommend?

    SELECT BINARY_CHECKSUM(-2.05)

    SELECT BINARY_CHECKSUM(2.05)

    Reply

Leave a Reply