SQL SERVER – Introduction to Change Data Capture (CDC) in SQL Server 2008

Simple-Talk.com has published my very first article on their site. This article is introducing Change Data Capture – the new concept introduced in SQL Server 2008. Change Data Capture records INSERTs, UPDATEs, and DELETEs applied to SQL Server tables, and makes a record available of what changed, where, and when, in simple relational ‘change tables’ rather than in an esoteric chopped salad of XML. These change tables contain columns that reflect the column structure of the source table you have chosen to track, along with the metadata needed to understand the changes that have been made.

Introduction to Change Data Capture (CDC) in SQL Server 2008

Please read my article and leave your valuable comments here. If you like the article do not forget to rate it.

Reference : Pinal Dave (http://blog.sqlauthority.com)

About these ads

58 thoughts on “SQL SERVER – Introduction to Change Data Capture (CDC) in SQL Server 2008

  1. Nice Article and i read a similar once before but this one is so simple to understand.. really you have a art to present things in very simple way…. see u change my thinking about indexes i dropped 4 after came back to Delhi

  2. Hello sir,

    One of the best article i have ever read.

    I have one question
    what, you are explaining about the Mask,
    that it stores in hexadecimal and after convert it into
    bit you know which columns are updated.
    what is the limit of that mask storing.
    i have table which has 185 columns so any easy way to
    find which column i have updated or
    can i get the name of the columns which are updated.

    one things is
    if i update with same value
    example column id has value – 1 and
    i update it with again 1 then that
    cdc records that thing ?

    • Hi Kuldip,

      You can have any number of column here. CDC supports as much as column in tracking as much are allowed by table creation.

      No, as name suggests Changed Data Capture, it will not track the column if do not change the value.

      Kind Regards,
      Pinal

      • Hi Pinal,
        I have some issue with CDC. My table has one varchar(max) column and one xml column. Whenever any update happens in the table where above two columns are not involved the before update data shows some strange values for those two columns.
        My xml column has value “”. In the CDC table before update it shows null. Same thing is happenning for varchar(max) column. Any idea on this?

  3. I have a system of triggers now that provides an undo/redo feature along with history commands that show a record’s history of changes. I’m wondering if CDC can be used to such functionality. Any thoughts?

  4. very good article and simple also. Now my question, is there any problem in using the CDC system that sql server provides, coz as i was searching for some pros and cons for using CDC i came across a link http://technet.microsoft.com/en-us/magazine/2008.11.sql.aspx?pr=blog#id0110025 which tells some comparison b/w change tracking and change data capture. Now is there any problem like performance hit in any way can occur when using the cdc facility

    Please reply!
    Thanks in advance

  5. Hi pinalDave,

    Great article, it helped a lot. I have two
    Is there any disadvantage of CDC that we have identified till now incase of real-time data transfer?
    How good is the performance and does CDC create any additional overhead on the server?
    Is it possible that CDC can replication process used in sql? Can it be compared?

    Awaiting your reply in anticipation
    Thanks in advance!

  6. Pingback: SQL SERVER – Several Readers Questions and Readers Answers Journey to SQL Authority with Pinal Dave

  7. Hi Pinal,

    Before I start my query I would like to say that whenever I have face any problem I have also got the solution through your Forum. It really helpful for us

    Thanks a million for all your efforts.

    I was trying to apply CDC on our Server where I would like to store ALL changes done on database. I was unable to figure out how do you track date and time at which those changes were carried out because when I go through the LOGS I only get information about the CHANGES but it doesn’t mention when this changes were carried out.

    Please let me know if I am missing something on tracing the time at which those changes where carried out.

    Thanks
    Neil

  8. Dear pinalDave
    I Red one article about introduction to Change Data Capture (CDC) in SQL Server 2008 which is Posted in simple-Talk.In that (a mirror of the tracked table is created with the same column structure of the original table+ Additional columns for tracking the Changes)..
    Does it Increases the DB Size while Executing?

  9. Hello Ajith,

    When CDC is enabled for a table the history of all changes is stored in a change table. As this table reserves size on disk, db size obviously increases. But using CDC retention based cleanup policy its size is maintained within a limit.

    Regards,
    Pinal Dave

  10. Hi Pinal,

    while search on google i have found out this alternative where you can add columns to existing CDC table to capture alternative information.

    Simply alter the cdc.schema_TableName_CT table and add a column with a default. The columns I have added are

    username varchar(30) null default(suser_sname())
    auditdte datetime2 null default(getdate())

    but i am not sure on impact it may carried out on Database.

    can you please let me know if there will any impact on database when I add additional columns which are not created by CDC

    Regards
    Neil

  11. Hi Dave

    I have a principal-mirror servers setup without witness and enabled CDC on a database 1 specific table on principal server. Whenever I failover I create the jobs(cleanup and capture) after failover was made(as BOL is saying) but the capture of the data seems is not happening on the newly principal server.
    If I failover back the capturing is wk ok on the initial principal.

    Any ideea if I miss something?

    Thx for help!

  12. Hi Pinal,

    CDC doesnt give me details on who made the changes.For this I can go with Audits.Other than this do you recommend any other methods.

    Thank You,
    Anup

  13. Hello Pinal,
    YOur blog was very useful for auditing. I want to know that how can we take backup and restore of cdc tables created under system tables.

    I increased retention to 1 week and backup database weekly.And if any issues want to restore the database and check cdc data.

    Thanks,

  14. Hi Pinal,

    Great article! You simplified the steps to make it easy to implement.

    Is there a way to also capture the User who made the modification?

    Is it as simple as adding a column called UserName with a DEFAULT VALUE of SUSER_SNAME() to the CT Table?

    Or is there a better way of doing this?

  15. Hello,

    Initial setup was clean and easy. When I dig further I am noticing the cleanup job is calling functions and stored procedures that do not exist.

    For example:

    sp_MScdc_cleanup_job calls the following functions and SP’s that do not exist.

    fn_cdc_is_db_enabled
    sp_cdc_get_cleanup_retention
    sys.sp_cdc_cleanup_job_internal

    Is anyone actually using this in a production envirnoment? Do I need to be running SQL2008 R2?

    Thanks,

    Randy

  16. OK, I as able to find the stored procedures in mssqlsystemresource.mdf database, although I still am looking for the above function.

    Thanks,

    Randy

  17. Hi Pinal,

    I do not have SQL Server 2008 and working with SQL server 2005, is there anything similar that i can do to my SQL 2005 version like adding some SP just like CDC that would give me the same functionality ??????

    Thanks
    Farooq

  18. Hi Pinal,

    I tried implementing the CDC and Configured Transctional Replication on the same database.. when i am doing this.. the CDC at the database level looks good..but CDC at the table level is disabling/tweeking the metadata tables of CDC..

    So, Could you plzz suggest me on thiss……

    Thanks
    Anil Inampudi

  19. hi Pinal,
    it’s very helpful article. but still i have small doubt, what is the Architecture of SQL server change data capture(CDC)? is sql server using trigger to capture change data in CDC?

    Thanks
    Tharindu Dhaneenja

  20. Hi Pinal,
    I have some issue with CDC. My table has one varchar(max) column and one xml column. Whenever any update happens in the table where above two columns are not involved the before update data shows some strange values for those two columns.
    My xml column has value “”. In the CDC table before update it shows null. Same thing is happenning for varchar(max) column. Any idea on this?

  21. This is cool feature and very well explained.

    do you have similar explanation for Change Tracking features, it’s pros and cons?
    Is it a good practice to use Change Tracking when polling is done with very high frequency say in every 2-3 minutes?

  22. Hi Pinal,

    I am looking for the opportunities to use this feature in my current DW implementation however, our source systems are in SQL 2005 and Sybase. Can CDC support other data sources somehow?

    Thanks
    Ashish

  23. Pingback: SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Day 18 of 31 Journey to SQLAuthority

  24. Pingback: SQL SERVER – Interview Questions and Answers – Frequently Asked Questions – Day 19 of 31 Journey to SQLAuthority

  25. Hi Pinal,
    great article, i love your way to explain.
    i’m not clearing that when i change views,stored proc, function, then cdc can work?
    and if any dba open table in edit mode & perform any insert/update/delete operation then in that case it also works?

  26. Hi Pinal,

    CDC was enabled on the existing database table where it had few records in it. CDC is not capcturing changes for old data which is there prior to the CDC, working great for new records that got inserted. What is the work around for this problem ?

  27. Hi Pinal,

    I have applied CDC on child tables, now the requirement has changed and i need to capture changes in parent tables. I can apply CDC on parent tables, but while fetching these change records in parent table, i want the records from its child tables as well which are associated with it at that point of time. Hope you get the point here.

  28. Is there any way to insert into CDC table manually?

    We have history tables already existing, we are trying to replace them with CDC tables, but we need to migrate data from old history tables into CDC table.

  29. Hi Pinal,

    Is there any way to keep my Change Data tables in another database instead of other schema?

    Your kind help is highly appreciated.

    Regards,
    Atiq Rahman

  30. How can I use CDC for bulk load when I have one database on a SQL Server 2005 server and the staging database on SQL Server 2008 with CDC? I don’t want to use lookups or joins or exist statements but I need to get the initial bulk load from the 2005 server and all incremental loads from the 2005 server. Can you help?

  31. How to push CDC data in change tables to destination

    The requirement is to license my DB data to others.
    After I give initial dump of data I am ware of only 2 options to keep the data in sync between source & destination

    1. Steup replication or
    2. Enable CDC .
    I have very limited knowledge of both.

    Can you please help me with which of the 2 mentioned above will be an efficient approach ?
    Also in case of replication the changes made to publisher are pushed to subscriber but in case of CDC i am not aware how to push the captured changes from CDC tables to the subscriber/destination server/DB

    Please guide me on this

  32. Hello Pinal,

    I just want to THANK YOU for every of your article. Your articles are so easy to understand. Really appreciate your help.

    Nick

  33. Pingback: SQL SERVER – A Quick Look at Logging and Ideas around Logging « SQL Server Journey with SQL Authority

  34. Wish I had read the article today before fumbling through BOL to gather an understanding :) Reading through your article confirms most of what I comprehended by reading BOL, but it also explained a few additional items. Excellent job with the article.

  35. Hi

    Generally while enable CDC default it created on same database. But now my requirement is enable the CDC on database1 and capture all the changes and logged into database2. That means all dml changes on database1 logged into database2 using CDC. If possible please assist on this.

    Regards.

  36. Sir i have question regarding that,
    If we disable CDC and then re-enable, then we lose existing CDC tables that tracked changes up to the point of disabling CDC. How can we use both feature disable CDC but keep already tracked data?
    Thx in advance …pls ans ASAP.

  37. Hi,
    I am using CDC to track my historical data.I have a doubt ,while clearing transaction log whether my CDC tables will clear its data?.

    Thanks in advance.

  38. Hi,
    Very Nice Article.
    Looks like what I was searching for.
    But, I have a doubt over here,,
    it says,
    “By Default clean up occurs after 3 days.”
    So how can I keep details(say in same db or another) which are being cleaned.

  39. Great article as always…

    Q: With CDC is it possible to set up a ‘number’ of previous tracking changes, rather than time based?

    eg. Given a record in a database, track the last (say) 100 edits done to that row. Keep the newest, and purge the older ones.

    Cheers!
    Kon,

  40. hi Pinal, thanks for the detailed article.

    Few questions on CDC implementation:
    1. What is the impact of CDC on DB server over classic Trigger on same set of tables; in terms of I/O, CPU, etc?
    2. Can CDC tables be created onto a different database than the source tables?

    Many thanks in advance.

  41. Hi Pinal,

    For example I have a table employee having column empid,ename,lastupdate_date

    Is their any sql server built in functionality which sets the last update datetime in each row , I don’t want to do it using trigger or adding code to set last updated datetime. But i want all the rows fetched which is last update between given datetime interval.

    Please help me on this.

    Thanks
    Saurabh

  42. Hi pinal,

    I have a set of database in which i want to create trigger to track the changes and update another table at the same time. How can i find the changed column name as well as the changed value.

    Please help me on this if you have any ideas on trigger.

  43. Hi Pinal,

    We are investigating on the working process of CDC to use this.
    But it seems there are many unanswered questions in this SQL Authority blog itself. We are just doubting whether we have to go with CDC or not.

    Thanks,
    Santhosh

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s