Tuesday, September 30, 2008

Administrator & Monitoring Change Data Capture in SQL Server 2008

SQL Server 2008 introduces a new feature called Change Data Capture (CDC). CDC Captures DDL and DML activities on a SQL Server Table and places the changes in a separate SQL Server relational table. In this article we will see how DBA can administer and monitor CDC in SQL Server 2008. The CDC feature is disabled by default at the database level. A member of the sysadmin server role must enable a database for change data capture. Once the database is enabled for CDC any member of the dbo fixed database role can enable a table within the database for change data capture.

Overview of Change Data Capture
Once CDC is enabled at the database level, the next step is to enable CDC for a specific table for which the change needs to be captured. The CDC feature gathers the changed data from the database transaction log file and inserts the change information in an associated change table which is created during the setup and configuration process of CDC. There is a one to one relationship that exists between the source table and the change capture table. You can have a maximum of two change tables for a single source table. As the CDC feature needs to continuously read the transaction log file it’s obvious that for CDC to work SQL Server Agent should be running. As this feature in entrenched in the transaction log architecture and thus a lot of the metadata in CDC is related around the concept of a Log Sequence Number (LSN)

Definition of Log Sequence Number (LSN)
Every record in the Microsoft SQL Server transaction log is uniquely identified by a log sequence number (LSN). LSN's are ordered such that if LSN2 is greater than LSN1, the change described by the log record referred to by LSN2 occurs after the change described by the log record LSN.

SQL Server Versions Supporting Change Data Capture
The Change Data Capture feature is available in SQL Server 2008 Enterprise, Developer Editions.

How to Enable CDC for a SQL Server 2008 Database
1. Connect to a SQL Server 2008 instance using SQL Server Management Studio
2. In the query window, type the following TSQL Query to create a ChangeDataCapture Database:

Use Master
Go

IF EXISTS (SELECT name FROM sys.databases WHERE name = N'ChangeDataCapture')
DROP DATABASE ChangeDataCapture
GO

USE [master]
GO
Create Database ChangeDataCapture
Go

3. Once the database is successfully created you need to enable the change data capture feature for the database, this can be done by executing the following TSQL Query:

Use ChangeDataCapture
Go
EXEC sys.sp_cdc_enable_db
GO

4. Execute the query below to check whether the database is enabled for CDC:

Select [name] as DBName, is_cdc_enabled from sys.databases

The value of 1 for the is_cdc_enabled column means that the database is enabled for CDC and value of 0 means that the database is not enabled for CDC.

5. Once the database is enabled for CDC, you will see a new cdc schema, cdc user and a few metadata tables and other system objects created in the ChangeDataCapture database. The most important things which a DBA needs to keep in mind when enabling CDC for a database is to make sure that there are not a cdc schema or cdc user existing in the database before configuring CDC. If there was a cdc schema or cdc user existing in the database then the configuration of CDC will fail, so a DBA needs to remove or rename any previously existing cdc schema or user from the database before configuring CDC.



How to Enable CDC for a SQL Server 2008 Database Table
1. Now let’s create a Currency Table in the ChangeDataCapture Database by executing the following TSQL Query:

Use ChangeDataCapture
Go
Create table Currency
(
CurrencyKey Int Identity(1,1) Primary Key NOT NULL,
CurrencyAlternateKey varchar(5),
CurrencyName varchar(25)
)

2. Once the Currency table is successfully created, a DBA needs to make sure that the SQL Server Agent Service is running. In order for the CDC to be successful the SQL Server Agent should be running.

3. Enable CDC for the table Currency by executing the following TSQL:

Use ChangeDataCapture
Go
EXEC sp_cdc_enable_table 'dbo', 'Currency', @role_name = NULL, @supports_net_changes =1
Go




4. Execute the query below to check whether the table is enabled for CDC:

Use ChangeDataCapture
Go
Select [name], is_tracked_by_cdc from sys.tables
GO

The value of 1 for the is_tracked_by_cdc column means that the CDC is enabled for the table and the value of 0 for is_tracked_by_cdc column means that the CDC is disabled.



5. Once you have enabled CDC for the Currency table, another table is created for keeping changed data and the information about the changes in the source table. The new table created will have the cdc.dbo_Currency_CT name as highlighted in the above snippet:

How to disable CDC for a SQL Server 2008 Database Table
Database Administrators can run the TSQL Query below to disable CDC on a table:

Use ChangeDataCapture
Go
EXEC sys.sp_cdc_disable_table 'dbo', 'Currency', 'all'
Go


How to disable CDC for a SQL Server 2008 Database
Database Administrators can run the TSQL Query below to disable CDC on a database. The SQL Server Agent should be running when DBA a decides to disable CDC for the database. Once the TSQL Query below has been executed successfully you could see that the SQL Server Agent Jobs which monitor the changes happening to table by reading the transaction log are deleted and also all the system tables related to CDC are also dropped.

Use ChangeDataCapture
Go
EXEC sys.sp_cdc_disable_db
Go


Growth of Change Data Capture System Tables
Once in every three days there is an automatic cleanup process that occurs. For more intense environments you can leverage the manual method using the system stored procedure: sys.sp_cdc_cleanup_change_table. When you execute this system procedure you specify the low LSN and any change records occurring before this point are removed and the start_lsn is set to the low LSN you specified.

Change Data Capture System Tables
Once the DBA has enabled CDC for the Currency table, SQL Server creates the tables below along with a schema named CDC. CDC enables change tracking on tables so that Data Manipulation Language (DML) and Data Definition Language (DDL) changes made to the tables can be captured. The system tables that store information used by change data capture operations are mentioned below:

Table Name Description
cdc._CT It returns one row for each change made to a captured column in the CDC enabled source table
cdc.captured_columns It returns one row for each column tracked in a capture instance
cdc.change_tables It returns one row for each change table in the database
cdc.ddl_history All the Data Definition Language (DDL) change made to tables that are enabled for CDC are captured
cdc.lsn_time_mapping It returns one row for each transaction having rows in a change table. This table is used to map between log sequence number (LSN) commit values and the time the transaction committed
cdc.index_columns It has information about all the indexes which are associated to a CDC enabled table
cdc.cdc_jobs This table is created in MSDB database and has the configuration parameters for CDC agent jobs
dbo.systranschemas This table is used to track schema changes in articles published in transactional and snapshot publications. This table is stored in both publication and subscription databases

Change Data Capture System Stored Procedures
Change data capture enables change tracking on tables so that Data Manipulation Language (DML) and Data Definition Language (DDL) changes made to the tables can be captured. The following stored procedures help DBA to implement, monitor, and maintain change data capture:
1. sys.sp_cdc_add_job
2. sys.sp_cdc_change_job
3. sys.sp_cdc_cleanup_change_table
4. sys.sp_cdc_dbsnapshotLSN
5. sys.sp_cdc_disable_db
6. sys.sp_cdc_disable_table
7. sys.sp_cdc_drop_job
8. sys.sp_cdc_enable_db
9. sys.sp_cdc_enable_table
10. sys.sp_cdc_generate_wrapper_function
11. sys.sp_cdc_get_captured_columns
12. sys.sp_cdc_get_ddl_history
13. sys.sp_cdc_help_change_data_capture
14. sys.sp_cdc_help_jobs
15. sys.sp_cdc_restoredb
16. sys.sp_cdc_scan
17. sys.sp_cdc_start_job
18. sys.sp_cdc_stop_job
19. sys.sp_cdc_vupgrade
20. sys.sp_cdc_vupgrade_databases

Change Data Capture Dynamic Management Views
Database Administrators can monitor the change data capture process to determine if changes made are being written correctly and with a reasonable latency to the change tables. The following below mentioned DMV can be used to identify if there are any errors:
1. sys.dm_cdc_log_scan_sessions
2. sys.dm_cdc_errors

Change Data Capture System Functions
Change data capture records DDL & DML activity applied to SQL Server 2008 tables, supplying the details of the changes in an easily consumed relational format. Column information that mirrors the column structure of a tracked source table is captured for the modified rows, along with the metadata needed to apply the changes to a target environment. The following functions are used to return information about the changes:
1. cdc.fn_cdc_get_all_changes_
2. cdc.fn_cdc_get_net_changes_
3. sys.fn_cdc_decrement_lsn
4. sys.fn_cdc_get_column_ordinal ( 'capture_instance' , 'column_name' )
5. sys.fn_cdc_get_max_lsn
6. sys.fn_cdc_get_min_lsn
7. sys.fn_cdc_has_column_changed
8. sys.fn_cdc_increment_lsn
9. sys.fn_cdc_is_bit_set
10. sys.fn_cdc_map_lsn_to_time
11. sys.fn_cdc_map_time_to_lsn

Change Data Capture SQL Server Agent Jobs
There are two SQL Server Agent Jobs which are created when CDC is enabled for a particular table.



Conclusion
Database Administrators can use this feature to monitor DDL and DML changes which are happening to tables.

No comments: