Tuesday, September 30, 2008

SQL Server 2008 MERGE Statement

SQL Server 2008 introduces the MERGE statement which will allow users to perform insert, update and delete operations in a single statement. In the earlier versions of SQL Server to achieve the same functionality the database developer or database administrator needed to write separate statements to perform the insert, update or delete of data in one table based on certain conditions in another table

How MERGE Statement Internally Works
The MERGE statement internally works as an individual insert, update and delete statement within a single Merge statement. You need to specify the SOURCE and the TARGET table or query which should be joined together. Within the MERGE statement you also need to specify the type of the data modification that needs to be performed when the records between the source and target are matched and what actions needs to be performed when they are not matched. With the introduction of MERGE statement the complex TSQL codes which was used earlier to do checks for the existence or inexistence of data within the data warehouse can be replaced with single Merge statement. The use of Merge statement will also improve the query performance.
Below are the three different matched clauses in MERGE:
  • WHEN MATCHED THEN
    • Rows that meet the criteria
  • WHEN [TARGET] NOT MATCHED THEN
    • Rows that do not match with another row in the target table
  • WHEN SOURCE NOT MATCHED THEN
    • Rows that do not match with another row in the source table

Overview on OUTPUT Clause
You can use the OUTPUT clause which was introduced in SQL Server 2005 to find out what records are being inserted, updated or deleted when you run a MERGE statement. OUTPUT clause will output information you specify for every record which is inserted, updated or deleted. The OUTPUT clause can also be useful to the value of the identity or any computed columns after an insert or update operation is performed against records in a table. In the below example you can see that you are using OUTPUT clause to identify what actions has been taken on records.

Syntax of MERGE statement:

MERGE
[ TOP ( expression ) [ PERCENT ] ]
[ INTO ] target_table [ WITH ( ) ] [ [ AS ] table_alias]
USING
ON
[ WHEN MATCHED [ AND ]
THEN ]
[ WHEN NOT MATCHED [ BY TARGET ] [ AND ]
THEN ]
[ WHEN NOT MATCHED BY SOURCE [ AND ]
THEN ]
[ ]
[ OPTION ( [ ,...n ] ) ];

Example: How to Use Merge Statement
In this example you will be creating a Target table named CricketTeams (destination table) and will populate it with the list of cricket playing nations. The second table named CricketTeams_UpdatedList will act as Source table, which has updated information about the cricket playing nations. Using MERGE statement you will be modifying the Target table named CricketTeams with the changes that of CricketTeams_UpdatedList Source table.

USE tempdb;
GO

IF OBJECT_ID (N'dbo.CricketTeams', N'U') IS NOT NULL
DROP TABLE dbo.CricketTeams;
GO

-- TARGET TABLE
CREATE TABLE dbo.CricketTeams
(
CricketTeamID tinyint NOT NULL PRIMARY KEY,
CricketTeamCountry nvarchar(30),
CricketTeamContinent nvarchar(50)
)
GO

-- INSERT DATA INTO TARGET TABLE
INSERT INTO dbo.CricketTeams VALUES
(1, 'Australia', 'Australia'),
(2, 'India', 'Asia'),
(3, 'Pakistan', 'Asia'),
(4, 'Srilanka', 'Asia'),
(5, 'Bangaladesh', 'Asia'),
(6, 'HongKong', 'Asia'),
(7, 'U.A.E', 'Asia'),
(8, 'England', 'Europe'),
(9, 'South Africa', 'Africa'),
(10, 'West Indies', 'North America');
GO

SELECT * FROM CricketTeams
GO

IF OBJECT_ID (N'dbo.CricketTeams_UpdatedList', N'U') IS NOT NULL
DROP TABLE dbo.CricketTeams_UpdatedList;
GO

-- SOURCE TABLE
CREATE TABLE dbo.CricketTeams_UpdatedList
(
CricketTeamID tinyint NOT NULL PRIMARY KEY,
CricketTeamCountry nvarchar(30),
CricketTeamContinent nvarchar(50)
)
GO

-- INSERT DATA INTO SOURCE TABLE
INSERT INTO dbo.CricketTeams_UpdatedList VALUES
(1, 'Australia', 'Australia'),
(2, 'India', 'Asia'),
(3, 'Pakistan', 'Asia'),
(4, 'Srilanka', 'Asia'),
(5, 'Bangaladesh', 'Asia'),
(6, 'Hong Kong', 'Asia'),
(8, 'England', 'Europe'),
(9, 'South Africa', 'Africa'),
(10, 'West Indies', 'North America'),
(11, 'Zimbabwe', 'Africa');
GO

/*
MERGE STATEMENT WHICH WILL PERFORM INSERT, UPDATE OR DELETE OPERATION BASED ON CONDITIONS MENTIONED WITHIN STATEMENT
*/

MERGE dbo.CricketTeams AS TARGET
USING dbo.CricketTeams_UpdatedList AS SOURCE
ON (TARGET.CricketTeamID = SOURCE.CricketTeamID)
WHEN MATCHED AND TARGET.CricketTeamContinent <> SOURCE.CricketTeamContinent OR
TARGET.CricketTeamCountry <> SOURCE.CricketTeamCountry
THEN UPDATE SET TARGET.CricketTeamContinent = SOURCE.CricketTeamContinent ,
TARGET.CricketTeamCountry = SOURCE.CricketTeamCountry
WHEN NOT MATCHED THEN
INSERT (CricketTeamID, CricketTeamCountry, CricketTeamContinent)
VALUES (SOURCE.CricketTeamID, SOURCE.CricketTeamCountry, SOURCE.CricketTeamContinent)
WHEN NOT MATCHED BY SOURCE THEN
DELETE
OUTPUT $action,
INSERTED.CricketTeamID AS SourceCricketTeamID,
INSERTED.CricketTeamCountry AS SourceCricketTeamCountry,
INSERTED.CricketTeamContinent AS SourceCricketTeamContinent,
DELETED.CricketTeamID AS TargetCricketTeamID,
DELETED.CricketTeamCountry AS TargetCricketTeamCountry,
DELETED.CricketTeamContinent AS TargetCricketTeamContinent;

SELECT @@ROWCOUNT;

SELECT * FROM CricketTeams_UpdatedList

Important Points to Remember While Using MERGE Statement

  • The MERGE statement requires a semicolon (;) as a statement terminator. Otherwise Error 10713 is raised when a MERGE statement is executed without the statement terminator
  • At least one of the three MATCHED clauses must be specified when using MERGE statement; the MATCHED clauses can be specified in any order. However a variable cannot be updated more than once in the same MATCHED clause
  • If you are using @@ROWCOUNT at the end of MERGE statement, then it will return the total number of rows inserted, updated and deleted in the target table
  • Use executing the MERGE statement should have SELECT Permission on the SOURCE Table and INSERT, UPDATE and DELETE Permission on the TARGET Table
  • MERGE Statement runs through data only once in the database which also improves performance

MERGE Statement Execution Results



Even though at the first glance the MERGE statement code looks very confusing, if you analysis it the logical implementation is very simple. Now let me explain what actually happees when we used the MERGE statement.


In the above snippet you could see that you have performed an UPDATE, DELETE and INSERT operation for records in CricketTeams (TARGET Table) based on certain conditions in CricketTeams_UpdatedList (SOURCE Table).

You could see that the record with CricketTeamID as “6” is UPDATED in CricketTeams (TARGET) Table. This is because in CricketTeams_UpdatedList SOURCE Table the value for CricketTeamCountry is “Hong Kong” where as in CricketTeams TARGET Table the value was “HongKong”. As there is a change in value for the CricketTeamCountry column in the SOURCE Table when compared to data in TARGET Table the record was UPDATED.

Next, you can also see a DELETE operation being performed on the CricketTeams TARGET Table for the record which had CricketTeamID as “7”. This has happens because there is no record in CricketTeams_UpdatedList SOURCE Table which has CricketTeamID as “7”. This means that the record is no longer required and its deleted from the CricketTeams TARGET table.

Finally, you can see an INSERT Operation which is being performed on CricketTeams TARGET Table. The new row is getting inserted into the CricketTeams TARGET table because, in CricketTeams_UpdatedList SOURCE Table there is new record with CricketTeamID as “11” appearing, which is not present in CricketTeams TARGET table. Hence the record is INSERTED into CricketTeams TARGET Table.

Conclusion
Using the MERGE statement a developer can reduce the complex TSQL code which is used to implement insert, update and delete logic when loading data into warehouse tables. The use of the Merge statement will also improve query performance.


No comments: