Sunday, November 2, 2008

Grouping Sets in SQL Server 2008

SQL Server 2008 introduces a new feature called GROUPING SETS for SQL Server Database Developers. When a GROUP BY clause is used with the GROUPING SETS feature in SQL Server 2008 it will help you generate a result set which will be equivalent to that generated by a UNION ALL of multiple simple transact SQL Group By statements. A Grouping Sets statement will generate a result set which is equivalent to a result set generated by the use of Group By, Rollup or Cube operations. It is easier to write transact SQL statements using the Grouping Sets clause, as it avoids the overhead of writing many queries and then using UNION ALL to get the desired results. In this article you will be see how to use the new Grouping Sets feature introduced in SQL Server 2008.

Advantages of Using Grouping Sets Clause

* The Grouping Sets feature is really helpful when you want to generate a set of aggregate results and at the same time you want to group by varying columns
* It is much easier to maintain and provides better performance when compared to running different queries against the same data and then finally performing a UNION ALL to get the desired results
* It provides better performance as it is executes once against the data source
* It is much easier to program and use Grouping Sets than writing multiple
select statements

Let us assume that as per business requirement you have been asked to find the count of Cricket Teams based on the following criteria such as CricketTeamShortCode or CricketTeamCountry or CricketTeamContinent. In order to get the desired results in the previous versions of SQL Server you will end up writing as many select statements with group by clause as shown below.

a) Select statement to find the count of CricketTeams by CricketTeamShortCode among the cricket playing nations:

USE GroupingSetsDemo
GO
SELECT CricketTeamShortCode, COUNT(*) FROM CricketTeams
GROUP BY CricketTeamShortCode
GO

b) Select statement to find the count of CricketTeams by CricketTeamCountry among the cricket playing nations:

USE GroupingSetsDemo
GO
SELECT CricketTeamCountry, COUNT(*) FROM CricketTeams
GROUP BY CricketTeamCountry
GO

c) Select statement to find the count of CricketTeams by Continent among the cricket playing nations:

USE GroupingSetsDemo
GO
SELECT CricketTeamContinent, COUNT(*) FROM CricketTeams
GROUP BY CricketTeamContinent
GO

Next you need to run all the queries using the UNION ALL operator to get the desired results. The only drawback with this approach is that you end up writing many select statements. However, it is not very easy to write a statement without many select statements when you have many conditions, especially when the table has many columns. Moreover there will be a performance hit as you will end up running multiple select queries against the same data source. The following query when executed will provide you all the results as desired:

Use GroupingSetsDemo
GO

SELECT CricketTeamContinent, COUNT(*) FROM CricketTeams
GROUP BY CricketTeamContinent

UNION ALL

SELECT CricketTeamCountry, COUNT(*) FROM CricketTeams
GROUP BY CricketTeamCountry

UNION ALL

SELECT CricketTeamShortCode, COUNT(*) FROM CricketTeams
GROUP BY CricketTeamShortCode
GO




Using Grouping Sets Clause Introduced in SQL Server 2008
Now let us see how we can write a select statement using the Grouping Sets clause which was introduced in SQL Server 2008. The result set which we got by running multiple select statements with UNION ALL operator can be obtained by just executing the below mentioned piece of TSQL code:

Use GroupingSetsDemo
GO

/* Using the Grouping Sets clause introduced in SQL Server 2008 */
SELECT CricketTeamShortCode, CricketTeamCountry, CricketTeamContinent, Count(*) AS Count
FROM CricketTeams
GROUP BY GROUPING SETS (
(CricketTeamShortCode),
(CricketTeamCountry),
(CricketTeamContinent),
()
)
ORDER BY CricketTeamShortCode, CricketTeamCountry, CricketTeamContinent
GO

You could see that the syntax is much similar to the GROUP BY clause which was available in the previous versions of SQL Server. In the below snippet you can see the result set which we have obtained by executing the above TSQL code which is using the GROUPING SETS feature:


You can see in the above snippet that by using the Grouping Sets clause we have got the same result set which in previous versions of SQL Server to get similar result set you end up writing multiple queries. Using the Grouping Sets we get the same result set we got by just executing a single TSQL query which makes this feature an excellent choice for developers when developing application on SQL Server 2008.

Conclusion
The Grouping Sets feature is an enhancement to the existing Group By clause of SQL Server. The Grouping Sets feature allows database developers to merge multiple Group By transact SQL queries into a single query. This feature is very useful when you need to collect summary data for different criteria’s as per business requirements. With the introduction of Grouping Sets clause in SQL Server 2008 you don’t have to write multiple select queries to collect summary data. Developers can use this new enhancement if your application has many Group By queries. Using this feature will make your queries much simpler to write and it will also improve the performance of your queries as they are running against the data source once.

No comments: