Tuesday, September 30, 2008

Common Table Expression in SQL Server 2005

When working with SQL 2005, there are times when you need to perform several aggregate functions, or to define a view that needs to be stored only during the execution of a particular query. A Common Table Expression (CTE) does just that; it helps in the definition and manipulation of query related results that need temporary storage. Its scope is limited to the statement that defines it. CTEs provide an easy way of writing and reviewing queries. Another unique and defining element about CTE is that it can self-reference, meaning it can refer to itself as many times as required in the same query. Microsoft developed the Common Table Expression for SQL Server 2005 based on the ANSI SQL-99 standard. Before its introduction, SQL 2000 users were using derived and temporary tables.



Types of Common Table Expression

CTEs can be recursive and non-recursive. In its non-recursive form a CTE can serve as a substitute for derived tables or a view, and can be used in place of user-defined routines. It provides a convenient tool for creating queries by building tables as and when they are required within nested SELECT statements. This simplifies query building by allowing them to be developed in separate logical blocks.

In a recursive CTE, the original CTE is executed repetitively in order to arrive at different subsets of data that together form the complete result set. It is best used in queries that return hierarchical data results like reporting relationships within an organization. A good example would be employees in an organizational chart or products that have subcomponents. You can think of it as a combination of a hybrid derived table and a declared temporary table.



Creating a Common Table Expression

The syntax for creating a CTE consists of the WITH statement followed by the expression name that will identify the CTE and a query definition. If the query definition provides for separate names for the columns that will store the results then a column list is provided immediately after the expression name of the CTE. The basic syntax is:

[
WITH [ ,...n ]
]
::=
expression_name [ ( column_name [ ,...n ] ) ]
AS
( CTE_query_definition )

In the following example, a Common Table Expression "MyCTE" is being defined that has two resulting columns LP and SP. The statement following AS provides the query definition by populating the two column names. MyCTE can be run by using the SELECT statement.

USE AdventureWorks
GO

WITH MyCTE (LP, SP) AS
(
SELECT LP, LP * .95 FROM itemlist.item
)
SELECT * FROM My_CTE
GO


CTEs and Temporary Tables

A CTE can be created using the WITH statement with the CTE name following it. You will find that temporary tables aren't as convenient. For instance, unlike CTEs, temporary tables can be populated only after they have been created and not otherwise. Study the following syntax to note the difference:

CREATE TABLE #My_CTE
(
LP money, SP money
)
INSERT INTO #My_CTE
(LP, SP)
SELECT LP, LP * .95 FROM itemlist.item

Temporary tables as used in SQL 2000, allow them to be called repeatedly from within a statement, whereas in a Common Table Expression, it can be called immediately after stating it. Hence, if you write syntax similar to the following example, there won't be any response to the CTE you call.

USE AdventureWorks
GO
WITH MyCTE (LP, SP) AS
(
SELECT LP, LP * .95 FROM itemlist.item
)
SELECT * FROM itemlist.item
SELECT * FROM My_CTE
GO

Both a CTE and a temporary table can be called by name using the SELECT * statement.



CTEs and Derived Tables

Derived tables in SQL 2000 and CTEs in SQL 2005 are very similar. Derived tables serve in simple queries; however, they suffer from two drawbacks: they can be used only once, and you cannot refer to it by name. CTEs score over derived tables when it comes to complex statements. In fact, CTEs can efficiently handle simple queries too, since they can be rewritten as derived tables as shown below:

SELECT * FROM
(
SELECT LP, (LP * .95) AS SP
FROM itemlist.item
)
MyDerivedTable



Recursive Queries

A recursive query used in SQL Server 2005 refers to a recursive CTE. It allows you to set up arbitrary levels of queries on data that have a tree like structure (hierarchical), for instance, the employee reporting structure in an organization — something that was not possible in SQL 2000. You can either set the number of levels of recursion you want or leave them without any limit depending upon the query required.

Let us study the following example of reporting an organizational hierarchy where a initial subquery is set up to return only those records (employees) that report to the top management (represented by Mng_ID = null). If we limit the number of levels of recursion here, we will be excluding those employees who fall outside the defined chain of command. If we set no limit, no employee is excluded but there is a risk of an infinite recursion occurring if any employee happens to be reporting directly or indirectly to himself, like a director who is also an employee. In such cases, you can join the table to itself once for each level.

A recursive subquery is then set up by using the CTE name "DictRep" to append additional rows to the result set. The two are then connected by using the operator UNION ALL. The first or the initial subquery is nonrecursive and is processed first while the recursive query refers to the rows that were added in the previous cycle. Whenever iteration generates no new rows, recursion comes to a halt. Running the following syntax will display the result set that shows the reporting levels in the organization.

USE AdventureWorks ;
GO
WITH DictRep(Log_ID, Mng_ID, Emp_ID) AS
(
SELECT Log_ID, Mng_ID, Emp_ID
FROM HRs.Emp
WHERE Mng_ID IS NULL
UNION ALL
SELECT e.Log_ID, e.Mng_ID, e.Emp_ID
FROM HRs.Emp e
INNER JOIN DictRep d
ON e.Mng_ID = d.Emp_ID
)
SELECT * FROM DictRep ;
GO

The result set would be:

Log_ID

Emp_ID

Mng_ID

Ken0

109

NULL

David0

6

109

Terri0

12

109

Peter0

21

109

Jean0

42

109

Laura1

140

109

James1

148

109

Brain3

273

109

Stephen0

268


Amy0

284

273

Syed0

288

273

Jae0

290

288

Lynn0

285

284










Recursive CTEs need an anchor member definition and a UNION ALL statement to generate appropriate results.

For all those who have so far been working with temporary tables and derived tables alone, Common Expression Tables are a good option to explore. Unlike recursive routines used in other languages, a recursive CTE can return multiple rows instead of one scalar value. It can handle complex queries and has a syntax that is easy to understand, read and write.

No comments: