Wednesday, October 1, 2008

Differences between SQL Server temporary tables and table variables

emporary Tables

Temporary tables are created in tempdb. The name "temporary" is slightly misleading, for even though the tables are instantiated in tempdb, they are backed by physical disk and are even logged into the transaction log. They act like regular tables in that you can query their data via SELECT queries and modify their data via UPDATE, INSERT, and DELETE statements. If created inside a stored procedure they are destroyed upon completion of the stored procedure. Furthermore, the scope of any particular temporary table is the session in which it is created; meaning it is only visible to the current user. Multiple users could create a temp table named #TableX and any queries run simultaneously would not affect one another - they would remain autonomous transactions and the tables would remain autonomous objects. You may notice that my sample temporary table name started with a "#" sign. This is the identifier for SQL Server that it is dealing with a temporary table.

The syntax for creating a temporary table is identical to creating a physical table in Microsoft SQL Server with the exception of the aforementioned pound sign (#):

CREATE TABLE dbo.#Cars
(
Car_id int NOT NULL,
ColorCode varchar(10),
ModelName varchar(20),
Code int,
DateEntered datetime
)

Temporary tables act like physical tables in many ways. You can create indexes and statistics on temporary tables. You can also apply Data Definition Language (DDL) statements against temporary tables to add constraints, defaults, and referential integrity such as primary and foreign keys. You can also add and drop columns from temporary tables. For example, if I wanted to add a default value to the DateEntered column and create a primary key using the Car_id field I would use the following syntax:

ALTER TABLE dbo.#Cars
ADD
CONSTRAINT
[DF_DateEntered] DEFAULT (GETDATE()) FOR [DateEntered],
PRIMARY KEY CLUSTERED
(
[Car_id]
) ON [PRIMARY]
GO

Table Variables

The syntax for creating table variables is quite similar to creating either regular or temporary tables. The only differences involve a naming convention unique to variables in general, and the need to declare the table variable as you would any other local variable in Transact SQL:

DECLARE @Cars table (
Car_id int NOT NULL,
ColorCode
varchar(10),
ModelName
varchar(20),
Code int,
DateEntered datetime
)

As you can see the syntax bridges local variable declaration (DECLARE @variable_name variable_data_type) and table creation (column_name, data_type, nullability). As with any other local variable in T-SQL, the table variable must be prefixed with an "@" sign. Unlike temporary or regular table objects, table variables have certain clear limitations.

  • Table variables can not have Non-Clustered Indexes
  • You can not create constraints in table variables
  • You can not create default values on table variable columns
  • Statistics can not be created against table variables

Similarities with temporary tables include:

  • Instantiated in tempdb
  • Clustered indexes can be created on table variables and temporary tables
  • Both are logged in the transaction log
  • Just as with temp and regular tables, users can perform all Data Modification Language (DML) queries against a table variable: SELECT, INSERT, UPDATE, and DELETE.

Usage

Temporary tables are usually preferred over table variables for a few important reasons: they behave more like physical tables in respect to indexing and statistics creation and lifespan. An interesting limitation of table variables comes into play when executing code that involves a table variable. The following two blocks of code both create a table called #Cars and @Cars. A row is then inserted into the table and the table is finally queried for its values.

--Temp Table:
CREATE TABLE dbo.#Cars
(
Car_id int NOT NULL,
ColorCode
varchar(10),
ModelName
varchar(20),
Code
int ,
DateEntered datetime
)

INSERT INTO dbo.#Cars (Car_id, ColorCode, ModelName, Code, DateEntered)
VALUES (1,'BlueGreen', 'Austen', 200801, GETDATE())

SELECT Car_id, ColorCode, ModelName, Code, DateEntered FROM dbo.#Cars

DROP TABLE dbo.[#Cars]

This returns the following results:

--Table Variable:
DECLARE @Cars TABLE
(
Car_id
int NOT NULL,
ColorCode
varchar(10),
ModelName
varchar(20),
Code
int ,
DateEntered datetime
)

INSERT INTO @Cars (Car_id, ColorCode, ModelName, Code, DateEntered)
VALUES (1,'BlueGreen', 'Austen', 200801, GETDATE())

SELECT Car_id, ColorCode, ModelName, Code, DateEntered FROM @Cars

The results differ, depending upon how you run the code. If you run the entire block of code the following results are returned:

However, you receive an error if you don't execute all the code simultaneously:

Msg 1087, Level 15, State 2, Line 1
Must
declare the table
variable "@Cars"

What is the reason for this behavior? It is quite simple. A table variable's lifespan is only for the duration of the transaction that it runs in. If we execute the DECLARE statement first, then attempt to insert records into the @Cars table variable we receive the error because the table variable has passed out of existence. The results are the same if we declare and insert records into @Cars in one transaction and then attempt to query the table. If you notice, we need to execute a DROP TABLE statement against #Cars. This is because the table persists until the session ends or until the table is dropped.

So, it would appear that I don't advocate the use of table variables. That is not true. They serve a very useful purpose in returning results from table value functions. Take for example the following code for creating a user-defined function that returns values from the Customers table in the Northwind database for any customers in a given PostalCode:

CREATE FUNCTION dbo.usp_customersbyPostalCode ( @PostalCode VARCHAR(15) )
RETURNS
@CustomerHitsTab TABLE (
[CustomerID] [nchar] (5),
[ContactName] [nvarchar] (30),
[Phone] [nvarchar] (24),
[Fax] [nvarchar] (24)
)
AS
BEGIN
DECLARE
@HitCount INT

INSERT INTO
@CustomerHitsTab
SELECT [CustomerID],
[ContactName],
[Phone],
[Fax]
FROM [Northwind].[dbo].[Customers]
WHERE PostalCode = @PostalCode

SELECT @HitCount = COUNT(*) FROM @CustomerHitsTab

IF @HitCount = 0
--No Records Match Criteria
INSERT INTO @CustomerHitsTab (
[CustomerID],
[ContactName],
[Phone],
[Fax] )
VALUES ('','No Companies In Area','','')

RETURN
END
GO

The @CustomerHitsTab table variable is created for the purpose of collecting and returning results of a function to the end user calling the dbo.usp_customersbyPostalCode function.

SELECT * FROM dbo.usp_customersbyPostalCode('1010')

SELECT * FROM dbo.usp_customersbyPostalCode('05033')

An unofficial rule-of-thumb for usage is to use table variables for returning results from user-defined functions that return table values and to use temporary tables for storage and manipulation of temporary data; particularly when dealing with large amounts of data. However, when lesser row counts are involved, and when indexing is not a factor, both table variables and temporary tables perform comparably. It then comes down to preference of the individual responsible for the coding process.

No comments: