Friday, February 13, 2009

Passing a Table to a Stored Procedure

SQL Server 2005 and previous versions do not support passing a table variable to a stored procedure.

This article introduces the new feature added to SQL Server 2008, which supports passing a TABLE to a stored procedure or function.

This article is based on SQL Server 2008 CTP 3. Some of the information may change by the time the product is finally released.

Before we create a Function or Stored Procedure that accepts a TABLE variable, we need to define a User Defined TABLE Type. SQL Server 2008 introduced a new User defined TABLE type. A TABLE type represents the structure of a table that can be passed to a stored procedure or function.

So the first step is to create a User Defined TABLE type. The following TSQL code creates a User defined TABLE type named "ItemInfo".

ItemNumber VARCHAR(50),
Qty INT )

You can use the system view SYS.TYPES to see the type that you have just created. The following query returns all the types defined in the system.

/* If you just need to find information about the TABLE types, you could find it from the following TSQL query.*/
SELECT * FROM SYS.TYPES WHERE is_table_type = 1
/* There is another view, which is handy to find information about TABLE types. */

We have created a TABLE type that we need. Now let us see how it works. Let us create a variable of type "ItemInfo" and try to insert a few records to it. Then lets query the table variable to see if the information is correctly inserted. [code]

/* Let us declare a variable of type ItemInfo which is a TABLE Type */

DECLARE @items AS ItemInfo
/* Insert values to the variable */
INSERT INTO @Items (ItemNumber, Qty)

SELECT '11000', 100 UNION ALL

SELECT '22000', 200 UNION ALL
SELECT '33000', 300

/* Lets check if the values are correctly inserted or not */


ItemNumber Qty
-------------------------------------------------- -----------
11000 100
22000 200
33000 300

Now let us create a stored procedure that accepts a TABLE variable. Let us create a very simple stored procedure which accepts a TABLE variable and SELECTs contents of the table.

2 (
3 @Items ItemInfo
4 )
6 AS
9 FROM @Items

Well, this would generate the following error:

1 /*
2 Msg 352, Level 15, State 1, Procedure TableParamDemo, Line 1
3 The table-valued parameter "@Items" must be declared with the READONLY option.
4 */

A table variable that is passed to a stored procedure or function should be marked as READONLY. The "callee" cannot modify the table being passed into it. Here is the correct code.

2 (
3 @Items ItemInfo READONLY
4 )
6 AS
9 FROM @Items

Now let us execute the stored procedure we just created. Run the following code.

1 /*
2 declare the variable
3 */
4 DECLARE @items AS ItemInfo
6 /*
7 Insert values to the variable
8 */
10 INSERT INTO @Items (ItemNumber, Qty)
11 SELECT '11000', 100 UNION ALL
12 SELECT '22000', 200 UNION ALL
13 SELECT '33000', 300
15 /*
16 Execute the procedure
17 */
18 EXECUTE TableParamDemo @Items
20 /*

23 ItemNumber Qty
24 -------------------------------------------------- -----------
25 11000 100
26 22000 200
27 33000 300
29 */

You cannot modify the TABLE parameter passed into the stored procedure. If you try to do so, you will get an error as shown in the following example.

2 (
3 @Items ItemInfo READONLY
4 )
6 AS
9 FROM @Items
11 INSERT INTO @Items (ItemNumber, Qty)
12 SELECT '1001', 20
14 /*
17 Msg 10700, Level 16, State 1, Procedure TableParamDemo, Line 11
18 The table-valued parameter "@Items" is READONLY and cannot be modified.
19 */

The support for TABLE variables is very interesting. While working with User Defined TABLE Type, please note that you cannot use it as a column of a table. Please also note that, once created, you cannot alter the structure of the TABLE.

SQL Server Federated Database Performance Tuning

For very large databases, consider using federated database servers to balance the processing load across multiple SQL 2000 or 2005 servers. This technique horizontally partitions SQL 2000 or 2005 data over one or more SQL Servers, allowing the client application to send SQL statements to the server in the federation having most of the data required by the statement. This way, the query load is spread over multiple SQL 2000 or 2005 servers.

Federated database servers works best for databases that can be naturally partitioned over multiple servers. For example, if the data can be segregated by product line or geographical location, then the data can easily be separated over multiple databases in a federation.

Here's a federated database might work. Let's say that a company has customers in North America, South America, Europe, and Asia, and that the data stored for every customer, no matter where they are located, is identical. One option would be to partition the data based horizontally on continent. In this case, four different SQL Servers would be needed in the federation, each storing its respective data.

In almost all cases, when a query is run against the customer data stored in the federated SQL Servers, the data will all be related to a specific continent. Because of this, only one SQL Server will be hit with the query, not all of the SQL Servers. And assuming that customers in all continents will be queried, the separation of customers by continent on different SQL Servers will spread the queries among all of the servers, allowing greater performance and scalability. [2000, 2005]

Sunday, February 8, 2009

String comparison: binary vs. dictionary

It is well known that you get better performance if you compare two strings in a binary collation than in a dictionary collation. But is the performance difference significant enough to warrant the use of an explicit COLLATE clause in the string comparison expression? That was a question came up in a conversation I had with a colleague recently.

To get a feel for the extent of the performance difference, I ran several tests comparing two commonly-used collations:

· Latin1_General_BIN, and
· SQL_Latin1_General_CP1_CI_AS

More specifically, I ran the following T-SQL script in a loop for 100 times:

DECLARE @s1 varchar(max), @s2 varchar(max), @i int
DECLARE @dummy int, @dt datetime
DECLARE @length int

SET @length = 1000000 -- or 100, or 1000000

SELECT @s1 = REPLICATE(CAST('a' as varchar(max)), @length)
SELECT @s2 = REPLICATE(CAST('a' as varchar(max)), @length)

SELECT @i = 1, @dt = GETDATE()

WHILE @i < 1000
IF @s1 = @s2 COLLATE SQL_Latin1_General_CP1_CI_AS
--IF @s1 = @s2 COLLATE Latin1_General_BIN
SET @dummy =0
SET @i = @i + 1


The script was run in a number of scenarios:

· The @length variable was set to 100, 10,000, and 1,000,000 to see the impact of the string length on the comparison method,

· The string comparison was done in the IF clause with either the Latin1_General_BIN collation or the SQL_Latin1_General_CP1_CI_AS collation

The most salient point to note is that as the string length increases, the performance difference between the two comparison methods increases. While there was no performance difference when the strings were 100 bytes long, the binary comparison was about twice as fast as the dictionary comparison when the string length was 10,000 bytes. When the string length increased to 1,000,000, the binary comparison was about seven times faster.

So, if you are comparing two short strings, you probably shouldn’t bother to explicitly cast the comparison into a binary collation. But when the strings can be very long, it can result in a huge performance improvement to explicitly specify a binary collation for string comparison.

Note that string comparison in COLLATE Latin1_General_BIN and string comparison in COLLATE SQL_Latin1_General_CP1_CI_AS have different semantics. But there are many cases where you have control over your data, and the difference in comparison semantics does not matter (e.g. when you know your data is all in lower case).

Nothing is particularly new here. Just want to contribute some data points to the community.

Sunday, February 1, 2009

Indexed View Performance Tuning and Optimization

If your application needs to access views often, consider adding a unique clustered index to your views to significantly improve performance. When a view is created with a unique clustered index, the view is created with the clustered index, which is created and stored in the database the same way as a clustered index on a table.

Once a unique clustered index has been created for a view, you can also create non-clustered indexes for the same view, which can be used by queries against the view to enhance performance.

As the underlying tables of the view are modified, the clustered index, and any non-clustered indexes of the view, is modified so that it is always up-to-date when it is accessed. And just like indexes on tables, indexes on views experience modification overhead. So only add an index to a view if the benefit of its speed increase when running exceeds the time it takes to update the view's index.

Indexed views can be used by SQL Server two different ways. First, the view can be called directly from a query, as conventional views are currently used. But instead of running the view's underlying SELECT statement and creating the view's result set on the fly, it uses the unique clustered index to display the results of the view almost immediately. Second, any query that is run on SQL Server 2000/2005 is automatically evaluated to see if any existing indexed views exist that would fulfill the query. If so, the Query Optimizer uses the indexed query, even though it has not been specified in the query, greatly speeding the query.

To get the most benefit out of indexed views, you need to use the SQL Server 2000/2005 Enterprise Edition. While you can create indexed views in the other editions of SQL Server 2000/2005, they will not be automatically considered by the query optimizer, and they require the use of the NOEXPAND hint to be used. [2000, 2005]