Sunday, December 28, 2008

Indexes for aggregates

It’s well known that indexes on columns used in where clause and for joins is a good thing in SQL, but what about other places. How about on aggregates?

Consider a simple table with an amount and a customerID. It’s a common requirement to calculate the total amount that each customer has paid. No conditions are enforced, so this would seem like a place where an index won’t help. Well, let’s see. (sample code at end)

The clustered index (and hence the physical order of the rows) is on the identity column.Take the following query.
SELECT CustomerID, SUM(Amount) FROM Payments group by customerID

Without any additional indexes, SQL will execute that as a hash match (aggregate) which comprises 63% of the query’s cost. That is because the data is not ordered by the grouped column, and hence cannot be simply be summed.

Now add an index on CustomerID, Amount and run the same query. Now, the aggregate can be satisfied just with the non clustred index and, more importantly, the data is in order of customerID, so just a stream aggregate is required and the data does not have to be resorted or run through a hashing algorithm. Moreover, the stream aggregate is only 15% of the query’s cost

CustomerID int,
InvoiceID int,
PaymentDate datetime,
Amount Numeric(17,5)

;WITH DataPopulate (RowNo, CustomerID ,InvoiceID ,ADate, Amount) AS (
SELECT 1 AS RowNo, FLOOR(RAND()*40) as CustomerID, FLOOR(RAND()*200) as InvoiceID,
DATEADD(dd, FLOOR(RAND()*75454),'1753/01/01'), rand()*856542 AS Amount
SELECT rowNo+1, FLOOR(RAND(RowNo*85455)*40) AS CustomerID,
FLOOR(RAND(RowNo*85455)*500) AS InvoiceID,
DATEADD(dd, FLOOR(RAND(RowNo*96322)*85454),'1753/01/01'),
rand(RowNo*25411)*86542 AS Amount
FROM DataPopulate WHERE RowNo<10000
INSERT INTO Payments (CustomerId, InvoiceID, PaymentDate, Amount)
SELECT CustomerID ,InvoiceID ,ADate, Amount FROM DataPopulate

When did CheckDB last run?

If corruption is found in a database it’s very important to know when the corruption started. In the cases where a restore is needed to fix the corruption, knowing which backup is clean can save a great deal of time in restoring databases and checking them.

On SQL 2000, the only way to know was to to go back through the history of the checkDB jobs (everyone is running checkDB regularly, right?) and see when the last successful run was.

On SQL 2005 and higher, when checkDB runs successfully, ie without finding any errors, it writes the date of the last run into the database’s boot page. Whenever SQL Server then starts that database up, be it during a server start, a database restore or bringing the database online, SQL prints a message into the error log.

CHECKDB for database ‘Testing’ finished without errors on 2008-12-22 10:20:06.007 (local time).

This makes it very easy to see when the database was last known to be clean (without any corruption). Note that SQLL is not running checkDB when it prints that. It’s simply printing out the date that CheckDB last completed without finding any problems. The information can also be read out of the boot page using an undocumented DBCC command.

1. Use Testing
2. GO


This prints out a whole bunch of information, all found in the database’s boot page. The info of interest here is in the second block (on SQL 2008 RTM)

dbi_dbccLastKnownGood = 2008-12-22 10:20:06.007

If the date listed is 1900-01-01 00:00:00, it means that CheckDB has never run successfully on that database. If that is the case, there will be no entries in the error log when the database is started.

Friday, December 26, 2008

Measuring Performance of Web Server

Measuring Web server performance is a skill that can only be refined by repeated experience and experimentation. There are many variables at play, such as the number of clients, speed of client connections, server resources, application code, and so on. It helps to have good tools at your disposal, and fortunately those are available.

Microsoft provides the Web Application Stress (WAS) tool, which simulates multiple HTTP clients hitting your Web site. You can control the client load, number of connections, format of cookies, headers, and several other parameters from the tool's graphical interface. After a test run, WAS provides you with reports containing performance metrics such as response time, throughput, and performance counter data relevant to your application. The goal is simple: to maximize throughput and CPU utilization under high degrees of load. WAS is available from the Microsoft Internet Information Server Resource Kit and is also downloadable separately from

ASP.NET also exposes a number of performance counters that can be used to track the execution of your applications. Unlike traditional ASP, most of these performance counters are exposed per-application, instead of globally for the entire machine. The per-application counters are available under the ASP.NET Framework applications performance object, and you need to select a particular application instance when selecting a counter to monitor. Of course, you can still see the counter values for all applications using a special "__Total__" application instance in System Monitor. ASP.NET also exposes global-only counters which are not bound to a particular application instance. These counters are located under the ASP.NET System performance object. To view all available counters for ASP.NET (on Windows 2000 systems):

1. Select Start->Programs->Administrative Tools->Performance.
2. Click the View Report button in System Monitor.
3. Click the Add button.
4. Select ASP.NET Applications, then choose the All counters radio button. Click OK.
5. Select ASP.NET, then choose the All counters radio button. Click OK.

The ASP.NET Trace feature is also useful for identifying performance bottlenecks in your code. It can show you important timing information between successive trace output statements, as well as information about the server control heierarchy, the amount of viewstate used, and the render size of controls on your page.

Thursday, December 25, 2008

Using xp_ReadErrorLog in SQL Server 2005

I would like to share some interesting parameters I found for the undocumented extended stored procedure xp_ReadErrorLog. In doing some testing with this extended stored procedure I found four very interesting parameters. Adding to some of the articles already on the web that discuss undocumented stored procedures, in this article I will explain my testing, use and some examples of the procedure.

Parameters revealed

While working on some system startup procedures that would be making use of the xp_ReadErrorLog extended stored procedure, I came across some very interesting and useful parameters for the procedure. In testing I discovered some of the hidden parameter options that are similar, but still different in the way the extended stored procedure works from version SQL 2000 to SQL 2005. These are SQL 2005 only options. The parameter data types and size were determined by investigating the undocumented stored procedure sp_ReadErrorLog that uses the extended stored procedure.

Well, the interesting part starts now with the parameters. As in the previous versions parameter 1 reads the error log number passed to it, where the default "0" reads the current log.


LogDate ProcessInfo Text
2008-03-04 12:11:01.450 Server Microsoft SQL Server 2005 - 9.00.3159.00 (Intel X86) Mar 23 2007 16:15:11
2008-03-04 12:11:01.500 Server (c) 2005 Microsoft Corporation.
2008-03-04 12:11:01.500 Server All rights reserved. 2008-03-04 12:11:01.500 Server Server process ID is 1284.
2008-03-04 12:11:01.500 Server Authentication mode is MIXED.
2008-03-04 12:11:01.510 Server Logging SQL Server messages in file 'D:\SRVAPPS\MSSQL.1\MSSQL\LOG\ERRORLOG'.

Now let's investigate Parameter (2). It turns out that a value of 1 (default) instructs the procedure to read the SQL error log. By passing a value of 2, the SQL server Agent log is read. Yes, the Agent log! So for example: xp_ReadErrorLog 0, 2 reads the current SQL server Agent log. Also note when using parameter 2 with the extended stored procedure that the column heading returned also changes.

xp_ReadErrorLog 0,2

LogDate ErrorLevel Text
2008-03-04 12:11:10.000 3 [393] Waiting for SQL Server to recover databases...
2008-03-04 12:11:14.000 3 [100] Microsoft SQLServerAgent version 9.00.3042.00 (x86 unicode retail build) ..
2008-03-04 12:11:14.000 3 [101] SQL Server xxxxxxxx version 9.00.3159 (0 connection limit)
2008-03-04 12:11:14.000 3 [102] SQL Server ODBC driver version 9.00.3042 2008-03-04 12:11:01.450 Server

Now we know that we can read both SQL logs (error and agent) for any log file, so now let's look at parameter (3). For those times when you need to find some value in the logs and have used the old trick/process of reading the extended stored procedure into a table and then searching through the table to find a value, we now have parameter (3). Parameter 3 is a search string that can be used to return just the log entry rows that contain the value in the search string. And to make it even better or to refine the search further, parameter 4 is also a search string.

An extra feature of this new version is that parameters 3 and 4 can be used in conjunction with each other for searching SQL error log (parameter 2 = 1) or SQL Agent log (parameter 2=2). So for example, xp_ReadErrorLog 0,1,'failed' will read the current SQL error log and return only rows that contain "failed". For an example of using parameter (4) example, xp_ReadErrorLog 0,1,'failed','login' will read current SQL error log returning only rows that contain "failed" and "login" in the same row. This makes it quite easy for retrieving those log entries for failed user logins from the SQL error logs, or maybe looking for those failed Agent jobs. Or those occasional times when you need to quickly find the port SQL started on or what machine the cluster is executing on.

xp_ReadErrorLog 0, 1, 'Failed', 'login'

LogDate ProcessInfo Text
2008-03-04 12:11:12.340 Logon Login failed for user 'Domain\xxxxxx'. [CLIENT: ]
2008-03-04 15:29:08.710 Logon Logon failed for login 'NT AUTHORITY\NETWORK SERVICE' due to trigger execution.
2008-03-04 15:29:08.710 spid54 The client was unable to reuse a session with SPID 54, which had been reset...

Parameters defined

* Parameter 1 (int), is the number of the log file you want to read, default is "0" for current log. The extended stored procedure xp_enumerrorlogs will come in handy in determining what SQL server error logs or SQL server Agent logs exist and when they were created. NOTE: extended stored procedure xp_enumerrorlogs parameter (1) works just like xp_ReadErrorLog parameter (2). A value of 1 shows available SQL error logs and a value of 2 shows Agent logs. The default value is 0.
* Parameter 2 (int), value of 1 reads SQL error logs, value of 2 reads SQL Server Agent logs, with a default value of 1.
* Parameter 3 varchar (255), is a search string for the log entry, with a default value of NULL.
* Parameter 4 varchar (255), is another search string for the log entry, with a default value of NULL.


An extended stored procedure is a dynamic link library that runs inside SQL server. It can execute from Query Analyzer or SQL Server Management Studio (SSMS) for example. In most cases these extended stored procedures can only be executed by users with sysadmin privileges. Also note as Microsoft has always said about undocumented processes may change in future release.

Tuesday, December 23, 2008

Extract User-Role Mapping

This script can be used with SQL 2000 and 2005. For example if you have 10 users in current database, it will retrieve all Group information along with default database, if the user has a login account else if will return a blank.

--Create a temp table that will hold the main result. This script will check your version, if SQL 2005 or 2000,
--because sp_helpuser return 6 values in SQL 2000 and 7 values in SQL 2005.
Create table #tmpUserPerm
(UserName varchar(100),
GroupName varchar(100),
LoginName varchar(100),
DefDBName varchar(100),
UserId int,
SID varbinary(100),
DefSchemaName varchar(100) null)

Declare @name varchar(100)
Declare @ver varchar(100)

--Create a temp table that will store all users except DBO and GUEST. If you want all users then
--you can remove "and name not in ('DBO', 'GUEST')" from the following statement.

select uid, name into #TmpUser from sysusers
where issqluser = 1 and hasdbaccess <> 0 and name not in ('DBO', 'GUEST')

--Execute the below query to get current version of SQL SERVER
set @ver = convert(varchar(100),SERVERPROPERTY('productversion'))

if (@ver = '9.00.3054.00') --If SQL 2005 then
--Run a cursor for all users
declare cur Cursor for Select name from #Tmpuser
open cur
fetch next from cur into @name
while @@fetch_Status = 0
--Get data from sp_helpuser for current value of user (@NAME)
insert into #tmpUserPerm (UserName, GroupName, LoginName, DefDBName, DefSchemaName, UserId, SID)
Exec sp_helpuser @name
fetch next from cur into @name
close cur
deallocate cur
drop table #Tmpuser
select * from #tmpUserPerm order by 1
drop table #tmpUserPerm
else --If SQL SERVER 2000 or other
--Run cursor for all the user names
declare cur1 Cursor for Select name from #Tmpuser
open cur1
fetch next from cur1 into @name
while @@fetch_Status = 0
--Get data from sp_helpuser for current value of user (@NAME)
insert into #tmpUserPerm (UserName, GroupName, LoginName, DefDBName, UserId, SID)
Exec sp_helpuser @name
fetch next from cur1 into @name
close cur1
deallocate cur1
drop table #Tmpuser
select username, groupname, loginname, defdbname from #tmpUserPerm order by 1
drop table #tmpUserPerm

Performance Tuning SQL Server Hardware

Generally, it is better to run fewer SQL Server databases on more smaller servers than it is to run many databases on one larger server. Here are some reasons why:

* Purchasing several smaller SQL Servers may be less expensive than buying one huge SQL Server, although this may not seem logical to many people not familiar with server hardware. Generally, smaller servers can be purchased at commodity prices, while very large servers are special order and have a cost premium associated with them.

* In many cases, your current, older SQL Server may not be upgradeable, or if it is, it may not be cost-effective to upgrade it as compared to purchasing new physical servers.

* As physical servers get larger (more CPUs), there is more and more CPU overhead generated, which in effect reduces the overall performance of the server. Each additional CPU adds a decreasing amount of additional CPU power. Two 4-CPU servers are more efficient from a CPU perspective than a single 8-CPU server.

* Some databases need to be tuned differently than other databases. If all of your databases are located on a single server, then you can't take advantage of SQL Server-wide performance tuning techniques to tune each separate database as each database will have to share all SQL Server-wide performance tuning settings.

* If the single, large server goes down, then all the databases go down. If the databases are on separate servers, then fewer databases go down.

* Most SQL Server-based applications grow over time. If each database is located on its own server, then it is easier to add incremental hardware (such as RAM) to the servers that need it. On a large server, you may find that you can't expand beyond a certain point.

[6.5, 7.0, 2000, 2005]

Sunday, December 21, 2008

Enabling vardecimal storage format

First, this feature is only available in EE and Dev SKU.

Enabling vardecimal storage format on a table is a two step process as follows:

First you need to enable database for Vardecimal storage format. This can be done using the stored procedure sp_db_vardecimal_storage_format. The exact command is as follows

exec sp_db_vardecimal_storage_format '', 'ON'

When the above command is executed, SQL Server internally bumps the database version number but no tables are enabled for Vardecimal storage format. The database version needs to be bumped to indicate that the data in this database can potentially have a different storage format (i.e. the Vardecimal storage format). This is used to prevent attaching a Vardecimal enabled database to earlier versions of SQL Server 2005 as those versions don’t know how to interpret the new storage format. You can only enable Vardecimal storage format on user databases.

To find out which database(s) is enabled for Vardecimal storage format, you can use the following command

exec sp_db_vardecimal_storage_format

Once you have enabled the database for Vardecimal storage format, you can now choose to enable one or more tables (based on the potential disk savings using the tool described earlier) with this new storage format as follows

sp_tableoption '', 'vardecimal storage format', 1

This command, potentially an expensive one (same order as creating an index), converts all the rows in the table containing columns of type decimal/numeric to Vardecimal storage format. During this conversion, the table is locked and is not available. If the table has no clustered index, then all non-clustered indexes are rebuilt because the RIDs of the rows will change due to storage format change. However, if you have clustered index on the table, then only the non-clustered indexes containing decimal numeric column as key or included column need to be rebuilt. Note, that you cannot enable vardecimal storage format on all tables. Before enabling vardecimal storage format, SQL Server needs to make sure that we can always revert back to static storage format for decimal data and that update of decimal/numeric data always succeeds. If these conditions are not satisfied, the conversion to vardecimal storage format is denied.

To disable Vardecimal storage format on the table, you can use the following command

sp_tableoption '', 'vardecimal storage format', 0

SQL Server guarantees that you can always revert back to ‘static’ storage format unless you run out of disk space during conversion. Note, that the space overhead, to enable/disable Vardecimal storage format, is of the same order as building an index and it is not an online operation.

You can use the following command to find out which tables(s) has been enabled for Vardecimal storage format

select objectproperty(object_id('’), 'TableHasVarDecimalStorageFormat')


select name, object_id, type_desc
from sys.objects
objectproperty(object_id, N'TableHasVarDecimalStorageFormat') = 1

Reducing the Size of your Database in SQL Server 2005/SP2

An exciting new feature in SQL Server 2005/SP2 is Vardecimal Storage Format. This storage format lets you reduce the size of your table significantly if the table has one of more columns of type decimal or numeric without requiring any changes to your application.

Up until now, the decimal and numeric types are stored as fixed length data in SQL Server. Both of these types are functionally equivalent and have a format of (p, s) where p is the precision (number of decimal digits) and s is the scale representing number of digits after the decimal. Depending on the precision (it can be declared in the range from 1 to 38), the decimal value can take anywhere from 5 bytes to 17 bytes. This can bloat the size of the table, especially when you have small decimal values for a column declared with high precision requirement. This issue is similar to char (17) vs. varchar(17). In this case, if most of your character data is 1 or 2 characters long but the max value is 17 characters long, you can reduce the size of the table by declaring the column to be of type varchar(17) instead of char(17).

The new vardecimal storage format stores the decimal/numeric values in a variable length storage format. It provides efficient storage of decimal/numeric data by eliminating the leading/trailing zeros and only storing the minimum required bytes. Using this format, you can get significant space savings (depending on your data distribution) in the space required to store decimal/numeric data. You can enable vardecimal storage format at a table level.

In our in-house testing, we have seen significant reduction in the size of the FACT table(s) that has large number of decimal columns. FACT tables are typically the largest table in a Data Warehouse. Here are some the numbers from our testing.

Best case reduction in the size of the table




Estimating the space savings with vardecimal storage format

Before enabling Vardecimal storage format, you may want to know the potential reduction in the size of the table. Clearly, if the table has no decimal/numeric columns, there will be no savings. Note, that even if you have a table with decimal/numeric column types, there is no guarantee that you will be able to reduce the size of the table by enabling Vardecimal storage format. Again, this issue is similar to VARCHAR (17) vs. CHAR(17). If all the values in the column type has 17 characters, then average row length will be larger with VARCHAR(17) because it will be stored in the variable portion of the record structure. Recall, you need 2 bytes to store the offset of the variable length column. Also, if VARCAHR(17) is the only variable length column in the table, there is another overhead of 2 bytes to store number of variable length columns in the row. So in this case, the worst case, declaring column type as VARCAHR(17) may cost you 4 bytes more for each row than CHAR(17).

SQL Server 2005/SP2 provides you a tool, a stored procedure, to estimate the ‘reduction in row size’ with Vardecimal storage format. The following example illustrates the reduction in row size for two tables that have same scheme but different data, t_decimal being the best case and t_decimal2 being the worst case (where each decimal value has max 38 digits as allowed by the declared precision)

create table t_decimal (c1 int, c2 decimal(10,2), c3 decimal (38,2), c4 varchar(10))


create table t_decimal2 (c1 int, c2 decimal(10,2), c3 decimal (38,2), c4 varchar(10))


-- insert rows into these tables.

declare @i int

select @i = 0

while (@i < 1000)


insert into t_decimal values (1, 0.0,0.0, 'hello')

insert into t_decimal2 values

(1,12345678.99,123456789012345678901234567890123499.99, 'hello')

set @i = @i + 1


-- Now let us find the potential space savings for each of these tables

-- This is the best case

exec sys.sp_estimated_rowsize_reduction_for_vardecimal 't_decimal'

Here is the output. Note, in this case, you can reduce the size of the row by almost 50%. Also, if you have more decimal/numeric columns, the savings will be proportionally larger.

avg_rowlen_fixed_format avg_rowlen_vardecimal_format row_count

--------------------------------------- --------------------------

46.00 24.00 1000

-- this is worst case. Note in this case, the average row length actually increases

-- with Vardecimal storage format.


exec sys.sp_estimated_rowsize_reduction_for_vardecimal 't_decimal2'

avg_rowlen_fixed_format avg_rowlen_vardecimal_format row_count

------------------------- ---------- ---------------- ------
46.00 48.00 1000

SQL Server Storage Engine -Enabling compression on a HEAP

I often get a question how to do enable compression on a table that is a heap (i.e. it does not have a clustered index). Clearly, one could create the clustered index with compression option set to PAGE and then drop the clustered index. This is an expensive operation because

· Creating a clustered index requires a SORT

· When you drop the clustered index, internal allocation structure, namely the PFS (Page Free Space) needs to updated to reflect the free space on each page. Note, when a clustered index is dropped, the resultant heap retains the compression state.

Instead, in SQL Server 2008, there is a new REBUILD command available that can be used to rebuild a HEAP. You can use this command to enable compression on the HEAP. Also, you can use this command to defragment a HEAP. You may wonder what is there to defrag in a heap because the pages don’t need to be traversed in the key order? Well, it is possible the pages in a HEAP are not full so the defragmentation can be used to reclaim the free space. In any case, let us get back to compression. Let me start with the following example

create table t1_big (c1 int, c2 int, c3 char(1000))


- Load the data

declare @i int

select @i = 0

while (@i < 5000)


insert into t1_big values (@i, @i + 5000,

replicate('a', 500))

set @i = @i + 1


- Look at the current row size

- The following query returns a length of 1015

select max_record_size_in_bytes

from sys.dm_db_index_physical_stats (db_id('compression'),


null, null, 'DETAILED')

where object_name (object_id) like '%t1_big%'

-- Now, you can use the following NEW command to enable

-- PAGE compression. You could have used this command to

-- enable ROW compression or even NONE compression. As

-- part of rebuilidng the HEAP, it is also defragmented.

-- This command can also be executed ONLINE

alter table t1_big rebuild

with (

data_compression = PAGE)

-- compression setting shows the PAGE compression

select object_name (object_id) as table_name, data_compression_desc

from sys.partitions

where object_name (object_id) like '%t1_big%'

-- check the length again and it returns a value 9

-- Note, the example I have used here is not realistic so it

-- show almost 99% compression but you get the idea.

select max_record_size_in_bytes

from sys.dm_db_index_physical_stats (db_id('compression'),


null, null, 'DETAILED')

where object_name (object_id) like '%t1_big%'

There are few restrictions on ALTER TABLE REBUILD command as follows

(1) If you use ONLINE option then HEAP rebuild is single threaded. Note, this restriction is not there when you are rebuilding an index which supports multi-threaded ONLINE build. However, in both cases, OFFLINE build does support multi-threading.

(2) When you rebuild a HEAP and specify compression option, the corresponding non clustered indexes are also rebuilt because RIDs will change but the compression state of the indexes will not change. If you want to alter compression option of existing nonclustered indexes, you will need to execute ‘Alter Index’ command on each such indexes. So effectively, you will need to rebuild the nonclustered index twice if you want to change the compression setting both for the HEAP and its associated nonclustered indexes. Here is the rationale for this

a. Most index rebuild options are not applicable to HEAP. For this reason, we did not add ‘ALL’ keyword like the one with Alter Index command.

b. Since indexes are typically much smaller compare to the table, we think most customers will not want to change compression state on indexes automatically as part of rebuilding the HEAP or clustered index for that matter.

How to track database growth across multiple SQL Server instances

It is easy to track database growth on a single SQL Server instance. We simply just need to store the results of sp_databases or loop through the databases and call sp_spaceused for each database.

If you support hundreds of SQL instances like I do, you'd want to store the database growth information in a central repository. From this central server, you could create a linked server for each SQL Server instance to track, but I hate creating linked servers. I especially hate having to create hundreds of them on one SQL Server instance. Instead of using linked servers, I created a CLR stored procedure. It requires one table.

You can download the code here. It includes the C# source code as well as the dll for the CLR object and a sample SQL script file to get it setup on your central server.

Once you have set it up, you can create a SQL job to call it. If you have a small number of SQL instances to administer, you can simply add multiple calls to isp_DatabaseGrowth, like this:

EXEC dbo.isp_DatabaseGrowth 'Server1\Instance1'

EXEC dbo.isp_DatabaseGrowth 'Server2'

If you have a large number of SQL instances to administer, I recommend looping through a table that contains one row for every SQL instance. Here is what my job step looks like:

DECLARE @serverId int, @serverName sysname, @max int

SET @serverId = 1

SELECT IDENTITY(int, 1, 1) AS ServerId, ServerName
INTO #Server
FROM dbo.Server
WHERE ServerName NOT IN ('Server1\Instance2', 'Server1\Instance3', 'Server3') --exclude certain SQL instances

SELECT @max = MAX(ServerId)
FROM #Server

WHILE @serverId <= @max
SELECT @serverId = ServerId, @serverName = ServerName
FROM #Server
WHERE ServerId = @serverId

EXEC dbo.isp_DatabaseGrowth @serverName

SET @serverId = @serverId + 1


Here's the DDL for the Server table:

CREATE TABLE [dbo].[Server]
[ServerName] [sysname] NOT NULL,
[ServerName] ASC

If any of your databases were upgraded to SQL Server 2005, the data returned from sp_spaceused/sp_databases may contain incorrect data due to row count inaccuracies in the catalog views. Make sure to run DBCC UPDATEUSAGE on your databases after an upgrade to SQL Server 2005. Databases that were created in SQL Server 2005 do not have this issue.

Link Servers

When you create a link between two SQL Servers, SQL Server does its best to perform as much work as it can on the remote server. This way, less data has to be moved from the remote server to the local server, helping to reduce overhead and boosting performance. But for some particular operations, they have to be performed on the local server, not the remote server. Some examples of locally performed operations include:

* Data conversion operations
* Queries that use the bit, timestamp, or uniqueidentifier data types
* Queries that use the TOP clause

Because of this, you may want to try to avoid performing any of the above operations using a remote linked server.

If you are running a remote query against a linked server, and you want to find out which parts are performing on the remote server and which are performing on the local server, run the query from Query Analyzer or Management Studio and take a look at the query plan. It will tell you what part of your query is running where. It should be your goal to create code that runs mostly on the remote server, not the local server. [7.0, 2000, 2005]

Thursday, December 18, 2008

Optimizing SQL Server Non-Clustered Indexes

In some cases, even though a column (or columns of a composite index) has a non-clustered index, the Query Optimizer may not use it (even though it should), instead performing a table scan (if the table is a heap) or a clustered index scan (if there is a clustered index). This, of course, can produce unwanted performance problems.

This particular problem can occur when there is a data correlation between the order of the rows in the table, and the order of the non-clustered index entries. This can occur when there is correlation between the clustered index and the non-clustered index. For example, the clustered index may be created on a date column, and the non-clustered index might be created on an invoice number column. If this is the case, then there is a correlation (or direct relationship) between the increasing dates and the increasing invoice numbers found in each row.

The reason this problem occurs is because the Query Optimizer assumes there is no correlation, and it makes its optimization decisions based on this assumption.

If you run into this problem, there are three potential resolutions to this problem:

* If possible, reorder the non-clustered index column (assuming a composite index) so that the column with the highest cardinality is the first column in the composite index.
* Create covering indexes.
* Add index hints to your queries to overrule the Query Optimizer.

Tips for building a Work Breakdown Structure

A Work Breakdown Structure (WBS) is the best way to understand the detailed work of the project when you have to build a schedule from scratch. It’s used to break the project down into the major phases, deliverables, and work components that will be built by the project. These work components can then be broken down into the activities that are required to build them. The WBS is not the same as the final schedule (which requires sequencing, resources, estimated effort, estimated duration, etc.).

Here are five tips to keep in mind when building your WBS:

#1: Create a WBS dictionary for large projects
Normally you wouldn’t need a WBS dictionary, but if your WBS has hundreds (or thousands) of detailed activities, there may just be too much to keep track of by hand. In this case, it might make sense to place all of the important information in a WBS dictionary. The dictionary helps keep track of all of the summary and detailed activities, including a short description, the WBS numeric identifier (1.1, 1.1.1, 1.1.2, etc.) and the estimated effort. If you enter your WBS dictionary into a specialized tool, the tool can also help to keep track of changes to the WBS as well.

#2: Use the summary activities as milestones
Your WBS should contain both detailed and summary activities. (A summary activity is one that is broken down further; a detailed activity is one that is not broken down further.) Although a schedule usually includes only detailed activities, it makes sense to include the summary activities as milestones (i.e., markers signifying that a deliverable or set of deliverables is complete). A summary activity can be used as a milestone since it would indicate that all of the underlying detailed work has been completed.

#3: Break activities into two or more detailed activities

I’ve seen teams that break one activity in the WBS into only one activity at the next level. In my opinion, this doesn’t make sense because then the detailed activity represents the same work as the prior summary activity. This doesn’t buy you anything.

#4: Make the final detailed activities action oriented
The detailed activities on your WBS (the ones that are not broken down further) are ultimately moved to your schedule. For that reason, it’s easier if the detailed activities in your WBS are action oriented — just as activities in your schedule would be. For example, instead of describing a detailed WBS activity as “meeting,” you should state it as “schedule a weekly meeting.” Instead of having a WBS detailed activity for “Testing Plan,” you should state it instead as “Create Testing Plan.” In this way, the detailed activities can be moved to the schedule with a minimum of wording changes.

#5: Don’t place requirements on the WBS
If you place a deliverable on your WBS, you can break this deliverable down into the activities that are required to create it. You don’t break a deliverable down into the requirements that describe it. Requirements do not belong on a WBS. Only deliverables and activities belong on the WBS.

By using these five techniques, you will save you time and rework on your next WBS.

Wednesday, December 17, 2008

The industry’s 10 best IT certifications

IT certifications boast numerous benefits. They bolster resumes, encourage higher salaries, and assist in job retention. But which IT certifications are best?

Technology professionals generate much debate over just that question. Many claim vendor-specific programs best measure a candidate’s skills, while others propose vendor-independent exams are the only worthy way of measuring real-world expertise. Still other observers believe the highest-level accreditations — Microsoft’s MCSE or new Architect Series certification, Cisco’s CCIE, etc. — are the only credentials that truly hold value.

Myself, I don’t fully subscribe to any of those mindsets. The best IT certification for you, after all, is likely to be different from that for another technology professional with different education, skills, and goals working at a different company in a different industry. For that reason, when pursuing any professional accreditation, you should give much thought and care to your education, experience, skills, goals, and desired career path.

Once a career road map is in place, selecting a potential certification path becomes much easier. And that’s where this list of the industry’s 10 best IT certifications comes into play. While this list may not include the 10 best accreditations for you, it does catalog 10 IT certifications that possess significant value for a wide range of technology professionals.

The new-generation Microsoft Certified IT Professional credential, or MCITP for short, is likely to become the next big Microsoft certification. Available for a variety of fields of expertise — including database developer, database administrator, enterprise messaging administrator, and server administrator — an MCITP validates a professional’s proven job-role capabilities. Candidates must pass several Microsoft exams that track directly to their job role before earning the new designation.

As with Microsoft’s other new-generation accreditations, the MCITP certification will retire when Microsoft suspends mainstream support for the platforms targeted within the MCITP exams. By matching the new certification to popular job roles, as has been done to some extent with CompTIA’s Server+ (server administrator), Project+ (project manager), and A+ (desktop support) certifications, Microsoft has created a new certification that’s certain to prove timely, relevant, and valuable.

#2: MCTS
The new-generation Microsoft Certified Technology Specialist (MCTS) helps IT staff validate skills in installing, maintaining, and troubleshooting a specific Microsoft technology. The MCTS certifications are designed to communicate the skills and expertise a holder possesses on a specific platform.

For example, candidates won’t earn an MCTS on SQL Server 2008. Instead, they’ll earn an MCTS covering SQL Server business intelligence (MCTS: SQL Server 2008 Business Intelligence), database creation (MCTS: SQL Server 2008, Database Development), or SQL server administration (MCTS: SQL Server 2008, Implementation and Maintenance).

These new certifications require passing multiple, tightly targeted exams that focus on specific responsibilities on specific platforms. MCTS designations will expire when Microsoft suspends mainstream support for the corresponding platform. These changes, as with other new-generation Microsoft certifications, add value to the accreditation.

#3: Security+
Security continues to be a critical topic. That’s not going to change. In fact, its importance is only going to grow. One of the quickest ways to lose shareholder value, client confidence, and sales is to suffer a data breach. And no self-respecting technology professional wants to be responsible for such a breach.

CompTIA’s Security+ accreditation provides a respected, vendor-neutral foundation for industry staff (with at least two years of experience) seeking to demonstrate proficiency with security fundamentals. While the Security+ accreditation consists of just a single exam, it could be argued that any IT employee charged with managing client data or other sensitive information should, at a minimum, possess this accreditation. The importance of ensuring staff are properly educated as to systems security, network infrastructure, access control, auditing, and organizational security principles is simply too important to take for granted.

#4: MCPD
There’s more to information technology than just administration, support, and networking. Someone must create and maintain the applications and programs that power organizations. That’s where the new-generation Microsoft Certified Professional Developer (MCPD) credential comes into play.

The MCPD accreditation measures a developer’s ability to build and maintain software solutions using Visual Studio 2008 and Microsoft .NET Framework 3.5. Split into three certification paths (Windows Developer 3.5, ASP.NET Developer 3.5, and Enterprise Applications Developer 3.5), the credential targets IT professionals tasked with designing, optimizing, and operating those Microsoft technologies to fulfill business needs.

A redesigned certification aimed at better-measuring real-world skills and expertise, the MCPD will prove important for developers and programmers. Besides requiring candidates to pass several exams, the MCPD certification will retire when Microsoft suspends mainstream support for the corresponding platform. The change is designed to ensure the MCPD certification remains relevant, which is certain to further increase its value.

#5: CCNA
The Cisco Certified Internetwork Expert (CCIE) accreditation captures most of the networking company’s certification glory. But the Cisco Certified Network Associate (CCNA) might prove more realistic within many organizations.

In a world in which Microsoft and Linux administrators are also often expected to be networking experts, many companies don’t have the budgets necessary to train (or employ) a CCIE. But even small and midsize corporations can benefit from having their technology professionals earn basic proficiency administering Cisco equipment, as demonstrated by earning a CCNA accreditation.

As smaller companies become increasingly dependent upon remote access technologies, basic Cisco systems skills are bound to become more important. Although many smaller organizations will never have the complexity or workload necessary to keep a CCIE busy, Cisco’s CCNA is a strong accreditation for technology professionals with a few years’ experience seeking to grow and improve their networking skills.

#6: A+
Technology professionals with solid hardware and support skills are becoming tougher to find. There’s not much glory in digging elbow-deep into a desktop box or troubleshooting Windows boot errors. But those skills are essential to keeping companies running.

Adding CompTIA’s A+ certification to a resume tells hiring managers and department heads that you have proven support expertise. Whether an organization requires desktop installation, problem diagnosis, preventive maintenance, or computer or network error troubleshooting, many organizations have found A+-certified technicians to be more productive than their noncertified counterparts.

Changes to the A+ certification, which requires passing multiple exams, are aimed at keeping the popular credential relevant. Basic prerequisite requirements are now followed by testing that covers specific fields of expertise (such as IT, remote support, or depot technician). The accreditation is aimed at those working in desktop support, on help desks, and in the field, and while many of these staffers are new to the industry, the importance of an A+ certification should not be overlooked.

#7: PMP
Some accreditations gain value by targeting specific skills and expertise. The Project Management Professional (PMP) certification is a great example.

The Project Management Institute (PMI), a nonprofit organization that serves as a leading membership association for project management practitioners, maintains the PMP exam. The certification measures a candidate’s project management expertise by validating skills and knowledge required to plan, execute, budget, and lead a technology project. Eligible candidates must have five years of project management experience or three years of project management experience and 35 hours of related education.

As organizations battle tough economic conditions, having proven project scheduling, budgeting, and management skills will only grow in importance. The PMI’s PMP credential is a perfect conduit for demonstrating that expertise on a resume.

Even years after their introduction, Microsoft Certified Systems Engineer (MCSE) and Microsoft Certified Systems Administrator (MCSA) credentials remain valuable. But it’s important to avoid interpreting these accreditations as meaning the holders are all-knowing gurus, as that’s usually untrue.

In my mind, the MCSE and MCSA hold value because they demonstrate the holder’s capacity to complete a long and comprehensive education, training, and certification program requiring intensive study. Further, these certifications validate a wide range of relevant expertise (from client and server administration to security issues) on specific, widely used platforms.

Also important is the fact that these certifications tend to indicate holders have been working within the technology field for a long time. There’s no substitute for actual hands-on experience. Many MCSEs and MCSAs hold their certifications on Windows 2000 or Windows Server 2003 platforms, meaning they’ve been working within the industry for many years. While these certifications will be replaced by Microsoft’s new-generation credentials, they remain an important measure of foundational skills on Windows platforms.

As mentioned with the Security+ accreditation earlier, security is only going to grow in importance. Whatever an organization’s mission, product, or service, security is paramount.

(ISC)², which administers the Certified Information Systems Security Professional (CISSP) accreditation, has done well building a respected, vendor-neutral security certification. Designed for industry pros with at least five years of full-time experience, and accredited by the American National Standards Institute (ANSI), the CISSP is internationally recognized for validating a candidate’s expertise with operations and network and physical security, as well as their ability to manage risk and understand legal compliance responsibilities and other security-related elements.

#10: Linux+
While pursuing my first Microsoft certification 10 years ago, I remember debating the importance of Linux with several telecommunications technicians. They mocked the investment I was making in learning Microsoft technologies. These techs were confident Linux was going to displace Windows.

Well, didn’t happen. Linux continues to make inroads, though. The open source alternative is an important platform. Those professionals who have Linux expertise and want to formalize that skill set will do well adding CompTIA’s Linux+ certification to their resumes.

The vendor-neutral exam, which validates basic Linux client and server skills, is designed for professionals with at least six to 12 months of hands-on Linux experience. In addition to being vendor-neutral, the exam is also distribution neutral (meaning the skills it covers work well whether a candidate is administering Red Hat, SUSE, or Ubuntu systems).

Tuesday, December 16, 2008

10 services to turn off in MS Windows XP

As long as Microsoft Windows has been a network capable operating system, it has come with quite a few services turned on by default, and it is a good idea for the security conscious user of Microsoft’s flagship product to shut down any of these that he or she isn’t using.

Each version of MS Windows provides different services, of course, so any list of services to disable for security purposes will be at least somewhat particular to a given version of Microsoft Windows. As such, a list like this one needs to be identified with a specific Microsoft Windows version, though it can still serve as a guide for the knowledgeable MS Windows user to check out the running services on other versions as well.

If you are running Microsoft Windows XP on your desktop system, consider turning off the following services. You may be surprised by what is running without your knowledge.

* IIS – Microsoft’s Internet Information Services provide the capabilities of a Webserver for your computer.

* NetMeeting Remote Desktop Sharing — NetMeeting is primarily a VoIP and videoconferencing client for Microsoft Windows, but this service in particular is necessary to remote desktop access.

* Remote Desktop Help Session Manager – This service is used by the Remote Assistance feature that you can use to allow others remote access to the system to help you troubleshoot problems.

* Remote Registry – The capabilities provided by the Remote Registry service are frightening to consider from a security perspective. They allow remote users (in theory, only under controlled
circumstances) to edit the Windows Registry.

* Routing and Remote Access – This service bundles a number of capabilities together, capabilities that most system administrators would probably agree should be provided separately. It is rare that any of them should be necessary for a typical desktop system such as Microsoft Windows XP, however, so they can all conveniently be turned off as a single service. Routing and Remote Access provides the ability to use the system as a router and NAT device, as a dialup access gateway, and a VPN server.

* Simple File Sharing – When a computer is not a part of a Microsoft Windows Domain, it is assumed by the default settings that any and all filesystem shares are meant to be universally accessible. In the real world, however, we should only want to provide shares to very specific, authorized users. As such, Simple File Sharing, which only provides blanket access to shares without exceptions, is not what we want to use for sharing filesystem resources. It is active by default on both MS Windows XP Professional and MS Windows XP Home editions. Unfortunately, this cannot be disabled on MS Windows XP Home. On MS Windows XP Professional, however, you can disable it by opening My Computer -> Tools -> Folder Options, clicking the View tab, and unchecking the Use simple file sharing (Recommended) checkbox in the Advanced settings: pane.

* SSDP Discovery Service – This service is used to discover UPnP devices on your network, and is required for the Universal Plug and Play Device Host service (see below) to operate.

* Telnet – The Telnet service is a very old mechanism for providing remote access to a computer, most commonly known from its use in the bad ol’ days of security for remote command shell access on Unix servers. These days, using Telnet to remotely manage a Unix system may be grounds for firing, where an encrypted protocol such as SSH should be used instead.

* Universal Plug and Play Device Host – Once you have your “Plug and Play” devices installed on your system, it is often the case that you will not need this service again.

* Windows Messenger Service – Listed in the Services window under the name Messenger, the Windows Messenger Service provides “net send” and “Alerter” functionality. It is unrelated to the Windows Messenger instant messaging client, and is not necessary to use the Windows Messenger IM network.

On your system, these services may not all be turned on, or even installed. Whether a given service is installed and running may depend on whether you installed the system yourself, whether you are using XP Home or XP Professional, and from which vendor you got your computer if MS Windows XP was installed by a vendor.

With the exception of Simple File Sharing, all of the above listed services can be disabled from the same place. Simply click on the Start button, then navigate to Settings -> Control Panel, open Administrative Tools, and from there open the Services window. To disable any service in the list, double-click on its entry in that window and change the Startup type: setting. In general, you should change services you are turning off for security purposes to a “Disabled” state. When in doubt about whether a given service is necessary for other services, check the Dependencies tab in the service’s settings dialog.

Obviously, this is not a comprehensive list of everything running on your computer that you may want to turn off. It is merely a list of ten items that you most likely do not need to have running, and constitute a security vulnerability if left running. Most users will never have need of any of the services in this list, once the computer is up and running. Other services may be disabled without ill effect as well, though you should research each item in the complete services list before you disable it to ensure that you actually do not need it running. Some of them are quite critical to the normal operation of your system, such as the Remote Procedure Call (RPC) service.

Every running — but unused — service on your machine is an unnecessary security vulnerability. If a service is not important at all for authorized users and basic system functionality, turn it off.

Monday, December 15, 2008

Performance Tuning for Views

While views are often convenient to use, especially for restricting users from seeing data they should not see, they aren't always good for performance. So if database performance is your goal, avoid using views (SQL Server 2000/2005 Indexed Views are another story).

Views can slow down queries for several different reasons. For example, let's look at these two SELECT statements:

* SELECT * FROM table_name
* SELECT * FROM view_name

Which is faster? If you test it, you will find that the first SELECT statement is faster, although the execution plan for both of them will be the same. How can that be? This is because it takes SQL Server extra work (such as looking up data in the system tables) before it can execute the view. This extra work is not part of the execution plan, so it appears that the two SELECT statements should run at the same speed, which they don't, because some of the work SQL Server is doing is hidden.

Another way views can hurt performance is when JOINs or UNIONs are used, and you don't intend to use all of the columns. This results in SQL Server performing unnecessary work (such as an unnecessary JOIN or UNION), slowing down the performance.

Views, like stored procedures, once they are run the first time, are optimized and their execution plan is stored in cache in case they need to be reused. But this is not reason enough to use a view.

Views, besides hurting performance, are not all that flexible when you are working with them. For example, they can't be changed on the fly, they can’t be used to sort data, and using them for INSERTs, UPDATEs, and DELETEs is problematic. In addition, while views can be nested, this just compounds their problems, so avoid doing this.

Instead of using views, use stored procedures instead. They are much more flexible and they offer better performance. [7.0, 2000, 2005]

Microsoft Internet Explorer Multiple Vulnerabilities

Some vulnerabilities have been reported in Microsoft Internet Explorer, which can be exploited by malicious people to compromise a user's system.

1) An error when handling parameters passed to unspecified navigation methods can be exploited to corrupt memory via a specially crafted web page.

2) An error when fetching a file with an overly long path from a WebDAV share can be exploited to corrupt heap memory via a specially crafted web page.

3) An unspecified use-after-free error can be exploited to corrupt memory via a specially crafted web page.

4) A boundary error when processing an overly long filename extension specified inside an "EMBED" tag can be exploited to cause a stack-based buffer overflow.

Successful exploitation of the vulnerabilities may allow execution of arbitrary code.

Patches are available for these critical issues.

Microsoft Excel Multiple Vulnerabilities

Some vulnerabilities have been reported in Microsoft Excel, which can be exploited by malicious people to compromise a user's system.

1) An error while validating an index value in a NAME record can be exploited to corrupt memory via a specially crafted Excel Spreadsheet (XLS) file.

2) An unspecified error in the processing of Excel records can be exploited to corrupt memory via a specially crafted XLS file.

3) An unspecified error in the processing of Excel formulas can be exploited to corrupt memory via a specially crafted XLS file.

Successful exploitation of the vulnerabilities may allow execution of arbitrary code.

Patches are available for these critical issues.

Microsoft Office Word Multiple Vulnerabilities

Multiple vulnerabilities have been reported in Microsoft Office Word, which can be exploited by malicious people to compromise a user's system.

1) An error when processing the "lcbPlcfBkfSdt" field within the FIB (File Information Block) can be exploited to corrupt memory via a specially crafted Word file.

2) An integer overflow error exists when calculating the space required for the specified number of points in a polyline or polygon. This can be exploited to cause a heap-based buffer overflow during parsing of objects in Rich Text Format (.rtf) files e.g. when a user opens a specially crafted .rtf file with Word or previews a specially crafted e-mail.

3) An unspecified error when parsing certain records can be exploited to corrupt memory via a specially crafted Word file.

4) An error exists when processing consecutive "\do" drawing object tags encountered in RTF documents. This can be exploited to free a heap buffer twice and corrupt memory.

5) An error when processing mismatched "\dpgroup" and "\dpendgroup" controlwords can be exploited to cause a buffer overflow via an RTF document containing an overly large number of "\dpendgroup" tags.

6) A boundary error when parsing RTF documents containing multiple drawing object tags can be exploited to cause a heap-based buffer overflow.

7) A boundary error when processing RTF documents can be exploited to overflow a static buffer via a document containing an overly large number of "\stylesheet" control words.

8) An error when processing a malformed table property can be exploited to cause a stack-based buffer overflow via a specially crafted Word document.

Successful exploitation of the vulnerabilities may allow execution of arbitrary code.
Patches are available for these critical issues.

Tuesday, December 9, 2008

2008 Index Fragmentation Maintenance

Nothing to it. Run it and it hits every database on the system. You can put it inside a stored procedure or inside a SQL Agent Job.

,@TableName NVARCHAR(255)
,@SchemaName NVARCHAR(255)
,@IndexName NVARCHAR(255)


,TableName NVARCHAR(255)
,SchemaName NVARCHAR(255)
,IndexName NVARCHAR(255)
,AvgFragment DECIMAL)

EXEC sp_msforeachdb 'INSERT INTO #Frag (
) SELECT ''?'' AS DBName
,t.Name AS TableName
,sc.Name AS SchemaName
, AS IndexName
FROM ?.sys.dm_db_index_physical_stats(DB_ID(''?''), NULL, NULL,
NULL, ''Sampled'') AS s
JOIN ?.sys.indexes i
ON s.Object_Id = i.Object_id
AND s.Index_id = i.Index_id
JOIN ?.sys.tables t
ON i.Object_id = t.Object_Id
JOIN ?.sys.schemas sc
ON t.schema_id = sc.SCHEMA_ID
WHERE s.avg_fragmentation_in_percent > 20
AND t.TYPE = ''U''
ORDER BY TableName,IndexName'


OPEN cList
INTO @DBName, @TableName,@SchemaName,@IndexName,@PctFrag
IF @PctFrag BETWEEN 20.0 AND 40.0
SET @Defrag = N'ALTER INDEX ' + @IndexName + ' ON ' + @DBName + '.' + @SchemaName + '.' + @TableName + ' REORGANIZE'
EXEC sp_executesql @Defrag
ELSE IF @PctFrag > 40.0
SET @Defrag = N'ALTER INDEX ' + @IndexName + ' ON ' + @DBName + '.' + @SchemaName + '.' + @TableName + ' REBUILD'
EXEC sp_executesql @Defrag

INTO @DBName, @TableName,@SchemaName,@IndexName,@PctFrag



Performance Tuning for Stored Procedures

For best performance, all objects that are called within the same stored procedure should all be owned by the same object owner or schema, preferably dbo, and should also be referred to in the format of object_owner.object_name or schema_owner.object_name.

If the object owner's or schemas are not specified for objects, then SQL Server must perform name resolution on the objects, which causes a small performance hit.

And if objects referred to in the stored procedure have different owners or schemas, SQL Server must check object permissions before it can access any object in the database, which adds unnecessary overhead. Ideally, the owner or schema of the stored procedure should own all of the objects referred to in the stored procedure.

In addition, SQL Server cannot reuse a stored procedure "in-memory plan" over if the object owner or schema is not used consistently. If a stored procedure is sometime referred to with its object owner's or schema name, and sometimes it is not, then SQL Server must re-execute the stored procedure, which also hinders performance. [7.0, 2000, 2005]

How to change all Object Owners to dbo

CreateALTER proc [dbo].[ChangeAllObjectOwnersTodbo]
set nocount on

declare @uid int
declare @objName varchar(50)
declare @userName varchar(50)
declare @currObjName varchar(50)
declare @outStr varchar(256)
set @uid = user_id('dbo')

declare chObjOwnerCur cursor static
select user_name(uid) as 'username', [name] as 'name' from sysobjects where uid <> @uid

open chObjOwnerCur
if @@cursor_rows = 0
print 'All objects are already owned by dbo!'
close chObjOwnerCur
deallocate chObjOwnerCur
return 1

fetch next from chObjOwnerCur into @userName, @objName
while @@fetch_status = 0
set @currObjName = 'dbo.' + @objName
if (object_id(@currObjName) > 0)
print 'WARNING *** ' + @currObjName + ' already exists ***'
set @outStr = 'sp_changeobjectowner ''' + @userName + '.' + @objName + ''', ''dbo'''
print @outStr
print 'go'
fetch next from chObjOwnerCur into @userName, @objName

close chObjOwnerCur
deallocate chObjOwnerCur
set nocount off
return 0

Sunday, December 7, 2008

SQL injection

SQL injection is a technique that exploits a security vulnerability occurring in the database layer of an application. The vulnerability is present when user input is either incorrectly filtered for string literal escape characters embedded in SQL statements or user input is not strongly typed and thereby unexpectedly executed. It is in fact an instance of a more general class of vulnerabilities that can occur whenever one programming or scripting language is embedded inside another.

Forms of SQL injection vulnerabilities

Incorrectly filtered escape characters
This form of SQL injection occurs when user input is not filtered for escape characters and is then passed into a SQL statement. This results in the potential manipulation of the statements performed on the database by the end user of the application.

The following line of code illustrates this vulnerability:

statement = "SELECT * FROM users WHERE name = '" + userName + "';"

This SQL code is designed to pull up the records of a specified username from its table of users. However, if the "userName" variable is crafted in a specific way by a malicious user, the SQL statement may do more than the code author intended. For example, setting the "userName" variable as

a' or 't'='t

renders this SQL statement by the parent language:

SELECT * FROM users WHERE name = 'a' OR 't'='t';

If this code were to be used in an authentication procedure then this example could be used to force the selection of a valid username because the evaluation of 't'='t' is always true.

While most SQL Server implementations allow multiple statements to be executed with one call, some SQL APIs such as php's mysql_query do not allow this for security reasons. This prevents hackers from injecting entirely separate queries, but doesn't stop them from modifying queries. The following value of "userName" in the statement below would cause the deletion of the "users" table as well as the selection of all data from the "data" table (in essence revealing the information of every user):

a';DROP TABLE users; SELECT * FROM data WHERE name LIKE '%

This input renders the final SQL statement as follows:


Incorrect type handling
This form of SQL injection occurs when a user supplied field is not strongly typed or is not checked for type constraints. This could take place when a numeric field is to be used in a SQL statement, but the programmer makes no checks to validate that the user supplied input is numeric. For example:

statement := "SELECT * FROM data WHERE id = " + a_variable + ";"

It is clear from this statement that the author intended a_variable to be a number correlating to the "id" field. However, if it is in fact a string then the end user may manipulate the statement as they choose, thereby bypassing the need for escape characters. For example, setting a_variable to

1;DROP TABLE users

will drop (delete) the "users" table from the database, since the SQL would be rendered as follows:


Magic String
The magic sting is a simple string of SQL used primarily at login pages. The magic string is


When used at a login page, you will be logged in as the user on top of the SQL table.

Vulnerabilities inside the database server
Sometimes vulnerabilities can exist within the database server software itself, as was the case with the MySQL server's mysql_real_escape_string() function[1]. This would allow an attacker to perform a successful SQL injection attack based on bad Unicode characters even if the user's input is being escaped.

Blind SQL Injection
Blind SQL Injection is used when a web application is vulnerable to SQL injection but the results of the injection are not visible to the attacker. The page with the vulnerability may not be one that displays data but will display differently depending on the results of a logical statement injected into the legitimate SQL statement called for that page. This type of attack can become time-intensive because a new statement must be crafted for each bit recovered. There are several tools that can automate these attacks once the location of the vulnerability and the target information has been established.[2]

Conditional Responses
One type of blind SQL injection forces the database to evaluate a logical statement on an ordinary application screen.

SELECT booktitle FROM booklist WHERE bookId = 'OOk14cd' AND 1=1

will result in a normal page while

SELECT booktitle FROM booklist WHERE bookId = 'OOk14cd' AND 1=2

will likely give a different result if the page is vulnerable to a SQL injection. An injection like this will prove that a blind SQL injection is possible, leaving the attacker to devise statements that evaluate to true or false depending on the contents of a field in another table.[3]

Conditional Errors
This type of blind SQL injection causes a SQL error by forcing the database to evaluate a statement that causes an error if the WHERE statement is true. For example,

SELECT 1/0 FROM users WHERE username='Ralph'

the division by zero will only be evaluated and result in an error if user Ralph exists.

Time Delays
Time Delays are a type of blind SQL injection that cause the SQL engine to execute a long running query or a time delay statement depending on the logic injected. The attacker can then measure the time the page takes to load to determine if the injected statement is true.

Preventing SQL Injection
To protect against SQL injection, user input must not directly be embedded in SQL statements. Instead, parameterized statements must be used (preferred), or user input must be carefully escaped or filtered.

Using Parameterized Statements
In some programming languages such as Java and .NET parameterized statements can be used that work with parameters (sometimes called placeholders or bind variables) instead of embedding user input in the statement. In many cases, the SQL statement is fixed. The user input is then assigned (bound) to a parameter. This is an example using Java and the JDBC API:

PreparedStatement prep = conn.prepareStatement("SELECT * FROM USERS WHERE USERNAME=? AND PASSWORD=?");
prep.setString(1, username);
prep.setString(2, password);

Similarly, in C#:

using (SqlCommand myCommand = new SqlCommand("SELECT * FROM USERS WHERE USERNAME=@username AND PASSWORD=HASHBYTES('SHA1', @password)", myConnection))
myCommand.Parameters.AddWithValue("@username", user);
myCommand.Parameters.AddWithValue("@password", pass);

SqlDataReader myReader = myCommand.ExecuteReader())

In PHP version 5 and MySQL version 4.1 and above, it is possible to use prepared statements through vendor-specific extensions like mysqli[4]. Example[5]:

$db = new mysqli("localhost", "user", "pass", "database");
$stmt = $db -> prepare("SELECT priv FROM testUsers WHERE username=? AND password=?");
$stmt -> bind_param("ss", $user, $pass);
$stmt -> execute();

In ColdFusion, the CFQUERYPARAM statement is useful in conjunction with the CFQUERY statement to nullify the effect of SQL code passed within the CFQUERYPARAM value as part of the SQL clause.[6] [7]. An example is below.


Enforcing the Use of Parameterized Statements
There are two ways to ensure an application is not vulnerable to SQL injection: using code reviews (which is a manual process), and enforcing the use of parameterized statements. Enforcing the use of parameterized statements means that SQL statements with embedded user input are rejected at runtime. Currently only the H2 Database Engine supports this feature.

Using Escaping
A straight-forward, though error-prone way to prevent injections is to escape dangerous characters. One of the reasons for it being error prone is that it is a type of blacklist which is less robust than a whitelist. For instance, every occurrence of a single quote (') in a parameter must be replaced by two single quotes ('') to form a valid SQL string literal. In PHP, for example, it is usual to escape parameters using the function mysql_real_escape_string before sending the SQL query:

$query = sprintf("SELECT * FROM Users where UserName='%s' and Password='%s'",

However, escaping is error-prone as it relies on the programmer to escape every parameter. Also, if the escape function fails to handle a special character correctly, an injection is still possible.

Find Out The Recovery Model For Your Database

You want to quickly find out what the recovery model is for your database but you don’t want to start clicking and right-clicking in SSMS/Enterprise Manager to get that information. This is what you can do, you can use databasepropertyex to get that info. Replace ‘msdb’ with your database name


What if you want it for all databases in one shot? No problem here is how, this will work on SQL Server version 2000

FROM sysdatabases

For SQL Server 2005 and up, you should use the following command

SELECT name,recovery_model_desc
FROM sys.databases

What is deferred name resolution and why do you need to care?


SET @x = 1

IF (@x = 0)

SELECT * FROM #temptable –what does this return

This is the error you get
Server: Msg 2714, Level 16, State 1, Line 12
There is already an object named ‘#temptable’ in the database.

You can do something like this to get around the issue with the temp table


SET @x = 1

IF (@x = 0)
INSERT #temptable
INSERT #temptable

SELECT * FROM #temptable –what does this return

So what is thing called Deferred Name Resolution? Here is what is explained in Books On Line

When a stored procedure is created, the statements in the procedure are parsed for syntactical accuracy. If a syntactical error is encountered in the procedure definition, an error is returned and the stored procedure is not created. If the statements are syntactically correct, the text of the stored procedure is stored in the syscomments system table.

When a stored procedure is executed for the first time, the query processor reads the text of the stored procedure from the syscomments system table of the procedure and checks that the names of the objects used by the procedure are present. This process is called deferred name resolution because objects referenced by the stored procedure need not exist when the stored procedure is created, but only when it is executed.

In the resolution stage, Microsoft SQL Server 2000 also performs other validation activities (for example, checking the compatibility of a column data type with variables). If the objects referenced by the stored procedure are missing when the stored procedure is executed, the stored procedure stops executing when it gets to the statement that references the missing object. In this case, or if other errors are found in the resolution stage, an error is returned.

So what is happening is that beginning with SQL server 7 deferred name resolution was enabled for real tables but not for temporary tables. If you change the code to use a real table instead of a temporary table you won’t have any problem
Run this to see what I mean


SET @x = 1

IF (@x = 0)

SELECT * FROM temptable –what does this return

What about variables? Let’s try it out, run this


SET @x = 1

IF (@x = 0)
SELECT @i = 5
SELECT @i = 6


And you get the follwing error
Server: Msg 134, Level 15, State 1, Line 13
The variable name ‘@i’ has already been declared. Variable names must be unique within a query batch or stored procedure.

Now why do you need to care about deferred name resolution? Let’s take another example
create this proc

SELECT dbo.somefuction(1)

CREATE FUNCTION somefuction(@id INT)
SELECT @id = 1

now run this

SP_DEPENDS ’somefuction’

result: Object does not reference any object, and no objects reference it.

Most people will not create a proc before they have created the function. So when does this behavior rear its ugly head? When you script out all the objects in a database, if the function or any objects referenced by an object are created after the object that references them then sp_depends won’t be 100% correct

SQL Server 2005 makes it pretty easy to do it yourself

SELECT specific_name,*
FROM information_schema.routines
WHERE object_definition(OBJECT_ID(specific_name)) LIKE ‘%somefuction%’
AND routine_type = ‘procedure’

How Do You Check If A Temporary Table Exists In SQL Server

How do you check if a temp table exists?

You can use IF OBJECT_ID(’tempdb..#temp’) IS NOT NULL Let’s see how it works

–Create table
USE Norhtwind


–Check if it exists
IF OBJECT_ID(‘tempdb..#temp’) IS NOT NULL
PRINT ‘#temp exists!’
PRINT ‘#temp does not exist!’

–Another way to check with an undocumented optional second parameter
IF OBJECT_ID(‘tempdb..#temp’,‘u’) IS NOT NULL
PRINT ‘#temp exists!’
PRINT ‘#temp does not exist!’

–Don’t do this because this checks the local DB and will return does not exist
IF OBJECT_ID(‘tempdb..#temp’,‘local’) IS NOT NULL
PRINT ‘#temp exists!’
PRINT ‘#temp does not exist!’

–unless you do something like this
USE tempdb

–Now it exists again
IF OBJECT_ID(‘tempdb..#temp’,‘local’) IS NOT NULL
PRINT ‘#temp exists!’
PRINT ‘#temp does not exist!’

–let’s go back to Norhtwind again
USE Norhtwind

–Check if it exists
IF OBJECT_ID(‘tempdb..#temp’) IS NOT NULL
PRINT ‘#temp exists!’
PRINT ‘#temp does not exist!’

now open a new window from Query Analyzer (CTRL + N) and run this code again

–Check if it exists
IF OBJECT_ID(‘tempdb..#temp’) IS NOT NULL
PRINT ‘#temp exists!’
PRINT ‘#temp does not exist!’

It doesn’t exist and that is correct since it’s a local temp table not a global temp table

Well let’s test that statement

–create a global temp table
CREATE TABLE ##temp(id INT) –Notice the 2 pound signs, that’s how you create a global variable

–Check if it exists
IF OBJECT_ID(‘tempdb..##temp’) IS NOT NULL
PRINT ‘##temp exists!’
PRINT ‘##temp does not exist!’

It exists, right?
Now run the same code in a new Query Analyzer window (CTRL + N)

–Check if it exists
IF OBJECT_ID(‘tempdb..##temp’) IS NOT NULL
PRINT ‘##temp exists!’
PRINT ‘##temp does not exist!’

And yes this time it does exist since it’s a global table

Thursday, December 4, 2008

Finding all data types in user tables

This script queries sys.columns to get the entire list of columns and tables existing in the current database, then maps the columns datatype with a name from sys.systypes. The where clause filters the results for user created databases, less 'sysdiagrams', or you can use the commented out where clause to target a specific table.

This is a great way to hunt down various data types and make sure different development teams are on the same page and don't do silly things like having the data types on their tables not matching other tables and causing frustrations in forgetting to cast the values.

select object_name(c.object_id) "Table Name", "Column Name", "Column Type"
from sys.columns c
join sys.systypes s on (s.xtype = c.system_type_id)
where object_name(c.object_id) in (select name from sys.tables where name not like 'sysdiagrams')
-- where object_name(c.object_id) in (select name from sys.tables where name like 'TARGET_TABLE_NAME')

Wednesday, December 3, 2008

Increment a string

It's good to create a serial number to tickets, or another serie from data type character.

declare @litere nvarchar(3)
declare @litera1 char(1)
declare @litera2 char(1)

set @litere='AQ'

select @litera1=substring(@litere,2,1)
select @litera2=substring(@litere,1,1)

if @litera1='Z'
set @litera1='A'
set @litera2=char(ascii(substring(@litere,1,1))+1)

set @litera1=char(ascii(substring(@litere,2,1))+1)

select @litera1, @litera2
set @litere=@litera2+@litera1
select @litere

Sunday, November 30, 2008

Writing Outer Joins in T-SQL

Occasionally someone will ask for my help with a query and say that both a right outer join and a left outer join was tried, and still the expected results were not achieved. That made me realize that some developers do not completely understand outer joins and that an article explaining how to use them might help.

Inner Join Review
The most commonly used join is an INNER JOIN. This type of join combines rows from two tables only when they match on the joining condition. Usually the primary key from one table matches a foreign key on another table, but join conditions can be more complex than that.

(Note: Most of the information in this article can be applied to views as well as tables. For simplicity, the word "table" will be used to mean table or view unless stated otherwise. Keys are not defined on views, but the underlying table’s key column or columns are often included in the view. To keep things simple, let's assume that is the case.)

INNER JOIN will retrieve a results row only where there is a perfect match between the two tables in the join condition. You will also often see one row from one of the tables matching multiple rows in the other table. For example, one customer can have many orders. One order can have many order details. The data on the one side will be repeated for each row on the many side. The following query is an example showing how the information from the Sales.SalesOrderHeader is repeated on each matching row:

SELECT s.SalesOrderID, OrderDate,ProductID
FROM Sales.SalesOrderHeader AS s
INNER JOIN Sales.SalesOrderDetail AS d ON s.SalesOrderID = d.SalesOrderID
ORDER BY s.SalesOrderID, ProductID

Outer Join Introduction
OUTER JOIN is used to join two tables even if there is not a match. An OUTER JOIN can be used to return a list of all the customers and the orders even if no orders have been placed for some of the customers. A keyword, RIGHT or LEFT, is used to specify which side of the join returns all possible rows. I like using LEFT because it makes sense to me to list the most important table first. Except for one example demonstrating RIGHT OUTER JOIN, this article will use left joins. Just a note: the keywords INNER and OUTER are optional.

The next example returns a list of all the customers and the SalesOrderID for the orders that have been placed, if any.

SELECT c.CustomerID, s.SalesOrderID
FROM Sales.Customer c
LEFT OUTER JOIN Sales.SalesOrderHeader s ON c.CustomerID = s.CustomerID

It uses the LEFT keyword because the Sales.Customer table is located on the left side and we want all rows returned from that table even if there is no match in the Sales.SalesOrderHeader table. This is an important point. Notice also that the CustomerID column is the primary key of the Sales.Customer table and a foreign key in the Sales.SalesOrderHeader table. This means that there must be a valid customer for every order placed. Writing a query that returns all orders and the customers if they match doesn’t make sense. The LEFT table should always be the primary key table when performing a LEFT OUTER JOIN.

If the location of the tables in the query are switched, the RIGHT keyword is used and the same results are returned:

SELECT c.CustomerID, s.SalesOrderID
FROM Sales.SalesOrderHeader s
RIGHT OUTER JOIN Sales.Customer c ON c.CustomerID = s.CustomerID

Notice that I didn’t change the join condition at all. It doesn’t matter which side of the equal sign the columns are listed; only where the tables are named is it important.

If I have a LEFT OUTER JOIN, what is returned from the table on the right side of the join where there is not a match? Each column from the right side will return a NULL. Try this query which lists the non-matching rows first:

SELECT c.CustomerID, s.SalesOrderID
FROM Sales.Customer c
LEFT OUTER JOIN Sales.SalesOrderHeader s ON c.CustomerID = s.CustomerID
ORDER BY s.SalesOrderID

By adding a WHERE clause to check for a NULL SalesOrderID, you can find all the customers who have not placed an order. My copy of AdventureWorks returns 66 customers with no orders:

SELECT c.CustomerID, s.SalesOrderID
FROM Sales.Customer c
LEFT OUTER JOIN Sales.SalesOrderHeader s ON c.CustomerID = s.CustomerID

Occasionally, you will need to be more specific. How can you find all the customers who have not placed an order in 2002? There are several ways to solve this problem. You could create a view of all the orders placed in 2002 and join the view on the Sales.Customer table. Another option is to create a CTE, or Common Table Expression, of the orders placed in 2002. This example shows how to use a CTE to get the required results:

( SELECT SalesOrderID, customerID
FROM Sales.SalesOrderHeader
WHERE OrderDate between '1/1/2002' and '12/31/2002'
SELECT c.CustomerID, s.SalesOrderID
FROM Sales.Customer c
LEFT OUTER JOIN s ON c.customerID = s.customerID

My favorite technique to solve this problem is much simpler. Additional criteria, in this case filtering on the OrderDate, can be added to the join condition. The query joins all customers to the orders placed in 2002. Then the results are restricted to those where there is no match. This query will return exactly the same results as the previous, more complicated query:

SELECT c.CustomerID, s.SalesOrderID
FROM Sales.Customer c
LEFT OUTER JOIN Sales.SalesOrderHeader s ON c.CustomerID = s.CustomerID
and s.OrderDate between '1/1/2002' and '12/31/2002'

Using Aggregates with Outer Joins
Aggregate queries introduce another pitfall watch out for. The following example is an attempt to list all the customers and the count of the orders that have been placed. Can you spot the problem?

SELECT c.CustomerID, count(*) OrderCount
FROM Sales.Customer c
LEFT OUTER JOIN Sales.SalesOrderHeader s ON c.CustomerID = s.CustomerID
GROUP BY c.CustomerID
ORDER BY OrderCount

Now the customers with no orders look like they have placed one order. That is because this query is counting the rows returned. To solve this problem, count the SalesOrderID column. NULL values are eliminated from the count.

SELECT c.CustomerID, count(SalesOrderID) OrderCount
FROM Sales.Customer c LEFT OUTER JOIN Sales.SalesOrderHeader s
ON c.CustomerID = s.CustomerID
GROUP BY c.CustomerID
ORDER BY OrderCount

Multiple Joins
Once more than two tables are involved in the query, things get a bit more complicated. When a table is joined to the RIGHT table, a LEFT OUTER JOIN must be used. That is because the NULL rows from the RIGHT table will not match any rows on the new table. An INNER JOIN causes the non-matching rows to be eliminated from the results. If the Sales.SalesOrderDetail table is joined to the Sales.SalesOrderHeader table and an INNER JOIN is used, none of the customers without orders will show up. NULL cannot be joined to any value, not even NULL.

To illustrate this point, when I add the Sales.SalesOrderDetail table to one of the previous queries that checked for customers without orders, I get back no rows at all.

SELECT c.CustomerID, s.SalesOrderID, d.SalesOrderDetailID
FROM Sales.Customer c
LEFT OUTER JOIN Sales.SalesOrderHeader s ON c.CustomerID = s.CustomerID
INNER JOIN Sales.SalesOrderDetail d ON s.SalesOrderID = d.SalesOrderID

To get correct results, change the INNER JOIN to a LEFT JOIN.

SELECT c.CustomerID, s.SalesOrderID, d.SalesOrderDetailID
FROM Sales.Customer c
LEFT OUTER JOIN Sales.SalesOrderHeader s ON c.CustomerID = s.CustomerID
LEFT OUTER JOIN Sales.SalesOrderDetail d ON s.SalesOrderID = d.SalesOrderID

What about additional tables joined to Sales.Customer, the table on the left? Must outer joins be used? If it is possible that there are some rows without matches, it must be an outer join to guarantee that no results are lost. The Sales.Customer table has a foreign key pointing to the Sales.SalesTerritory table. Every customer’s territory ID must match a valid value in Sales.SalesTerritory. This query returns 66 rows as expected because it is impossible to eliminate any customers by joining to Sales.SalesTerritory:

SELECT c.CustomerID, s.SalesOrderID, t.Name
FROM Sales.Customer c
LEFT OUTER JOIN Sales.SalesOrderHeader s ON c.CustomerID = s.CustomerID
INNER JOIN Sales.SalesTerritory t ON c.TerritoryID = t.TerritoryID

Sales.SalesTerritory is the primary key table; every customer must match a valid territory. If you wanted to write a query that listed all territories, even those that had no customers, an outer join will be used. This time, Sales.Customers is on the right side of the join.

SELECT t.Name, CustomerID
FROM Sales.SalesTerritory t
LEFT OUTER JOIN Sales.Customer c ON t.TerritoryID =c.TerritoryID

SQL Server Connection Strings

Some common and not so common connection strings for the .NET SqlConnection object. The article includes .NET sample code and some tricks to increase the supportability of your application.

Trusted Authentication
Data Source=ServerName; Initial Catalog=DatabaseName; Integrated Security=SSPI;

Trusted authentication uses the security credentials of the current user to make the connection to SQL Server. SQL Server uses Windows (or Active Directory) to validate the current user. ServerName can be the name of a server or the name of a SQL Server instance such as Server1\Instance2. ServerName can also be expressed as an IP address. SSPI stands for Security Support Provider Interface (in you were curious).

SQL Server Security Authentication

Data Source=ServerName; Initial Catalog=DatabaseName; User Id=UserName; Password=UserPassword;

In SQL Server authentication SQL Server stores the username and password. ServerName can be the name of a server or the name of a SQL Server instance such as Server1\Instance2. ServerName can also be expressed as an IP address.

Setting the Application Name
Data Source=ServerName; Initial Catalog=DatabaseName; Integrated Security=SSPI; Application Name=MyAppName;

I often set the Application Name when I construct connections strings. Whatever text you assign to Application Name will appear in a couple of different places:

* It will be displayed in Profiler under the Application Name column.
* It will be shown in the output of sp_who2 in the Program Name column.
* It will be shown in the Activity Monitor in the Application column. You can get to the Activity Monitor in SQL Server Management Studio by Management -> Activity Monitor.
* It will appear in the program_name column if you select from master.dbo.sysprocesses (for SQL Server 2000)
* It will appear int he program_name column if you select from sys.dm_exec_sessions (for SQL Server 2005 and later).

Setting the application name makes it very easy to find out what applications are issuing particular SQL statements against my database. Setting the application name can also lead to an increase in the number of connections to your SQL Server. Each client that uses connection pooling will create one pool inside each application per unique connection string. If you use multiple application names you have the possibility to increase the number of pools and thus the number of connections to SQL Server. I've always found it more beneficial to have the application name than to have a few less connections to my database servers.

Using MARS (Multiple Active Result Sets)

Data Source=ServerName; Initial Catalog=DatabaseName; Integrated Security=SSPI; MultipleActiveResultSets=True;

If you want to use MARS you'll need to enable it in the connection string.

Sample .NET code
There are two common ways to create a connection string in .NET. The first is to use an explicit connection string.

SqlConnection conn = new SqlConnection();
conn.ConnectionString = "Data Source=L40; Initial Catalog=master; Integrated Security=SSPI;";

The second is to use the Connection String Builder object in .NET to construct a connection string.

SqlConnectionStringBuilder csb = new SqlConnectionStringBuilder();
csb.DataSource = "L40";
csb.InitialCatalog = "master";
csb.IntegratedSecurity = true;

SqlConnection conn = new SqlConnection();
conn.ConnectionString = csb.ToString();