Monday, November 10, 2008

Shrinking databases

“Order the pages, shuffle the pages.”

Do you ever shrink your data files? I’ve personally never been fond of it, especially for production databases. After all, they’ll simply have to grow again and, especially if the data files are on independent drives, there’s little difference between space free on the drive or space free in the data file. There is also a more insidious reason for not shrinking a database.

Let’s take a very simple database (The creation code is at the end of the post). I have two tables, both with a tens of thousands of rows. Both tables have a clustered index on a uniqueidentifier and are heavily fragmented (>99%).

DBCC SHOWCONTIG(LargeTable1) -- 99.30%
BCC SHOWCONTIG(LargeTable2) -- 99.21%

To fix the fragmentation, rebuild both indexes. That fixes the fragmentation, but now the data file is using almost twice the space necessary.

DBCC ShowFileStats -- 3363 extents total, 1697 used (215 MB total, 106 MB free)

So, shrink the database to release the wasted space back to the OS

DBCC SHRINKDATABASE (TestingShrink, 10) -- Shrink to 10% free

That’s fixed the space issue. But now, have another look at those two indexes that were just rebuilt.

DBCC SHOWCONTIG(LargeTable1)
- Logical Scan Fragmentation ………………: 99.99%

DBCC SHOWCONTIG(LargeTable2)
- Logical Scan Fragmentation ………………: 7.08%

Oops. Not exactly a desired outcome.

When SQL shrinks a data file, it takes extents that are towards the end of the file and moves them to empty places further forward. It does this with no concern over logical order of pages or indexes. Net result, after shrinking a database, many of the indexes in that database will be badly fragmented.

For this reason mainly I always recommend that, especially for production databases, the data files get grown as necessary and not shrunk. The space that can be reclaimed from the data file is not worth what the shrink does to page ordering. Especially since, as production databases tend to do, the file will simply be growing again sometime in the future.

All too often I hear of maintenance plans that first rebuild all the indexes, then shrink the data files. That kind of maintenance is worse than useless. The index rebuild uses cpu and time to arrange indexes in logical order and in the process often grows the data file. The shrink then uses more time and cpu and often will leave the indexes more fragmented than they were before the rebuild.

Basically, if you’re going to rebuild indexes, don’t shrink the data files. If you’re going to shrink data files, either don’t waste time rebuilding indexes, or do them after the shrink.

Paul Randal wrote a very nice post on the downsides of shrink, entitled “Turn Auto Shrink Off!” Pretty much says it all.

Caveat: There are cases where shrinking data files does make sense. When a process created lots of tables for processing then dropped them again, after a massive archiving job, after changing data types in a table to release a large amount of wasted space (more on that another time). Just be aware of the effect of a shrink on the fragmentation of indexes.

Edit: Some more thoughts from Paul Randal on shrinking databases: Autoshrink. Turn it OFF!

Sample Code:

1. SET NOCOUNT ON
2. GO
3.
4. CREATE DATABASE TestingShrink
5. GO
6.
7. ALTER DATABASE TestingShrink SET RECOVERY SIMPLE
8. GO
9.
10. USE TestingShrink
11. GO
12.
13. Create Table LargeTable1 ( -- row size of ~700 (10 rows per page)
14. ID BIGINT,
15. SomeString CHAR(600),
16. Row_ID UNIQUEIDENTIFIER,
17. AValue NUMERIC(30,8),
18. RandomDate DATETIME
19. )
20.
21. Create Table LargeTable2 ( -- row size of ~700 (10 rows per page)
22. ID BIGINT,
23. SomeString CHAR(600),
24. Row_ID UNIQUEIDENTIFIER,
25. AValue NUMERIC(30,8),
26. RandomDate DATETIME
27. )
28. GO
29.
30. -- ensuring high fragmentation
31. CREATE CLUSTERED INDEX idx_Large1 on LargeTable1 (Row_ID)
32. CREATE CLUSTERED INDEX idx_Large2 on LargeTable2 (Row_ID)
33. GO
34.
35. DECLARE @i SMALLINT
36. SET @i = 0
37. WHILE (@i<8)
38. BEGIN
39. ;WITH DataPopulate (RowNo, Strng,Uniqueid,Num,ADate) AS (
40. SELECT 1 AS RowNo, 'abc' as Strng, NewID() AS Uniqueid, rand()*856542 AS Num, DATEADD(dd, FLOOR(RAND()*75454),'1753/01/01')
41. UNION ALL
42. SELECT rowNo+1, 'abc' as Strng, NewID() AS Uniqueid, rand(RowNo*25411)*856542 AS Num, DATEADD(dd, FLOOR(RAND(RowNo*96322)*85454),'1753/01/01')
43. FROM DataPopulate WHERE RowNo<10000
44. )
45. INSERT INTO LargeTable1
46. SELECT * FROM DataPopulate
47. OPTION (MAXRECURSION 10000)
48.
49. ;WITH DataPopulate (RowNo, Strng,Uniqueid,Num,ADate) AS (
50. SELECT 1 AS RowNo, 'abc' as Strng, NewID() AS Uniqueid, rand()*856542 AS Num, DATEADD(dd, FLOOR(RAND()*75454),'1753/01/01')
51. UNION ALL
52. SELECT rowNo+1, 'abc' as Strng, NewID() AS Uniqueid, rand(RowNo*25411)*856542 AS Num, DATEADD(dd, FLOOR(RAND(RowNo*96322)*85454),'1753/01/01')
53. FROM DataPopulate WHERE RowNo<10000
54. )
55. INSERT INTO LargeTable2
56. SELECT * FROM DataPopulate
57. OPTION (MAXRECURSION 10000)
58. SET @i = @i+1
59. END
60. GO
61.
62. DBCC SHOWCONTIG(LargeTable1) -- 99.30%
63. DBCC SHOWCONTIG(LargeTable2) -- 99.21%
64. DBCC showfilestats -- 2467 extents total, 2463 used (157 MB total, 256kb free)
65. GO
66. -- Rebuild the indexes. This should grow the database quite a bit.
67. Alter Index idx_Large1 on LargeTable1 rebuild
68. Alter Index idx_Large2 on LargeTable2 rebuild
69. go
70.
71. DBCC SHOWCONTIG(LargeTable1) -- 0%
72. DBCC SHOWCONTIG(LargeTable2) -- 1%
73. DBCC ShowFileStats -- 3363 extents total, 1697 used (215 MB total, 106 MB free)
74. GO
75.
76. USE Master
77. go
78. DBCC SHRINKDATABASE (TestingShrink, 10) -- Shrink to 10% free
79. go
80. use TestingShrink
81. GO
82.
83. DBCC ShowFileStats -- 1885 extents total, 1695 used (120 MB total, 12 MB free)
84. DBCC SHOWCONTIG(LargeTable1) -- 99.99%
85. DBCC SHOWCONTIG(LargeTable2) --7.08%
86. GO
87.
88. USE master
89. GO
90.
91. DROP DATABASE TestingShrink
92. GO

No comments: