In our next blog post about speeding up your SQL Server database backup and restore processes, we’re going to focus on moving tables into their own filegroups. If you frequently need to restore production databases into QA/Dev environments, but are frustrated with how long it takes, keep reading.
Move Large Tables Into Their Own Filegroups
In last week’s blog post, we discussed how you can speed up your backup and restore processes by moving large tables into a separate database, all while ensuring your applications continue to run smoothly.
If you don’t want to lose referential integrity between the data in your primary database and the new database, you may not want to use that approach, and are looking for an alternative solution.
SQL Server supports the ability to place tables in different Filegroups, to aid in allocation and administration. Backup and restore processes can then target specific Filegroups, and therefore, specific tables.
Why Should I Use Filegroups To Segment My Data?
By placing our tables in dedicated Filegroups, and targeting our backup and restore operations to only those Filegroups that contain the data we want to restore, we can significantly speed up those backup and restore operations.
However, because the tables remain part of the same database, our referential integrity isn’t affected. Foreign Keys can continue to do their job (providing the data is present).
With the approach demonstrated below, applications will typically require zero code changes as your application code is agnostic to the Filegroup to which a table belongs.
Before our backup and restores can take advantage of our Filegroups, we need to create and implement the Filegroups themselves.
For this example, we’re using the 10GB version of the Stack Overflow database, which has been modified to include an extra (fictional) table name ‘Audit’. This table keeps track of activity in the system, so has grown very large over time.
As things stand, the database only has a single Filegroup. When it comes to backing up and restoring, there’s no choice but to backup and restore the entire database, including the large Audit table.
SELECT * FROM sys.[filegroups] AS [F]
Step 1 - Create a New Filegroup
Firstly, we’ll create a new Filegroup and add a new data file to it. This Filegroup will eventually hold our Audit table.
/*1. Create a new file group*/ USE [master]; ALTER DATABASE [StackOverflow2010] ADD FILEGROUP [FG_Audit]; GO /*2. Add a new file to this file group*/ ALTER DATABASE [StackOverflow2010] ADD FILE ( NAME = [StackOverflow_Audit], FILENAME = 'C:\Databases\StackOverflow_Audit.ndf', SIZE = 1024, MAXSIZE = UNLIMITED, FILEGROWTH = 256MB ) TO FILEGROUP [FG_Audit]; GO
Verify that the new Filegroup has been created:
SELECT * FROM sys.[filegroups] AS [F]
Step 2 - Move the Table Into the New Filegroup
One of the most common ways to move a table into a new Filegroup is by dropping the clustered index and recreating it on the new Filegroup.
This is the most time-consuming and impactful stage in the process. Dropping and recreating the clustered index on a large table is not something you do lightly. The table is going to be locked and unavailable for the duration. In an ideal world, we would have planned and implemented our Filegroups before inserting any data into the affected tables.
The good news is that once it’s done, it’s done! You don’t have to do this each time, but all your future backup and restore operations can benefit.
CREATE UNIQUE CLUSTERED INDEX [PK__Audit__3214EC07F6E60506] ON [StackOverflow2010].[dbo].[Audit] ([ID]) WITH DROP_EXISTING /*Could use ONLINE here if supported*/ ON [FG_Audit];
Note that you’ll need to repeat this process for all the indexes on the table that you’re moving.
Step 3 - Backup Only the Primary Filegroup
Now that we’ve moved all that data into its own Filegroup, we can now target our backups at only the Filegroup we care about; the Primary Filegroup.
BACKUP DATABASE [StackOverflow2010] FILEGROUP = 'PRIMARY' TO DISK = 'C:\Databases\Backups\StackOverflowPrimaryFB.bak' WITH COPY_ONLY; /*optional*/
Our Primary-Filegroup-only backup is now much smaller than the full backup:
Step 4 - Restore Only the Primary Filegroup
In our restore environment, we’ll restore only the Primary Filegroup:
RESTORE DATABASE [StackOverflow2010] FILEGROUP = 'PRIMARY' FROM DISK = 'C:\Databases\Backups\StackOverflowPrimaryFG.Bak';
Your database is now restored, minus the massive tables that you don’t need in your QA/Dev environments.
But there’s one final point to consider, which is that any applications and database objects that depend on those missing tables will now error.
If we try and query any of the objects that reside in a Filegroup that we did not restore, we receive the following error:
SELECT * FROM [dbo].[Audit] AS [A]
Msg 8653, Level 16, State 1, Line 1
The query processor is unable to produce a plan for the table or view ‘Audit’ because the table resides in a filegroup that is not online.
Your first thought may well be to simply create a blank copy of those tables, and you’d be right! However, it’s not quite that simple.
The definition of those tables still exists within the metadata of our database. Therefore, any attempt to create the table again will throw an error.
Fortunately, there’s a simple trick to get around this, which is to rename the missing tables:
EXEC sp_rename 'dbo.Audit', '_Audit';
We can then create our missing table in the Primary Filegroup, and even insert some data if required.
We’re now able to backup and restore our production databases into our QA/UAT environments, but without bringing over all those GBs of data that we don’t want, or need.
In our next blog post, we’ll look at how by partitioning the data, we can get even more granular, targeting our backups and restores at the specific data within a table, rather than the table as a while.
Thanks for reading!