TempDB Configuration Matters

TempDB Database

TempDB Database

For each SQL Server database installation there are several accompanying system databases that help keep SQL Server running and functional.  They store information about security, what the server is doing, and how the system should process certain functions.  The TempDB database doesn’t keep permanent information–in fact, each time SQL Server restarts, the TempDB database gets purged and re-built as if it is a new database.  Why then should I care about the TempDB database?  Well, I am glad you asked.

Lots of Hats

The TempDB database has several functions it handles; some of which include: Sorting, local and global tables, index re-organizing, hash comparisons, XML Variables, spooling, triggers, snapshot isolation, and other internal functions.  In short–it is the workhorse of the environment and ALL databases on the system will interact with it at some point, so the configuration becomes very important.

Transaction Dependent

Because local table variables and hash joins are stored in the TempDB database, queries that use these objects/functions have a part of their processing done in the TempDB database.  When a query needs to move a process to TempDB, it creates a worktable and write data out.  What happens when I have 1,000 transactions per second and all of them have a hash join?  I will have a line out the door of threads waiting to be able to write to TempDB for their turn.  I can potentially have GAM (Global Allocation Map) and SGAM(Secondary Global Allocation Map) contention.  Also, because these processes are dependent for a transaction to process, the speed in which than can be written and read becomes important.

Providing some relief

There are two recommendations that can provide immediate to your TempDB environment.  The first is to create multiple TempDB files of the same size.  This will help alleviate the SGAM contention.  The second is to move the TempDB data files to the fastest disks in the system and/or to the most number of spindles as possible.  TempDB is the first candidate for Flash drives should the entire system not be able to take advantage of those disks.

How large should my TempDB files be? Determining the size of the TempDB files may be a bit of trial and error; however, if you have databases already on the system, one way to help make a decision is to run DBCC CHECKDB WITH ESTIMATEONLY to get the size the CHECKDB command uses.

How many TempDB Files should I have? While there are varying ideas, the one I hold to is 1 file per CPU core up to 8 files.  Monitoring will help provide insight if more are needed in the future.

The Importance of Backups

A re-post of a posting at b2bsol.com.

Too many organizations do not have adequate protection of their data and are susceptible to data loss.  While security is important and you might think this post is about implementing policy to limit your exposure to hacking, I am talking about something much more basic than that–I am talking about database backups.  A database backup is the first step in ensuring data availability and limiting exposure to corruption or human error.

What kind of backup strategy do you need?  Well, what is your tolerance for data loss?

To help you answer this question, there are two components you need to consider.

RPO – Recovery Point Objective

TechTarget defines RPO as “the age of files that must be recovered from backup storage for normal operations to resume if a computer, system, or network goes down as a result of a hardware, program, or communications failure.”  My definition would be something like–the moment in time you want to be able to restore to.  If you experience corruption or disk failure, how close do you need to get to that point in time?  Defining this metric–which will vary from system to system, will give you your RPO.

RTO – Recovery Time Objective

TechTarget defines RTO as “the maximum tolerable length of time that a computer, system, network, or application can be down after a failure or disaster occurs.”  My definition would be–The time needed to restore a system to usable functionality. I should note this time would include the alerting and response of your team if user intervention is required.

It would be easy to say, I want less than 1 second of data loss (RPO) and  less than 5 minutes of downtime (RTO); however, don’t expect to pay TGIF prices for Ruth Chris service.  Microsoft has made great strides in giving SQL Server many options for High Availability and Disaster Recovery and the ability to keep the system up ; however, none of these solutions remove the requirement to take backups.  The amount of history you keep will depend on your business requirements and the costs associated with keeping that storage.

If your organization does not have the RPO and RTO points defined, it is definitely time to make it happen.