Search This Blog

SQL Server: How many data files do I need for TempDB? For Heavy traffic environments

Starting Sql Server 2005, tempdb’s role has significantly increased. It is heavily used for data processing i.e MARS, sort, reindexing, temp tables, table variables, etc. Performance Tuning TempDB is not just an important task its a non-ignorable task for any large scale system (VLDB or evolving VLDBs).
There are several best practicies to make overall improvements to a Sql Server instance. From Hardware, to Operating System settinga, Sql Server Configuration, Storage Architecture, Tables Structure, Indexes, etc. Today, I am going to focus on one aspect that I’ve been meaning to write about for a long time. Optimization of TempDB performance using multiple data files.
As you know, any database has data and log file. Each data file has a dedicated thread to perform IO and other necessary operations. If all your data is in one MDF file, then any table reads/writes have to go through the same thread. Even if the underlying disk subsystem is fast, you are limiting the overall throughput with just one thread. More data files means, more threads to simultaneously read and write data. But there is a right balance to strike. Adding a new data file for each table to increase parallelism may not be the first thing to attempt (may be this is a good idea, but needs to be properly tested)
When you have one MDF file for TempDB, all the processes that need TempDB space will be competing for the same limited resources. When you have multiple files, they are managed in parallel with dedicated threads. Larger size means more bit map allocations available and multiple files means parallel execution.
Since Sql using round-robin proportional fill algorithm for distributing data load to all the data files, it results in keeping the data files to the same size. This allows evenly growth in all data files. When you first create tempdb data files, be generous and allocate a big size. Do not rely on auto-growth. Auto-growth is a backup plan, not a technique as part of primary plan. So, allocate more space than you need for each data file initially. Monitor the usage and allocate the size to what is more applicable.
Now, the important question is ‘how many data files do I create for my tempdb?
This is the logic I follow (and what I learnt from other MVPs).
  • If you have a quad-core CPU, then add 4 data files (all equal size)
  • If you have dual quad-code, then add 8 data files.
  • If you have 4 quad-core or more (i.e. 64 core, etc), then start with 8 data files and work your way up by adding 4 data files each time.
In essence, keep the number of data files to the same number as the number of cores. Monitor your system, and see what works. There is no silver bullet here, only guidelines to make incremental improvements.
Use DMVs (system views), to measure the growth:
  • sys.dm_io_virtual_file_stats
  • sys.database_files