Search This Blog

SQL Server: Data File Structure

In our earlier posts , we have discussed about various types pages in details. In this post let us discuss about the organization of the data file.In other words how these pages are organized in the data file.
As you know, SQL server divides the data file into 8 KB pages and page is the smallest unit of any IO operation. SQL server refer the first page in the data file as page number 0.In all the data file first 9  pages (till page number 8) are in same order as shown below.In the primary data file, the 10th page (Page number 9) will be the boot page which store the metadata about the database.





The first page,Page number 0 the file header (page type 15). It hold the file header information. There will be only one header page for each data file and that reside in the 0th position. The header page store the file level information like file size ,minimum size,max size and file growth.

The second page , Page number 1 is the first PFS page (Page type 11). PFS page is the second page (Page Number 1) in the data file followed by file header (Page Number 0). GAM and SGAM are used to track the extent allocation status where as PFS pages are used to track page level allocation. While allocating pages , database engine identify the extent with free pages using the GAM and SGAM. Once the database engine found extent with free page, it use the PFS page to identify the unallocated page in that extent and amount of free space in the page. Free space is only tracked for pages storing LOB values (ie text/image, varchar(max),nvarchar(max),varbinary(max) ,row overflow data) and heap data pages. By default LOB data are stored in a separate page by keeping only a pointer along with record stored in the data pages. These are the only pages where data can be stored in the available free pages.For index pages, the data should be stored in the order of index and there is no options in the insertion point of the records.So it will be unnecessary overhead of tracking free space in the index pages. PFS page repeat after every 8088 pages. That means page number 1,8088,16176,24264...are PFS page in every data file.Read more about PFS page

The third page, Page Number 2 is the first GAM(Global Allocation Map) page (Page type 8).GAM pages records what extents have been allocated for any use. GAM has bit for every extent. If the bit is 1, the corresponding extent is free, if the bit is 0, the corresponding extent is in use as uniform or mixed extent.A GAM page can hold information of around 64000 extents. That is, a GAM page can hold information of (64000X8X8)/1024 = 4000 MB approximately. In short, a data file of size 7 GB will have two GAM pages.Read more about GAM pages

The fourth page, Page Number 3 is the first SGAM(Shared Global Allocation Map) page (Page type 8).SGAM pages record what extents are currently being used as mixed extent and also have at least one unused page. SGAM has bit for every extent. If the bit is 1, the corresponding extent is used as a mixed extent and has at least one page free to allocate. If the bit is 0, the extent is either not used as a mixed extent or it is mixed extent and with all its pages being used. A SGAM page can hold information of 64000 extents. That is, a SGAM page can hold information of (64000X8X8)/1024 = 4000 MB. In short, a data file of size 7 GB will have two SGAM page.Read more about SGAM page

The fifth and sixth page, page number 4 and 5 are not used in the current architecture of the SQL server. The page type of these pages is 0. DBCC page for these pages will print the header part and it will end with invalid page type error.

The seventh page, Page Number 6 is the first DCM (Differential Change Map) page (Page type 16).SQL Server uses DCM pages  to track extent modified after the last full backup operation. DCM pages track the extents modified after the full backup.DCM has a bit for every extent it tracks. If the bit is 1, the corresponding extent is modified after the last full backup, if the bit is 0, there is no change in the corresponding extent after the last full backup.A DCM page can hold information of around 64000 extents. Precisely DCM pages repeat after every 511232 pages .One DCM page can track the changes of exactly 63904 extents.The second DCM page will be 511238th page. Read more about DCM pages.

The eight page, Page Number 7 is the first BCM (Bulk Change Map) page (Page type 17). SQL Server uses BCM pages to track extent modified by bulk logged operation since last backup log operation. BCM has a bit for every extent it tracks. If the bit is 1, the corresponding extent is modified after the last log backup due to bulk logged operation, if the bit is 0, there is no change in the corresponding extent due to bulk logged operation after the last log backup.A BCM page can hold information of around 64000 extents. Precisely BCM pages repeat after every 511232 pages .One BCM page can track the changes of exactly 63904 extents.The second BCM page will be 511239th page.Read more about BCM pages

The ninth page, Page Number 8 is the first IAM(Index Allocation Map) page (Page type 10). An IAM page track the pages/extents allocation in GAM interval of a partition for specific allocation unit of a table. Read More about IAM pages

The tenth page, Page Number 9 is the boot page Page type 13). Boot page is available as 9th page only in the primary data file. Boot page will not be available in the secondary data file.We can see the content of this page using the DBCC Page command and some of the values stores in this page are self explanatory.If this page is corrupted for some reason, it is not possible to recover the database using DBCC CheckDb. The page restore also will not help in this situation. The only possible way to recover the database is restore from last good backup.

From the 11th page on wards, you can see mix different types pages like data page,index pages,IAM pages,row-overflow pages and LOB pages.Page type of Data pages and index pages is 1 and 2 respectively.Page type of  Row-overflow and LOB pages is 3. You can read more about the data page structure from this post. Data pages and index pages are stored almost in the same structure.

Row overflow pages are used to store the row data that did not fit into the data page.LOB pages are used to store the large objects which are not stored as part of row data.