Q. What is Fragmentation? How to detect fragmentation and how to eliminate it?
A. Storing data non-contiguously on disk is known as fragmentation. Before learning to eliminate fragmentation, you should have a clear understanding of the types of fragmentation. We can classify fragmentation into two types:
- Internal Fragmentation: When records are stored non-contiguously inside the page, then it is called internal fragmentation. In other words, internal fragmentation is said to occur if there is unused space between records in a page. This fragmentation occurs through the process of data modifications (INSERT, UPDATE, and DELETE statements) that are made against the table and therefore, to the indexes defined on the table. As these modifications are not equally distributed among the rows of the table and indexes, the fullness of each page can vary over time. This unused space causes poor cache utilization and more I/O, which ultimately leads to poor query performance.
- External Fragmentation: When on disk, the physical storage of pages and extents is not contiguous. When the extents of a table are not physically stored contiguously on disk, switching from one extent to another causes higher disk rotations, and this is called Extent Fragmentation.
Index pages also maintain a logical order of pages inside the extent. Every index page is linked with previous and next page in the logical order of column data. However, because of Page Split, the pages turn into out-of-order pages. An out-of-order page is a page for which the next physical page allocated to the index is not the page pointed to by the next-page pointer in the current leaf page. This is called Logical Fragmentation.
Ideal non-fragmented pages are given below:
Statistics for table scan are as follows:
- Page read requests: 2
- Extent switches: 0
- Disk space used by table: 16 KB
- avg_fragmentation_in_percent: 0
- avg_page_space_used_in_percent: 100
Following are fragmented pages:
In this case, the statistics for table scan are as follows:
- Page read requests: 6
- Extent switches: 5
- Disk space used by table: 48 KB
- avg_fragmentation_in_percent > 80
- avg_page_space_used_in_percent: 33
How to detect Fragmentation: We can get both types of fragmentation using the DMV: sys.dm_db_index_physical_stats. For the screenshot given below, the query is as follows:
SELECT OBJECT_NAME(OBJECT_ID), index_id,index_type_desc,index_level,
(DB_ID(N'AdventureWorksLT'), NULL, NULL, NULL , 'SAMPLED')
ORDER BY avg_fragmentation_in_percent DESC
Along with other information, there are two important columns that for detecting fragmentation, which are as follows:
- avg_fragmentation_in_percent: This is a percentage value that represents external fragmentation. For a clustered table and leaf level of index pages, this is Logical fragmentation, while for heap, this is Extent fragmentation. The lower this value, the better it is. If this value is higher than 10%, some corrective action should be taken.
- avg_page_space_used_in_percent: This is an average percentage use of pages that represents to internal fragmentation. Higher the value, the better it is. If this value is lower than 75%, some corrective action should be taken.
- Reducing Fragmentation in a Heap: To reduce the fragmentation of a heap, create a clustered index on the table. Creating the clustered index, rearrange the records in an order, and then place the pages contiguously on disk.
- Reducing Fragmentation in an Index: There are three choices for reducing fragmentation, and we can choose one according to the percentage of fragmentation:
- If avg_fragmentation_in_percent > 5% and < 30%, then use ALTER INDEX REORGANIZE: This statement is replacement for DBCC INDEXDEFRAG to reorder the leaf level pages of the index in a logical order. As this is an online operation, the index is available while the statement is running.
- If avg_fragmentation_in_percent > 30%, then use ALTER INDEX REBUILD: This is replacement for DBCC DBREINDEX to rebuild the index online or offline. In such case, we can also use the drop and re-create index method.
- (Update: Please note this option is strongly NOT recommended)Drop and re-create the clustered index: Re-creating a clustered index redistributes the data and results in full data pages. The level of fullness can be configured by using the FILLFACTOR option in CREATE INDEX.
Reference: Pinal Dave (https://blog.sqlauthority.com)