indexes

About fragmentation in SQL Server

- Extent = 8 pages = 64K.
- On a page split, SQL Server generally moves half the total number of rows in the original page to the new page.
- If new rows are added in the order of the clustered index, then the index rows will be always added at the trailing end of the index, preventing the page splits otherwise caused by the INSERT statements.
- For queries that don’t have to traverse a series of pages to retrieve the data, fragmentation can have minimal impact.
- For a table with a clustered index, the fragmentation of the clustered index is the same with the fragmentation of the data pages, since the leaf pages of the clustered index and data pages are the same.
- A small table (or index) with fewer than eight pages is simply unlikely to benefit from efforts to remove the fragmentation because it will be stored on mixed extents.

Internal fragmentation
- When data is fragmented within the same extent (= 8 pages).
- A little internal fragmentation can be beneficial, because it allows you to perform INSERT and UPDATE queries without causing page splits.

External fragmentation
- When data is fragmented over 2 extents.
- A range scan on an index will need more switches between the corresponding extents than ideally required. A range scan on an index will be unable to benefit from read-ahead operations performed on the disk.
- For better performance, it is preferable to use sequential I/O, since this can read a whole extent (8 x 8KB) in a single disk I/O operation. By contrast, a noncontiguous layout of pages requires nonsequential or random I/O operations to retrieve the index pages from the disk, and a random I/O operation can read only 8KB of data in a single disk operation (this may be acceptable, however, if you are retrieving only one.

Columnstore indexes vs indexed views

Columnstore indexes Indexed views
It didn't require Enterprise Edition in SQL Server 2016 SP1+ Limited usage in non-Enterprise Editions
No session setting requirements Requires certain session settings set on
A lot less storage required More storage required
Less administrative overhead / maintenance More administrative overhead / maintenance
Not able to to inserts/updates prior to SQL Server 2014 Most costly to maintain during inserts/updates

Common myths about index fragmentation in SQL Server

Rebuild all the indexes every night.Not necessary as not all indexes are fragmented => waste of server resources. Just thing to transaction log activity and disk activity. Some indexes fragmentation can be done with INDEX REORGANIZE.

Add more memory to the server.More memory doesn't stop index fragmentation happening.

Index fragmentation is irrelevant when using SSD.SSD allow index scans to be faster, but SSD doesn't prevent index fragmentation.

Online index rebuilds do not causes blocking.Online index rebuilds acquire locks that can cause long-term blocking.

Use the same fill factor for all indexes.Some indexes aren't going to have any fragmentation => waste space.

What are the implications of index fragmentation?

Slower index scans.

Increased disk space usage.The density of rows on the pages is lower => store fewer rows per page => wasting space

Increased buffer pool usage.

Causes increased transaction log generation.

[SQL Server] Number of pages in a table

The query below helps you to find data file pages for a table based on sys.allocation_units table.

SELECT
        T.NAME AS table_name,
        P.rows AS no_rows,
        SUM(AU.total_pages) AS total_pages,
        SUM(AU.used_pages) AS used_pages,
        (SUM(AU.total_pages) - SUM(AU.used_pages)) AS unused_pages
FROM
        sys.tables T
        INNER JOIN sys.indexes I ON T.OBJECT_ID = I.object_id
        INNER JOIN sys.partitions P ON I.object_id = P.OBJECT_ID AND I.index_id = P.index_id
        INNER JOIN sys.allocation_units AU ON P.partition_id = AU.container_id
GROUP BY
        T.name, P.rows
ORDER BY total_pages DESC

Using Indexes Effectively in SQL Server

Adding indexes on a specific table doesn't guarantee you the queries will use these indexes effectively. These are some of the query design rules you should follow to improve the use of indexes:

Avoid nonsargable search conditions
A sargable predicate in a query is one in which an index can be used.

Sargable =, >, >=, <, <=
BETWEEN
LIKE 'text%'
Nonsargable <>, !=
NOT EXISTS
NOT IN
OR
LIKE '%text'

Avoid arithmetic operators on the WHERE clause column
For example, assuming an index has been created on NoWeeks column, the use of the multiplication operator column in the first query prevents the optimizer from using (choosing) the index on the column:

-- the index will not be used
WHERE NoWeeks * 7 > @NoDays

-- the index will be used
WHERE NoWeeks > @NoDays / 7

Avoid functions on the WHERE clause column

-- the index will not be taken into account
WHERE SUBSTRING(CustomerName, 1, 1) = 'F' ;

-- the index will be taken into account
WHERE CustomerName LIKE 'F%' ;

Index fragmentation resolutions in SQL Server

After you detected fragmentation of an index, you can "resolve" fragmentation in an index by rearranging the index rows and pages so that their physical and logical orders match.

You achieve this through the following techniques:
1. Dropping and re-creating the index
If the index being dropped is a clustered index, then all the nonclustered indexes on the table have to be rebuilt after the cluster is dropped.


2. Creating the index with the DROP_EXISTING clause


3. Executing the ALTER INDEX REBUILD statement on the index
To be performed in case the fragmentation is over 30%.

ALTER INDEX <IX_IndexName> ON <TableName>
REBUILD;

or rebuild all indexes of a specified table:

ALTER INDEX ALL ON <TableName>
REBUILD;


4. Executing the ALTER INDEX REORGANIZE statement on the index
To be performed in case the index fragmentation is between 5% and 30%

ALTER INDEX <IX_IndexName> ON <TableName>
REORGANIZE;


Characteristics of Four Defragmentation Techniques

Drop and Create Index Create Index with DROP_ EXISTING ALTER INDEX REBUILD ALTER INDEX REORGANIZE
Blocking High High Medium Low
Defragment index with constraints Highly complex Moderately Easy Easy
Defragment multiple indexes together No No Yes Yes
Degree of defragmentation High High High Moderate to low
Statistics are updated Yes Yes Yes No
Apply new fill factor Yes Yes Yes No

Factors affecting index performance in SQL Server

Indexes and query performance are strongly related. It’s essential to ensure that the factors that affect the efficiency of an index are optimal:

  • Fill Factor
  • Statistics
  • Fragmentation
  • I/O Sub-System
  • Compression

Find missing indexes using DMVs

The following query determines which are the missing indexes and displays their column details based on dm_db_missing_index_group_stats DMV.

USE <DatabaseName>;

SELECT 
        REPLACE(ID.[Statement], '[' + DB_NAME(ID.database_id) + '].', '') as table_name,
        CAST(S.avg_total_user_cost * S.avg_user_impact * S.user_seeks as INT) as [score],
        ID.equality_columns, ID.inequality_columns, ID.included_columns,               
        'CREATE INDEX '
        + 'IX_' + OBJECT_NAME(ID.object_id) + '_' + CAST(ID.index_handle as VARCHAR)
        + ' ON ' + REPLACE(ID.[Statement], '[' + DB_NAME(ID.database_id) + '].', '')
        + ' (' + ISNULL (ID.equality_columns,'')
        + CASE
                WHEN ID.equality_columns IS NOT NULL AND ID.inequality_columns IS NOT NULL
                THEN ', ' + ID.inequality_columns
                ELSE ''
          END
        + ')'
        + ISNULL (' INCLUDE (' + ID.included_columns + ')', '') AS create_index_statement
FROM
        sys.dm_db_missing_index_details ID
        INNER JOIN sys.dm_db_missing_index_groups G
                ON ID.index_handle = G.index_handle
        INNER JOIN sys.dm_db_missing_index_group_stats S
                ON G.index_group_handle = S.group_handle
WHERE
        ID.database_id = DB_ID()
        AND OBJECTPROPERTY(ID.[object_id], 'IsMsShipped') = 0  
ORDER BY score DESC;

Improve performance using filtered indexes

A filtered index is a non-clustered index that contains only a subset of the number of rows contained in a table. You add a WHERE clause to reduce the number of rows that are stored at the leaf level.

CREATE NONCLUSTERED INDEX <Index_Name>
ON <Table Name>(<Column(s)>)
WHERE <Filter Condition>

A good real example it is use is on Status column, as you might have a large number of rows with Closed status that are not part of your current searches (the business wants to query for Open, Processing, Invoicing, etc)

CREATE NONCLUSTERED INDEX [IX_Orders_IdStatus_Filterered]
ON [dbo].[Orders] ([IdStatus])
WHERE ([IdStatus] IN ('O', 'P', 'I'))

Another well-suited scenarios is where you frequently have to filter out NULL values.

Pros:

  • By having fewer rows in an index, less I/O is done when that index is used.
  • Index size is dramatically reduced.
  • The statistics on these filtered indexes are filtered as well, which typically results in them being more accurate.
  • Cons:

  • Use certain expressions, such as BETWEEN, NOT IN, or a CASE statement.
  • Parameterized queries doesn't take advantage of filtered indexes.
  • Pages

    Subscribe to RSS - indexes