About fragmentation in SQL Server

- Extent = 8 pages = 64K.
- On a page split, SQL Server generally moves half the total number of rows in the original page to the new page.
- If new rows are added in the order of the clustered index, then the index rows will be always added at the trailing end of the index, preventing the page splits otherwise caused by the INSERT statements.
- For queries that don’t have to traverse a series of pages to retrieve the data, fragmentation can have minimal impact.
- For a table with a clustered index, the fragmentation of the clustered index is the same with the fragmentation of the data pages, since the leaf pages of the clustered index and data pages are the same.
- A small table (or index) with fewer than eight pages is simply unlikely to benefit from efforts to remove the fragmentation because it will be stored on mixed extents.

Internal fragmentation
- When data is fragmented within the same extent (= 8 pages).
- A little internal fragmentation can be beneficial, because it allows you to perform INSERT and UPDATE queries without causing page splits.

External fragmentation
- When data is fragmented over 2 extents.
- A range scan on an index will need more switches between the corresponding extents than ideally required. A range scan on an index will be unable to benefit from read-ahead operations performed on the disk.
- For better performance, it is preferable to use sequential I/O, since this can read a whole extent (8 x 8KB) in a single disk I/O operation. By contrast, a noncontiguous layout of pages requires nonsequential or random I/O operations to retrieve the index pages from the disk, and a random I/O operation can read only 8KB of data in a single disk operation (this may be acceptable, however, if you are retrieving only one.

Columnstore indexes vs indexed views

Columnstore indexes Indexed views
It didn't require Enterprise Edition in SQL Server 2016 SP1+ Limited usage in non-Enterprise Editions
No session setting requirements Requires certain session settings set on
A lot less storage required More storage required
Less administrative overhead / maintenance More administrative overhead / maintenance
Not able to to inserts/updates prior to SQL Server 2014 Most costly to maintain during inserts/updates

Using computed columns in SQL Server

In a database we need some values that were often calculated while generating several reports.

Assuming we have the following table:

CREATE TABLE InvoiceLine
AS
(
       InvoiceLineID int
       , InvoiceID int
       , NumberOfItems int
       , Amount money
)

And we want to find the total amount of an invoice line. We can achieve this by addding a new computed column:

CREATE TABLE InvoiceLine
AS
(
       InvoiceLineID int
       , InvoiceID int
       , NumberOfItems int
       , Amount money
       , TotalAmount AS (Amount*NumberOfItems) PERSISTED
)

Improvements and enhancements in SQL Server 2017

Resumable Online Index Rebuild.This Database Engine feature in the latest edition of SQL Server, allows the user to resume the online index rebuilding operation from exactly where it was hindered due to failure. This includes failures like less disk space, failover to replica, or simple pauses in the operation.

Automatic Database Tuning.This feature allows you to take pre-emptive action for potential query troubles. It gives you insights to query performance troubles, along with suggesting solutions for the same. It can also fix problems once they have been identified.

Graph Database Capabilities.SQL Server 2017 provides Graph Database Capabilities for modeling many – to many relationships. They have been integrated with T- SQL and also allow for easy querying and storing of graph data.

Improved Backup Performance.For all the small databases that are running on high end servers, the backup process is optimized in a way that several iterations of the buffer pool are avoided, this leads to improved performance of backups for small as well as medium databases. As the size of your database keeps on increasing, your gains in terms of performance keep on decreasing.

Query Store Improvements.The query store in SQL Server 2017 can now give you wait stats summary, it tracks all the wait stats categories for all queries in the Query store. This allows for enhanced troubleshooting experience which gives better insights of workload performance and bottlenecks.

Common myths about index fragmentation in SQL Server

Rebuild all the indexes every night.Not necessary as not all indexes are fragmented => waste of server resources. Just thing to transaction log activity and disk activity. Some indexes fragmentation can be done with INDEX REORGANIZE.

Add more memory to the server.More memory doesn't stop index fragmentation happening.

Index fragmentation is irrelevant when using SSD.SSD allow index scans to be faster, but SSD doesn't prevent index fragmentation.

Online index rebuilds do not causes blocking.Online index rebuilds acquire locks that can cause long-term blocking.

Use the same fill factor for all indexes.Some indexes aren't going to have any fragmentation => waste space.

What are the implications of index fragmentation?

Slower index scans.

Increased disk space usage.The density of rows on the pages is lower => store fewer rows per page => wasting space

Increased buffer pool usage.

Causes increased transaction log generation.

Solving the N+1 selects problem

This problems occurs when you have a parent object loading its child data by issuing a separate SQL statement for each child objects that needs to be loaded. So n+1 queries are executed against the database.

Records are being loaded individually because they are being lazy loaded.

What we can do instead if to use eager loading.

Adhoc queries vs Parameterized queries vs Stored procedures

Parameterized queries vs adhoc (dynamic) queries:
- Protects against SQL injection attacks.
- Improves application performance and scalability as the same cached execution plan is reused for each query execution, compared with adhoc queries where an execution plan is generated every time.
- Plan Cache memory is kept low.

Parameterized queries vs stored procedures:
- Similar performance.
- Stored procedures offers advantages when we talk about security or restriction to data.

SQL Server Operations in Query Execution Plan

Clustered Index Scan: Reads all the rows stored in a table stored as a clustered index.
Table Scan: Reads all the rows in table that is stored as a heap structure.
Clustered Index Seek: Traverses the tree structure of table stored as a clustered index to find the needed row(s).
Index Scan: Reads all of the key values of an index to find the matching data.
Index Seek: Traverses the tree structure of an index to find the matching index keys.

Nested Loops Join: For each value in the first data set, SQL Server loops through the second data set looking for matches.
Merge Join: Used to join two data sets that are already sorted using the same key. A row from each source is obtained. If the rows match, they are joined. If the rows don't match, the lower value row is discarded and a new row is obtained from that source.
Hash Match: A hashtable of the smaller data set is created, then SQL Server loops through the larger data set probing the hashtable for matching values. Used when two large datasets must be joined.

Temporal tables - New feature in SQL Server 2016

- System version tables.
- Point-in-time data access.
- Behind the hood: a history table is created with the same schema, but without constraints.
- ALTER TABLE automatically updates the history table.

DECLARE @TenDaysAgo datetime2 = DATEADD(d, -10, SYSDATETIME())

SELECT *
FROM dbo.MyTable
FOR SYSTEM_TIME AS OF @TenDaysAgo

Pages

Subscribe to SQLhint RSS