sql-server

When to use IN operator

Using WHERE IN and WHERE NOT IN clauses in T-SQL code can produce an execution plan involving one or more nested loops. This increases the number of comparisons SQL Server must perform exponentially. Use the WHERE IN clause only if you have a short list of values you need to evaluate:

SELECT *
FROM Customers
WHERE CustomerID NOT IN
   (SELECT CustomerID FROM Orders)

Replace the WHERE IN clause with OUTER JOIN if you're using a subquery to generate a potentially large list. Doing so you can improve performance significantly:

SELECT c.*
FROM Customers c
LEFT OUTER JOIN Orders o
ON o.CustomerID = c.CustomerID
WHERE o.CustomerID IS NULL

In this case, the second query uses LEFT OUTER JOIN, producing an execution plan that lets it run about three times faster than the first query.

How to update views in SQL Server

You can modify data through a view by using INSTEAD OF trigger. INSTEAD OF INSERT triggers defined on a view (or table) replace the standard action of the INSERT statement with your statements.
For exemplifications I’ve create 2 tables and a view as follows:

CREATE TABLE Person (
SSN int PRIMARY KEY,
name varchar(100)
)

CREATE TABLE Patient(
SSN int PRIMARY KEY,
blood_group varchar(5),
CONSTRAINT FKPerson FOREIGN KEY (SSN)
REFERENCES Person (SSN)
)
GO

CREATE VIEW PersonPatient as
SELECT Person.SSN, Person.name, Patient.blood_group
FROM Person
INNER JOIN Patient ON Person.SSN = Patient.SSN
GO

Below is instead of trigger executed when a new row is trying to be added to PersonPatient view. Its business is very simple: inserts a row in Person table and a row in Patient table. Of course, your business might be more complicated. For example, you can add business to manage duplicate patients, etc

CREATE TRIGGER IOS_PersonPatient ON PersonPatient
INSTEAD OF INSERT
AS
BEGIN
INSERT INTO Person(SSN, name)
SELECT SSN, name from inserted
INSERT INTO Patient(SSN, blood_group)
SELECT SSN, blood_group FROM inserted
END

Having instead of trigger, now you can write insert statements instead PersonPatient view:

INSERT INTO PersonPatient(SSN, name, blood_group)
VALUES (1234567890, 'John Smith', 'B')

Another way to data through a view is using updatable partitioned views.

Generate row number column in SQL Server

Sometimes we need our SELECT statement to return current row number. In the following script it is shown two ways you can perform this task.

-- Sample 1: Using SELF JOIN
SELECT
        (SELECT COUNT(TerritoryID)
        FROM dbo.Territories T2
        WHERE T2.TerritoryID <= T1.TerritoryID) AS RowNumber
        , T1.*
FROM dbo.Territories T1

-- Sample 2: Using IDENTITY
SELECT
    IDENTITY(int,1,1) as RowNumber, *
    INTO #TempTerritories
FROM dbo.Territories

SELECT * FROM #TempTerritories
DROP TABLE #TempTerritories

Anyway, most of the times you can avoid this practice, and build row number in the GUI part. By example, if you have a report and you want to add a row count column, you can insert a special field called “Record Number”: select Insert menu/Special Field/RecordNumber.

Your suggestions are welcomed.

Empty all tables in SQL Server

In day-by-day programmer life you need a magic script to clean up data you’ve inserted in the application you work on. Of course, you can do this by running ‘CREATE TABLE’ statements and recreate your database from scratch. This is perfect valid, but most of the times you want to keep data in some tables like: configuration tables, catalogues tables, so recreating database is not appropriate in this case.

The following script, interrogates sysobjects table to get all user tables. Then it tries to delete rows from tables. Some of the tables’ rows could be deleted most probably because of reference integrity will be broken. If so, I reiterate all tables and retrying to delete rows. This operation continues until no error is raised while deleting rows from all tables. I must admit this solution uses brute- force, but I found it very simple. I’ve tested it on medium size databases and the results were very good.

If you need to keep data in some tables, you just need to add a clause to the SQL statement getting all user tables in current database:
AND name NOT IN ('author', 'titles', ‘table X’)

SET NOCOUNT ON
DECLARE @TABLE_NAME NVARCHAR(255), @HAS_IDENTITY NUMERIC(15), @HAS_ERROR BIT

-- Create a cursor with all table names
DECLARE TABLES_CURSOR CURSOR SCROLL FOR
SELECT name
FROM sysobjects
WHERE xtype = 'U' AND name <> 'dtproperties'
--AND name NOT IN ('author', 'titles', ...)

SET @HAS_ERROR = 1
OPEN TABLES_CURSOR
WHILE @HAS_ERROR <> 0
BEGIN
       SET @HAS_ERROR = 0
       FETCH FIRST FROM TABLES_CURSOR INTO @TABLE_NAME
       WHILE @@FETCH_STATUS = 0
       BEGIN        
             EXECUTE ('DELETE ' + @TABLE_NAME)
             IF @@ERROR<>0
                    -- Table rows can't be deleted.
                    SET @HAS_ERROR = 1
             ELSE
                    BEGIN
                           --Reset identity for emptied table
                           SET @HAS_IDENTITY = (SELECT IDENT_CURRENT(@TABLE_NAME))
                           IF @HAS_IDENTITY IS NOT NULL
                           BEGIN
                                 DBCC CHECKIDENT (@TABLE_NAME, reseed, 0) WITH NO_INFOMSGS
                           END
                    END

             FETCH NEXT FROM TABLES_CURSOR INTO @TABLE_NAME
       END
END

CLOSE TABLES_CURSOR
DEALLOCATE TABLES_CURSOR

Your suggestions are welcomed.

About fragmentation in SQL Server

- Extent = 8 pages = 64K.
- On a page split, SQL Server generally moves half the total number of rows in the original page to the new page.
- If new rows are added in the order of the clustered index, then the index rows will be always added at the trailing end of the index, preventing the page splits otherwise caused by the INSERT statements.
- For queries that don’t have to traverse a series of pages to retrieve the data, fragmentation can have minimal impact.
- For a table with a clustered index, the fragmentation of the clustered index is the same with the fragmentation of the data pages, since the leaf pages of the clustered index and data pages are the same.
- A small table (or index) with fewer than eight pages is simply unlikely to benefit from efforts to remove the fragmentation because it will be stored on mixed extents.

Internal fragmentation
- When data is fragmented within the same extent (= 8 pages).
- A little internal fragmentation can be beneficial, because it allows you to perform INSERT and UPDATE queries without causing page splits.

External fragmentation
- When data is fragmented over 2 extents.
- A range scan on an index will need more switches between the corresponding extents than ideally required. A range scan on an index will be unable to benefit from read-ahead operations performed on the disk.
- For better performance, it is preferable to use sequential I/O, since this can read a whole extent (8 x 8KB) in a single disk I/O operation. By contrast, a noncontiguous layout of pages requires nonsequential or random I/O operations to retrieve the index pages from the disk, and a random I/O operation can read only 8KB of data in a single disk operation (this may be acceptable, however, if you are retrieving only one.

Columnstore indexes vs indexed views

Columnstore indexes Indexed views
It didn't require Enterprise Edition in SQL Server 2016 SP1+ Limited usage in non-Enterprise Editions
No session setting requirements Requires certain session settings set on
A lot less storage required More storage required
Less administrative overhead / maintenance More administrative overhead / maintenance
Not able to to inserts/updates prior to SQL Server 2014 Most costly to maintain during inserts/updates

Using computed columns in SQL Server

In a database we need some values that were often calculated while generating several reports.

Assuming we have the following table:

CREATE TABLE InvoiceLine
AS
(
       InvoiceLineID int
       , InvoiceID int
       , NumberOfItems int
       , Amount money
)

And we want to find the total amount of an invoice line. We can achieve this by addding a new computed column:

CREATE TABLE InvoiceLine
AS
(
       InvoiceLineID int
       , InvoiceID int
       , NumberOfItems int
       , Amount money
       , TotalAmount AS (Amount*NumberOfItems) PERSISTED
)

Common myths about index fragmentation in SQL Server

Rebuild all the indexes every night.Not necessary as not all indexes are fragmented => waste of server resources. Just thing to transaction log activity and disk activity. Some indexes fragmentation can be done with INDEX REORGANIZE.

Add more memory to the server.More memory doesn't stop index fragmentation happening.

Index fragmentation is irrelevant when using SSD.SSD allow index scans to be faster, but SSD doesn't prevent index fragmentation.

Online index rebuilds do not causes blocking.Online index rebuilds acquire locks that can cause long-term blocking.

Use the same fill factor for all indexes.Some indexes aren't going to have any fragmentation => waste space.

What are the implications of index fragmentation?

Slower index scans.

Increased disk space usage.The density of rows on the pages is lower => store fewer rows per page => wasting space

Increased buffer pool usage.

Causes increased transaction log generation.

Solving the N+1 selects problem

This problems occurs when you have a parent object loading its child data by issuing a separate SQL statement for each child objects that needs to be loaded. So n+1 queries are executed against the database.

Records are being loaded individually because they are being lazy loaded.

What we can do instead if to use eager loading.

Pages

Subscribe to RSS - sql-server