Introduction
Database performance can make or break your application. Slow queries lead to poor user experience, increased costs, and scalability limitations. Indexing is the single most impactful technique for improving SQL Server query performanceβoften providing 10x to 100x speedups for common operations.
In this comprehensive guide, you'll master SQL Server indexing from fundamentals to advanced strategies. We'll cover clustered indexes, non-clustered indexes, covering indexes, filtered indexes, columnstore indexes, and the critical decision-making process for index design. Every concept includes real-world scenarios, performance benchmarks, and actionable implementation steps.
What You'll Learn:
- Index fundamentals and how SQL Server uses them
- Clustered vs non-clustered indexes with performance comparisons
- Creating optimal covering indexes
- Filtered indexes for conditional data
- Index maintenance and fragmentation management
- Query execution plans and index usage analysis
- Common indexing anti-patterns and how to avoid them
- Real-world performance tuning case studies
Index Architecture Overview
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SQL Server Index Architecture β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Query Request β
β β β
β βΌ β
β βββββββββββββββββββ β
β β Query Optimizer β (Analyzes query, chooses index strategy) β
β ββββββββββ¬βββββββββ β
β β β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Execution Plan β β
β β ββββββββββββββ βββββββββββββββ βββββββββββββ β β
β β β Index Seek β β Index Scan β βTable Scan β β β
β β βββββββ¬βββββββ ββββββββ¬βββββββ βββββββ¬ββββββ β β
β ββββββββββΌββββββββββββββββββΌββββββββββββββββΌββββββββ β
β β β β β
β βΌ βΌ βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Index Structures β β
β β β β
β β Clustered Index (B-Tree) β β
β β ββ Root Node β β
β β ββ Intermediate Nodes β β
β β ββ Leaf Nodes (actual data pages) β β
β β β β
β β Non-Clustered Index (B-Tree) β β
β β ββ Root Node β β
β β ββ Intermediate Nodes β β
β β ββ Leaf Nodes (key + RID/Cluster Key) β β
β β β β
β β Columnstore Index (Column-oriented) β β
β β ββ Row Groups (compressed) β β
β β ββ Column Segments β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β ββββββββββββββββββ β
β β Data Pages β (8KB blocks) β
β ββββββββββββββββββ β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Prerequisites
Required Software
- SQL Server 2019 or later (Developer/Express edition sufficient)
- SQL Server Management Studio (SSMS) 19+
- Azure Data Studio (optional, for cross-platform work)
- Windows Performance Monitor (for I/O analysis)
Required Permissions
db_ownerrole on test databaseVIEW SERVER STATEpermission for DMV queriesALTERpermission for index creation/modification
Verify Installation
-- Check SQL Server version
SELECT @@VERSION;
-- Should be SQL Server 2019 (15.x) or higher
-- Check edition and features
SELECT
SERVERPROPERTY('ProductVersion') AS Version,
SERVERPROPERTY('ProductLevel') AS ServicePack,
SERVERPROPERTY('Edition') AS Edition;
-- Verify permissions
SELECT
HAS_PERMS_BY_NAME(NULL, NULL, 'VIEW SERVER STATE') AS CanViewServerState,
IS_MEMBER('db_owner') AS IsDbOwner;
Step 1: Create Sample Database and Data
Database Setup
-- Create performance testing database
CREATE DATABASE PerformanceTuning;
GO
USE PerformanceTuning;
GO
-- Create Orders table (will contain millions of rows)
CREATE TABLE Orders (
OrderID INT IDENTITY(1,1) NOT NULL,
CustomerID INT NOT NULL,
OrderDate DATETIME NOT NULL,
ShipDate DATETIME NULL,
TotalAmount DECIMAL(18,2) NOT NULL,
OrderStatus VARCHAR(20) NOT NULL,
PaymentMethod VARCHAR(20) NOT NULL,
ShippingAddress NVARCHAR(200) NOT NULL,
CreatedAt DATETIME NOT NULL DEFAULT GETDATE(),
ModifiedAt DATETIME NULL
);
-- Create OrderDetails table
CREATE TABLE OrderDetails (
OrderDetailID INT IDENTITY(1,1) NOT NULL,
OrderID INT NOT NULL,
ProductID INT NOT NULL,
Quantity INT NOT NULL,
UnitPrice DECIMAL(18,2) NOT NULL,
Discount DECIMAL(5,2) NOT NULL DEFAULT 0.00
);
-- Create Products table
CREATE TABLE Products (
ProductID INT IDENTITY(1,1) NOT NULL,
ProductName NVARCHAR(100) NOT NULL,
CategoryID INT NOT NULL,
UnitPrice DECIMAL(18,2) NOT NULL,
UnitsInStock INT NOT NULL,
Discontinued BIT NOT NULL DEFAULT 0
);
-- Create Customers table
CREATE TABLE Customers (
CustomerID INT IDENTITY(1,1) NOT NULL,
CustomerName NVARCHAR(100) NOT NULL,
Email NVARCHAR(100) NOT NULL,
Country NVARCHAR(50) NOT NULL,
City NVARCHAR(50) NOT NULL,
RegistrationDate DATETIME NOT NULL
);
Generate Large Dataset
-- Generate 100,000 customers
WITH CustomerCTE AS (
SELECT TOP 100000
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS CustomerID
FROM sys.all_objects a
CROSS JOIN sys.all_objects b
)
INSERT INTO Customers (CustomerName, Email, Country, City, RegistrationDate)
SELECT
'Customer' + CAST(CustomerID AS VARCHAR(10)),
'customer' + CAST(CustomerID AS VARCHAR(10)) + '@example.com',
CASE (CustomerID % 5)
WHEN 0 THEN 'USA'
WHEN 1 THEN 'Canada'
WHEN 2 THEN 'UK'
WHEN 3 THEN 'Germany'
ELSE 'France'
END,
'City' + CAST((CustomerID % 100) AS VARCHAR(10)),
DATEADD(DAY, -(CustomerID % 1000), GETDATE())
FROM CustomerCTE;
-- Generate 50,000 products
WITH ProductCTE AS (
SELECT TOP 50000
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS ProductID
FROM sys.all_objects a
CROSS JOIN sys.all_objects b
)
INSERT INTO Products (ProductName, CategoryID, UnitPrice, UnitsInStock, Discontinued)
SELECT
'Product' + CAST(ProductID AS VARCHAR(10)),
(ProductID % 100) + 1,
CAST((RAND(CHECKSUM(NEWID())) * 1000) + 10 AS DECIMAL(18,2)),
CAST(RAND(CHECKSUM(NEWID())) * 500 AS INT),
CASE WHEN (ProductID % 20) = 0 THEN 1 ELSE 0 END
FROM ProductCTE;
-- Generate 1,000,000 orders
DECLARE @BatchSize INT = 10000;
DECLARE @TotalOrders INT = 1000000;
DECLARE @CurrentBatch INT = 0;
WHILE @CurrentBatch < @TotalOrders
BEGIN
INSERT INTO Orders (CustomerID, OrderDate, ShipDate, TotalAmount, OrderStatus, PaymentMethod, ShippingAddress)
SELECT TOP (@BatchSize)
(ABS(CHECKSUM(NEWID())) % 100000) + 1,
DATEADD(DAY, -(ABS(CHECKSUM(NEWID())) % 730), GETDATE()),
DATEADD(DAY, (ABS(CHECKSUM(NEWID())) % 7), DATEADD(DAY, -(ABS(CHECKSUM(NEWID())) % 730), GETDATE())),
CAST((RAND(CHECKSUM(NEWID())) * 5000) + 50 AS DECIMAL(18,2)),
CASE (ABS(CHECKSUM(NEWID())) % 4)
WHEN 0 THEN 'Pending'
WHEN 1 THEN 'Shipped'
WHEN 2 THEN 'Delivered'
ELSE 'Cancelled'
END,
CASE (ABS(CHECKSUM(NEWID())) % 3)
WHEN 0 THEN 'Credit Card'
WHEN 1 THEN 'PayPal'
ELSE 'Bank Transfer'
END,
'Address ' + CAST((ABS(CHECKSUM(NEWID())) % 10000) AS VARCHAR(10))
FROM sys.all_objects a
CROSS JOIN sys.all_objects b;
SET @CurrentBatch = @CurrentBatch + @BatchSize;
RAISERROR('Inserted %d orders', 0, 1, @CurrentBatch) WITH NOWAIT;
END;
-- Generate 3,000,000 order details
DECLARE @OrderDetailBatch INT = 10000;
DECLARE @TotalDetails INT = 3000000;
DECLARE @CurrentDetail INT = 0;
WHILE @CurrentDetail < @TotalDetails
BEGIN
INSERT INTO OrderDetails (OrderID, ProductID, Quantity, UnitPrice, Discount)
SELECT TOP (@OrderDetailBatch)
(ABS(CHECKSUM(NEWID())) % 1000000) + 1,
(ABS(CHECKSUM(NEWID())) % 50000) + 1,
(ABS(CHECKSUM(NEWID())) % 10) + 1,
CAST((RAND(CHECKSUM(NEWID())) * 500) + 10 AS DECIMAL(18,2)),
CASE WHEN (ABS(CHECKSUM(NEWID())) % 10) = 0
THEN CAST((RAND(CHECKSUM(NEWID())) * 0.3) AS DECIMAL(5,2))
ELSE 0.00
END
FROM sys.all_objects a
CROSS JOIN sys.all_objects b;
SET @CurrentDetail = @CurrentDetail + @OrderDetailBatch;
RAISERROR('Inserted %d order details', 0, 1, @CurrentDetail) WITH NOWAIT;
END;
-- Verify data counts
SELECT 'Customers' AS TableName, COUNT(*) AS RowCount FROM Customers
UNION ALL
SELECT 'Products', COUNT(*) FROM Products
UNION ALL
SELECT 'Orders', COUNT(*) FROM Orders
UNION ALL
SELECT 'OrderDetails', COUNT(*) FROM OrderDetails;
Step 2: Understanding Query Performance WITHOUT Indexes
Baseline Performance Tests
-- Clear procedure cache and buffer pool for accurate testing
CHECKPOINT;
DBCC DROPCLEANBUFFERS; -- Clears data cache
DBCC FREEPROCCACHE; -- Clears execution plan cache
GO
-- Enable execution time and I/O statistics
SET STATISTICS TIME ON;
SET STATISTICS IO ON;
GO
-- Test 1: Find specific order (Table Scan)
SELECT * FROM Orders
WHERE OrderID = 500000;
-- Expected: ~1-2 seconds, Scan entire table
-- Test 2: Find orders by customer (Table Scan)
SELECT * FROM Orders
WHERE CustomerID = 12345;
-- Expected: ~1-3 seconds, Scan entire table
-- Test 3: Find recent orders (Table Scan)
SELECT * FROM Orders
WHERE OrderDate >= DATEADD(DAY, -30, GETDATE());
-- Expected: ~2-4 seconds, Scan entire table
-- Test 4: Aggregate query (Table Scan)
SELECT
CustomerID,
COUNT(*) AS OrderCount,
SUM(TotalAmount) AS TotalSpent
FROM Orders
GROUP BY CustomerID;
-- Expected: ~3-5 seconds, Scan entire table
-- Test 5: JOIN without indexes (Very Slow!)
SELECT
o.OrderID,
o.OrderDate,
c.CustomerName,
c.Email
FROM Orders o
INNER JOIN Customers c ON o.CustomerID = c.CustomerID
WHERE o.OrderDate >= DATEADD(DAY, -7, GETDATE());
-- Expected: ~10-20 seconds, Multiple table scans
SET STATISTICS TIME OFF;
SET STATISTICS IO OFF;
Analyze Execution Plans
-- Enable actual execution plan (Ctrl+M in SSMS)
SET SHOWPLAN_ALL ON;
GO
SELECT * FROM Orders WHERE OrderID = 500000;
GO
SET SHOWPLAN_ALL OFF;
GO
-- View execution plan details
SELECT
deqs.execution_count,
deqs.total_worker_time / 1000 AS TotalCPU_ms,
deqs.total_elapsed_time / 1000 AS TotalDuration_ms,
deqs.total_logical_reads,
deqs.total_physical_reads,
SUBSTRING(dest.text, (deqs.statement_start_offset/2)+1,
((CASE deqs.statement_end_offset
WHEN -1 THEN DATALENGTH(dest.text)
ELSE deqs.statement_end_offset
END - deqs.statement_start_offset)/2) + 1) AS QueryText
FROM sys.dm_exec_query_stats AS deqs
CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
WHERE dest.text LIKE '%Orders%'
ORDER BY deqs.total_worker_time DESC;
Step 3: Clustered Indexes - The Foundation
What is a Clustered Index?
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Clustered Index B-Tree Structure β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Root Node β
β ββββββββββββ¬βββββββββββ β
β β 1-1000 β1001-2000 β β
β ββββββ¬ββββββ΄ββββββ¬βββββ β
β β β β
β ββββββββββΌββββ βββββΌβββββββββ β
β β Inter Node β β Inter Node β β
β β 1-500 β β 1001-1500 β β
β ββββββββ¬ββββββ ββββββββ¬ββββββ β
β β β β
β βββββββββΌβββββ βββββββββΌβββββ β
β β Leaf Pages β β Leaf Pages β β
β β (Data) β β (Data) β β
β β OrderID: 1 β βOrderID:1001β β
β β Customer:10β βCustomer:250β β
β β Amount:100 β β Amount:500 β β
β β ... β β ... β β
β ββββββββββββββ ββββββββββββββ β
β β
β Key Points: β
β - Leaf level = actual data pages β
β - Only ONE clustered index per table β
β - Data physically sorted by clustered key β
β - Heap if no clustered index β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Create Clustered Indexes
-- Add clustered index on Orders (Primary Key)
ALTER TABLE Orders
ADD CONSTRAINT PK_Orders PRIMARY KEY CLUSTERED (OrderID);
GO
-- Add clustered indexes on other tables
ALTER TABLE OrderDetails
ADD CONSTRAINT PK_OrderDetails PRIMARY KEY CLUSTERED (OrderDetailID);
GO
ALTER TABLE Products
ADD CONSTRAINT PK_Products PRIMARY KEY CLUSTERED (ProductID);
GO
ALTER TABLE Customers
ADD CONSTRAINT PK_Customers PRIMARY KEY CLUSTERED (CustomerID);
GO
-- View index information
SELECT
OBJECT_NAME(i.object_id) AS TableName,
i.name AS IndexName,
i.type_desc AS IndexType,
i.is_primary_key,
i.is_unique,
c.name AS ColumnName
FROM sys.indexes i
INNER JOIN sys.index_columns ic ON i.object_id = ic.object_id AND i.index_id = ic.index_id
INNER JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id
WHERE OBJECT_NAME(i.object_id) IN ('Orders', 'OrderDetails', 'Products', 'Customers')
ORDER BY TableName, i.index_id, ic.key_ordinal;
Test Performance with Clustered Index
SET STATISTICS TIME ON;
SET STATISTICS IO ON;
-- Test 1: Find specific order (Index Seek)
SELECT * FROM Orders WHERE OrderID = 500000;
-- Now: < 1ms, Index Seek (1-2 pages read)
-- Before: ~1-2 seconds, Table Scan (thousands of pages)
-- Test 2: Range query on clustered key
SELECT * FROM Orders
WHERE OrderID BETWEEN 500000 AND 500100;
-- Now: < 5ms, Index Seek + Range Scan
-- Huge improvement!
-- Test 3: Get customer details
SELECT * FROM Customers WHERE CustomerID = 12345;
-- Now: < 1ms, Index Seek
SET STATISTICS TIME OFF;
SET STATISTICS IO OFF;
Step 4: Non-Clustered Indexes
What is a Non-Clustered Index?
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Non-Clustered Index Structure β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Non-Clustered Index (CustomerID) β
β Root Node β
β ββββββββββββ¬βββββββββββ β
β β Cust 1-5KβCust 5K-10β β
β ββββββ¬ββββββ΄ββββββ¬βββββ β
β β β β
β ββββββββββΌββββ βββββΌβββββββββ β
β β Leaf Nodes β β Leaf Nodes β β
β βCustomerID β βCustomerID β β
β β + Pointer β β + Pointer β β
β βββββββββββ β βββββββββββ β β
β β100 β Ptr β β5001 β Ptr β β
β β101 β Ptr β β5002 β Ptr β β
β β102 β Ptr β β5003 β Ptr β β
β ββββββββββ¬ββββ ββββββββ¬ββββββ β
β β β β
β ββββββββ¬ββββββββ β
β β (Pointer follows to clustered β
β β index or RID to get full row) β
β βΌ β
β ββββββββββββββββββ β
β β Clustered Indexβ β
β β (Full Data) β β
β ββββββββββββββββββ β
β β
β Key Points: β
β - Separate structure from data β
β - Leaf nodes contain key + pointer β
β - Can have multiple per table (up to 999) β
β - Requires key lookup for non-included columns β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Create Non-Clustered Indexes
-- Index for customer lookup queries
CREATE NONCLUSTERED INDEX IX_Orders_CustomerID
ON Orders (CustomerID);
GO
-- Index for date range queries
CREATE NONCLUSTERED INDEX IX_Orders_OrderDate
ON Orders (OrderDate DESC); -- DESC for recent dates first
GO
-- Index for status queries
CREATE NONCLUSTERED INDEX IX_Orders_OrderStatus
ON Orders (OrderStatus);
GO
-- Composite index for customer + date queries
CREATE NONCLUSTERED INDEX IX_Orders_CustomerID_OrderDate
ON Orders (CustomerID, OrderDate DESC);
GO
-- Index for order details lookups
CREATE NONCLUSTERED INDEX IX_OrderDetails_OrderID
ON OrderDetails (OrderID);
GO
-- Index for product lookups
CREATE NONCLUSTERED INDEX IX_OrderDetails_ProductID
ON OrderDetails (ProductID);
GO
-- Index for customer email lookups
CREATE NONCLUSTERED INDEX IX_Customers_Email
ON Customers (Email);
GO
-- Index for product category queries
CREATE NONCLUSTERED INDEX IX_Products_CategoryID
ON Products (CategoryID)
WHERE Discontinued = 0; -- Filtered index!
GO
Test Performance with Non-Clustered Indexes
SET STATISTICS TIME ON;
SET STATISTICS IO ON;
-- Test 1: Customer orders (uses IX_Orders_CustomerID)
SELECT * FROM Orders
WHERE CustomerID = 12345;
-- Now: < 10ms, Index Seek
-- Before: ~1-3 seconds, Table Scan
-- Test 2: Recent orders (uses IX_Orders_OrderDate)
SELECT * FROM Orders
WHERE OrderDate >= DATEADD(DAY, -30, GETDATE());
-- Now: < 50ms, Index Seek
-- Before: ~2-4 seconds, Table Scan
-- Test 3: Composite index usage
SELECT * FROM Orders
WHERE CustomerID = 12345
AND OrderDate >= '2024-01-01';
-- Now: < 5ms, Index Seek (uses IX_Orders_CustomerID_OrderDate)
-- Test 4: JOIN performance
SELECT
o.OrderID,
o.OrderDate,
c.CustomerName,
c.Email
FROM Orders o
INNER JOIN Customers c ON o.CustomerID = c.CustomerID
WHERE o.OrderDate >= DATEADD(DAY, -7, GETDATE());
-- Now: < 100ms, Index Seeks on both tables
-- Before: ~10-20 seconds, Table Scans
SET STATISTICS TIME OFF;
SET STATISTICS IO OFF;
Step 5: Covering Indexes (INCLUDE Clause)
What is a Covering Index?
A covering index contains ALL columns needed by a query, eliminating the need for key lookups.
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Key Lookup Problem β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Query: SELECT OrderID, CustomerID, TotalAmount β
β WHERE CustomerID = 100 β
β β
β Without INCLUDE: β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β 1. Index Seek on IX_Orders_CustomerID β
β 2. For each matching row: β
β βββΊ Key Lookup to clustered index (get TotalAmount) β
β β
β With 100 matching rows = 101 operations! β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β With INCLUDE (TotalAmount): β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β 1. Index Seek on IX_Orders_CustomerID β
β (Contains OrderID, CustomerID, TotalAmount) β
β β
β With 100 matching rows = 1 operation! β
β 100x more efficient! β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Create Covering Indexes
-- Drop existing simple index
DROP INDEX IF EXISTS IX_Orders_CustomerID ON Orders;
GO
-- Create covering index
CREATE NONCLUSTERED INDEX IX_Orders_CustomerID_Covering
ON Orders (CustomerID)
INCLUDE (OrderDate, TotalAmount, OrderStatus);
GO
-- Covering index for product queries
CREATE NONCLUSTERED INDEX IX_Products_CategoryID_Covering
ON Products (CategoryID)
INCLUDE (ProductName, UnitPrice, UnitsInStock)
WHERE Discontinued = 0;
GO
-- Covering index for order details
CREATE NONCLUSTERED INDEX IX_OrderDetails_OrderID_Covering
ON OrderDetails (OrderID)
INCLUDE (ProductID, Quantity, UnitPrice, Discount);
GO
Test Covering Index Performance
-- Enable actual execution plan (Ctrl+M in SSMS)
-- Query 1: Without covering index (has key lookups)
SELECT OrderID, CustomerID, TotalAmount, OrderStatus
FROM Orders
WHERE CustomerID = 12345;
-- Check execution plan: will show "Key Lookup" operators
-- Query 2: With covering index (no key lookups!)
SET STATISTICS IO ON;
SELECT OrderID, CustomerID, OrderDate, TotalAmount, OrderStatus
FROM Orders
WHERE CustomerID = 12345;
-- Check execution plan: Only "Index Seek", NO "Key Lookup"!
-- Performance: 5-10x faster than without covering
SET STATISTICS IO OFF;
Performance Comparison
-- Benchmark: Compare with/without covering index
DECLARE @StartTime DATETIME2;
DECLARE @EndTime DATETIME2;
DECLARE @Iterations INT = 1000;
DECLARE @Counter INT = 0;
-- Test WITHOUT covering (force index that doesn't include TotalAmount)
SET @StartTime = SYSDATETIME();
WHILE @Counter < @Iterations
BEGIN
SELECT OrderID, CustomerID, TotalAmount
FROM Orders WITH (INDEX(IX_Orders_OrderDate))
WHERE OrderDate >= '2024-01-01';
SET @Counter = @Counter + 1;
END
SET @EndTime = SYSDATETIME();
PRINT 'Without Covering Index: ' +
CAST(DATEDIFF(MILLISECOND, @StartTime, @EndTime) AS VARCHAR(10)) + 'ms';
-- Test WITH covering index
SET @Counter = 0;
SET @StartTime = SYSDATETIME();
WHILE @Counter < @Iterations
BEGIN
SELECT OrderID, CustomerID, TotalAmount, OrderDate
FROM Orders WITH (INDEX(IX_Orders_CustomerID_Covering))
WHERE CustomerID BETWEEN 1000 AND 2000;
SET @Counter = @Counter + 1;
END
SET @EndTime = SYSDATETIME();
PRINT 'With Covering Index: ' +
CAST(DATEDIFF(MILLISECOND, @StartTime, @EndTime) AS VARCHAR(10)) + 'ms';
Step 6: Filtered Indexes
Create Filtered Indexes
-- Index only active orders (status != 'Cancelled')
CREATE NONCLUSTERED INDEX IX_Orders_Active
ON Orders (OrderDate DESC)
INCLUDE (CustomerID, TotalAmount, OrderStatus)
WHERE OrderStatus != 'Cancelled';
GO
-- Index only recent orders (last 90 days)
CREATE NONCLUSTERED INDEX IX_Orders_Recent
ON Orders (CustomerID, OrderDate DESC)
INCLUDE (TotalAmount, OrderStatus)
WHERE OrderDate >= DATEADD(DAY, -90, GETDATE());
GO
-- Index only high-value orders
CREATE NONCLUSTERED INDEX IX_Orders_HighValue
ON Orders (CustomerID, OrderDate DESC)
INCLUDE (TotalAmount, PaymentMethod)
WHERE TotalAmount >= 1000.00;
GO
-- Index only pending orders (frequently queried)
CREATE NONCLUSTERED INDEX IX_Orders_Pending
ON Orders (OrderDate DESC)
INCLUDE (CustomerID, TotalAmount, ShippingAddress)
WHERE OrderStatus = 'Pending';
GO
Test Filtered Index Performance
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
-- Query 1: Active orders (uses filtered index)
SELECT OrderID, OrderDate, CustomerID, TotalAmount
FROM Orders
WHERE OrderStatus != 'Cancelled'
AND OrderDate >= '2024-01-01';
-- Uses IX_Orders_Active (smaller, faster)
-- Query 2: Pending orders (uses filtered index)
SELECT OrderID, OrderDate, CustomerID, TotalAmount, ShippingAddress
FROM Orders
WHERE OrderStatus = 'Pending'
AND OrderDate >= DATEADD(DAY, -7, GETDATE());
-- Uses IX_Orders_Pending (optimized for this exact scenario)
-- Query 3: High-value customer analysis
SELECT CustomerID, COUNT(*) AS HighValueOrders, SUM(TotalAmount) AS TotalSpent
FROM Orders
WHERE TotalAmount >= 1000.00
GROUP BY CustomerID
ORDER BY TotalSpent DESC;
-- Uses IX_Orders_HighValue
SET STATISTICS IO OFF;
SET STATISTICS TIME OFF;
Benefits of Filtered Indexes
-- Compare index sizes
SELECT
i.name AS IndexName,
SUM(ps.used_page_count) * 8 / 1024 AS SizeMB,
SUM(ps.row_count) AS RowCount
FROM sys.indexes i
INNER JOIN sys.dm_db_partition_stats ps
ON i.object_id = ps.object_id AND i.index_id = ps.index_id
WHERE OBJECT_NAME(i.object_id) = 'Orders'
GROUP BY i.name
ORDER BY SizeMB DESC;
-- Filtered indexes are significantly smaller!
-- IX_Orders_Active: ~60% smaller than full index
-- IX_Orders_Recent: ~90% smaller
-- IX_Orders_Pending: ~75% smaller
Step 7: Index Maintenance
Check Index Fragmentation
-- View fragmentation for all indexes
SELECT
OBJECT_NAME(ips.object_id) AS TableName,
i.name AS IndexName,
ips.index_type_desc,
ips.avg_fragmentation_in_percent,
ips.page_count,
ips.record_count
FROM sys.dm_db_index_physical_stats(
DB_ID(), NULL, NULL, NULL, 'DETAILED') ips
INNER JOIN sys.indexes i
ON ips.object_id = i.object_id AND ips.index_id = i.index_id
WHERE ips.avg_fragmentation_in_percent > 10
AND ips.page_count > 1000
ORDER BY ips.avg_fragmentation_in_percent DESC;
Rebuild Fragmented Indexes
-- Rebuild single index
ALTER INDEX IX_Orders_CustomerID_Covering ON Orders REBUILD;
-- Rebuild all indexes on a table
ALTER INDEX ALL ON Orders REBUILD;
-- Rebuild with online option (Enterprise Edition)
ALTER INDEX IX_Orders_CustomerID_Covering ON Orders
REBUILD WITH (ONLINE = ON);
-- Reorganize (less intrusive, works for < 30% fragmentation)
ALTER INDEX IX_Orders_CustomerID_Covering ON Orders REORGANIZE;
Automated Maintenance Script
-- Create maintenance procedure
CREATE OR ALTER PROCEDURE sp_MaintainIndexes
@FragmentationThreshold FLOAT = 10.0,
@RebuildThreshold FLOAT = 30.0
AS
BEGIN
SET NOCOUNT ON;
DECLARE @TableName NVARCHAR(128);
DECLARE @IndexName NVARCHAR(128);
DECLARE @Fragmentation FLOAT;
DECLARE @SQL NVARCHAR(MAX);
DECLARE index_cursor CURSOR FOR
SELECT
OBJECT_NAME(ips.object_id),
i.name,
ips.avg_fragmentation_in_percent
FROM sys.dm_db_index_physical_stats(
DB_ID(), NULL, NULL, NULL, 'LIMITED') ips
INNER JOIN sys.indexes i
ON ips.object_id = i.object_id AND ips.index_id = i.index_id
WHERE ips.avg_fragmentation_in_percent >= @FragmentationThreshold
AND ips.page_count > 1000
AND i.name IS NOT NULL;
OPEN index_cursor;
FETCH NEXT FROM index_cursor INTO @TableName, @IndexName, @Fragmentation;
WHILE @@FETCH_STATUS = 0
BEGIN
IF @Fragmentation >= @RebuildThreshold
BEGIN
SET @SQL = 'ALTER INDEX ' + QUOTENAME(@IndexName) +
' ON ' + QUOTENAME(@TableName) + ' REBUILD;';
PRINT 'Rebuilding: ' + @IndexName + ' (' +
CAST(@Fragmentation AS VARCHAR(10)) + '%)';
END
ELSE
BEGIN
SET @SQL = 'ALTER INDEX ' + QUOTENAME(@IndexName) +
' ON ' + QUOTENAME(@TableName) + ' REORGANIZE;';
PRINT 'Reorganizing: ' + @IndexName + ' (' +
CAST(@Fragmentation AS VARCHAR(10)) + '%)';
END
EXEC sp_executesql @SQL;
FETCH NEXT FROM index_cursor INTO @TableName, @IndexName, @Fragmentation;
END;
CLOSE index_cursor;
DEALLOCATE index_cursor;
-- Update statistics
EXEC sp_updatestats;
PRINT 'Index maintenance completed.';
END;
GO
-- Execute maintenance
EXEC sp_MaintainIndexes;
Step 8: Analyze Index Usage
Find Missing Indexes
-- SQL Server tracks missing indexes automatically
SELECT
DB_NAME(mid.database_id) AS DatabaseName,
OBJECT_NAME(mid.object_id, mid.database_id) AS TableName,
migs.avg_user_impact AS Avg Impact,
migs.user_seeks + migs.user_scans AS Total_Seeks_Scans,
mid.equality_columns,
mid.inequality_columns,
mid.included_columns,
'CREATE INDEX IX_' + OBJECT_NAME(mid.object_id, mid.database_id) + '_' +
REPLACE(REPLACE(ISNULL(mid.equality_columns, ''), '[', ''), ']', '') +
CASE WHEN mid.inequality_columns IS NOT NULL
THEN '_' + REPLACE(REPLACE(mid.inequality_columns, '[', ''), ']', '')
ELSE '' END +
' ON ' + OBJECT_NAME(mid.object_id, mid.database_id) + ' (' +
ISNULL(mid.equality_columns, '') +
CASE WHEN mid.inequality_columns IS NOT NULL
THEN CASE WHEN mid.equality_columns IS NOT NULL THEN ',' ELSE '' END +
mid.inequality_columns
ELSE '' END + ')' +
CASE WHEN mid.included_columns IS NOT NULL
THEN ' INCLUDE (' + mid.included_columns + ')'
ELSE '' END AS CreateIndexStatement
FROM sys.dm_db_missing_index_details mid
INNER JOIN sys.dm_db_missing_index_groups mig
ON mid.index_handle = mig.index_handle
INNER JOIN sys.dm_db_missing_index_group_stats migs
ON mig.index_handle = migs.group_handle
WHERE mid.database_id = DB_ID()
AND migs.avg_user_impact > 50 -- High impact
ORDER BY migs.avg_user_impact DESC;
Find Unused Indexes
-- Indexes that are never used should be dropped
SELECT
OBJECT_NAME(i.object_id) AS TableName,
i.name AS IndexName,
i.type_desc,
ius.user_seeks,
ius.user_scans,
ius.user_lookups,
ius.user_updates,
CASE
WHEN ius.user_seeks + ius.user_scans + ius.user_lookups = 0
THEN 'DROP INDEX ' + QUOTENAME(i.name) + ' ON ' + QUOTENAME(OBJECT_NAME(i.object_id)) + ';'
ELSE 'Index is used'
END AS Recommendation
FROM sys.indexes i
LEFT JOIN sys.dm_db_index_usage_stats ius
ON i.object_id = ius.object_id AND i.index_id = ius.index_id
WHERE OBJECTPROPERTY(i.object_id, 'IsUserTable') = 1
AND i.type_desc != 'HEAP'
AND i.is_primary_key = 0
AND i.is_unique_constraint = 0
AND (ius.user_seeks + ius.user_scans + ius.user_lookups = 0 OR ius.index_id IS NULL)
ORDER BY TableName, IndexName;
Index Usage Statistics
-- Detailed index usage analysis
SELECT
OBJECT_NAME(ius.object_id) AS TableName,
i.name AS IndexName,
ius.user_seeks AS UserSeeks,
ius.user_scans AS UserScans,
ius.user_lookups AS UserLookups,
ius.user_updates AS UserUpdates,
ius.last_user_seek AS LastSeek,
ius.last_user_scan AS LastScan,
CASE
WHEN ius.user_seeks + ius.user_scans + ius.user_lookups = 0 THEN 'NEVER USED'
WHEN ius.user_updates > (ius.user_seeks + ius.user_scans + ius.user_lookups) * 2
THEN 'WRITE HEAVY (Consider Dropping)'
WHEN ius.user_seeks + ius.user_scans + ius.user_lookups > ius.user_updates * 10
THEN 'READ HEAVY (Excellent)'
ELSE 'BALANCED'
END AS UsagePattern
FROM sys.dm_db_index_usage_stats ius
INNER JOIN sys.indexes i
ON ius.object_id = i.object_id AND ius.index_id = i.index_id
WHERE ius.database_id = DB_ID()
AND OBJECTPROPERTY(ius.object_id, 'IsUserTable') = 1
ORDER BY UserSeeks + UserScans + UserLookups DESC;
Step 9: Advanced Indexing Strategies
Composite Index Column Order
-- Rule: Most selective columns first, then less selective
-- BAD: Low selectivity first
CREATE INDEX IX_Orders_Bad ON Orders (OrderStatus, CustomerID);
-- OrderStatus has only 4 values, CustomerID has 100,000 values
-- GOOD: High selectivity first
CREATE INDEX IX_Orders_Good ON Orders (CustomerID, OrderStatus);
-- CustomerID narrows down quickly, then OrderStatus refines
-- Test the difference
SET STATISTICS IO ON;
-- Using bad index
SELECT * FROM Orders WITH (INDEX(IX_Orders_Bad))
WHERE CustomerID = 12345 AND OrderStatus = 'Pending';
-- Higher logical reads
-- Using good index
SELECT * FROM Orders WITH (INDEX(IX_Orders_Good))
WHERE CustomerID = 12345 AND OrderStatus = 'Pending';
-- Much fewer logical reads!
SET STATISTICS IO OFF;
Index for Sorting (ORDER BY)
-- Create index matching sort order
CREATE INDEX IX_Orders_DateDesc ON Orders (OrderDate DESC, OrderID DESC);
-- Query benefits from sorted index
SELECT TOP 100 OrderID, OrderDate, TotalAmount
FROM Orders
ORDER BY OrderDate DESC, OrderID DESC;
-- No sort operation needed! Index already sorted
-- Check execution plan: Should show "Index Seek" with "Ordered: True"
Partial Match Searches
-- For LIKE queries with leading wildcard
CREATE INDEX IX_Customers_Email ON Customers (Email);
-- This can use index (leading characters known)
SELECT * FROM Customers WHERE Email LIKE 'john%';
-- Index seek possible
-- This CANNOT use index (leading wildcard)
SELECT * FROM Customers WHERE Email LIKE '%@gmail.com';
-- Full index scan required
-- Solution for trailing searches: Computed column + index
ALTER TABLE Customers ADD EmailReversed AS REVERSE(Email);
CREATE INDEX IX_Customers_EmailReversed ON Customers (EmailReversed);
-- Now search reversed
SELECT * FROM Customers WHERE EmailReversed LIKE REVERSE('%@gmail.com') + '%';
-- Can use index seek!
Step 10: Monitoring and Troubleshooting
Create Monitoring Dashboard
-- Index health report
CREATE OR ALTER VIEW vw_IndexHealthReport
AS
SELECT
OBJECT_NAME(i.object_id) AS TableName,
i.name AS IndexName,
i.type_desc AS IndexType,
ips.avg_fragmentation_in_percent AS Fragmentation,
ips.page_count AS Pages,
(ips.page_count * 8) / 1024.0 AS SizeMB,
ius.user_seeks AS Seeks,
ius.user_scans AS Scans,
ius.user_lookups AS Lookups,
ius.user_updates AS Updates,
ius.last_user_seek AS LastSeek,
CASE
WHEN ips.avg_fragmentation_in_percent > 30 THEN 'REBUILD NEEDED'
WHEN ips.avg_fragmentation_in_percent > 10 THEN 'REORGANIZE RECOMMENDED'
ELSE 'HEALTHY'
END AS MaintenanceAction,
CASE
WHEN ius.user_seeks + ius.user_scans + ius.user_lookups = 0 THEN 'UNUSED - CONSIDER DROPPING'
WHEN ius.user_updates > (ius.user_seeks + ius.user_scans + ius.user_lookups) * 3
THEN 'WRITE HEAVY - REVIEW NECESSITY'
ELSE 'OPTIMAL'
END AS UsageStatus
FROM sys.indexes i
LEFT JOIN sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'LIMITED') ips
ON i.object_id = ips.object_id AND i.index_id = ips.index_id
LEFT JOIN sys.dm_db_index_usage_stats ius
ON i.object_id = ius.object_id AND i.index_id = ius.index_id AND ius.database_id = DB_ID()
WHERE OBJECTPROPERTY(i.object_id, 'IsUserTable') = 1
AND i.type_desc != 'HEAP';
GO
-- Query the dashboard
SELECT * FROM vw_IndexHealthReport
WHERE MaintenanceAction != 'HEALTHY' OR UsageStatus != 'OPTIMAL'
ORDER BY SizeMB DESC;
Performance Baseline
-- Capture current performance metrics
CREATE TABLE IndexPerformanceBaseline (
CaptureDate DATETIME NOT NULL DEFAULT GETDATE(),
TableName NVARCHAR(128),
IndexName NVARCHAR(128),
UserSeeks BIGINT,
UserScans BIGINT,
UserLookups BIGINT,
UserUpdates BIGINT,
AvgFragmentation FLOAT,
PageCount BIGINT
);
-- Capture baseline
INSERT INTO IndexPerformanceBaseline
SELECT
GETDATE(),
OBJECT_NAME(i.object_id),
i.name,
ISNULL(ius.user_seeks, 0),
ISNULL(ius.user_scans, 0),
ISNULL(ius.user_lookups, 0),
ISNULL(ius.user_updates, 0),
ips.avg_fragmentation_in_percent,
ips.page_count
FROM sys.indexes i
LEFT JOIN sys.dm_db_index_usage_stats ius
ON i.object_id = ius.object_id AND i.index_id = ius.index_id AND ius.database_id = DB_ID()
LEFT JOIN sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'LIMITED') ips
ON i.object_id = ips.object_id AND i.index_id = ips.index_id
WHERE OBJECTPROPERTY(i.object_id, 'IsUserTable') = 1;
-- Compare against baseline (run after some time)
SELECT
current.TableName,
current.IndexName,
current.UserSeeks - baseline.UserSeeks AS SeeksDelta,
current.UserScans - baseline.UserScans AS ScansDelta,
current.UserUpdates - baseline.UserUpdates AS UpdatesDelta,
current.AvgFragmentation - baseline.AvgFragmentation AS FragmentationDelta,
DATEDIFF(HOUR, baseline.CaptureDate, current.CaptureDate) AS HoursSinceBaseline
FROM IndexPerformanceBaseline baseline
CROSS APPLY (
SELECT TOP 1 *
FROM IndexPerformanceBaseline current_inner
WHERE current_inner.TableName = baseline.TableName
AND current_inner.IndexName = baseline.IndexName
ORDER BY CaptureDate DESC
) current
WHERE baseline.CaptureDate = (SELECT MIN(CaptureDate) FROM IndexPerformanceBaseline)
ORDER BY SeeksDelta DESC;
Best Practices Summary
DO:
- β Always create clustered index on primary key
- β Use covering indexes for frequently-run queries
- β Create filtered indexes for subset queries
- β Order composite index columns by selectivity (most selective first)
- β Monitor index usage and drop unused indexes
- β Rebuild fragmented indexes regularly (>30% fragmentation)
- β Include necessary columns to avoid key lookups
- β Use included_columns liberally in covering indexes
DON'T:
- β Create indexes on every column "just in case"
- β Index columns with very low selectivity (e.g., boolean, status with 2-3 values)
- β Create too many indexes on write-heavy tables
- β Ignore index maintenance (fragmentation)
- β Forget to analyze execution plans before creating indexes
- β Create duplicate or overlapping indexes
- β Use index hints unless absolutely necessary
Key Takeaways
- Clustered indexes physically sort data - only ONE per table, choose wisely (usually primary key)
- Non-clustered indexes are separate structures - can have up to 999 per table
- Covering indexes eliminate key lookups by including all needed columns - massive performance gains
- Filtered indexes reduce index size and improve performance for subset queries
- Index maintenance is critical - rebuild when fragmentation > 30%, reorganize when 10-30%
- Monitor index usage - drop unused indexes that slow down writes
- Composite index column order matters - most selective columns first
- Analyze execution plans before creating indexes to verify they'll be used
Additional Resources
- SQL Server Index Architecture
- Index Design Guidelines
- Execution Plans Explained
- DMV Queries for Performance
- Brent Ozar's Index Tuning Resources
Next Steps
- Audit your database: Run missing index queries and find unused indexes
- Create covering indexes: Identify your top 10 queries and optimize them
- Implement monitoring: Set up automated index health checks
- Schedule maintenance: Create SQL Agent jobs for regular index rebuilding
- Learn execution plans: Master reading and interpreting query plans
- Explore columnstore: For analytical workloads, investigate columnstore indexes
- Test everything: Use non-production environments to test index changes
Ready to supercharge your database performance? Start with the missing index queries and work through your most expensive operations firstβyou'll see dramatic improvements immediately!