SQL Server Integration Services: ETL, Reporting, and Analysis Services

SQL Server Integration Services: ETL, Reporting, and Analysis Services

Introduction

SQL Server Integration Services (SSIS), Reporting Services (SSRS), and Analysis Services (SSAS) form the complete Business Intelligence stack. This guide covers building ETL packages, creating parameterized reports, designing OLAP cubes, and deploying solutions for enterprise data integration and analytics.

SSIS - Extract, Transform, Load

Setting Up SSIS Project

Visual Studio (SQL Server Data Tools):

  • File β†’ New β†’ Project β†’ Integration Services Project
  • Target Server Version: SQL Server 2019/2022

Project Structure:

  • SSIS Packages (.dtsx files)
  • Connection Managers
  • Parameters and Variables
  • Configurations

Basic ETL Package

Control Flow - Task Orchestration:

<!-- Package structure (simplified) -->
<Executable>
  <!-- 1. Execute SQL Task: Truncate staging table -->
  <ExecuteSQLTask>
    <SQL>TRUNCATE TABLE dbo.StagingCustomers;</SQL>
  </ExecuteSQLTask>
  
  <!-- 2. Data Flow Task: Extract and load data -->
  <DataFlowTask>
    <!-- Source: OLE DB Source -->
    <!-- Destination: SQL Server Destination -->
  </DataFlowTask>
  
  <!-- 3. Execute SQL Task: Merge to production -->
  <ExecuteSQLTask>
    <SQL>EXEC dbo.MergeCustomers;</SQL>
  </ExecuteSQLTask>
</Executable>

Data Flow Components:

  1. Source Components:

    • OLE DB Source (SQL Server, Oracle)
    • ADO.NET Source
    • Flat File Source (CSV, TXT)
    • Excel Source
    • XML Source
  2. Transformations:

    • Lookup (reference data)
    • Derived Column (calculations)
    • Conditional Split (routing)
    • Data Conversion (type changes)
    • Aggregate (GROUP BY)
    • Sort
    • Merge Join
    • Union All
  3. Destination Components:

    • OLE DB Destination
    • SQL Server Destination (fastest, local only)
    • Flat File Destination
    • Recordset Destination (memory)

Complete ETL Example: Customer Data Integration

Scenario: Extract customers from source database, validate/transform, load to data warehouse

Step 1: Create Connection Managers

Right-click Connection Managers β†’ New Connection:

  • SourceDB: OLE DB Connection to source SQL Server
  • TargetDB: OLE DB Connection to data warehouse
  • ErrorLog: Flat File Connection for error logging

Step 2: Control Flow Design

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Execute SQL Task        β”‚
β”‚ Truncate Staging Table  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
            β”‚ Success (Green)
            ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Data Flow Task          β”‚
β”‚ Extract & Load Customersβ”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
            β”‚ Success
            ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Execute SQL Task        β”‚
β”‚ Merge to Production     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
            β”‚ Success
            ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Send Mail Task          β”‚
β”‚ Success Notification    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Step 3: Data Flow Design

OLE DB Source
    ↓
Lookup Transformation (Validate Country Code)
    ↓ Match Output
Derived Column (Calculate FullName, Format Phone)
    ↓
Conditional Split
    β”œβ”€ IsValid = True β†’ SQL Server Destination (Staging)
    └─ IsValid = False β†’ Flat File Destination (Error Log)

Step 4: Transformations Configuration

Derived Column Transformation:

Column Name Expression Data Type
FullName [FirstName] + " " + [LastName] DT_WSTR(100)
FormattedPhone REPLACE([Phone],"-","") DT_WSTR(20)
LoadDate GETDATE() DT_DBTIMESTAMP

Conditional Split Transformation:

Output Name Condition
ValidRecords !ISNULL([Email]) && LEN([Email]) > 5
InvalidRecords (Default Output)

Step 5: Error Handling

<!-- Configure error output on source -->
<OleDbSource ErrorRowDisposition="RedirectRow">
  <!-- Errors redirected to error log file -->
</OleDbSource>

Slowly Changing Dimensions (SCD)

SCD Type 1: Overwrite (No History)

-- Merge statement for Type 1
MERGE dbo.DimCustomer AS target
USING dbo.StagingCustomer AS source
ON target.CustomerID = source.CustomerID

WHEN MATCHED THEN
    UPDATE SET
        target.CustomerName = source.CustomerName,
        target.Email = source.Email,
        target.ModifiedDate = GETDATE()

WHEN NOT MATCHED BY TARGET THEN
    INSERT (CustomerID, CustomerName, Email, CreatedDate)
    VALUES (source.CustomerID, source.CustomerName, source.Email, GETDATE());

SCD Type 2: Preserve History (Versioning)

-- Add version columns to dimension
ALTER TABLE dbo.DimCustomer ADD
    EffectiveDate DATETIME NOT NULL DEFAULT('1900-01-01'),
    ExpirationDate DATETIME NULL,
    IsCurrent BIT NOT NULL DEFAULT(1);

-- SCD Type 2 merge logic
-- 1. Expire current records where values changed
UPDATE target
SET 
    target.IsCurrent = 0,
    target.ExpirationDate = GETDATE()
FROM dbo.DimCustomer target
INNER JOIN dbo.StagingCustomer source ON target.CustomerID = source.CustomerID
WHERE 
    target.IsCurrent = 1
    AND (target.CustomerName <> source.CustomerName OR target.Email <> source.Email);

-- 2. Insert new versions for changed records
INSERT INTO dbo.DimCustomer (CustomerID, CustomerName, Email, EffectiveDate, IsCurrent)
SELECT 
    source.CustomerID,
    source.CustomerName,
    source.Email,
    GETDATE(),
    1
FROM dbo.StagingCustomer source
INNER JOIN dbo.DimCustomer target ON source.CustomerID = target.CustomerID
WHERE 
    target.IsCurrent = 0
    AND target.ExpirationDate = GETDATE();

-- 3. Insert new records
INSERT INTO dbo.DimCustomer (CustomerID, CustomerName, Email, EffectiveDate, IsCurrent)
SELECT 
    CustomerID,
    CustomerName,
    Email,
    GETDATE(),
    1
FROM dbo.StagingCustomer
WHERE CustomerID NOT IN (SELECT CustomerID FROM dbo.DimCustomer);

SSIS Slowly Changing Dimension Wizard:

  1. Drag "Slowly Changing Dimension" from Toolbox to Data Flow
  2. Configure:
    • Business Key: CustomerID
    • Changing Attributes:
      • CustomerName β†’ Type 1 (Overwrite)
      • Address β†’ Type 2 (Historical)
    • Fixed Attributes: CustomerID

Wizard generates transformation logic automatically.

Incremental Loads

-- Track last successful load timestamp
CREATE TABLE dbo.ETLControl (
    PackageName NVARCHAR(100) PRIMARY KEY,
    LastLoadDate DATETIME,
    LastLoadStatus NVARCHAR(20)
);

-- SSIS Package variable: @LastLoadDate
-- Set from ETLControl table in Execute SQL Task

-- Incremental extract query
SELECT 
    CustomerID,
    CustomerName,
    Email,
    ModifiedDate
FROM dbo.Customers
WHERE ModifiedDate > ?  -- Parameter from @LastLoadDate variable

-- Update ETLControl after successful load
UPDATE dbo.ETLControl
SET 
    LastLoadDate = GETDATE(),
    LastLoadStatus = 'Success'
WHERE PackageName = 'CustomerETL';

Package Parameters and Configurations

Project Parameters:

  • Right-click project β†’ Parameters
  • Add: ServerName, DatabaseName, EmailRecipient

Use Parameter in Connection Manager:

  • Right-click Connection Manager β†’ Parameterize
  • Property: ConnectionString
  • Expression: "Data Source=" + @[$Project::ServerName] + ";Initial Catalog=" + @[$Project::DatabaseName]

Environment Variables (SSISDB Catalog):

-- Create environment
EXEC catalog.create_environment 
    @folder_name = 'ETL',
    @environment_name = 'Production';

-- Add environment variables
EXEC catalog.create_environment_variable
    @folder_name = 'ETL',
    @environment_name = 'Production',
    @variable_name = 'ServerName',
    @data_type = 'String',
    @value = 'PROD-SQL01';

-- Map to project parameter
EXEC catalog.create_environment_reference
    @folder_name = 'ETL',
    @project_name = 'CustomerETL',
    @environment_name = 'Production',
    @reference_id = @reference_id OUTPUT;

EXEC catalog.set_object_parameter_value
    @folder_name = 'ETL',
    @project_name = 'CustomerETL',
    @parameter_name = 'ServerName',
    @parameter_value = 'ServerName',
    @object_type = 20,
    @value_type = 'R';  -- R = Reference

SSRS - Reporting Services

Creating a Report

SQL Server Data Tools β†’ New Report:

  1. Data Source:
<DataSource Name="AdventureWorks">
  <ConnectionString>
    Data Source=localhost;Initial Catalog=AdventureWorks
  </ConnectionString>
  <UseSharedDataSource>true</UseSharedDataSource>
</DataSource>
  1. Dataset:
SELECT 
    o.OrderID,
    c.CustomerName,
    o.OrderDate,
    o.TotalAmount,
    o.Status
FROM dbo.Orders o
INNER JOIN dbo.Customers c ON o.CustomerID = c.CustomerID
WHERE 
    (@StartDate IS NULL OR o.OrderDate >= @StartDate)
    AND (@EndDate IS NULL OR o.OrderDate <= @EndDate)
    AND (@Status IS NULL OR o.Status = @Status)
ORDER BY o.OrderDate DESC;
  1. Report Parameters:
Parameter Data Type Default Value Allow Blank Allow Null
StartDate DateTime =DateAdd("m", -1, Today()) βœ“ βœ“
EndDate DateTime =Today() βœ“ βœ“
Status String "All" βœ“ βœ“

Status Parameter - Available Values:

  • Dataset query:
SELECT DISTINCT Status FROM dbo.Orders
UNION ALL
SELECT 'All' AS Status;
  1. Report Layout:

Tablix (Table) Control:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Order ID β”‚ Customer     β”‚ Order Date β”‚ Total       β”‚ Status   β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ =OrderID β”‚ =CustomerNameβ”‚ =OrderDate β”‚ =TotalAmountβ”‚ =Status  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Expressions:

  • Format currency: =Format(Fields!TotalAmount.Value, "C")
  • Conditional formatting: =IIF(Fields!Status.Value = "Cancelled", "Red", "Black")

Matrix Report (Pivot Table)

-- Dataset for matrix
SELECT 
    YEAR(OrderDate) AS OrderYear,
    MONTH(OrderDate) AS OrderMonth,
    Status,
    SUM(TotalAmount) AS Revenue
FROM dbo.Orders
GROUP BY YEAR(OrderDate), MONTH(OrderDate), Status;

Matrix Layout:

           β”‚ Pending β”‚ Completed β”‚ Cancelled β”‚ Row Total
───────────┼─────────┼───────────┼───────────┼──────────
2024 Jan   β”‚ $10,000 β”‚  $50,000  β”‚   $2,000  β”‚  $62,000
2024 Feb   β”‚ $15,000 β”‚  $60,000  β”‚   $1,500  β”‚  $76,500
Column Tot β”‚ $25,000 β”‚ $110,000  β”‚   $3,500  β”‚ $138,500

Chart Reports

Add Chart Control:

  • Toolbox β†’ Chart β†’ Drag to report
  • Chart Type: Column, Bar, Line, Pie, Area

Configure Chart:

Category Groups: =Fields!OrderMonth.Value
Series Groups: =Fields!Status.Value
Values: =Sum(Fields!TotalAmount.Value)

Subscriptions

SQL Server Reporting Services Portal:

  1. Navigate to report
  2. Subscribe β†’ Create subscription

Email Subscription:

  • To: sales@contoso.com
  • Subject: Monthly Sales Report - @ReportExecutionTime
  • Render Format: PDF
  • Schedule: Monthly (1st day, 8:00 AM)

Data-Driven Subscription (Enterprise Edition):

-- Create subscriber list
CREATE TABLE dbo.ReportSubscribers (
    SubscriberID INT PRIMARY KEY IDENTITY,
    EmailAddress NVARCHAR(100),
    CustomerID INT,
    FileFormat NVARCHAR(20)
);

-- Query for data-driven subscription
SELECT 
    EmailAddress AS TO,
    'Customer Report' AS Subject,
    FileFormat,
    CustomerID  -- Passed to report parameter
FROM dbo.ReportSubscribers;

SSAS - Analysis Services

Creating a Tabular Model

SQL Server Data Tools β†’ New Project β†’ Analysis Services Tabular Project

1. Import Data:

  • Model β†’ Import from Data Source
  • Choose SQL Server
  • Select tables: FactSales, DimCustomer, DimProduct, DimDate

2. Create Relationships:

FactSales[CustomerKey] β†’ DimCustomer[CustomerKey]
FactSales[ProductKey]  β†’ DimProduct[ProductKey]
FactSales[OrderDateKey] β†’ DimDate[DateKey]

3. Create Measures (DAX):

-- Total Sales
TotalSales := SUM(FactSales[SalesAmount])

-- Total Quantity
TotalQuantity := SUM(FactSales[Quantity])

-- Average Order Value
AvgOrderValue := DIVIDE([TotalSales], DISTINCTCOUNT(FactSales[OrderID]))

-- Sales YTD
SalesYTD := TOTALYTD([TotalSales], DimDate[Date])

-- Sales Previous Year
SalesPY := CALCULATE([TotalSales], SAMEPERIODLASTYEAR(DimDate[Date]))

-- Sales Growth %
SalesGrowth% := DIVIDE([TotalSales] - [SalesPY], [SalesPY], 0)

-- Top 10 Products
Top10ProductSales := 
CALCULATE(
    [TotalSales],
    TOPN(10, ALL(DimProduct[ProductName]), [TotalSales], DESC)
)

4. Create Hierarchies:

Date Hierarchy:

  • Year β†’ Quarter β†’ Month β†’ Date

Product Hierarchy:

  • Category β†’ Subcategory β†’ Product

Geography Hierarchy:

  • Country β†’ State β†’ City

5. Calculated Columns:

-- Full Name
DimCustomer[FullName] = DimCustomer[FirstName] & " " & DimCustomer[LastName]

-- Sales Category
FactSales[SalesCategory] = 
    SWITCH(
        TRUE(),
        FactSales[SalesAmount] >= 1000, "High",
        FactSales[SalesAmount] >= 500, "Medium",
        "Low"
    )

MDX Queries (Multidimensional Model)

-- Basic MDX query
SELECT 
    [Measures].[Total Sales] ON COLUMNS,
    [DimDate].[Calendar].[Month].MEMBERS ON ROWS
FROM [SalesCube];

-- Filter by year
SELECT 
    [Measures].[Total Sales] ON COLUMNS,
    [DimProduct].[Category].[Category].MEMBERS ON ROWS
FROM [SalesCube]
WHERE [DimDate].[Calendar].[Year].&[2025];

-- Top 10 products
SELECT 
    [Measures].[Total Sales] ON COLUMNS,
    TOPCOUNT([DimProduct].[Product].[Product].MEMBERS, 10, [Measures].[Total Sales]) ON ROWS
FROM [SalesCube];

-- Year-over-year growth
WITH MEMBER [Measures].[Sales Growth] AS
    ([Measures].[Total Sales] - 
     ([Measures].[Total Sales], ParallelPeriod([DimDate].[Calendar].[Year], 1)))
    / ([Measures].[Total Sales], ParallelPeriod([DimDate].[Calendar].[Year], 1))
    , FORMAT_STRING = "Percent"
SELECT 
    {[Measures].[Total Sales], [Measures].[Sales Growth]} ON COLUMNS,
    [DimDate].[Calendar].[Year].MEMBERS ON ROWS
FROM [SalesCube];

Connecting to SSAS from Excel

Excel β†’ Data β†’ Get Data β†’ From Database β†’ From Analysis Services

  1. Server: localhost\TABULAR
  2. Database: SalesModel
  3. Connect

PivotTable:

  • Rows: DimProduct[Category]
  • Columns: DimDate[Year]
  • Values: TotalSales

Excel Formulas with SSAS:

=CUBEVALUE("SalesModel", "[Measures].[TotalSales]", "[DimDate].[Year].&[2025]")

=CUBEMEMBER("SalesModel", "[DimProduct].[Category].&[Electronics]")

Deployment

Deploy SSIS Package to SSISDB

# PowerShell deployment script
$SSISNamespace = "Microsoft.SqlServer.Management.IntegrationServices"
[Reflection.Assembly]::LoadWithPartialName($SSISNamespace) | Out-Null

$sqlConnectionString = "Data Source=localhost;Initial Catalog=master;Integrated Security=SSPI;"
$sqlConnection = New-Object System.Data.SqlClient.SqlConnection $sqlConnectionString

$integrationServices = New-Object $SSISNamespace.IntegrationServices $sqlConnection

# Create catalog if not exists
if (!$integrationServices.Catalogs["SSISDB"]) {
    $catalog = New-Object $SSISNamespace.Catalog ($integrationServices, "SSISDB", "P@ssw0rd123!")
    $catalog.Create()
}

$catalog = $integrationServices.Catalogs["SSISDB"]

# Create folder
$folder = New-Object $SSISNamespace.CatalogFolder ($catalog, "ETL", "ETL Projects")
$folder.Create()

# Deploy project
[byte[]] $projectFile = [System.IO.File]::ReadAllBytes("C:\Projects\CustomerETL\bin\Development\CustomerETL.ispac")
$folder.DeployProject("CustomerETL", $projectFile)

Write-Host "Project deployed successfully"

Execute SSIS Package

-- Execute package from SSISDB
DECLARE @execution_id BIGINT;

EXEC catalog.create_execution 
    @folder_name = 'ETL',
    @project_name = 'CustomerETL',
    @package_name = 'LoadCustomers.dtsx',
    @execution_id = @execution_id OUTPUT;

-- Set parameter values
EXEC catalog.set_execution_parameter_value 
    @execution_id,
    @object_type = 50,  -- Package
    @parameter_name = N'ServerName',
    @parameter_value = 'PROD-SQL01';

-- Start execution
EXEC catalog.start_execution @execution_id;

-- Check execution status
SELECT 
    execution_id,
    folder_name,
    project_name,
    package_name,
    status,
    start_time,
    end_time
FROM catalog.executions
WHERE execution_id = @execution_id;

SQL Server Agent Job for SSIS

-- Create job
EXEC msdb.dbo.sp_add_job
    @job_name = 'Daily Customer ETL',
    @enabled = 1;

-- Add job step
EXEC msdb.dbo.sp_add_jobstep
    @job_name = 'Daily Customer ETL',
    @step_name = 'Execute SSIS Package',
    @subsystem = 'SSIS',
    @command = '/ISSERVER "\"\SSISDB\ETL\CustomerETL\LoadCustomers.dtsx\"" /SERVER "\"localhost\"" /Par "\"$ServerOption::LOGGING_LEVEL(Int16)\"";1 /Par "\"$ServerOption::SYNCHRONIZED(Boolean)\"";True /CALLERINFO SQLAGENT /REPORTING E',
    @retry_attempts = 3,
    @retry_interval = 5;

-- Add schedule (daily at 2 AM)
EXEC msdb.dbo.sp_add_schedule
    @schedule_name = 'Daily at 2 AM',
    @freq_type = 4,  -- Daily
    @freq_interval = 1,
    @active_start_time = 020000;

EXEC msdb.dbo.sp_attach_schedule
    @job_name = 'Daily Customer ETL',
    @schedule_name = 'Daily at 2 AM';

-- Add notification
EXEC msdb.dbo.sp_add_notification
    @alert_name = 'Daily Customer ETL',
    @operator_name = 'DBA',
    @notification_method = 1;  -- Email

Key Takeaways

  • SSIS provides robust ETL with visual designer and transformations
  • Use Lookup transformation for reference data validation
  • Implement SCD Type 2 for historical dimension tracking
  • Design incremental loads with watermark timestamps
  • SSRS supports parameterized reports with subscriptions
  • Matrix reports enable pivot table-style analysis
  • SSAS Tabular uses DAX for powerful business logic
  • Deploy SSIS packages to SSISDB catalog for management
  • Schedule ETL jobs with SQL Server Agent
  • Monitor executions with catalog views

Next Steps

  • Build first SSIS package with error handling
  • Create parameterized SSRS report with drill-down
  • Design SSAS tabular model with DAX measures
  • Automate deployments with PowerShell
  • Implement logging and auditing for ETL processes

Additional Resources


Extract. Transform. Load. Analyze. Decide.