Power Query M: Language Transformations

Power Query M: Language Transformations

Introduction

Power Query M is the functional programming language behind Power Query's data transformation engine in Power BI, Excel, and other Microsoft data tools. While the graphical interface handles 80% of common scenarios, mastering M unlocks the remaining 20% that separates basic reports from enterprise-grade, scalable, maintainable data solutions.

M is a case-sensitive, functional language with lazy evaluation semantics that enables sophisticated ETL patterns: dynamic data source connections, complex conditional logic, reusable function libraries, API pagination handling, incremental refresh optimization, and error-resilient data pipelines. Unlike SQL's declarative approach or Python's imperative style, M's functional paradigm requires thinking in terms of immutable transformations and composed expressions.

However, M's power comes with complexity: query folding behavior is often opaque, error messages can be cryptic, performance bottlenecks emerge from innocent-looking transformations, and debugging is challenging without proper techniques. A poorly written M query can slow refreshes from seconds to hours or break when data sources change slightly.

This comprehensive guide covers everything you need to write production-quality M code: from functional programming fundamentals through query folding optimization, custom function development, error handling patterns, parameterization strategies, API integration techniques, incremental refresh implementation, performance profiling, modular design patterns, and troubleshooting methodologies. Whether you're writing your first custom function or optimizing enterprise ETL pipelines, you'll learn battle-tested patterns that balance power, performance, and maintainability.

Prerequisites

  • Power BI Desktop or Excel with Power Query
  • Understanding of basic Power Query UI operations
  • Familiarity with SQL and/or data transformation concepts
  • (Optional) Functional programming background
  • (Optional) Understanding of REST API concepts

M Language Fundamentals

The Functional Programming Paradigm

M is a case-sensitive, functional language with immutable data structures and first-class functions. Every transformation step returns a new value rather than modifying existing data, enabling Power Query to optimize execution through lazy evaluation and query folding.

Key Characteristics:

Immutability: Once created, values cannot be changed. Each transformation step creates a new dataset rather than modifying the original.

// Each step produces a new table
Source = Excel.Workbook(File.Contents("C:\Data\Sales.xlsx")),
FilteredSource = Table.SelectRows(Source, each [Amount] > 1000),  // New table
TransformedData = Table.TransformColumns(FilteredSource, ...)     // Another new table

Lazy Evaluation: M expressions are not evaluated until their results are actually needed. This allows Power Query to optimize the execution plan and potentially push operations to the data source (query folding).

// These steps define transformations but don't execute yet
Step1 = Table.SelectRows(Source, each [Date] >= #date(2025,1,1)),
Step2 = Table.SelectColumns(Step1, {"ID", "Name", "Amount"}),
Step3 = Table.Sort(Step2, {{"Amount", Order.Descending}})
// Execution only happens when the final result is materialized

First-Class Functions: Functions are values that can be assigned to variables, passed as arguments, and returned from other functions.

// Function as a value
Multiply = (x, y) => x * y,
ApplyOperation = (operation, a, b) => operation(a, b),
Result = ApplyOperation(Multiply, 5, 3)  // Returns 15

Complete M Syntax Guide

Data Types:

// Numbers
IntegerValue = 42,
DecimalValue = 3.14159,
ScientificNotation = 1.5e6,

// Text (immutable strings)
SimpleText = "Hello, Power Query",
MultilineText = "Line 1" & "#(cr,lf)" & "Line 2",
EscapedQuotes = "She said ""Hello""",

// Logical
BooleanTrue = true,
BooleanFalse = false,

// Dates and Times
DateValue = #date(2025, 3, 17),
TimeValue = #time(14, 30, 0),
DateTimeValue = #datetime(2025, 3, 17, 14, 30, 0),
DateTimeZoneValue = #datetimezone(2025, 3, 17, 14, 30, 0, -5, 0),  // EST
DurationValue = #duration(2, 5, 30, 15),  // 2 days, 5 hours, 30 min, 15 sec

// Lists (ordered collections)
SimpleList = {1, 2, 3, 4, 5},
MixedList = {1, "text", true, #date(2025,1,1)},
RangeList = {1..100},  // Creates list from 1 to 100
AccessElement = SimpleList{0},  // Returns 1 (zero-indexed)

// Records (key-value pairs)
PersonRecord = [
    FirstName = "Vladimir",
    LastName = "Luis",
    Age = 35,
    IsActive = true
],
AccessField = PersonRecord[FirstName],  // Returns "Vladimir"

// Tables (typed collections of records)
SampleTable = #table(
    {"ID", "Name", "Amount"},  // Column names
    {
        {1, "Alice", 1500},
        {2, "Bob", 2300},
        {3, "Charlie", 1800}
    }
),

// Null
NullValue = null

Query Folding Mastery

Understanding Query Folding

Query folding is Power Query's ability to translate M transformations into native data source queries (SQL, OData, etc.). When folding succeeds, operations execute at the source rather than loading all data into memoryβ€”dramatically improving performance for large datasets.

How to Check Query Folding:

  1. Right-click a step in Applied Steps
  2. Look for "View Native Query" option
  3. If available and shows SQL/native code β†’ folding works
  4. If unavailable or greyed out β†’ folding broken at this step

Example of Successful Folding:

// This entire query can fold to SQL Server
let
    Source = Sql.Database("prod-server.database.windows.net", "SalesDB"),
    SalesTable = Source{[Schema="dbo", Item="Sales"]}[Data],
    FilteredRows = Table.SelectRows(SalesTable, each [OrderDate] >= #date(2025,1,1)),
    SelectedColumns = Table.SelectColumns(FilteredRows, {"OrderID", "CustomerID", "Amount"}),
    SortedRows = Table.Sort(SelectedColumns, {{"Amount", Order.Descending}}),
    TopN = Table.FirstN(SortedRows, 1000)
in
    TopN

// Native Query (View Native Query shows):
-- SELECT TOP 1000 OrderID, CustomerID, Amount
-- FROM dbo.Sales
-- WHERE OrderDate >= '2025-01-01'
-- ORDER BY Amount DESC

Operations That Support Folding

Operation M Function Folds? Notes
Filter rows Table.SelectRows βœ… With simple predicates
Select columns Table.SelectColumns βœ… Column pruning
Remove columns Table.RemoveColumns βœ… Efficient at source
Sort Table.Sort βœ… Pushed to ORDER BY
Top N rows Table.FirstN βœ… Becomes LIMIT/TOP
Join tables Table.Join, Table.NestedJoin βœ… Both sources must support folding
Group by Table.Group βœ… Aggregations at source
Rename columns Table.RenameColumns βœ… Alias columns
Data type changes Table.TransformColumnTypes βœ… CAST operations
Distinct rows Table.Distinct βœ… SELECT DISTINCT

Operations That Break Folding

Operation Why It Breaks Alternative
Add index column No native equivalent Add at source if possible
Custom columns with complex logic Can't translate to SQL Use computed columns at source
Merge after custom function Function not foldable Restructure to fold first
Text operations on aggregated data Post-aggregation transforms Transform before aggregation
Invoked custom functions per row Row-by-row evaluation Use table functions or fold first
Table.Buffer Explicit materialization Use sparingly
Some Text.* functions No SQL equivalent Apply after folding steps

Query Folding Best Practices

// ❌ BAD: Breaks folding early
let
    Source = Sql.Database("server", "db"),
    Sales = Source{[Schema="dbo", Item="Sales"]}[Data],
    AddedIndex = Table.AddIndexColumn(Sales, "Index", 1, 1),  // BREAKS FOLDING
    FilteredRows = Table.SelectRows(AddedIndex, each [Amount] > 1000),  // No longer folds
    Result = Table.SelectColumns(FilteredRows, {"CustomerID", "Amount"})
in
    Result

// βœ… GOOD: Maximize folding first, then add index
let
    Source = Sql.Database("server", "db"),
    Sales = Source{[Schema="dbo", Item="Sales"]}[Data],
    FilteredRows = Table.SelectRows(Sales, each [Amount] > 1000),  // Folds
    SelectedColumns = Table.SelectColumns(FilteredRows, {"CustomerID", "Amount"}),  // Folds
    AddedIndex = Table.AddIndexColumn(SelectedColumns, "Index", 1, 1)  // Breaks, but minimal data
in
    AddedIndex

Performance Impact Example:

  • Dataset: 10 million rows from SQL Server
  • Filter condition: [OrderDate] >= 2025-01-01 (returns 500K rows)
  • With folding: 2-3 seconds (500K rows transferred)
  • Without folding: 3-5 minutes (10M rows transferred, filtered locally)

Diagnosing Folding Issues

// Create a diagnostic query to test each step
let
    Source = Sql.Database("server", "db"),
    
    // Test Step 1
    Step1 = Table.SelectRows(Source, each [Date] >= #date(2025,1,1)),
    // Right-click β†’ View Native Query: βœ… Works
    
    // Test Step 2
    Step2 = Table.AddColumn(Step1, "Custom", each [Amount] * 1.1),
    // Right-click β†’ View Native Query: βœ… Still works (simple math)
    
    // Test Step 3
    Step3 = Table.AddIndexColumn(Step2, "Index", 1, 1),
    // Right-click β†’ View Native Query: ❌ Greyed out (folding broken)
    
    FinalResult = Step3
in
    FinalResult

Advanced Custom Functions

Function Syntax and Structure

// Basic function syntax
(parameter1 as type, parameter2 as type, ...) as returnType => expression

// Named function
let
    fnAddNumbers = (x as number, y as number) as number => x + y
in
    fnAddNumbers

// Multi-line function with let-in expression
let
    fnCalculateDiscount = (amount as number, discountPercent as number) as number =>
        let
            Discount = amount * (discountPercent / 100),
            FinalAmount = amount - Discount
        in
            FinalAmount
in
    fnCalculateDiscount

Error-Resilient Custom Functions

// Function with comprehensive error handling
let
    fnSafeParseNumber = (value as any) as number =>
        let
            Result = try
                if value = null then
                    0
                else if Value.Is(value, type number) then
                    value
                else if Value.Is(value, type text) then
                    Number.From(Text.Trim(value))
                else
                    error Error.Record("InvalidType", "Cannot convert to number", value)
            otherwise
                0
        in
            Result[Value]? ?? 0  // Return 0 if error occurred
in
    fnSafeParseNumber

// Usage
Source = Excel.Workbook(File.Contents("C:\Data\Messy.xlsx")),
Data = Source{[Item="Sheet1"]}[Data],
Cleaned = Table.TransformColumns(Data, {{"Amount", fnSafeParseNumber}})

Reusable Table Function Library

// Create a library of table transformation functions
// Save this as a query named "fnTableLibrary"
let
    // Remove columns with all null values
    fnRemoveEmptyColumns = (tbl as table) as table =>
        let
            ColumnNames = Table.ColumnNames(tbl),
            NonEmptyCols = List.Select(ColumnNames, each List.NonNullCount(Table.Column(tbl, _)) > 0)
        in
            Table.SelectColumns(tbl, NonEmptyCols),
    
    // Standardize column names (uppercase, replace spaces with underscores)
    fnStandardizeColumnNames = (tbl as table) as table =>
        let
            OldNames = Table.ColumnNames(tbl),
            NewNames = List.Transform(OldNames, each Text.Upper(Text.Replace(_, " ", "_"))),
            Renamed = Table.RenameColumns(tbl, List.Zip({OldNames, NewNames}))
        in
            Renamed,
    
    // Trim all text columns
    fnTrimAllTextColumns = (tbl as table) as table =>
        let
            TextColumns = Table.ColumnsOfType(tbl, {type text, type nullable text}),
            Trimmed = Table.TransformColumns(tbl, 
                List.Transform(TextColumns, each {_, Text.Trim, type text}))
        in
            Trimmed,
    
    // Unpivot all columns except specified keys
    fnUnpivotExceptKeys = (tbl as table, keyColumns as list) as table =>
        let
            AllColumns = Table.ColumnNames(tbl),
            ValueColumns = List.Difference(AllColumns, keyColumns),
            Unpivoted = Table.UnpivotOtherColumns(tbl, keyColumns, "Attribute", "Value")
        in
            Unpivoted,
    
    // Export all functions as record
    Functions = [
        RemoveEmptyColumns = fnRemoveEmptyColumns,
        StandardizeColumnNames = fnStandardizeColumnNames,
        TrimAllTextColumns = fnTrimAllTextColumns,
        UnpivotExceptKeys = fnUnpivotExceptKeys
    ]
in
    Functions

// Usage in other queries:
let
    Lib = fnTableLibrary,
    Source = Excel.Workbook(File.Contents("C:\Data\Source.xlsx")),
    RawData = Source{[Item="Data"]}[Data],
    Step1 = Lib[RemoveEmptyColumns](RawData),
    Step2 = Lib[StandardizeColumnNames](Step1),
    Step3 = Lib[TrimAllTextColumns](Step2)
in
    Step3

Parameterization Patterns

Environment-Specific Parameters

// Create parameters (Home β†’ Manage Parameters)
// ServerName: "prod-server.database.windows.net" (Production)
// DatabaseName: "SalesDB"

let
    Source = Sql.Database(ServerName, DatabaseName),
    SalesData = Source{[Schema="dbo", Item="Sales"]}[Data]
in
    SalesData

// Easily switch between Dev/QA/Prod by changing parameter values
// No code changes required

Dynamic Date Range Parameters for Incremental Refresh

// For incremental refresh, create RangeStart and RangeEnd parameters
// Power BI manages these automatically during refresh

let
    Source = Sql.Database("server", "db"),
    Sales = Source{[Schema="dbo", Item="Sales"]}[Data],
    
    // Filter using parameters (MUST support folding for incremental refresh)
    FilteredRows = Table.SelectRows(Sales, each 
        [OrderDate] >= RangeStart and [OrderDate] < RangeEnd
    )
in
    FilteredRows

// In Power BI Desktop:
// 1. Define parameters: RangeStart (DateTime), RangeEnd (DateTime)
// 2. Use them in query as shown above
// 3. Configure incremental refresh policy on the table

Function Parameters with Retry Logic

// Function that accepts parameters and includes retry logic
let
    fnGetDataWithRetry = (
        server as text,
        database as text,
        schema as text,
        tableName as text,
        maxRetries as number
    ) as table =>
        let
            GetData = (retryCount as number) as table =>
                try
                    let
                        Source = Sql.Database(server, database),
                        TableData = Source{[Schema=schema, Item=tableName]}[Data]
                    in
                        TableData
                otherwise
                    if retryCount < maxRetries then
                        // Wait 2 seconds and retry
                        @GetData(retryCount + 1)
                    else
                        error Error.Record(
                            "MaxRetriesExceeded",
                            "Failed to connect after " & Number.ToText(maxRetries) & " attempts",
                            [Server=server, Database=database, Table=tableName]
                        ),
            
            Result = GetData(0)
        in
            Result
in
    fnGetDataWithRetry

// Usage:
fnGetDataWithRetry("server.database.windows.net", "SalesDB", "dbo", "Sales", 3)

API Integration Patterns

Offset-Based Pagination

// Paginate through API results using offset/limit pattern
let
    BaseUrl = "https://api.contoso.com/v1/customers",
    PageSize = 100,
    
    GetPage = (offset as number) as table =>
        let
            Url = BaseUrl & "?limit=" & Number.ToText(PageSize) & "&offset=" & Number.ToText(offset),
            Response = Json.Document(Web.Contents(Url)),
            Data = Response[data],
            Table = if Data = null or List.Count(Data) = 0 
                    then #table({"id", "name", "email"}, {}) 
                    else Table.FromRecords(Data)
        in
            Table,
    
    GetAllPages = () as table =>
        let
            FirstPage = GetPage(0),
            AllPages = List.Generate(
                () => [Page = FirstPage, Offset = 0],
                each Table.RowCount([Page]) = PageSize,  // Continue while full page
                each [Page = GetPage([Offset] + PageSize), Offset = [Offset] + PageSize],
                each [Page]
            ),
            Combined = Table.Combine(AllPages)
        in
            Combined,
    
    Result = GetAllPages()
in
    Result

Cursor-Based Pagination

// Paginate using next page tokens/cursors
let
    BaseUrl = "https://api.contoso.com/v2/orders",
    
    GetPages = (cursor as nullable text) as list =>
        let
            Url = if cursor = null then BaseUrl else BaseUrl & "?cursor=" & cursor,
            Response = Json.Document(Web.Contents(Url)),
            Data = Response[data],
            NextCursor = Response[next_cursor],
            
            CurrentPage = Table.FromRecords(Data),
            Result = if NextCursor = null then
                {CurrentPage}  // Last page
            else
                {CurrentPage} & @GetPages(NextCursor)  // Recursive call
        in
            Result,
    
    AllPages = GetPages(null),
    Combined = Table.Combine(AllPages)
in
    Combined

REST API with OAuth2 Authentication

// Separate authentication into its own query (named "fnAPIAuth")
let
    TokenUrl = "https://login.contoso.com/oauth/token",
    ClientId = "your-client-id",
    ClientSecret = "your-client-secret",
    
    TokenRequest = Web.Contents(
        TokenUrl,
        [
            Headers = [#"Content-Type" = "application/x-www-form-urlencoded"],
            Content = Text.ToBinary(
                "grant_type=client_credentials" &
                "&client_id=" & ClientId &
                "&client_secret=" & ClientSecret
            )
        ]
    ),
    TokenResponse = Json.Document(TokenRequest),
    AccessToken = TokenResponse[access_token]
in
    AccessToken

// Main data query referencing authentication
let
    Token = fnAPIAuth,
    Url = "https://api.contoso.com/v1/data",
    
    Response = Json.Document(Web.Contents(
        Url,
        [Headers = [Authorization = "Bearer " & Token]]
    )),
    
    Data = Table.FromRecords(Response[results])
in
    Data

Error Handling Strategies

Try-Otherwise Pattern

// Basic error handling with fallback
let
    Source = try Csv.Document(File.Contents("C:\Data\Sales.csv")) 
             otherwise Excel.Workbook(File.Contents("C:\Data\Sales.xlsx"))
in
    Source

// Handling specific errors
let
    Result = try 
        Number.From("invalid")
    otherwise
        if Error.Record()[Reason] = "Expression.Error" then
            0  // Return default value
        else
            error Error.Record("UnexpectedError", "An unexpected error occurred")
in
    Result

Error Record Analysis

// Capture and analyze error details
let
    Url = "https://api.contoso.com/data",
    
    Response = try Json.Document(Web.Contents(Url)),
    
    Result = if Response[HasError] then
        let
            ErrorRec = Response[Error],
            ErrorDetails = [
                Reason = ErrorRec[Reason],
                Message = ErrorRec[Message],
                Detail = ErrorRec[Detail]
            ]
        in
            error Error.Record(
                ErrorRec[Reason],
                "API Call Failed: " & ErrorRec[Message],
                ErrorDetails
            )
    else
        Response[Value]
in
    Result

Data Quality Validation with Errors

// Validate data and raise meaningful errors
let
    Source = Excel.Workbook(File.Contents("C:\Data\Sales.xlsx")),
    Data = Source{[Item="Sales"]}[Data],
    
    // Validate required columns exist
    RequiredColumns = {"OrderID", "CustomerID", "Amount", "OrderDate"},
    ActualColumns = Table.ColumnNames(Data),
    MissingColumns = List.Difference(RequiredColumns, ActualColumns),
    
    Validated = if List.Count(MissingColumns) > 0 then
        error Error.Record(
            "ValidationError",
            "Missing required columns",
            [Missing = MissingColumns, Found = ActualColumns]
        )
    else
        Data,
    
    // Validate data types
    TypedData = try Table.TransformColumnTypes(Validated, {
        {"OrderID", Int64.Type},
        {"Amount", Currency.Type},
        {"OrderDate", type date}
    })
    otherwise
        error Error.Record("ValidationError", "Invalid data types in source")
in
    TypedData

Performance Optimization

Minimize Data Movement

// ❌ BAD: Loads entire table, then filters
let
    Source = Sql.Database("server", "db"),
    AllSales = Source{[Schema="dbo", Item="Sales"]}[Data],  // Loads millions of rows
    FilteredSales = Table.SelectRows(AllSales, each [Date] >= #date(2025,1,1))  // Filters locally
in
    FilteredSales

// βœ… GOOD: Pushes filter to source
let
    Source = Sql.Database("server", "db"),
    Sales = Source{[Schema="dbo", Item="Sales"]}[Data],
    FilteredSales = Table.SelectRows(Sales, each [Date] >= #date(2025,1,1))  // Folds to SQL
in
    FilteredSales

Use Table.Buffer Wisely

// Table.Buffer caches a table in memory
// Use ONLY when referencing the same dataset multiple times in non-foldable operations

// ❌ BAD: Unnecessary buffering
let
    Source = Sql.Database("server", "db"),
    Sales = Source{[Schema="dbo", Item="Sales"]}[Data],
    Buffered = Table.Buffer(Sales),  // Wastes memory, folding still works without it
    Filtered = Table.SelectRows(Buffered, each [Amount] > 1000)
in
    Filtered

// βœ… GOOD: Buffer when needed
let
    Source = Sql.Database("server", "db"),
    Sales = Source{[Schema="dbo", Item="Sales"]}[Data],
    Filtered = Table.SelectRows(Sales, each [Amount] > 1000),  // Folds
    
    // Now we need to reference filtered data multiple times in custom columns
    Buffered = Table.Buffer(Filtered),  // Prevents re-querying source multiple times
    
    WithCustom1 = Table.AddColumn(Buffered, "Metric1", each 
        List.Sum(Table.SelectRows(Buffered, (r) => r[CustomerID] = [CustomerID])[Amount])),
    WithCustom2 = Table.AddColumn(WithCustom1, "Metric2", each 
        List.Average(Table.SelectRows(Buffered, (r) => r[Region] = [Region])[Amount]))
in
    WithCustom2

List.Generate for Efficient Iteration

// List.Generate is more efficient than recursive functions for iteration
// Generates lists without building intermediate collections

// Efficient date range generation
let
    StartDate = #date(2025, 1, 1),
    EndDate = #date(2025, 12, 31),
    
    DateList = List.Generate(
        () => StartDate,                        // Initial value
        each _ <= EndDate,                      // Condition
        each Date.AddDays(_, 1),                // Next value
        each _                                  // Transform (optional)
    ),
    
    DateTable = Table.FromList(DateList, Splitter.SplitByNothing(), {"Date"})
in
    DateTable

Incremental Refresh Implementation

Configure Incremental Refresh Query

// 1. Create parameters in Power BI Desktop:
//    - RangeStart (DateTime)
//    - RangeEnd (DateTime)

let
    Source = Sql.Database("prod-server.database.windows.net", "SalesDB"),
    Sales = Source{[Schema="dbo", Item="FactSales"]}[Data],
    
    // CRITICAL: This filter MUST support query folding for incremental refresh to work
    IncrementalFilter = Table.SelectRows(Sales, each 
        [OrderDateTime] >= RangeStart and [OrderDateTime] < RangeEnd
    ),
    
    // Additional transformations (ensure they maintain folding if possible)
    SelectedColumns = Table.SelectColumns(IncrementalFilter, 
        {"OrderID", "CustomerID", "ProductID", "OrderDateTime", "Amount"}),
    
    TypedColumns = Table.TransformColumnTypes(SelectedColumns, {
        {"OrderID", Int64.Type},
        {"Amount", Currency.Type},
        {"OrderDateTime", type datetime}
    })
in
    TypedColumns

// 2. Configure Incremental Refresh Policy:
//    - Right-click table in Fields pane β†’ Incremental refresh
//    - Archive data starting: X years/months before refresh date
//    - Incrementally refresh data starting: Y days before refresh date
//    - Detect data changes: (Optional) Select date column

Verify Query Folding for Incremental Refresh

// Create a test query to verify folding works with parameters
let
    // Set test parameter values
    TestRangeStart = #datetime(2025, 1, 1, 0, 0, 0),
    TestRangeEnd = #datetime(2025, 3, 31, 23, 59, 59),
    
    Source = Sql.Database("prod-server.database.windows.net", "SalesDB"),
    Sales = Source{[Schema="dbo", Item="FactSales"]}[Data],
    
    IncrementalFilter = Table.SelectRows(Sales, each 
        [OrderDateTime] >= TestRangeStart and [OrderDateTime] < TestRangeEnd
    )
    // Right-click this step β†’ View Native Query
    // Should show: WHERE OrderDateTime >= '2025-01-01' AND OrderDateTime < '2025-03-31'
in
    IncrementalFilter

Modular Query Design

Staging β†’ Transform β†’ Serve Pattern

// STAGE 1: Raw Data Ingestion (query named "stg_Sales")
// Minimal transformation, preserve source fidelity
let
    Source = Sql.Database("server", "db"),
    SalesRaw = Source{[Schema="dbo", Item="Sales"]}[Data]
in
    SalesRaw

// STAGE 2: Transformation (query named "tfm_Sales")
// Business logic, data quality, standardization
let
    Source = stg_Sales,  // Reference staging query
    
    // Data type corrections
    TypedData = Table.TransformColumnTypes(Source, {
        {"OrderDate", type date},
        {"Amount", Currency.Type}
    }),
    
    // Business rules
    ValidRecords = Table.SelectRows(TypedData, each [Amount] > 0),
    
    // Calculated columns
    WithFiscalYear = Table.AddColumn(ValidRecords, "FiscalYear", each 
        if Date.Month([OrderDate]) >= 7 then Date.Year([OrderDate]) + 1 
        else Date.Year([OrderDate])
    )
in
    WithFiscalYear

// STAGE 3: Serve Layer (query named "Sales")
// Final structure for reporting
let
    Source = tfm_Sales,  // Reference transform query
    
    FinalColumns = Table.SelectColumns(Source, {
        "OrderID", "CustomerID", "OrderDate", "FiscalYear", "Amount"
    }),
    
    LoadToModel = Table.Buffer(FinalColumns)  // Cache for multiple report visuals
in
    LoadToModel

Shared Function Library

// Create a centralized function library query (named "fnLibrary")
let
    Functions = [
        // Date functions
        GetFiscalYear = (dateValue as date) as number =>
            if Date.Month(dateValue) >= 7 
            then Date.Year(dateValue) + 1 
            else Date.Year(dateValue),
        
        GetWeekNumber = (dateValue as date) as number =>
            Number.RoundDown((Date.DayOfYear(dateValue) - 1) / 7) + 1,
        
        // Text functions
        NormalizeText = (txt as text) as text =>
            Text.Upper(Text.Trim(Text.Clean(txt))),
        
        // Number functions
        SafeDivide = (numerator as number, denominator as number, defaultValue as number) as number =>
            if denominator = 0 then defaultValue else numerator / denominator,
        
        // Table functions
        RemoveDuplicateColumns = (tbl as table) as table =>
            let
                Cols = Table.ColumnNames(tbl),
                DistinctCols = List.Distinct(Cols),
                Result = Table.SelectColumns(tbl, DistinctCols)
            in
                Result
    ]
in
    Functions

// Use in any query:
let
    Lib = fnLibrary,
    Source = Excel.Workbook(File.Contents("C:\Data\Sales.xlsx")),
    Data = Source{[Item="Sheet1"]}[Data],
    
    WithFiscalYear = Table.AddColumn(Data, "FiscalYear", each 
        Lib[GetFiscalYear]([OrderDate])),
    
    WithNormalizedNames = Table.TransformColumns(Data, {
        {"CustomerName", Lib[NormalizeText]}
    })
in
    WithNormalizedNames

Troubleshooting and Debugging

Performance Profiling with Diagnostics

// Enable query diagnostics:
// Tools β†’ Query Diagnostics β†’ Start Diagnostics
// Refresh your query
// Tools β†’ Query Diagnostics β†’ Stop Diagnostics

// Analyze "Diagnostic" queries created:
// - Look for steps with long durations
// - Identify "Datasource.Query" operations (native queries)
// - Find "Memory.Usage" to see data volume
// - Check "Query.CombineResults" for merge/append overhead

// Example of slow step identification:
// If you see Table.AddColumn taking 30 seconds,
// likely a row-by-row operation that could be optimized

Common Error Messages Decoded

Error Message Cause Resolution
DataFormat.Error: Invalid format Data type mismatch during conversion Use try...otherwise to handle conversion errors; verify source data quality
DataSource.Error: Unable to connect Connection failure (network, credentials, firewall) Verify credentials, check network connectivity, ensure firewall allows outbound connections
Expression.Error: The column 'X' was not found Column name mismatch after source schema change Update query to handle schema changes dynamically; use Table.ColumnNames() to verify
Function.Error: Circular reference Query references itself directly or indirectly Review query dependencies; restructure to eliminate circular references
Query.Timeout: The operation timed out Long-running query exceeds timeout threshold Optimize query (add filters, improve folding); increase timeout in data source settings
Token.Error: Invalid token Authentication token expired Refresh credentials in data source settings; implement token refresh logic for APIs
Memory.Error: Out of memory Dataset too large for available memory Filter data earlier; use incremental refresh; optimize transformations to reduce memory footprint

Debugging Techniques

1. Isolate Steps: Comment out transformations to identify which step causes issues.

let
    Source = Sql.Database("server", "db"),
    Sales = Source{[Schema="dbo", Item="Sales"]}[Data],
    
    // Test each step independently
    Step1 = Table.SelectRows(Sales, each [Date] >= #date(2025,1,1)),
    // Step2 = Table.AddColumn(Step1, "Custom", each [Amount] * 1.1),  // Comment out
    // Step3 = Table.Group(Step2, ...),  // Comment out
    
    Result = Step1  // Change to Step1, Step2, Step3 to test each
in
    Result

2. Add Diagnostic Columns: Insert columns that expose intermediate calculations.

let
    Source = Excel.Workbook(File.Contents("C:\Data\Sales.xlsx")),
    Data = Source{[Item="Sheet1"]}[Data],
    
    // Add diagnostic column to see what's happening
    WithDiagnostic = Table.AddColumn(Data, "DEBUG_AmountType", each 
        if [Amount] = null then "NULL"
        else if Value.Is([Amount], type number) then "NUMBER"
        else if Value.Is([Amount], type text) then "TEXT"
        else "OTHER"
    )
in
    WithDiagnostic

3. Use Table.Profile for Data Quality: Analyze column statistics to identify issues.

let
    Source = Excel.Workbook(File.Contents("C:\Data\Sales.xlsx")),
    Data = Source{[Item="Sheet1"]}[Data],
    
    // Generate profile with min, max, distinct count, null count, error count
    Profile = Table.Profile(Data)
in
    Profile

Additional Troubleshooting Scenarios

Issue Cause Resolution
Slow refresh despite folding Large result set transferred Add filters earlier; use incremental refresh; verify indexes exist at source
API errors Rate limits or authentication expiration Implement pagination with delays; refresh credentials; add retry logic
Data type mismatch Implicit type conversions fail on dirty data Use try...otherwise; explicitly set types with error handling; validate source data quality
Incremental refresh fails Parameter misconfiguration or non-foldable filter Verify RangeStart/RangeEnd used correctly; check "View Native Query" on filter step; ensure DateTime parameter types
"Circular reference" error Query references itself directly/indirectly Review dependencies in Queries pane; restructure to staging β†’ transform β†’ serve pattern
Custom function not reusable Function uses query-specific references Convert to pure function accepting parameters; avoid referencing external queries inside function body

Key Takeaways

  • Query folding is the #1 performance optimization: Always verify with "View Native Query" and structure transformations to maximize folding.
  • M is functional, not imperative: Think in terms of immutable transformations and composed expressions, not loops and mutations.
  • Custom functions enable reusability: Build a library of table and value functions; prefix with fn* for easy identification.
  • Error handling is production-critical: Use try...otherwise, validate data quality early, and provide meaningful error messages.
  • Parameterization enables flexibility: Separate environment-specific values; use RangeStart/RangeEnd for incremental refresh.
  • API integration requires pagination: Use List.Generate for offset/cursor-based patterns; implement retry logic and authentication refresh.
  • Modular design improves maintainability: Separate staging (raw ingestion), transformation (business logic), and serve (reporting) layers.
  • Profile and diagnose systematically: Use Query Diagnostics, Table.Profile, and step-by-step isolation to identify bottlenecks.
  • Incremental refresh requires folding: The date filter using RangeStart/RangeEnd parameters MUST push to the source for incremental refresh to work.
  • Documentation and naming conventions matter: Use clear query names (stg_, tfm_, fn), comment complex logic, and maintain a function library.

References