.NET 8 Performance: JIT Improvements, Native AOT, and Runtime Optimizations

.NET 8 Performance: JIT Improvements, Native AOT, and Runtime Optimizations

Introduction

.NET 8 delivers substantial performance improvements across the runtime, libraries, and compiler. This guide explores JIT (Just-In-Time) compiler enhancements, Dynamic Profile-Guided Optimization (PGO), Native Ahead-of-Time (AOT) compilation for minimal startup and memory footprint, new LINQ optimizations, Span enhancements, and overall runtime performance gains that make .NET 8 the fastest .NET yet.

JIT Compiler Improvements

Dynamic PGO (Profile-Guided Optimization)

Enabling PGO:

<PropertyGroup>
  <PublishAot>false</PublishAot>
  <TieredCompilation>true</TieredCompilation>
  <TieredPGO>true</TieredPGO>
</PropertyGroup>

How PGO Works:

// Cold path executed rarely
public decimal CalculateDiscount(Order order)
{
    // PGO learns this branch is taken 95% of the time
    if (order.CustomerType == CustomerType.Premium)
    {
        return order.Total * 0.15m;  // Hot path - optimized
    }
    else if (order.CustomerType == CustomerType.Gold)
    {
        return order.Total * 0.10m;  // Warm path
    }
    else
    {
        return 0m;  // Cold path - less optimized
    }
}

PGO Benefits:

Without PGO:
- All code paths equally optimized
- Generic branch prediction
- Uniform inlining decisions

With Dynamic PGO:
- Hot paths heavily optimized
- Cold paths minimally optimized
- Smart inlining based on actual usage
- Better register allocation
- 20-30% performance improvement in real scenarios

Loop Optimizations

Vectorization:

// Automatically vectorized by JIT
public static void MultiplyArrays(int[] a, int[] b, int[] result)
{
    for (int i = 0; i < a.Length; i++)
    {
        result[i] = a[i] * b[i];
    }
}

// JIT generates SIMD instructions (AVX2/AVX512)
// Processes 8 integers at once with Vector256<int>

Loop Unrolling:

// ❌ Before: Sequential processing
for (int i = 0; i < array.Length; i++)
{
    sum += array[i];
}

// ✅ After: JIT unrolls loop (4x)
for (int i = 0; i < array.Length; i += 4)
{
    sum += array[i] + array[i+1] + array[i+2] + array[i+3];
}
// Handles remainder separately

Inlining Improvements

Aggressive Inlining:

// Small methods automatically inlined
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool IsValid(string input)
{
    return !string.IsNullOrWhiteSpace(input);
}

// Call site becomes:
if (!string.IsNullOrWhiteSpace(input))
{
    // No method call overhead
}

Cross-Assembly Inlining:

// .NET 8 can inline methods across assembly boundaries
// with ReadyToRun (R2R) and PGO

// Library.dll
public class Calculator
{
    public static int Add(int a, int b) => a + b;
}

// App.dll
var result = Calculator.Add(5, 10);
// Inlined as: var result = 5 + 10;

Native AOT Compilation

Basic Configuration

Project Setup:

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <TargetFramework>net8.0</TargetFramework>
    <PublishAot>true</PublishAot>
    <InvariantGlobalization>true</InvariantGlobalization>
    <IlcOptimizationPreference>Speed</IlcOptimizationPreference>
    <IlcGenerateStackTraceData>false</IlcGenerateStackTraceData>
  </PropertyGroup>
</Project>

Publish Command:

dotnet publish -c Release -r win-x64

# Output:
# - myapp.exe (5MB native executable)
# - No runtime dependencies
# - <10ms startup time
# - ~50% memory reduction vs JIT

AOT-Compatible Code

Supported Scenarios:

// ✅ AOT-compatible
public class Calculator
{
    public int Add(int a, int b) => a + b;
}

// ✅ Generic methods with value types
public T Max<T>(T a, T b) where T : IComparable<T>
{
    return a.CompareTo(b) > 0 ? a : b;
}

// ✅ LINQ with concrete types
var result = numbers
    .Where(n => n > 10)
    .Select(n => n * 2)
    .ToList();

Unsupported Features:

// ❌ Reflection.Emit
var assembly = AssemblyBuilder.DefineDynamicAssembly(...);

// ❌ Dynamic code generation
dynamic obj = new ExpandoObject();
obj.Property = "value";

// ❌ Unconstrained generics with reflection
public void Process<T>(T item)
{
    var type = typeof(T);
    var method = type.GetMethod("ToString");
}

// ✅ Workaround: Source generators
[JsonSerializable(typeof(User))]
partial class UserContext : JsonSerializerContext { }

Trimming and Size Optimization

Aggressive Trimming:

<PropertyGroup>
  <PublishAot>true</PublishAot>
  <PublishTrimmed>true</PublishTrimmed>
  <TrimMode>full</TrimMode>
  <EnableTrimAnalyzer>true</EnableTrimAnalyzer>
</PropertyGroup>

Preserving Code:

// Prevent trimming specific types
[DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicMethods)]
public static void ProcessType(Type type)
{
    var methods = type.GetMethods();
}

// Assembly-level preservation
[assembly: UnconditionalSuppressMessage(
    "Trimming",
    "IL2026",
    Scope = "member",
    Target = "~M:MyApp.Startup.ConfigureServices")]

LINQ Performance Enhancements

Order/OrderBy Improvements

Optimized Sorting:

var numbers = Enumerable.Range(1, 1000000);

// .NET 7: ~150ms
// .NET 8: ~80ms (46% faster)
var sorted = numbers
    .Order()
    .ToArray();

// ThenBy optimization
var users = GetUsers();
var ordered = users
    .OrderBy(u => u.LastName)
    .ThenBy(u => u.FirstName)  // Single sort pass in .NET 8
    .ToList();

Count/LongCount Optimization

Smart Counting:

// ❌ .NET 7: Enumerates entire collection
var count = collection
    .Where(x => x.IsActive)
    .Count();

// ✅ .NET 8: Optimized for known-length collections
var count = collection
    .Where(x => x.IsActive)
    .Count();  // Uses TryGetNonEnumeratedCount when possible

// Example optimization
List<int> numbers = [1, 2, 3, 4, 5];
var count = numbers
    .Where(n => n > 2)
    .Count();  // Doesn't allocate iterator in simple cases

Index/Range Support

Range Operations:

int[] numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];

// ✅ .NET 8: Optimized with spans
var slice = numbers[2..7];  // No allocation

// LINQ with ranges
var result = numbers
    .Take(5..8)  // Items at indices 5, 6, 7
    .ToArray();

// Reverse ranges
var lastThree = numbers[^3..];  // [8, 9, 10]

Span and Memory Enhancements

SearchValues

Efficient Searching:

// ❌ Old approach: Multiple Contains calls
private static readonly char[] Separators = [' ', '\t', '\n', '\r'];

public static int CountWords(string text)
{
    int count = 0;
    foreach (char c in text)
    {
        if (Separators.Contains(c))
            count++;
    }
    return count;
}

// ✅ .NET 8: SearchValues (10x faster)
private static readonly SearchValues<char> Separators = 
    SearchValues.Create([' ', '\t', '\n', '\r']);

public static int CountWords(ReadOnlySpan<char> text)
{
    int count = 0;
    int index;
    while ((index = text.IndexOfAny(Separators)) >= 0)
    {
        count++;
        text = text.Slice(index + 1);
    }
    return count;
}

CompositeFormat

Compiled Format Strings:

// ❌ Old: Parses format string every time
for (int i = 0; i < 1000; i++)
{
    var message = string.Format("User {0} logged in at {1}", 
        users[i].Name, DateTime.Now);
}

// ✅ .NET 8: Pre-compiled format (3x faster)
private static readonly CompositeFormat LogFormat = 
    CompositeFormat.Parse("User {0} logged in at {1}");

for (int i = 0; i < 1000; i++)
{
    var message = string.Format(null, LogFormat, 
        users[i].Name, DateTime.Now);
}

Utf8 String Literals

Zero-Allocation UTF-8:

// ❌ Old: Allocates UTF-16 string, converts to UTF-8
byte[] bytes = Encoding.UTF8.GetBytes("Hello, World!");

// ✅ .NET 8: UTF-8 literal (compile-time encoding)
ReadOnlySpan<byte> utf8 = "Hello, World!"u8;

// Direct HTTP response
await response.Body.WriteAsync("Success"u8);

// JSON without allocation
using var doc = JsonDocument.Parse("{\"name\":\"John\"}"u8);

Collection Improvements

Frozen Collections

Immutable Optimized Collections:

// ❌ Dictionary lookup: O(1) but hash overhead
var dictionary = new Dictionary<string, int>
{
    ["one"] = 1,
    ["two"] = 2,
    ["three"] = 3
};

// ✅ FrozenDictionary: Optimized for lookups (40% faster)
var frozen = dictionary.ToFrozenDictionary();

// Optimizes based on size:
// - Small collections: Perfect hash
// - Large collections: Minimal collision hash

// Use case: Configuration lookups
private static readonly FrozenDictionary<string, string> Config = 
    new Dictionary<string, string>
    {
        ["ApiEndpoint"] = "https://api.contoso.com",
        ["Timeout"] = "30",
        ["RetryCount"] = "3"
    }.ToFrozenDictionary();

PriorityQueue Enhancements

Better Performance:

var queue = new PriorityQueue<string, int>();

// .NET 8: 30% faster enqueue/dequeue
queue.Enqueue("Low", 3);
queue.Enqueue("High", 1);
queue.Enqueue("Medium", 2);

// EnqueueRange (bulk operation)
queue.EnqueueRange(
    [("A", 1), ("B", 2), ("C", 3)]);

// TryDequeue with out parameter
while (queue.TryDequeue(out var item, out var priority))
{
    Console.WriteLine($"{item}: {priority}");
}

Regular Expression Improvements

Source Generator

Compile-Time Regex:

// ❌ Old: Runtime compilation overhead
private static readonly Regex EmailRegex = 
    new(@"^[^@]+@[^@]+\.[^@]+$", RegexOptions.Compiled);

// ✅ .NET 8: Source-generated (faster startup, better performance)
[GeneratedRegex(@"^[^@]+@[^@]+\.[^@]+$", RegexOptions.IgnoreCase)]
private static partial Regex EmailRegex();

public bool ValidateEmail(string email)
{
    return EmailRegex().IsMatch(email);
}

NonBacktracking Mode

Guaranteed Performance:

// ❌ Catastrophic backtracking possible
var regex = new Regex(@"(a+)+b");
regex.IsMatch(new string('a', 30));  // Can take seconds!

// ✅ NonBacktracking: O(n) guaranteed
var regex = new Regex(@"(a+)+b", RegexOptions.NonBacktracking);
regex.IsMatch(new string('a', 30));  // Always fast

ASP.NET Core Performance

HTTP/3 Support

Configuration:

var builder = WebApplication.CreateBuilder(args);

builder.WebHost.ConfigureKestrel(options =>
{
    options.ListenAnyIP(5001, listenOptions =>
    {
        listenOptions.Protocols = HttpProtocols.Http1AndHttp2AndHttp3;
        listenOptions.UseHttps();
    });
});

// Benefits:
// - 0-RTT connection establishment
// - Better multiplexing
// - Improved head-of-line blocking

Request Decompression

Automatic Decompression:

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddRequestDecompression();

var app = builder.Build();
app.UseRequestDecompression();

// Automatically decompresses:
// - gzip
// - deflate
// - brotli

Rate Limiting

Built-in Rate Limiting:

builder.Services.AddRateLimiter(options =>
{
    options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(
        context => RateLimitPartition.GetFixedWindowLimiter(
            context.User.Identity?.Name ?? context.Connection.RemoteIpAddress?.ToString() ?? "anonymous",
            _ => new FixedWindowRateLimiterOptions
            {
                PermitLimit = 100,
                Window = TimeSpan.FromMinutes(1)
            }));
});

app.UseRateLimiter();

app.MapGet("/api/data", () => "Success")
    .RequireRateLimiting("fixed");

Benchmarking Results

Startup Time

.NET 6:  250ms
.NET 7:  180ms
.NET 8:  120ms (52% faster than .NET 6)

Native AOT:
.NET 7:  50ms
.NET 8:  8ms (84% faster)

Memory Usage

Hello World App (64-bit):
.NET 6:  28 MB
.NET 7:  24 MB
.NET 8:  18 MB (36% reduction)

Native AOT:
.NET 7:  12 MB
.NET 8:  6 MB (50% reduction)

Throughput

JSON Serialization (1M objects):
.NET 6:  2,100 ops/sec
.NET 7:  2,850 ops/sec
.NET 8:  4,200 ops/sec (100% faster than .NET 6)

LINQ OrderBy (1M items):
.NET 6:  180ms
.NET 7:  150ms
.NET 8:  80ms (56% faster)

Best Practices

  1. Enable Dynamic PGO: Significant gains with minimal effort
  2. Use Native AOT for Services: Ideal for containers and serverless
  3. Leverage Span: Reduce allocations in hot paths
  4. Frozen Collections: Use for readonly lookup tables
  5. Source-Generated Regex: Better startup and performance
  6. Benchmark Changes: Use BenchmarkDotNet to validate improvements
  7. Profile Production: Use dotnet-trace and Application Insights

Troubleshooting

AOT Compatibility Issues:

# Analyze trim warnings
dotnet publish -c Release -r win-x64 /p:PublishAot=true

# Review IL2XXX warnings
# Add suppressions or redesign problematic code

PGO Not Activating:

# Verify PGO is enabled
dotnet-trace collect --process-id <pid> --providers Microsoft-Windows-DotNETRuntime:0x1E000080018:5

# Check for "TieredCompilation" events

Key Takeaways

  • .NET 8 JIT improvements deliver 20-30% performance gains with Dynamic PGO
  • Native AOT provides sub-10ms startup and 50% memory reduction
  • LINQ optimizations make common operations 40-50% faster
  • Span enhancements like SearchValues provide 10x improvements
  • Frozen collections optimize readonly lookup scenarios by 40%

Next Steps

  • Migrate to .NET 8 and enable Dynamic PGO
  • Evaluate Native AOT for containerized services
  • Replace hot-path allocations with Span
  • Use BenchmarkDotNet to measure real improvements
  • Profile with dotnet-counters and dotnet-trace

Additional Resources


Faster runtime, faster apps.