.NET 8 Performance: JIT Improvements, Native AOT, and Runtime Optimizations
Introduction
.NET 8 delivers substantial performance improvements across the runtime, libraries, and compiler. This guide explores JIT (Just-In-Time) compiler enhancements, Dynamic Profile-Guided Optimization (PGO), Native Ahead-of-Time (AOT) compilation for minimal startup and memory footprint, new LINQ optimizations, Span
JIT Compiler Improvements
Dynamic PGO (Profile-Guided Optimization)
Enabling PGO:
<PropertyGroup>
<PublishAot>false</PublishAot>
<TieredCompilation>true</TieredCompilation>
<TieredPGO>true</TieredPGO>
</PropertyGroup>
How PGO Works:
// Cold path executed rarely
public decimal CalculateDiscount(Order order)
{
// PGO learns this branch is taken 95% of the time
if (order.CustomerType == CustomerType.Premium)
{
return order.Total * 0.15m; // Hot path - optimized
}
else if (order.CustomerType == CustomerType.Gold)
{
return order.Total * 0.10m; // Warm path
}
else
{
return 0m; // Cold path - less optimized
}
}
PGO Benefits:
Without PGO:
- All code paths equally optimized
- Generic branch prediction
- Uniform inlining decisions
With Dynamic PGO:
- Hot paths heavily optimized
- Cold paths minimally optimized
- Smart inlining based on actual usage
- Better register allocation
- 20-30% performance improvement in real scenarios
Loop Optimizations
Vectorization:
// Automatically vectorized by JIT
public static void MultiplyArrays(int[] a, int[] b, int[] result)
{
for (int i = 0; i < a.Length; i++)
{
result[i] = a[i] * b[i];
}
}
// JIT generates SIMD instructions (AVX2/AVX512)
// Processes 8 integers at once with Vector256<int>
Loop Unrolling:
// ❌ Before: Sequential processing
for (int i = 0; i < array.Length; i++)
{
sum += array[i];
}
// ✅ After: JIT unrolls loop (4x)
for (int i = 0; i < array.Length; i += 4)
{
sum += array[i] + array[i+1] + array[i+2] + array[i+3];
}
// Handles remainder separately
Inlining Improvements
Aggressive Inlining:
// Small methods automatically inlined
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool IsValid(string input)
{
return !string.IsNullOrWhiteSpace(input);
}
// Call site becomes:
if (!string.IsNullOrWhiteSpace(input))
{
// No method call overhead
}
Cross-Assembly Inlining:
// .NET 8 can inline methods across assembly boundaries
// with ReadyToRun (R2R) and PGO
// Library.dll
public class Calculator
{
public static int Add(int a, int b) => a + b;
}
// App.dll
var result = Calculator.Add(5, 10);
// Inlined as: var result = 5 + 10;
Native AOT Compilation
Basic Configuration
Project Setup:
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net8.0</TargetFramework>
<PublishAot>true</PublishAot>
<InvariantGlobalization>true</InvariantGlobalization>
<IlcOptimizationPreference>Speed</IlcOptimizationPreference>
<IlcGenerateStackTraceData>false</IlcGenerateStackTraceData>
</PropertyGroup>
</Project>
Publish Command:
dotnet publish -c Release -r win-x64
# Output:
# - myapp.exe (5MB native executable)
# - No runtime dependencies
# - <10ms startup time
# - ~50% memory reduction vs JIT
AOT-Compatible Code
Supported Scenarios:
// ✅ AOT-compatible
public class Calculator
{
public int Add(int a, int b) => a + b;
}
// ✅ Generic methods with value types
public T Max<T>(T a, T b) where T : IComparable<T>
{
return a.CompareTo(b) > 0 ? a : b;
}
// ✅ LINQ with concrete types
var result = numbers
.Where(n => n > 10)
.Select(n => n * 2)
.ToList();
Unsupported Features:
// ❌ Reflection.Emit
var assembly = AssemblyBuilder.DefineDynamicAssembly(...);
// ❌ Dynamic code generation
dynamic obj = new ExpandoObject();
obj.Property = "value";
// ❌ Unconstrained generics with reflection
public void Process<T>(T item)
{
var type = typeof(T);
var method = type.GetMethod("ToString");
}
// ✅ Workaround: Source generators
[JsonSerializable(typeof(User))]
partial class UserContext : JsonSerializerContext { }
Trimming and Size Optimization
Aggressive Trimming:
<PropertyGroup>
<PublishAot>true</PublishAot>
<PublishTrimmed>true</PublishTrimmed>
<TrimMode>full</TrimMode>
<EnableTrimAnalyzer>true</EnableTrimAnalyzer>
</PropertyGroup>
Preserving Code:
// Prevent trimming specific types
[DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicMethods)]
public static void ProcessType(Type type)
{
var methods = type.GetMethods();
}
// Assembly-level preservation
[assembly: UnconditionalSuppressMessage(
"Trimming",
"IL2026",
Scope = "member",
Target = "~M:MyApp.Startup.ConfigureServices")]
LINQ Performance Enhancements
Order/OrderBy Improvements
Optimized Sorting:
var numbers = Enumerable.Range(1, 1000000);
// .NET 7: ~150ms
// .NET 8: ~80ms (46% faster)
var sorted = numbers
.Order()
.ToArray();
// ThenBy optimization
var users = GetUsers();
var ordered = users
.OrderBy(u => u.LastName)
.ThenBy(u => u.FirstName) // Single sort pass in .NET 8
.ToList();
Count/LongCount Optimization
Smart Counting:
// ❌ .NET 7: Enumerates entire collection
var count = collection
.Where(x => x.IsActive)
.Count();
// ✅ .NET 8: Optimized for known-length collections
var count = collection
.Where(x => x.IsActive)
.Count(); // Uses TryGetNonEnumeratedCount when possible
// Example optimization
List<int> numbers = [1, 2, 3, 4, 5];
var count = numbers
.Where(n => n > 2)
.Count(); // Doesn't allocate iterator in simple cases
Index/Range Support
Range Operations:
int[] numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
// ✅ .NET 8: Optimized with spans
var slice = numbers[2..7]; // No allocation
// LINQ with ranges
var result = numbers
.Take(5..8) // Items at indices 5, 6, 7
.ToArray();
// Reverse ranges
var lastThree = numbers[^3..]; // [8, 9, 10]
Span and Memory Enhancements
SearchValues
Efficient Searching:
// ❌ Old approach: Multiple Contains calls
private static readonly char[] Separators = [' ', '\t', '\n', '\r'];
public static int CountWords(string text)
{
int count = 0;
foreach (char c in text)
{
if (Separators.Contains(c))
count++;
}
return count;
}
// ✅ .NET 8: SearchValues (10x faster)
private static readonly SearchValues<char> Separators =
SearchValues.Create([' ', '\t', '\n', '\r']);
public static int CountWords(ReadOnlySpan<char> text)
{
int count = 0;
int index;
while ((index = text.IndexOfAny(Separators)) >= 0)
{
count++;
text = text.Slice(index + 1);
}
return count;
}
CompositeFormat
Compiled Format Strings:
// ❌ Old: Parses format string every time
for (int i = 0; i < 1000; i++)
{
var message = string.Format("User {0} logged in at {1}",
users[i].Name, DateTime.Now);
}
// ✅ .NET 8: Pre-compiled format (3x faster)
private static readonly CompositeFormat LogFormat =
CompositeFormat.Parse("User {0} logged in at {1}");
for (int i = 0; i < 1000; i++)
{
var message = string.Format(null, LogFormat,
users[i].Name, DateTime.Now);
}
Utf8 String Literals
Zero-Allocation UTF-8:
// ❌ Old: Allocates UTF-16 string, converts to UTF-8
byte[] bytes = Encoding.UTF8.GetBytes("Hello, World!");
// ✅ .NET 8: UTF-8 literal (compile-time encoding)
ReadOnlySpan<byte> utf8 = "Hello, World!"u8;
// Direct HTTP response
await response.Body.WriteAsync("Success"u8);
// JSON without allocation
using var doc = JsonDocument.Parse("{\"name\":\"John\"}"u8);
Collection Improvements
Frozen Collections
Immutable Optimized Collections:
// ❌ Dictionary lookup: O(1) but hash overhead
var dictionary = new Dictionary<string, int>
{
["one"] = 1,
["two"] = 2,
["three"] = 3
};
// ✅ FrozenDictionary: Optimized for lookups (40% faster)
var frozen = dictionary.ToFrozenDictionary();
// Optimizes based on size:
// - Small collections: Perfect hash
// - Large collections: Minimal collision hash
// Use case: Configuration lookups
private static readonly FrozenDictionary<string, string> Config =
new Dictionary<string, string>
{
["ApiEndpoint"] = "https://api.contoso.com",
["Timeout"] = "30",
["RetryCount"] = "3"
}.ToFrozenDictionary();
PriorityQueue Enhancements
Better Performance:
var queue = new PriorityQueue<string, int>();
// .NET 8: 30% faster enqueue/dequeue
queue.Enqueue("Low", 3);
queue.Enqueue("High", 1);
queue.Enqueue("Medium", 2);
// EnqueueRange (bulk operation)
queue.EnqueueRange(
[("A", 1), ("B", 2), ("C", 3)]);
// TryDequeue with out parameter
while (queue.TryDequeue(out var item, out var priority))
{
Console.WriteLine($"{item}: {priority}");
}
Regular Expression Improvements
Source Generator
Compile-Time Regex:
// ❌ Old: Runtime compilation overhead
private static readonly Regex EmailRegex =
new(@"^[^@]+@[^@]+\.[^@]+$", RegexOptions.Compiled);
// ✅ .NET 8: Source-generated (faster startup, better performance)
[GeneratedRegex(@"^[^@]+@[^@]+\.[^@]+$", RegexOptions.IgnoreCase)]
private static partial Regex EmailRegex();
public bool ValidateEmail(string email)
{
return EmailRegex().IsMatch(email);
}
NonBacktracking Mode
Guaranteed Performance:
// ❌ Catastrophic backtracking possible
var regex = new Regex(@"(a+)+b");
regex.IsMatch(new string('a', 30)); // Can take seconds!
// ✅ NonBacktracking: O(n) guaranteed
var regex = new Regex(@"(a+)+b", RegexOptions.NonBacktracking);
regex.IsMatch(new string('a', 30)); // Always fast
ASP.NET Core Performance
HTTP/3 Support
Configuration:
var builder = WebApplication.CreateBuilder(args);
builder.WebHost.ConfigureKestrel(options =>
{
options.ListenAnyIP(5001, listenOptions =>
{
listenOptions.Protocols = HttpProtocols.Http1AndHttp2AndHttp3;
listenOptions.UseHttps();
});
});
// Benefits:
// - 0-RTT connection establishment
// - Better multiplexing
// - Improved head-of-line blocking
Request Decompression
Automatic Decompression:
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddRequestDecompression();
var app = builder.Build();
app.UseRequestDecompression();
// Automatically decompresses:
// - gzip
// - deflate
// - brotli
Rate Limiting
Built-in Rate Limiting:
builder.Services.AddRateLimiter(options =>
{
options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(
context => RateLimitPartition.GetFixedWindowLimiter(
context.User.Identity?.Name ?? context.Connection.RemoteIpAddress?.ToString() ?? "anonymous",
_ => new FixedWindowRateLimiterOptions
{
PermitLimit = 100,
Window = TimeSpan.FromMinutes(1)
}));
});
app.UseRateLimiter();
app.MapGet("/api/data", () => "Success")
.RequireRateLimiting("fixed");
Benchmarking Results
Startup Time
.NET 6: 250ms
.NET 7: 180ms
.NET 8: 120ms (52% faster than .NET 6)
Native AOT:
.NET 7: 50ms
.NET 8: 8ms (84% faster)
Memory Usage
Hello World App (64-bit):
.NET 6: 28 MB
.NET 7: 24 MB
.NET 8: 18 MB (36% reduction)
Native AOT:
.NET 7: 12 MB
.NET 8: 6 MB (50% reduction)
Throughput
JSON Serialization (1M objects):
.NET 6: 2,100 ops/sec
.NET 7: 2,850 ops/sec
.NET 8: 4,200 ops/sec (100% faster than .NET 6)
LINQ OrderBy (1M items):
.NET 6: 180ms
.NET 7: 150ms
.NET 8: 80ms (56% faster)
Best Practices
- Enable Dynamic PGO: Significant gains with minimal effort
- Use Native AOT for Services: Ideal for containers and serverless
- Leverage Span
: Reduce allocations in hot paths - Frozen Collections: Use for readonly lookup tables
- Source-Generated Regex: Better startup and performance
- Benchmark Changes: Use BenchmarkDotNet to validate improvements
- Profile Production: Use dotnet-trace and Application Insights
Troubleshooting
AOT Compatibility Issues:
# Analyze trim warnings
dotnet publish -c Release -r win-x64 /p:PublishAot=true
# Review IL2XXX warnings
# Add suppressions or redesign problematic code
PGO Not Activating:
# Verify PGO is enabled
dotnet-trace collect --process-id <pid> --providers Microsoft-Windows-DotNETRuntime:0x1E000080018:5
# Check for "TieredCompilation" events
Key Takeaways
- .NET 8 JIT improvements deliver 20-30% performance gains with Dynamic PGO
- Native AOT provides sub-10ms startup and 50% memory reduction
- LINQ optimizations make common operations 40-50% faster
- Span
enhancements like SearchValues provide 10x improvements - Frozen collections optimize readonly lookup scenarios by 40%
Next Steps
- Migrate to .NET 8 and enable Dynamic PGO
- Evaluate Native AOT for containerized services
- Replace hot-path allocations with Span
- Use BenchmarkDotNet to measure real improvements
- Profile with dotnet-counters and dotnet-trace
Additional Resources
Faster runtime, faster apps.