LINQ Mastery: Advanced Query Techniques in C#
Introduction
[Explain LINQ as declarative data query abstraction; benefits of composability, type safety, and readability over imperative loops.]
Prerequisites
- C# 12+ knowledge
- Understanding of delegates and lambda expressions
- .NET 9 SDK
LINQ Fundamentals Recap
| Syntax | Type | Use Case |
|---|---|---|
| Query Syntax | SQL-like | Readable for simple queries |
| Method Syntax | Fluent API | Composable, full operator access |
| Mixed Syntax | Hybrid | Query + method chaining |
Step-by-Step Guide
Step 1: Deferred vs Immediate Execution
Deferred (Lazy):
var query = numbers.Where(n => n > 5); // Not executed yet
var result = query.ToList(); // Executes now
Immediate:
var count = numbers.Count(n => n > 5); // Executes immediately
var sum = numbers.Sum(); // Immediate
Step 2: Custom LINQ Operators
public static class LinqExtensions
{
public static IEnumerable<T> WhereIf<T>(this IEnumerable<T> source, bool condition, Func<T, bool> predicate)
{
return condition ? source.Where(predicate) : source;
}
public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> source, int size)
{
var batch = new List<T>(size);
foreach (var item in source)
{
batch.Add(item);
if (batch.Count == size)
{
yield return batch;
batch = new List<T>(size);
}
}
if (batch.Any()) yield return batch;
}
}
// Usage
var filtered = products
.WhereIf(includeDiscounted, p => p.Discount > 0)
.ToList();
var batches = largeList.Batch(100);
Step 3: Expression Trees
Dynamic Filtering:
public static IQueryable<T> ApplyFilter<T>(IQueryable<T> query, string propertyName, object value)
{
var parameter = Expression.Parameter(typeof(T), "x");
var property = Expression.Property(parameter, propertyName);
var constant = Expression.Constant(value);
var equality = Expression.Equal(property, constant);
var lambda = Expression.Lambda<Func<T, bool>>(equality, parameter);
return query.Where(lambda);
}
// Usage
var filteredProducts = products.AsQueryable()
.ApplyFilter("Category", "Electronics");
Step 4: Advanced Joins
Left Join:
var leftJoin = from customer in customers
join order in orders on customer.Id equals order.CustomerId into orderGroup
from order in orderGroup.DefaultIfEmpty()
select new { customer.Name, OrderId = order?.Id ?? 0 };
Group Join:
var grouped = from customer in customers
join order in orders on customer.Id equals order.CustomerId into orderGroup
select new { customer.Name, Orders = orderGroup };
Multiple Key Join:
var result = from order in orders
join detail in orderDetails on new { order.OrderId, order.Year } equals new { detail.OrderId, detail.Year }
select new { order, detail };
Step 5: GroupBy with Complex Keys
var grouped = products
.GroupBy(p => new { p.Category, PriceRange = p.Price / 100 })
.Select(g => new
{
g.Key.Category,
g.Key.PriceRange,
Count = g.Count(),
AvgPrice = g.Average(p => p.Price)
});
Step 6: Nested LINQ Queries
var result = customers.Select(c => new
{
c.Name,
TopOrders = c.Orders
.OrderByDescending(o => o.Total)
.Take(3)
.Select(o => new { o.Id, o.Total })
});
Step 7: Aggregation with Custom Accumulators
var summary = orders.Aggregate(
new { TotalRevenue = 0m, OrderCount = 0 },
(acc, order) => new
{
TotalRevenue = acc.TotalRevenue + order.Total,
OrderCount = acc.OrderCount + 1
}
);
Step 8: Parallel LINQ (PLINQ)
var result = largeCollection
.AsParallel()
.Where(x => ExpensiveOperation(x))
.Select(x => Transform(x))
.ToList();
// Control parallelism
var controlled = largeCollection
.AsParallel()
.WithDegreeOfParallelism(4)
.Where(x => x.IsValid)
.ToList();
Performance Optimization
Avoid Multiple Enumerations
Wrong:
if (query.Any())
{
var first = query.First(); // Re-enumerates
}
Correct:
var list = query.ToList();
if (list.Any())
{
var first = list.First();
}
Use Indexed Access for Random Lookups
// Wrong: O(n) per lookup
var customer = customers.First(c => c.Id == id);
// Correct: O(1) after initial grouping
var lookup = customers.ToLookup(c => c.Id);
var customer = lookup[id].FirstOrDefault();
Projection Before Filtering (EF Core)
// Wrong: Fetches all columns
var result = dbContext.Products
.Where(p => p.IsActive)
.ToList();
// Correct: Select only needed columns
var result = dbContext.Products
.Where(p => p.IsActive)
.Select(p => new { p.Id, p.Name, p.Price })
.ToList();
Common Anti-Patterns
Anti-Pattern 1: Nested ToList Calls
// Wrong: Multiple database hits
var result = customers.ToList()
.Where(c => c.Orders.ToList().Any(o => o.Total > 1000))
.ToList();
// Correct: Single query
var result = customers
.Where(c => c.Orders.Any(o => o.Total > 1000))
.ToList();
Anti-Pattern 2: Unnecessary Intermediate Collections
// Wrong
var temp = list.Where(x => x.IsActive).ToList();
var result = temp.Select(x => x.Name).ToList();
// Correct
var result = list
.Where(x => x.IsActive)
.Select(x => x.Name)
.ToList();
Advanced Scenarios
Dynamic Sorting
public static IQueryable<T> OrderByProperty<T>(this IQueryable<T> source, string propertyName, bool descending = false)
{
var parameter = Expression.Parameter(typeof(T), "x");
var property = Expression.Property(parameter, propertyName);
var lambda = Expression.Lambda(property, parameter);
var methodName = descending ? "OrderByDescending" : "OrderBy";
var resultExpression = Expression.Call(
typeof(Queryable),
methodName,
new Type[] { typeof(T), property.Type },
source.Expression,
Expression.Quote(lambda)
);
return source.Provider.CreateQuery<T>(resultExpression);
}
Recursive Queries
public static IEnumerable<T> TraverseTree<T>(T root, Func<T, IEnumerable<T>> childSelector)
{
var stack = new Stack<T>();
stack.Push(root);
while (stack.Count > 0)
{
var current = stack.Pop();
yield return current;
foreach (var child in childSelector(current))
{
stack.Push(child);
}
}
}
// Usage
var allNodes = TraverseTree(rootCategory, c => c.Subcategories);
Troubleshooting
Issue: Query not executing against database
Solution: Use IQueryable instead of IEnumerable; avoid .ToList() before filtering
Issue: OutOfMemoryException with large datasets
Solution: Use streaming with yield return; avoid .ToList() on large collections
Issue: Poor performance with complex queries
Solution: Profile with EF Core logging; add indexes; consider raw SQL for edge cases
Best Practices
- Prefer method syntax for complex queries
- Use
IQueryablefor database queries to enable server-side execution - Avoid side effects in lambda expressions
- Leverage AsNoTracking() for read-only EF queries
- Use PLINQ only for CPU-bound operations
Key Takeaways
- Deferred execution enables query composition without performance penalty.
- Expression trees power dynamic query building.
- Custom operators extend LINQ for domain-specific needs.
- Performance optimization requires understanding execution context.
Next Steps
- Build expression tree-based dynamic filtering library
- Implement query result caching strategy
- Explore async LINQ (IAsyncEnumerable)
Additional Resources
Which LINQ pattern will you apply first?