Testing Azure Container Apps
Introduction
Reliable testing is critical for Azure Container Apps (ACA) workloads that scale dynamically, rely on Dapr sidecars, or integrate event-driven patterns (HTTP, Service Bus, Event Grid). This guide establishes an end-to-end testing strategy covering unit, integration, contract, performance, resilience, and security validation with automation examples.
Prerequisites
- Azure subscription (free trial)
- Azure CLI (
az) +containerappextension - GitHub Actions or Azure DevOps pipeline access
- Node.js / .NET SDK (sample services/tests)
- k6 (load testing) or Azure Load Testing resource
- Trivy / Microsoft Defender for Cloud (image scanning)
Testing Strategy Overview
| Layer | Scope | Tools | Goal |
|---|---|---|---|
| Unit | Functions, methods | xUnit / Jest | Deterministic logic correctness |
| Integration | Service + dependent resource (Redis, Cosmos) | Testcontainers / Docker Compose | Resource wiring & data behavior |
| Contract | API surface & schemas | OpenAPI diff, Pact | Prevent breaking consumer changes |
| End-to-End | Full workflow (HTTP → event → persistence) | Playwright / REST clients | Validate business scenarios |
| Load / Performance | RU, latency, concurrency, scale-out | k6 / Azure Load Testing | Capacity & auto-scaling behavior |
| Resilience | Fault injection, timeouts | Chaos Studio (future) / custom scripts | Graceful degradation |
| Security | Image + dependency scan | Trivy, Defender for Cloud | Vulnerability & misconfig detection |
| Observability | Telemetry completeness | Application Insights / OpenTelemetry | Trace coverage & useful metrics |
Architecture Under Test
Local Integration Setup (Testcontainers Example)
// xUnit fixture spinning Redis + Cosmos Emulator (pseudo)
public class IntegrationFixture : IAsyncLifetime {
public string RedisConnection { get; private set; }
public CosmosClient Cosmos { get; private set; }
private IContainer _redis;
public async Task InitializeAsync() {
_redis = new ContainerBuilder()
.WithImage("redis:7")
.WithPortBinding(6379, true)
.Build();
await _redis.StartAsync();
RedisConnection = $"localhost:{_redis.GetMappedPublicPort(6379)}";
Cosmos = new CosmosClient("https://localhost:8081", "C2F...=="); // emulator key
}
public async Task DisposeAsync() => await _redis.StopAsync();
}
Contract Testing (Pact / OpenAPI Diff)
openapi-diff previous.yaml current.yaml --fail-on-changed
Add consumer-driven contract tests for critical event payloads (e.g., JSON schema versions of queue messages).
Performance & Load (k6)
import http from 'k6/http';
import { check, sleep } from 'k6';
export const options = { stages: [ { duration: '1m', target: 50 }, { duration: '3m', target: 200 }, { duration: '1m', target: 0 } ] };
export default () => {
const res = http.get(`${__ENV.BASE_URL}/api/orders`);
check(res, { 'status 200': r => r.status === 200, 'p95 < 300ms': r => r.timings.duration < 300 });
sleep(1);
};
Run with autoscale scenario to observe container replica count and KEDA scaling events.
Resilience Testing
Fault injections:
- Terminate one replica (
az containerapp revision deactivate). - Introduce latency via a test-only middleware (delay 500ms) to verify timeout & retry policies.
- Simulate Redis outage by blocking port locally; confirm fallback logic.
Security & Compliance
trivy image ghcr.io/org/app-api:latest --severity HIGH,CRITICAL --exit-code 1
trivy fs . --ignore-unfixed
Integrate Defender for Cloud recommendations; fail pipeline on critical CVEs.
CI/CD Pipeline Snippet (GitHub Actions)
name: aca-ci
on: [push]
jobs:
build-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: '20' }
- run: npm ci && npm test
- name: Unit tests (.NET)
run: dotnet test src/Api.Tests/Api.Tests.csproj --configuration Release
- name: Load test (k6 smoke)
run: BASE_URL=${{ secrets.APP_URL }} k6 run tests/load/smoke.js
- name: Image build
run: docker build -t ghcr.io/org/app-api:${{ github.sha }} .
- name: Security scan
run: trivy image ghcr.io/org/app-api:${{ github.sha }} --severity HIGH,CRITICAL --exit-code 1
- name: Push image
run: echo $CR_PAT | docker login ghcr.io -u USER --password-stdin && docker push ghcr.io/org/app-api:${{ github.sha }}
deploy:
needs: build-test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Azure Login
uses: azure/login@v2
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
- name: Deploy ACA
run: |
az containerapp update \
--name app-api \
--resource-group rg-aca-prod \
--image ghcr.io/org/app-api:${{ github.sha }} \
--set-env-vars APP_ENV=prod
Observability
- Enable Dapr tracing + OpenTelemetry exporter to Application Insights.
- Track custom metrics:
queueLag,redisHitRate,replicaCount. - Alerts: p95 latency, HTTP 5xx rate, throttled Service Bus calls.
Sample Kusto Query (failed requests trend):
requests
| where timestamp > ago(1h)
| where resultCode startswith "5"
| summarize count() by bin(timestamp, 5m)
Troubleshooting Matrix
| Symptom | Likely Cause | Action | Preventative |
|---|---|---|---|
| High cold start latency | Image size too large | Optimize layers, enable caching | Multistage builds & slim base |
| Replica thrash | Misconfigured KEDA scaling metric | Adjust min/max replicas & cooldown | Define stable threshold |
| 429 / throttling | Under-provisioned backing services | Increase capacity / caching | RU & concurrency monitoring |
| Missing traces | Dapr tracing disabled | Enable tracing config | Version-controlled observability config |
| Failed deploy due to CVE | Critical vulnerability found | Patch dependency / rebuild image | Scheduled image scans |
Best Practices
- Keep images lean (distroless or slim base).
- Externalize config via secrets & env vars; rotate regularly.
- Use revision mode for blue/green; test new revision under load before traffic shift.
- Automate regression smoke after deploy (status endpoint + key business API).
- Tag images with git sha + semantic version.
- Enforce resource limits (CPU/memory) to avoid noisy neighbor issues.
Key Takeaways
- Multi-layer testing prevents late production surprises.
- Load & resilience tests validate autoscale + failure recovery.
- Security scanning must be gating, not advisory.
- Observability coverage (logs, metrics, traces) enables fast MTTR.
References
Next Steps
- Add chaos experiments (network latency, pod kill) once Chaos Studio supports ACA.
- Integrate contract tests in CI for event payloads.
- Expand performance benchmarks to scheduled nightly runs.
[Detailed explanation with context]
# Example Azure CLI command
az group create --name myResourceGroup --location eastus
Step 2: [Second Major Step]
[Continue with clear, actionable steps]
Step 3: [Third Major Step]
[Add screenshots or diagrams where helpful]
Best Practices
- [Key best practice 1]
- [Key best practice 2]
- [Key best practice 3]
Common Issues & Troubleshooting
Issue: [Common problem]
Solution: [How to fix it]
Key Takeaways
- ✅ [Main learning point 1]
- ✅ [Main learning point 2]
- ✅ [Main learning point 3]
Next Steps
- [Suggested follow-up topic or action]
- [Link to related Azure service]
Additional Resources
What are your experiences with [this topic]? Share your thoughts in the comments below!