Architecture BlueprintProven

Resilient Cache Architecture with Azure Redis

A production-grade caching blueprint covering cache-aside pattern, write strategies, TTL governance, and the failure handling most teams forget: what happens when Redis itself goes down.

Budhisamvad Research·Jan 2026·11 min read
10–100×
read latency improvement from a well-designed cache
Practitioner range
#1
cause of cache outages: treating cache as a required dependency
Budhisamvad analysis
60s
typical TTL for volatile data — too long causes stale reads
Practitioner guidance
100%
of cache reads should have a source-of-truth fallback path
Budhisamvad standard

Caching is the most common performance optimisation in enterprise systems and the one most frequently implemented in a way that creates a new single point of failure. The question that separates a resilient cache from a fragile one is simple: what happens to your application when the cache is unavailable? If the answer is "it goes down too," you've added a dependency, not a cache.

What happens to your application when the cache is unavailable? If the answer is "it goes down too," you have not added a cache — you have added a new single point of failure with a faster read path.

The cache resilience question
Watch out
The anti-pattern: an application that treats the cache as a required dependency. When Redis becomes unavailable — and it will, during failovers, scaling events, or network partitions — the application errors out instead of falling back to the source of truth. The cache was supposed to improve resilience and instead became a new way for the system to fail.
Architecture — Cache-aside pattern with resilient fallback
Resilient Redis cache-aside architecture with fallbackApplicationCache-aside logicAzure RedisPrimary + ReplicaTTL-governed keysSource DBSystem of recordSQL / Cosmos DBCircuit BreakerRedis down → bypass1. GET2. miss → read3. populateWhen Redis is unavailable, circuit breaker routes reads directly to the source DB — degraded but available

When to Use This Pattern

Use this when
  • Read-heavy workloads where the same data is requested frequently
  • Expensive-to-compute or expensive-to-fetch data (aggregations, joins, API calls)
  • Session storage requiring fast access across distributed application instances
  • Workloads that can tolerate eventual consistency on cached data
Avoid when
  • Write-heavy workloads where cache invalidation overhead exceeds the benefit
  • Data requiring strict real-time consistency (financial balances, inventory counts)
  • Datasets small enough to hold in application memory
  • Cases where you cannot tolerate stale reads even briefly

The Cache-Aside Pattern

Cache-aside (lazy loading) is the default pattern for most enterprise caching. The application checks the cache first; on a miss, it reads from the source, populates the cache, and returns the result. The key resilience property: the source of truth is always the database, never the cache. The cache is an optimisation, and the system functions correctly (if slower) without it.

FrameworkThe Cache Resilience Test™
Before deploying any cache, answer three questions. 1. Availability: if the cache is down, does the application still work (degraded) or fail? It must degrade, not fail. 2. Consistency: when the underlying data changes, how does the cache learn? Define TTL or explicit invalidation — never "hope." 3. Stampede: when a popular key expires, do thousands of requests hit the database simultaneously? Use a lock or probabilistic early expiry to prevent the thundering herd.

Get the Cache Architecture Decision Guide

The cache strategy comparison table and Cache Resilience Test — as a one-pager for your infrastructure review.

Practitioner insight
From the field: A retail platform cached product data with a 1-hour TTL. During a flash sale, a popular product's cache entry expired and 8,000 concurrent requests hit the database in the same second — a cache stampede that took down the database and the sale with it. The fix was trivial in hindsight: a per-key lock so only one request refreshes the cache while others briefly serve stale data. The lesson: TTL expiry is a correlated event, and correlated events at scale are how caches cause the outages they were meant to prevent.

Write Strategies

How writes interact with the cache determines your consistency guarantees. Cache-aside with invalidation (delete the cache entry on write, let the next read repopulate) is the safest default. Write-through (update cache and database together) gives stronger consistency at the cost of write latency. Write-behind (update cache, async to database) gives the best write performance but risks data loss — rarely appropriate for enterprise systems of record.

CriterionCache-asideWrite-throughWrite-behind
ConsistencyEventual (safe default)StrongWeak (async)
Write latencyLowHigherLowest
Data loss riskNoneNonePossible on crash
ComplexityLowMediumHigh
Best forMost enterprise systemsRead-heavy, consistency-criticalHigh-write, loss-tolerant
Practitioner insight
From the field: Write-behind caching looks attractive on a latency benchmark and causes data-loss incidents in production. For any system of record — financial, customer, or compliance data — cache-aside with invalidation is the correct default. Reserve write-behind for genuinely loss-tolerant, high-write workloads like telemetry ingestion, and even then, only with an explicit durability strategy.

Production Checklist

  1. 01
    Configure Redis with replication and automatic failover

    Azure Cache for Redis Premium tier with a replica. Without a replica, a primary failure means a cold cache and a database load spike during repopulation.

  2. 02
    Implement a circuit breaker around cache calls

    When Redis is unavailable, the circuit breaker routes reads directly to the source database — degraded performance, but the system stays available. This is the single most important resilience pattern for caching.

  3. 03
    Set TTLs deliberately, with jitter

    Every cached key needs a TTL appropriate to its data's volatility. Add random jitter to TTLs so that keys created together don't all expire simultaneously and cause a synchronised stampede.

  4. 04
    Prevent cache stampedes on hot keys

    For frequently-accessed keys, use a per-key lock or probabilistic early recomputation so that only one request refreshes an expired key while others serve slightly stale data.

  5. 05
    Monitor hit rate, latency, and evictions

    A falling hit rate signals a TTL or sizing problem. Rising evictions mean the cache is undersized. Both degrade silently until they cause a database load problem.

Found this useful? Share it →
This article is free to read. No paywall, no limits, ever.
✦ You just finished this article

There are 9 more like this. Plus AI advisors that go deeper.

Sign up free to get new research in your inbox, download frameworks as PDFs, and try the Cloud Architecture Advisor — AI that personalises this guidance for your specific situation.

The Leadership Brief

Weekly practitioner intelligence — platform engineering, AI, cloud architecture. Every Monday. Free forever.

Downloadable frameworks

Platform Gravity Model™, IDP selection flowchart, AI Deployment Ladder — as one-pager PDFs for your team.

Early access to research

New reports and frameworks reach members before public release.

1 free AI Advisor question

Try a Reymentos AI Advisor on what you just read. No subscription needed to try.

P
S
A
M
R
Join technology leaders worldwide

Free forever · No credit card · Unsubscribe anytime · $39/mo for AI advisors