Thursday, November 27, 2025
HomeProductsADO.NET Data ProvidersCaching in C#: A Comprehensive Technical Guide 

Caching in C#: A Comprehensive Technical Guide 

In .NET systems, performance is often won or lost on the read path. Every extra database or API call adds latency and cost. Caching fixes that by keeping frequently used data, like product lists or lookups, close to your code, turning slow trips into instant reads. 

This is not theoretical, it works in the real world. Stack Overflow runs a two-tier cache (in-process + Redis), where a Redis hop takes only 0.2–0.5 ms and local memory reads are effectively instant. That combination keeps requests fast and traffic steady, even during peak load. Many teams report similar gains with managed Redis on Azure, achieving up to 8× higher throughput and 10× lower latency after adding a cache layer. 

This guide shows how to do the same. We’ll show you which cache to use, which pattern to apply, and the essentials (keys, expirations, invalidation, monitoring) that make the gains last. 

Table of contents

Types of caching in .NET 

.NET supports four main caching models. Each addresses a different layer of performance: local speed, full-response reuse, cross-node sharing, or a mix of all three. Let’s explore them. 

In-memory caching (IMemoryCache / MemoryCache

This cache lives inside your process, so reads skip the network and are as fast as it gets. Use it for small, hot objects (feature flags, lookup tables, per-request computations). It fits single-instance apps or server farms that can tolerate node-local state (e.g., with session affinity). Always set expirations and, if you store many items, enforce a size budget. 

Expiration & eviction—how it behaves: 

  • Absolute expiration: the entry expires at a fixed point in time.
  • Sliding expiration: the TTL extends on each access (pair with an absolute cap to avoid “immortal” hot keys).
  • Priority hintsLow / Normal / High / NeverRemove influence what gets evicted first under pressure.
  • Size limits: if you set SizeLimit, each entry must declare a Size; the cache evicts to stay within that budget (units are app-defined). 

Minimal, production-ready usage 

// Program.cs 
builder.Services.AddMemoryCache(); 
 
// Service or endpoint 
public class Products(IMemoryCache cache, IProductRepository repo) 

    public async Task<Product?> GetAsync(int id) 
    { 
        var key = $"product:{id}"; 
        return await cache.GetOrCreateAsync(key, async entry => 
        { 
            entry.AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(10); 
            entry.SlidingExpiration               = TimeSpan.FromMinutes(3); 
            entry.Priority                        = CacheItemPriority.High; 
            entry.Size                            = 1; // enables size-based eviction when SizeLimit is set 
            return await repo.GetAsync(id); 
        }); 
    } 

For completeness, IMemoryCache provides thread-safe accessors and extension methods (Get, Set, TryGetValue, GetOrCreateAsync) that encapsulate the underlying ICacheEntry lifecycle. 

Output caching (ASP.NET Core) 

Output caching stores the final HTTP response (HTML/JSON) and serves subsequent matching requests without executing your action/handler. You control variation by route, query, or headers and apply policies per endpoint. By default, the built-in store is in-memory; for scale-out, add a Redis store. This is server-side and distinct from client/proxy Response Caching.  

Policy-driven example 

// Program.cs 
builder.Services.AddOutputCache(options => 

    options.AddPolicy("Public10m", p => p 
        .Cache() 
        .Expire(TimeSpan.FromMinutes(10)) 
        .SetVaryByRouteValue("id") 
        .SetVaryByQuery("q")); 
}); 
 
var app = builder.Build(); 
app.UseOutputCache(); 
 
app.MapGet("/posts/{id:int}", async (int id, BlogRepo repo) => 
    Results.Ok(await repo.GetAsync(id))) 
   .CacheOutput("Public10m"); 

Operational levels you should set include:  

  • DefaultExpirationTimeSpan (if not set per policy).
  • SizeLimit for the store (default ~100 MB) and MaximumBodySize.
  • Vary rules that reflect personalization and content negotiation. 

Distributed caching 

A distributed cache gives every app instance the same cached view via the IDistributedCache abstraction. Common providers are RedisSQL Server, and NCache; you can swap providers without touching call sites. Use this whenever you scale beyond a single instance or run blue/green deploys.  

Here is an overview of the providers: 

  • Redis (StackExchange.Redis): Most common choice for low-latency, in-memory caching with replication and persistence options (AOF/RDB) or managed platforms (e.g., Azure Cache for Redis). Wired up via AddStackExchangeRedisCache.
  • SQL Server: A durable, database-backed cache (Microsoft.Extensions.Caching.SqlServer) with service extensions to register it in DI. Better for environments that already standardize on SQL Server operations.
  • NCache: A .NET-native distributed cache with first-class IDistributedCache integration (AddNCacheDistributedCache), clustering, and high availability features. Useful when you want rich cache features and .NET-centric ops. 

Minimal usage (Redis example) 

// Program.cs 
builder.Services.AddStackExchangeRedisCache(o => 

    o.Configuration = builder.Configuration["Redis:ConnectionString"]; 
    o.InstanceName  = "app1:"; // namespace keys 
}); 
 
public class CatalogService(IDistributedCache cache, CatalogRepo repo) 

    public async Task<Product?> GetAsync(int id, CancellationToken ct) 
    { 
        var key = $"product:{id}"; 
        var json = await cache.GetStringAsync(key, ct); 
        if (json is not null) return JsonSerializer.Deserialize<Product>(json); 
 
        var data = await repo.GetAsync(id, ct); 
        if (data is null) return null; 
 
        await cache.SetStringAsync( 
            key, 
            JsonSerializer.Serialize(data), 
            new DistributedCacheEntryOptions { 
                AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(15) 
            }, 
            ct); 
 
        return data; 
    } 

Scalability, synchronization, and HA considerations: 

  • Topology: Use managed Redis with replicas and persistence, and test failover regularly. NCache supports replication and partitioning; SQL Server caching follows database HA setups.
  • Eviction and TTLs: Set explicit TTLs and expect staleness to keep memory and consistency under control.
  • Payload discipline: Keep values compact, compress only when worthwhile, and namespace keys (e.g., product:).
  • Cold start control: Pre-warm critical keys or pair this approach with output caching for faster first responses.
  • Ops hygiene: Monitor hit ratio, memory, and latency; alert on eviction spikes or connection issues. 

HybridCache (.NET 9) 

HybridCache is .NET 9’s built-in two-tier cache: a fast in-process L1 on top of a distributed L2 (e.g., Redis) behind one simple API. Reads check L1 first; on a miss, HybridCache falls through to L2, single-flights the refill (one caller repopulates, others wait), then writes back to both tiers.  

You get local speed and fleet-wide consistency without building your own layering. It also supports tag-based invalidation, separate expiration for each tier, and size limits to keep entries safe. Use it when you run multiple instances and want in-process latency with shared correctness: think hot APIs, catalogs, and dashboards. 

What it adds beyond IMemoryCache + IDistributedCache

  • Stampede protection (single-flight refresh). On expiry/miss, one caller repopulates while others await the same result, reducing thundering herds.
  • Tag-based invalidation. Invalidate related keys as a group (e.g., all prices after a pricing import).
  • Guardrails. Defaults for maximum payload size and key length; supports modern distributed cache APIs. 

Quick start (API shape) 

dotnet add package Microsoft.Extensions.Caching.Hybrid 
// Program.cs -- register HybridCache + your L2 provider (e.g., Redis) and MemoryCache 
builder.Services.AddHybridCache(); 
 
public class PricesService(HybridCache cache, IPriceRepo repo) 

    public Task<Price?> GetAsync(string sku, CancellationToken ct) => 
        cache.GetOrCreateAsync( 
            key: $"price:{sku}", 
            factory: _ => repo.GetAsync(sku, ct), // executes once on miss 
            options: new HybridCacheEntryOptions { 
                Expiration           = TimeSpan.FromMinutes(5), 
                LocalCacheExpiration = TimeSpan.FromMinutes(2), 
                Tags                 = ["prices", $"sku:{sku}"] 
            }, 
            cancellationToken: ct); 

// Elsewhere: await cache.RemoveByTagAsync("prices", ct); // coordinated invalidation 

Now that you know what each caching option offers, let’s look at how to use them effectively. 

Caching patterns and strategies 

The following patterns spell out who talks to the database, when the cache is populated, and how freshness is maintained as data changes. 

The core patterns 

Use these four building blocks to match your read/write behavior: 

  • Cache-Aside (lazy loading). The app checks the cache first; on a miss it reads the database and then populates the cache. On writes, the app updates the DB and either invalidates or refreshes the cache. Simple, flexible, and the default for many teams.
  • Read-Through. The app talks only to the cache; on a miss, the cache layer (or wrapper) fetches from the DB and fills itself. This centralizes loading logic and reduces app boilerplate.
  • Write-Through. Each write updates the cache and the database synchronously. Reads can rely on the cache being current, at the cost of slightly slower writes.
  • Write-Behind (write-back). The app writes to the cache, which asynchronously persists to the DB (often batched). This increases write throughput but introduces temporary inconsistency and a failure window if the cache dies before flush. 

Selection decision matrix 

Situation Preferred pattern Rationale 
Read-heavy domain; straightforward invalidation Cache-Aside Minimal coupling; only cache what’s read; easy to reason about.  
You want the cache layer to own “load on miss” Read-Through Centralizes miss handling; keeps app code thin.  
Fresh reads immediately after writes are critical Write-Through Cache updated in lockstep with DB; consistent read path.  
Write throughput matters more than immediate consistency Write-Behind Absorbs spikes; persists later; eventual consistency accepted.  

Patterns are one thing, running them well is another. The following best practices keep your cache steady in production: 

  • Define freshness: Decide what “fresh enough” means for each object, set TTLs accordingly, and plan refreshes for hot keys.
  • Invalidate precisely: Clear or update cached data on writes by key or tag so stale values don’t linger.
  • Avoid stampedes: Let one request repopulate a missing key while others wait—modern libraries handle this for you.
  • Plan for failures: Use durable queues and idempotent writes for write-behind, and monitor latency for write-through.
  • Keep it lean: Use compact serialization, clear key namespaces, and avoid large objects that trigger evictions.
  • Measure and adjust: Track hit ratios, miss penalties, and latency, then tune TTLs and policies based on real data. 

While caching often lives at the web or API layer, there’s plenty you can optimize right inside your data layer too. 

ADO.NET caching opportunities 

Caching with ADO.NET is about capturing the repeatable parts of your data access, then refreshing them predictably. Three targets deliver the biggest wins. 

What to cache (and why) 

Start with the following: 

  • Common query results: Frequently requested, slow-to-generate result sets (e.g., “top products,” “recent posts”) can be held in memory to cut round-trips on every repeat read. This is a standard ADO.NET technique and fits scenarios where data changes less often than it’s read.
  • Lookup/reference tables: Country lists, status codes, feature flags, and other small dimensions are classic cache material: they’re read everywhere and updated rarely; cache once, reuse across requests.
  • Disconnected ADO.NET objects (DataSet/DataTable): These objects are designed to hold tabular data in memory and back entire screens or reports without requerying, ideal for read-heavy dashboards and exports. 

For apps that must react quickly to database changes, tie cache entries to SQL Server change notifications so an update in the source invalidates the corresponding cached data (ADO.NET supports this via SqlDependency and related mechanisms.) 

How to choose: In-memory vs. distributed 

Dimension In-memory (process-local) Distributed (Redis / NCache / SQL Server) 
Scope Per node (not shared) Shared across all instances 
Latency Lowest (no network hop) Low, but includes a network hop 
Best for Small lookups, hot query results, single-instance apps, or farms with sticky sessions Cloud/scale-out apps, blue/green deploys, autoscaling, shared dashboards/slices 
Consistency model Node-local; each node may diverge Fleet-wide view; consistent across nodes (subject to TTLs) 
Freshness strategy Short TTLs; app-driven invalidation on writes TTLs + coordinated invalidation; optional DB change notifications/dependencies 
HA/Resilience Loses entries on recycle/restart Provider HA (replicas/clusters/AGs) and persistence options 
Ops complexity Minimal Moderate (operating Redis/NCache/SQL provider, monitoring, failover) 
Memory pressure Eats app RAM; eviction impacts only that node Centralized memory budget; eviction impacts all nodes 
Payload guidance Keep tiny; avoid giant DataSets Can handle larger shared payloads, but keep serialization lean 
Failure modes Cold cache after deploy; per-node cache incoherence Cache outage adds latency; design for graceful DB fallback 
Typical TTL Short (seconds–minutes) Short–medium (seconds–minutes), tuned to staleness tolerance 
Examples Country list, status codes, feature flags, small read-mostly results Popular grids/reports, shared product/pricing slices, cross-node session/state surrogates 

Rule of thumb: keep small, universal lookups and tiny result sets in-process for sheer speed; promote shared or heavier ADO.NET payloads (grids, report slices) to a distributed cache so every instance sees the same value. 

When to cache, refresh, or bypass (best-practice checklist) 

Use these quick rules: 

  • Cache when reads dominate writes, the data is reused across requests, and you can state a freshness budget (e.g., “≤5 minutes”). Start with lookups and high-traffic queries.
  • Refresh via short, explicit TTLs for moderately dynamic data; event-driven invalidation (write-path invalidates or SQL change notifications) where freshness matters; and pre-warming before known peaks (reports, promos).
  • Bypass the cache for highly personalized, security-sensitive, or rapidly changing values (balances, stock levels) where the cost of staleness exceeds the latency saved. Use direct ADO.NET reads on those paths. 

Operational guardrails: keep payloads lean (avoid giant DataSets), use clear key namespaces, monitor hit ratio and miss penalty, and instrument invalidations so you can see when and why entries were evicted or refreshed. 

Expiration and eviction policies 

Caching only works if readers and operators can predict how fresh data is and who gets evicted first when memory tightens. 

1. Set freshness explicitly 

Make staleness a conscious choice, not an accident: 

  • Absolute expiration places a hard ceiling on staleness; it’s your SLA guardrail (“never older than 10 minutes”). Microsoft’s guidance treats absolute TTL as the definitive upper bound.
  • Sliding expiration keeps hot keys alive by extending TTL on access. However, a sliding-only policy can keep an entry alive forever; pair it with an absolute cap to prevent immortal keys. 

2. Control who leaves under pressure 

When memory fills up, caches remove entries based on their priority (Low, Normal, High, NeverRemove) and the total size limit. To make these priorities work properly, set a global size cap and record the size of each cached item. 

3. Know the server-side defaults 

ASP.NET Core Output Caching has conservative defaults: SizeLimit ≈ 100 MB, MaximumBodySize ≈ 64 MB, and DefaultExpiration ≈ 60 seconds. Larger responses won’t be cached, and when storage is full, new entries are blocked until others are evicted. Adjust these settings before assuming your cache works. 

4. Monitor what users feel 

A 90% hit rate means little if cache misses take forever. Track hit/miss ratios, P95/P99 latency for misses, eviction causes, payload sizes, and network delay in distributed caches. Also, focus on reducing latency that users feel, not chasing pretty hit-rate numbers. 

Once these foundations are covered, the next challenge is scale.  

Advanced caching techniques 

This section moves beyond “store and TTL” into the operational patterns that keep latency flat under load. Here are the techniques to apply in production: 

  • Stampede prevention: When many requests hit an expired key at once, it can overload your system. Use a single-flight refresh so only one request updates the cache while others wait for the result. .NET 9’s HybridCache handles this automatically, reducing latency spikes.
  • Cache warming: For predictable traffic (like morning dashboards or promo pages), preload data at startup or before peak hours. This ensures faster first loads and less pressure on your backend.
  • Multi-level (L1/L2) caching: Combine fast in-memory caching (L1) with a shared distributed cache (L2). .NET 9’s HybridCache supports this model and adds tag-based invalidation to clear related entries efficiently after bulk updates. 

However, every cache entry starts with a key and ends with data. How you design both makes a big difference. 

Cache key design and serialization 

Here’s how you do it: 

  • Use a compact key pattern like domain:entity:id for consistency.
  • Add tenant or user prefixes when isolation is needed.
  • Keep keys short, most providers limit length and charge per byte.
  • Plan for bulk invalidation with prefixes or tags.
  • Consider HybridCache as it supports tagging natively for easier cleanup. 

Once your key strategy is set, the next step is choosing the right serialization format. Here’s the rationale for your choice: 

  • Binary/compact (e.g., MessagePack): Best for high-traffic, distributed caches where speed and bandwidth matter. It produces smaller, faster payloads but isn’t human-readable. Always version your schemas to avoid breaking changes.
  • JSON: Ideal for readability and cross-language setups. Payloads are larger and slower to parse, so limit fields, compress if useful, and lock serializer settings (case, enums, etc.) for consistency. 

Treat caches like a real data layer: validate inputs, never store secrets, use TLS in transit, and encrypt sensitive data at rest. Define clear versioning and fallback rules for serialization to avoid corruption during updates. 

Common use cases for caching 

Caching drives real impact when it targets the right pressure points. Here are the scenarios where it delivers the most value: 

  1. Reducing database and API calls: A few queries usually handle most traffic. Caching results like top items or filtered lists reduces database load and keeps responses consistent.
  2. Storing frequently accessed data: Datasets such as countries, feature flags, or configs rarely change but are read constantly. Keeping them cached ensures instant reads and steady performance. 
  3. Supporting output caching: ASP.NET Core Output Caching can skip endpoint execution for repeated requests. Combined with underlying object caches, it creates a faster, more efficient response path.  

Knowing where caching helps is essential, but knowing how it’s performing is just as important.  

Monitoring and debugging 

To keep caching reliable at scale, manage it like any other production system. Here’s how: 

  • Measure what matters: Track hit and miss rates, latency percentiles (P50, P95, P99), and miss penalties, those are what users feel. Watch eviction counts and reasons (TTL, memory pressure, manual clears) and monitor payload size and (de)serialization time. In distributed setups, include round-trip latency, connection health, and failover events.
  • Use the right tools: Instrument cache calls with .NET diagnostics and structured logs. Add OpenTelemetry metrics and traces to link cache behavior with database or API load. For deeper visibility, use provider tools like Redis CLI or dashboards to monitor memory, keyspace stats, ops/sec, and slowlogs.
  • Fix recurring issues: Low hit rates often mean you’re caching the wrong queries. Sliding-only TTLs can create “immortal” hot keys—set absolute limits. Stale reads point to broken invalidation paths or missing change notifications. And if large objects keep triggering evictions, compress or split them to fit the cache profile. 

Caching is evolving fast, and the next wave of improvements will focus on simplicity, visibility, and cloud readiness. Here’s what to expect, and how to prepare. 

  • HybridCache sets the tone: With .NET 9, HybridCache moves caching toward a standardized model. Expect built-in L1/L2 layering, single-flight refreshes, and tag-based invalidation to replace much of today’s custom caching logic.
  • Cloud-native becomes the baseline: The shift to managed distributed caches will deepen, prioritizing scalability, resilience, and reduced ops friction. Stateless apps with shared caches will dominate autoscaling environments, while OpenTelemetry brings end-to-end cache visibility across microservices.
  • Edge and CDN convergence: The line between app and edge caching will continue to blur. Smarter Cache-Control and Vary headers will push more responses to CDNs, leaving server Output Caching for dynamic but repeatable content. 

Preparing for the shift 

Teams that benchmark now, upgrade to .NET 9, and validate distributed configurations will adapt faster. Use this checklist to guide your next iteration: 

  • Benchmark current latency, hit/miss ratios, and miss penalties.
  • Upgrade to .NET 9 and adopt HybridCache with clear expiration and tagging rules.
  • Verify distributed cache provider settings and rehearse failovers.
  • Align CDN and edge headers with server cache policies.
  • Load-test hot paths and enable detailed telemetry before launch. 

Conclusion 

Pick the right type and interaction pattern for your topology and freshness budget: in-memory for single-node speed, output caching for repeatable responses, distributed for fleet coherence, and HybridCache when you want speed and scale without DIY layers. Enforce clear expiration/eviction, disciplined keys/serialization, and a few advanced techniques (single-flight, warming, L1/L2).  

Finally, make observability non-negotiable, prove with metrics and traces that caching cuts latency and offloads your DB/API, and keep those numbers in your migration plan. 

Dereck Mushingairi
Dereck Mushingairi
I’m a technical content writer who loves turning complex topics—think SQL, connectors, and backend chaos—into content that actually makes sense (and maybe even makes you smile). I write for devs, data folks, and curious minds who want less fluff and more clarity. When I’m not wrangling words, you’ll find me dancing salsa, or hopping between cities.
RELATED ARTICLES

Whitepaper

Social

Topics

Products