Why caching matters
Caching is one of the highest ROI optimizations for read-heavy APIs. It reduces latency and saves backend resources.
HTTP provides native cache semantics; using them correctly gives you free performance from browsers, CDNs, and proxies.
Cache-Control basics
Cache-Control tells clients and intermediaries how long a response is fresh and who can store it. `public` allows shared caches, `private` limits to the client, and `max-age` sets freshness.
Set conservative defaults and tune per endpoint. Not all data should be cached equally.
ETags and conditional requests
ETags let clients check whether a resource changed without downloading it again. The client sends If-None-Match, and the server returns 304 if unchanged.
This reduces bandwidth while keeping responses fresh. It is especially useful for large or slow-changing resources.
Last-Modified headers
Last-Modified is simpler than ETags and works when you have a reliable update timestamp.
It is less precise but easy to compute, and often good enough.
Caching with authentication
Authenticated responses can be cached safely with `private` and the right headers. Shared caches should not store user-specific data.
Use `Vary` headers when responses depend on auth, locale, or user-specific flags.
Invalidation strategies
Caching is only useful if invalidation is correct. Use shorter TTLs for dynamic data and longer TTLs for stable resources.
When data changes, update ETags or purge caches so clients refresh.
A practical caching recipe
For read-heavy endpoints: set Cache-Control, return ETags, and support 304. For sensitive data: use private caching with short TTLs.
Measure cache hit rates and adjust over time. Caching is a tuning tool, not a one-time setting.
