2026-06-22 · ~21 min read

The CDN decision record: why the CDN you chose determines your cache invalidation latency and your origin shield cost at global scale

CDN selection looks like a routing detail until a pricing bug is patched in four minutes of deploy time but cached API responses keep serving the wrong entitlements to users for the next hour — because the team cached API responses without documenting the invalidation procedure, the propagation latency, or what maximum staleness was acceptable for pricing data. The CDN you chose determines your cache invalidation model, your origin shield architecture, and your per-GB cost floor at scale — and all three are set at selection time, before you have any of the production incidents that make the structural constraints visible.

A 20-person SaaS company chose Amazon CloudFront in year one. They were already on AWS for everything else — RDS, EC2, S3 for asset storage. The engineering lead opened a ChatGPT session ("what CDN should I use for a React frontend on AWS?"), and within an afternoon had a working CloudFront distribution pointing at an S3 bucket for static assets and an ALB for API requests. The tutorial covered creating a distribution, pointing it at S3, setting up an origin access control policy, configuring a custom domain with an ACM certificate, and enabling HTTPS. It was thorough. The session closed when the first CloudFront URL returned the correct HTML file. Nobody asked about the cache invalidation model. Nobody asked about what happened to CDN-cached API responses when a bug was deployed. Nobody asked about the geographic price tiers for bandwidth outside North America, or what the per-path invalidation billing threshold was, or whether CloudFront's origin shield would be in the critical path for invalidation propagation.

Eighteen months later, the product had grown to serve four geographic regions — North America, Western Europe, Southeast Asia, and Australia. The company shipped a React SPA with a standard webpack setup: static assets got content-hash filenames on every build (main.4f2a91c.js, vendor.38bc7d.js). A new developer, tasked with reducing the JavaScript bundle size, modified the webpack code-splitting configuration. The change introduced a new code-splitting strategy that moved a large vendor dependency into a named chunk: chunk-recharts.js. This filename was explicit and fixed — it did not receive a content hash, because the developer had disabled hashing for the new chunk while debugging the split and then committed the change without reverting that flag. On deploy, the new index.html was uploaded to S3 and immediately available via CloudFront. The new chunk-recharts.js was also uploaded. But CloudFront's edge PoPs had a 24-hour TTL on S3-served assets. The old index.html (from the prior build, content-cached with its own hash) expired and rolled over correctly. The new chunk-recharts.js — identical filename to what any future build would also produce — hit a cold cache on first request and loaded correctly. So far so good.

Two weeks later, a second deploy updated chunk-recharts.js with a dependency upgrade. The file was re-uploaded to S3 with the same filename. CloudFront edge PoPs worldwide were serving the old version from cache, well within the 24-hour TTL. Users who had loaded the app before the deploy got the new index.html (because the index.html was set with a 0-second CDN TTL and a 60-second browser TTL — the engineer had configured the index file correctly), which now referenced a chunk-recharts.js that had an incompatible interface with the newly shipped code. The app crashed on load with a JavaScript runtime error that appeared in Sentry as a TypeError: r.useQuery is not a function. One thousand users in Southeast Asia — where the APAC CloudFront edge PoPs were still serving the previous day's cached version — experienced a broken application for two hours before the engineering team realized the cause and issued a cache invalidation.

The cache invalidation took four minutes to propagate across all CloudFront edge locations globally. There was no formal documentation of that figure. The on-call engineer had assumed cache invalidations happened in seconds, because the first few invalidations they had tested during initial setup had propagated in under 30 seconds to the one US edge location they were testing from. They did not know that Southeast Asian edge locations had different propagation timelines, or that simultaneous invalidations across a large distribution took longer to propagate than single-file invalidations of a small distribution. During those four minutes, they refreshed the monitoring dashboard every 30 seconds, not knowing whether the invalidation was in progress or stuck. No runbook existed for "verify that a cache invalidation has propagated across all regions." The 1,000 affected users were fully resolved at the four-minute mark. The postmortem produced a commitment to only use content-hashed filenames, going forward. It did not produce a cache invalidation propagation latency benchmark, a maximum accepted staleness policy by content type, or a documented procedure for verifying that an invalidation had propagated to a specific region.

The second incident was independent and more consequential. Six months prior to the broken-bundle incident, the API team had added caching headers to several read-heavy API endpoints to reduce ALB and application server load. Pricing data — plan entitlements, feature flags per tier, per-seat pricing for the Team plan — was served with a Cache-Control: no-cache, s-maxage=3600 header, meaning CloudFront would cache the response for up to one hour before making a conditional request to the origin. The response time for pricing API calls dropped from 180ms to 4ms for cache hits. Origin request volume for the pricing endpoint fell by 97%. The configuration change was a four-line diff merged without architectural review because it looked like a performance improvement, not a caching architecture decision.

Eight months later, a bug was discovered in the pricing entitlement logic: a recent code change had introduced a conditional that incorrectly granted Pro-tier users access to a Team-tier feature (custom retention policy settings) without upgrading their plan. The bug had been live for 11 days. The fix was deployed in four minutes. Sentry confirmed the error rate went to zero. But the CloudFront cache continued serving cached pricing responses — including the erroneous feature entitlement — to users hitting the cached PoPs for up to one hour after the deploy. Users who received a cache-hit response still saw the Team-tier feature in their UI. Users who received a cache-miss response (after a cache expiry or a first-time cache population) saw the correct restricted behavior. The divergent experience created a support wave: users who had seen the Team feature, then saw it disappear, filed tickets saying features they had been using were suddenly removed.

The incident engineer deployed the fix and confirmed the error in application monitoring went to zero. They did not realize a CDN cache invalidation was also required. An experienced team member on the incident call remembered the caching headers and suggested an invalidation. The invalidation was issued 47 minutes after the deploy. It propagated in 3.5 minutes. For 50.5 minutes after the correct code was live, some users saw wrong pricing entitlements. The postmortem produced a commitment to add a deployment step that invalidated the pricing endpoints on every release. It did not produce documentation of which other endpoints were also cached with multi-minute TTLs, what the maximum accepted staleness was for each cached endpoint category, or what the propagation latency SLA was for CloudFront invalidations under different load conditions. Those artifacts — the decisions about which content is safe to cache for how long and what the invalidation procedure is for each category — were never written down before this incident, and were only partially written down after it.

The three structural properties that CDN selection determines

When teams evaluate CDNs, the conversation focuses on geographic coverage (number of edge PoPs), AWS or GCP ecosystem integration, HTTPS termination, and the time required to get the first cached asset serving correctly. These are real evaluation criteria. The structural properties that determine whether the CDN selection ages well — whether incidents occur, whether costs scale predictably, whether the deploy procedure is reliable under incident pressure — are different, and they are set at selection time.

Cache invalidation model: latency, granularity, and cost

The cache invalidation model of a CDN has four dimensions: propagation latency (how long between issuing a purge and all edge PoPs serving fresh content), granularity (single URL, prefix, wildcard, or cache tag), cost per invalidation at the team's deployment cadence, and the interplay between TTL expiry and explicit purge as the two separate paths to freshness.

Propagation latency is the most consequential and least-documented dimension. Different CDNs have structurally different invalidation architectures. Cloudflare uses anycast routing across its entire network: a purge signal issued to the Cloudflare API propagates to all edge PoPs simultaneously via the anycast control plane. Global propagation completes in under 150 milliseconds. Fastly's instant purge uses a similar control-plane broadcast model and typically propagates in under 150 milliseconds with measured P50 values under 50 milliseconds. Amazon CloudFront uses an eventually-consistent propagation model with no contractual SLA: single-URL invalidations typically propagate across the full edge network in 60 to 90 seconds under normal conditions; wildcard invalidations that touch many edge locations frequently take 3 to 5 minutes; under high invalidation load across a large distribution, propagation can extend further. These latency differences are not footnotes — they determine whether a security patch, a pricing correction, or a GDPR data removal completes in seconds or in minutes, and whether the on-call engineer can confidently verify completion or must wait and refresh.

Granularity determines what can be purged atomically. Single-URL purge (purge one specific path) is available in every CDN. Prefix purge (purge all paths beginning with /api/pricing/) is available in Cloudflare Pro and Business, Fastly, and CloudFront with wildcard path syntax. Cache-tag purge — purging all resources that were served with a specific Cache-Tag: pricing-v1 or Surrogate-Key: user-123 response header — is available in Cloudflare Business and Enterprise, Fastly (via Surrogate-Key headers), and is not available in CloudFront natively. Cache-tag purge is the highest-fidelity option for content-heavy products: you tag a resource with its logical dependency keys at serve time (Cache-Tag: article-789,author-42), and when article 789 is updated, you purge all cached representations of it — including paginated index pages that include the article, search results that include it, related article widgets that reference it — with one API call listing the tag. Without cache-tag support, achieving equivalent precision requires either listing every affected URL explicitly (operationally complex) or purging a broader prefix (over-invalidating and increasing cache miss load on the origin).

The cost per invalidation at deployment cadence is the metric teams discover after it appears on a bill. CloudFront includes 1,000 invalidation path submissions per month at no charge; each additional path costs $0.005. A single wildcard (/*) counts as one path. A list of 300 specific file paths counts as 300 paths. Teams that deploy multiple times per day and invalidate each changed asset path individually can exceed the 1,000-path threshold within the first week of the month. Cloudflare includes unlimited invalidations on Pro and above. Fastly includes unlimited instant purges. Bunny.net includes unlimited purge API calls. For teams with high deploy cadence, CloudFront's per-path billing model after the free tier is a structural cost that needs to be modeled as a function of deployment cadence × assets invalidated per deploy, not evaluated at initial CDN selection time when deploy frequency is low and the free tier is never exhausted.

Origin shield topology: isolation, cost, and invalidation complexity

CDN origin shield is a designated caching PoP between the CDN's edge PoPs and the origin server. Without shield, cache misses at each edge PoP result in a direct request to the origin. If 60 edge PoPs worldwide all experience a cache miss for the same uncached resource within a short window (common after a cache invalidation when all edge caches are simultaneously cold), the origin receives up to 60 requests for the same resource in rapid succession. Shield collapses these: all 60 edge PoPs route their cache misses to the shield PoP, which either serves from its own shield cache or makes a single request to the origin. The origin sees one request, not sixty.

The structural consequence for cache invalidation is that the shield PoP is an additional cache layer in the hit path. A cache invalidation must propagate to both the edge PoPs and the shield PoP. With CloudFront origin shield, the invalidation signal propagates to all edge PoPs and to the shield PoP. But the order and timing of propagation across layers is not documented with an SLA. An edge PoP that receives the invalidation signal and experiences a cache miss will route that miss to the shield PoP. If the shield PoP has not yet received the invalidation signal and still has the old cached content, it serves the old content to the edge PoP, which re-caches the old content. The edge PoP shows fresh after the invalidation from the control plane perspective, but serves stale because it re-filled from the shield before the shield was cleared. This creates the false positive — an invalidation that reports success at the edge layer while the shield layer still propagates the old content.

Amazon CloudFront charges origin shield bandwidth at $0.0075 per GB of data transferred from origin to shield PoP, in addition to standard origin bandwidth charges. For products with high cache miss rates (frequently changing content) or large response sizes, origin shield adds measurable cost. For products with high cache hit rates and stable content (static site, asset CDN), origin shield's cost is low and its origin load reduction is significant. The appropriate model is to document the origin shield PoP choice, the cost formula at current and projected bandwidth, and whether origin shield is in the critical path for cache invalidation — then treat that documentation as load-bearing for the deploy procedure.

Cloudflare's Tiered Cache (origin shield equivalent) is included in all plans, with Smart Tiered Cache selecting the optimal upper-tier PoP automatically based on network topology. Fastly's shielding is included and configured in VCL with a set req.backend = shield; declaration for cache misses. The structural invalidation behavior — that the shield layer is an additional cache that must be invalidated — applies across CDNs; the difference is whether it costs extra (CloudFront) and whether the invalidation propagation model accounts for the shield in its latency guarantee (Cloudflare and Fastly's instant purge models propagate through all cache layers simultaneously).

Cost model: per-GB, per-request, flat, and geographic variance

CDN cost models divide into four structural types, and the type determines how costs scale with different growth vectors. Per-GB bandwidth billing (CloudFront, Fastly, Bunny.net) scales with data transferred — a video streaming product or large file delivery service pays proportionally to bytes served. Per-request billing (CloudFront adds this on top of bandwidth) scales with request count regardless of payload size — a high-request-volume API with small responses pays proportionally to request count. Flat monthly billing (Cloudflare Pro and Business for most sites) scales with neither — the cost is fixed regardless of bandwidth or request volume, which is a structural advantage for products with unpredictable traffic spikes. Enterprise committed-traffic pricing (Cloudflare Enterprise, Fastly volume discounts, CloudFront's savings bundles) fixes a monthly bandwidth commitment at a lower per-GB rate.

Geographic pricing variance is the dimension most often missing from CDN cost models at selection time. CloudFront charges $0.085 per GB for the first 10TB served from US and European origins, $0.14 per GB from South America, $0.09–0.14 from APAC regions, and $0.19 per GB from Australia. A product with 40% of traffic in APAC pays the APAC rate for that fraction of bandwidth — meaningfully higher than the US rate that tutorials and pricing calculators default to. CloudFront also charges per-HTTPS-request at $0.012 per 10,000 requests, meaning a high-request-volume API with small responses accumulates request charges independently of bandwidth charges. Cloudflare Pro charges a flat $20 per month with no bandwidth or request charges beyond WAF and Workers usage. Bunny.net charges $0.01 per GB in the US and EU, $0.03 per GB in APAC — lower than CloudFront at both geographies. Fastly charges $0.12 per GB with volume discounts starting at $10,000/month committed spend.

The cost model intersects with the multi-region deployment decision directly: the CDN geographic coverage and pricing tier structure should inform which origin regions to deploy to, because placing an origin closer to an edge PoP reduces cross-region bandwidth charges between origin and shield, reduces CDN-to-origin latency for cache misses, and can shift traffic into a lower-priced bandwidth tier. A team that has documented both the CDN cost model and the multi-region deployment strategy can optimize the origin placement to minimize the combined cost; a team that documented neither makes both decisions independently and discovers the interaction at the first bill that includes significant APAC traffic.

The options and their structural tradeoffs

Cloudflare

Cloudflare operates one of the largest anycast networks with over 310 PoPs in 120 countries. The anycast architecture means any request to a Cloudflare-fronted domain routes to the nearest Cloudflare PoP based on BGP routing, and cache invalidations propagate to all PoPs simultaneously via the same anycast control plane. Global cache purge propagation completes in under 150 milliseconds across all 310 PoPs without exception — this is a structural property of the architecture, not a service level agreement that averages out under load. Cloudflare supports single-URL purge, prefix purge on Pro plans and above, and cache-tag purge (via Cache-Tag response headers or the cf-cache-tag header) on Business plans and above. Cache-tag purge allows purging all cached representations sharing a tag with one API call, enabling precise invalidation of logically related resources without over-invalidating unrelated content.

Cloudflare's bandwidth billing model is structurally different from all other major CDNs: Cloudflare Pro ($20/month) and Business ($200/month) include unmetered bandwidth for HTTP/HTTPS traffic. There are no per-GB charges for bandwidth on paid plans. This makes Cloudflare's cost model insensitive to traffic spikes — a viral product moment that generates 100× normal traffic produces no CDN bandwidth overage. The cost ceiling is the plan price. The exceptions are Cloudflare Workers (edge compute, billed at $0.30 per million requests beyond the included 10 million), Cloudflare Stream (video streaming, billed per minute stored and per minute delivered), and Cloudflare R2 (object storage, billed per GB stored and per operation — though R2 has no egress charges between R2 and Cloudflare's network). DDoS protection and WAF are included in all plans, including the Free plan for basic DDoS and the Pro plan for the managed WAF ruleset. Teams that would otherwise pay separately for CDN bandwidth and a WAF/DDoS product (CloudFront + AWS Shield Advanced, or CloudFront + Imperva) often find Cloudflare's all-included pricing model favorable once total infrastructure cost is calculated.

Cloudflare Workers provide edge compute running JavaScript (and other runtimes via Wasm) at the Cloudflare edge layer. Workers can perform request routing, A/B routing, authentication header injection, URL normalization for cache key stability, and response transformation without an origin request. Workers are billed at $5 per month for the first 10 million requests, then $0.30 per million. The Workers model is appropriate for lightweight edge logic that should execute before the cache or origin is consulted; it is not appropriate for long-running compute or stateful operations. The API gateway selection interacts with CDN edge compute — Cloudflare Workers can serve as a lightweight API gateway layer for routing, rate limiting, and auth header injection, reducing the need for a separate managed API gateway product for some use cases. Documenting whether Workers are in scope at CDN selection time prevents the later discovery that edge compute requirements push the CDN tier from Pro to Business.

Amazon CloudFront

CloudFront is the appropriate CDN for teams with deep AWS integration requirements: native origin access control for S3 (restricting bucket access to CloudFront only without bucket policies), direct integration with ALB and API Gateway as HTTP origins, Lambda@Edge for server-side rendering and request manipulation at the edge (running Node.js or Python Lambda functions at CloudFront edge locations, billed at $0.60 per million requests plus compute time), CloudFront Functions for lightweight JavaScript at the edge ($0.10 per million invocations — 6× cheaper than Lambda@Edge for simple URL normalization or header manipulation), and native integration with AWS WAF, AWS Shield, and ACM for certificate management. Teams that prefer to manage all infrastructure within a single AWS account — unified billing, IAM-based access control, CloudTrail audit logging — benefit from CloudFront's deep integration with the AWS service catalog.

CloudFront's per-GB bandwidth pricing ($0.085/GB for US/EU, higher for other regions) plus per-HTTPS-request pricing ($0.012 per 10,000 requests) makes cost modeling straightforward but requires explicit projection. A product serving 10TB/month from US origins pays $850 in CloudFront bandwidth plus request charges. The same product on Cloudflare Pro pays $20/month flat. At 1TB/month the CloudFront cost is $85, making Cloudflare Pro's $20 less attractive in absolute terms if AWS integration advantages are real. The breakeven where CloudFront's per-GB model crosses Cloudflare's flat model depends on the plan tier and the bandwidth volume; for most teams, the crossover is well below typical production traffic volumes where Cloudflare's flat model represents lower total cost. The build-versus-buy analysis for CloudFront versus Cloudflare should include this crossover calculation, not assume that AWS ecosystem fit justifies whichever CDN cost the company is paying without modeling the alternative.

CloudFront origin shield is a paid feature at $0.0075/GB from origin to shield PoP. The 13 available shield regions are AWS regions (us-east-1, us-east-2, us-west-2, ap-northeast-1, ap-southeast-1, ap-southeast-2, eu-central-1, eu-west-1, eu-west-2, sa-east-1, us-gov-west-1, ca-central-1, ap-northeast-2). The appropriate shield region is the one with the lowest latency to the origin server. Shield enables the cache collapse behavior described above and meaningfully reduces origin request rate for products with high cache miss rates. The origin shield cost should be modeled as a fraction of total bandwidth at the current cache hit rate and projected traffic: at a 90% cache hit rate, origin shield handles 10% of total bandwidth, so origin shield cost is approximately 10% × total bandwidth × $0.0075/GB. The documentation should make this formula explicit — including the fact that the cache hit rate will vary by endpoint category (static assets approach 99%+ hit rate; authenticated API responses with user-specific data approach 0% hit rate and must be excluded from CDN caching entirely).

Fastly

Fastly is the CDN with the most powerful programmatic caching model. Edge logic is written in VCL (Varnish Configuration Language), a domain-specific language purpose-built for HTTP caching that can express cache key construction, request and response manipulation, cache freshness policy, and backend routing with granularity that CDN-level configuration UI cannot match. A Fastly VCL configuration can set cache TTLs differently based on response status code (200 cached for 10 minutes, 404 cached for 30 seconds, 503 not cached), construct cache keys from custom header combinations (cache responses per user language header and device type, but not per session cookie), implement stale-while-revalidate with explicit stale window durations (serve stale content up to 30 seconds while fetching fresh from origin in the background), and route requests to different origin pools based on URL pattern or request headers. This level of control is appropriate for teams with complex caching requirements that would otherwise require application-level workarounds or a custom reverse-proxy layer.

Fastly's instant purge API propagates globally in under 150 milliseconds, frequently under 50 milliseconds measured. Fastly Surrogate-Key headers (equivalent to Cloudflare Cache-Tag) are included at all pricing tiers — attaching a Surrogate-Key: article-789 author-42 header to responses allows purging all representations tagged article-789 with a single API call. Fastly's real-time log streaming delivers CDN access logs to external observability platforms — Splunk, Datadog, BigQuery, GCS, S3, Kafka — with under 1-second latency, making CDN access log data available in the same analysis pipeline as application logs without a delay. This integrates directly with the logging infrastructure decision: teams that have already selected a centralized log platform get CDN access logs in the same destination without additional tooling.

Fastly charges per-GB at $0.12/GB with volume discounts beginning around $10,000/month committed spend, plus per-request charges for some regions. This pricing model is appropriate for content-heavy products where VCL's caching control justifies the higher per-GB cost versus Cloudflare's flat model or Bunny.net's lower per-GB rate. Fastly is a poor fit for high-bandwidth/low-complexity use cases (video streaming, large file delivery) where bandwidth cost is the primary variable and VCL control is not needed — Bunny.net's $0.005–0.03/GB pricing represents substantially lower total cost for equivalent bandwidth with simpler caching requirements. Fastly is also a poor fit for teams with no VCL knowledge and no time to develop it — the configuration model requires more upfront investment than CloudFront's managed distribution UI or Cloudflare's dashboard-driven rules.

Bunny.net (BunnyCDN)

Bunny.net is a bandwidth-cost-optimized CDN with a simple product model. Pricing is per-GB by region: $0.01/GB in North America and Europe, $0.03/GB in Asia-Pacific, $0.06/GB in South America and Africa — consistently 3–8× lower than CloudFront at equivalent geographies and 4–12× lower than Fastly. There are no minimum commitments, no per-request charges beyond the per-GB rate, and no separate charges for HTTPS, custom certificates, or DDoS protection (network-level DDoS filtering is included). The purge API supports single-URL and full-zone purge and propagates instantly. Bunny.net covers 120 PoPs across all major geographies. The product is appropriate for high-bandwidth, low-complexity use cases: video delivery, large file downloads, static asset serving for products with straightforward caching requirements and high user sensitivity to bandwidth cost.

Bunny.net does not include edge compute, WAF managed rulesets, or cache-tag purge at the granularity of Cloudflare or Fastly. Configuration is through a dashboard and a simple edge scripting API (BunnyScript), not VCL. Teams that need WAF, DDoS mitigation beyond network-level filtering, complex edge routing logic, or programmatic caching rules based on response content will find Bunny.net insufficient and need to layer another service in front of it or choose a CDN with those capabilities included. Bunny.net's appropriate position is as the bandwidth layer behind a WAF/security proxy (Cloudflare in proxy mode with Bunny.net as the origin, for teams that want Cloudflare's security model without Cloudflare's asset serving — though this arrangement is architecturally complex and should be explicitly documented if used), or as the primary CDN for products where simplicity and bandwidth cost are the dominant evaluation criteria and the product's security and caching requirements are met by application-level controls.

The AI chat sessions that produced undocumented decisions

CDN decisions are made across a cluster of sessions that feel like deployment configuration rather than architecture decisions. The initial setup session selects the vendor and gets the first cached asset serving correctly. Subsequent sessions add features — API response caching, origin shield, new content types, deploy automation — each solving an immediate performance or cost problem without revisiting the structural constraints established in earlier sessions. The decisions about cache freshness policy, invalidation procedure, and cost model accumulate silently, each individually reasonable, until a deploy-day incident makes the structural gaps visible.

The initial CDN selection session — "what CDN should I use for a React app on AWS?" — produces the vendor choice without any invalidation latency benchmarking, any cost modeling at 10× current traffic, or any documentation of what cache invalidation would be required when a breaking change was deployed. In 2022, the session would likely have recommended CloudFront for an AWS-native team and produced a working distribution within an afternoon. The session solved the immediate problem — static assets served globally with low latency — and closed. The constraints of the solution (eventually-consistent invalidation with no propagation SLA, per-path billing after 1,000 paths/month) were not externalized into a document. See the structural pattern in decisions never written down: the session closes when the first asset loads from the CDN, and the structural properties of the CDN choice are never externalized.

The API response caching session — "how do I cache API responses at CloudFront to reduce origin load?" — produces Cache-Control headers and CloudFront behavior configurations that cache API responses with multi-minute TTLs. The ChatGPT response explains s-maxage and how to configure CloudFront cache behaviors per path pattern. It may correctly explain that pricing endpoints should be cached with shorter TTLs than static content. What it does not produce is documentation of the maximum accepted staleness per endpoint category, what the cache invalidation procedure is for that endpoint on deploy, what the propagation latency of a CloudFront invalidation is for that distribution, or which endpoints are included in the automatic deploy-triggered invalidation step of the CI/CD pipeline. The session treats "API responses are now cached" as the success criterion. The operational consequence — that a bug fix deploy requires both a code deploy and a cache invalidation, and that the cache invalidation has a measurable propagation delay — is not in scope because the session was a performance optimization session, not a deploy procedure session. The CI/CD pipeline decision record should reference the CDN invalidation step explicitly, and the CDN ADR should document which endpoints require invalidation on deploy; in practice, neither document exists and the invalidation step is discovered during the incident.

The origin shield setup session — "how do I set up CloudFront origin shield to reduce origin requests?" — produces a working origin shield configuration. The session explains shield PoP selection and the origin load reduction benefit. It does not document that origin shield adds an additional cache layer to the invalidation propagation path, that stale content can persist at the shield layer after edge PoPs show the invalidation as complete, how to verify that a cache invalidation has propagated through the shield layer, or what the origin shield bandwidth charge is as a fraction of total CDN cost. The session solves the immediate problem (high origin request volume) and closes. The structural consequence — that the deploy procedure must now verify invalidation at the shield layer, not just at the nearest edge PoP — is only discovered when an invalidation appears to succeed (tested from a PoP that hit the edge cache directly) while users on a different routing path continue seeing stale content via the shield. The WhyChose extractor run on the origin shield setup session surfaces the original context — the high-origin-request problem that motivated shield, the shield region chosen, the cost rationale — which is the information needed to evaluate whether origin shield should be removed if its invalidation complexity is causing incidents, or whether the invalidation procedure should be updated to explicitly verify shield propagation. The raw session is in the chat history; the decision record that connects the origin problem to the shield decision to the invalidation consequence is not in any document.

The CDN cost review session — "our AWS bill has a large CloudFront line item, how do we reduce it?" — produces the discovery that Cloudflare's flat pricing model might be cheaper at current bandwidth volumes, or that origin shield is adding measurable cost that may not be justified by origin load reduction at the current traffic level. The session produces a cost comparison and possibly a migration recommendation. What it does not produce is documentation of why CloudFront was chosen originally (AWS integration, tutorial familiarity, IAM-based origin access control for S3), which CloudFront-specific features are currently in use (Lambda@Edge for server-side rendering, CloudFront Functions for URL normalization, origin access control with S3 bucket policies that restrict access to CloudFront), and which of those features would require re-engineering for a Cloudflare migration. Without that audit, the cost optimization session produces "Cloudflare would be cheaper" — which may be true — without the nuance of "we use Lambda@Edge for server-side rendering of the marketing pages; migrating to Cloudflare would require rewriting those edge functions to Cloudflare Workers, which is a meaningful engineering investment that needs to be valued against the bandwidth cost savings." The original CDN ADR would have made the AWS-specific feature dependencies explicit, giving the cost optimization session the information it needs to evaluate migration accurately rather than reactively.

What to actually document in the CDN ADR

A CDN ADR that prevents cache-freshness incidents and cost surprises does not document the distribution configuration — the Terraform state or the CDN dashboard captures that. It documents why this CDN was chosen, what structural properties that choice imposes on the deploy procedure, the maximum accepted staleness by content type, and what decisions were made during configuration that a future engineer cannot infer from the CDN configuration alone.

The cache invalidation policy is the most operationally critical section. Document the invalidation method (explicit API purge versus TTL expiry as the two independent freshness paths), the measured propagation latency at the P99 (not the average — P99 propagation is what the on-call engineer will wait for during an incident), the cost formula at the team's deployment cadence (for CloudFront: deployments per month × paths invalidated per deploy versus the 1,000-path monthly free tier), and the maximum accepted staleness by content type. The staleness policy by content type is the section most consistently absent from CDN configurations: static assets with content-hash filenames (no explicit invalidation needed — filename changes with every build); static assets with fixed filenames (invalidation required on every deploy; zero staleness tolerance; TTL must be short or invalidation must be in the deploy pipeline); API responses with pricing or entitlement data (near-zero staleness tolerance; explicit invalidation must be part of the deploy procedure for any change to pricing logic; propagation latency must be documented and verified); API responses with general read data (define acceptable staleness window explicitly, e.g., 60 seconds, and document that this window is the maximum users may see stale data after a deploy); authenticated user-specific API responses (should not be cached at the CDN layer — cache must be bypassed for responses with session-specific content, and the bypass mechanism must be explicitly configured and audited). The ADR format guidance for the Consequences section is where the negative consequence of CDN caching API responses — that the deploy procedure must now include a cache invalidation step and verify propagation — should be stated explicitly, not discovered during an incident.

The origin shield configuration section must document the shield PoP designation (or "tiered cache" terminology for Cloudflare), whether shield is in the invalidation propagation critical path, the cost formula for origin shield bandwidth at current and projected traffic, and the procedure for verifying that a cache invalidation has propagated through the shield layer. For CloudFront: "Origin shield is enabled at us-east-1 (closest AWS region to the origin ALB in us-east-1). Origin shield bandwidth cost: $0.0075/GB × (1 − cache hit rate) × total CDN bandwidth. At a 92% cache hit rate and 8TB monthly bandwidth, origin shield adds approximately $4.80/month. Cache invalidation verification must include a test request from an IP that routes through the shield PoP — test from a North American IP on a path that hits the shield, not from an EU edge PoP that may have hit the edge cache directly. Invalidation is complete when a cache miss on the shield PoP returns the updated response." Without this documentation, the on-call engineer verifying a cache invalidation tests from the nearest edge PoP, gets the updated response, declares the invalidation complete, and closes the incident — while users in geographies that route through the shield continue seeing stale content for an additional period until the shield cache expires.

The cost model at current and projected traffic scale must include the geographic breakdown. Document the top user geographies and their fraction of traffic (e.g., 55% North America, 30% Western Europe, 15% APAC), the per-GB rate for each geography on the chosen CDN, the projected monthly bandwidth at current traffic, the projected cost at 5× and 10× current bandwidth in each geography, and whether the cost model changes structure at a specific volume tier (CloudFront's tiered pricing reducing the per-GB rate at volume thresholds; Fastly's committed-traffic discount starting at $10,000/month). Document whether edge compute (Workers, Lambda@Edge, CloudFront Functions) is in use, what it does, and its cost model separate from CDN bandwidth. Document whether WAF is included (Cloudflare Pro and above) or separately billed (CloudFront requires AWS WAF at $5/month baseline + $1/million evaluated), because teams that add WAF reactively (after a security incident) discover both the WAF product and its cost simultaneously under incident pressure. Consult the guidance on documenting architecture decisions for the cost-tradeoff framing — the ADR's Consequences section should state the cost implication of staying on the chosen CDN at projected traffic versus switching to an alternative model, making future cost review sessions start from a documented baseline rather than a blank-slate audit.

The infrastructure-as-code source of truth for CDN configuration must be explicit. "CloudFront distribution configuration is managed in Terraform in the infra/cloudfront/ module. Distribution settings, cache behavior rules per path pattern, origin configurations, origin shield settings, and viewer certificate are all in Terraform state. Direct console edits to distribution configuration are not permitted — the next terraform apply will overwrite console changes. Cache invalidation is invoked via the scripts/cdn-invalidate.sh script, which calls the CloudFront invalidation API and polls for propagation completion. Lambda@Edge function versions are managed in infra/edge-functions/ and require a CloudFront distribution update to deploy — they cannot be updated independently of the distribution." Without this documentation, a developer who edits a cache behavior TTL in the CloudFront console during an incident creates drift between the console state and the Terraform state. The next infrastructure deployment silently reverts the console change. The infrastructure-as-code strategy decision record covers the IaC tool choice and source-of-truth policy broadly; the CDN ADR specifies it for CDN configuration specifically, including the edge function deployment model which is a CDN-specific complication not covered in general IaC guidance.

The CDN selection ADR template

The template below follows the Nygard format extended with CDN-specific sections. The sections whose absence produced the incidents above — the cache invalidation policy with propagation latency, the staleness policy by content type, the origin shield configuration in the invalidation path, and the cost model at projected traffic — are all present. Adapt field values to the chosen CDN.

# ADR-NNN: CDN selection

## Status
Accepted / Proposed / Superseded by ADR-NNN

## Context
[What content is being served? Static SPA assets, API responses, video,
large file downloads, or a mix? What are the invalidation requirements?
Is there zero-tolerance staleness for any content type (pricing, security)?
What are the geographic traffic distribution and top user regions? What is
the current and projected monthly bandwidth? Is edge compute required for
routing, auth, or SSR? What is the team's AWS/GCP ecosystem dependency?]

## Decision
We will use [Cloudflare / CloudFront / Fastly / Bunny.net / other] as
the primary CDN for [scope: all public-facing content / static assets only /
API responses for specific paths / etc].

## Cache invalidation policy
Invalidation method: [explicit API purge / TTL expiry / both]
Propagation latency (P99): [< 150ms for Cloudflare/Fastly /
  typically 60–90s for CloudFront single-URL / 3–5min CloudFront wildcard /
  no formal SLA — document the measured or vendor-documented figure]
Invalidation cost formula: [N deployments/month × M paths/deploy =
  K total paths/month; free tier: N; overage cost: $X/path above N]
Maximum accepted staleness by content type:
  Static assets with content-hash filenames: no explicit invalidation needed
    (filename changes on every deploy; TTL can be long)
  Static assets with fixed filenames (e.g., favicon.ico, robots.txt):
    TTL = [short, e.g., 60s] OR explicit invalidation in deploy pipeline
  API responses — pricing and entitlement data: zero tolerance;
    cache bypass (no CDN caching) OR invalidation in deploy pipeline
    with propagation verification before deploy is considered complete
  API responses — general read data: [N] seconds maximum staleness;
    s-maxage = [N]; not in deploy-triggered invalidation batch
  API responses — user-specific or session-authenticated: no CDN caching;
    cache bypass configured via [Cache-Control: no-store / Vary: Cookie /
    CDN bypass header — document the mechanism]
Deploy-triggered invalidation procedure:
  [list of paths or path patterns invalidated on every deploy;
   script or CI/CD step that runs the invalidation;
   verification method for confirming propagation complete]

## Origin shield configuration
Shield enabled: [yes / no]
Shield PoP: [AWS region / Cloudflare Smart Tiered Cache / Fastly shield PoP /
  not applicable — document which]
Shield in invalidation propagation critical path: [yes for CloudFront /
  included in anycast purge for Cloudflare/Fastly — document behavior]
Origin shield bandwidth cost: [$X/GB × (1 − hit rate) × total bandwidth/month]
Shield invalidation verification: [procedure for verifying that a cache
  invalidation has propagated through the shield layer, not just the edge PoPs]
Origin load reduction (measured or estimated): [N% reduction in origin
  requests at current cache hit rate — justify the shield cost]

## Cost model
CDN product: [Cloudflare Pro / CloudFront / Fastly / Bunny.net]
Primary billing metric: [flat monthly / per-GB / per-GB + per-request]
Current monthly bandwidth: [N] GB total
  Geographic breakdown: [US: N%, EU: M%, APAC: K% — use real traffic split]
Current CDN cost: $[amount/month] itemized by component
  [Bandwidth by region, request charges, edge compute, WAF, origin shield]
Edge compute: [Workers/Lambda@Edge/CloudFront Functions — yes/no; cost model]
WAF: [included in plan / separately billed — $X/month baseline + $Y/million]
Projected cost at 5× current bandwidth: $[amount/month]
Projected cost at 10× current bandwidth: $[amount/month]
If alternative exists (e.g., Cloudflare vs CloudFront for AWS-native team):
  Alternative: [CDN name and plan]
  Alternative cost at current bandwidth: $[amount/month]
  Features preventing migration: [Lambda@Edge / CloudFront Functions /
    origin access control policy for S3 / specific AWS integration —
    document which are required vs. inherited by default]
  Migration engineering cost estimate: [dev-days to re-implement feature
    dependencies in the alternative CDN]
  Decision: [migrate by [date] / stay because [specific dependency] /
    re-evaluate at [bandwidth threshold] — explicit, not default]

## Infrastructure-as-code source of truth
CDN distribution config source of truth: [Terraform module path /
  Pulumi stack / CloudFormation — no console edits permitted]
Edge function source of truth: [IaC path and deployment model]
Cache behavior rules source of truth: [IaC path]
Console edit policy: [not permitted — next apply reverts /
  permitted for emergencies if PRed within 24h — document which]
Cache invalidation script: [path to the script; invocation in deploy pipeline]

## Consequences
Positive: [capabilities this CDN provides — instant purge, AWS integration,
  VCL control, bandwidth cost, geographic coverage]
Negative: [CloudFront eventually-consistent invalidation with no SLA;
  CloudFront per-path invalidation billing above 1,000/month;
  CloudFront origin shield cost at low cache hit rates;
  Cloudflare Business required for cache-tag purge;
  Fastly VCL learning curve and per-GB cost at high bandwidth;
  Bunny.net absence of WAF and edge compute;
  deploy procedure now requires cache invalidation step;
  cache invalidation propagation must be verified before deploy-complete signal]
Risks: [stale content served during invalidation propagation window;
  origin shield in invalidation critical path creates two-layer verification
  requirement; CDN cost at APAC traffic growth faster than US growth]

The sections that teams consistently skip are the maximum accepted staleness by content type (most teams know they have caching; few have explicitly stated that pricing endpoint staleness tolerance is zero), the origin shield in the invalidation critical path (most teams enable origin shield for performance and don't revisit it when an invalidation appears to succeed at the edge but not the shield), and the cost model at 10× current bandwidth split by geography (APAC bandwidth costs 2–3× US costs on CloudFront and Fastly, and the APAC fraction of traffic typically grows faster than the US fraction as a product matures). Those three sections are the ones whose absence produces the stale-pricing incident, the false-positive invalidation verification, and the cost surprise when APAC traffic becomes significant. Write them before the third content type is added to the CDN, not after the pricing bug is live for an hour post-deploy.

CDN decisions share the structural characteristic of other infrastructure platform decisions in this series: the initial choice is made quickly, at low traffic, when the staleness and cost consequences are invisible. The performance optimization decision record covers the caching strategy at the application layer — in-memory caching, Redis caching, database query caching — while the CDN ADR covers the caching strategy at the edge layer. The two records reference each other: caching at the CDN layer for API responses only makes sense for endpoints that are not cached at the application layer with shorter TTLs, and the combined caching strategy should document which layer is responsible for freshness for each content type. Writing the CDN ADR forces that explicit statement, which in turn forces the explicit statement of the application-layer caching strategy for each endpoint. The two decisions are made in separate sessions and documented in separate ADRs, but they are not independent. The CDN ADR is where their relationship is stated — so the next engineer adding an endpoint knows whether to add CDN caching, application caching, both, or neither, without having to reverse-engineer the intent from the existing configuration.

Frequently asked questions

What is the difference between Cloudflare and CloudFront for cache invalidation?

The fundamental difference is propagation latency and cost model. Cloudflare cache purges propagate globally in under 150 milliseconds across all 310+ PoPs simultaneously, because Cloudflare's anycast architecture broadcasts purge signals to all PoPs via the same control plane used for routing. Single-URL purge, prefix purge (Pro+), and cache-tag purge (Business+) all propagate at the same speed. Cloudflare does not charge per invalidation on paid plans. Amazon CloudFront uses eventually-consistent propagation with no contractual SLA: single-URL invalidations typically propagate in 60–90 seconds; wildcard or large-scope invalidations typically take 3–5 minutes; under high invalidation load the window extends further. CloudFront provides 1,000 free invalidation path submissions per month; additional paths cost $0.005 each. CloudFront origin shield adds a second cache layer that must receive the invalidation signal, and stale content can persist at the shield layer after edge PoPs report the invalidation as complete. For use cases where cache freshness SLA matters — security patches, pricing corrections, feature flag rollouts, GDPR data removals — Cloudflare's sub-150ms propagation is a structurally different operational profile than CloudFront's eventual consistency. For static asset delivery with content-hash filenames (where explicit invalidation is not needed because the filename changes with every deploy), the propagation latency difference is irrelevant and CloudFront's AWS integration advantages — origin access control for S3, native Lambda@Edge, IAM-based access, unified billing — dominate the decision.

What is CDN origin shield and when does it add cost?

CDN origin shield is a designated intermediate caching PoP between the CDN's edge PoPs and the origin server. Without shield, cache misses at each edge PoP result in a direct request to the origin — if 60 edge PoPs worldwide all experience a cold cache for the same resource simultaneously (common after a cache invalidation), the origin receives up to 60 concurrent requests. Origin shield collapses these: all edge PoPs route cache misses to the shield PoP, which either serves from its shield cache or makes one request to the origin. Amazon CloudFront charges origin shield bandwidth at $0.0075/GB from origin to shield PoP, in addition to standard origin transfer costs. At a 90% cache hit rate, origin shield handles 10% of total CDN bandwidth, so shield cost is approximately 10% × total bandwidth × $0.0075/GB. Cloudflare's Tiered Cache (equivalent function) is included in all plans. Fastly shielding is included. Origin shield introduces a complication for cache invalidation: the shield PoP maintains its own cache, and a CloudFront invalidation must propagate through both the edge PoPs and the shield PoP. An invalidation verified from a PoP that hit the edge cache directly may appear complete while the shield PoP continues serving stale content to edge PoPs that experience a cache miss and route to the shield for a fill. Verifying invalidation completion at the shield layer — not just the nearest edge PoP — is the step that prevents false-positive invalidation verification during incidents. The origin shield configuration must be documented in the CDN ADR with the invalidation verification procedure, not just as a cost-reduction setting that was enabled and forgotten.

When should a team choose Fastly instead of Cloudflare or CloudFront?

Fastly is appropriate when the team needs fine-grained, programmable caching logic at the edge that CDN configuration rules cannot express. Fastly's VCL (Varnish Configuration Language) allows complex cache key construction from arbitrary header combinations, different TTL policies per response status code, stale-while-revalidate with precise stale window control, and request routing to different origin pools based on request properties. These capabilities exceed what Cloudflare's dashboard-driven cache rules and CloudFront's cache behavior policies can express without edge compute (Workers or Lambda@Edge). Fastly's Surrogate-Key purge (equivalent to Cloudflare cache-tag purge) is included at all pricing tiers, not gated to a Business plan. Fastly's real-time log streaming delivers CDN access logs to external observability platforms with under 1-second latency, making CDN traffic data available in the same analysis pipeline as application logs without polling delay. Fastly's instant purge propagates in under 150ms (often under 50ms), equivalent to Cloudflare. Fastly's per-GB pricing ($0.12/GB with volume discounts) is higher than Cloudflare's flat model for high-bandwidth static sites and higher than Bunny.net for high-volume downloads. Fastly is well-suited for API-heavy products with complex caching rules, media platforms with per-asset TTL policies derived from content metadata, and teams with existing Varnish knowledge who want managed edge Varnish. Fastly is less suited for teams prioritizing configuration simplicity over control, for high-bandwidth products where per-GB cost dominates the decision (Cloudflare flat or Bunny.net low-per-GB), or for teams without VCL knowledge and no time to develop it.

What should a CDN selection ADR document that teams typically skip?

Teams typically document the CDN vendor and the distribution or zone configuration settings. The ADR sections that prevent cache-freshness incidents and cost surprises are: (1) the maximum accepted staleness by content type — not just that caching is configured, but what the longest acceptable staleness window is for each content category (pricing data: zero tolerance; static assets with hashed filenames: long TTL is safe; general API read responses: defined window, e.g., 60 seconds); (2) the deploy-triggered invalidation procedure — which paths are invalidated on every deploy, which CI/CD step runs the invalidation, what verification confirms propagation is complete before the deploy-complete signal is issued; (3) the origin shield configuration in the invalidation critical path — whether shield is enabled, whether it is in the propagation path for cache invalidations, and the procedure for verifying invalidation through the shield layer versus just at the edge PoPs; (4) the cost model at 5× and 10× current bandwidth with geographic breakdown — APAC bandwidth costs 2–3× US/EU rates on CloudFront and Fastly, and APAC traffic fraction typically grows faster than US as a product matures; and (5) the AWS or GCP feature dependencies that would need re-engineering before a CDN migration — Lambda@Edge, CloudFront Functions, origin access control for S3, unified AWS billing — versus features that are CDN-agnostic and portable. These sections are not derivable from the CDN console or Terraform state. The state shows current configuration; the ADR documents the freshness policy, the cost projection, and the structural constraints that the CDN choice imposes on every future deploy procedure.