The day you need to re-issue everything is the day you hit Let's Encrypt's rate limits

Ask whoever signs off on the DR runbook how long it takes to re-issue every certificate you own, and you usually get a shrug: automation renews them, so re-issuing the lot must be the same job aimed at every host at once. That instinct fails under load. Day-to-day renewal is a slow trickle the CA has sized its limits for; mass re-issuance is a spike, and refusing spikes is part of what a CA does. You never notice the gap until something rips the schedule from your hands: a CA distrust, a compromised key, a forced migration off a vendor. Suddenly you need thousands of fresh certificates today, and a rate limit nobody ever measured governs the whole incident.

When you're forced to re-issue everything at once

The triggers all land on the days you have the least room. Distrust hands you a deadline the browsers set. A key compromise makes every certificate signed with that key suspect, revocation already counting down. Walking off one CA leaves the old certificates working but on a clock. In each case "re-issue everything" is the recovery step itself, and a stall partway through leaves you holding a half-trusted estate mid-incident.

What stops you is rarely a clean error. It is throughput. Orders go through, then come back as 429 Too Many Requests, and your automation either backs off cleanly or keeps pounding the endpoint.

ACME rate limits, CAA records, and validation as throughput limits

Let's Encrypt caps issuance per registered domain per week, and that ceiling does not soften mid-incident. In 2025 they reworked the New Orders limit into a token-bucket scheme with a per-account budget that refills slowly. Renewals and brand-new issuance are metered differently, so the comfortable numbers you live under day to day say nothing about the burst. A forced re-issue looks like new issuance, and if most of your certificates share one parent domain, that weekly cap is the first wall you hit.

Two more constraints sit underneath. CAA records are checked on every issuance, so if recovery routes through a backup CA, each zone needs a CAA record that already names it, and DNS does not propagate on demand. Validation is its own throughput problem: a few seconds of propagation per challenge, times thousands of orders, and the choke point is your challenge plumbing, not the CA.

Staging, batching, and a backup CA for surge issuance

Run the surge against the staging ACME endpoint first. Its limits are far looser, and it exists so you can prove the runbook without spending production budget. Then build the real run as a paced queue.

Rehearse mass re-issuance before the incident does it for you

The figure that matters is wall-clock time to re-issue the whole estate at the CA's real refill rate, validation included. Measure it once in staging, write it on the runbook, and the director gets a recovery-time number instead of a button that does not exist. If it comes back in hours, decide the backup CA and the batching order now, not while the pager is going off.

Automate Certificates keeps a live inventory per Environment, grouped by registered domain and CA, so you can see how many certificates share a parent before a burst drives you into a 429, and stage a paced, multi-CA re-issue against the limits that bind. Map your throughput against those constraints on the features page.