By Yair Knijn · January 1, 2026
The day you need to re-issue everything is the day you hit Let's Encrypt's rate limits
Ask whoever signs off on the DR runbook how long it takes to re-issue every certificate you own, and you usually get a shrug: automation renews them, so re-issuing the lot must be the same job aimed at every host at once. That instinct fails under load. Day-to-day renewal is a slow trickle the CA has sized its limits for; mass re-issuance is a spike, and refusing spikes is part of what a CA does. You never notice the gap until something rips the schedule from your hands: a CA distrust, a compromised key, a forced migration off a vendor. Suddenly you need thousands of fresh certificates today, and a rate limit nobody ever measured governs the whole incident.
When you're forced to re-issue everything at once
The triggers all land on the days you have the least room. Distrust hands you a deadline the browsers set. A key compromise makes every certificate signed with that key suspect, revocation already counting down. Walking off one CA leaves the old certificates working but on a clock. In each case "re-issue everything" is the recovery step itself, and a stall partway through leaves you holding a half-trusted estate mid-incident.
What stops you is rarely a clean error. It is throughput. Orders go through, then come back as 429 Too Many Requests, and your automation either backs off cleanly or keeps pounding the endpoint.
ACME rate limits, CAA records, and validation as throughput limits
Let's Encrypt caps issuance per registered domain per week, and that ceiling does not soften mid-incident. In 2025 they reworked the New Orders limit into a token-bucket scheme with a per-account budget that refills slowly. Renewals and brand-new issuance are metered differently, so the comfortable numbers you live under day to day say nothing about the burst. A forced re-issue looks like new issuance, and if most of your certificates share one parent domain, that weekly cap is the first wall you hit.
Two more constraints sit underneath. CAA records are checked on every issuance, so if recovery routes through a backup CA, each zone needs a CAA record that already names it, and DNS does not propagate on demand. Validation is its own throughput problem: a few seconds of propagation per challenge, times thousands of orders, and the choke point is your challenge plumbing, not the CA.
Staging, batching, and a backup CA for surge issuance
Run the surge against the staging ACME endpoint first. Its limits are far looser, and it exists so you can prove the runbook without spending production budget. Then build the real run as a paced queue.
- Batch by registered domain so no single parent gets shoved past its weekly cap, and spread the rest across the refill window.
- Pre-stage
CAArecords that authorize a second CA, so failover is a config switch and not a DNS edit made at 2am. - Keep a second ACME account live at a different CA, so recovery does not hinge on the one provider that may be the reason you are re-issuing.
- Order by what hurts most when down: public ingress and revenue paths first, internal and low-traffic names last.
Rehearse mass re-issuance before the incident does it for you
The figure that matters is wall-clock time to re-issue the whole estate at the CA's real refill rate, validation included. Measure it once in staging, write it on the runbook, and the director gets a recovery-time number instead of a button that does not exist. If it comes back in hours, decide the backup CA and the batching order now, not while the pager is going off.
Automate Certificates keeps a live inventory per Environment, grouped by registered domain and CA, so you can see how many certificates share a parent before a burst drives you into a 429, and stage a paced, multi-CA re-issue against the limits that bind. Map your throughput against those constraints on the features page.