The engineer who left owned the ACME account. Renewals stopped three months later.

Offboarding goes by the book. You revoke the departing engineer's SSO, rotate their cloud keys, take back the laptop, sign the ticket. Clean exit. What hides in the word "clean" is the assumption that nothing load-bearing was still running as that person. Certificate renewal usually is, because someone wired it up in an afternoon two years ago, it never broke, and so nobody ever touched it again.

That engineer's identity is now the bus factor for every TLS handshake you serve, and you will not learn that for weeks.

How renewals end up owned by one human identity

Nobody sits down and decides to hang renewals off a personal account. It accretes. Whoever first ran certbot registered the ACME account with their own work email so the expiry notices had somewhere to land. The DNS-01 plugin wanted an API token, so they minted one under the cloud login already open in their browser. The renewal job went into their user crontab on a jump host, scoped to their personal access token. Each shortcut was reasonable on its own. Together they add up to a single point of failure with a pulse.

The tell is that no shared inventory names an owner. Ask "who renews api.example.com?" and the answer comes back as a person, not a service.

ACME account keys, DNS-01 tokens, and the bus factor of one

Three artifacts decide whether a renewal survives a departure, and they break independently. The ACME account key is an asymmetric keypair the CA ties your authorizations to; lose access to wherever it lives and you re-register from scratch. The DNS-01 credential is whatever writes the _acme-challenge TXT record, and when it is a personal API token, deactivating the human deactivates your proof of control. The renewal trigger is the cron, timer, or pipeline that runs the client on schedule, and it inherits the permissions of whoever set it up.

Why the failure shows up months after offboarding

A certificate stays valid for its whole lifetime no matter who can still renew it. Revoke the engineer's access today and every cert keeps serving traffic right up to its own expiry. The renewal that should have fired can no longer authenticate, so it throws into a log nobody reads while the clock on the existing cert keeps running. The outage is scheduled for the day that cert expires, not the day of the offboarding, which is exactly why nobody connects the two. And the gap between them is shrinking. The CA/Browser Forum's Ballot SC-081v3, passed in April 2025, steps maximum TLS validity down to 47 days by 2029 and cuts domain-validation reuse to 10 days. Shorter lifetimes spring this trap faster and more often, collapsing the window between "access revoked" and "cert dead" from a quarter to a few weeks.

Service identities, shared vaults, and ownership that outlives people

The fix is to make the automation belong to the system, not to a person. Register the ACME account under a role mailbox or distribution list so expiry notices outlive anyone's inbox. Keep the account key and the DNS-01 credential in a shared secrets vault with team-scoped access, and issue the DNS token as a service principal whose lifecycle has nothing to do with any one employee. Run the renewal under a machine identity instead of a user crontab. Then prove it works: your offboarding runbook should include "force a renewal with this person's access removed" as a real check, not an article of faith.

Automate Certificates is built so no renewal is ever load-bearing on a single human. Account keys, DNS-01 credentials, and renewal schedules belong to the team and survive any departure, with expiry and challenge-failure alerts routed to a role rather than someone who may already be gone. See how we model ownership in our security overview before your next offboarding turns into an outage nobody can trace.