I use Let's Encrypt with auto-renewal. Do I really need cert monitoring?

Yes. Let's Encrypt auto-renewal fails all the time — rate limits, DNS misconfigurations, expired ACME account email addresses, broken cron jobs, IAM permission changes. The whole point of monitoring is catching the cases where automation doesn't work.

How early should I get warnings?

30 days is the standard first warning — enough time to investigate and fix without panic. Add a 14-day reminder if not resolved, and a 3-day "this is now urgent" alert. The 3-day window is your last chance to manually intervene before customers see the error.

What about wildcard certs?

Wildcards are convenient but consolidate risk. A wildcard for *.acme.com covers all subdomains with one cert — meaning one expired cert takes down everything. Monitor wildcards more aggressively (start alerts at 45 days) and have a renewal runbook ready.

Can I monitor certs on internal services?

Yes, but you need a monitor that can reach the internal hostname. Either deploy an agent inside your network, or use a self-hosted check tool. External SaaS monitoring can't inspect certs on hosts it can't reach.

What if my cert is on a CDN, not my origin?

Monitor the CDN-facing hostname (the one customers actually hit), not your origin. The CDN cert is the one that, when expired, breaks customer connections. Your origin cert matters too if you require origin TLS, but it's the customer-facing one that triggers user-visible errors.

SSL certificate expiry: the silent killer

Why this still happens in 2026

Every operations engineer has at least one expired-cert story. It\'s a rite of passage. Surely with Let\'s Encrypt, automated renewal, and a decade of TLS being mainstream, this should be a solved problem. It is not.

Why expired certs still take sites down regularly:

Renewal automation runs on a schedule that itself can fail (cron daemon dies, container gets cycled, IAM permissions change).
Let\'s Encrypt rate limits silently throttle renewals during incidents that loop retries.
DNS-01 challenges break when DNS records get cleaned up by another team.
Certs on intermediate infrastructure (load balancers, reverse proxies, internal CAs) get forgotten.
Manually-renewed certs from years-old vendor relationships get missed when the responsible person leaves.
The cert renews fine but doesn\'t get deployed to all the places it needs to be.

Cert expiry is the kind of incident where everyone afterward says "we should have known." With monitoring, you do.

Why standard uptime checks miss it

Most monitoring tools check HTTP endpoints. By default, many of them ignore TLS errors — they\'re checking whether your application responds, not whether the connection itself is valid.

So your monitor happily reports "200 OK" while customers see this:

Your connection is not private
NET::ERR_CERT_DATE_INVALID

Customers can\'t click past it (without scary "advanced" steps). Your monitor doesn\'t know anything is wrong. You find out when the support tickets arrive.

The fix is configuring your monitor to either:

Reject connections with invalid/expired TLS, or
Use a dedicated SSL certificate check that inspects the cert directly.

What real cert monitoring looks like

A proper SSL cert monitor doesn\'t just check whether the connection works — it inspects the certificate itself.

What it should look at:

Notice expiry date. The "not after" timestamp on the cert.
Issuer. Did the cert change CAs unexpectedly? Sometimes the only signal of a misconfigured renewal.
Subject and SANs. Does the cert cover the hostname you\'re hitting?
Chain validity. Is the cert signed by a trusted root and is the chain complete?
OCSP status. Has the cert been revoked?
Algorithm. Is it using deprecated crypto (e.g. SHA-1)?

Most teams just need expiry monitoring. Cert chain validation is a useful belt-and-suspenders for catching misconfigurations.

The 30/14/3 alert pattern

The standard cadence we recommend is three escalating alerts:

30 days out: low-priority. Email or Slack notice. "Heads up, you should plan to renew."
14 days out: medium-priority. Slack ping. "Why hasn\'t this renewed yet? Investigate."
3 days out: high-priority. SMS or PagerDuty. "This is going to break customer connections in 72 hours. Drop what you\'re doing."

The 30-day window is most important: it gives you time to investigate and fix without panic. The 3-day alert is the safety net for when you missed the earlier ones.

Some teams add a 1-day alert as a final escalation. We\'d argue if you\'re still not on it at 1 day out, the same urgency channel won\'t help.

Common renewal-automation failures

Patterns we\'ve seen break otherwise-working renewal pipelines:

The IAM permissions drift

A renewal job uses an IAM role to update Route53 records or write to S3. Six months later someone tightens the IAM policy and removes a permission. Renewal cron starts failing silently. Three months later: outage.

The container that was never restarted

Renewal succeeds. New cert is written to the right path. But the load balancer or web server is running in a container that loaded the cert at startup — it\'s still serving the old one until restarted. Outage when the old cert expires.

The DNS record someone deleted

DNS-01 challenge needs a TXT record. A team cleaning up "stale" DNS records deletes it. Renewal fails. The error is buried in a log nobody reads.

The Let\'s Encrypt rate limit

You hit a transient deploy issue, cert renewal retries 50 times in an hour. You hit Let\'s Encrypt\'s rate limit. The next legitimate renewal attempt is throttled. Cert expires.

The forgotten manual cert

One service uses a manually-purchased EV cert from a year ago. The person who bought it left. Nobody renews it because nobody knows it\'s there.

Certs people forget to monitor

The certs most likely to bite you:

Internal-only services (admin tools, dashboards) — hidden from customers but breaks employee workflows.
Mail server certs (SMTP, IMAP). Mail clients show terrible errors when these expire.
Custom domain certs on status pages. Status page is the worst place to have a cert error during an incident.
Webhook receiver endpoints. Webhooks can fail silently if your cert breaks.
API endpoints behind separate hostnames. Easy to forget if api.acme.com isn\'t in your main monitoring config.
Marketing campaign landing pages on subdomains.

The pattern: anything with a public hostname that does TLS needs a cert monitor. Inventory all of them. Add monitoring for each.

A practical setup checklist

Inventory every public hostname your business owns. Subdomains too.
For each, add an SSL certificate check (separate from your HTTP uptime check).
Configure 30/14/3-day expiry alerts.
Route 30-day alerts to email/Slack; 3-day alerts to SMS or pager.
Quarterly: audit the inventory. New subdomains? New services? New marketing sites?
Set up a calendar reminder to verify your renewal automation actually ran in the last 30 days.
For wildcards, set 45-day alerts — the blast radius justifies the earlier warning.

This is one of the lowest-effort, highest-value monitoring patterns. The day it saves you from a public TLS error, it pays for the entire monitoring stack.

SSL certificate expiry: the silent killer

Why this still happens in 2026

Why standard uptime checks miss it

What real cert monitoring looks like

The 30/14/3 alert pattern

Common renewal-automation failures

The IAM permissions drift

The container that was never restarted

The DNS record someone deleted

The Let\'s Encrypt rate limit

The forgotten manual cert

Certs people forget to monitor

A practical setup checklist

Frequently asked questions

Start watching your sites in 5 minutes.

Why this still happens in 2026

Why standard uptime checks miss it

What real cert monitoring looks like

The 30/14/3 alert pattern

Common renewal-automation failures

The IAM permissions drift

The container that was never restarted

The DNS record someone deleted

The Let\'s Encrypt rate limit

The forgotten manual cert

Certs people forget to monitor

A practical setup checklist

Frequently asked questions

Related reading

DNS monitoring: the layer everyone forgets

How to choose an uptime monitoring tool in 2026

Multi-region monitoring: why a single check isn't enough

Start watching your sites in 5 minutes.