“Warning: Potential Security Risk Ahead! Your browser detected an issue and did not continue”.
Expired certificates are unfortunately a very common issue, that affects a wide range of services. Websites, DNS-over-HTTP/DNS-over-TLS, SMTPS, IMAPS, pretty much anything depending on certificates will suddenly become inaccessible if certificates are no longer valid for the current date.
Why does it happen?
Mainly because installing and updating certificates used to be a manual and tedious process.
CAs used to be awful, automation was complicated and after having successfully installed a certificate, the need to replace it looked like a distant future that wasn’t worth worrying about.
Things have improved since, thanks to Let’s Encrypt and vendors such as SSLMate finally offering a decent way to deal with certificates programmatically.
Still, expired certificates is a thing that keeps happening, sometimes with catastrophic implications. What recently happened to Firefox illustrates the fact that even in 2019, this is not a solved problem.
DNSCrypt and certificates
The DNSCrypt protocol also uses certificates. The way they are validated is simple and secure.
A regular DNS query is used to retrieve the server’s certificates that are verified using a public key already known by the client. The certificate matching the preferred cipher suite, that is still valid for the current date, and with the highest serial number will be automatically chosen.
This mechanism allows for seamless key rotation (validity periods can overlap), revocation, and cryptographic agility.
I designed DNSCrypt in 2011. Certificate management was no different than in today’s version, with one exception: how long a certificate should be valid for was not specified. In fact, there was no specification yet.
But dnscrypt-wrapper
, the server-side proxy, created certificates valid for 365 days by default.
Most resolvers just used that default value.
Time passed, and virtually all resolvers failed exactly 365 days after they had initially been set up. Including servers operated by large companies such as Yandex.
Unlike CAs, certificate creation was a simple process, that only required a couple shell commands. Still, people eventually forgot about them, causing service disruption. And more often than once, when it happened, the secret key originally used for signing had been lost, requiring a client update.
How the expired certificate problem has been solved
The DNSCrypt protocol version 2 was specified in 2015.
Besides introducing a new cipher suite (XChaCha20-Poly1305), and solving the DNS amplification problem without requiring TCP, it introduced a surprising requirement:
Certificates should not be valid for more than 24 hours.
dnscrypt-wrapper
was updated to generate short-term certificates only. The dnscrypt server docker image rotates certificates every 8 hours.
The reference client implementation, dnscrypt-proxy
, accepts certificates with any expiration date, but spits out a warning “The key rotation period for this server is excessively long” if it is valid for more than 24 hours.
Having very short-lived certificates has multiple advantages.
First, it tremendously helps with forward security: if a server is compromised or a key leaked today, yesterday’s traffic cannot be decrypted. The key has changed, already.
More importantly, it forces operators to set up automation from day one. When a new server is put online, if the scripts to rotate the keys haven’t been set up or don’t work, this will be immediately discovered.
The service will not fail a year from now. It will fail a couple hours later.
On the other hand, if automation has been set up and works, rotating the key every year, every quarter or 3 times a day doesn’t make any operational difference. If it works for more than 24 hours, it will keep working forever.
This, along with a ready-to-use Docker image implementing this, has been effective at reducing the number of servers having expired certificates, while improving security.
Dealing with long-term certificates client-side
Still, some DNS operators chose to create keys lasting for more than 24 hours, maybe due to operational constraints.
This is fine, but how can we prevent service disruption if these operators forget about key renewal?
That problem was largely solved with a handful lines of code in the dnscrypt-proxy
client.
Users get an informational warning 30 days before the expiration of a certificate required by a server they use, another message at a higher severity level 7 days before the expiration, and a critical message if the certificate has less than 24 hours left.
This, of course, only applies to certificates originally having a long validity period.
These messages give users a chance to inform DNS operators about the forthcoming certificate expiration before it’s too late. Not after the fact.
Users can also preventively switch to other servers in order to avoid losing service.
Combining strict requirements, with conforming implementations and preventive client-side notifications has been incredibly effective.
I don’t remember of any DNSCrypt server from the list of public DNS servers having had an expired certificate in the past 2 or 3 years.