On user authentication

If you are asking yourself what parameters are best for Argon2id, if scrypt would be a better choice, or if PBKDF2-* would be good enough after all, stop worrying. No matter what you choose, chances are that your overall security is far better than most companies having been recently featured on haveibeenpwned.

However, the current state of application/website authentication remains suboptimal.

Avoiding storage of cleartext passwords is a no brainer. However, password stretching alone only mitigates the implications of a database leak, provided that the passwords had enough entropy.

Encrypting the hashed passwords before storing them is a slight improvement. As long as the key is not stored in the very same database, this makes a set of leaked hashed passwords useless. Passwords can also be frequently reencrypted using a new key, so that if the key ever gets leaked, newly created/changed passwords will not be affected by the database leak.

Still, the threat model completely ignores at least two very likely scenarios: application server compromise and employees.

Password hashing and encryption typically imply knowing the cleartext password to start with. If application servers got compromised, passwords may be exfiltrated before they are hashed.

A related threat doesn’t require any kind of compromise: employees. Anyone having legitimate access to servers can potentially exfiltrate passwords as well. And monetize them, or use them to keep using the service for free on the behalf of customers after they leave the company.

In this context, “password” refers to any kind of pre-shared, long-term secret. API keys naturally fit into the same category, and being less predicable than regular passwords doesn’t help at all against server compromises or insider leaks.

A public key authentication system solves this. Servers can store public keys as-is: a public key cannot be used for anything but verification. Forging a signature is assumed to be impossible without knowing the secret key.

Client certificates have been supported by web browsers and libraries for ages, so why aren’t they used much, except to authenticate traffic between servers within the same organization?

Long gone are the days where users accessed applications from a single device. People naturally expect to be able to log in from their phone, their tablet, and multiple computers simultaneously. And even though technically better alternatives exist, passwords remain by far the most practical way to achieve this.

Ignoring the practical aspect for a moment, and even if it might sound counter-intuitive, certificate-based authentication can actually reduce privacy.

The go-to way to create a secure channel, and use certificate based client authentication, is to use TLS.

However, TLS, including TLS 1.3, doesn’t encrypt certificates. Neither server (yet) nor client certificates.

A pretty bad implication is that passive attackers can take advantage of this to reliably fingerprint users and devices, and link them to IP addresses. Disabling session resumption doesn’t help much against this.

On the other hand, passwords sent over an established secure channel are not affected by this.

So, no matter how you slice it, common authentication schemes definitely have room for improvement.

Would it be somehow possible to get the best of both worlds?

A solution

As it happens, yes. And solutions are obvious and have been known for a long time. But they haven’t been widely deployed, because sticking to well established (and possibly bad or outdated) standards rightfully feels like the safest thing to do, for any definition of “safe”.

Establishing a secure channel is a solved problem. Use TLS. But since sending a non-ephemeral client public key before the key exchange is not that great, don’t do it. Establish the encrypted channel first, and use it to send the client public key for authentication next.

Passive attackers cannot distinguish between clients using their public key any more. Server-side, leaking public keys doesn’t have any catastrophic implications any more.

From a usability perspective, this doesn’t solve the problem of public key distribution across multiple devices. Unless key pairs are derived from domains and passwords.

In this scenario, password stretching can be done client-side and the result can be then used to generate the key pair. By definition, password hashing functions are very taxing on memory and CPU cycles, making them a nice DoS attack vector when run server-side. Or, at least, this represents a non-negligible source of expense in order to provision accordingly.

Delegating the computation to clients solves this.

In order to authenticate a client, the server sends a nonce over the secure channel, the client signs it using its private key computed using its password and the server domain, sends its public key as well as the signature to the server, and all the server has to do is then verify that signature.

Implementing this is trivial. Here is an illustration using the libhydrogen API:

Client-side: generate the key pair

// Generate a deterministic seed from a password `domain|username|password`
// of length `passwd_len`, and an optional shared secret key `master_key`
// using a password stretching function.
// `domain` is a static application uuid/context/personalization string.

uint8_t kp_seed[hydro_sign_SEEDBYTES];
hydro_pwhash_deterministic(kp_seed, sizeof kp_seed, passwd, passwd_len,
                           CONTEXT, master_key, OPSLIMIT, 0, 1);

// Generate a key pair from the seed

hydro_sign_keypair kp;
hydro_sign_keygen_deterministic(&kp, kp_seed);

Server-side: send a nonce to the client

uint8_t nonce[32];
hydro_random_buf(nonce, sizeof nonce);

And send nonce to the client.

Client-side: build the challenge and compute its signature

// Compute a signature for `domain|username|nonce` (with proper separation,
// for example by prefixing components with their length) of length
// `challenge_len` using the secret key.

uint8_t signature[hydro_sign_BYTES];
hydro_sign_create(signature, challenge, challenge_len, CONTEXT, kp.sk);

And send username and kp.pk to the server.

Server-side: verify the signature of the challenge

Reconstruct the challenge from username, check that the kp.pk maps to a valid user now if this is faster than a scalar multiplication, and verify the signature of the challenge:

if (hydro_sign_verify(signature, challenge, challenge_len,
                      CONTEXT, kp.pk) != 0) {
    abort(); // bail out
}

Done

If having multiple users having shared the same name and password within same service are parts of your threat model, the server can assign a random master_key to individual users, and send it along with the nonce. This requires an initial packet from the client to the server to send the user name. In this context, to prevent enumeration, a pseudorandom master_key should be returned for a nonexistent user.

The simplicity of this protocol compared to PAKEs resides in the fact that we assume the existence of an already established secure channel. This is a very reasonable assumption for any modern application.

Note that the OPAQUE draft proposes integrating the OPAQUE scheme with TLS1.3. If it ever gets actually implemented, it will be a game changer.

Even though they should be augmented with a second factor, passwords are not dead. They won’t be anytime soon.

However, the way they are currently stored is not great. I would encourage new applications to consider schemes that don’t require servers to get a copy of a password, API key, or anything that resembles a non-ephemeral shared key, at any point in time.

OPAQUE introduces a very interesting idea: performing a two-party computation in order to blind the salt. The server doesn’t know the salt, and the client requires an interaction with the server to compute it.

I implemented that idea in a complete authentication example using WebAssembly: Access control example.