Don't pass on small block ciphers

Although they are omnipresent in constrained environments and lightweight protocols, small (32-bit, 64-bit) block ciphers have a bad reputation. They are perceived as something antiquated, insecure, and to stay away from for any new application, especially in software implementations.

To some extent, thinking that way is safe, and as a general building block, larger block ciphers are more versatile and provide improved security and usage limits than small block ciphers. In fact, due to modern applications and protocols, the trend in some contexts is to realize that 128 bits are not enough, and shift to large block ciphers such as Rijndael-256 and Vistrutah (NIST submission), part of NIST’s wide-block standardization effort, or public permutations such as Keccak.

The main problem with small block ciphers is that well… they are small. Generally, we want the set of possible inputs and outputs to be large enough that it’s practically impossible to enumerate. If that set is small, that doesn’t hold true. Generally, we also want the output of a cipher to be indistinguishable from random for an adversary. But with a 32-bit block cipher, collisions become likely around 2¹⁶ blocks for a given key, and the distinguishing advantage only grows from there, which is ridiculously low. Using them securely is possible, but requires very careful design of application-specific protocols.

Nonetheless, small block ciphers can remain very useful, even in modern applications because in spite of their limitations, they remain block ciphers.

Block ciphers

A block cipher is a fundamental building block in symmetric cryptography.

Imagine a list of N elements. Then, you put them all in a bag, shake the bag really well, and randomly grab all the elements one by one, to form a new list. There are still N elements, just in a completely different order. The first one may now be the 23948234th one, etc. We just applied a random-looking permutation.

If we know what the permutation is, we can easily invert the process, and map element 23948234 back to the first one.

A block cipher is a set of two functions:

A function P(k, x) -> x' that takes a secret key k, generates a random-looking permutation from it, and returns the image of x under that permutation.
A function P'(k, x') -> x that applies the inverse permutation and recovers x from P(k, x).

Without the secret key k, the permutation cannot be recovered, so given x', guessing what x is requires brute-forcing the key space or, for small block sizes, exhausting the (limited) set of possible inputs, as discussed below.

Counters and other small inputs

A common way to use block ciphers is to encrypt a counter. With a 32-bit block cipher, you can safely encrypt a counter from 0 to 2³² without getting a collision, as long as the counter doesn’t wrap around and the same key isn’t reused for a different counter domain. This is still counting, just in a shuffled order.

And in real-world applications, counters are everywhere: packet/frame indices, account identifiers, message numbers, version numbers, and anything using a monotonically incrementing key in a database.

The value of some of these counters may reveal sensitive information. Creating accounts on a website and observing changes in account identifiers is a pretty reliable way to guess how many customers the company has, and if that number is increasing or decreasing. Quite sensitive data for a public company. Similarly, opening support tickets and observing the ticket number leaks information about the company’s business.

A common mitigation is to replace counters with UUID or random identifiers, even in environments providing strong consistency guarantees.

That works, with large enough identifiers.

With UUIDv4’s 122 random bits, generating ~2⁴⁸ identifiers gives a collision probability around 2^-27, which is small but not negligible at scale.

But with a 64-bit block cipher and a counter, up to 2⁶⁴ identifiers can be safely generated with a collision probability of exactly 0.

If more than 2³² identifiers are not necessary for your application, a 32-bit block cipher will do the same job, and keep storage requirements down to a minimum.

Keep using a regular counter, encrypt it with a block cipher before making it public, and decrypt that later if needed to recover the original counter value.

No trusted source of randomness is necessary, and no accurate clock is necessary. The encrypted values themselves won’t be ordered, but you can keep the original counter for internal indexing and only expose the encrypted form publicly.

UUIDs and small block ciphers

What about UUIDs?

UUIDv1 allows adversaries to recover the exact creation time (down to 100 ns resolution), the clock sequence behavior, and a unique identifier of the machine that generated it. The timestamp alone is enough to leak traffic patterns, record creation rates, and sometimes even enable correlation across systems. It’s also predictable. Because UUIDv1 is basically “timestamp + sequence,” once an attacker sees a few UUIDs, they can often predict nearby values. That makes UUIDv1 completely unsuitable as public identifiers and in anything used in URLs or APIs where guessing matters.

UUIDv4 is 122 random bits (the remaining 6 are fixed version and variant markers).

UUIDv6 is “ordered UUIDv1,” mostly for legacy. RFC 9562 basically says: v6 exists to improve DB locality for environments already using v1, but if you’re not tied to v1, you should use v7 instead.

So what about UUIDv7? It’s explicitly time-ordered using a Unix-epoch millisecond timestamp, with the remaining bits used for randomness and/or a counter to keep monotonicity within the same millisecond.

This gives you much better insertion locality than UUIDv4 in many B-tree-style indexes. However, this requires properly synchronized clocks, and v7 leaks creation time. RFC 9562 calls out that timestamps create a (small) attack surface, and if UUIDs are used for security-sensitive purposes, it recommends UUIDv4 instead.

So, you can use UUIDv7 as the database primary key, but don’t treat it as secret.

Meanwhile, the output of a block cipher hides the underlying value, while being much smaller than a UUID, which definitely matters at scale in databases. Being deterministic, it still leaks equality (same input produces same output), distinctness, and frequency patterns, but the actual values remain hidden.

Small block ciphers work well with counters, but they work equally well with anything else that fits within the block size.

And timestamps are no exception: if they represent sensitive information, a small block cipher is a simple, efficient, usually format-preserving way to encrypt and decrypt them.

If you need UUIDs, why not combine them with a small block cipher, so that the timestamp is not leaked, while the UUIDs’ properties are retained?

UUIDv7 is essentially a 48-bit timestamp followed by randomness plus version/variant bits, and optionally a counter to ensure monotonicity within the same millisecond.

Concatenate a 48-bit timestamp with either a 16-bit counter or randomness, and 64 bits of extra randomness to form your UUIDs.

Then, encrypt the first 64 bits with a 64-bit block cipher before exposing them publicly.

The encrypted portion won’t be lexically ordered, but the original UUIDv7 can still serve as the internal primary key for indexing. Expose only the UUID with the encrypted prefix publicly, and decrypt when you need the original back.

Downsides

So far, small block ciphers sound pretty useful. But they do come with real limitations that you need to understand before reaching for one.

The obvious downside of a block cipher is that, for a given key, after having observed M distinct outputs (encryption of M distinct inputs), we know that if we encrypt inputs that we didn’t see before, their encryption will not be part of the outputs we already observed.

So, for a set of N possible outputs, after having observed M distinct outputs, when a new input is encrypted, the probability to guess its encryption is 1/(N-M) per attempt.

This is where small block ciphers come short. If an adversary’s goal is to decrypt a ciphertext, and they have an oracle telling them if a guess is right, depending on the protocol, they may be able to do it.

Small block ciphers are thus generally a bad idea against active adversaries.

However, they can be very useful against passive adversaries whose capability is limited to observing identifiers, who are then unable to map them to the original value.

Usage of small block ciphers is also not incompatible with traditional mechanisms for tamper detection such as MACs.

Another downside is that, for any cipher, the secret key must remain secure. With random identifiers, only the random number generator needs to be secure.

So, what small block ciphers should I use?

There are plenty of small block ciphers designed for hardware, that don’t perform as well in software.

In software on general-purpose CPUs, and as scary as it may sound, the best options were designed by the NSA: the SIMON and SPECK Families of Lightweight Block Ciphers.

While their design is not radically different from well-known, well-trusted ciphers, there was originally a lot of pushback due to their origin.

And, as a side effect, they got a ton of 3rd-party analysis. Due to their simplicity, they are also the first targets of new tools and techniques for cryptanalysis. However, after more than 13 years of 3rd-party analysis, there are still zero practical attacks against them, and they still have a decent security margin. New cryptanalysis techniques have been developed, but they are generic techniques that aren’t vastly more effective on these block ciphers than other ciphers of the same size.

And if the NSA really knows of completely novel techniques to break these ciphers that no one in academia ever explored, which is very unlikely, they can apply the same techniques to other widely used ciphers sharing the same construction.

Unlike the Dual-DRBG backdoor, there’s little room to hide a backdoor in plain sight in such ciphers. SIMON’s round constants are generated by a simple LFSR with a fixed primitive polynomial. The base value doesn’t come with a justification like “digits of pi”, but in that context, anything not showing signs of symmetry is fine. SPECK avoids constants almost entirely.

TLDR: after more than 13 years, extensive cryptanalysis exists, no practical backdoors have been found, and attacks generally align with what the designers claimed. The only expectations they don’t meet are cultural. After all that time, it’s time to move on and consider them for their properties and applications.

SIMON and SPECK are small, simple, and fast. They can be implemented in ~20 lines of code in any programming language, and will have good performance on a wide range of CPUs.

They are versatile, supporting block sizes of 32, 48, 64, 96 and 128 bits, with a key size from 64 to 256 bits. For a 128-bit block cipher, you’d better use AES. But for something smaller, they are a solid choice.

And if in spite of existing analysis, you still feel nervous about structural attacks and some edge-case distinguishers, there’s a simple, cheap addition you can make: key whitening (the FX/Even-Mansour construction). XOR the key before the first round and after the last round. This has been formally shown to improve security bounds under idealized models. Done.

Want to give them a spin? Try, for example, these easy to use implementations of SPECK, SIMON and SIMECK in Zig and Rust.

Should you use small block ciphers?

Generally, no. They are not generic ciphers that are safe to use in a wide range of scenarios. If they are large enough and the random number generator can be trusted, random identifiers are also a more secure option. Maybe dates, frame numbers, account identifiers and other numbers publicly leaked are not considered sensitive data, so you don’t have to worry about them at all. It all depends on your applications and protocols.

However, small block ciphers represent something to keep in your toolbox. You don’t have to systematically reach out to AES-GCM. But even if you do, maybe the public nonce leaks information? In that case, there’s a simple way to hide it. Use a small block cipher.