Notes on cryptography and information theory

The birthday paradox in cryptanalysis

Entry 005 · 2026-03-21

The birthday paradox is the everyday observation that in a room of 23 people, the probability of a shared birthday is over 50%. The underlying mathematics: for N equally likely outcomes, collisions in random samples become likely once the sample size approaches √N. The intuition fails because we mentally privilege our own birthday; the question is about any pair.

In cryptography the paradox sets the difficulty of finding collisions in any function with N-sized output. If H outputs 128 bits, generic collision search costs roughly 2^64 evaluations — much less than the 2^128 cost of inverting H on a specific target. This is why 128 bits of hash output gives 128-bit preimage resistance but only 64-bit collision resistance, and why 256-bit hashes are the default for any setting where collision resistance matters.

The classic example is the chosen-prefix collision attack on MD5, which made it possible to forge X.509 certificates in 2008. MD5's 128-bit output gives 64-bit generic collision resistance, and structural weaknesses brought the effective complexity down to roughly 2^39, within a single workstation's reach. The Flame malware (2012) exploited a related weakness for code-signing.

The paradox also drives the choice of nonce size. A 96-bit nonce gives 2^48 birthday-bound collision resistance — fine for a single device generating nonces, less fine for a population of devices each picking nonces independently from the same pool. AES-GCM's 96-bit IV size is the most-debated example: with random IVs, the recommendation is to limit each key to fewer than 2^32 encryptions, well below the per-IV nominal limit.