Backdoored PNRGs from the NSA

Bruce Schneier has an article at about the new government-sponsored official standards for random number generators in NIST Special Publication 800-90.  Apparently, it’s possible that one of them contains a back-door for the NSA; depending on how the constants in the algorithm were chosen, the NSA may have another set of constants that let them predict the “random” numbers generated by the algorithm.

To people not very familiar with cryptography, it may seem odd that random number generators are very significant.  However, all modern key-based cryptography is based on having a source of entropy (true randomness) — somewhere it can get a key that is unlikely to be guessed or otherwise determined.  When we talk about “40-bit” or “128-bit” encryption, we’re really talking about the key length, which provides an upper bound on available entropy.  Ideally, cryptography would be based on true random numbers, for which every bit of number is a bit of entropy.  However, true random numbers have to be generated physically — we have devices that do it based on radioactive decay, but you can also get it by asking a human to move a mouse around or bang on a keyboard, as PGP does when generating keys.  Thus, for most applications, we settle for pseudo-random number generators — programs which generate a stream of numbers that are unrelated to each other, have a uniform distribution, and are for most purposes entirely random.

However, a psuedo-random number generator usually needs a seed — a starting point for the generator.  If you use the same seed, you’ll get the same stream of “random” numbers.  Thus, the seeds chosen are usually very large numbers.  Cryptographic pseudo-random number generators are considerably more processor-intensive than the regular “random” number generators used in non-security applications, as they’re usually based on multiple iterations of a hashing algorithm.

What happens if your pseudo-random number generator isn’t very good?  Well, in the early 2000s, an online casino in the Caribbean (I wish I could remember the name of it to provide a link to the news coverage) lost several million dollars.  Apparently, a player realized that to shuffle the decks of cards, they used a standard, non-cryptographic random number generator — the sort of thing that’s built into Windows and Linux and such.  A shuffled deck of cards is very random — there are 8×1067 ways to shuffle a deck, which is about 225 bits of entropy. However, the random number generator used only a 32-bit seed!  There are only 4×109 32-bit numbers.  This is still a lot, but with modern computer aids, it’s a manageable number.  So what did this player do?  He had his computer generate shuffled decks for each of the four billion 32-bit seeds.  He then wrote a program that let him enter specific cards that were drawn (e.g. “fourth card was a queen of spades, fifth card was a 9 of diamonds…”) based on the draws he could see (such as his own cards in poker, or the up cards in blackjack) and it would pare down the four billion decks to the ones that could have potentially produced those draws.

It turns out that when you know that almost all decks are invalid (not able to be generated by the random number generator in use), there aren’t many decks that can produce a given set of cards.  Thus, within 3-5 known cards, his program would spit out the entire deck, and that player could now predict the future.  He would know exactly what cards would be coming out, and what ones already had.  Thus, poker and blackjack were trivial, and he won a ton of money.

Many things in cryptography operate similarly.  If you can predict the random numbers being used, you drastically simplify cracking the code.  It is generally still not what a layman would call simple — but it brings a message from “even the National Security Agency with its thousand acres of supercomputers couldn’t crack it in our lifetime” to “it’s still out of reach for you and I, but, well, the NSA could probably crack it in a day or two.”  Well-funded, skilled adversaries can use any small defect in a cryptosystem that lowers entropy to shorten the time to break codes.

And that’s why the NSA would be interested in putting a back-door in a pseudo-random number generator.  Did they actually do this?  In my opinion, the evidence Schneier presents is pretty convincing, and while Schneier is today best known as a popularizer of security rather than a technical expert, one would do well to remember that he also wrote Applied Cryptography, a very technical book that sits on the bookshelf of basically every security developer, including mine.  The NIST publication presents four random number generators, based on different algorithms, and then recommends the use of one, Dual_EC_DRBG, that is about 1,000 times slower than the other three.  Unlike the others (Hash_DRBG, HMAC_DRBG, and CTR_DRBG), however, with this particular algorithm it would be possible to craft a set of input constants that are defective in a specific way — such that someone armed with a corresponding set of constants could predict the output of the generator.

Now, we don’t have proof that the NSA actually did this.  It’s possible that the input constants in the NIST publication are truly random, chosen arbitrarily, and the NSA does not have a matching key that will break the generator.  But the NSA is pretty smart, and almost certainly knew about the flaw in the algorithm — in general, people in the cryptographic industry assume that the NSA is a few years ahead of them and just hasn’t said so.  The old adage about not attributing to malice what simple incompetence will explain usually applies to government pretty well, but not to the NSA.

Really, this is a rather ingenious way to backdoor a crypto algorithm.  The normal method — just make a cryptosystem with a mathematical flaw or known backdoor key — has a serious issue: if you can figure out the mathematical flaw, so can someone else.  The NSA wants to be able to listen to our phone calls — it doesn’t also want every other country to be able to do so.  To backdoor a cryptosystem requires making it so you can read messages without also weakening it for everyone else.  This method does exactly that — without the specific numbers that match the provided input constants, the system isn’t flawed at all.  The NSA has the key (if, indeed, they do), and no one else does.  Putting it in the random number generator rather than the cryptosystem itself is a good way to draw attention away from it, too.

And if the NSA didn’t choose the constants to have a backdoor, why recommend an elliptic-curve based generator that’s three orders of magnitude slower than several other generators, all believed to be just as secure, that are based on much more easily understood mathematics like hashing?  It just doesn’t seem to make much sense.

crypto, legal, privacy, society

If you enjoyed this post, please consider to leave a comment or subscribe to the feed and get future articles delivered to your feed reader.