Securing Data at Rest with Cryptography

Over at Schneier on Security, Bruce Schneier has a post today about securing data on disk. Encryption is often sold as a panacea for all security problems — which it’s not — but keeping people from reading your data if they steal your laptop is one thing encryption is really good at, and it’s an area where the real complexities of encryption (key management, key rotation, public key infrastructure) aren’t terribly important and can be safely neglected.

Schneier mentions Microsoft’s BitLocker in passing, and I wanted to add some detail. BitLocker is a whole-disk encryption system integrated into Windows Vista, and integrates with the Trusted Platform Module if available (the TPM is a smart chip on the mainboard that stores keys and performs secure cryptographic operations.) You tell BitLocker to encrypt your drive, and then choose one of several options for how to store the key. The simplest mode simply prevents someone from mounting the drive in another system or operating system, by storing the key in the TPM and retrieving it automatically on boot (this actually does make it significantly harder to get at the data on the disk without your password.) More complex modes store the key in the TPM and require either a PIN code from you or a certificate stored on a USB key to extract the key. Thus, on booting your PC you enter your PIN or insert the key, and the drive is unlocked.

The PGP product Schneier advocates encrypts the drive similarly to BitLocker, though rather than storing the key in the TPM it relies on a user-supplied passphrase to decrypt the key. While this is theoretically less secure (with the TPM, even the encrypted key is stored in tamper-resistant hardware and difficult to access), in practice it makes little difference — it’s still quite secure, and unlike BitLocker will let you encrypt other drives.

However, one feature BitLocker has and PGP lacks is key escrow. Now, this is normally thought of by privacy activists as an anti-feature, remembering the Clipper Chip fiasco of the late 90’s. However, the purpose of BitLocker’s key escrow is not to give a back-door key to the government, but rather to make the system palatable for enterprise deployment. Large corporations have traditionally been unwilling to embrace whole-disk encryption products like PGP even on laptops carrying sensitive data, for fear that the person with the key will forget the passphrase or simply leave the company and refuse to disclose it. By having the BitLocker keys escrowed with the domain controller such that appropriate corporate officers can retrieve it, it makes BitLocker “safe” for corporate use. If you’re not a domain member (i.e. it’s your home computer), then the keys aren’t escrowed with anyone else — there’s no government back-door.

Schneier rightly points out that an issue with any sort of whole-drive encryption is that they do not protect your data from government subpoena. If the government seizes your computer as evidence, they can (in the United States at least) subpoena the keys, and if you don’t turn them over you can be fined or jailed for contempt of court. This is not an issue for most (legal) data, but if you have something to hide from everyone, there are solutions other than the one Schneier posits (“just don’t keep data on your laptop that you don’t want subpoenaed.”) One option is the open-source disk encrypter TrueCrypt.

The problem with encrypted data on your disk is that it’s really obvious. It is not plausible to say “Oh, I don’t have any encrypted data” if served with a subpoena. For one, you probably have encryption software on your computer, and links to data that can’t be followed without decryption. But besides that, encrypted data is provably, mathematically distinguishable from almost everything else. Encrypted data consists of a binary blob with a uniform distribution across its entire data space — that is, any given byte is just as likely to be 00 as it is to be 01, 02, 03, or any other value. If you plotted it on a histogram, given enough data the graph would be approximately flat (subject to the variation and “clumpiness” always present in random data) and there would be no more repetition than expected by random chance. This is unlike every other type of data — executable programs, graphics, sound, word processor files, spreadsheets, etc. all have their own characteristic histograms and repeated patterns. Even compressed files have specific, recognizable headers and certain characteristic patterns (though they come closest to looking like encrypted data, since they have high entropy.) Thus, encrypted data stands out because it is “more random” than any other data on your hard drive. Since no one keeps large blobs of totally random noise on their hard drive, if one is found, it’s pretty certain to be encrypted data, and the courts know this (or at least can be convinced of it by expert witnesses.)

TrueCrypt has the feature of being able to place an encrypted volume inside an encrypted volume. Combined with the fact that it pads encrypted volumes with random noise, this leads to the ability to have plausible deniability of encrypted data. Essentially, it works as follows:

  1. You create a TrueCrypt volume on your hard drive with a specified size, say 10 GB. TrueCrypt reserves that much space, and fills it with random noise.
  2. You create a second TrueCrypt volume, with a different key, inside the first volume, with a smaller specified size, say 2 GB. TrueCrypt takes that space and fills it with different random noise.
  3. When you want to access encrypted data, you mount both volumes. You put really secret stuff on the inner volume, and moderately secret stuff (e.g. pirated MP3s) on the outer volume.

Now, if someone gets your laptop, they can see that you have TrueCrypt installed, and that there is a 10GB encrypted volume (as there’s a 10GB blob of random noise on your hard drive.)  They force you to give them the key, and you do so.  This unlocks the outer volume, revealing its encrypted files.  However, there is no sign that the inner volume exists.  Unless you know it’s there, and know the key, there is no way to distinguish the random noise of its encrypted files from the random noise TrueCrypt filled the outer volume with anyway.  There could be a dozen encrypted volumes, or none — it’s impossible for anyone to know, and indeed, most people without a security mindset would never even think of such a thing.

Now, there are drawbacks to this technology.  If you mount the outer volume but not the inner one, neither TrueCrypt nor your operating system knows about the inner volume, either!  This means that writing files to the outer volume may overwrite and destroy the inner volume if you’ve not mounted it.  This isn’t a major problem, but it is inconvenient, especially if you have many volumes (as you need to type in the different passphrases and addresses of all of them every time you want to write to any of them.)  And no automation will help you, because having any would defeat the purpose — the existence of automation scripts would tip off a smart forensic investigator that your outer volume contains inner volumes.

It’s an interesting solution to the problem of plausible deniability — using steganography to hide encrypted data in encrypted data.  Admittedly, Schneier’s solution (just don’t have the data at all) is even safer, but sometimes that’s not good enough.

crypto, legal, passwords, products

If you enjoyed this post, please consider to leave a comment or subscribe to the feed and get future articles delivered to your feed reader.