Archive for November, 2007»
Why Hackers Love Wi-Fi
Hackers love wireless networking. At DefCon 15, it was easy to predict which sessions would have lines running out the door and require getting there well in advance for a seat - it was the sessions with “wireless” or “Wi-Fi” in the title. The Wireless Village was very popular, and many of the hacking contests involved wireless access points.Why do hackers love wireless networks? Really, there are two reasons, and those two together have some scary implications for risk on the modern Internet.
1.) Wireless Networks Use Shared Media
Back in the 80’s and 90’s, most wired Ethernet networks were based on shared media topologies. In principle, when you plugged into an Ethernet network and sent a packet, the packet on the wire (the actual electrical impulses) went to every other machine on the network. Hubs were simple repeaters, broadcasting everything they received. Only when your signals reached the router at the Internet edge were they actually intelligently processed. Thus, every computer on the LAN got every packet - the network cards just threw away any packets whose destination address specified another computer. However, a hacker wanting to eavesdrop on others had an easy job - just toggle the network card into “promiscuous mode” (a hard task on some network cards and OSs, but completely trivial on others) and it will receive every packet, giving you a god’s-eye view into the network. Protocols were mostly unencrypted then, too - so you saw everyone’s email, their paswords as they logged into Telnet or IMAP, etc. You could also spoof traffic - since you saw the packets sent by others, you could simply send responses back claiming to be the recipient. So long as your response arrived before the real one, yours would be accepted and the actual response discarded as out of sequence. It was the golden age of network-protocol hacking. Such easy access to passwords made other types of hacking easy, too - once you had the password to someone’s UNIX account or email box, there was a very good chance it would work on all their other accounts, too.
Then it all changed. Shared media has significant disadvantages as it scales - since everyone is dumping packets onto what essentially amounts to a single wire, collisions occur when two systems transmit simultaneously. Both then have to back off, slow down, and retransmit their garbled packets. The packets are tiny (Ethernet frames are normally restricted to 1500 bytes or less), but if you have 100 systems communicating at once, collisions can become quite frequent. Plus, even in the late 90’s people were not totally unaware of the security risks - the fact that any student could read all the network traffic of everyone else in their dorm was not considered desirable by universities, for instance. Thus, Ethernet was converted over to switched media. Switches, unlike hubs, do not treat all ports as equal. Instead, they remember which ports they have received traffic from an address on, and only forward traffic to an address to those ports. Traffic is only broadcast to all ports when a switch has no idea for which port it is intended, or when a packet is actually marked as a broadcast. Now, when you put your Ethernet card in promiscuous mode, all you hear is traffic meant for you - everything else has been blocked by the switch. Suddenly, packet sniffers went dead - there was nothing to see anymore. Ethernet became a lot more secure.
But wireless changes things again. Wireless networks are shared media, and they are shared inherently, in a way that cannot be changed. Radio waves fly in all directions. There is no way for your laptop to transmit only to another laptop or an access point - all radio is broadcast. Thus, when you sit down in a coffee shop and turn on wireless, you begin broadcasting everything to everyone within range (about a mile, for attackers who have good antennas and high-power network cards.) The shared media nature can be mitigated somewhat via cryptography - if all the traffic to the access point is encrypted, it hardly matters if someone can eavesdrop since they can’t understand it anyway. But open access points are, by their nature, open - they’re either not encrypted at all, or they’re encrypted in such a way that everyone is using the same key. Once the hacker has the key (either by cracking it, which is not hard on most Wi-Fi networks, or by simply paying as a legitimate user of the wireless hotspot), they can read all the traffic just like in the hub-based glory days of old.
There are solid wireless encryption systems. A network based on WPA2 with a strong passcode is quite secure, about as good as a wired connection (keeping in mind that “as good as a wired connection” is not an absolute guarantee of safety, either.) Modern encryption systems like AES coupled with 802.1x certificate-based authentication can make a well-engineered corporate wireless LAN quite safe.
But hackers don’t love well-engineered corporate wireless LANs. They love the terrible ones in coffee shops and bookstores and your house. On these networks, they can listen to all traffic, they can spoof traffic, and they can even kick people off and hijack their connections, or edit their connections on the fly. The “airpwn” attack from a DefCon 2-4 years ago was particularly amusing; using two wireless cards, it would sniff everyone’s HTTP traffic on one connection, then on the other card spoof responses to all requests for images, substituting other images (such as the hacker group’s logo, or more unsavory fare like the infamous goatse.cx site; that is not a hyperlink on purpose, do not navigate to that URL as it is not safe for work or, indeed, for anywhere else.) The result was that one laptop at a security conference was able to dynamically edit the HTTP streams of everyone else there - hundreds of people. That’s the kind of power a hacker can have on a shared-media network. In addition, on these sorts of networks, it’s trivially easy to hijack sessions. This means that on any site that uses HTTPS for authentication only, but then HTTP for the actual service (a category that includes all of the Google apps like GMail, as well as all the Yahoo! and Windows Live services), a hacker gains full access to your account if they overhear any of your wireless traffic.
The only truly safe way to use a public wireless hotspot is to use it only to VPN to a network you trust. Anything else is dangerous.
2.) Wireless Networks Provide Plausible Deniability
The legal system is not terribly friendly to hackers. Even innocuous and non-destructive activity, when applied to networks you don’t own, is often illegal. Now, for the most part hackers don’t worry overmuch about getting caught - if you don’t cause more than $5,000 in damages, the FBI won’t get involved, and the average local police department is about as capable of investigating sorcery as computer crime. However, when a hacker does worry about legal prosecution, a public wireless network is the next best thing to Siberia for where to commit a crime from.
When you do anything on the Internet, a host of servers are recording your activity based on your IP address. IP address, however, is not necessarily long-lived. Depending on how you access the Internet, your IP address might change every time you plug your computer in, or reboot, or move from building to building. Thus, investigators must be able to tie the IP address they know committed a crime to a specific, physical person.
With wireless, this is a problem. All the sites being attacked don’t see the IP address of the hacker - they see the IP address of the wireless access point. Thus, they have to subpoena the owner of the access point and demand to know who was using it. In the case of a well-designed corporate wireless LAN, they can check their logs to see which 802.1x certificate was using that IP at that time, and uniquely identify you. But in the case of a public hotspot, there probably aren’t any logs at all! They’re completely incapable of giving you up. And even should someone who was there say “I saw a shifty guy in the corner using a laptop!” to the police, that’s not going to be enough evidence. And if there are logs, they will tie your traffic to your MAC address, a unique code assigned to your network card at the factory.
Most people think MAC addresses cannot be changed, so it uniquely identifies your network card. If the police get a hold of your network card, they’ve caught you. This is actually totally untrue. Many network cards will allow you to change the MAC address to whatever you want (in Windows, it’s on Connection Properties -> Configure -> Advanced -> Physical Address), though this is entirely up to the network driver. Many Windows drivers block this functionality, thinking that users don’t need it. On Linux, however, the network drivers have been written by geeks, who operate under the impression that users need everything. Thus, on Linux systems changing your MAC address is as simple as typing one command (”macchanger eth0 00:11:22:33:44:55″), and you can even configure the network stack to give you a new, random MAC address every time you connect to a network.
As a result, a trail that leads to a wireless hotspot is basically a dead end for investigators. They get nothing but a fake MAC address that could correspond to any computer within a 1-mile radius - the hacker might not have even been in the building. Hard to get “beyond a reasonable doubt” out of that.
And those are why hackers love wireless networking. It’s like the 80’s phone networks, where a hacker can be a ghost in the machine, undetectable, and with tremendous power. It’s a dangerous place.
You might wonder, if wireless networks are so anonymous, how hackers ever get caught. Actually, there are three main ways:
- They get stupid, and brag about what they did.
- They get stupid, and while performing their illegal activities they also do something that identifies them, like log into their email account.
- Investigators follow the money. We don’t catch you breaking into the bank, we see where you sent the money to. We don’t catch you stealing credit card numbers, we catch you using them.
Luckily for those of us in the business of investigating and preventing computer crime, wireless networks won’t save criminals from their own stupidity, and you can’t send cash through the airwaves.
SMB Reflection Made Way Too Easy
Windows file sharing operates via an old protocol called SMB (Server Message Block.) In modern Windows operating systems, it operates over TCP/445, though older versions of Windows also made use of NetBIOS (UDP/137, UDP/138, and TCP/139). Due to the ubiquity of Windows file shares on corporate Intranets, in general these ports are open to basically everyone on the internal network, though they are blocked at edge firewalls. Even UNIX/Linux machines often use these ports, due to a Windows-file-sharing-compatibility package called SaMBa.
There have been many security vulnerabilities in NetBIOS in the past, and some in SMB, so these protocols are (rightly) considered moderately risky by network administrators. However, scarier than any of these patched vulnerabilities is the flaw in the design — SMB is subject to a sort of replay attack, called SMB Relay or SMB Reflection.
SMB at first appears safe from replay attacks. After all, it uses challenge-response authentication (normally; there is a protocol for SMB with cleartext, but basically no client or server will accept this protocol now), whose whole purpose is to prevent eavesdropping and relay. If you try to replay the same response to a server, it won’t work, because the challenge is different. There are three ways SMB allows challenge-response authentication — LANMAN, NTLM, and two-way NTLM. In any case, the principle is the same — the client asks to authenticate, the server sends a challenge, the client encrypts the challenge with a password, and sends the encrypted result as a response.
So how do you perform a replay attack on SMB? Via SMB Relay (to attack another host) or SMB Reflection (to attack the client.) It goes like this:
- Client (C) connects to you (a malicious server, M) and asks for a challenge.
- M connects to C in a separate sessions and asks for a challenge. It still has the connection from (1) on hold, having not responded yet.
- C receives the request from M, and responds with a challenge (challenge_C).
- M takes challenge_C, modifies it to appear to be coming from M (challenge_M), and responds to the connection from (1) with it.
- C finally receives the challenge (challenge_M) that it asks for. It uses its credentials to respond to it (response_M).
- M receives response_M, which is correct for challenge_M, and so grants C the access it requested. Of course, this response also matches up with the challenge it’s holding onto (challenge_C). It forwards it right back to C as response_C.
- C receives response_C, which is correct for challenge_C, and so grants M the access it requested.
No, C doesn’t ever realize that it just received and responded to the challenge that it, itself, sent out moments before. By requesting access to M, I have unwittingly given it all it needs to authenticate against me at the same time. It can’t be carried out later — the reflection attack has to happen at the moment I am trying to connect.
Note that this is a design flaw — there’s no “bug” to patch here (I suppose Microsoft could modify SMB to ensure that it’s not responding to its own challenges; it would be possible but not trivial), this is the behavior of SMB since time immemorial. No buffer overrun is exploited; SMB is acting precisely as it’s intended to. This issue has been known since 2001.
Host firewalls like Windows Firewall and ZoneAlarm help some — at least, they keep SMB Reflection attacks out. However, they don’t help if the attack is coming from a zone you trust — i.e. if you’re sharing files with your corporate Intranet, someone on the corporate Intranet can attack you this way if you try to authenticate against them. And it also doesn’t stop SMB Relay – if I connect to M with my firewall up, he can’t use reflection to attack me, but he can relay to yet another host and impersonate me, passing me its challenge and relaying my response to the remote host.
It actually gets a bit worse than this, because the NTLM package used to authenticate SMB is the same one used to authenticate you against Intranet websites in Internet Explorer. The attacker might not need to get you to try to make a file-share connection to him; a web connection can be sufficient.
This attack sounds relatively tricky to carry out, and it is… by hand. However, the ever-popular Metasploit Framework contains an SMB Relay module (which also works for SMB Reflection) that makes it quite quick and easy (you need the live-tree version of Metasploit out of CVS, not the release build, as the module is relatively new and was just demonstrated at Defcon 15 this August.) You do have to disable SMB on your computer to use it, though (which is simple on Linux, but on Windows involves unbinding the Server service.)
This makes SMB reflection trivial. You load the module, tell it your IP address, load your choice of many payloads (such as having a shell started and passed to you, or simply having an SMB connection opened that you can do with as you will), and then wait for someone to connect to you. You can either specify a target server if you want SMB Relay, or leave that unspecified for an SMB Reflection back to the person connecting. The only thing Metasploit won’t do for you is make people connect to you.
So the moral of the story is apparently to be careful who you connect to, especially on local networks where your file-sharing ports are open. That’s a pretty good moral in general… but it’s really not enough. People who run Windows Firewall often have a blanket exception for File & Printer Sharing, which opens port TCP/445… if you’re not behind a home firewall or router, this sort of attack may be able to be carried out on you from any site on the Internet. And tomorrow, I’ll be talking about why nothing of any kind is ever safe on a public wireless hotspot.
Backdoored PNRGs from the NSA
Bruce Schneier has an article at wired.com about the new government-sponsored official standards for random number generators in NIST Special Publication 800-90. Apparently, it’s possible that one of them contains a back-door for the NSA; depending on how the constants in the algorithm were chosen, the NSA may have another set of constants that let them predict the “random” numbers generated by the algorithm.
To people not very familiar with cryptography, it may seem odd that random number generators are very significant. However, all modern key-based cryptography is based on having a source of entropy (true randomness) — somewhere it can get a key that is unlikely to be guessed or otherwise determined. When we talk about “40-bit” or “128-bit” encryption, we’re really talking about the key length, which provides an upper bound on available entropy. Ideally, cryptography would be based on true random numbers, for which every bit of number is a bit of entropy. However, true random numbers have to be generated physically — we have devices that do it based on radioactive decay, but you can also get it by asking a human to move a mouse around or bang on a keyboard, as PGP does when generating keys. Thus, for most applications, we settle for pseudo-random number generators — programs which generate a stream of numbers that are unrelated to each other, have a uniform distribution, and are for most purposes entirely random.
However, a psuedo-random number generator usually needs a seed — a starting point for the generator. If you use the same seed, you’ll get the same stream of “random” numbers. Thus, the seeds chosen are usually very large numbers. Cryptographic pseudo-random number generators are considerably more processor-intensive than the regular “random” number generators used in non-security applications, as they’re usually based on multiple iterations of a hashing algorithm.
What happens if your pseudo-random number generator isn’t very good? Well, in the early 2000s, an online casino in the Caribbean (I wish I could remember the name of it to provide a link to the news coverage) lost several million dollars. Apparently, a player realized that to shuffle the decks of cards, they used a standard, non-cryptographic random number generator — the sort of thing that’s built into Windows and Linux and such. A shuffled deck of cards is very random — there are 8×1067 ways to shuffle a deck, which is about 225 bits of entropy. However, the random number generator used only a 32-bit seed! There are only 4×109 32-bit numbers. This is still a lot, but with modern computer aids, it’s a manageable number. So what did this player do? He had his computer generate shuffled decks for each of the four billion 32-bit seeds. He then wrote a program that let him enter specific cards that were drawn (e.g. “fourth card was a queen of spades, fifth card was a 9 of diamonds…”) based on the draws he could see (such as his own cards in poker, or the up cards in blackjack) and it would pare down the four billion decks to the ones that could have potentially produced those draws.
It turns out that when you know that almost all decks are invalid (not able to be generated by the random number generator in use), there aren’t many decks that can produce a given set of cards. Thus, within 3-5 known cards, his program would spit out the entire deck, and that player could now predict the future. He would know exactly what cards would be coming out, and what ones already had. Thus, poker and blackjack were trivial, and he won a ton of money.
Many things in cryptography operate similarly. If you can predict the random numbers being used, you drastically simplify cracking the code. It is generally still not what a layman would call simple — but it brings a message from “even the National Security Agency with its thousand acres of supercomputers couldn’t crack it in our lifetime” to “it’s still out of reach for you and I, but, well, the NSA could probably crack it in a day or two.” Well-funded, skilled adversaries can use any small defect in a cryptosystem that lowers entropy to shorten the time to break codes.
And that’s why the NSA would be interested in putting a back-door in a pseudo-random number generator. Did they actually do this? In my opinion, the evidence Schneier presents is pretty convincing, and while Schneier is today best known as a popularizer of security rather than a technical expert, one would do well to remember that he also wrote Applied Cryptography, a very technical book that sits on the bookshelf of basically every security developer, including mine. The NIST publication presents four random number generators, based on different algorithms, and then recommends the use of one, Dual_EC_DRBG, that is about 1,000 times slower than the other three. Unlike the others (Hash_DRBG, HMAC_DRBG, and CTR_DRBG), however, with this particular algorithm it would be possible to craft a set of input constants that are defective in a specific way — such that someone armed with a corresponding set of constants could predict the output of the generator.
Now, we don’t have proof that the NSA actually did this. It’s possible that the input constants in the NIST publication are truly random, chosen arbitrarily, and the NSA does not have a matching key that will break the generator. But the NSA is pretty smart, and almost certainly knew about the flaw in the algorithm — in general, people in the cryptographic industry assume that the NSA is a few years ahead of them and just hasn’t said so. The old adage about not attributing to malice what simple incompetence will explain usually applies to government pretty well, but not to the NSA.
Really, this is a rather ingenious way to backdoor a crypto algorithm. The normal method — just make a cryptosystem with a mathematical flaw or known backdoor key — has a serious issue: if you can figure out the mathematical flaw, so can someone else. The NSA wants to be able to listen to our phone calls — it doesn’t also want every other country to be able to do so. To backdoor a cryptosystem requires making it so you can read messages without also weakening it for everyone else. This method does exactly that — without the specific numbers that match the provided input constants, the system isn’t flawed at all. The NSA has the key (if, indeed, they do), and no one else does. Putting it in the random number generator rather than the cryptosystem itself is a good way to draw attention away from it, too.
And if the NSA didn’t choose the constants to have a backdoor, why recommend an elliptic-curve based generator that’s three orders of magnitude slower than several other generators, all believed to be just as secure, that are based on much more easily understood mathematics like hashing? It just doesn’t seem to make much sense.
The Trouble with Copy Protection
SecurityFocus reports that a patch has been issued for a vulnerability in the Macrovision SafeDisc driver. Apparently, due to a flaw in how the driver handles configuration parameters (which probably means a garden-variety buffer overflow), it’s possible for a local user to use the driver to elevate privilege all the way to the kernel.
This sort of security flaw is a major problem with copy-protection drivers like SafeDisc; this is also the same basic issue as caused all the controversy over the “Sony Rootkit” of 2005. Fundamentally, the purpose of any copy-protection or DRM system is to protect data from the user. Thus, it is attempting to create a security boundary where none exists — to prevent the user, possibly a user with administrative privileges, from performing certain manipulations of data entirely under his control while allowing other manipulations (e.g. watching a film, playing a game, listening to a CD) to continue unhindered. The problem is that it’s just data — what copy-protection and DRM vendors are doing is the equivalent to my trying to write a book, with normal ink on normal paper, that you can read but not copy, even by hand. It can’t be done; there is no inherent difference between reading-to-read and reading-to-copy.
So instead, DRM and copy-protection vendors, like Macrovision, create a system that runs at a level of privilege above what the user can normally achieve — on a Windows machine, at least NT AUTHORITY\SYSTEM privileges, but often kernel mode drivers. This driver then sits, Big Brother-like, above the user, watching his activities, and preventing “illicit” operations. Meanwhile, while being immune to manipulations by the user, this supervisor must take orders from data — that is, Macrovision SafeDisc must be told by a game that it should check for copy protection and stop the game if it fails, while the Sony “rootkit” must be told by a CD that it should allow playing but stop copying.
Thus, the user’s computer is put into a rather odd state — the user doesn’t control it, a piece of supervisory code does. And if that piece of code is flawed (as it was in both the Macrovision and Sony cases), attackers can write malware that issues instructions to that supervisory code, imitating “protected” media.
If you’re a non-Administrative user (such as almost all Vista or UNIX/Linux users, but only a few Windows XP-and-before users), you are protected from running code that does certain potentially-harmful things to your system. You can’t write to the Windows directory, or modify installed programs, or register a driver. However, these copy-protection drivers supply an end-run around this protection — you can supply data to the copy-protection driver (after all, you have to be able to tell it to check up on you), which means that any malware you run can also supply data to the copy-protection driver. And since it runs with greater privilege than you, it can do all the harmful things you supposedly can’t. Copy-protection drivers, to make content more secure for the copyright-holder, make your computer less secure for you.
From a theory perspective, the problem here is that there is no security boundary (a line which code and data cannot cross without being subjected to a security policy), on a general-purpose computer, between an administrative user and all the data on the system. This is what the copyright-holders want, but it’s not really possible for them to get it. All of these systems can be circumvented by simply placing a new supervisor above the one added by the copyright holder (e.g. run the system in a virtual machine, or with a kernel debugger attached, or in the most extreme scenario, just walk through the code execution by hand, choosing to ignore instructions you don’t like until you get a fully unprotected data stream.) Thus, they fake it, in ways that make the system less secure, simply to make it more difficult for a nontechnical user to get the unencrypted stream. The result is a simple arms race between copyright-holders and hackers, which has a side effect of harming innocent users by making them increasingly vulnerable to malware.
Social Engineering For Hire
There’s an article in PC Magazine about a company called TraceSecurity that performs audits of physical security via social engineering. Essentially, companies hire them to steal data, and they do so by simply talking their way into the facility and getting unrestricted physical access to the servers.
If a skilled attacker has unrestricted physical access to a machine, they can acquire all the data on the machine. Database encryption can help quite a bit — unless they also get the system that contains the key to your database. Since in many cases the database server sits in the server room right next to the middle-tier server that encrypts it, this is not necessarily much of a protection against true physical access.
To most people, it seems like it would be difficult to simply talk your way into a private facility and get left alone with the mission-critical servers, but really, I’m not surprised that TraceSecurity reports no difficulty getting abandoned anywhere short of a bank vault. Daily life is based on trust — we assume that people are what they say they are and appear to be, because life is is impossible otherwise. In addition, we encounter legitimate people so much more often than criminals that in a sense a criminal is a surprise every time.
Anyone who’s worked in a corporate office with badge-based security knows how easy tailgating is. Wait for someone to swipe their badge and walk in right behind him — your chances of being challenged are very low, since people do it all the time and even people who originally challenged tailgaters have usually gotten tired of it within a few months (since it’s basically always just someone too lazy to get their badge out.) What TraceSecurity does is pretty similar, with a dose of social engineering — just dress up as someone who belongs there, pretend to be someone who belongs there, and walk right in.
They tend to prefer pest-control services or fire marshals for their disguises (though they have to jump through a few legal hoops to dress up as a federal agent without committing a crime), though other penetration testers I’ve encountered favor telecom vendors. If a company’s ISP is Verizon, they will think little of a Verizon technician showing up, and probably happily let him into a wiring closet or server room.
The bigger difficulty than getting in is getting left alone. This is one area where simple surreptitious entry, like tailgating, is better than dressing as someone like a pest inspector or fire marshal who, in their normal jobs, you would not likely leave alone anyway. Still, people at corporate offices are busy. If one is following you around, dawdle long enough in non-sensitive areas and I’m not terribly surprised they get tired of wasting their day escorting you. By the time you get to the server room, they swipe you in and get back to work.
This sort of penetration test makes the news, though, because it’s interesting and unusual. Even TraceSecurity, which the article makes sound like specializes in this sort of assessment, offers a wide array of other security services. A career exclusively performing on-site physical/social penetration tests may be limited to characters in Sneakers. The main reason, though, is the perception of risk.
People see the security measures around physical intrusion. The servers are in a locked room, in their locked building, surrounded by people who know each other, so getting in must be difficult. On the other hand, most people have no idea how to hack into a server from the Internet, and thus have no way to gauge the risk other than the availability heuristic — and we hear about online break-ins and data leaks in the news all the time, so it must be easy. This makes people inclined to overestimate the risk from network attacks (though, honestly, the risk is pretty high) as compared to from physical intrusion.
This said, another thing preventing physical attacks on servers is not the difficulty of the attack, but the simple dearth of people willing to carry it out. Breaking into a building to steal something “feels” like crime, while just typing code into your keyboard is probably more easily rationalized — it’s the same reason why people who would never shoplift a CD happily copy music, despite the acts being legally similar. Of course, there’s probably also a higher likelihood of getting caught in the physical intrusion — people have seen you. This is a case where prevention is very hard but detection is less difficult. It takes a special sort of person to be caught red-handed trespassing in a server room and still keep their cool well enough to get out of the situation without arrest. Admittedly, this lowers the actual risk of attack — it reduces the threat, despite the presence of the vulnerability.
The usual solution posited to this sort of attack is user education — just teach people to be vigilant, ask to see badges of people they don’t recognize, verify the identity of service providers, call the fire department and ask if the fire marshal should really be here, etc. However, in truth, this just won’t work. TraceSecurity couldn’t get the bank manager to leave them alone in the vault — because people standing in a vault think about security, and know that a normal person might be tempted to steal when surrounded by cash. But in a server room, where the potential theft may actually be much greater, it’s not what’s on their minds, and simple user education isn’t likely to change that. Human beings trust each other, and criminals learn how to cultivate and play on that trust — a security awareness program isn’t going to change human nature. What is necessary here is to worry less about prevention and more about detection and response.
When data is extremely valuable — say, personally identifiable information with credit card numbers, in bulk (20,000 records or more) — it shouldn’t be stored in a corporate office server room anyway. You wouldn’t store $200,000 in cash in a closet in your office building, so don’t store something of equivalent value and easier to carry there, either. Colocate the server in a secure datacenter, where it’s surrounded by people who are aware of security and under guard and camera.
However, for less-valuable data, instead of thinking about how to keep people out — a task that may be impossible — think about how to know they’re there and recover from the breach. Methods like camera surveillance deter crime by making intruders believe themselves (rightly) more likely to be caught. Use monitoring tools on computers to be able to determine if someone has gained physical access to them (an action which tends to result in the server going down for a short time) and investigate such alerts immediately. Even procedural efforts like requiring people to sign in and out of server rooms can be helpful — if the sysadmin has to write down that he admitted three people to the server room and left them there, he’s more inclined to have security come to mind, and more likely to speak up later when you realize a theft has occurred. In addition, do use encryption on valuable data — this ensures that if an intruder does walk off with the database file (or the hard drive it’s on), they’re less likely to be able to make use of it. It may not be enough in the case of someone who breaks into your building and has all night to figure out where the key is, but it may be enough for the person who has 5 minutes to copy everything they can to a thumb drive before you come back with their cup of coffee.
Secure P2P for Pirates
According to a recent Reuters article, the unrepentant pirates of Sweden’s The Pirate Bay are working on developing their own peer-to-peer networking system. It turns out that this is a relatively fascinating security problem, even though in this case it’s the criminals needing the security, vs. the law-abiding companies trying to break it — a bit of a reversal, to say the least.
Currently, the Pirate Bay is probably the world’s most popular BitTorrent tracker for downloading pirated media, receiving 1.5 million unique visitors a day. With a quick trip to the Pirate Bay, you can quickly acquire any piece of music, any episode of any recent television show (usually within a couple hours of its first airing), any movie (generally while it’s still in theaters), etc. Membership is required to enforce ratios (i.e. ensure you upload as well as download), but is free and open to all. However, they’re unsatisfied with the BitTorrent protocol for a variety of reasons — chiefly the legal risk that their “customers” take. Downloading from the Pirate Bay via BitTorrent runs two risks — first, that a copyright holder will grab your IP address and send a cease-and-desist order to your ISP, or worse, a subpoena which under the DMCA in the United States could carry a fine of tens of thousands of dollars, and second, that your ISP itself will cancel your subscription for using too much upstream bandwidth. Comcast, in particular, is notorious for doing this without being willing to admit how much “too much” is, even as they cut you off for using it.
BitTorrent is an ingenious protocol. The idea is to prevent massive load on single servers for downloading popular files by ensuring that everyone who downloads the file also shares it with others, even as the download occurs. You don’t need the entire file to start sharing it — you register with a BitTorrent “tracker” like (The Pirate Bay) as working on a file, and all the other peers who either have or want that file are notified of your existence. Peers then communicate with each other, swapping whatever parts of the file they have for the parts they don’t. Thus, everyone’s upload bandwidth is being used at the same time as the download, unlike some previous P2P protocols. This is used for many legal purposes — for one, Blizzard’s World of Warcraft uses it to update the game, to get around the obvious difficulty of having about 4 million of its 6 million subscribers all trying to download a 450-meg content update on the same day. Thanks to BitTorrent, these updates go smoothly every time.
The problem, however, comes when the files being shared are illegal. In the United States, uploading copyrighted media can result in rather substantial fines and statutory damages, and the RIAA and MPAA are actively suing people by the thousand to get them charged. People want to download copyrighted media, so sites like the Pirate Bay exist. But RIAA and MPAA agents can connect to these trackers, too — they’re open to all — and the tracker shares everyone’s IP address with them. Since with BitTorrent, downloading and uploading go hand in hand, there’s no way to download copyrighted material without not only breaking the law but also advertising your IP to anyone who wants it. There are blacklists of known RIAA/MPAA peers that will protect a pirate from the most ham-fisted detection, but it would be trivial for the copyright holders to evade this sort of blocking. The Pirate Bay itself is largely immune to prosecution — they are located in Sweden, where copyright law subjects them to at worst a $300 fine every time they’re arrested (which has happened more than once.) For the most part, legal threats just amuse them. However, they’re concerned about their downloaders — as without people sharing files, they cannot exist.
In addition to the legal issues, there is the issue with ISPs. “Unlimited” low-cost home broadband survives because people generally use only the tiniest fraction of their upstream bandwidth. Comcast allocates me, and everyone else in my area, 384 kbit/sec. If I used this bandwith to full utilization for an entire month, I’d have uploaded 118 gigabytes. This is actually quite a lot — by way of comparison, playing World of Warcraft 24/7 for an entire month would use only 1.2 megabytes, or 1% as much. This is fine by Comcast, because most of their users are only surfing the web, using only a few hundred kilobytes per month. If everyone used their entire allotment of 118 gigabytes, Comcast would have to raise rates tremendously — from the current $50 or so per month to probably 5 times as much (or more.) Compare business Internet rates (which assume you are hosting servers, and thus upload a lot) with residential ones (which assume you almost always download and upload very little) to see the difference. Instead, the many light users subsidize the few heavy users. BitTorrent, in which everyone helps take load off servers by uploading everything they download, often many times over, threatens this model — if everyone uploads, Internet rates will have to go way up.
Thus, ISPs often try to stop BitTorrent and other peer-to-peer systems. They use copyright as an excuse, but really, they don’t care about copyright — they care about cost. Your downloading costs very little. Your uploading to other customers on the same ISP costs very little. Your uploading to the Internet costs them quite a lot by comparison. The most primitive way they’ve tried this is simple port-blocking — they ban connections to the port TCP/6119 (BitTorrent’s default) on all their customers PCs. This doesn’t work very well — for one, it’s obvious (BitTorrent simply fails to function), and for another, BitTorrent doesn’t need to use any port in particular. Due to the tracker, other peers can find you no matter what port you choose, so simply changing the default in your BitTorrent client gets around this. Slightly less primitive is “traffic shaping” — the ISP slows traffic to the default port, or it inspects all traffic for BitTorrent headers and slows any packets showing them. (The latter approach is much more expensive for the ISP, since it requires a deep inspection firewall on all traffic.) Once again, changing port is easy. In addition, some BitTorrent clients have added a header encryption feature to evade traffic shaping — this limits which peers are usable (specifically, to only other peers that support the header encryption), but evades the traffic shaping. Comcast has recently been using the Sandvine intelligent traffic management system, which has caused some controversy since it actually impersonates the user and sends forged traffic on their behalf, in a further attempt to limit BitTorrent and other P2P traffic.
The above problems are inherent to BitTorrent, and at first, they seem inherent to all peer-to-peer systems. However, the buccaneers of the Pirate Bay have come up with a rather ambitious plan to improve on BitTorrent, developing their own protocol to better suit their needs. They’re still working on the specification (there’s a wiki up for suggestions), but I find it interesting the security and privacy issues they need to overcome. At first glance, it seems the problems they must solve are the following:
- How can people upload pirated files without their IP addresses being detected by groups like the MPAA and RIAA?
- How can people hide the use of a file-sharing application so their ISP does not detect it and cut them off?
But that’s actually rather short-sighted, and the suggestions on the wiki seem to indicate that they’ve realized that, too. Creating a new peer-to-peer protocol to replace BitTorrent for pirates requires not looking at the current attacks, but rather at the threats themselves. The problem they really want to solve is simply to defend against these two threats:
- Legal prosecution for uploading pirated files
- ISP retribution for uploading large amounts of data
This is rather different! What they want to avoid is not detection per se, but rather the current consequences of that detection. In addition, they seek to address several technical/functional shortcomings of the BitTorrent protocol while they’re at it (such as that the tracker software does not scale to their traffic volume, and that upload bandwidth use in BitTorrent is suboptimal — many peers are not uploading anything.)
Right now, ISPs face no legal liability for transferring all this pirated media, since they are only content-indifferent carriers. Thus, a system that allowed users to also be content-indifferent carriers (i.e. sharing data they did not choose to download as well as the files they acquire on purpose) might provide some legal protection. The problem is that right now, users are from a legal standpoint sharing media they have, not simply transmitting media. Thus, a system of “reflector nodes”, where the aforementioned suboptimal bandwidth use instead has the empty bandwidth filled by data relayed from other peers might work. The ideal from an anonymity perspective would be onion routing, as performed by the TOR Project. Unfortunately, this causes a serious growth in bandwidth requirements for all peers — basically defeating the purpose of BitTorrent. Some balance must be found between true anonymity, as can be provided by a high-latency encrypted mix network with traffic-analysis resistance like TOR, and simple obfuscation, or even juggling around what is transmitted to be able to stick to the letter of the law while violating its spirit. No one would believe that pirates don’t mean to transmit pirated software, the mix network just makes it look that way, but it doesn’t matter if anyone believes it so long as they can’t prove it beyond a reasonable doubt in a court of law.
Avoiding ISP retribution is a bit harder. You can encrypt and use random ports, thus making detection impossible. However, this causes a problem — if everyone does this, and everyone uses P2P, then everyone’s Internet rates go up! This is hardly the desired outcome. An ISP administrator has contributed some novel suggestions regarding changing the protocol to help ISPs save costs. If the peer-to-peer system would deliberately prioritize other peers on the same ISP (ideally using WHOIS/ARIN data, though even simple CIDR subnets would help) for uploads, it could drastically reduce the ISP’s costs. Napster provides a good example — during their heyday, when Napster pirated transfers were killing college networks, they worked with universities to institute just this type of solution. The Napster client would look for other users at the same university to share with, only going to the Internet when this failed. This type of solution — not fighting the method by which ISPs hurt P2P but rather fighting its motivation — is bound to work better. It’s a good example of thinking about the threat, not about the particular vulnerability. In addition, it’s probably the only way to fight things like Sandvine (which, due to the way it works, can’t be stopped by a BitTorrent client unless it went to full encryption with all the negative effects that has — lightweight ways to evade Sandvine require patching the TCP/IP stack and altering RFC-mandated behavior, which is doable by people willing to hack their OS but not something you can just bundle into your P2P software.)
Another issue that the Pirate Bay has is with fake files. Sometimes, a user (either an RIAA/MPAA shill or just someone who likes being obnoxious) will upload a file of the approximate right size with a filename matching something new and popular (like a just-released movie or album) that contains no or bad data. With nothing but the filename to go on, users download the fakes, causing the seed count to go up and making the fake appear even more “realistic” on the tracker — and hundreds of gigabytes of bandwidth are wasted. Currently, the only thing to be done about this is to look at the uploader and ensure he is someone trusted, but identity is impossible to verify. Some sort of digital signature/PKI system would be very helpful here.
Overall, it will be very interesting to see what they come up with. Like all open-source projects, it may or may not actually get off the ground, and pirates are of course not well-known for their altruistic contributions. However, it’s not likely the BitTorrent creators (who don’t get any money from pirates) will work on these problems, so it falls to people like the Pirate Bay to try. Even if you don’t want pirated media, the resultant system could be useful for a host of purposes — the same technologies being used for fighting piracy and cutting ISP bills in the United States are used for hunting down dissidents and limiting free access to information in totalitarian nations. In addition, a sufficiently large peering system with deep storage and forced reflectors (i.e. people sharing data they did not specifically choose to download or share) could result in a sort of distributed information well in which any human knowledge could be stored for easy access and rendered almost indestructible. Criminals have been putting legitimate technologies to underhanded uses for centuries — an illegitimate technology can be put to beneficial uses as well.
The New York Times reports that people will be able to sign up for “do-not-track” lists to prevent online advertisers from monitoring their activities. It is not clear from the article if they’re expecting a government solution, along the lines of the National Do Not Call Registry for telemarketers, or merely solutions from ISPs and advertisers themselves.
Unfortunately, there is a slight problem with either solution: it’s pretty much impossible.
First, a bit about how ad networks work. Whenever your browser loads a page with a banner or text ad on it, the page contains a link to the ad network’s web server telling it to load the ad. As it does with any site, your browser first checks to see if it has a cookie recorded for that site. If it’s the first time you’ve ever visited that ad network, then it does not; if you have visited before, then there is a unique ID number for you in the cookie. The browser then sends a request to the ad network, along with a cookie (if any) and a referrer header (saying what page the ad was loaded from.)
The ad network site then looks up the ID in the cookie. This ID is linked with a list of all the referrer headers it’s ever received from you — this is the “tracking” component. It adds the new referrer header to the list, and then uses the list to try to puzzle out what sort of things you like and pick the ad it thinks you’re most likely to click on. It then returns that ad. If no cookie was received from you, it also creates an ID for you and sends that so as to set the cookie for next time.
That’s pretty much all it does. There are variants, which also use script to inspect the pages you linked from and use that to make better predictions of what you want to see adds for, but the overall effect is the same. The ad network doesn’t know who you are, or any demographic info about you — all it knows is that some person with a random ID has visited a specific list of sites. In addition, there’s a simple way to dump all that tracking information — tell your browser to delete all the cookies (or just the ones for ad networks.) Whenever you do this, the ad networks will all think you’re a “new” person and provide you with a new ID number.
So, how do we stop the ad tracking (should you even really want to)? I can see a few possibilities, but all have some significant difficulties associated with them:
1.) Set a cookie that essentially sets your ID as “don’t track me, use random ads instead.” Whenever you visit an ad network, this “do-not-track” ID is sent, and the ad network sends you back a random ad without bothering to record your referrer. Issues: due to the same-site rule, this cookie must be set by each ad network itself. So there’s no common registry — you have to opt out with each ad network, and then trust each ad network to continue to obey the opt-out.
2.) Install an app or modify the browser to dump cookies. Works great; no more tracking. Issues: also breaks half of the Web. If you allow even per-session cookies, some limited tracking is possible, and if you don’t allow session cookies, you break pretty much all of the Web.
3.) Have your ISP scan all your web traffic, find cookies that are going to ad networks, and strip only those. This makes the web work normally while killing ad networks. Issues: requires all the ISPs offering this sort of technology to keep track of every ad network in the world so they know which cookies to block. What about single-site ad networks? (e.g. the New York Times tracking which articles on their site you read and targeting ads based on those.) There are probably tens of thousands of them.
Also, the above three examples are only pointing out issues when ad networks are not malicious – that is, they want to allow you to opt out if you so desire. If they are hostile, then they can work around any of the above options. They can simply disregard the do-not-track cookies and set a different ID, or track you via codes embedded in image tags. The latter method is inferior, since it does not persist across sessions (it forgets who you are whenever you close your browser) without the cooperation of the actual sites the ads are on, but it does still allow some tracking capability. Affiliate networks are constantly advertising and improving their “cookieless traffic” capabilities.
Of course, if the government cares to get involved, it can simply mandate that all ad networks offer an opt-out, and pursue legal action against any who don’t, or who evade their own opt-out systems. However, what it can’t do is offer a centralized list like the Do Not Call Registry. After all, the ad networks do not know who you are – they only know you are some random ID number who has visited various sites in the past. Thus, they have no way to check against a list and see if you’re on it. And since cookies can only be sent to the site they came from, the government site can’t set some kind of master “do-not-track” cookie — your browser would refuse to send the cookie to any ad networks!
However, before instituting a system like this at all, we should perhaps consider the unintended consequences. The reason that ad networks institute tracking is that targeted ads are more valuable to advertisers than random ones. A car company would rather show ads to car buffs than to people who don’t drive, and it will pay more for ads it knows are going to interested parties. Thus, if ad networks cannot target ads with tracking, they will have to charge less for ads. This means that sites will get paid less per ad for placing ad network links on their sites. Therefore, eliminating ad network tracking means sites will have to carry more ads. Is “more ads” really what we want here? Are we willing to accept more ads to ditch the tracking? How big a privacy threat is this, anyway? There are people I don’t want to track my web surfing, certainly, but DoubleClick and Aquantive are not the people I’m thinking of here. Perhaps what we need is not a way to opt out of ad tracking, but more limits on who can get that data? Were ad tracking data illegal to resell and not admissible in court, would we care about it at all? I’m not sure that I would.
Of course, much of this is moot if instead of opting out of the tracking systems, you just “opt out” of the ad networks altogether, either with a plugin like AdBlock (which advertisers hate) or a custom hosts file. It doesn’t get 100% of the networks, of course, but it sure gets a lot of them.
The War on the Unexpected
Bruce Schneier has a good post today called “The War on the Unexpected,” about the unintended results of asking the general population to report anything suspicious. Even discounting deliberate malfeasance (reporting the neighbor you don’t like as “suspicious”), people find a lot of things suspicious, and the gatekeepers have no motivation to apply intelligent filtering to public reports. When someone makes a specious report and the police overreact, they’re praised for their vigilance, while the real victim in the situation is lucky to escape without prison time. The result is a paranoid society where merely being unusual can get you into trouble — the very opposite of a free society where your actions are none of anyone else’s business unless you’re directly harming them.
Of course, there’s not much motivation for government to reduce these overzealous “awareness” programs, either. A paranoid populace is always supportive of more government intervention to “protect” them, and making everyone into a criminal makes social control quite easy, since there is no one not subject to arrest, only the people you haven’t chosen to arrest yet.
Terrorism can never be absolutely prevented because terrorism is easy — it is a sad fact of chemistry that many things explode, and there are many ways of being dead. A free society can only prevent crime because criminals have something to lose — people acting in self-interest do not want to die or go to prison, and a free society must fight crime via punishing criminals after the crime has been committed. Since terrorists of the current radical Islamic model aren’t deterred in this way, we are deprived of our normal security responses and forced to try to fight with prevention only, rather than the standard responses of detection & punishment. To truly eliminate this sort of terrorism requires changing the culture from which it emerges — removing the “feed stock” of terrorist organizations by giving people something to live for. This is not a short-term project.
The proper response of a free society to terrorism is not “prevention at all costs,” but rather prevention where the cost is justified and resilience where it is not. Western society is distributed, and has a phenomenal depth of resources that is absent in many other societies — our culture is, in short, extremely hard to destroy. As catastrophic as the September 11th attacks were, your chances of dying in a terrorist attack remain smaller than your chances of dying of heatstroke, inhalation of a foreign object, or drowning in a swimming pool; our society is threatened not by the direction damage of terrorist attacks but by the response those attacks cause in us. Some threats are direct and obvious enough that mitigates them makes sense, but for many threats the rational response is to accept the risk; that is, recognize that the risk is there, understand that the chances of it affecting you, personally, are nearly nil, and that absolute safety does not exist. We need to go on about our lives, and work to recover from attacks in the same way that we recover from natural disasters. When a disaster happens, we mourn, we help the people affected, we rebuild the damage — but we do not change our way of life because of them. Somehow, we think that human-caused disasters should be entirely different, but this is not necessarily the case.
Stripping for CAPTCHAs
Spammers want email accounts. Free email services like Yahoo! Mail, GMail, and Windows Live Hotmail want to give people free email accounts, but they don’t want to help spammers. Thus, they try to make sure that it is easy for one person to sign up for an email account, but hard for a spam system to sign up for 1000 email accounts.
Thus, when you sign up for an email account, such as on this Yahoo! page, you’re required to complete a CAPTCHA (”Completely Automated Public Turing test to tell Computers and Humans Apart”; the acronym probably came first) to prove you’re not a computer. These are relatively easy for humans to read, but relatively difficult for computers to — although as OCR software gets better at reading them, they get harder and harder for people to read. Eventually these will stop working altogether when the crossover error rate for computers reading them is equal to or lower than the one for humans, though this is a good way off. We’re already at the point where when a major online service increases their CAPTCHA difficulty, they notice a significant drop-off in sign-ups as users find themselves unable to complete them (many users, if they can’t complete it in 1-2 tries, consider it to be not worth the effort and go on to another site.)
In the meantime, though, spammers keep trying to find ways to bypass them. Automation doesn’t work so well — that being the whole point — so they’ve come up with rather innovative ways to do this.
One option: just pay people to solve them for you. Spamming makes money. One email account can send thousands of spams before being shut down. In the global economy, you can hire someone for $0.60/hr. to solve CAPTCHAs for you without asking questions like “why are you doing this?” At $50/week, you can have all the email accounts you need to make rather more than $50 sending spam.
A newer option: make people think it’s a game. Yes, there’s a piece of malware floating around that has a digitized woman stripping for CAPTCHAs. It’s like digital strip poker, only instead of winning a hand of cards you just have to correctly answer a CAPTCHA. You fill them out, the app signs up for an email account and sends it to the spammer, and it shows you porn. It’s considered malware (Trend Micro calls it TROJ_CAPTCHAR.A) because it’s being used for spamming, but the app does exactly what it says it does — it doesn’t harm its user, it just helps spammers in the background.
Of course, in a sense CAPTCHA is still serving its purpose — it is stopping purely automated attacks. Neither paying people nor tricking them with porn games scales nearly as well as straight automation — without a CAPTCHA you could create thousands of email accounts per hour rather than per week. However, it still serves as a good illustration of the ingenuity of attackers, and the fact that no countermeasure makes an app “secure” — they make it secure from something. In this case, with pure automation foreclosed to them, attackers have simply found an end-run around the problem. CAPTCHAs are dependent on making it not worth the spammer’s time to fake sign-ups, and in that they succeed… where they fail is that some other people value their time far less than spammers do, and spammers are learning to exploit that fact.
Subscribe