In my last post about finding a job in information security, when discussing application security, I off-handedly mentioned several mitigation technologies — GS, DEP, SAL, and ASLR. These are technologies developed by OS vendors to provide system-wide protection against common attacks, and are things every application developer should know about when dealing with native (unmanaged) code.
The scourge of C and C++ apps for the last decade and a half has been the stack buffer overflow. This is an attack wherein the attacker discovers that an application is trying to fit some piece of user input into a spot in memory without first checking to see if it will fit. In the most common scenario, the spot in memory is a local variable, which means that carefully-crafted input can overwrite the return pointer on the stack with a user-selected value. If this is done, when the function finishes it will transfer execution to the user-provided input, which can then take control of the running process and do anything that that process’s owner is capable of. If the process is an OS service, running with a privileged account like root on a UNIX/Linux system or Administrator/SYSTEM on a Windows system, it may be able to take full control of the system. I first learned this attack in Aleph One’s classic Phrack article, Smashing the Stack for Fun and Profit, written in 1996.
Application developers have been told for many years now to be very careful when allocating memory and copying data, especially strings, to prevent these exploits. However, it’s relatively difficult, so developers continue to make the same mistakes. In addition, the attackers get more creative, and have found variations on this attack that are even harder to avoid. Luckily, OS developers have also been busy trying to find global mitigations for these attacks, so that developers can’t make these mistakes, and the whole computing ecosystem becomes safer.
The first common OS-based mitigation technology is the stack canary. On Windows, this mitigation is activated via the /GS compiler option (for Guard Stack); Solaris also incorporates a similar mechanism called StackGhost, while the latest GCC compiler on Linux has a stack protection feature called PPC. Of the major OS’s currently in use, only Mac OS X is missing a stack canary feature.
Whenever a function is called, a stack frame is created in memory for the function call. The stack frame is arranged as follows:
|Local Variables||Saved EBP||Saved EIP||Arguments|
Each portion of the frame is just large enough for its contents. EIP is the instruction pointer — whatever EIP points to, the processor executes. The Saved EIP is the return pointer — when the function returns, that saved value is placed into EIP.
A buffer overflow occurs when the attacker tricks the application into placing something into a local variable that is too large to fit. It thus overflows its bounds, overwriting the saved registers. Since the saved EIP has been overwritten, when the function returns, execution jumps to whatever value the attacker wants. However, in a /GS-compiled binary, this is much more difficult, as the stack frame instead looks like this:
|Local Variables||Canary||Saved EBP||Saved EIP||Arguments|
The canary is basically an arbitrary random number. However, the system remembers what it was when the stack frame was entered, and before returning to the saved EIP, it checks to make sure the canary value hasn’t changed. This poses a problem for the attacker, because it’s in the way! Any value large enough to overwrite the saved EIP will also overwrite the canary, and the attacker doesn’t know what the canary value is. In order to get it, he would need to execute some code to read it… and he can’t execute code with the canary in the way. Thus, stack buffer overflows are prevented.
Some creative attackers figured out that you could still sometimes do some damage by overwriting not the saved EIP, but the function arguments. If a function makes use of delegation and receives function pointers in arguments, you could sometimes still execute code this way, because they would be used during the function, and /GS only checks the canary when the function returns. Thus, in recent versions of Visual Studio, /GS also causes the system to make a copy of the arguments when a stack frame is created, placed before the local variables. The copy is used until the function exits; thus, overwriting the arguments does nothing until the function returns, at which time the canary is checked, and any corruption is detected.
Hardware Data Execution Protection
Another mitigation added for buffer overflow prevention is what Microsoft calls Data Execution Protection (DEP), which makes use of Intel and AMD’s NX (No-Execute) flag on recent CPUs. On NX-enabled CPUs, each memory page is marked as either code (executable) or data (not executable,) and a fatal error occurs if EIP ever points into a data page. A compiler flag in Visual Studio 2005 and greater (/NXCOMPAT) enables this feature on an application; Linux compilers have also added a similar feature.
The entire stack is marked as a data page, which normally prevents stack overflows. While the attacker can overwrite EIP, he can’t make it jump execution into his own input, so he can’t execute his own code — only code already in the process. However, once again, enterprising hackers have found a way around it — what is called the “return to libc” attack. They overwrite the saved EIP with an address pointing to kernel32!VirtualProtect(), the function that marks pages as code or data! With carefully crafted arguments, they can actually instruct VirtualProtect to mark the stack as code, then return into their code. On the bright side, this is very difficult, and won’t work if the exploitable buffer is a string, because the required arguments are full of null bytes.
A more elaborate attack can call into ntdll!NtSetInformationProcess and disable NX for the entire process. The advantage to this is that it can be done without null bytes (though it’s very complicated), so it can go through strings. The disadvantage, though, is that it won’t likely work on a securely-configured production server. If NX is set globally enabled in boot.ini, ntdll!NtSetInformationProcess is unable to override it.
Though I’ve mentioned Windows-specific function names here, there are Linux equivalents that can be used in attacks. (Indeed, it’s called the “return to libc” attack because of the name of the UNIX/Linux C runtime library.)
Address Space Layout Randomization
All of these evasions of NX protection require being able to instruct the system to jump directly into system functions. Doing this requires address prediction — you have to know where in memory the system functions are so you can jump to them. Even in the simple stack-smashing exploit, the attacker still needs to know where the stack is in order to place that value into the saved EIP. Address Space Layout Randomization (ASLR) is a relatively new technology that makes address prediction nearly impossible by making libraries load into different locations on every reboot. If the attacker doesn’t know where the libraries are, he generally cannot jump to them with any reliability.
ASLR is enabled on Windows using the linker flag /DYNAMICBASE. OpenBSD has ASLR by default; Linux implementations have a weak form of ASLR but can be upgraded to full ASLR using various popular kernel patch. Once again, Mac OS X is the only major OS missing this mitigation, though changes in OS X 10.5 imply they are preparing to add it in a future version.
ASLR randomizes where libraries are found, so that it is very difficult to predict where they are. It does, however, have a few weaknesses:
- In many cases, executable files themselves are not randomized. Thus, attackers are prevented from jumping to system functions, but can still jump to functions in the executable file.
- Only the high-order bytes of addresses are randomized; the attacker can still jump to anything within 16 memory pages of known address space.
- It may be possible to brute-force the location of a library by simply trying all the addresses if you have a section of code that will permit this.
Case #3 is very difficult on Windows, since there are no forking daemons and if a service is made to crash several times in a row it will stop restarting (precisely to prevent this sort of brute-forcing.) However, on UNIX/Linux systems, this is possible, and it may be possible on Windows, too, if the code being exploited eats exceptions (i.e. it has an exception handler that discards errors and keeps the service running.)
Safe Structured Exception Handling
On Windows C++ applications, there’s another way around the stack canaries — exploiting Structured Exception Handling. When SEH is used, the stack looks like this:
|Local Variables||SEH Next||SEH Ret||Canary||Saved EBP||Saved EIP||Arguments|
Those SEH pointers are found before the canary, and thus can be overwritten. It’s possible to craft values for those pointers that point into the stack, and then force an exception to occur. When the exception happens, the pointers are followed and arbitrary code is run. Stack canaries don’t help with this (and the canary can’t be put before the SEH pointers, because in a sense they are local variables, just not ones declared by the programmer), though NX still does. However, since NX is not available on all processors (nor enabled on all processes), Microsoft also introduced the /SafeSEH compiler flag.
In a /SafeSEH process, when execution begins, the system asks all the libraries in a process to find all of their possible exception handlers and write them to a table. Before ever jumping to an SEH Next pointer, it verifies that the pointer points to something on the table. Thus, if the attacker overwrites this pointer, it does no good — he can’t run anything that isn’t an exception handler.
There is a problem with this, though — it only works if every library used by an application was compiled with /SafeSEH and records its exception handlers on the table. If even one library didn’t, then the system can’t verify the pointers — they might well be pointing to an exception handler that just wasn’t registered.
There are no non-Windows equivalents to /SafeSEH, as the SEH method of exception handling is a Windows-specific construct.
Security Annotation Language
Ideally, we wouldn’t need all these mitigations because we wouldn’t be writing buffer overflows in the first place. However, when writing complex code, they can be very hard to see. We would prefer that the compiler just detect the overflows at compile-time, but the compiler doesn’t always know how our variables will be used, and thus cannot determine where an overflow may lie.
Microsoft’s Security Annotation Language (discussed in detail on Michael Howard’s blog here) allows the developer to “hint” to the compiler how all the arguments to a function are used. The developer uses SAL annotations on each function declaration, specifying if arguments are input or output, if they can be NULL, how long their buffers are, etc. These “hints” (actually compiler macros) allow the compiler to verify that no buffer overruns are being introduced.
It’s more work for the developer, as they have to put some thought into the annotations, and a company making use of SAL has to enforce its use (i.e. no checking in functions that aren’t annotated.) However, while it’s work, it’s not difficult — unlike checking for buffer overruns manually, which is very difficult. With properly-annotated functions, most buffer overruns can be caught at compile time, and fixed before the application is ever released. Unfortunately, SAL has not seen much use outside of Microsoft itself, due to the extra developer overhead. It’s easier to get people to add a few compiler & linker flags than to change the way they program.