Buffer overflow protection: Difference between revisions

Content deleted Content added
rm already linked in body
Mark section no references, remove example section for WP:NOTGUIDE (and no references)
Line 19:
 
==Canaries==
{{Unreferenced section|date=November 2023}}
''Canaries'' or ''canary words'' are known values that are placed between a buffer and control data on the stack to monitor buffer overflows. When the buffer overflows, the first data to be corrupted will usually be the canary, and a failed verification of the canary data will therefore alert of an overflow, which can then be handled, for example, by invalidating the corrupted data. A canary value should not be confused with a [[sentinel value]].
 
Line 88 ⟶ 89:
===StackGhost (hardware-based)===
Invented by [[Mike Frantzen]], StackGhost is a simple tweak to the register window spill/fill routines which makes buffer overflows much more difficult to exploit. It uses a unique hardware feature of the [[Sun Microsystems]] [[SPARC]] architecture (that being: deferred on-stack in-frame register window spill/fill) to detect modifications of return [[Pointer (computer programming)|pointers]] (a common way for an [[exploit (computer security)|exploit]] to hijack execution paths) transparently, automatically protecting all applications without requiring binary or source modifications. The performance impact is negligible, less than one percent. The resulting [[gdb]] issues were resolved by [[Mark Kettenis]] two years later, allowing enabling of the feature. Following this event, the StackGhost code was integrated (and optimized) into [[OpenBSD]]/SPARC.
 
==A canary example==
Normal buffer allocation for [[x86]] architectures and other similar architectures is shown in the [[buffer overflow]] entry. Here, we will show the modified process as it pertains to StackGuard.
 
When a function is called, a stack frame is created. A stack frame is built from the end of memory to the beginning; and each stack frame is placed on the top of the stack, closest to the beginning of memory. Thus, running off the end of a piece of data in a stack frame alters data previously entered into the stack frame; and running off the end of a stack frame places data into the previous stack frame. A typical stack frame may look as below, having a [[return statement|return address]] (RETA) placed first, followed by other control information (CTLI).
 
(CTLI)(RETA) <!-- replace these PsOS with images -->
 
In [[C (programming language)|C]], a function may contain many different per-call data structures. Each piece of data created on call is placed in the stack frame in order, and is thus ordered from the end to the beginning of memory. Below is a hypothetical function and its stack frame.
<syntaxhighlight lang="c">
int foo() {
int a; /* integer */
int *b; /* pointer to integer */
char c[10]; /* character arrays */
char d[3];
 
b = &a; /* initialize b to point to ___location of a */
strcpy(c,get_c()); /* get c from somewhere, write it to c */
*b = 5; /* the data at the point in memory b indicates is set to 5 */
strcpy(d,get_d());
return *b; /* read from b and pass it to the caller */
}
</syntaxhighlight>
(d..)(c.........)(b...)(a...)(CTLI)(RETA)
 
In this hypothetical situation, if more than ten bytes are written to the array {{code|c}}, or more than 13 to the character array {{code|d}}, the excess will overflow into integer pointer {{code|b}}, then into integer {{code|a}}, then into the control information, and finally the return address. By overwriting {{code|b}}, the pointer is made to reference any position in memory, causing a read from an arbitrary address. By overwriting ''RETA'', the function can be made to execute other code (when it attempts to return), either existing functions ([[return-to-libc attack|ret2libc]]) or code written into the stack during the overflow.
 
In a nutshell, poor handling of {{code|c}} and {{code|d}}, such as the unbounded [[strcpy]]() calls above, may allow an attacker to control a program by influencing the values assigned to {{code|c}} and {{code|d}} directly. The goal of buffer overflow protection is to detect this issue in the least intrusive way possible. This is done by removing what can be out of harms way and placing a sort of tripwire, or '''canary''', after the buffer.
 
Buffer overflow protection is implemented as a change to the compiler. As such, it is possible for the protection to alter the structure of the data on the stack frame. This is exactly the case in systems such as ''ProPolice''. The above function's automatic variables are rearranged more safely: arrays {{code|c}} and {{code|d}} are allocated first in the stack frame, which places integer {{code|a}} and integer pointer {{code|b}} before them in memory. So the stack frame becomes
 
(b...)(a...)(d..)(c.........)(CTLI)(RETA)
 
As it is impossible to move ''CTLI'' or ''RETA'' without breaking the produced code, another tactic is employed. An extra piece of information, called a "canary" (CNRY), is placed after the buffers in the stack frame. When the buffers overflow, the canary value is changed. Thus, to effectively attack the program, an attacker must leave definite indication of his attack. The stack frame is
 
(b...)(a...)(d..)(c.........)(CNRY)(CTLI)(RETA)
 
At the end of every function there is an instruction which continues execution from the memory address indicated by ''RETA''. Before this instruction is executed, a check of ''CNRY'' ensures it has not been altered. If the value of ''CNRY'' fails the test, program execution is ended immediately. In essence, both deliberate attacks and inadvertent programming bugs result in a program abort.
 
The canary technique adds a few instructions of overhead for every function call with an automatic array, immediately before all dynamic buffer allocation and after dynamic buffer deallocation. The overhead generated in this technique is not significant. It does work, though, unless the canary remains unchanged. If the attacker knows that it's there, and can determine the value of the canary, they may simply copy over it with itself. This is usually difficult to arrange intentionally, and highly improbable in unintentional situations.
 
The position of the canary is implementation specific, but it is always between the buffers and the protected data. Varied positions and lengths have varied benefits.
 
==See also==