Buffer overflow protection: Difference between revisions

Content deleted Content added
No edit summary
Link suggestions feature: 2 links added.
 
(28 intermediate revisions by 16 users not shown)
Line 1:
{{Short description|Software security techniques}}
'''Buffer overflow protection''' is any of various techniques used during software development to enhance the security of executable programs by detecting [[buffer overflow]]s on [[call stack|stack]]-allocated variables, and preventing them from causing program misbehavior or from becoming serious [[computer security|security]] vulnerabilities. A stack buffer overflow occurs when a program writes to a memory address on the program's call stack outside of the intended data structure, which is usually a fixed-length buffer. Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, which could lead to program crashes, incorrect operation, or security issues.
 
'''Buffer overflow protection''' is any of various techniques used during software development to enhance the security of executable programs by detecting [[buffer overflow]]s on [[call stack|stack]]-allocated variables, and preventing them from causing program misbehavior or from becoming serious [[computer security|security]] vulnerabilities. A stack buffer overflow occurs when a program writes to a [[memory address]] on the program's call stack outside of the intended data structure, which is usually a fixed-length buffer. Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, which could lead to program crashes, incorrect operation, or security issues.
 
Typically, buffer overflow protection modifies the organization of stack-allocated data so it includes a ''[[stack canary|canary]]'' value that, when destroyed by a stack buffer overflow, shows that a buffer preceding it in memory has been overflowed. By verifying the canary value, execution of the affected program can be terminated, preventing it from misbehaving or from allowing an attacker to take control over it. Other buffer overflow protection techniques include ''[[bounds checking]]'', which checks accesses to each allocated block of memory so they cannot go beyond the actually allocated space, and ''tagging'', which ensures that memory allocated for storing data cannot contain executable code.
Line 13 ⟶ 15:
Stack buffer overflow can be caused deliberately as part of an attack known as [[stack smashing]]. If the affected program is running with special privileges, or if it accepts data from untrusted network hosts (for example, a public [[webserver]]), then the bug is a potential security vulnerability that allows an [[hacker (computer security)|attacker]] to inject executable code into the running program and take control of the process. This is one of the oldest and more reliable methods for attackers to gain unauthorized access to a computer.<ref>{{cite journal |last=Levy |first=Elias |author-link=Elias Levy |title=Smashing The Stack for Fun and Profit |journal=[[Phrack]] |volume=7 |issue=49 |page=14 |date=1996-11-08 |url=http://www.phrack.org/issues/49/14.html#article }}</ref>
 
Typically, buffer overflow protection modifies the organization of data in the [[stack frame]] of a [[function call]] to include a "canary" value that, when destroyed, shows that a buffer preceding it in memory has been overflowed. This provides the benefit of preventing an entire class of attacks. According to some researchers,<ref>{{cite web|url=https://www.cpe.ku.ac.th/~mcs/courses/2005_02/214573/papers/buffer_overflows.pdf |title=Buffer Overflows: Attacks and Defenses for the Vulnerability of the Decade* |url-status=dead |archive-url=https://web.archive.org/web/20130309083252/http://tmp-www.cpe.ku.ac.th/~mcs/courses/2005_02/214573/papers/buffer_overflows.pdf |archive-date=2013-03-09 }}</ref> the performance impact of these techniques is negligible.
 
Stack-smashing protection is unable to protect against certain forms of attack. For example, it cannot protect against buffer overflows in the heap. There is no sane way to alter the layout of data within a [[Data structure|structure]]; structures are expected to be the same between modules, especially with shared libraries. Any data in a structure after a buffer is impossible to protect with canaries; thus, programmers must be very careful about how they organize their variables and use their structures.
 
==Canaries==
{{Unreferenced section|date=November 2023}}
''Canaries'' or ''canary words'' or ''stack cookies'' are known values that are placed between a buffer and control data on the stack to monitor buffer overflows. When the buffer overflows, the first data to be corrupted will usually be the canary, and a failed verification of the canary data will therefore alert of an overflow, which can then be handled, for example, by invalidating the corrupted data. A canary value should not be confused with a [[sentinel value]].
 
The terminology is a reference to the historic practice of using [[animal sentinel#Toxic gases|canaries in coal mines]], since they would be affected by toxic gases earlier than the miners, thus providing a biological warning system. Canaries are alternately known as ''stack cookies'', which is meant to evoke the image of a "broken cookie" when the value is corrupted.
 
There are three types of canaries in use: ''terminator'', ''random'', and ''random [[XOR]]''. Current versions of StackGuard support all three, while ProPolice supports ''terminator'' and ''random'' canaries.
Line 30 ⟶ 33:
''Random canaries'' are randomly generated, usually from an [[entropy (computing)|entropy]]-gathering [[daemon (computer software)|daemon]], in order to prevent an attacker from knowing their value. Usually, it is not logically possible or plausible to read the canary for exploiting; the canary is a secure value known only by those who need to know it&mdash;the buffer overflow protection code in this case.
 
Normally, a random canary is generated at program initialization, and stored in a [[global variable]]. This variable is usually [[Padding (cryptography)|padded]] by unmapped pages, so that attempting to read it using any kinds of tricks that exploit bugs to read off RAM cause a [[segmentation fault]], terminating the program. It may still be possible to read the canary, if the attacker knows where it is, or can get the program to read from the stack.
 
===Random XOR canaries===
''Random XOR canaries'' are random canaries that are [[Exclusive or|XOR]]-scrambled using all or part of the control data. In this way, once the canary or the control data is clobbered, the canary value is wrong.
 
Random XOR canaries have the same vulnerabilities as random canaries, except that the "read from stack" method of getting the canary is a bit more complicated. The attacker must get the canary, the algorithm, and the control data in order to re-generate the original canary needed to spoof the protection.
 
In addition, random XOR canaries can protect against a certain type of attack involving overflowing a buffer in a structure into a [[Pointer (computer programming)|pointer]] to change the pointer to point at a piece of control data. Because of the XOR encoding, the canary will be wrong if the control data or return value is changed. Because of the pointer, the control data or return value can be changed without overflowing over the canary.
 
Although these canaries protect the control data from being altered by clobbered pointers, they do not protect any other data or the pointers themselves. Function pointers especially are a problem here, as they can be overflowed into and can execute [[shellcode]] when called.
Line 44 ⟶ 47:
{{Main|Bounds checking}}
 
Bounds checking is a compiler-based technique that adds run-time bounds information for each allocated block of memory, and checks all pointers against those at run-time. For C and C++, bounds checking can be performed at pointer calculation time<ref name="joneskelly">{{cite web |url=http://www.doc.ic.ac.uk/~phjk/BoundsChecking.html |title=Bounds Checking for C |publisher=Doc.ic.ac.uk |access-date=2014-04-27 |archive-url=https://web.archive.org/web/20160326081542/https://www.doc.ic.ac.uk/~phjk/BoundsChecking.html |archive-date=2016-03-26 |url-status=dead }}</ref> or at [[Dereferencing|dereference]] time.<ref name="safecodesva">{{cite web|url=http://sva.cs.illinois.edu/sva.html |title=SAFECode: Secure Virtual Architecture |publisher=Sva.cs.illinois.edu |date=2009-08-12 |access-date=2014-04-27}}</ref><ref name="asan">{{cite web|url=https://code.google.com/p/address-sanitizer/|title=google/sanitizers|date=19 June 2021}}</ref><ref name="failsafec">{{cite web |url=http://staff.aist.go.jp/y.oiwa/FailSafeC/index-en.html |title=Fail-Safe C: Top Page |publisher=Staff.aist.go.jp |date=2013-05-07 |access-date=2014-04-27 |archive-url=https://web.archive.org/web/20160707163127/https://staff.aist.go.jp/y.oiwa/FailSafeC/index-en.html |archive-date=2016-07-07 |url-status=dead }}</ref>
 
Implementations of this approach use either a central repository, which describes each allocated block of memory,<ref name="joneskelly"/><ref name="safecodesva"/><ref name="asan"/> or [[fat pointer]]s,<ref name="failsafec"/> which contain both the pointer and additional data, describing the region that they point to.
Line 51 ⟶ 54:
Tagging<ref>{{cite web |url=http://www.feustel.us/Feustel%20&%20Associates/Advantages.pdf |title=Tuesday, April 05, 2005 |website=Feustel.us |access-date=2016-09-17 |archive-url=https://web.archive.org/web/20160623195112/http://www.feustel.us/Feustel%20%26%20Associates/Advantages.pdf |archive-date=June 23, 2016 |url-status=dead }}</ref> is a compiler-based or hardware-based (requiring a [[tagged architecture]]) technique for tagging the type of a piece of data in memory, used mainly for type checking. By marking certain areas of memory as non-executable, it effectively prevents memory allocated to store data from containing executable code. Also, certain areas of memory can be marked as non-allocated, preventing buffer overflows.
 
Historically, tagging has been used for implementing high-level programming languages;<ref>{{cite journal |url=https://dl.acm.org/citation.cfm?id=36183&dl=ACM&coll=DL&CFID=488926714&CFTOKEN=95195479 |title=Tags and type checking in LISP: hardware and software approaches |year=1987 |publisher=ACM|doi=10.1145/36204.36183 |last1=Steenkiste |first1=Peter |last2=Hennessy |first2=John |journal=ACM SigopsSIGOPS Operating Systems Review |volume=21 |issue=4 |pages=50–59 |doi-access=free }}</ref> with appropriate support from the [[operating system]], tagging can also be used to detect buffer overflows.<ref>{{cite web |url=http://public.support.unisys.com/aseries/docs/clearpath-mcp-11.0/pdf/38347639-000.pdf |title=ClearPath Enterprise Servers MCP Security Overview |publisher=Public.support.unisys.com |access-date=2014-04-27 |archive-url=https://web.archive.org/web/20130124070111/http://public.support.unisys.com/aseries/docs/clearpath-mcp-11.0/pdf/38347639-000.pdf |archive-date=2013-01-24 |url-status=dead }}</ref> An example is the [[NX bit]] hardware feature, supported by [[Intel]], [[AMD]] and [[ARM architecture|ARM]] processors.
 
==Implementations==
Line 63 ⟶ 66:
In 2012, [[Google]] engineers implemented the <kbd>-fstack-protector-strong</kbd> flag to strike a better balance between security and performance.<ref>{{cite web|url=https://gcc.gnu.org/ml/gcc-patches/2012-06/msg00974.html |title=Han Shen(ææ) - [PATCH&#93; Add a new option "-fstack-protector-strong" (patch / doc inside) |publisher=Gcc.gnu.org |date=2012-06-14 |access-date=2014-04-27}}</ref> This flag protects more kinds of vulnerable functions than <kbd>-fstack-protector</kbd> does, but not every function, providing better performance than <kbd>-fstack-protector-all</kbd>. It is available in GCC since its version 4.9.<ref>{{cite web|last1=Edge|first1=Jake|title="Strong" stack protection for GCC|url=https://lwn.net/Articles/584225/|website=Linux Weekly News|access-date=28 November 2014|date=February 5, 2014|quote=It has made its way into GCC 4.9}}</ref>
 
All [[Fedora (operating system)|Fedora]] packages are compiled with <kbd>-fstack-protector</kbd> since Fedora Core 5, and <kbd>-fstack-protector-strong</kbd> since Fedora 20.<ref>{{cite web|url=https://fedoraproject.org/wiki/Security_Features#Stack_Smash_Protection.2C_Buffer_Overflow_Detection.2C_and_Variable_Reordering |title=Security Features |publisher=FedoraProject |date=2013-12-11 |access-date=2014-04-27}}</ref><ref>{{cite web|url=https://fedorahosted.org/fesco/ticket/1128 |title=#1128 (switching from "-fstack-protector" to "-fstack-protector-strong" in Fedora 20) – FESCo |publisher=Fedorahosted.org |access-date=2014-04-27}}</ref> Most packages in [[Ubuntu (operating system)|Ubuntu]] are compiled with <kbd>-fstack-protector</kbd> since 6.10.<ref>{{cite web|url=https://wiki.ubuntu.com/Security/Features#stack-protector |title=Security/Features - Ubuntu Wiki |publisher=Wiki.ubuntu.com |access-date=2014-04-27}}</ref> Every [[Arch Linux]] package is compiled with <kbd>-fstack-protector</kbd> since 2011.<ref>{{cite web|url=https://bugs.archlinux.org/task/18864 |title=FS#18864 : Consider enabling GCC's stack-smashing protection (ProPolice, SSP) for all packages |publisher=Bugs.archlinux.org |access-date=2014-04-27}}</ref> All Arch Linux packages built since 4 May 2014 use <kbd>-fstack-protector-strong</kbd>.<ref>{{cite web |url=https://projects.archlinux.org/svntogit/packages.git/commit/trunk?h=packages/pacman&id=695ca25d4c24f3bd3b8c350d64f2697c733d5169 |archive-url=https://archive.today/20140718035407/https://projects.archlinux.org/svntogit/packages.git/commit/trunk?h=packages/pacman&id=695ca25d4c24f3bd3b8c350d64f2697c733d5169 |url-status=dead |archive-date=July 18, 2014 |title=svntogit/packages.git - Git clone of the 'packages' repository }}</ref> Stack protection is only used for some packages in [[Debian]],<ref>{{cite web |url=http://outflux.net/debian/hardening/ |title=Debian Security Hardening Statistics |publisher=Outflux.net |access-date=2014-04-27 |archive-date=2014-04-28 |archive-url=https://web.archive.org/web/20140428012424/http://outflux.net/debian/hardening/ |url-status=dead }}</ref> and only for the [[FreeBSD]] base system since 8.0.<ref>{{cite web|url=http://www.freebsd.org/releases/8.0R/relnotes.html |title=FreeBSD 8.0-RELEASE Release Notes |publisher=Freebsd.org |date=2013-11-13 |access-date=2014-04-27}}</ref> Stack protection is standard in certain operating systems, including [[OpenBSD]],<ref>{{cite web| url = https://man.openbsd.org/gcc-local.1| title = OpenBSD's gcc-local(1) manual page| quote = gcc comes with the ''ProPolice'' stack protection extension, which is enabled by default.}}</ref> [[Hardened Gentoo]]<ref>{{cite web|url=https://wiki.gentoo.org/wiki/Hardened/Toolchain#Default_addition_of_the_Stack_Smashing_Protector_.28SSP.29|title=Hardened/Toolchain - Gentoo Wiki|quote=The Gentoo hardened GCC switches on the stack protector by default unless explicitly requested not to.|date=2016-07-31}}</ref> and [[DragonFly BSD]] .{{Citation needed|date=September 2013}}.
 
StackGuard and ProPolice cannot protect against overflows in automatically allocated structures that overflow into function pointers. ProPolice at least will rearrange the allocation order to get such structures allocated before function pointers. A separate mechanism for [[buffer Overflow#Pointer protection|pointer protection]] was proposed in PointGuard<ref>{{cite web|url=http://www.usenix.org/events/sec03/tech/full_papers/cowan/cowan_html/index.html|title=12th USENIX Security Symposium — Technical Paper}}</ref> and is available on [[Microsoft Windows]].<ref>{{cite web|url=http://blogs.msdn.com/michael_howard/archive/2006/08/16/702707.aspx|title=MSDN Blogs – Get the latest information, insights, announcements, and news from Microsoft experts and developers in the MSDN blogs.|date=6 August 2021 }}</ref>
 
===Microsoft Visual Studio===
Line 74 ⟶ 77:
 
===Clang/[[LLVM]]===
Clang supports the same <kbd>-fstack-protector</kbd> options as GCC<ref>{{cite web|url=https://lists.llvm.org/pipermail/cfe-dev/2017-April/053662.html |publisher=Clang.llvm.org |title=Clang mailing list |date=28 April 2017 |access-date=2022-11-16}}</ref> and a stronger "safe stack" ({{tt|1=-fsanitize=safe-stack}}) system with similarly low performance impact.<ref>{{cite web |title=SafeStack — Clang 17.0.0git documentation |url=https://releases.llvm.org/15.0.0/tools/clang/docs/SafeStack.html |website=clang.llvm.org}}</ref> Clang also has three buffer overflow detectors, namely [[AddressSanitizer]] (<code>-fsanitize=address</code>),<ref name="asan"/> UBSan (<code>-fsanitize=bounds</code>),<ref>{{cite web|url=http://clang.llvm.org/docs/UsersManual.html |title=Clang Compiler User's Manual — Clang 3.5 documentation |publisher=Clang.llvm.org |access-date=2014-04-27}}</ref>
Clang supports three buffer overflow detectors, namely
and the unofficial SafeCode (last updated for LLVM 3.0).<ref>{{cite web|url=http://safecode.cs.illinois.edu/ |title=SAFECode |publisher=Safecode.cs.illinois.edu |access-date=2014-04-27}}</ref>
[[AddressSanitizer]] (-fsanitize=address),<ref name="asan"/>
 
-fsanitize=bounds,<ref>{{cite web|url=http://clang.llvm.org/docs/UsersManual.html |title=Clang Compiler User's Manual — Clang 3.5 documentation |publisher=Clang.llvm.org |access-date=2014-04-27}}</ref>
and SafeCode.<ref>{{cite web|url=http://safecode.cs.illinois.edu/ |title=SAFECode |publisher=Safecode.cs.illinois.edu |access-date=2014-04-27}}</ref>
These systems have different tradeoffs in terms of performance penalty, memory overhead, and classes of detected bugs. Stack protection is standard in certain operating systems, including [[OpenBSD]].<ref>{{cite web| url = https://man.openbsd.org/clang-local.1| title = OpenBSD's clang-local(1) manual page| quote = clang comes with stack protection enabled by default, equivalent to the ''-fstack-protector-strong'' option on other systems.}}</ref>
 
Line 88 ⟶ 90:
===StackGhost (hardware-based)===
Invented by [[Mike Frantzen]], StackGhost is a simple tweak to the register window spill/fill routines which makes buffer overflows much more difficult to exploit. It uses a unique hardware feature of the [[Sun Microsystems]] [[SPARC]] architecture (that being: deferred on-stack in-frame register window spill/fill) to detect modifications of return [[Pointer (computer programming)|pointers]] (a common way for an [[exploit (computer security)|exploit]] to hijack execution paths) transparently, automatically protecting all applications without requiring binary or source modifications. The performance impact is negligible, less than one percent. The resulting [[gdb]] issues were resolved by [[Mark Kettenis]] two years later, allowing enabling of the feature. Following this event, the StackGhost code was integrated (and optimized) into [[OpenBSD]]/SPARC.
 
==A canary example==
Normal buffer allocation for [[x86]] architectures and other similar architectures is shown in the [[buffer overflow]] entry. Here, we will show the modified process as it pertains to StackGuard.
 
When a function is called, a stack frame is created. A stack frame is built from the end of memory to the beginning; and each stack frame is placed on the top of the stack, closest to the beginning of memory. Thus, running off the end of a piece of data in a stack frame alters data previously entered into the stack frame; and running off the end of a stack frame places data into the previous stack frame. A typical stack frame may look as below, having a [[return statement|return address]] (RETA) placed first, followed by other control information (CTLI).
 
(CTLI)(RETA) <!-- replace these PsOS with images -->
 
In [[C (programming language)|C]], a function may contain many different per-call data structures. Each piece of data created on call is placed in the stack frame in order, and is thus ordered from the end to the beginning of memory. Below is a hypothetical function and its stack frame.
<syntaxhighlight lang="c">
int foo() {
int a; /* integer */
int *b; /* pointer to integer */
char c[10]; /* character arrays */
char d[3];
 
b = &a; /* initialize b to point to ___location of a */
strcpy(c,get_c()); /* get c from somewhere, write it to c */
*b = 5; /* the data at the point in memory b indicates is set to 5 */
strcpy(d,get_d());
return *b; /* read from b and pass it to the caller */
}
</syntaxhighlight>
(d..)(c.........)(b...)(a...)(CTLI)(RETA)
 
In this hypothetical situation, if more than ten bytes are written to the array {{code|c}}, or more than 13 to the character array {{code|d}}, the excess will overflow into integer pointer {{code|b}}, then into integer {{code|a}}, then into the control information, and finally the return address. By overwriting {{code|b}}, the pointer is made to reference any position in memory, causing a read from an arbitrary address. By overwriting ''RETA'', the function can be made to execute other code (when it attempts to return), either existing functions ([[return-to-libc attack|ret2libc]]) or code written into the stack during the overflow.
 
In a nutshell, poor handling of {{code|c}} and {{code|d}}, such as the unbounded [[strcpy]]() calls above, may allow an attacker to control a program by influencing the values assigned to {{code|c}} and {{code|d}} directly. The goal of buffer overflow protection is to detect this issue in the least intrusive way possible. This is done by removing what can be out of harms way and placing a sort of tripwire, or '''canary''', after the buffer.
 
Buffer overflow protection is implemented as a change to the compiler. As such, it is possible for the protection to alter the structure of the data on the stack frame. This is exactly the case in systems such as ''ProPolice''. The above function's automatic variables are rearranged more safely: arrays {{code|c}} and {{code|d}} are allocated first in the stack frame, which places integer {{code|a}} and integer pointer {{code|b}} before them in memory. So the stack frame becomes
 
(b...)(a...)(d..)(c.........)(CTLI)(RETA)
 
As it is impossible to move ''CTLI'' or ''RETA'' without breaking the produced code, another tactic is employed. An extra piece of information, called a "canary" (CNRY), is placed after the buffers in the stack frame. When the buffers overflow, the canary value is changed. Thus, to effectively attack the program, an attacker must leave definite indication of his attack. The stack frame is
 
(b...)(a...)(d..)(c.........)(CNRY)(CTLI)(RETA)
 
At the end of every function there is an instruction which continues execution from the memory address indicated by ''RETA''. Before this instruction is executed, a check of ''CNRY'' ensures it has not been altered. If the value of ''CNRY'' fails the test, program execution is ended immediately. In essence, both deliberate attacks and inadvertent programming bugs result in a program abort.
 
The canary technique adds a few instructions of overhead for every function call with an automatic array, immediately before all dynamic buffer allocation and after dynamic buffer deallocation. The overhead generated in this technique is not significant. It does work, though, unless the canary remains unchanged. If the attacker knows that it's there, and can determine the value of the canary, they may simply copy over it with itself. This is usually difficult to arrange intentionally, and highly improbable in unintentional situations.
 
The position of the canary is implementation specific, but it is always between the buffers and the protected data. Varied positions and lengths have varied benefits.
 
==See also==
{{Portal|Computer programming}}
 
* [[Sentinel value]] (which is not to be confused with a canary value)
* [[Control-flow integrity]]
* [[Address space layout randomization]]
Line 142 ⟶ 101:
 
==References==
{{Reflist|30em}}
 
==External links==