Defensive programming: Difference between revisions

Content deleted Content added
JChMathae (talk | contribs)
Trusting internal data validity: Optimizing compilers are (too) smart
BrainStack (talk | contribs)
Link suggestions feature: 3 links added.
 
(37 intermediate revisions by 28 users not shown)
Line 1:
{{Short description|Software development methodology}}
{{Use American English|date=November 2020}}
{{Multiple issues|
{{no footnotes|date=March 2009}}
{{howto|date=March 2012}}
 
}}
'''Defensive programming''' is a form of [[defensive design]] intended to ensuredevelop programs that are capable of detecting potential security abnormalities and make predetermined responses.<ref>{{Citation |last=Boulanger |first=Jean-Louis |title=6 - Technique to Manage Software Safety |date=2016-01-01 |url=https://www.sciencedirect.com/science/article/pii/B9781785481178500064 |work=Certifiable Software Applications 1 |pages=125–156 |editor-last=Boulanger |editor-first=Jean-Louis |publisher=Elsevier |language=en |isbn=978-1-78548-117-8 |access-date=2022-09-02}}</ref> It ensures the continuing function of a piece of [[software]] under unforeseen circumstances. Defensive programming practices are often used where [[high availability]], [[safety]], or [[computer security|security]] is needed.
 
Defensive programming is an approach to improve software and [[source code]], in terms of:
Line 11 ⟶ 10:
* Making the software behave in a predictable manner despite unexpected inputs or user actions.
 
Overly defensive programming, however, may safeguard against errors that will never be encountered, thus incurring run-time and maintenance costs. There is also a risk that code traps prevent too many [[Exception handling|exceptions]], potentially resulting in unnoticed, incorrect results.
 
== Secure programming ==
{{main|Secure coding}}
 
Secure programming is the subset of defensive programming concerned with [[computer security]]. Security is the concern, not necessarily safety or availability (the [[software]] may be allowed to fail in certain ways). As with all kinds of defensive programming, avoiding bugs is a primary objective; however, the motivation is not as much to reduce the likelihood of failure in normal operation (as if safety waswere the concern), but to reduce the [[attack surface]] – the programmer must assume that the software might be misused actively to reveal bugs, and that bugs could be exploited maliciously.
 
<syntaxhighlight lang="c">int risky_programming(char *input) {
Line 27 ⟶ 26:
// ...
}</syntaxhighlight>
The function will result in undefined behavior when the input is over 1000 characters. Some novice programmers may not feel that this is a problem, supposing that no user will enter such a long input. This particular bug demonstrates a vulnerability which enables [[buffer overflow]] [[exploit (computer security)|exploit]]s. Here is a solution to this example:
 
<syntaxhighlight lang="c">int secure_programming(char *input) {
Line 35 ⟶ 34:
 
// Copy input without exceeding the length of the destination.
strncpy(str, input, sizeof(str));
 
// If strlen(input) >= sizeof(str) then strncpy won't null terminate.
Line 63 ⟶ 62:
}
return "black"; // To be handled as a dead traffic light.
// Waring: This last 'return' statement will be dropped by an optimizing
// compiler if all possible values of 'traffic_light_color' are listed in
// the previous 'switch' statement...
}
</syntaxhighlight>
Line 78 ⟶ 74:
}
assert(0); // Assert that this section is unreachable.
// Waring: This 'assert' function call will be dropped by an optimizing
// compiler if all possible values of 'traffic_light_color' are listed in
// the previous 'switch' statement...
}
</syntaxhighlight>
Line 115 ⟶ 108:
If existing code is tested and known to work, reusing it may reduce the chance of bugs being introduced.
 
However, reusing code is not ''always'' a good practice,. because it also amplifies the damagesReuse of a potential attack on the initialexisting code., {{clarifyespecially span|Reusewhen inwidely thisdistributed, casecan mayallow causefor seriousexploits [[businessto process]]be bugs.|plain=Busycreated logicthat bugstarget anda securitywider bugsaudience arethan differentwould things.otherwise Thebe originalpossible textand herebrings explicitlywith talkedit aboutall businessthe logic,security and it'svulnerabilities still not clear whatof the problemreused is with reuse in business logiccode.|date=December 2016}}
 
When considering using existing source code, a quick review of the modules(sub-sections such as classes or functions) will help eliminate or make the developer aware of any potential vulnerabilities and ensure it is suitable to use in the project. {{Citation needed|reason=Cannot find source, Was from a video viewed~April 2015|date=November 2021}}
 
==== Legacy problems ====
Line 125 ⟶ 120:
* [[Legacy code]] may not have been designed under a defensive programming initiative, and might therefore be of much lower quality than newly designed source code.
* Legacy code may have been written and tested under conditions which no longer apply. The old quality assurance tests may have no validity any more.
** '''Example 1''': legacy code may have been designed for ASCII input but now the input is [[UTF-8]].
** '''Example 2''': legacy code may have been compiled and tested on 32-bit architectures, but when compiled on 64-bit architectures, new arithmetic problems may occur (e.g., invalid signedness tests, invalid type casts, etc.).
** '''Example 3''': legacy code may have been targeted for offline machines, but becomes vulnerable once network connectivity is added.
Line 131 ⟶ 126:
 
Notable examples of the legacy problem:
* [[BIND|BIND 9]], presented by Paul Vixie and David Conrad as "BINDv9 is a [[Rewrite (programming)|complete rewrite]]", "Security was a key consideration in design",<ref>{{Cite web|url=http://impressive.net/archives/fogo/20001005080818.O15286@impressive.net|title=fogo archive: Paul Vixie and David Conrad on BINDv9 and Internet Security by Gerald Oskoboiny <gerald@impressive.net>|website=impressive.net|access-date=2018-10-27}}</ref> naming security, robustness, scalability and new protocols as key concerns for rewriting old legacy code.
* [[Microsoft Windows]] suffered from "the" [[Windows Metafile vulnerability]] and other exploits related to the WMF format. Microsoft Security Response Center describes the WMF-features as ''"Around 1990, WMF support was added... This was a different time in the security landscape... were all completely trusted"'',<ref>{{Cite news|url=http://blogs.technet.com/msrc/archive/2006/01/13/417431.aspx|title=Looking at the WMF issue, how did it get there?|work=MSRC|access-date=2018-10-27|language=en-US|archive-url=https://web.archive.org/web/20060324152626/http://blogs.technet.com/msrc/archive/2006/01/13/417431.aspx|archive-date=2006-03-24|url-status=dead}}</ref> not being developed under the security initiatives at Microsoft.
* [[Oracle Corporation|Oracle]] is combating legacy problems, such as old source code written without addressing concerns of [[SQL injection]] and [[privilege escalation]], resulting in many security vulnerabilities which have taken time to fix and also generated incomplete fixes. This has given rise to heavy criticism from security experts such as [[David Litchfield]], [[Alexander Kornbrust]], [[Cesar Cerrudo]].<ref>{{Cite web|url=http://seclists.org/lists/bugtraq/2006/May/0039.html|title=Bugtraq: Oracle, where are the patches???|last=Litchfield|first=David|website=seclists.org|access-date=2018-10-27}}</ref><ref>{{Cite web|url=http://seclists.org/lists/bugtraq/2006/May/0045.html|title=Bugtraq: RE: Oracle, where are the patches???|last=Alexander|first=Kornbrust|website=seclists.org|access-date=2018-10-27}}</ref><ref>{{Cite web|url=http://seclists.org/lists/bugtraq/2006/May/0083.html|title=Bugtraq: Re: [Full-disclosure] RE: Oracle, where are the patches???|last=Cerrudo|first=Cesar|website=seclists.org|access-date=2018-10-27}}</ref> An additional criticism is that default installations (largely a legacy from old versions) are not aligned with their own security recommendations, such as [http://www.oracle.com/technology/deploy/security/database-security/pdf/twp_security_checklist_database.pdf [Oracle Database]] Security Checklist], which is hard to amend as many applications require the less secure legacy settings to function correctly.
 
=== Canonicalization ===
Line 141 ⟶ 136:
Assume that code constructs that appear to be problem prone (similar to known vulnerabilities, etc.) are bugs and potential security flaws. The basic rule of thumb is: "I'm not aware of all types of [[security exploit]]s. I must protect against those I ''do'' know of and then I must be proactive!".
 
===Other techniquesways of securing code===
* One of the most common problems is unchecked use of constant-size structuresor andpre-allocated functionsstructures for dynamic-size data{{cn|date=December 2023}} such as inputs to the program (the [[buffer overflow]] problem). This is especially common for [[string (computer programming)|string]] data in [[C (programming language)|C]]{{cn|date=December 2023}}. C library functions like <ttcode>gets</ttcode> should never be used since the maximum size of the input buffer is not passed as an argument. C library functions like <ttcode>scanf</ttcode> can be used safely, but require the programmer to take care with the selection of safe format strings, by sanitizing it before using it.
<!-- Please expand this article. These random notes should be changed to a more coherent article. -->
* Encrypt/authenticate all important data transmitted over networks. Do not attempt to implement your own encryption scheme, but use a [[Cryptography standards|proven one]] instead. Message checking with a hash or similar technology will also help secure data sent over a network.
* One of the most common problems is unchecked use of constant-size structures and functions for dynamic-size data (the [[buffer overflow]] problem). This is especially common for [[string (computer programming)|string]] data in [[C (programming language)|C]]. C library functions like <tt>gets</tt> should never be used since the maximum size of the input buffer is not passed as an argument. C library functions like <tt>scanf</tt> can be used safely, but require the programmer to take care with the selection of safe format strings, by sanitizing it before using it.
 
* Encrypt/authenticate all important data transmitted over networks. Do not attempt to implement your own encryption scheme, but use a proven one instead.
====The three rules of data security====
* All [[data]] is important until proven otherwise.
* All data is tainted until proven otherwise.
* All code is insecure until proven otherwise.
** You cannot prove the security of any code in [[userland (computing)|userland]], or, more canonicallycommonly known as: ''"never trust the client"''.
These three rules about data security describe how to handle any data, internally or externally sourced:
* If data is to be checked for correctness, verify that it is correct, not that it is incorrect.
 
'''All data is important until proven otherwise''' - means that all data must be verified as garbage before being destroyed.
 
'''All data is tainted until proven otherwise''' - means that all data must be handled in a way that does not expose the rest of the runtime environment without verifying integrity.
 
'''All code is insecure until proven otherwise''' - while a slight misnomer, does a good job reminding us to never assume our code is secure as bugs or [[undefined behavior]] may expose the project or system to attacks such as common [[SQL injection]] attacks.
 
====More Information====
* If data is to be checked for correctness, verify that it is correct, not that it is incorrect.
* [[Design by contract]]
** Design by contract uses [[precondition]]s, [[postcondition]]s and [[Invariant (computer science)|invariants]] to ensure that provided data (and the state of the program as a whole) is sanitized. This allows code to document its assumptions and make them safely. This may involve checking arguments to a function or method for validity before executing the body of the function. After the body of a function, doing a check of state or other held data, and the return value before exits (break/return/throw/error code), is also wise.
* [[Assertion (computing)|Assertions]] (also called '''assertive programming''')
** Within functions, you may want to check that you are not referencing something that is not valid (i.e., null) and that array lengths are valid before referencing elements, especially on all temporary/local instantiations. A good heuristic is to not trust the libraries you did not write either. So any time you call them, check what you get back from them. It often helps to create a small library of "asserting" and "checking" functions to do this along with a logger so you can trace your path and reduce the need for extensive [[debugging]] cycles in the first place. With the advent of logging libraries and [[aspect oriented programming]], many of the tedious aspects of defensive programming are mitigated.
* Prefer [[Exception handling|exceptions]] to return codes
** Generally speaking, it is preferable{{According to throwwhom|date=December intelligible2023}} to throw exception messages that enforce part of your [[application programming interface|API]] [[Design by contract|contract]] and guide the client [[programmer]]developer instead of returning error code values that ado clientnot programmerpoint isto likelywhere tothe beexception unpreparedoccurred foror andwhat the program stack hencelooked minimizeliked, theirBetter complaintslogging and exception handling will increase robustness and security of your software.{{cn|date=December 2023}}, while minimizing developer stress{{Dubiouscn|date=JuneDecember 20152023}}.
 
==See also==
* [[Computer security]]
* [[Immunity-aware programming]]
 
== References ==