Content deleted Content added
m Reverted 2 edits by 172.59.185.141 (talk) to last revision by Kvng |
I simplified most of the extra words, fixed some uppercase mistakes, and fixed punctuation. |
||
Line 2:
[[File:Db null.png|300px|thumb|right|The Greek lowercase [[Omega|omega (ω)]] character is used to represent Null in [[database theory]].]]
In [[SQL]], '''null''' or '''NULL''' is a special marker used to indicate that a data value does not exist in the [[database]]. Introduced by the creator of the [[Relational model|relational]] database model, [[E. F. Codd]], SQL null serves to
A null should not be confused with a value of [[0]]. A null indicates a lack of a value, which is not the same
In SQL, null is a marker, not a value. This usage is quite different from most programming languages, where a [[Null pointer|null value]] of a reference means it is not pointing to any [[Object (computer science)|object]].
== History ==
E. F. Codd mentioned nulls as a method of representing missing data in the [[relational model]] in a 1975 paper in the ''FDT Bulletin of [[Association for Computing Machinery|ACM]]-[[SIGMOD]]''. Codd's paper that is most commonly cited
The 1986 SQL standard basically adopted Codd's proposal after an implementation prototype in [[IBM System R]]. Although [[Don Chamberlin]] recognized nulls (alongside duplicate rows) as one of the most controversial features of SQL, he defended the design of Nulls in SQL invoking the pragmatic arguments that it was the least expensive form of system support for missing information, saving the programmer from many duplicative application-level checks (see [[semipredicate problem]]) while at the same time providing the database designer with the option not to use Nulls if they so desire; for example,
Codd indicated in his 1990 book ''The Relational Model for Database Management, Version 2'' that the single Null mandated by the SQL standard was inadequate, and should be replaced by two separate Null-type markers to indicate
=== Challenges ===
Line 29:
}}.</ref> In the following example, multiplying 10 by Null results in Null:
<syntaxhighlight lang="
10 * NULL -- Result is NULL
</syntaxhighlight>
Line 35:
This can lead to unanticipated results. For instance, when an attempt is made to divide Null by zero, platforms may return Null instead of throwing an expected "data exception{{snd}} division by zero".<ref name="SQL2003-Part2-Sec626"/> Though this behavior is not defined by the ISO SQL standard many DBMS vendors treat this operation similarly. For instance, the Oracle, PostgreSQL, MySQL Server, and Microsoft SQL Server platforms all return a Null result for the following:
<syntaxhighlight lang="
NULL / 0
</syntaxhighlight>
Line 53:
'Fish ' || NULL || 'Chips' -- Result is NULL
</syntaxhighlight>
This is not true for all database implementations. In an Oracle RDBMS, for example, NULL and the empty string are considered the same thing and therefore 'Fish ' || NULL || 'Chips' results in 'Fish Chips'.<ref>{{Cite web |date=2022-05-23 |title=Handle empty strings when migrating from Oracle to PostgreSQL {{!}} AWS Database Blog |url=https://aws.amazon.com/blogs/database/handle-empty-strings-when-migrating-from-oracle-to-postgresql/ |access-date=2023-12-30 |website=aws.amazon.com |language=en-US}}</ref>
== Comparisons with NULL and the three-valued logic (3VL) ==
Line 68:
}}</ref> The logical result of the expression below, which compares the value 10 to Null, is Unknown:
<syntaxhighlight lang="
SELECT 10 = NULL -- Results in Unknown
</syntaxhighlight>
Line 74:
However, certain operations on Null can return values if the absent value is not relevant to the outcome of the operation. Consider the following example:
<syntaxhighlight lang="
SELECT NULL OR TRUE -- Results in True
</syntaxhighlight>
Line 80:
In this case, the fact that the value on the left of OR is unknowable is irrelevant, because the outcome of the OR operation would be True regardless of the value on the left.
SQL implements three logical results, so SQL implementations must provide for a specialized [[Ternary logic|three-valued logic (3VL)]]. The rules governing SQL three-valued logic are shown in the tables below ('''p''' and '''q''' represent logical states)"<ref name="fourrules">{{cite journal | last = Coles, Michael | title = Four Rules for Nulls | journal = SQL Server Central | publisher = Red Gate Software | date = June 27, 2005 | url = http://www.sqlservercentral.com/columnists/mcoles/fourrulesfornulls.asp }}</ref> The truth tables SQL uses for AND, OR, and NOT correspond to a common fragment of the Kleene and Łukasiewicz three-valued logic (which differ in their definition of implication, however, SQL defines no such operation).<ref name="Klein">{{Cite book | last1 = Hans-Joachim | first1 = K. | chapter = Null Values in Relational Databases and Sure Information Answers | doi = 10.1007/3-540-36596-6_7 | title = Semantics in Databases. Second International Workshop Dagstuhl Castle, Germany, January 7–12, 2001. Revised Papers | series = Lecture Notes in Computer Science | volume = 2582 | pages = 119–138| year = 2003 | isbn = 978-3-540-00957-3 | chapter-url = http://www.is.informatik.uni-kiel.de/~hjk/sqlni.ps}}</ref>
{| class="wikitable"
Line 117:
SQL three-valued logic is encountered in [[Data Manipulation Language]] (DML) in comparison predicates of DML statements and queries. The <code>[[Where (SQL)|WHERE]]</code> clause causes the DML statement to act on only those rows for which the predicate evaluates to True. Rows for which the predicate evaluates to either False or Unknown are not acted on by <code>[[Insert (SQL)|INSERT]]</code>, <code>[[Update (SQL)|UPDATE]]</code>, or <code>[[Delete (SQL)|DELETE]]</code> DML statements, and are discarded by <code>[[Select (SQL)|SELECT]]</code> queries. Interpreting Unknown and False as the same logical result is a common error encountered while dealing with Nulls.<ref name="fourrules"/> The following simple example demonstrates this fallacy:
<syntaxhighlight lang="
SELECT *
FROM t
Line 123:
</syntaxhighlight>
The example query above logically always returns zero rows because the comparison of the ''i'' column with Null always returns Unknown, even for those rows where ''i'' is Null. The Unknown result causes the <code>SELECT</code> statement to summarily discard
=== Null-specific and 3VL-specific comparison predicates ===
Line 153:
The addition of IS UNKNOWN to the other operators of SQL's three-valued logic makes the SQL three-valued logic [[functionally complete]],<ref>C. J. Date, ''Relational database writings, 1991-1994'', Addison-Wesley, 1995, p. 371</ref> meaning its logical operators can express (in combination) any conceivable three-valued logical function.
On systems
=== Law of the excluded fourth (in WHERE clauses) ===
Line 168:
</syntaxhighlight>
if the column x contains any Nulls; in that case, the second query would return some rows the first one does not return, namely all those in which x is Null. In classical two-valued logic, the law of the excluded middle would allow the simplification of the WHERE clause predicate, in fact its elimination. Attempting to apply the law of the excluded middle to SQL's 3VL is effectively a [[False dilemma|false dichotomy]]. The second query is actually equivalent with:
<syntaxhighlight lang="sql">
|