Functional dependency: Difference between revisions

Content deleted Content added
OAbot (talk | contribs)
m Open access bot: doi added to citation with #oabot.
fix typo
Tags: Mobile edit Mobile web edit Advanced mobile edit
 
(14 intermediate revisions by 9 users not shown)
Line 1:
{{Short description|Relational database theory concept}}
{{about|a concept in relational database theory|function dependencies in the Haskell programming language|type class}}
{{refimprove|date=October 2012}}
In [[relational database]] theory, a '''functional dependency''' is a ('''FD''') is [[Relational database#Constraints|constraint]]''' between two attribute sets, ofwhereby attributesvalues in aone [[Relationset (databasethe ''determinant'' set)|relation]] fromdetermine athe database.values Inof the other words,set a(the ''dependent'' set). A functional dependency isbetween a constraintdeterminant betweenset two''X'' attributes inand a relation.dependent set ''Y'' can described as follows:
Given a relation ''R'' and sets of attributes <math>X,Y \subseteq R</math>, ''X'' is said to '''functionally determine''' ''Y'' (written ''X'' → ''Y'') if and only if each ''X'' value in ''R'' is associated with precisely one ''Y'' value in ''R''; ''R'' is then said to ''satisfy'' the functional dependency ''X'' → ''Y''. Equivalently, the [[projection (relational algebra)|projection]] <math>\Pi_{X,Y}R</math> is a [[Function (mathematics)|function]], i.e. ''Y'' is a function of ''X''.<ref name="HalpinMorgan2008">{{cite book |author1=Terry Halpin |title=Information Modeling and Relational Databases |url=https://books.google.com/books?id=puO_VlbR_x4C&pg=PA140 |year=2008 |publisher=Morgan Kaufmann |isbn=978-0-12-373568-3 |page=140 |edition=2nd}}</ref><ref name="Date2012">{{cite book |author=Chris Date |title=Database Design and Relational Theory: Normal Forms and All That Jazz |url=https://books.google.com/books?id=8jAGhpMSjAcC&pg=PA21 |year=2012 |publisher=O'Reilly Media, Inc. |isbn=978-1-4493-2801-6 |page=21}}</ref> In simple words, if the values for the ''X'' attributes are known (say they are ''x''), then the values for the ''Y'' attributes corresponding to ''x'' can be determined by looking them up in ''any'' [[Tuple#Relational model|tuple]] of ''R'' containing ''x''. Customarily ''X'' is called the ''determinant'' set and ''Y'' the ''dependent'' set. A functional dependency FD: ''X'' → ''Y'' is called ''trivial'' if ''Y'' is a [[subset]] of ''X''.
 
Given a [[Relation (database)|relation]] ''R'' and attribute sets of attributes <math>''X'',''Y'' <math>\subseteq R</math> ''R'', ''X'' is said to '''functionally determine''' ''Y'' (written ''X'' → ''Y'') if and only if each ''X'' value in ''R'' is associated with precisely one ''Y'' value in ''R'';. ''R'' is then said to ''satisfy'' the functional dependency ''X'' → ''Y''. Equivalently, the [[projection (relational algebra)|projection]] <math>\Pi_{X,Y}R</math> is a [[Function (mathematics)|function]], i.e.that is, ''Y'' is a function of ''X''.<ref name="HalpinMorgan2008">{{cite book |author1=Terry Halpin |title=Information Modeling and Relational Databases |url=https://books.google.com/books?id=puO_VlbR_x4C&pg=PA140 |year=2008 |publisher=Morgan Kaufmann |isbn=978-0-12-373568-3 |page=140 |edition=2nd}}</ref><ref name="Date2012">{{cite book |author=Chris Date |title=Database Design and Relational Theory: Normal Forms and All That Jazz |url=https://books.google.com/books?id=8jAGhpMSjAcC&pg=PA21 |year=2012 |publisher=O'Reilly Media, Inc. |isbn=978-1-4493-2801-6 |page=21}}</ref> In simple words, if the values for the ''X'' attributes are known (say they are ''x''), then the values for the ''Y'' attributes corresponding to ''x'' can be determined by looking them up in ''any'' [[Tuple#Relational model|tuple]] of ''R'' containing ''x''. Customarily ''X'' is called the ''determinant'' set and ''Y'' the ''dependent'' set. A functional dependency FD: ''X'' → ''Y'' is called ''trivial'' if ''Y'' is a [[subset]] of ''X''.
In other words, a dependency FD: ''X'' → ''Y'' means that the values of ''Y'' are determined by the values of ''X''. Two tuples sharing the same values of ''X'' will necessarily have the same values of ''Y''.
 
In other words:
The determination of functional dependencies is an important part of designing databases in the [[relational model]], and in [[database normalization]] and [[denormalization]]. A simple application of functional dependencies is ''Heath's theorem''; it says that a relation ''R'' over an attribute set ''U'' and satisfying a functional dependency ''X'' → ''Y'' can be safely split in two relations having the [[Lossless-Join Decomposition|lossless-join decomposition]] property, namely into <math>\Pi_{XY}(R)\bowtie\Pi_{XZ}(R) = R</math> where ''Z'' = ''U'' − ''XY'' are the rest of the attributes. ([[set union|Union]]s of attribute sets are customarily denoted by there juxtapositions in database theory.) An important notion in this context is a [[candidate key]], defined as a minimal set of attributes that functionally determine all of the attributes in a relation. The functional dependencies, along with the [[attribute ___domain]]s, are selected so as to generate constraints that would exclude as much data inappropriate to the [[user ___domain]] from the system as possible.
* when ''X'' attributes have known values (here, ''x''), the values for their corresponding ''Y'' attibutes can be determined by looking them up in ''any'' [[Tuple#Relational model|tuple]] of ''R'' containing ''x''.
* two tuples sharing the same values of ''X'' will necessarily have the same values of ''Y''.
 
In other words, aA dependency FD: ''X'' → ''Y'' means that the values of ''Y'' are determined by the values of ''X''. TwoA tuplesfunctional sharingdependency theFD: same''X'' values of ''XY'' willis necessarilycalled have''trivial'' theif same''Y'' valuesis a [[subset]] of ''YX''.
A notion of [[logical implication]] is defined for functional dependencies in the following way: a set of functional dependencies <math>\Sigma</math> logically implies another set of dependencies <math>\Gamma</math>, if any relation ''R'' satisfying all dependencies from <math>\Sigma</math> also satisfies all dependencies from <math>\Gamma</math>; this is usually written <math>\Sigma \models \Gamma</math>. The notion of logical implication for functional dependencies admits a [[soundness|sound]] and [[completeness (logic)|complete]] finite [[axiomatization]], known as ''Armstrong's axioms''.
 
The determination of functional dependencies is an important part of designing databases in the [[relational model]], and in [[database normalization]] and [[denormalization]]. A simple application of functional dependencies is ''[[Heath's theorem'']]; it says that a relation ''R'' over an attribute set ''U'' and satisfying a functional dependency ''X'' → ''Y'' can be safely split in two relations having the [[Lossless-Join Decomposition|lossless-join decomposition]] property, namely into <math>\Pi_{XY}(R)\bowtie\Pi_{XZ}(R) = R</math> where ''Z'' = ''U'' − ''XY'' are the rest of the attributes. ([[set union|Union]]s of attribute sets are customarily denoted by theretheir juxtapositions in database theory.) An important notion in this context is a [[candidate key]], defined as a minimal set of attributes that functionally determine all of the attributes in a relation. The functional dependencies, along with the [[attribute ___domain]]s, are selected so as to generate constraints that would exclude as much data inappropriate to the [[user ___domain]] from the system as possible.
 
A notion of [[logical implication]] is defined for functional dependencies in the following way: a set of functional dependencies <math>\Sigma</math> logically implies another set of dependencies <math>\Gamma</math>, if any relation ''R'' satisfying all dependencies from <math>\Sigma</math> also satisfies all dependencies from <math>\Gamma</math>; this is usually written <math>\Sigma \models \Gamma</math>. The notion of logical implication for functional dependencies admits a [[soundness|sound]] and [[completeness (logic)|complete]] finite [[axiomatization]], known as ''[[Armstrong's axioms'']].
 
== Examples ==
Line 40 ⟶ 46:
* StudentID → Semester.
 
Note that ifIf a row was added where the student had a different value of semester, then the functional dependency FD would no longer exist. This means that the FD is implied by the data as it is possible to have values that would invalidate the FD.
 
Other nontrivial functional dependencies can be identified, for example:
Line 48 ⟶ 54:
The latter expresses the fact that the set {StudentID, Lecture} is a [[superkey]] of the relation.
 
=== Employee department model ===
 
A classic example of functional dependency is the employee department model.
Line 115 ⟶ 121:
:''X'' → ''Y'' and ''X'' → ''Z'' [[if and only if]] ''X'' → ''YZ''
 
== Closure ==

=== Closure of functional dependency ===
The closure of a set of values is essentially the full set of valuesattributes that can be determined fromusing aits setfunctional of known valuesdependencies for a given relationship using its functional dependencies. One uses [[Armstrong's axioms]] to provide a proof - i.e. reflexivity, augmentation, transitivity.
 
Given <math>R</math> and <math>F</math> a set of FDs that holds in <math>R</math>:
The closure of <math>F</math> in <math>R</math> (denoted <math>F</math><sup>+</sup>) is the set of all FDs that are logically implied by <math>F</math>.<ref>{{Cite journal|last=Saiedian|first=H.|date=1996-02-01|title=An Efficient Algorithm to Compute the Candidate Keys of a Relational Database Schema|url=https://academic.oup.com/comjnl/article-lookup/doi/10.1093/comjnl/39.2.124|journal=The Computer Journal|language=en|volume=39|issue=2|pages=124–132|doi=10.1093/comjnl/39.2.124|issn=0010-4620|url-access=subscription}}</ref>
 
=== Closure of a set of attributes ===
Closure of a set of attributes X with respect to <math>F</math> is the set X<sup>+</sup> of all attributes that are functionally determined by X using <math>F</math><sup>+</sup>.
attributes that are functionally determined by X using <math>F</math><sup>+</sup>.
 
==== Example ====
Imagine the following list of FDs. We are going to calculate a closure for A (written as A<sup>+</sup>) from this relationship.
 
# ''A'' → ''B''
Line 139 ⟶ 146:
| A → ABCD (by (c), and 2)
}}
Therefore, A<sup>+</sup>= ABCD. Because A<sup>+</sup> includes every attribute in the relationship, it is a [[superkey]].
The closure is therefore A → ABCD. By calculating the closure of A, we have validated that A is also a good candidate key as its closure is every single data value in the relationship.
 
== Covers and equivalence ==