Indicator function: Difference between revisions

Content deleted Content added
Shwobopho (talk | contribs)
m Improved wording in Definition description
m Fixed punctuation
 
(2 intermediate revisions by 2 users not shown)
Line 1:
{{Short description|Mathematical function characterizing set membership}}
{{About|the 0-–1 indicator function|the 0-–infinity indicator function|characteristic function (convex analysis)}}
{{More footnotes|date=December 2009}}
{{Use American English|date = March 2019}}
 
[[Image:Indicator function illustration.png|right|thumb|A three-dimensional plot of an indicator function, shown over a square two-dimensional ___domain (set {{mvar|X}}): the "raised" portion overlays those two-dimensional points which are members of the "indicated" subset ({{mvar|A}}).]]
In [[mathematics]], an '''indicator function''' or a '''characteristic function''' of a [[subset]] of a [[Set (mathematics)|set]] is a [[Function (mathematics)|function]] that maps elements of the subset to one, and all other elements to zero. That is, if {{mvar|A}} is a subset of some set {{mvar|X}}, then the indicator function of {{mvar|A}} is the function <math>\mathbf{1}_A</math> defined by <math>\mathbf{1}_{A}\!(x) \equiv= 1\ </math> if <math>\ x \in A\ ,</math> and <math>\ \mathbf{1}_{A}\!(x) \equiv= 0\ </math> otherwise, where <math>\ \mathbf{1}_A\ </math> is one common notation for the indicator function;. otherOther common notations are <{{math>\|𝟙{{sub|''A''}}}} I_A\!( x )\ ,</math>and <math>\ \chi_A\!( x )\ ,.</math>{{efn|name=χαρακτήρ}} and <math> \ \operatorname\mathbb{I}\!\bigl(\ x \in A\ \bigr) ~.</math>
 
The indicator function of {{mvar|A}} is the [[Iverson bracket]] of the property of belonging to {{mvar|A}}; that is,
 
<math display="block">\ \mathbf{1}_{A}(x) = \left[\ x\in A\ \right] ~.</math>
 
For example, the [[Dirichlet function]] is the indicator function of the [[rational number]]s as a subset of the [[real number]]s.
 
==Definition==
Given an arbitrary set {{mvar|X}}, anthe indicator function of a subset {{mvar|A}} forof {{mvar|X}} is defined by the function (::)
<math display=block>\ \mathbf{1}_A \colon X \mapsto \{ 0, 1 \}\ </math>
 
defined by
<math display=block>\ \mathbf{1}_A \colon X \mapsto \{ 0, 1 \}\ </math>
<math display="block" qid="Q371983">\operatorname\mathbf{1}_A\!( x ) \equiv=
 
or, more specifically by (::)
 
<math display="block" qid="Q371983">\operatorname\mathbf{1}_A\!( x ) \equiv
\begin{cases}
1 \quad & \mathsftext{ if }\quad x \in A\ , \\
0 \quad & \mathsftext{ if }\quad x \notin A ~\,.
\end{cases}
</math>
 
The [[Iverson bracket]] provides the equivalent notation, <math>\ \left[\ x\in A\ \right]\ </math> or {{nobr|{{math|⟦&thinsp;''x'' ∈ ''A''&thinsp;⟧}} ,}} that can be used instead of <math>\ \mathbf{1}_{A}\!( x ) ~.</math>
 
The function <math>\ \mathbf{1}_A\ </math> is sometimes denoted {{mvarmath|I<𝟙{{sub>|''A</sub>''}}}}, {{mvar|&chi;I<sub>A</sub>}}, {{mvar|K&chi;<sub>A</sub>}}, or even just {{mvar|A}}.{{efn|name=χαρακτήρ|
The [[Greek alphabet|Greek letter]] {{mvar|&chi;}} appears because it is the initial letter of the Greek word {{lang|grc|{{math|χαρακτήρ}}}}, which is the ultimate origin of the word ''characteristic''.
}} or even just {{mvar|A}}.{{efn|
The set of all indicator functions on {{mvar|X}} can be identified with the set operator <math>\ \mathcal{P}( X )\ ,</math> the [[power set]] of {{mvar|X}}. Consequently, both sets are denoted by the conventional [[abuse of notation]] as <math>\ 2^X\ ,</math> in analogy to the relation for the count of elements in the powerset and the original set. This is a special case <math>\left(\ Y = \{\ 0,\, 1\ \}\ \right)</math> of the notation <math>\ Y^X\ </math> for the set of all functions <math>\ f\ </math> such that <math>\ f: X \mapsto Y ~\,.</math>
}}
 
==Notation and terminology==
The notation <math>\ \chi_A\ </math> is also used to denote the [[Characteristic function (convex analysis)|characteristic function]] in [[convex analysis]], which is defined as if using the [[Multiplicative inverse|reciprocal]] of the standard definition of the indicator function.
 
A related concept in [[statistics]] is that of a [[dummy variable (statistics)|dummy variable]]. (This must not be confused with "dummy variables" as that term is usually used in mathematics, also called a [[free variables and bound variables|bound variable]].)
 
The term "[[characteristic function (probability theory)|characteristic function]]" has an unrelated meaning in [[probability theory|classic probability theory]]. For this reason, [[List of probabilists|traditional probabilists]] use the term '''indicator function''' for the function defined here almost exclusively, while mathematicians in other fields are more likely to use the term ''characteristic function''{{efn|name=χαρακτήρ}} to describe the function that indicates membership in a set.
 
In [[fuzzy logic]] and [[Many-valued logic|modern many-valued logic]], predicates are the [[characteristic function (probability theory)|characteristic functions]] of a [[probability distribution]]. That is, the strict true/false valuation of the predicate is replaced by a quantity interpreted as the degree of truth.
 
==Basic properties==
The ''indicator'' or ''characteristic'' [[function (mathematics)|function]] of a subset {{mvar|A}} of some set {{mvar|X}} [[Map (mathematics)|maps]] elements of {{mvar|X}} to the [[codomain]] <math>\ \bigl\{\ 0,\, 1\ \bigr\} ~.</math>
 
This mapping is [[surjective]] only when {{mvar|A}} is a non-empty [[proper subset]] of {{nobr|&hairsp; {{mvar|X}} .}} If <math>\ A = X\ ,</math> then <math>\ \mathbf{1}_A \equiv 1 ~.</math> By a similar argument, if <math>\ A = \emptyset\ </math> then <math>\ \mathbf{1}_A \equiv 0 ~.</math>
 
If <math>\ A\ </math> and <math>\ B\ </math> are two subsets of <math>\ X\ ,</math> then
<math display=block>\begin{align}
\mathbf{1}_{A\cap B}(x) ~&=~ \min\bigl\{\ \mathbf{1}_A(x),\ \mathbf{1}_B(x)\ \bigr\} ~~=~ \mathbf{1}_A(x) \cdot\mathbf{1}_B(x)\ , \\
\mathbf{1}_{A\cup B}(x) ~&=~ \max\bigl\{\ \mathbf{1}_A(x),\ \mathbf{1}_B(x)\ \bigr\} ~=~ \mathbf{1}_A(x) + \mathbf{1}_B(x) - \mathbf{1}_A(x) \cdot \mathbf{1}_B(x)\ ,,
\end{align}</math>
 
and the indicator function of the [[Complement (set theory)|complement]] of <math>\ A\ </math> i.e. <math>\ A^\complement\ </math> is:
<math display=block>\mathbf{1}_{A^\complement} = 1 - \mathbf{1}_A ~.</math>
 
More generally, suppose <math>\ A_1, \dotsc, A_n\ </math> is a collection of subsets of {{mvar|X}}. For any <math>\ x \in X\ :</math>
 
<math display=block> \prod_{k \in I} \left(\ 1 - \mathbf{1}_{A_k}\!\left( x \right)\ \right)\ </math>
 
is clearly a product of {{math|0}}s and {{math|1}}s. This product has the value {{math|1}} at precisely those <math>\ x \in X\ </math> that belong to none of the sets <math>\ A_k\ </math> and is 0 otherwise. That is
 
<math display=block> \prod_{k \in I} ( 1 - \mathbf{1}_{A_k}) = \mathbf{1}_{X - \bigcup_{k} A_k} = 1 - \mathbf{1}_{\bigcup_{k} A_k} ~.</math>
 
Expanding the product on the left hand side,
 
<math display=block>\ \mathbf{1}_{\bigcup_{k} A_k}= 1 - \sum_{F \subseteq \{1, 2, \dotsc, n\}} (-1)^{|F|} \mathbf{1}_{\bigcap_F A_k} = \sum_{\emptyset \neq F \subseteq \{1, 2, \dotsc, n\}} (-1)^{|F|+1} \mathbf{1}_{\bigcap_F A_k}\ </math>
 
where <math>|F|</math> is the [[cardinality]] of {{mvar|F}}. This is one form of the principle of [[inclusion-exclusion]].
 
As suggested by the previous example, the indicator function is a useful notational device in [[combinatorics]]. The notation is used in other places as well, for instance in [[probability theory]]: if {{mvar|X}} is a [[probability space]] with probability measure <math>\ \operatorname\mathbb{P}\ </math> and {{mvar|A}} is a [[Measure (mathematics)|measurable set]], then <math>\ \mathbf{1}_A\ </math> becomes a [[random variable]] whose [[expected value]] is equal to the probability of {{mvar|A}}:
 
<math display=block> \operatorname\mathbb{E}_X\left\{\ \mathbf{1}_A(x)\ \right\}\ =\ \int_{X} \mathbf{1}_A( x )\ \operatorname{d\ \mathbb{P} }(x) = \int_{A} \operatorname{d\ \mathbb{P} }(x) = \operatorname\mathbb{P}( A ) ~.</math>
 
This identity is used in a simple proof of [[Markov's inequality]].
Line 85 ⟶ 82:
;[[Mean]]: <math>\ \operatorname\mathbb{E}(\mathbf{1}_A (\omega)) = \operatorname\mathbb{P}(A)\ </math> (also called "Fundamental Bridge").
 
;[[Variance]]: <math>\ \operatorname{Var}(\mathbf{1}_A (\omega)) = \operatorname\mathbb{P}(A)(1 - \operatorname\mathbb{P}(A)) ~.</math>
 
;[[Covariance]]: <math>\ \operatorname{Cov}(\mathbf{1}_A (\omega), \mathbf{1}_B (\omega)) = \operatorname\mathbb{P}(A \cap B) - \operatorname\mathbb{P}(A) \operatorname\mathbb{P}(B) ~.</math>
 
==Characteristic function in recursion theory, Gödel's and Kleene's representing function==
[[Kurt Gödel]] described the ''representing function'' in his 1934 paper "On undecidable propositions of formal mathematical systems" (the symbol "{{math|¬}}" indicates logical inversion, i.e. "NOT"):<ref name=Martin-1965>{{cite book |pages=41–74 |editor-link=Martin Davis (mathematician) |editor-first=Martin |editor-last=Davis |year=1965 |title=The Undecidable |publisher=Raven Press Books |place=New York, NY}}</ref>{{rp|page=42}}
 
{{blockquote|1=There shall correspond to each class or relation {{mvar|R}} a representing function <math>\ \phi(x_1, \ldots x_n) = 0\ </math> if <math>\ R(x_1,\ldots x_n)\ </math> and <math>\ \phi(x_1,\ldots x_n) = 1\ </math> if <math>\ \neg R(x_1,\ldots x_n) ~.</math>}}
 
[[Stephen Kleene|Kleene]] offers up the same definition in the context of the [[primitive recursive function]]s as a function {{mvar|φ}} of a predicate {{mvar|P}} takes on values {{math|0}} if the predicate is true and {{math|1}} if the predicate is false.<ref name=Kleene1952>{{cite book |last=Kleene |first=Stephen |author-link=Stephen Kleene |year=1971 |orig-year=1952 |title=Introduction to Metamathematics |page=227 |publisher=Wolters-Noordhoff Publishing and North Holland Publishing Company |___location=Netherlands |edition=Sixth reprint, with corrections}}</ref>
 
For example, because the product of characteristic functions <math>\ \phi_1 * \phi_2 * \cdots * \phi_n = 0\ </math> whenever any one of the functions equals {{math|0}}, it plays the role of logical OR: IF <math>\ \phi_1 = 0\ </math> OR <math>\ \phi_2 = 0\ </math> OR ... OR <math>\ \phi_n = 0\ </math> THEN their product is {{math|0}}. What appears to the modern reader as the representing function's logical inversion, i.e. the representing function is {{math|0}} when the function {{mvar|R}} is "true" or satisfied", plays a useful role in Kleene's definition of the logical functions OR, AND, and IMPLY,<ref name=Kleene1952 />{{rp|228}} the bounded-<ref name=Kleene1952 />{{rp|228}} and unbounded-<ref name=Kleene1952 />{{rp|279 ff}} [[mu operator]]s and the CASE function.<ref name=Kleene1952 />{{rp|229}}
 
==Characteristic function in fuzzy set theory==
Line 103 ⟶ 100:
==Smoothness==
{{See also|Laplacian of the indicator}}
In general, the indicator function of a set is not smooth; it is continuous if and only if its [[support (math)|support]] is a [[connected component (topology)|connected component]]. In the [[algebraic geometry]] of [[finite fields]], however, every [[affine variety]] admits a ([[Zariski topology|Zariski]]) continuous indicator function.<ref>{{Cite book|title=Course in Arithmetic|last=Serre|pages=5}}</ref> Given a [[finite set]] of functions <math>\ f_\alpha \in \mathbb{F}_q\left[\ x_1, \ldots, x_n\ \right]\ </math> let <math>\ V = \bigl\{\ x \in \mathbb{F}_q^n : f_\alpha(x) = 0\ \bigr\}\ </math> be their vanishing locus. Then, the function <math display="inline">\ \mathbb{P}( x ) = \prod\left(\ 1 - f_\alpha(x)^{q-1}\ \right)\ </math> acts as an indicator function for <math>\ V ~.</math> If <math>\ x \in V\ </math> then <math>\ \mathbb{P}(x) = 1\ ,</math> otherwise, for some <math>\ f_\alpha\ ,</math> we have <math>\ f_\alpha(x) \neq 0\ </math> which implies that <math>\ f_\alpha(x)^{q-1} = 1\ ,</math> hence <math>\ \mathbb{P}(x) = 0 ~.</math>
 
Although indicator functions are not smooth, they admit [[weak derivative]]s. For example, consider [[Heaviside step function]] <math display="block">\ H(x) \equiv \operatorname\mathbb{I}\!\bigl(\ x > 0\ \bigr)\ </math> The [[distributional derivative]] of the Heaviside step function is equal to the [[Dirac delta function]], i.e. <math display=block>\frac{\ \mathrm{d}\ H(x)\ }{\ \mathrm{d}\ x\ }= \delta(x)\ </math>
and similarly the distributional derivative of <math display="block">\ G(x) := \operatorname\mathbb{I}\!\bigl(\ x < 0\ \bigr)\ </math> is <math display=block>\ \frac{\ \mathrm{d}\ G(x)\ }{\ \mathrm{d}\ x\ } = -\delta(x) ~.</math>
 
Thus the derivative of the Heaviside step function can be seen as the ''inward normal derivative'' at the ''boundary'' of the ___domain given by the positive half-line. In higher dimensions, the derivative naturally generalises to the inward normal derivative, while the Heaviside step function naturally generalises to the indicator function of some ___domain {{mvar|D}}. The surface of {{mvar|D}} will be denoted by {{mvar|S}}. Proceeding, it can be derived that the inward [[Laplacian of the indicator#Dirac surface delta function|inward normal derivative]] of the indicator]] gives rise to a ''[[surface delta function]]'', which can be indicated by <math>\ \delta_S(\mathbf{x})\ </math>:
<math display=block>\delta_S(\mathbf{x}) = -\mathbf{n}_x \cdot \nabla_x \operatorname\mathbb{I}\!\bigl(\ \mathbf{x}\in D\ \bigr)\ </math>
where {{mvar|n}} is the outward [[Normal (geometry)|normal]] of the surface {{mvar|S}}. This 'surface delta function' has the following property:<ref>{{cite journal |last=Lange |first=Rutger-Jan |year=2012 |title=Potential theory, path integrals and the Laplacian of the indicator |journal=Journal of High Energy Physics |volume=2012 |issue=11 |pages=29–30 |arxiv=1302.0864 |bibcode=2012JHEP...11..032L |doi=10.1007/JHEP11(2012)032|s2cid=56188533 }}</ref>
<math display=block>-\int_{\R^n}f(\mathbf{x})\,\mathbf{n}_x\cdot\nabla_x \operatorname\mathbb{I}\!\bigl(\ \mathbf{x}\in D\ \bigr) \; \operatorname{d}^{n}\mathbf{x} = \oint_{S}\,f(\mathbf{\beta}) \; \operatorname{d}^{n-1}\mathbf{\beta}\ .</math>
 
By setting the function {{mvar|f}} equal to one, it follows that the [[Laplacian of the indicator#Dirac surface delta function|inward normal derivative of the indicator]] integrates to the numerical value of the [[surface area]] {{mvar|S}}.