Submodular set function: Difference between revisions

Content deleted Content added
lowercased section headers
A bunch of changes to improve readability, remove ordered lists where no order was necessary, clarify sentences which include both math and English, etc.
Line 1:
In mathematics, a '''submodular functionsset function''' are(also known as a '''submodular function''') is a [[set functionsfunction]] whichwhose usuallyvalue, appearinformally, inhas approximationthe algorithmsproperty andthat the difference in gamethe theoryvalue of the function that a single element makes when added to an input set decreases (as functionsthe modelingsize userof preferences)the input set increases. TheseSubmodular functions have a natural [[diminishing returns]] property which makes them suitable for many applications, including [[approximation algorithms]] and [[game theory]] (as functions modeling user preferences).
 
== Definition ==
'''SubmodularIf <math>\Omega</math> is a [[set]], a submodular function''' is a set function <math>f:2^{\Omega}\rightarrow \mathbb{R}</math>, where <math>2^\Omega</math> denotes the [[Power set#Representing subsets as functions|power set]] of <math>\Omega</math>, which satisfies one of the following equivalent definitions<ref>{{Harvard citations|last = Schrijver|year = 2003|loc = §44, p. 766|nb = }}</ref>.
# For every <math>X\subseteq, Y \subseteq \Omega</math> with <math> X \subseteq Y</math> and every <math>x \in \Omega \backslash Y</math> we have that <math>f(X\cup \{x\})-f(X)\geq f(Y\cup \{x\})-f(Y)</math>.
# For every <math>S, T \subseteq \Omega</math> we have that <math>f(S)+f(T)\geq f(S\cup T)+f(S\cap T)</math>.
# For every <math>X\subseteq \Omega</math> and <math>x_1,x_2\in \Omega\backslash X</math> we have that <math>f(X\cup \{x_1\})+f(X\cup \{x_2\})\geq f(X\cup \{x_1,x_2\})+f(X)</math>.
 
A nonnegative submodular function is also a [[subadditive]] function, but a subadditive function need not be submodular.
 
== Types of submodular functions ==
===Monotone submodularMonotone function===
A submodular function <math>f</math> is said to be ''monotone'' if for every <math>T\subseteq S</math> we have that <math>f(T)\leq f(S)</math>. Examples of monotone submodular functions include:
#; Linear functions : Any function of the form <math>f(S)=\sum_{i\in S}w_i</math> is called a linear function. Additionally if <math>\forall i,w_i\geq 0</math> then f is monotone.
:Examples of Monotone Submodular function
#; Budget-additive functions : Any function of the form <math>f(S)=\min(B,\sum_{i\in S}w_i)</math> for each <math>w_i\geq 0</math> and <math>B\geq 0</math> is called budget additive.
# Linear functions
#; Coverage functions : Let <math>\Omega=\{E_1,E_2,\ldots,E_n\}</math> be a collection of subsets of some ground set <math>\Omega'</math>. The function <math>f(S)=|\cup_{E_i\in S}E_i|</math> for <math>S\subseteq \Omega</math> is called a coverage function.
#: Any function of the form <math>f(S)=\sum_{i\in S}w_i</math> is called a linear function. Additionally if <math>\forall i,w_i\geq 0</math> then f is monotone.
#; [[Entropy (information theory)|Entropy]] : Let <math>\Omega=\{X_1,X_2,\ldots,X_n\}</math> be a set of [[random variables]]. Then for any <math>S\subseteq \Omega</math> we have that <math>H(S)</math> is a submodular function, where <math>H(S)</math> is the entropy of the set of random variables <math>S</math>
# Budget-additive functions
#; [[Matroid]] [[matroid rank|rank functions]] : Let <math>\Omega=\{e_1,e_2,\dots,e_n\}</math> be the ground set on which a matroid is defined. Then the rank function of the matroid is a submodular function.
#: Any function of the form <math>f(S)=\min(B,\sum_{i\in S}w_i)</math> for each <math>w_i\geq 0</math> and <math>B\geq 0</math> is called budget additive.
# Coverage function
#:Let <math>\Omega=\{E_1,E_2,\ldots,E_n\}</math> be a collection of subsets of some ground set <math>\Omega'</math>. The function <math>f(S)=|\cup_{E_i\in S}E_i|</math> for <math>S\subseteq \Omega</math> is called a coverage function.
# [[Entropy (information theory)|Entropy]]
#:Let <math>\Omega=\{X_1,X_2,\ldots,X_n\}</math> be a set of [[random variables]]. Then for any <math>S\subseteq \Omega</math> we have that <math>H(S)</math> is a submodular function, where <math>H(S)</math> is the entropy of the set of random variables <math>S</math>
# [[Matroid]] [[matroid rank|rank functions]]
#:Let <math>\Omega=\{e_1,e_2,\dots,e_n\}</math> be the ground set on which a matroid is defined. Then the rank function of the matroid is a submodular function.
 
=== Non-monotone submodular function===
A submodular function <math>f</math> which is not necessarily monotone is called as Non''non-monotone Submodular function''.
====Symmetric non-monotone submodular function====
A submodular function <math>f</math> is called symmetric if for every <math>S\subseteq \Omega</math> we have that <math>f(S)=f(\Omega-S)</math>
:Examples of Symmetric Non-Monotone Submodular function
# Graph cuts
#:Let <math>\Omega=\{v_1,v_2,\dots,v_n\}</math> be the vertices of a [[graph (mathematics)|graph]]. For any set of vertices <math>S\subseteq \Omega</math> let <math>f(S)</math> denote the number of edges <math>e=(u,v)</math> such that <math>u\in S</math> and <math>v\in \Omega-S</math>.
# [[Mutual information]]
#:Let <math>\Omega=\{X_1,X_2,\ldots,X_n\}</math> be a set of [[random variable]]s. Then for any <math>S\subseteq \Omega</math> we have that <math>f(S)=I(S;\Omega-S)</math> is a submodular function, where <math>I(S;\Omega-S)</math> is the mutual information.
====Asymmetric non-monotone submodular function====
A submodular function <math>f</math> is called Asymmetric if it is not necessarily symmetric.
:Examples of Symmetric Non-Monotone Submodular function
# Directed graph cuts
#:Let <math>\Omega=\{v_1,v_2,\dots,v_n\}</math> be the vertices of a [[directed graph]]. For any set of vertices <math>S\subseteq \Omega</math> let <math>f(S)</math> denote the number of edges <math>e=(u,v)</math> such that <math>u\in S</math> and <math>v\in \Omega-S</math>.
 
==== Symmetric ====
==Continuous extensions==
A non-monotone submodular function <math>f</math> is called ''symmetric'' if for every <math>S\subseteq \Omega</math> we have that <math>f(S)=f(\Omega-S)</math>.
===Lovasz extension===
Examples of symmetric non-monotone submodular functions include:
This extension has been named after [[László Lovász]]. Consider any vector <math>\bold{x}=\{x_1,x_2,\dots,x_n\}</math> such that each <math>0\leq x_i\leq 1</math>. Then the Lovasz extension is defined as <math>f^L(\bold{x})=\mathbb{E}(f(\{i|x_i\geq \lambda\}))</math> where the expectation is over choosing <math>\lambda</math> uniformly in <math>[0,1]</math>. It can be shown that Lovasz extension is a convex function.
#; Graph cuts : Let <math>\Omega=\{v_1,v_2,\dots,v_n\}</math> be the vertices of a [[graph (mathematics)|graph]]. For any set of vertices <math>S\subseteq \Omega</math> let <math>f(S)</math> denote the number of edges <math>e=(u,v)</math> such that <math>u\in S</math> and <math>v\in \Omega-S</math>.
#; [[Mutual information]] : Let <math>\Omega=\{X_1,X_2,\ldots,X_n\}</math> be a set of [[random variable]]s. Then for any <math>S\subseteq \Omega</math> we have that <math>f(S)=I(S;\Omega-S)</math> is a submodular function, where <math>I(S;\Omega-S)</math> is the mutual information.
 
===Multilinear= extensionAsymmetric ====
A non-monotone submodular function which is not symmetric is called asymmetric.
Consider any vector <math>\bold{x}=\{x_1,x_2,\ldots,x_n\}</math> such that each <math>0\leq x_i\leq 1</math>. Then the multilinear extension is defined as <math>F(\bold{x})=\sum_{S\subseteq \Omega} f(S) \prod_{i\in S} x_i \prod_{i\notin S} (1-x_i)</math>
===Convex Closure===
Consider any vector <math>\bold{x}=\{x_1,x_2,\dots,x_n\}</math> such that each <math>0\leq x_i\leq 1</math>. Then the convex closure is defined as <math>f^-(\bold{x})=\min(\sum_S \alpha_S f(S):\sum_S \alpha_S 1_S=\bold{x},\sum_S \alpha_S=1,\alpha_S\geq 0)</math>. It can be shown that <math>f^L(\bold{x})=f^-(\bold{x})</math>
 
#:For example, a directed graph cut is an asymmetric non-monotone submodular function. Let <math>\Omega=\{v_1,v_2,\dots,v_n\}</math> be the vertices of a [[directed graph]]. For any set of vertices <math>S\subseteq \Omega</math> let <math>f(S)</math> denote the number of edges <math>e=(u,v)</math> such that <math>u\in S</math> and <math>v\in \Omega-S</math>.
===Concave closure===
 
== Continuous extensions ==
===Lovasz Lovász extension ===
This extension has beenis named after mathematician [[László Lovász]]. Consider any vector <math>\bold{x}=\{x_1,x_2,\dots,x_n\}</math> such that each <math>0\leq x_i\leq 1</math>. Then the LovaszLovász extension is defined as <math>f^L(\bold{x})=\mathbb{E}(f(\{i|x_i\geq \lambda\}))</math> where the expectation is over choosing <math>\lambda</math> uniformlychosen infrom the [[uniform distribution]] on the interval <math>[0,1]</math>. ItThe can be shown that LovaszLovász extension is a convex function.
 
=== Multilinear extension ===
Consider any vector <math>\bold{x}=\{x_1,x_2,\ldots,x_n\}</math> such that each <math>0\leq x_i\leq 1</math>. Then the multilinear extension is defined as <math>F(\bold{x})=\sum_{S\subseteq \Omega} f(S) \prod_{i\in S} x_i \prod_{i\notin S} (1-x_i)</math>.
 
=== Convex Closureclosure ===
Consider any vector <math>\bold{x}=\{x_1,x_2,\dots,x_n\}</math> such that each <math>0\leq x_i\leq 1</math>. Then the convex closure is defined as <math>f^-(\bold{x})=\min(\sum_S \alpha_S f(S):\sum_S \alpha_S 1_S=\bold{x},\sum_S \alpha_S=1,\alpha_S\geq 0)</math>. It can be shown that <math>f^L(\bold{x})=f^-(\bold{x})</math>.
 
=== Concave closure ===
Consider any vector <math>\bold{x}=\{x_1,x_2,\dots,x_n\}</math> such that each <math>0\leq x_i\leq 1</math>. Then the concave closure is defined as <math>f^+(\bold{x})=\max(\sum_S \alpha_S f(S):\sum_S \alpha_S 1_S=\bold{x},\sum_S \alpha_S=1,\alpha_S\geq 0)</math>.
 
== Closure properties ==
==Properties==
The class of submodular functions is [[closure (mathematics)|closed]] under non-negative [[linear combination]]s. Consider any submodular function <math>f_1,f_2,\ldots,f_k</math> and non-negative numbers <math>\alpha_1,\alpha_2,\ldots,\alpha_k</math>. Then the function <math>g</math> defined by <math>g(S)=\sum_{i=1}^k \alpha_i f_i(S)</math> is submodular. Furthermore, for any submodular function <math>f</math>, the functions defined by <math>g(S)=\min(f(S),c)</math>, where <math>c</math> is a non-negative real number and <math>g(S)=f(\Omega \setminus S)</math> are also submodular.
===Operations which preserve submodularity===
 
# Non-negative linear combinations. Consider any submodular function <math>f_1,f_2,\ldots,f_k</math> and non negative numbers <math>\alpha_1,\alpha_2,\ldots,\alpha_k</math>. Then <math>g(S)=\sum_{i=1}^k \alpha_i f_i(S)</math> is a submodular function.
== Optimization problems ==
# Consider any monotone submodular function <math>f</math> and a non negative number <math>c</math>. Then <math>g(S)=min(f(S),c)</math> is also a submodular function.
Submodular functions have properties which are very similar to [[convex function|convex]] and [[concave functionsfunction]]s. HenceFor athis lotreason, ofan [[optimization problemsproblem]] which concerns optimizing a convex or concave function can also be castdescribed as the problem of maximizing or minimizing a submodular functionsfunction subject to varioussome constraints.
# Consider any submodular function <math>f</math>. Then <math>g(S)=f(\Omega-S)</math> is also a submodular function.
 
#:Under theThe simplest case theminimization problem is to find a set <math>S\subseteq \Omega</math> which minimizes a submodular function subject to no constraints. AThis seriesproblem ofis resultscomputable in [[polynomial time]].<ref name="GLS" /><ref name="Cunningham" /><ref name="IFF" /><ref name="Schrijver" /> have establishedComputing the polynomial time solvability of this problem. Finding [[minimum cut]] in a graph is a special case of this general minimization problem.
 
Unlike minimization, maximization of submodular functions is usually [[NP-hard]]. Many problems, such as [[max cut]] and the [[maximum coverage problem]], can be cast as special cases of this general maximization problem under suitable constraints. Typically, the approximation algorithms for these problems are based on either [[greedy algorithm]]s or [[local search (optimization)|local search algorithm]]s. The problem of maximizing a symmetric non-monotone submodular function subject to no constraints admits a 1/2 approximation algorithm.<ref name="FMV" /> Computing the [[maximum cut]] of a graph is a special case of this problem. The problem of maximizing a monotone submodular function subject to a cardinality constraint admits a <math>1 - \frac{1}{e}</math> approximation algorithm.<ref name="NVF" /> The [[maximum coverage problem]] is a special case of this problem.
==Optimization problems==
Submodular functions have properties which are very similar to convex and concave functions. Hence a lot of optimization problems can be cast as maximizing or minimizing submodular functions subject to various constraints.
# Minimization of submodular functions.
#:Under the simplest case the problem is to find set <math>S\subseteq \Omega</math> which minimizes submodular function subject to no constraints. A series of results <ref name="GLS" /><ref name="Cunningham" /><ref name="IFF" /><ref name="Schrijver" /> have established the polynomial time solvability of this problem. Finding [[minimum cut]] in a graph is a special case of this problem.
# Maximization of submodular functions.
#:Unlike minimization, maximization of submodular functions is usually [[NP-hard]]. A host of problems such as [[max cut]], [[maximum coverage problem]] can be cast as special cases of this problem under suitable constraints. Typically the approximation algorithms for these problems are based on either greedy or local search type of algorithms.
## Maximizing a Symmetric Non-monotone Submodular function subject to no constraint. This problem admits a 1/2 approximation algorithm<ref name="FMV" />. Finding [[max cut]] is a special case of this problem.
## Maximizing a Monotone Submodular function subject to cardinality constraint. This problem admits a 1-1/e approximation algorithm<ref name="NVF" />. [[Maximum coverage problem]] is a special case of this problem.
 
== See also ==
{{Div col|cols=1}}
* [[Supermodular function]]
* [[Polymatroid]]
* [[Matroid]]
{{Div col end}}
 
== Citations ==
{{reflist|
refs=
Line 84 ⟶ 71:
}}
 
== References ==
 
*{{Citation|last=Schrijver|first=Alexander|authorlink=Alexander Schrijver|year=2003|title=Combinatorial Optimization|___location=|publisher=[[Springer Publishing|Springer]]|isbn=3-540-44389-4}}