Quadratic pseudo-Boolean optimization: Difference between revisions

Content deleted Content added
No edit summary
 
(18 intermediate revisions by 12 users not shown)
Line 1:
{{short description|Combinatorial optimization method for pseudo-Boolean functions}}
'''Quadratic pseudo-Boolean optimisation''' ('''QPBO''') is a [[combinatorial optimization]] method for minimizing quadratic [[pseudo-Boolean function]]s in the form
 
:<math> f(\mathbf{x}) = w_0 + \sum_{p \in V} w_p(x_p) + \sum_{(p, q) \in E} w_{pq}(x_p, x_q) </math>
 
in the [[binary data|binary variables]] <math>x_p \in \{0, 1\} \; \forall p \in V = \{1, \dots, n\}</math>, with <math>E \subseteq V \times V</math>. If <math>f</math> is [[Pseudo-Boolean_function#Submodularity|submodular]] then QPBO produces a global optimum equivalently to [[graph cut optimization]], while if <math>f</math> contains non-submodular terms then the algorithm produces a partial solution with specific optimality properties, in both cases in [[polynomial time]].<ref name="review" />
 
QPBO is a useful tool for inference on [[Markov random field]]s and [[conditional random field]]s, and has applications in [[computer vision]] problems such as [[image segmentation]] and [[stereo cameras|stereo matching]].<ref name="rother" />
Line 21:
 
== Properties ==
QPBO produces a solution where each variable assumes one of three possible values,: ''true'', ''false'', and ''undefined'', noted in the following as 1, 0, and <math>\emptyset</math> respectively. The solution satisfieshas the two following two properties:.
 
* ''Partial optimality'': if <math>f</math> is submodular, then QPBO produces a global minimum exactly, equivalentlyequivalent to [[graph cut optimization|graph cut]], and all variables have a non-undefined value.; Ifif submodularity is not satisfied, the result will be a partial solution <math>\mathbf{x}</math> where a subset <math>\hat{V} \subseteq V</math> of the variables have a non-undefined value. SuchA partial solution is always part of a global solution, i.e. there existexists a global minimum point <math>\mathbf{x^*}</math> for <math>f</math> such that <math>x_i = x_i^*</math> for each <math>i \in \hat{V}</math>.
* ''persistencePersistence'': given a solution <math>\mathbf{x}</math> generated by QPBO and an arbitrary assignment of values <math>\mathbf{y}</math> to the variables, if a new solution <math>\hat{\mathbf{y}}</math> is constructed by replacing <math>y_i</math> with <math>x_i</math> for each <math>i \in \hat{V}</math>, then <math>f(\hat{\mathbf{y}}) \le f(\mathbf{y})</math>.<ref name="review" />
 
* ''persistence'': given a solution <math>\mathbf{x}</math> generated by QPBO and an arbitrary assignment of values <math>\mathbf{y}</math> to the variables, if a new solution <math>\hat{\mathbf{y}}</math> is constructed by replacing <math>y_i</math> with <math>x_i</math> for each <math>i \in \hat{V}</math>, then <math>f(\hat{\mathbf{y}}) \le f(\mathbf{y})</math>.<ref name="review" />
 
== Algorithm ==
Line 32 ⟶ 31:
The algorithm can be divided in three steps: graph construction, max-flow computation, and assignment of values to the variables.
 
When constructing the graph, the set of vertices <math>V</math> contains the source and sink nodes <math>s</math> and <math>t</math>, and a couplepair of nodes <math>p</math> and <math>p'</math> for each variable. After re-parametrising the function to normal form,<ref name="normal form" group="note" /> a pair of edges is added to the graph for each term <math>w</math>:
* for each term <math>w_p(0)</math> the edges <math>p \rightarrow t</math> and <math>s \rightarrow p'</math>, with weight <math>\frac{1}{2} w_p(0)</math>;
* for each term <math>w_p(1)</math> the edges <math>s \rightarrow p</math> and <math>p' \rightarrow t</math>, with weight <math>\frac{1}{2} w_p(1)</math>;
Line 44 ⟶ 43:
Once the minimum cut is known, each variable receives a value depending upon the position of its corresponding nodes <math>p</math> and <math>p'</math>: if <math>p</math> belongs to the connected component containing the source and <math>p'</math> belongs to the connected component containing the sink then the variable will have value of 0. Vice versa, if <math>p</math> belongs to the connected component containing the sink and <math>p'</math> to the one containing the source, then the variable will have value of 1. If both nodes <math>p</math> and <math>p'</math> belong to the same connected component, then the value of the variable will be undefined.<ref name="rother" />
 
The way undefined variables can be handled is dependantdependent upon the context of the problem. In the general case, given a [[partition of a set|partition]] of the graph in two sub-graphs and two solutions, each one optimal for one of the sub-graphs, then it is possible to combine the two solutions into one solution optimal for the whole graph in polynomial time.<ref name="billionnet" /> However, computing an optimal solution for the subset of undefined variables is still a [[NP-hard]] problem. In the context of iterative algorithms such as <math>\alpha</math>-expansion, a reasonable approach is to leave the value of undefined variables unchanged, since the persistence property guarantees that the target function will have non-increasing value.<ref name="review" /> Different exact and approximate strategies to minimise the number of undefined variables exist.<ref name="rother" />
 
== Higher order terms ==
 
Reducing high-order terms to quadratics can be done by a process called "quadratization", and several dozen quadratization methods are given in [[Nike Dattani]]'s 2019 book "Quadratization in disctete optimization and quantum mechanics"<ref name="Dattani"></ref>. The problem of optimizing higher-order pseudo-boolean functions is generally difficult. It is always possible to reduce a higher-order function to a quadratic function which is equivalent with respect to the optimisation, problem known as "higher-order [[clique (graph theory)|clique]] reduction" (HOCR), and the result of such reduction can be optimized with QPBO. Generic methods for reduction of arbitrary functions rely on specific substitution rules and in the general case they require the introduction of auxiliary variables.<ref name="fix" /> In practice most terms can be reduced without introducing additional variables, resulting in a simpler optimization problem, and the remaining terms can be reduced exactly, with addition of auxiliary variables, or approximately, without addition of any new variable.<ref name="elc" />
 
==Notes==
<references>
<ref name="Dattani">Dattani (2019) [https://arxiv.org/pdf/1901.04405 Quadratization in discrete optimization and quantum mechanics] (Book).</ref>
<ref name="review">Kolmogorov and Rother (2007).</ref>
<ref name="fix">Fix et al. (2011).</ref>
Line 61 ⟶ 59:
 
==References==
* {{cite journal|first1=Alain|last1=Billionnet|first2=Brigitte|last2=Jaumard|author2-link= Brigitte Jaumard |title=A decomposition method for minimizing quadratic pseudo-boolean functions|journal= [[Operations Research Letters]] |volume=8|number=3| pages=161–163|year=1989|url=https://www.sciencedirect.com/science/article/pii/0167637789900436|doi=10.1016/0167-6377(89)90043-6}}
* {{cite conference|first1=Alexander|last1=Fix|first2=Aritanan|last2=Gruber|first3=Endre|last3=Boros|first4=Ramin|last4=Zabih|title=A graph cut algorithm for higher-order Markov random fields|conference= [[International Conference on Computer Vision]] |year=2011|pages=1020–1027|url=https://www.cs.cornell.edu/~afix/Papers/ICCV11.pdf}}
* {{cite conference|first1=Hiroshi|last1=Ishikawa|title=Higher-Order Clique Reduction Without Auxiliary Variables|conference= [[Conference on Computer Vision and Pattern Recognition]]|year=2014|publisher=IEEE|pages=1362–1269|url=https://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Ishikawa_Higher-Order_Clique_Reduction_2014_CVPR_paper.pdf}}
* {{cite journal|first1=Vladimir|last1=Kolmogorov|first2=Carsten|last2=Rother|title=Minimizing Nonsubmodular Functions: A Review|journal= [[IEEE Transactions on Pattern Analysis and Machine Intelligence]] |volume=29|number=7|year=2007|pages=1274–1279|publisher=IEEE|doi=10.1109/tpami.2007.1031|pmid=17496384}}
* {{cite conference|first1=Carsten|last1=Rother|first2=Vladimir|last2=Kolmogorov|first3=Victor|last3=Lempitsky|first4=Martin|last4=Szummer|title=Optimizing binary MRFs via extended roof duality|conference= [[Conference on Computer Vision and Pattern Recognition]] |pages=1–8|year=2007|url=https://www.microsoft.com/en-us/research/wp-content/uploads/2007/06/cvpr07-QPBOpi-TR.pdf}}
 
== Notes ==
<references group="note">
<ref name="normal form">The representation of a pseudo-Boolean function with coefficients <math>\mathbf{w} = (w_0, w_1, \dots, w_{nn})</math> is not unique, and if two coefficient vectors <math>\mathbf{w}</math> and <math>\mathbf{w}'</math> represent the same function then <math>\mathbf{w}'</math> is said to be a reparametrisation of <math>\mathbf{w}</math> and vice versa. In some constructions it is useful to ensure that the function has a specific form, called ''noramlnormal form'', which is always defindeddefined for any function, and it is not unique. A function <math>f</math> is in normal form if the two following conditions hold (Kolmogorov and Rother (2007)):
# <math>\min \{ w_p^0, w_p^1 \} = 0</math> for each <math>p \in V</math>;
# <math>\min \{ w_{pq}^{0j}, w_{pq}^{1j} \} = 0</math> for each <math>(p, q) \in E</math> and for each <math>j \in \{0, 1\}</math>.