Numerical certification: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 20:34, 9 November 2018 edit Ofloveandhate (talk \| contribs) 132 edits →Interval Newton: corrected link ← Previous edit		Latest revision as of 11:50, 19 February 2025 edit undo Deepgut (talk \| contribs) 64 edits Link suggestions feature: 3 links added. Tags: Visual edit Newcomer task Suggested: add links
(26 intermediate revisions by 6 users not shown)
Line 1: '''Numerical certification''' is the process of verifying the correctness of a candidate solution to a [[system of equations~~, often a result of algorithmic calculation~~]]. In (numerical) computational mathematics, such as [[numerical algebraic geometry]], candidate solutions are computed algorithmically, but there is the possibility that errors have ~~been~~corrupted ~~introduced~~the candidates. ~~Beyond~~For instance, in addition to the inexactness of ~~candidates,~~input ~~they~~data ~~may~~and ~~also~~candidate ~~simply~~solutions, benumerical ~~grossly~~errors ~~incorrect~~or aserrors in the ~~result~~discretization of ~~any~~the ofproblem amay ~~number~~result ofin ~~modes~~corrupted ofcandidate ~~failure~~solutions. The goal of numerical certification is to provide a certificate which proves which of these candidates are, indeed, approximate solutions.▼ ~~{{Draft article}}~~ ▲'''Numerical certification''' is the process of verifying the correctness of a candidate solution to a system of equations, often a result of algorithmic calculation. In numerical computational mathematics, such as [[numerical algebraic geometry]], candidate solutions are computed, but there is the possibility that errors have been introduced. Beyond the inexactness of candidates, they may also simply be grossly incorrect as the result of any of a number of modes of failure. The goal of numerical certification is to provide a certificate which proves which of these candidates indeed approximate solutions. Methods for certification can be divided into two flavors: ''a priori'' certification and ''a posteriori'' certification. ''A posteriori'' certification confirms the correctness of the final answers (regardless of how they are generated), while ''a priori'' certification confirms the correctness of each step of a specific computation. A typical example of ''a posteriori'' certification is [[~~Stephen_Smale~~Stephen Smale\|Smale]]'s alpha theory, while a typical example of ''a priori'' certification is [[interval arithmetic]]. == Certificates == A '''certificate''' for a root is a computational proof of the correctness of a candidate solution. For instance, a certificate may consist of an approximate solution <math>x</math>, a region <math>R</math> containing <math>x</math>, and a proof that <math>R</math> contains exactly one solution to the system of equations. ~~A certificate in~~In this context, ~~for~~an ''a ~~candidate~~priori'' ~~solution~~numerical certificate is ~~similar to~~ a certificate ~~for~~in the sense of [[Correctness (computer science)\|correctness in computer science]]. ~~Whereas a computer science certificate proves that a function produces~~On the ~~correct~~other ~~result~~hand, ~~numerical certification operates on solutions. Between the two flavors,~~an ''a ~~priori~~posteriori'' numerical ~~certification~~certificate ~~bears~~operates ~~the~~only ~~greatest~~on ~~similarity~~solutions, ~~to certification~~regardless of anhow ~~algorithm, in that the results~~they are ~~mathematically guaranteed to be correct -- much like '''total correctness'''~~computed. ~~On the other hand~~Hence, ''a posteriori'' ~~numerical~~ certification ~~operates on solution candidates, regardless of how they are computed, and hence~~ is ~~rather~~ different from algorithmic correctness --– for an extreme example, ~~one~~an algorithm could randomly generate candidates and attempt to certify them as approximate roots using ''a posteriori'' certification. == ''A posteriori'' certification methods == There are a variety of methods for ''a posteriori'' certification, including * Alpha theory (Smale) * Krawczyk's method/Interval Newton (Moore) * Miranda Test (Yap, Vegter, Sharma)▼ === Alpha theory === The cornerstone of '''Smale's alpha theory''' is ~~error~~ bounding the error for [[Newton's method]]. Smale's 1986 work<ref>{{cite journal \|last1=Smale \|first1=Steve \|title=Newton’s method estimates from data at one point \|journal=The merging of disciplines: new directions in pure, applied, and computational mathematics \|date=1986 \|pages=~~185-196~~185–196}}</ref> introduced the quantity <math>\alpha</math>, which quantifies the convergence of Newton's method. More precisely, let <math>F</math> be a ~~measurement~~system onof analytic functions in the ~~distance~~variables to<math>x</math>, a<math>D</math> ~~true~~the ~~solution~~[[derivative]] operator, and <math>N</math> the Newton operator. The quantities <math display="block">\beta(f,x) = \|\\|x - N(x)\|\\| = \|\\|Df(x)^{-1} f(x)\|\\|</math>▼ <math display="block">\gamma(f,x) = \sup_{k\geq 2}\|\left\\| \frac{Df(x)^{-1}D^kf(x)}{k!} \|\right\\|^\frac{1}{k-1}</math>▼ ~~More precisely, let <math>f</math> be a system of polynomials with variables <math>x</math>, with <math>D</math> the derivative operator, <math>N</math> a Newton step. The quantities~~ ▲<math display="block">\beta(f,x) = \|\|x - N(x)\|\| = \|\|Df(x)^{-1} f(x)\|\|</math> ▲<math display="block">\gamma(f,x) = \sup_{k\geq 2}\|\| \frac{Df(x)^{-1}D^kf(x)}{k!} \|\|^\frac{1}{k-1}</math> and <math display="block">\alpha(f,x) = \beta(f,x) \gamma(f,x) </math> ~~can be computed and~~are used to certify a candidate solution. In particular, ~~since~~ if <math display="block">\alpha(f,x) < \frac{13 - 3\sqrt{17}}{4},</math> then <math>x</math> is an '''approximate solution''' for <math>f</math>, ~~meaning that~~i.e., the candidate is in the ___domain of [[Rate of convergence\|quadratic convergence]] for Newton's method.~~<ref~~ ~~name="alphacertified">{{cite~~ ~~journal~~In ~~\|last1=Hauenstein~~other ~~\|first1=Jonathan~~words, ~~\|last2=Sottile~~if ~~\|first2=Frank~~this ~~\|title=Algorithm~~inequality ~~921:~~holds, ~~alphaCertified:~~then ~~certifying~~there ~~solutions~~is toa ~~polynomial~~root ~~systems~~<math>x^\ast</math> ~~\|journal=ACM~~of ~~Transactions~~<math>F</math> onso ~~Mathematical~~that ~~Software~~iterates ~~(TOMS)~~of ~~\|date=2012~~the ~~\|volume=38~~Newton ~~\|issue=4~~operator ~~\|page=28~~converge ~~\|doi=10.1145/2331130.2331136}}</ref>~~as <math display="block">\left\\|N^k(x)-x^\ast\right\\|\leq\frac{1}{2^{2^k-1}}\\|x-x^\ast\\|.</math> The software package [http://www.math.tamu.edu/~sottile/research/stories/alphaCertified/ alphaCertified] provides an implementation of the alpha test for polynomials by estimating <math>\beta</math> and <math>\gamma</math>.<ref name="alphacertified">{{cite journal \|last1=Hauenstein \|first1=Jonathan \|last2=Sottile \|first2=Frank \|title=Algorithm 921: alphaCertified: certifying solutions to polynomial systems \|journal=ACM Transactions on Mathematical Software \|date=2012 \|volume=38 \|issue=4 \|page=28 \|doi=10.1145/2331130.2331136}}</ref> === Interval Newton ===▼ === Interval Newton and Krawczyck methods === The fixed points of the Newton operator [[Newton's method]] for a square system of functions <math>F:\mathbb{C}^n\rightarrow\mathbb{C}^n</math> correspond to the roots of <math>f</math>. In more generality, suppose that <math>G:\mathbb{C}^n\rightarrow\mathbb{C}^n</math> is a function whose fixed points correspond to the roots of <math>F</math>. Interval Newton {{citation needed\|reason=don't have time to figure out how to reference a book right now\|November 9, 2018}}, Krawczyk {{citation needed\|reason=don't have time to figure out how to reference a book right now\|November 9, 2018}}, and related methods use such functions to establish the existence and uniqueness of roots within a region <math>I</math> using topological methods: ~~# 1 If~~Suppose <math>G:\mathbb{R}^n\rightarrow\mathbb{R}^n</math> ~~maps~~is a ~~region~~function ~~<math>I</math>~~whose ~~into~~fixed ~~itself,~~points ~~then~~correspond byto ~~[[Brouwer~~the ~~fixed-point~~roots ~~theorem]],~~of <math>GF</math>. ~~has~~ atFor ~~least one fixed point in <math>I</math>~~example, ~~and,~~the ~~hence~~Newton ~~<math>F</math>~~operator has atthis ~~least~~property. ~~one~~ ~~root~~Suppose inthat <math>I</math>. is a region, then, # 2 If <math>FG</math> ismaps a<math>I</math> ~~[[contraction~~into ~~mapping]]~~itself, ~~in a region~~i.e., <math>G(I)\subseteq I</math>, then ~~there~~by is[[Brouwer fixed-point theorem]], <math>G</math> has at ~~most~~least one ~~root~~fixed ofpoint in <math>I</math>, and, hence <math>F</math> has at least one root in <math>I</math>. # If <math>G</math> is [[contractive]] in a region containing <math>I</math>, then there is at most one root in <math>I</math>. There are versions of the following methods over the complex numbers, but both the interval arithmetic and conditions must be adjusted to reflect this case. ~~The basic form of Newton's method replaces the input in the iterator with an interval approximation. In other words~~ ▲=== =Interval Newton method==== ~~<math display="block">N_f(I)=m(I)+J_F(I)^{-1}F(m(I))</math>~~ isIn the univariate case, Newton's method can ~~contained~~be ~~within~~directly generalized to certify a root over an interval. For an interval <math>IJ</math>, ~~where~~let <math>m(IJ)</math> isbe the midpoint of <math>IJ</math>. ~~and~~ ~~<math>J_f(I)</math> is an over-approximation to~~Then, the ~~set-image~~interval ofNewton ~~the~~operator ~~[[Jacobian~~applied ~~matrix and determinant\|Jacobian]] of~~to <math>FJ</math> ~~on <math>I</math>.~~is :<math>IN(J)=m(J)-F(m(J))/F'(J).</math> In practice, any interval containing <math>F'(J)</math> can be used in this computation. If <math>x</math> is a root of <math>F</math>, then by the [[mean value theorem]], there is some <math>c\in J</math> such that <math>F(m(J))-F'(c)(m(J)-x)=F(x)=0</math>. In other words, <math>F(m(J))=F'(c)(m(J)-x)</math>. Since <math>F'(J)</math> contains the inverse of <math>F</math> at all points of <math>J</math>, it follows that <math>m(J)-x\in F(m(J))/F'(J)</math>. Therefore, <math>x=m(J)-(m(J)-x)\in IN(J)</math>. Furthermore, if <math>0\not\in F'(J)</math>, then either <math>m(J)</math> is a root of <math>F</math> and <math>IN(J)=\{m(J)\}</math> or <math>m(J)\not\in IN(J)</math>. Therefore, <math>J\cap N(J)</math> is at most half the width of <math>J</math>. Therefore, if there is some root of <math>F</math> in <math>J</math>, the iterative procedure of replacing <math>J</math> by <math>J\cap IN(J)</math> will converge to this root. If, on the other hand, there is no root of <math>F</math> in <math>J</math>, this iterative procedure will eventually produce an empty interval, a witness to the nonexistence of roots. === Miranda test ===▼ See [[Interval arithmetic#Interval Newton method\|interval Newton method]] for higher dimensional analogues of this approach. ====Krawczyck method==== Let <math>Y</math> be any <math>n\times n</math> [[invertible matrix]] in <math>GL(n,\mathbb{R})</math>. Typically, one takes <math>Y</math> to be an approximation to <math>F'(y)^{-1}</math>. Then, define the function <math>G(x)=x-YF(x).</math> We observe that <math>x</math> is a fixed of <math>G</math> if and only if <math>x</math> is a root of <math>F</math>. Therefore the approach above can be used to identify roots of <math>F</math>. This approach is similar to a multivariate version of Newton's method, replacing the derivative with the fixed matrix <math>Y</math>. We observe that if <math>J</math> is a compact and convex region and <math>y\in J</math>, then, for any <math>x\in J</math>, there exist <math>c_1,\dots,c_n\in J</math> such that :<math>G(y)-G(x)=\begin{bmatrix}\nabla g_1(c_1)^T\\\vdots\\\nabla g_n(c_n)^T\end{bmatrix}(y-x).</math> Let <math>G'(J)</math> be the Jacobian matrix of <math>G</math> evaluated on <math>J</math>. In other words, the entry <math>(G'(J))_{ij}</math> consists of the image of <math>\frac{\partial g_i}{\partial x_j}</math> over <math>J</math>. It then follows that <math>G(x)\in G(y)+\nabla G(J)(x-y),</math> where the matrix-vector product is computed using interval arithmetic. Then, allowing <math>x</math> to vary in <math>J</math>, it follows that the image of <math>G</math> on <math>J</math> satisfies the following containment: <math>G(J)\subset G(y)+G'(J)(J-y),</math> where the calculations are, once again, computed using interval arithmetic. Combining this with the formula for <math>G</math>, the result is the Krawczyck operator :<math>K_{y,Y}(J)=y-YF(y)+(I-F'(J))(J-y),</math> where <math>I</math> is the [[identity matrix]]. If <math>K_{y,Y}(J)\subset J</math>, then <math>G</math> has a fixed point in <math>J</math>, i.e., <math>F</math> has a root in <math>J</math>. On the other hand, if the maximum [[matrix norm]] using the [[Uniform norm\|supremum norm for vectors]] of all matrices in <math>I-F'(J)</math> is less than <math>1</math>, then <math>G</math> is contractive within <math>J</math>, so <math>G</math> has a unique fixed point. A simpler test, when <math>J</math> is an axis-aligned parallelepiped, uses <math>y=m(J)</math>, i.e., the midpoint of <math>J</math>. In this case, there is a unique root of <math>F</math> if :<math>\\|K(X)-m(X)\\|<\frac{w(X)}{2},</math> where <math>w(X)</math> is the length of the longest side of <math>J</math>. ▲=== Miranda test === ▲* Miranda ~~Test~~test (Yap, Vegter, Sharma) == ''A priori'' certification methods == * Interval arithmetic (Moore, Arb, Mezzarobba) * Condition numbers (~~Beltran-Leykin~~Beltran–Leykin) === Interval arithmetic === {{Main\|Interval arithmetic}} Interval arithmetic can be used to provide an ''a priori'' numerical certificate by computing intervals containing unique solutions. By using intervals instead of plain numeric types during path tracking, resulting candidates are represented by intervals. The candidate solution-interval is itself the certificate, in the sense that the solution is guaranteed to be inside the interval. === Condition numbers === {{Main\|Condition number}} [[Numerical algebraic geometry]] solves polynomial systems using [[homotopy continuation]] and path tracking methods. By monitoring the [[condition number]] for a tracked homotopy at every step, and ensuring that no two solution paths ever intersect, one can compute a numerical certificate along with a solution. This scheme is called ''a priori'' path tracking.<ref>{{cite journal \|last1=Beltran \|first1=Carlos \|last2=Leykin \|first2=Anton \|title=Certified numerical homotopy tracking \|journal=Experimental Mathematics \|date=2012 \|volume=21 \|issue=1 \|pages=~~69-83~~69–83}}</ref> Non-certified numerical path tracking relies on heuristic methods for controlling time step size and precision.<ref>{{cite journal \|last1=Bates \|first1=Daniel \|last2=Hauenstein \|first2=Jonathan \|last3=Sommese \|first3=Andrew \|last4=Wampler \|first4=Charles \|title=Stepsize control for path tracking \|journal=Contemporary Mathematics \|date=2009 \|volume=496 \|issue=21}}</ref> In contrast, ''a priori'' certified path tracking goes beyond heuristics to provide step size control that ''guarantees'' that for every step along the path, the current point is within the ___domain of [[quadratic convergence]] for the current path. == ~~See also~~ References== {{Reflist}} [[:Category:Algebraic geometry]] [[:Category:Nonlinear algebra]]