Probability bounds analysis: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 12:34, 8 November 2013 edit Magioladitis (talk \| contribs) Extended confirmed users, Rollbackers 908,596 edits moved long comment in talk page ← Previous edit		Latest revision as of 07:14, 17 June 2024 edit undo David Eppstein (talk \| contribs) Autopatrolled, Administrators 235,763 edits swap out deadlink
(32 intermediate revisions by 25 users not shown)
Line 1: {{Short description\|Mathematical method of risk analysis}} ~~{{Use dmy dates\|date=April 2013}}~~ {{Use dmy dates\|date=October 2022}} '''Probability bounds analysis (PBA)''' is a collection of methods of uncertainty propagation for making qualitative and quantitative calculations in the face of uncertainties of various kinds. It is used to project partial information about random variables and other quantities through mathematical expressions. For instance, it computes sure bounds on the distribution of a sum, product, or more complex function, given only sure bounds on the distributions of the inputs. Such bounds are called [[probability box]]es, and constrain [[cumulative distribution function\|cumulative probability distributions]] (rather than [[probability density function\|densities]] or [[probability mass function\|mass functions]]). '''Probability bounds analysis''' ('''PBA''') is a collection of methods of uncertainty propagation for making qualitative and quantitative calculations in the face of uncertainties of various kinds. It is used to project partial information about random variables and other quantities through mathematical expressions. For instance, it computes sure bounds on the distribution of a sum, product, or more complex function, given only sure bounds on the distributions of the inputs. Such bounds are called [[probability box]]es, and constrain [[cumulative distribution function\|cumulative probability distributions]] (rather than [[probability density function\|densities]] or [[probability mass function\|mass functions]]). This [[upper and lower bounds\|bounding]] approach permits analysts to make calculations without requiring overly precise assumptions about parameter values, dependence among variables, or even distribution shape. Probability bounds analysis is essentially a combination of the methods of standard [[interval analysis]] and classical [[probability theory]]. Probability bounds analysis gives the same answer as interval analysis does when only range information is available. It also gives the same answers as [[Monte Carlo simulation]] does when information is abundant enough to precisely specify input distributions and their dependencies. Thus, it is a generalization of both interval analysis and probability theory. The diverse methods comprising probability bounds analysis provide algorithms to evaluate mathematical expressions when there is uncertainty about the input values, their dependencies, or even the form of mathematical expression itself. The calculations yield results that are guaranteed to enclose all possible distributions of the output variable if the input [[probability box\|p-boxes]] were also sure to enclose their respective distributions. In some cases, a calculated p-box will also be best-possible in the sense that the bounds could be no tighter without excluding some of the possible distributions. ~~the bounds could be no tighter without excluding some of the possible~~ ~~distributions.~~ P-boxes are usually merely bounds on possible distributions. The bounds often also enclose distributions that are not themselves possible. For instance, the set of probability distributions that could result from adding random values without the independence assumption from two (precise) distributions is generally a proper [[subset]] of all the distributions enclosed by the p-box computed for the sum. That is, there are distributions within the output p-box that could not arise under any dependence between the two input distributions. The output p-box will, however, always contain all distributions that are possible, so long as the input p-boxes were sure to enclose their respective underlying distributions. This property often suffices for use in [[Probabilistic risk assessment\|risk analysis]] and other fields requiring calculations under uncertainty. ==History of bounding probability== The idea of bounding probability has a very long tradition throughout the history of probability theory. Indeed, in 1854 [[George Boole]] used the notion of interval bounds on probability in his ''[[The Laws of Thought]]''.<ref name="BOOLE1854">{{cite book\|url= https://www.gutenberg.org/ebooks/15114 \|last=Boole \|first=George \|title=An Investigation of the Laws of Thought on which are Founded the Mathematical Theories of Logic and Probabilities \|publisher=Walton and Maberly \|year=1854 \|___location=London}}</ref><ref name=Hailperin86>{{cite book \|last=Hailperin \|first=Theodore \|title=Boole's Logic and Probability \|publisher=North-Holland \|year=1986 \|___location=Amsterdam \|isbn=978-0-444-11037-4 }}</ref> Also dating from the latter half of the 19th century, the [[Chebyshev inequality\|inequality]] attributed to [[Chebyshev]] described bounds on a distribution when only the mean and variance of the variable are known, and the related [[Markov inequality\|inequality]] attributed to [[Andrey Markov\|Markov]] found bounds on a positive variable when only the mean is known. [[Henry E. Kyburg, Jr.\|Kyburg]]<ref name="kyburg99">Kyburg, H.E., Jr. (1999). [https://sipta.org/documentation/interval_prob/kyburgnew.pdf Interval valued probabilities]. SIPTA Documentation on Imprecise Probability.</ref> reviewed the history of interval probabilities and traced the development of the critical ideas through the 20th century, including the important notion of incomparable probabilities favored by [[John Maynard Keynes\|Keynes]]. ~~The idea of bounding probability has a very long~~ tradition throughout the history of probability theory. Indeed, in 1854 [[George Boole]] used the notion of interval bounds on probability in his [[The Laws of Thought]].<ref name="BOOLE1854">{{cite book Of particular note is [[Maurice René Fréchet\|Fréchet]]'s derivation in the 1930s of bounds on calculations involving total probabilities without dependence assumptions. Bounding probabilities has continued to the present day (e.g., Walley's theory of [[imprecise probability]].<ref name="WALLEY1991">{{cite book\|url= https://archive.org/details/statisticalreaso0000wall \|last=Walley \|first=Peter \|title=Statistical Reasoning with Imprecise Probabilities \|url-access=registration \|publisher=Chapman and Hall \|year=1991 \|___location=London \|isbn=978-0-412-28660-5 }}</ref>) ~~\| last = Boole~~ ~~\| first = George~~ ~~\| title = An Investigation of the Laws of Thought on which are Founded the Mathematical Theories of Logic and Probabilities~~ ~~\| publisher = Walton and Maberly~~ ~~\| year = 1854~~ ~~\| ___location = London~~ ~~\| url = http://www.gutenberg.org/etext/15114~~ ~~}}</ref><ref name=Hailperin86>{{cite book~~ ~~\| last = Hailperin~~ ~~\| first = Theodore~~ ~~\| title = Boole's Logic and Probability~~ ~~\| publisher = North-Holland~~ ~~\| year = 1986~~ ~~\| ___location = Amsterdam~~ ~~\| isbn = 0-444-11037-2 }}~~ ~~</ref> Also dating from the latter half of the 19th century, the [[Chebyshev_inequality\|inequality]] attributed to [[Chebyshev]] described bounds on a distribution when only the mean and~~ ~~variance of the variable are known, and the related [[Markov_inequality\|inequality]] attributed to [[Andrey Markov\|Markov]] found bounds on a~~ ~~positive variable when only the mean is known.~~ [[Henry E. Kyburg, Jr.\|Kyburg]]<ref name="kyburg99">Kyburg, H.E., Jr. (1999). [http://www.sipta.org/documentation/interval_prob/kyburg.pdf Interval valued probabilities]. SIPTA Documention on Imprecise Probability.</ref> reviewed the history of interval probabilities and traced the development of the critical ideas through the 20th century, including the important notion of incomparable probabilities favored by [[John Maynard Keynes\|Keynes]]. ~~Of particular note is [[Maurice René Fréchet\|Fréchet]]'s derivation in the 1930s of bounds on calculations involving total probabilities without~~ ~~dependence assumptions. Bounding probabilities has continued to the~~ ~~present day (e.g., Walley's theory of [[imprecise probability]].<ref name="WALLEY1991">{{cite book~~ ~~\| last = Walley~~ ~~\| first = Peter~~ ~~\| title = Statistical Reasoning with Imprecise Probabilities~~ ~~\| publisher = Chapman and Hall~~ ~~\| year = 1991~~ ~~\| ___location = London~~ ~~\| isbn = 0-412-28660-2 }}</ref>)~~ The methods of probability bounds analysis that could be routinely used in risk assessments were developed in the 1980s. Hailperin<ref name=Hailperin86 /> described a computational scheme for bounding logical calculations extending the ideas of Boole. Yager<ref name=Yager>Yager, R.R. (1986). Arithmetic and other operations on Dempster–Shafer structures. ''International Journal of Man-machine Studies'' '''25''': 357–366.</ref> described the elementary procedures by which bounds on [[convolution of probability distributions\|convolutions]] can be computed under an assumption of independence. At about the same time, Makarov, <ref name=Makarov>Makarov, G.D. (1981). Estimates for the distribution function of a sum of two random variables when the marginal distributions are fixed. ''Theory of Probability and Its Applications'' '''26''': 803–806.</ref> and independently, Rüschendorf<ref>Rüschendorf, L. (1982). Random variables with maximum sums. ''Advances in Applied Probability'' '''14''': 623–632.</ref> solved the problem, originally posed by [[Kolmogorov]], of how to find the upper and lower bounds for the probability distribution of a sum of random variables whose marginal distributions, but not their joint distribution, are known. Frank et al.<ref name=Franketal87>Frank, M.J., R.B. Nelsen and B. Schweizer (1987). Best-possible bounds for the distribution of a sum—a problem of Kolmogorov. ''Probability Theory and Related Fields'' '''74''': 199–211.</ref> generalized the result of Makarov and expressed it in terms of [[Copula (probability theory)\|copulas]]. Since that time, formulas and algorithms for sums have been generalized and extended to differences, products, quotients and other binary and unary functions under various dependence assumptions.<ref name=WilliamsonDowns>Williamson, R.C., and T. Downs (1990). Probabilistic arithmetic I: Numerical methods for calculating convolutions and dependency bounds. ''International Journal of Approximate Reasoning'' '''4''': 89–158.</ref><ref name=Fersonetal03>Ferson, S., V. Kreinovich, L. Ginzburg, D.S. Myers, and K. Sentz. (2003). [http://www.ramas.com/unabridged.zip ''Constructing Probability Boxes and Dempster–Shafer Structures''] {{webarchive\|url=https://web.archive.org/web/20110722073459/http://www.ramas.com/unabridged.zip \|date=22 July 2011 }}. SAND2002-4015. Sandia National Laboratories, Albuquerque, NM.</ref><ref>Berleant, D. (1993). Automatically verified reasoning with both intervals and probability density functions. ''Interval Computations'' '''1993 (2) ''': 48–70.</ref><ref>Berleant, D., G. Anderson, and C. Goodman-Strauss (2008). Arithmetic on bounded families of distributions: a DEnv algorithm tutorial. Pages 183–210 in ''Knowledge Processing with Interval and Soft Computing'', edited by C. Hu, R.B. Kearfott, A. de Korvin and V. Kreinovich, Springer (~~ISBN~~ {{isbn\|978-1-84800-325-5}}).</ref><ref name=BerleantGoodmanStrauss>Berleant, D., and C. Goodman-Strauss (1998). Bounding the results of arithmetic operations on random variables of unknown dependency using intervals. ''Reliable Computing'' '''4''': 147–165.</ref><ref name=Fersonetal04>Ferson, S., R. Nelsen, J. Hajagos, D. Berleant, J. Zhang, W.T. Tucker, L. Ginzburg and W.L. Oberkampf (2004). [http://www.ramas.com/depend.pdf ''Dependence in Probabilistic Modeling, Dempster–Shafer Theory, and Probability Bounds Analysis'']. Sandia National Laboratories, SAND2004-3072, Albuquerque, NM.</ref> ~~<!--~~ ==Arithmetic expressions== ~~It is possible to mix very different kinds of knowledge together in a bounding analysis. For instance,~~ Arithmetic expressions involving operations such as additions, subtractions, multiplications, divisions, minima, maxima, powers, exponentials, logarithms, square roots, absolute values, etc., are commonly used in [[Probabilistic risk assessment\|risk analyses]] and uncertainty modeling. Convolution is the operation of finding the probability distribution of a sum of independent random variables specified by probability distributions. We can extend the term to finding distributions of other mathematical functions (products, differences, quotients, and more complex functions) and other assumptions about the intervariable dependencies. There are convenient algorithms for computing these generalized convolutions under a variety of assumptions about the dependencies among the inputs.<ref name=Yager /><ref name=WilliamsonDowns /><ref name=Fersonetal03 /><ref name=Fersonetal04 /> ===Mathematical details=== In some cases, we may not know whether a quantity varies or is a fixed constant. Even if we know a quantity to be a constant, we may not know its value precisely. And, even if we know a quantity to be randomly varying, we may not know the statistical distribution that governs that variation, or the stochastic dependence it may have with other quantities. Let <math>\mathbb{D}</math> denote the space of distribution functions on the [[real number]]s <math>\R,</math> i.e., :<math> \mathbb{D} = \{ D \| D: \R \to [0,1], D(x) \leq D(y) \text{ for all } x < y \}.</math> In some cases, the shape or family of the distribution of a quantity may be known from mechanistic or physics-based arguments, but its parameters may be in doubt. In others cases, some summary statistical characteristics of a quantity may have been recorded in the scientific literature, but other details and the original data are unavailable so that we do not know the family of the statistical distribution even though we know some of its parameters. In some cases, there may be sample data available but the small sample may be size, or the data values may have non-negligible measurement uncertainty. A p-box is a quintuple ~~Further suppose that sparse data were used to form the 95% confidence limits for the distribution of ''C''. And the variable ''D'' is known to be well described by a precise distribution.~~ :<math>\left \{ \overline{F}, \underline{F}, m, v, \mathbf{F} \right \},</math> Probability bounds analysis includes the important special case of [[dependency bounds analysis]]<<__Williamson and Downs>> to compute bounds on the cumulative distribution of a function of random variables when only the marginal distributions of the variables are known, which is a problem originally posed by [[Kolmogorov]]. ~~-->~~ where <math>\overline{F} </math> and <math>\underline{F} \in \mathbb{D}</math>, <math>m</math> and <math>v</math> are real intervals, and <math>\mathbf{F} \subset \mathbb{D}</math>. This quintuple denotes the set of distribution functions <math>F \in \mathbf{F} \subset \mathbb{D}</math> such that: ~~==Arithmetic expressions==~~ Arithmetic expressions involving operations such as additions, subtractions, multiplications, divisions, minima, maxima, powers, exponentials, logarithms, square roots, absolute values, etc., are commonly used in [[Probabilistic risk assessment\|risk analyses]] and uncertainty modeling. Convolution is the operation of finding the probability distribution of a sum of independent random variables specified by probability distributions. We can extend the term to finding distributions of other mathematical functions (products, differences, quotients, and more complex functions) and other assumptions about the intervariable dependencies. There are convenient algorithms for computing these generalized convolutions under a variety of assumptions about the dependencies among the inputs.<ref name=Yager /><ref name=WilliamsonDowns /><ref name=Fersonetal03 /><ref name=Fersonetal04 /> :<math>\begin{align} ~~===Mathematical details===~~ \forall x \in \R: \qquad &\overline{F}(x) \leq F(x) \leq \underline{F}(x) \\[6pt] Let {{Unicode\|𝔻}} denote the space of distribution functions on the [[real number]]s {{Unicode\|ℝ}}, i.e., {{Unicode\|𝔻}} = {''D'' \| ''D'' : {{Unicode\|ℝ}} → [0,1], ''D''(''x'') ≤ ''D''(''y'') whenever ''x'' < ''y'', for all ''x'', ''y'' [[Naive_set_theory#Sets.2C_membership_and_equality\|∈]] {{Unicode\|ℝ}}}, and let {{Unicode\|𝕀}} denote the set of real [[Interval (mathematics)\|intervals]], i.e., {{Unicode\|𝕀}} = {''i'' \| ''i'' = [''i''<sub>1</sub>, ''i''<sub>2</sub>], ''i''<sub>1</sub> ≤ ''i''<sub>2</sub>, ''i''<sub>1</sub>, ''i''<sub>2</sub> ∈ {{Unicode\|ℝ}}}. Then a p-box is a quintuple {''{{overbar\|F}}'', <u>''F''</u>, ''m'', ''v'', '''F'''}, where ''{{overbar\|F}}'', <u>''F''</u> ∈ {{Unicode\|𝔻}}, while ''m'', ''v'' ∈ {{Unicode\|𝕀}}, and '''F''' ⊆ {{Unicode\|𝔻}}. This quintuple denotes the set of distribution functions ''F'' ∈ '''F''' ⊆ {{Unicode\|𝔻}} such that ''{{overbar\|F}}''(''x'') ≤ ''F''(''x'') ≤ <u>''F''</u>(''x'') for all ''x'' ∈ {{Unicode\|ℝ}}}, and the mean and variance of ''F'' are in the intervals ''m'' and ''v'' respectively. &\int_\R x dF(x) \in m && \text{expectation condition} \\ &\int_\R x^2 dF(x) - \left ( \int_\R x dF(x) \right )^2 \in v && \text{variance condition} \end{align}</math> If a function satisfies all the conditions above it is said to be ''inside'' the p-box. In some cases, there may be no information about the moments or distribution family other than what is encoded in the two distribution functions that constitute the edges of the p-box. Then the quintuple representing the p-box <math>\{B_1, B_2, [-\infty, \infty], [0, \infty], \mathbb{D}\}</math> can be denoted more compactly as [''B''<sub>1</sub>, ''B''<sub>2</sub>]. This notation harkens to that of intervals on the real line, except that the endpoints are distributions rather than points. The notation <math>X \sim F</math> denotes the fact that <math>X \in \R</math> is a random variable governed by the distribution function ''F'', that is, :<math>\begin{cases} F: \R \to [0,1] \\ x \mapsto \Pr (X \leq x) \end{cases}</math> Let us generalize the tilde notation for use with p-boxes. We will write ''X'' ~ ''B'' to mean that ''X'' is a random variable whose distribution function is unknown except that it is inside ''B''. Thus, ''X'' ~ ''F'' ∈ ''B'' can be contracted to X ~ B without mentioning the distribution function explicitly. If ''X'' and ''Y'' are independent random variables with distributions ''F'' and ''G'' respectively, then ''X'' + ''Y'' = ''Z'' ~ ''H'' given by :<math>H(z) = \int_{z=x+y} F(x) G(y) dz = \int_{\R} F(x) G(z-x) dx = F * G.</math> ~~If ''F'' is a [[distribution function]] and ''B'' is a [[p-box]], the notation ''F'' ∈ ''B'' means that ''F'' is an~~ ~~element of ''B'' = {''B''<sub>1</sub>, ''B''<sub>2</sub>, [''m''<sub>1</sub>,''m''<sub>2</sub>],~~ ~~[''v''<sub>1</sub>,''v''<sub>2</sub>], '''B'''}, that is,~~ ~~''B''<sub>2</sub>(''x'') ≤ ''F''(''x'') ≤ ''B''<sub>1</sub>(''x''), for all ''x'' ∈ {{Unicode\|ℝ}},~~ ~~[[Expected_value\|E]](''F'') ∈ [''m''<sub>1</sub>,''m''<sub>2</sub>],~~ ~~[[Variance\|V]](''F'') ∈ [''v''<sub>1</sub>,''v''<sub>2</sub>], and~~ ~~''F'' ∈ '''B'''. We sometimes say ''F'' is <em>inside</em> ''B''.~~ ~~In some cases, there may be no information about the moments or distribution family other than what is~~ ~~encoded in the two distribution functions that constitute the edges of the p-box. Then the quintuple~~ ~~representing the p-box {''B''<sub>1</sub>, ''B''<sub>2</sub>, [−∞,∞], [0,∞], 𝔻}~~ ~~can be denoted more compactly as [''B''<sub>1</sub>, ''B''<sub>2</sub>]. This notation harkens to~~ ~~that of intervals on the real line, except that the endpoints are distributions rather than points.~~ This operation is called a [[convolution]] on ''F'' and ''G''. The analogous operation on p-boxes is straightforward for sums. Suppose ~~The notation ''X'' ~ ''F'' denotes the fact that ''X''∈{{Unicode\|ℝ}} is a random variable governed by the~~ ~~distribution function ''F'', that is, ''F'' = ''F''(''x''):{{Unicode\|ℝ}}→[0,1]:x→Pr(''X''≤''x'').~~ ~~<!-- can I get the "mapsto" character without resorting to ugly <math> ? -->~~ ~~Let us generalize the tilde notation for use with p-boxes. We will write~~ ~~''X'' ~ ''B''~~ ~~to mean that ''X'' is a random variable whose distribution function is unknown except that it is inside ''B''.~~ ~~Thus,~~ ~~''X'' ~ ''F'' ∈ ''B''~~ ~~can be contracted to X ~ B without mentioning the distribution function explicitly.~~ :<math>X \sim A = [A_1, A_2], \quad \text{and} \quad Y \sim B = [B_1, B_2].</math> ~~If ''X'' and ''Y'' are independent random variables with distributions ''F'' and ''G''~~ ~~respectively, then ''X'' + ''Y'' = ''Z'' ~ ''H'' given by~~ :''H''(''z'') = <big>∫ </big><sub>z=x+y</sub> ''F''(''x'') ''G''(''y'') d''z'' = <big>∫ </big>{{su\|''p''=∞\|''b''=−∞}} ''F''(''x'') ''G''(''z − x'') d''x'' = ''F * G''. <!-- the convolution asterisk looks a little better italicized --> ~~This operation is called a [[convolution]] on ''F'' and ''G''. The analogous operation on~~ ~~p-boxes is straightforward for sums.~~ ~~Suppose~~ ~~:''X'' ~ ''A'' = [''A''<sub>1</sub>, ''A''<sub>2</sub>] and~~ ~~:''Y'' ~ ''B'' = [''B''<sub>1</sub>, ''B''<sub>2</sub>].~~ ~~If ''X'' and ''Y'' are stochastically independent, then the distribution of ''Z''=''X''+''Y'' is~~ ~~inside the p-box~~ ~~:[''A''<sub>1</sub> * ''B''<sub>1</sub>, ''A''<sub>2</sub> * ''B''<sub>2</sub>]. <!-- when there are subscripts, the convolution asterisk looks better not italicized -->~~ ~~Finding~~If ~~bounds~~''X'' and ''Y'' are stochastically independent, onthen the distribution of ~~sums~~ ''Z'' = ''X'' + ''Y'' is inside the p-box ~~<em>without making any assumption about the dependence</em> between ''X''~~ ~~and ''Y'' is actually easier than the problem assuming independence.~~ ~~Makarov<ref name=Makarov/><ref name=Franketal87/><ref name=WilliamsonDowns/> showed that~~ ~~:''Z'' ~ <big>[ sup</big><sub>x+y=z</sub> max(''F''(''x'') + ''G''(''y'') − 1, 0), <big>inf</big><sub>x+y=z</sub> min(''F''(''x'') + ''G''(''y''), 1) <big>]</big>.~~ :<math> \left [A_1 * B_1, A_2 * B_2 \right ].</math> These bounds are implied by the [[copula_(probability_theory)#Fr.C3.A9chet.E2.80.93Hoeffding_copula_bounds\|Fréchet–Hoeffding]] [[copula (probability theory)\|copula]] bounds. The problem can also be solved using the methods of [[mathematical programming]]<ref name=BerleantGoodmanStrauss />. Finding bounds on the distribution of sums ''Z'' = ''X'' + ''Y'' ''without making any assumption about the dependence'' between ''X'' and ''Y'' is actually easier than the problem assuming independence. Makarov<ref name=Makarov/><ref name=Franketal87/><ref name=WilliamsonDowns/> showed that :<math>Z \sim \left [ \sup_{z=x+y} \max ( F(x) +G(y) -1, 0), \inf_{z=x+y} \min (F(x)+G(y), 1) \right ]</math> These bounds are implied by the [[copula (probability theory)#Fr.C3.A9chet.E2.80.93Hoeffding copula bounds\|Fréchet–Hoeffding]] [[copula (probability theory)\|copula]] bounds. The problem can also be solved using the methods of [[mathematical programming]].<ref name=BerleantGoodmanStrauss /> The convolution under the intermediate assumption that ''X'' and ''Y'' have [[positive quadrant dependence\|positive dependence]] is likewise easy to compute, as is the convolution under the extreme assumptions of [[Comonotonicity\|perfect positive]] or [[countermonotonicity\|perfect negative]] dependency between ''X'' and ''Y''.<ref name=Fersonetal04 /> Generalized convolutions for other operations such as subtraction, multiplication, division, etc., can be derived using transformations. For instance, p-box subtraction ''A'' − ''B'' can be defined as ''A'' + (−''B''), where the negative of a p-box ''B'' = [''B''<sub>1</sub>, ''B''<sub>2</sub>] is [''B''<sub>2</sub>(−''x''), ''B''<sub>1</sub>(−''x'')]. ~~[''B''<sub>2</sub>(−''x''), ''B''<sub>1</sub>(−''x'')].~~ ==Logical expressions== Logical or [[~~Boolean_function~~Boolean function\|Boolean expressions]] involving [[~~logical_conjunction~~logical conjunction\|conjunctions]] ([[~~AND_gate~~AND gate\|AND]] operations), [[~~logical_disjunction~~logical disjunction\|disjunctions]] ([[~~OR_gate~~OR gate\|OR]] operations), exclusive disjunctions, equivalences, conditionals, etc. arise in the analysis of fault trees and event trees common in risk assessments. If the probabilities of events are characterized by intervals, as suggested by [[George Boole\|Boole]]<ref name="BOOLE1854" /> and [[John Maynard Keynes\|Keynes]]<ref name="kyburg99" /> among others, these binary operations are straightforward to evaluate. For example, if the probability of an event A is in the interval P(A) = ''a'' = [0.2, 0.25], and the probability of the event B is in P(B) = ''b'' = [0.1, 0.3], then the probability of the [[logical conjunction\|conjunction]] is surely in the interval :   P(A & B) = ''a'' ~~×~~× ''b'' :::: = [0.2, 0.25] ~~×~~× [0.1, 0.3] :::: = [0.2 ~~×~~× 0.1, 0.25 ~~×~~× 0.3] :::: = [0.02, 0.075] so long as A and B can be assumed to be independent events. If they are not independent, we can still bound the conjunction using the classical [[Frechet inequalities\|~~Fréchet~~Fréchet inequality]]. In this case, we can infer at least that the probability of the joint event A & B is surely within the interval :   P(A & B) = env(max(0, ''a''+''b''−1), min(''a'', ''b'')) :::: = env(max(0, [0.2, 0.25]+[0.1, 0.3]−1), min([0.2, 0.25], [0.1, 0.3])) Line 125 ⟶ 80: :::: = [0, 0.25] where env([''x''<sub>1</sub>,''x''<sub>2</sub>], [''y''<sub>1</sub>,''y''<sub>2</sub>]) is [min(''x''<sub>1</sub>,''y''<sub>1</sub>), max(''x''<sub>2</sub>,''y''<sub>2</sub>)]. Likewise, the probability of the [[logical disjunction\|disjunction]] is surely in the interval :   P(A v B) = ''a'' + ''b'' − ''a'' ~~×~~× ''b'' = 1 − (1 − ''a'') ~~×~~× (1 − ''b'') :::: = 1 − (1 − [0.2, 0.25]) ~~×~~× (1 − [0.1, 0.3]) :::: = 1 − [0.75, 0.8] ~~×~~× [0.7, 0.9] :::: = 1 − [0.525, 0.72] :::: = [0.28, 0.475] if A and B are independent events. If they are not independent, the ~~Fréchet~~Fréchet inequality bounds the disjunction :   P(A v B) = env(max(''a'', ''b''), min(1, ''a'' + ''b'')) :::: = env(max([0.2, 0.25], [0.1, 0.3]), min(1, [0.2, 0.25] + [0.1, 0.3])) Line 136 ⟶ 91: :::: = [0.2, 0.55]. It is also possible to compute interval bounds on the conjunction or disjunction under other assumptions about the dependence between A and B. For instance, one might assume they are positively dependent, in which case the resulting interval is not as tight as the answer assuming independence but tighter than the answer given by the ~~Fréchet~~Fréchet inequality. Comparable calculations are used for other logical functions such as negation, exclusive disjunction, etc. When the Boolean expression to be evaluated becomes complex, it may be necessary to evaluate it using the methods of mathematical programming<ref name=Hailperin86 /> to get best-possible bounds on the expression. A similar problem one presents in the case of [[probabilistic logic]] (see for example Gerla 1994). If the probabilities of the events are characterized by probability distributions or p-boxes rather than intervals, then analogous calculations can be done to obtain distributional or p-box results characterizing the probability of the top event. ~~<!--~~ ~~Prob(A and B) = Prob(A) * Prob(B).~~ ~~Prob(A or B) = Prob(A) + Prob(B) – Prob(A) * Prob(B)~~ ~~Operation Formula~~ ~~conjunction [ max(0, a+b–1), min(a, b) ],~~ ~~disjunction [ max(a, b), min(1, a+b) ],~~ ~~a = [0.2, 0.25]~~ ~~b = [0.1, 0.3]~~ ~~a \|&\| b~~ ~~[ 0.02, 0.075]~~ ~~a & b~~ ~~[ 0, 0.25]~~ ~~a \|\|\| b~~ ~~[ 0.28, 0.475]~~ ~~a \| b~~ ~~[ 0.2, 0.55]~~ ~~-->~~ ==Magnitude comparisons== The probability that an uncertain number represented by a p-box ''D'' is less than zero is the interval Pr(''D'' < 0) = [<u>''F''</u>''(0), ''F̅''(0)], where ''F̅''(0) is the left bound of the probability box ''D'' and <u>''F''</u>(0) is its right bound, both evaluated at zero. Two uncertain numbers represented by probability boxes may then be compared for numerical magnitude with the following encodings: :''A'' < ''B'' = Pr(''A'' − ''B'' < 0), :''A'' > ''B'' = Pr(''B'' − ''A'' < 0), :''A'' ~~≤~~≤ ''B'' = Pr(''A'' − ''B'' ~~≤~~≤ 0), and :''A'' ~~≥~~≥ ''B'' = Pr(''B'' − ''A'' ~~≤~~≤ 0). Thus the probability that ''A'' is less than ''B'' is the same as the probability that their difference is less than zero, and this probability can be said to be the value of the expression ''A'' < ''B''. Line 173 ⟶ 104: ==Sampling-based computation== Some analysts<ref>Alvarez, D. A., 2006. On the calculation of the bounds of probability of events using infinite random sets. ''International Journal of Approximate Reasoning'' '''43''': 241–267.</ref><ref>Baraldi, P., Popescu, I. C., Zio, E., 2008. Predicting the time to failure of a randomly degrading component by a hybrid Monte Carlo and possibilistic method. ''IEEE Proc. International Conference on Prognostics and Health Management''.</ref><ref>Batarseh, O. G., Wang, Y., 2008. Reliable simulation with input uncertainties using an interval-based approach. ''IEEE Proc. Winter Simulation Conference''.</ref><ref>Roy, Christopher J., and Michael S. Balch (2012). A holistic approach to uncertainty quantification with application to supersonic nozzle thrust. ''International Journal for Uncertainty Quantification'' ~~[in~~'''2''' ~~press]~~(4): 363–81 {{doi\|10.1615/Int.J.UncertaintyQuantification.2012003562}}.</ref><ref>Zhang, H., Mullen, R. L., Muhanna, R. L. (2010). Interval Monte Carlo methods for structural reliability. ''Structural Safety'' '''32''': 183–190.</ref><ref>Zhang, H., Dai, H., Beer, M., Wang, W. (2012). Structural reliability analysis on the basis of small samples: an interval quasi-Monte Carlo method. ''Mechanical Systems and Signal Processing'' ~~[in~~'''37''' ~~press]~~(1–2): 137–51 {{doi\|10.1016/j.ymssp.2012.03.001}}.</ref> use sampling-based approaches to computing probability bounds, including [[Monte Carlo simulation]], [[Latin hypercube]] methods or [[importance sampling]]. These approaches cannot assure mathematical rigor in the result because such simulation methods are approximations, although their performance can generally be improved simply by increasing the number of replications in the simulation. Thus, unlike the analytical theorems or methods based on mathematical programming, sampling-based calculations usually cannot produce [[verified computing\|verified computations]]. However, sampling-based methods can be very useful in addressing a variety of problems which are computationally [[NP-hard\|difficult]] to solve analytically or even to rigorously bound. One important example is the use of Cauchy-deviate sampling to avoid the [[curse of dimensionality]] in propagating [[Interval (mathematics)\|interval]] uncertainty through high-dimensional problems.<ref>Trejo, R., Kreinovich, V. (2001). [http://www.cs.utep.edu/vladik/2000/tr00-17.pdf Error estimations for indirect measurements: randomized vs. deterministic algorithms for ‘black-box’ programs]. ''Handbook on Randomized Computing'', S. Rajasekaran, P. Pardalos, J. Reif, and J. Rolim (eds.), Kluwer, 673–729.</ref> ==Relationship to other uncertainty propagation approaches== PBA belongs to a class of methods that use [[imprecise probability\|imprecise probabilities]] to simultaneously represent [[~~Uncertainty_quantification~~Uncertainty quantification\|aleatoric and epistemic uncertainties]]. PBA is a generalization of both [[interval analysis]] and probabilistic [[~~convolution_of_probability_distributions~~convolution of probability distributions\|convolution]] such as is commonly implemented with [[Monte Carlo simulation]]. PBA is also closely related to [[robust Bayes analysis]], which is sometimes called [[Bayesian sensitivity analysis]]. PBA is an alternative to [[second-order Monte Carlo simulation]]. ==Applications== Line 213 ⟶ 144: ==References== {{Reflist\|30em}} ~~==Additional references==~~ * {{cite book \| last1 = Bernardini \| first1 = Alberto \| last2 = Tonon \| first2 = Fulvio \| title = Bounding Uncertainty in Civil Engineering: Theoretical Background \| publisher = Springer \| ___location = Berlin \| year = 2010 \| isbn = 3-642-11189-0 }} * {{cite book \| last = Ferson \| first = Scott \| title = RAMAS Risk Calc 4.0 Software : Risk Assessment with Uncertain Numbers \| publisher = Lewis Publishers \| ___location = Boca Raton, Florida \| year = 2002 \| isbn = 1-56670-576-2 }} ==Further references== * {{cite book \| last1 = Oberkampf \| first1 = William L. \| last2 = Roy \| first2 = Christopher J. \| title = Verification and Validation in Scientific Computing \| publisher = Cambridge University Press \| ___location = New York \| year = 2010 \| isbn = 0-521-11360-1 }}<!-- In an email dated 28 March 2011, William Oberkampf stated "PBA is the only UQ method we discuss and apply in our examples in the book." --> * {{cite book \| last1 = Bernardini \| first1 = Alberto \| last2 = Tonon \| first2 = Fulvio \| title = Bounding Uncertainty in Civil Engineering: Theoretical Background \| publisher = Springer \| ___location = Berlin \| year = 2010 \| isbn = 978-3-642-11189-1 }} * {{cite book \| last = Ferson \| first = Scott \| title = RAMAS Risk Calc 4.0 Software : Risk Assessment with Uncertain Numbers \| publisher = Lewis Publishers \| ___location = Boca Raton, Florida \| year = 2002 \| isbn = 978-1-56670-576-9 }} * {{cite journal \|first=G. \|last=Gerla \|title=Inferences in Probability Logic \|journal=Artificial Intelligence \|volume=70 \|issue=1–2 \|pages=33–52 \|year=1994 \|doi=10.1016/0004-3702(94)90102-3 }} * {{cite book \| last1 = Oberkampf \| first1 = William L. \| last2 = Roy \| first2 = Christopher J. \| title = Verification and Validation in Scientific Computing \| publisher = Cambridge University Press \| ___location = New York \| year = 2010 \| isbn = 978-0-521-11360-1 }}<!-- In an email dated 28 March 2011, William Oberkampf stated "PBA is the only UQ method we discuss and apply in our examples in the book." --> ==External links== * [http://www.ramas.com/pbawhite.pdf Probability bounds analysis in environmental risk assessments] * [http://ualr.edu/jdberleant/intprob/ Intervals and probability distributions] * [https://web.archive.org/web/20120210155925/http://www.sandia.gov/epistemic/ Epistemic uncertainty project] * [http://www.sipta.org/ The Society for Imprecise Probability: Theories and Applications] [[Category:Probability bounds analysis\| ]] ~~[[Category:Risk analysis]]~~ [[Category:Mathematical analysis]]