Notation in probability and statistics: Difference between revisions

Content deleted Content added
Was previously P(A \cup B) or P(A \cup B) which is same thing. Should be P(A \cup B) or P(B \cup A)
OAbot (talk | contribs)
m Open access bot: url-access=subscription updated in citation with #oabot.
 
(44 intermediate revisions by 22 users not shown)
Line 1:
{{Short description|none}}
{{ProbabilityTopicsTOC}}
{{StatsTopicTOC}}
Line 4 ⟶ 5:
 
==Probability theory==
{{Unreferenced section|date=March 2021}}
* [[Random variable]]s are usually written in [[upper case]] roman letters: ''X'', ''Y'', etc.
* [[Random variable]]s are usually written in [[upper case]] Roman letters, such as <math display="inline">X</math> or <math display="inline">Y</math> and so on. Random variables, in this context, usually refer to something in words, such as "the height of a subject" for a continuous variable, or "the number of cars in the school car park" for a discrete variable, or "the colour of the next bicycle" for a categorical variable. They do not represent a single number or a single category. For instance, if <math>P(X = x) </math> is written, then it represents the probability that a particular realisation of a random variable (e.g., height, number of cars, or bicycle colour), ''X'', would be equal to a particular value or category (e.g., 1.735 m, 52, or purple), <math display="inline">x</math>. It is important that <math display="inline">X</math> and <math display="inline">x</math> are not confused into meaning the same thing. <math display="inline">X</math> is an idea, <math display="inline">x</math> is a value. Clearly they are related, but they do not have identical meanings.
* Particular realizations of a random variable are written in corresponding [[lower case]] letters. For example, ''x''<sub>1</sub>, ''x''<sub>2</sub>, …, ''x''<sub>''n''</sub> could be a [[random sample|sample]] corresponding to the random variable ''X''. A cumulative probability is formally written <math>P(X\le x) </math> to differentiate the random variable from its realization.
* Particular realisations of a random variable are written in corresponding [[lower case]] letters. For example, <math display="inline">x_1,x_2, \ldots,x_n</math> could be a [[random sample|sample]] corresponding to the random variable <math display="inline">X</math>. A cumulative probability is formally written <math>P(X\le x) </math> to distinguish the random variable from its realization.<ref>{{Cite web |date=2021-08-09 |title=Calculating Probabilities from Cumulative Distribution Function |url=https://analystprep.com/cfa-level-1-exam/quantitative-methods/calculating-probabilities-from-cumulative-distribution-function/ |access-date=2024-02-26}}</ref>
* The probability is sometimes written <math>\mathbb{P} </math> to distinguish it from other functions and measure ''P'' so as to avoid having to define "''P'' is a probability”probability" and <math>\mathbb{P}(X\in A) </math> is short for <math>P(\{\omega \in\Omega: X(\omega) \in A\})</math>, where <math>\Omega</math> is the event space and, <math>X</math> is a random variable that is a function of <math>\omega</math> (i.e., it depends upon <math>\omega</math>), and <math>\omega</math> is some outcome of interest within the ___domain specified by <math>\Omega</math> (say, a randomparticular variableheight, or a particular colour of a car). <math>\Pr(A)</math> notation is used alternatively.
*<math>\mathbb{P}(A \cap B)</math> or <math>\mathbb{P}[B \cap A]</math> indicates the probability that events ''A'' and ''B'' both occur. The [[joint probability distribution]] of random variables ''X'' and ''Y'' is denoted as <math>P(X, Y)</math>, while joint probability mass function or probability density function as <math>f(x, y)</math> and joint cumulative distribution function as <math>F(x, y)</math>.
*<math>\mathbb{P}(A \cup B)</math> or <math>\mathbb{P}[B \cup A]</math> indicates the probability of either event ''A'' or event ''B'' occurring (“or”"or" in this case means [[inclusive or|one or the other or both]]).
*[[sigma-algebra|&sigma;σ-algebras]] are usually written with uppercase [[Calligraphy|calligraphic]] (e.g. <math>\mathcal F</math> for the set of sets on which we define the probability ''P'')
*[[Probability density function]]s (pdfs) and [[probability mass function]]s are denoted by lowercase letters, e.g. <math>f(x)</math>, or <math>f_X(x)</math>.
*[[Cumulative distribution function]]s (cdfs) are denoted by uppercase letters, e.g. <math>F(x)</math>, or <math>F_X(x)</math>.
* [[Survival function]]s or complementary cumulative distribution functions are often denoted by placing an [[overbar]] over the symbol for the cumulative:<math>\overline{F}(x) =1-F(x)</math>, or denoted as <math>S(x)</math>,
*In particular, the pdf of the [[standard normal distribution]] is denoted by &phi;<math display="inline">\varphi(''z'')</math>, and its cdf by &<math display="inline">\Phi;(''z'')</math>.
*Some common operators:
:* <math display="inline">\mathrm{E}[''X''] </math>: [[expected value]] of ''X''
:* <math display="inline">\operatorname{var}[''X''] </math>: [[variance]] of ''X''
:* <math display="inline">\operatorname{cov}[''X'', ''Y''] </math>: [[covariance]] of ''X'' and ''Y''
* X is independent of Y is often written <math>X \perp Y</math> or <math>X \perp\!\!\!\perp Y</math>, and X is independent of Y given W is often written
:<math>X \perp\!\!\!\perp Y \,|\, W </math> or
:<math>X \perp Y \,|\, W</math>
* <math>\textstyle P(A\mid B)</math>, the ''[[conditional probability]]'', is the probability of <math>\textstyle A</math> ''given'' <math>\textstyle B</math>, i.e., <mathref>\textstyle{{Citation A</math>|title=Probability ''after''and <math>\textstylestochastic B</math>processes is|date=2013-07-22 observed|url=http://dx.{{factdoi.org/10.1201/b15257-3 |work=Applied Stochastic Processes |pages=9–36 |access-date=May2023-12-08 |publisher=Chapman and Hall/CRC |doi=10.1201/b15257-3 |isbn=978-0-429-16812-3|url-access=subscription 2016}}</ref>
 
==Statistics==
{{Unreferenced section|date=March 2021}}
*Greek letters (e.g. ''&theta;'', ''&beta;'') are commonly used to denote unknown parameters (population parameters).
*Greek letters (e.g. ''&theta;'', ''&beta;'') are commonly used to denote unknown parameters (population parameters).<ref>{{Cite web |date=1999-02-13 |title=Letters of the Greek Alphabet and Some of Their Statistical Uses |url=https://lesn.appstate.edu/olson/EDL7150/Components/Other%20useful%20links/Greek%20Alphabet%20and%20Statistics.htm |access-date=2024-02-26 |website=les.appstate.edu/}}</ref>
*A tilde (~) denotes "has the probability distribution of".
*Placing a hat, or caret (also known as a circumflex), over a true parameter denotes an [[estimator]] of it, e.g., <math>\widehat{\theta}</math> is an estimator for <math>\theta</math>.
*The [[arithmetic mean]] of a series of values ''x''<sub>1</sub>,math ''x''<sub>2</subdisplay="inline">x_1,x_2, ...\ldots, ''x''<sub>''n''x_n</submath> is often denoted by placing an "[[overbar]]" over the symbol, e.g. <math>\bar{x}</math>, pronounced "''<math display="inline">x''</math> bar".
*Some commonly used symbols for [[Sample (statistics)|sample]] statistics are given below:
**the [[sample mean]] <math>\bar{x}</math>,
**the [[sample variance]] ''s''<supmath display="inline">s^2</supmath>,
** the [[sample standard deviation]] ''<math display="inline">s</math>'',
**the [[Pearson correlation coefficient|sample correlation coefficient]] ''<math display="inline">r</math>'',
**the sample cumulants ''k<submath display="inline">rk_r</submath>''.
*Some commonly used symbols for [[Statistical population|population]] parameters are given below:
**the population mean ''&<math display="inline">\mu;''</math>,
**the population variance ''&sigma;''<supmath display="inline">\sigma^2</supmath>,
** the population standard deviation ''&<math display="inline">\sigma;</math>'',
**the population [[Pearson product-moment correlation coefficient|correlation]] ''&<math display="inline">\rho;</math>'',
**the population [[cumulant]]s ''&kappa;<submath display="inline">r\kappa_r</submath>'',
*<math>x_{(k)}</math> is used for the <math>k^\text{th}</math> [[order statistic]], where <math>x_{(1)}</math> is the sample minimum and <math>x_{(n)}</math> is the sample maximum from a total sample size ''<math display="inline">n''</math>.<ref>{{Cite web |title=Order Statistics |url=https://www.colorado.edu/amath/sites/default/files/attached-files/order_stats.pdf |access-date=2024-02-26 |website=colorado.edu}}</ref>
 
==Critical values==
{{Unreferenced section|date=March 2021}}
The ''&alpha;α''-level upper [[critical value (statistics)|critical value]] of a [[probability distribution]] is the value exceeded with probability &<math display="inline">\alpha;</math>, that is, the value ''x''<submath display="inline">''&x_\alpha;''</submath> such that ''<math display="inline">F''(''x''<sub>''&x_\alpha;''</sub>) =&nbsp; 1&nbsp;&minus;&nbsp;''&-\alpha;''</math>, where ''<math display="inline">F''</math> is the cumulative distribution function. There are standard notations for the upper critical values of some commonly used distributions in statistics:
*''z''<submath display="inline">''&z_\alpha;''</submath> or ''<math display="inline">z''(''&\alpha;'')</math> for the [[standard normal distribution]]
*''t''<sub>''&alpha;'',''&nu;''</sub> or ''t''(''&alpha;'',''&nu;'') for the [[Student's t-distribution|''t''-distribution]] with &nu; [[Degrees of freedom (statistics)|degrees of freedom]]
*<math display="inline">{\chi_t_{\alpha,\nu}}^2</math> or <math display="inline">{\chi}^{2}t(\alpha,\nu)</math> for the [[chi-squaredStudent's t-distribution|''t''-distribution]] with &<math display="inline">\nu;</math> [[Degrees of freedom (statistics)|degrees of freedom]]
*<math>F_{\chi_{\alpha,\nu_1,\nu_2nu}}^2</math> or F<math>{\chi}^{2}(&\alpha;,''&\nu;''<sub>1)</submath>,''&nu;''<sub>2</sub>) for the [[Fchi-squared distribution]] with ''&nu;''<sub>1</sub>math and ''&display="inline">\nu;''<sub>2</submath> degrees of freedom
*<math>F_{\alpha,\nu_1,\nu_2}</math> or <math display="inline">F(\alpha,\nu_1,\nu_2)</math> for the [[F-distribution]] with <math display="inline">\nu_1</math> and <math display="inline">\nu_2</math> degrees of freedom
 
==Linear algebra==
{{Unreferenced section|date=March 2021}}
*[[Matrix (mathematics)|Matrices]] are usually denoted by boldface capital letters, e.g. '''A'''.
*[[ColumnMatrix vector(mathematics)|Matrices]]s are usually denoted by boldface lowercasecapital letters, e.g. '''x'''<math display="inline">\bold{A}</math>.
*[[MatrixColumn (mathematics)|Matricesvector]]s are usually denoted by boldface capitallowercase letters, e.g. '''A<math display="inline">\bold{x}</math>'''.
*The [[transpose]] operator is denoted by either a superscript T (e.g. '''A'''<supmath display="inline">\bold{A}^\mathrm{T}</supmath>''') or a [[prime (symbol)|prime symbol]] (e.g. '''<math display="inline">\bold{A}'</math>'''&prime;).
*A [[row vector]] is written as the transpose of a column vector, e.g. '''x'''<supmath display="inline">\bold{x}^\mathrm{T}</supmath>''' or '''<math display="inline">\bold{x}'</math>'''&prime;.
 
==Abbreviations==
{{Unreferenced section|date=March 2021}}
Common abbreviations include:
*'''a.e.''' [[almost everywhere]]
Line 72 ⟶ 78:
== See also ==
*[[Glossary of probability and statistics]]
*[[CombinationsCombination]]s and permutations[[permutation]]s
*[[Typographical conventions in mathematical formulae]]
*[[History of mathematical notation]]
 
== References ==
{{Reflist}}
*{{Citation| title=Recommended Standards for Statistical Symbols and Notation. COPSS Committee on Symbols and Notation| first1=Max|last1=Halperin |first2=H. O. |last2=Hartley |first3=P. G.|last3=Hoel | journal=The American Statistician| volume=19 |year=1965 | pages=12–14 | issue=3| doi=10.2307/2681417 | jstor=2681417}}
{{refbegin}}
* {{Citation| |title = Recommended Standards for Statistical Symbols and Notation. COPSS Committee on Symbols and Notation| |first1 = Max |last1 = Halperin |first2 = H. O. |last2 = Hartley |first3 = P. G. |last3 = Hoel | journal = The American Statistician| |volume = 19 |year = 1965 | pages = 12–14 | issue = 3| |doi = 10.2307/2681417 | jstor = 2681417 }}
{{refend}}
 
== External links ==
* [http://jeff560.tripod.com/stat.html Earliest Uses of Symbols in Probability and Statistics], maintained by Jeff Miller.
 
{{Mathematical symbols notation language}}
 
[[Category:Probability and statistics| Notation]]