Notation in probability and statistics: Difference between revisions

Content deleted Content added
Improve C/E. + inline ref. + "Bibliography".
Tags: Reverted Visual edit
OAbot (talk | contribs)
m Open access bot: url-access=subscription updated in citation with #oabot.
 
(43 intermediate revisions by 21 users not shown)
Line 1:
{{Short description|none}}
{{ProbabilityTopicsTOC}}
{{StatsTopicTOC}}
Line 4 ⟶ 5:
 
==Probability theory==
{{Unreferenced section|date=March 2021}}
* [[Random variable]]s are usually written in [[upper case]] roman letters: ''X'', ''Y'', ''Z'', ''T'', etc.<ref name=":0">{{Cite web|date=2020-04-26|title=List of Probability and Statistics Symbols|url=https://mathvault.ca/hub/higher-math/math-symbols/probability-statistics-symbols/|access-date=2020-09-10|website=Math Vault|language=en-US}}</ref>
* [[Random variable]]s are usually written in [[upper case]] Roman letters, such as <math display="inline">X</math> or <math display="inline">Y</math> and so on. Random variables, in this context, usually refer to something in words, such as "the height of a subject" for a continuous variable, or "the number of cars in the school car park" for a discrete variable, or "the colour of the next bicycle" for a categorical variable. They do not represent a single number or a single category. For instance, if <math>P(X = x) </math> is written, then it represents the probability that a particular realisation of a random variable (e.g., height, number of cars, or bicycle colour), ''X'', would be equal to a particular value or category (e.g., 1.735 m, 52, or purple), <math display="inline">x</math>. It is important that <math display="inline">X</math> and <math display="inline">x</math> are not confused into meaning the same thing. <math display="inline">X</math> is an idea, <math display="inline">x</math> is a value. Clearly they are related, but they do not have identical meanings.
* Particular realizations of a random variable are written in corresponding [[lower case]] letters. For example, ''x''<sub>1</sub>, ''x''<sub>2</sub>, …, ''x''<sub>''n''</sub> could be a [[random sample|sample]] corresponding to the random variable ''X''. A cumulative probability is formally written <math>P(X\le x) </math> to differentiate the random variable from its realization.
* Particular realisations of a random variable are written in corresponding [[lower case]] letters. For example, <math display="inline">x_1,x_2, \ldots,x_n</math> could be a [[random sample|sample]] corresponding to the random variable <math display="inline">X</math>. A cumulative probability is formally written <math>P(X\le x) </math> to distinguish the random variable from its realization.<ref>{{Cite web |date=2021-08-09 |title=Calculating Probabilities from Cumulative Distribution Function |url=https://analystprep.com/cfa-level-1-exam/quantitative-methods/calculating-probabilities-from-cumulative-distribution-function/ |access-date=2024-02-26}}</ref>
* The probability is sometimes written <math>\mathbb{P} </math> to distinguish it from other functions and measure ''P'' so as to avoid having to define "''P'' is a probability”probability" and <math>\mathbb{P}(X\in A) </math> is short for <math>P(\{\omega \in\Omega: X(\omega) \in A\})</math>, where <math>\Omega</math> is the event space and, <math>X</math> is a random variable that is a function of <math>\omega</math> (i.e., it depends upon <math>\omega</math>), and <math>\omega</math> is some outcome of interest within the ___domain specified by <math>\Omega</math> (say, a randomparticular variableheight, or a particular colour of a car). <math>\Pr(A)</math> notation is used alternatively.<ref name=":0" />
*<math>\mathbb{P}(A \cap B)</math> or <math>\mathbb{P}[B \cap A]</math> indicates the probability that events ''A'' and ''B'' both occur. The [[joint probability distribution]] of random variables ''X'' and ''Y'' is denoted as <math>P(X, Y)</math>, while joint probability mass function or probability density function as <math>f(x, y)</math> and joint cumulative distribution function as <math>F(x, y)</math>.<ref name=":0" />
*<math>\mathbb{P}(A \cupcap B)</math> or <math>\mathbb{P}[B \cupcap A]</math> indicates the probability of eitherthat eventevents ''A'' or eventand ''B'' occurringboth occur. The [[joint probability distribution]] of random variables ''X'' and ''Y'' is denoted as <math>P(“or”X, inY)</math>, thiswhile casejoint meansprobability [[inclusivemass or|onefunction or theprobability otherdensity orfunction both]]as <math>f(x, y).<ref/math> name=":0"and joint cumulative distribution function as <math>F(x, y)</math>.
*<math>\mathbb{P}(A \cup B)</math> or <math>\mathbb{P}[B \cup A]</math> indicates the probability of either event ''A'' or event ''B'' occurring ("or" in this case means [[inclusive or|one or the other or both]]).
*[[sigma-algebra|&sigma;σ-algebras]] are usually written with uppercase [[Calligraphy|calligraphic]] (e.g. <math>\mathcal F</math> for the set of sets on which we define the probability ''P'')
*[[Probability density function]]s (pdfs) and [[probability mass function]]s are denoted by lowercase letters, e.g. <math>f(x)</math>, or <math>f_X(x)</math>.<ref name=":0" />
*[[CumulativeProbability distributiondensity function]]s (cdfspdfs) and [[probability mass function]]s are denoted by uppercaselowercase letters, e.g. <math>Ff(x)</math>, or <math>F_Xf_X(x)</math>.<ref name=":0" />
*[[Cumulative distribution function]]s (cdfs) are denoted by uppercase letters, e.g. <math>F(x)</math>, or <math>F_X(x)</math>.
* [[Survival function]]s or complementary cumulative distribution functions are often denoted by placing an [[overbar]] over the symbol for the cumulative:<math>\overline{F}(x) =1-F(x)</math>, or denoted as <math>S(x)</math>,<ref name=":0" />
*In particular, the pdf of the [[standard normal distribution]] is denoted by &phi;<math display="inline">\varphi(''z'')</math>, and its cdf by &<math display="inline">\Phi;(''z'')</math>.
*Some common operators:<ref name=":0" />
:* <math display="inline">\mathrm{E}[''X''] </math>: [[expected value]] of ''X''
:* <math display="inline">\operatorname{var}[''X''] </math>: [[variance]] of ''X''
:* <math display="inline">\operatorname{cov}[''X'', ''Y''] </math>: [[covariance]] of ''X'' and ''Y''
* X is independent of Y is often written <math>X \perp Y</math> or <math>X \perp\!\!\!\perp Y</math>, and X is independent of Y given W is often written
:<math>X \perp\!\!\!\perp Y \,|\, W </math> or
:<math>X \perp Y \,|\, W</math><ref name=":0" />
* <math>\textstyle P(A\mid B)</math>, the ''[[conditional probability]]'', is the probability of <math>\textstyle A</math> ''given'' <math>\textstyle B</math>, <ref>{{Citation name|title=":0"Probability />and thatstochastic is,processes <math>\textstyle|date=2013-07-22 A<|url=http:/math>/dx.doi.org/10.1201/b15257-3 ''after''|work=Applied <math>\textstyleStochastic B</math>Processes is|pages=9–36 observed.{{fact|access-date=May2023-12-08 |publisher=Chapman and Hall/CRC |doi=10.1201/b15257-3 |isbn=978-0-429-16812-3|url-access=subscription 2016}}</ref>
 
==Statistics==
{{Unreferenced section|date=March 2021}}
*Greek letters (e.g. ''&theta;'', ''&beta;'') are commonly used to denote unknown parameters (population parameters).
*Greek letters (e.g. ''&theta;'', ''&beta;'') are commonly used to denote unknown parameters (population parameters).<ref>{{Cite web |date=1999-02-13 |title=Letters of the Greek Alphabet and Some of Their Statistical Uses |url=https://lesn.appstate.edu/olson/EDL7150/Components/Other%20useful%20links/Greek%20Alphabet%20and%20Statistics.htm |access-date=2024-02-26 |website=les.appstate.edu/}}</ref>
*A tilde (~) denotes "has the probability distribution of".
*Placing a hat, or caret (also known as a circumflex), over a true parameter denotes an [[estimator]] of it., For examplee.g., <math>\widehat{\theta}</math> is an estimator for <math>\theta</math>.<ref name=":0" />
*The [[arithmetic mean]] of a series of values ''x''<sub>1</sub>,math ''x''<sub>2</subdisplay="inline">x_1,x_2, ...\ldots, ''x''<sub>''n''x_n</submath> is often denoted by placing an "[[overbar]]" over the symbol, (e.g., <math>\bar{x}</math>, pronounced "''<math display="inline">x''</math> bar").
*Some commonly used symbols for [[Sample (statistics)|sample]] statistics are given below:
**the [[sample mean]] <math>\bar{x}</math>,
**the [[sample variance]] ''s''<supmath display="inline">s^2</supmath>,
** the [[sample standard deviation]] ''<math display="inline">s</math>'',
**the [[Pearson correlation coefficient|sample correlation coefficient]] ''<math display="inline">r</math>'',
**the sample cumulants ''k<submath display="inline">rk_r</submath>''.
*Some commonly used symbols for [[Statistical population|population]] parameters are given below:
**the population mean ''&<math display="inline">\mu;''</math>,
**the population variance ''&sigma;''<supmath display="inline">\sigma^2</supmath>,
** the population standard deviation ''&<math display="inline">\sigma;</math>'',
**the population [[Pearson product-moment correlation coefficient|correlation]] ''&<math display="inline">\rho;</math>'',
**the population [[cumulant]]s ''&kappa;<submath display="inline">r\kappa_r</submath>'',
*<math>x_{(k)}</math> is used for the <math>k^\text{th}</math> [[order statistic]],<ref name=":0" /> where <math>x_{(1)}</math> is the sample minimum and <math>x_{(n)}</math> is the sample maximum from a total sample size ''<math display="inline">n''</math>.<ref>{{Cite web |title=Order Statistics |url=https://www.colorado.edu/amath/sites/default/files/attached-files/order_stats.pdf |access-date=2024-02-26 |website=colorado.edu}}</ref>
 
==Critical values==
{{Unreferenced section|date=March 2021}}
The ''&alpha;α''-level upper [[critical value (statistics)|critical value]] of a [[probability distribution]] is the value exceeded with probability &<math display="inline">\alpha;</math>, that is, the value ''x''<submath display="inline">''&x_\alpha;''</submath> such that ''<math display="inline">F''(''x''<sub>''&x_\alpha;''</sub>) =&nbsp; 1&nbsp;&minus;&nbsp;''&-\alpha;''</math>, where ''<math display="inline">F''</math> is the cumulative distribution function. There are standard notations for the upper critical values of some commonly used distributions in statistics:<ref name=":0" />
*''z''<submath display="inline">''&z_\alpha;''</submath> or ''<math display="inline">z''(''&\alpha;'')</math> for the [[standard normal distribution]]
*''t''<sub>''&alpha;'',''&nu;''</sub> or ''t''(''&alpha;'',''&nu;'') for the [[Student's t-distribution|''t''-distribution]] with &nu; [[Degrees of freedom (statistics)|degrees of freedom]]
*<math display="inline">{\chi_t_{\alpha,\nu}}^2</math> or <math display="inline">{\chi}^{2}t(\alpha,\nu)</math> for the [[chi-squaredStudent's t-distribution|''t''-distribution]] with &<math display="inline">\nu;</math> [[Degrees of freedom (statistics)|degrees of freedom]]
*<math>F_{\chi_{\alpha,\nu_1,\nu_2nu}}^2</math> or F<math>{\chi}^{2}(&\alpha;,''&\nu;''<sub>1)</submath>,''&nu;''<sub>2</sub>) for the [[Fchi-squared distribution]] with ''&nu;''<sub>1</sub>math and ''&display="inline">\nu;''<sub>2</submath> degrees of freedom
*<math>F_{\alpha,\nu_1,\nu_2}</math> or <math display="inline">F(\alpha,\nu_1,\nu_2)</math> for the [[F-distribution]] with <math display="inline">\nu_1</math> and <math display="inline">\nu_2</math> degrees of freedom
 
==Linear algebra==
{{Unreferenced section|date=March 2021}}
*[[Matrix (mathematics)|Matrices]] are usually denoted by boldface capital letters, e.g. '''A'''.
*[[ColumnMatrix vector(mathematics)|Matrices]]s are usually denoted by boldface lowercasecapital letters, e.g. '''x'''<math display="inline">\bold{A}</math>.
*[[MatrixColumn (mathematics)|Matricesvector]]s are usually denoted by boldface capitallowercase letters, e.g. '''A<math display="inline">\bold{x}</math>'''.
*The [[transpose]] operator is denoted by either a superscript T (e.g. '''A'''<supmath display="inline">\bold{A}^\mathrm{T}</supmath>''') or a [[prime (symbol)|prime symbol]] (e.g. '''<math display="inline">\bold{A}'</math>'''&prime;).
*A [[row vector]] is written as the transpose of a column vector, e.g. '''x'''<supmath display="inline">\bold{x}^\mathrm{T}</supmath>''' or '''<math display="inline">\bold{x}'</math>'''&prime;.
 
==Abbreviations==
{{Unreferenced section|date=March 2021}}
Common abbreviations include:
*'''a.e.''' [[almost everywhere]]
*'''a.s.''' [[almost surely]]
* '''cdf''' [[cumulative distribution function]]<ref name=":0" />
* '''cmf''' [[cumulative mass function]]
*'''df''' [[degrees of freedom (statistics)|degrees of freedom]] (also <math>\nu</math>)<ref name=":0" />
*'''i.i.d.''' [[Independent and identically distributed random variables|independent and identically distributed]]<ref name=":0" />
*'''pdf''' [[probability density function]]<ref name=":0" />
*'''pmf''' [[probability mass function]]<ref name=":0" />
* '''r.v.''' [[random variable]]<ref name=":0" />
* '''w.p.''' with probability; '''wp1''' [[with probability 1]]
* '''i.o.''' infinitely often, i.e. <math> \{ A_n\text{ i.o.} \} = \bigcap_N\bigcup_{n\geq N} A_n </math>
Line 72 ⟶ 78:
== See also ==
*[[Glossary of probability and statistics]]
*[[CombinationsCombination]]s and permutations[[permutation]]s
*[[Typographical conventions in mathematical formulae]]
*[[History of mathematical notation]]
 
== References ==
{{Reflist}}
<references />
{{refbegin}}
* {{Citation| |title = Recommended Standards for Statistical Symbols and Notation. COPSS Committee on Symbols and Notation| |first1 = Max |last1 = Halperin |first2 = H. O. |last2 = Hartley |first3 = P. G. |last3 = Hoel | journal = The American Statistician| |volume = 19 |year = 1965 | pages = 12–14 | issue = 3| |doi = 10.2307/2681417 | jstor = 2681417 }}
{{refend}}
 
== BibliographyExternal links ==
* [http://jeff560.tripod.com/stat.html Earliest Uses of Symbols in Probability and Statistics], maintained by Jeff Miller.
*{{Citation| title=Recommended Standards for Statistical Symbols and Notation. COPSS Committee on Symbols and Notation| first1=Max|last1=Halperin |first2=H. O. |last2=Hartley |first3=P. G.|last3=Hoel | journal=The American Statistician| volume=19 |year=1965 | pages=12–14 | issue=3| doi=10.2307/2681417 | jstor=2681417}}
 
{{Mathematical symbols notation language}}
==External links==
*[http://jeff560.tripod.com/stat.html Earliest Uses of Symbols in Probability and Statistics], maintained by Jeff Miller.
 
[[Category:Probability and statistics| Notation]]