Notation in probability and statistics: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 03:08, 31 March 2020 edit 2001:8003:3000:5a00:7059:1f14:e931:f91e (talk) Was previously P(A \cup B) or P(A \cup B) which is same thing. Should be P(A \cup B) or P(B \cup A) ← Previous edit		Latest revision as of 07:32, 24 June 2025 edit undo OAbot (talk \| contribs) Bots 646,409 edits m Open access bot: url-access=subscription updated in citation with #oabot.
(44 intermediate revisions by 22 users not shown)
Line 1: {{Short description\|none}} {{ProbabilityTopicsTOC}} {{StatsTopicTOC}} Line 4 ⟶ 5: ==Probability theory== {{Unreferenced section\|date=March 2021}} * [[Random variable]]s are usually written in [[upper case]] roman letters: ''X'', ''Y'', etc. * [[Random variable]]s are usually written in [[upper case]] Roman letters, such as <math display="inline">X</math> or <math display="inline">Y</math> and so on. Random variables, in this context, usually refer to something in words, such as "the height of a subject" for a continuous variable, or "the number of cars in the school car park" for a discrete variable, or "the colour of the next bicycle" for a categorical variable. They do not represent a single number or a single category. For instance, if <math>P(X = x) </math> is written, then it represents the probability that a particular realisation of a random variable (e.g., height, number of cars, or bicycle colour), ''X'', would be equal to a particular value or category (e.g., 1.735 m, 52, or purple), <math display="inline">x</math>. It is important that <math display="inline">X</math> and <math display="inline">x</math> are not confused into meaning the same thing. <math display="inline">X</math> is an idea, <math display="inline">x</math> is a value. Clearly they are related, but they do not have identical meanings. * Particular realizations of a random variable are written in corresponding [[lower case]] letters. For example, ''x''<sub>1</sub>, ''x''<sub>2</sub>, …, ''x''<sub>''n''</sub> could be a [[random sample\|sample]] corresponding to the random variable ''X''. A cumulative probability is formally written <math>P(X\le x) </math> to differentiate the random variable from its realization. * Particular realisations of a random variable are written in corresponding [[lower case]] letters. For example, <math display="inline">x_1,x_2, \ldots,x_n</math> could be a [[random sample\|sample]] corresponding to the random variable <math display="inline">X</math>. A cumulative probability is formally written <math>P(X\le x) </math> to distinguish the random variable from its realization.<ref>{{Cite web \|date=2021-08-09 \|title=Calculating Probabilities from Cumulative Distribution Function \|url=https://analystprep.com/cfa-level-1-exam/quantitative-methods/calculating-probabilities-from-cumulative-distribution-function/ \|access-date=2024-02-26}}</ref> * The probability is sometimes written <math>\mathbb{P} </math> to distinguish it from other functions and measure ''P'' ~~so as~~ to avoid having to define “"''P'' is a ~~probability”~~probability" and <math>\mathbb{P}(X\in A) </math> is short for <math>P(\{\omega \in\Omega: X(\omega) \in A\})</math>, where <math>\Omega</math> is the event space ~~and~~, <math>X</math> is a random variable that is a function of <math>\omega</math> (i.e., it depends upon <math>\omega</math>), and <math>\omega</math> is some outcome of interest within the ___domain specified by <math>\Omega</math> (say, a ~~random~~particular ~~variable~~height, or a particular colour of a car). <math>\Pr(A)</math> notation is used alternatively. <math>\mathbb{P}(A \cap B)</math> or <math>\mathbb{P}[B \cap A]</math> indicates the probability that events ''A'' and ''B'' both occur. The [[joint probability distribution]] of random variables ''X'' and ''Y'' is denoted as <math>P(X, Y)</math>, while joint probability mass function or probability density function as <math>f(x, y)</math> and joint cumulative distribution function as <math>F(x, y)</math>. <math>\mathbb{P}(A \cup B)</math> or <math>\mathbb{P}[B \cup A]</math> indicates the probability of either event ''A'' or event ''B'' occurring (~~“or”~~"or" in this case means [[inclusive or\|one or the other or both]]). [[sigma-algebra\|~~σ~~σ-algebras]] are usually written with uppercase [[Calligraphy\|calligraphic]] (e.g. <math>\mathcal F</math> for the set of sets on which we define the probability ''P'') [[Probability density function]]s (pdfs) and [[probability mass function]]s are denoted by lowercase letters, e.g. <math>f(x)</math>, or <math>f_X(x)</math>. [[Cumulative distribution function]]s (cdfs) are denoted by uppercase letters, e.g. <math>F(x)</math>, or <math>F_X(x)</math>. [[Survival function]]s or complementary cumulative distribution functions are often denoted by placing an [[overbar]] over the symbol for the cumulative:<math>\overline{F}(x) =1-F(x)</math>, or denoted as <math>S(x)</math>, In particular, the pdf of the [[standard normal distribution]] is denoted by ~~φ~~<math display="inline">\varphi(''z'')</math>, and its cdf by &<math display="inline">\Phi;(''z'')</math>. Some common operators: :* <math display="inline">\mathrm{E}[''X''] </math>: [[expected value]] of ''X'' :* <math display="inline">\operatorname{var}[''X''] </math>: [[variance]] of ''X'' :* <math display="inline">\operatorname{cov}[''X'', ''Y''] </math>: [[covariance]] of ''X'' and ''Y'' * X is independent of Y is often written <math>X \perp Y</math> or <math>X \perp\!\!\!\perp Y</math>, and X is independent of Y given W is often written :<math>X \perp\!\!\!\perp Y \,\|\, W </math> or :<math>X \perp Y \,\|\, W</math> * <math>\textstyle P(A\mid B)</math>, the ''[[conditional probability]]'', is the probability of <math>\textstyle A</math> ''given'' <math>\textstyle B</math>~~, i.e.,~~ <~~math~~ref>~~\textstyle~~{{Citation ~~A</math>~~\|title=Probability ~~''after''~~and ~~<math>\textstyle~~stochastic ~~B</math>~~processes is\|date=2013-07-22 ~~observed~~\|url=http://dx.~~{{fact~~doi.org/10.1201/b15257-3 \|work=Applied Stochastic Processes \|pages=9–36 \|access-date=~~May~~2023-12-08 \|publisher=Chapman and Hall/CRC \|doi=10.1201/b15257-3 \|isbn=978-0-429-16812-3\|url-access=subscription ~~2016~~}}</ref> ==Statistics== {{Unreferenced section\|date=March 2021}} Greek letters (e.g. ''θ'', ''β'') are commonly used to denote unknown parameters (population parameters). Greek letters (e.g. ''θ'', ''β'') are commonly used to denote unknown parameters (population parameters).<ref>{{Cite web \|date=1999-02-13 \|title=Letters of the Greek Alphabet and Some of Their Statistical Uses \|url=https://lesn.appstate.edu/olson/EDL7150/Components/Other%20useful%20links/Greek%20Alphabet%20and%20Statistics.htm \|access-date=2024-02-26 \|website=les.appstate.edu/}}</ref> A tilde (~) denotes "has the probability distribution of". Placing a hat, or caret (also known as a circumflex), over a true parameter denotes an [[estimator]] of it, e.g., <math>\widehat{\theta}</math> is an estimator for <math>\theta</math>. The [[arithmetic mean]] of a series of values ~~''x''~~<~~sub>1</sub>,~~math ~~''x''<sub>2</sub~~display="inline">x_1,x_2, ~~...~~\ldots, ~~''x''<sub>''n''~~x_n</~~sub~~math> is often denoted by placing an "[[overbar]]" over the symbol, e.g. <math>\bar{x}</math>, pronounced "''<math display="inline">x''</math> bar". Some commonly used symbols for [[Sample (statistics)\|sample]] statistics are given below: the [[sample mean]] <math>\bar{x}</math>, the [[sample variance]] ~~''s''~~<~~sup~~math display="inline">s^2</~~sup~~math>, the [[sample standard deviation]] ''<math display="inline">s</math>'', the [[Pearson correlation coefficient\|sample correlation coefficient]] ''<math display="inline">r</math>'', *the sample cumulants ~~''k~~<~~sub~~math display="inline">rk_r</~~sub~~math>''. Some commonly used symbols for [[Statistical population\|population]] parameters are given below: the population mean ~~''&~~<math display="inline">\mu~~;''~~</math>, the population variance ~~''σ''~~<~~sup~~math display="inline">\sigma^2</~~sup~~math>, the population standard deviation ''&<math display="inline">\sigma;</math>'', the population [[Pearson product-moment correlation coefficient\|correlation]] ''&<math display="inline">\rho;</math>'', *the population [[cumulant]]s ''~~κ~~<~~sub~~math display="inline">r\kappa_r</~~sub~~math>'', <math>x_{(k)}</math> is used for the <math>k^\text{th}</math> [[order statistic]], where <math>x_{(1)}</math> is the sample minimum and <math>x_{(n)}</math> is the sample maximum from a total sample size ''<math display="inline">n''</math>.<ref>{{Cite web \|title=Order Statistics \|url=https://www.colorado.edu/amath/sites/default/files/attached-files/order_stats.pdf \|access-date=2024-02-26 \|website=colorado.edu}}</ref> ==Critical values== {{Unreferenced section\|date=March 2021}} The ''~~α~~α''-level upper [[critical value (statistics)\|critical value]] of a [[probability distribution]] is the value exceeded with probability &<math display="inline">\alpha;</math>, that is, the value ~~''x''~~<~~sub~~math display="inline">~~''&~~x_\alpha~~;''~~</~~sub~~math> such that ''<math display="inline">F''(~~''x''<sub>''&~~x_\alpha~~;''</sub>~~) =~~ ~~ 1~~ − ''&~~-\alpha~~;''~~</math>, where ''<math display="inline">F''</math> is the cumulative distribution function. There are standard notations for the upper critical values of some commonly used distributions in statistics: ~~''z''~~<~~sub~~math display="inline">~~''&~~z_\alpha~~;''~~</~~sub~~math> or ''<math display="inline">z''(~~''&~~\alpha~~;''~~)</math> for the [[standard normal distribution]] ''t''<sub>''α'',''ν''</sub> or ''t''(''α'',''ν'') for the [[Student's t-distribution\|''t''-distribution]] with ν [[Degrees of freedom (statistics)\|degrees of freedom]] <math display="inline">~~{\chi_~~t_{\alpha,\nu}~~}^2~~</math> or <math display="inline">~~{\chi}^{2}~~t(\alpha,\nu)</math> for the [[~~chi-squared~~Student's t-distribution\|''t''-distribution]] with &<math display="inline">\nu;</math> [[Degrees of freedom (statistics)\|degrees of freedom]] <math>F_{\chi_{\alpha,\~~nu_1,\nu_2~~nu}}^2</math> or F<math>{\chi}^{2}(&\alpha;,~~''&~~\nu~~;''<sub>1~~)</~~sub~~math>~~,''ν''<sub>2</sub>)~~ for the [[Fchi-squared distribution]] with ~~''ν''~~<~~sub>1</sub>~~math ~~and ''&~~display="inline">\nu~~;''<sub>2~~</~~sub~~math> degrees of freedom <math>F_{\alpha,\nu_1,\nu_2}</math> or <math display="inline">F(\alpha,\nu_1,\nu_2)</math> for the [[F-distribution]] with <math display="inline">\nu_1</math> and <math display="inline">\nu_2</math> degrees of freedom ==Linear algebra== {{Unreferenced section\|date=March 2021}} [[Matrix (mathematics)\|Matrices]] are usually denoted by boldface capital letters, e.g. '''A'''. ▼ [[~~Column~~Matrix ~~vector~~(mathematics)\|Matrices]]s are usually denoted by boldface ~~lowercase~~capital letters, e.g. ~~'''x'''~~<math display="inline">\bold{A}</math>. ▲[[~~Matrix~~Column ~~(mathematics)\|Matrices~~vector]]s are usually denoted by boldface ~~capital~~lowercase letters, e.g. '''A<math display="inline">\bold{x}</math>'''. The [[transpose]] operator is denoted by either a superscript T (e.g. ~~'''A~~'''<~~sup~~math display="inline">\bold{A}^\mathrm{T}</~~sup~~math>''') or a [[prime (symbol)\|prime symbol]] (e.g. '''<math display="inline">\bold{A}'</math>'''~~′~~). A [[row vector]] is written as the transpose of a column vector, e.g. ~~'''x~~'''<~~sup~~math display="inline">\bold{x}^\mathrm{T}</~~sup~~math>''' or '''<math display="inline">\bold{x}'</math>'''~~′~~. ==Abbreviations== {{Unreferenced section\|date=March 2021}} Common abbreviations include: '''a.e.''' [[almost everywhere]] Line 72 ⟶ 78: == See also == [[Glossary of probability and statistics]] [[~~Combinations~~Combination]]s and ~~permutations~~[[permutation]]s [[Typographical conventions in mathematical formulae]] [[History of mathematical notation]] == References == {{Reflist}} {{Citation\| title=Recommended Standards for Statistical Symbols and Notation. COPSS Committee on Symbols and Notation\| first1=Max\|last1=Halperin \|first2=H. O. \|last2=Hartley \|first3=P. G.\|last3=Hoel \| journal=The American Statistician\| volume=19 \|year=1965 \| pages=12–14 \| issue=3\| doi=10.2307/2681417 \| jstor=2681417}}▼ {{refbegin}} ▲* {{Citation\| \|title = Recommended Standards for Statistical Symbols and Notation. COPSS Committee on Symbols and Notation\| \|first1 = Max \|last1 = Halperin \|first2 = H. O. \|last2 = Hartley \|first3 = P. G. \|last3 = Hoel \| journal = The American Statistician\| \|volume = 19 \|year = 1965 \| pages = 12–14 \| issue = 3\| \|doi = 10.2307/2681417 \| jstor = 2681417 }} {{refend}} == External links == * [http://jeff560.tripod.com/stat.html Earliest Uses of Symbols in Probability and Statistics], maintained by Jeff Miller. {{Mathematical symbols notation language}} [[Category:Probability and statistics\| Notation]]