Talk:Cumulative distribution function: Difference between revisions

Content deleted Content added
Wiki me (talk | contribs)
Tags: Mobile edit Mobile web edit Advanced mobile edit
 
(104 intermediate revisions by 52 users not shown)
Line 1:
{{WikiProject banner shell|class=C|vital=yes|1=
{{maths rating|frequentlyviewed=yes
{{WikiProject Mathematics|importance = high}}
|field = probability and statistics
{{WikiProject Statistics|importance = high}}
|class = start
|historical =
}}
{{annual readership}}
{{WPStatistics}}
{{archive box|auto=yes}}
 
== Logistic? ==
==Distribution function==
This page had a redirect from [[distribution function]], which I've now made into its own article describing a related but distinct concept in physics. I'll try to modify the pages pointing here through that redirect so that the net change in the wikipedia is minimal.[[User:SMesser|SMesser]] 16:12, 24 Feb 2005 (UTC)
: I added a reference to here on the "distribution" page so that "distribution function" appears separately for statistics and for physics [[User:Melcombe|Melcombe]] ([[User talk:Melcombe|talk]]) 16:35, 21 February 2008 (UTC)
 
[[User:Thatsme314|Thatsme314]] asserted in an edit https://en.wikipedia.org/w/index.php?title=Cumulative_distribution_function&oldid=904343161...
== Cumulative density function ==
 
"In the case of a [[continuous distribution]], it gives the area under the [[probability density function]] from minus infinity to <math>x</math>, and is a [[Logistic distribution|logistic]] [[Logistic distribution|distribution]]. Cumulative distribution functions are also used to specify the distribution of [[multivariate random variable]]s."
I originally created the redirect [[cumulative density function]] in March to point to this article. Why? A simple google test for [http://www.google.com/search?hl=en&lr=&q=%22cumulative+density+function%22&btnG=Search cumulative density function] shows 41,000 hits while [http://www.google.com/search?hl=en&lr=&q=%22cumulative+distribution+function%22&btnG=Search cumulative distribution function] shows 327,000 hits. Michael Hardy's contention is that "cumulative density" is patent nonsense (see deletion log) and a redirect shouldn't exist.
 
Am I misunderstanding what is meant here? The obvious reading is clearly wrong, but perhaps there is a less obvious reading that I am missing. Best clase, it still needed clarification so I deleted it.
Regardless of the correctness of "cumulative density", there still is a significant usage of it in reference to this article and its content. "Cumulative density function" is even [http://www.ccl.rutgers.edu/~ssi/thesis/thesis-node52.html used in a doctoral thesis]. Hardly patent nonsense.
 
--[[User:Livingthingdan|Livingthingdan]] ([[User talk:Livingthingdan|talk]]) 07:36, 7 August 2019 (UTC)
Even if "cumulative density function" is incorrect, someone still may look for it, find nothing, and create an article paralleling this article. If you don't buy the "it's not patent nonsense, or even just nonsense" then I invoke (from [[WP:R#When should we delete a redirect?]]) that it increases accidental linking and therefore should not be deleted.
 
== Serious Error ==
Michael, if you have a problem with the correctness of "cumulative density" then by all means add a section here or change the redirect to an article and explain it there. Either way, [[cumulative density function]] needs to be a valid link. [[User:Cburnett|Cburnett]] 14:42, 14 December 2005 (UTC)
 
There is written that if the cumulative distribution function is continuos then X is absolutely continuos.
: I just saw this debate now. I've changed the redirect page into a navigation page explaining the severe confusion. [[User:Michael Hardy|Michael Hardy]] 21:59, 20 July 2007 (UTC)
This is just false, you need F continuos and with continuous derivative!
 
== Consistency cadlag==
 
while the distribution is required to be cadlag? a discussion section on this will be valuable. [[User:Jackzhp|Jackzhp]] ([[User talk:Jackzhp|talk]]) 18:46, 15 August 2009 (UTC)
Please be consistent! In Probability theory the integral of the "probability density function" "PDF" is called
"cumulative density function" CDF or simply "distribution function". Thus the adjective cumulative.
See http://mathworld.wolfram.com/DistributionFunction.html
 
: Moreover there is a tradition here (I suppose because of [[Kolmogorov]]'s original notation, but I'm not sure) that the CDF should be left continous... [[User:Drkazmer|Drkazmer]] [[Image:Crystal 128 penguin.png|17px]] <sup>[[User vita:Drkazmer|Just tell me...]]</sup> 23:01, 2 January 2012 (UTC)
The term "Cumulative distribution function" is nonsense because it implies the integral of the integral of the PDF. Utterly nonsense! Please correct this link! [[User:lese]] 4 Nov 2007.
 
EDIT: My sincere apologies, but I don't know where else to report this. Unlike other Wikipedia pages, when this page is googled, its title shows up with the first letter of the first word uncapitalized. Try it. Cheers. -- Anonymous user
:"Cumulative distribution function" appears in Everitt's Dictionary of Statistics while "cumulative density function" does not. Similarly in the Unwin Dictionary of Mathematics. [[User:Melcombe|Melcombe]] ([[User talk:Melcombe|talk]]) 16:42, 21 February 2008 (UTC)
 
==Complementary Comulative Distribution function==
== How is this a debate? ==
I assume there is an error after "Proof: Assuming X has density function f, we have for any c > 0", regarding integration limits for E(X) ? <small><span class="autosigned">—Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[User:Amir bike|Amir bike]] ([[User talk:Amir bike|talk]] • [[Special:Contributions/Amir bike|contribs]]) 05:54, 19 May 2011 (UTC)</span></small><!-- Template:Unsigned --> <!--Autosigned by SineBot-->
 
It is said that [[Markov's inequality]] states that: <math>\bar F(x) \leq \frac{\mathbb E(X)}{x} </math>
The word "cumulative distribution function" is used in many elementary books. It is a pretty stupid term, but we are stuck with it. The best we can do is acknowledge that the term is out there, that is should simply be "distribution function" and that it's definition MUST be with <= or else many tables, software routines, etc will be incorrectly used. <small><span class="autosigned">—Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[User:Jmsteele|Jmsteele]] ([[User talk:Jmsteele|talk]] • [[Special:Contributions/Jmsteele|contribs]]) </span></small><!-- Template:Unsigned -->
However it is only correct in continuous case, as in discrete case <math>P(X \geq x) = \bar F(X) + P(X=x) </math> Although the Inequality still holds, the current version is weaker than the proper Markov's inequality <small><span class="autosigned">— Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[User:Colinfang|Colinfang]] ([[User talk:Colinfang|talk]] • [[Special:Contributions/Colinfang|contribs]]) 18:59, 4 March 2012 (UTC)</span></small><!-- Template:Unsigned --> <!--Autosigned by SineBot-->
 
:The current version is the standard statement of Markov's inequality found in reference books. If there is a stronger result, it could be stated with a citation. If the stronger resuly is still generally known as Markov's inequality, then the [[Markov's inequality]] article could be updated as well. But the version in the article (now) states valid conditions under which the results hold. [[User:Melcombe|Melcombe]] ([[User talk:Melcombe|talk]]) 16:58, 15 April 2012 (UTC)
: I don't think its a stupid term and I have no problem with it. On the other hand "cumulative density function" is a horribly stupid term. [[User:Michael Hardy|Michael Hardy]] 21:59, 20 July 2007 (UTC)
 
== Doesn't make senseUtility ==
"Note that in the definition above, the "less or equal" sign, '≤' could be replaced with "strictly less" '<'. This would yield a different function, but either of the two functions can be readily derived from the other. The only thing to remember is to stick to either definition as mixing them will lead to incorrect results. In English-speaking countries the convention that uses the weak inequality (≤) rather than the strict inequality (<) is nearly always used."
 
It would be helpful if in the entry there was a discussion of the utility performing a CDF plot. This would include when to perform one, and what information is learned from performing the CDF plot. What real world applications would this include? Maybe an example would be helpful <span style="font-size: smaller;" class="autosigned">— Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[Special:Contributions/209.252.149.162|209.252.149.162]] ([[User talk:209.252.149.162|talk]]) 14:10, 1 August 2011 (UTC)</span><!-- Template:Unsigned IP --> <!--Autosigned by SineBot-->
Surely it doesn't matter at all! Since the probability of one single value is 0, hence the two interval boundaries can be included or excluded.
:If you're only interested in integrals. [[User:Gerbrant|Shinobu]] 22:50, 7 June 2006 (UTC)
::The convention in the entire world is to use '≤' and it matters HUGELY for the binomial, poisson, negative binomial, etc. To use anything else and to rely upon the formulas in any text would lead substantial errors, say when one is using a table of the binomial distribution. [[User:Jmsteele|Jmsteele]] 01:18, 21 October 2006 (UTC)
:I'm not sure about that. The definition: F(x) = P(X <= x)
:Because P(X <= x) = P(X < x) + P(X = x), F(x) = P(X < x) + P(X = x)
:Now for normal functions (the kind of functions you mention) P(X = x) = 0.
:Of course, there are things like deltafunctions, but that's not what you're talking about. [[User:Gerbrant|Shinobu]] 16:27, 27 October 2006 (UTC)
 
== Table of cdfs ==
Please consider some very important distributions: The Binomial, Poisson, Hypergeometric. You simply MUST use the definition F(x) = P(X <= x) or else all software packages and all tables will be misundestood. PS I am a professor of statistics, so give me some slack here. This is not a matter of delta functions it is a matter of sums of coin flips ... very basic stuff.
 
I have moved the recently added table of cdfs to here for discussion/revision. The version added was ...
== F(x) vs Phi(x) ==
I completely disagree with "It is conventional to use a capital F for a cumulative distribution function, in contrast to the lower-case f used for probability density functions and probability mass functions." From all the literature I have read, <math>\Phi(x) \!</math> is the cumulative distribution function and <math>\phi(x) \!</math> is used for probability density/mass functions. Where's the reference to make such a bold claim that F and f are convention? See the [[probit]] article which uses <math>\Phi^{-1}(x) \!</math> for the inverse to cdf. -- [[User:Thoreaulylazy|Thoreaulylazy]] 19:13, 3 October 2006 (UTC)
:There is no such convention - you can pick any symbol you like, of course. It is common practice to use the capital for the cdf, because it's the primitive of the df. I've seen phi in quantum mechanical books, but I've also seen f and rho. [[User:Gerbrant|Shinobu]] 22:58, 3 October 2006 (UTC)
:From ''all'' the literatures I have read, the pair of F and f was the convention. I don't mean to say that Φ and φ are wrong, but how can you be so sure to declare something else as a bold claim? Many different fields have different notational conventions, and we just have to accept it. [[User:Musiphil|Musiphil]] 07:03, 3 December 2006 (UTC)
 
{|class="wikitable"
====This is a collapsed disctintion. One uses Phi for the normal distribution and phi for the normal density. These are reseved symbols for these purposes --- see any statistics book. One uses F and f fo the generic distributions and densities, but these are not reserved. In many books and papers one will find G g , H h etc. Each time the capital representing distribution and the lower case the density.
|-
! Distribution
! Cumulative Density function <math>\! F_X(x)</math>
|-
| [[Binomial distribution|Binomial]] B(''n, p'')
| &nbsp; <math>\! \textstyle I_{1-p}(n - k, 1 + k)</math>
|-
| [[Negative binomial distribution|Negative binomial]] NB(''r, p'')
| &nbsp; <math>\! 1 - I_{p}(k+1, r) </math>
|-
| [[Poisson distribution|Poisson]] Pois(<math>\lambda</math>)
| &nbsp; <math>\! e^{-\lambda} \sum_{i=0}^{k} \frac{\lambda^i}{i!}</math>
|-
| [[Uniform distribution (continuous)|Uniform]] U(''a, b'')
| &nbsp; <math>\! \frac{x-a}{b-a}</math> for <math>x \in (a,b)</math>
|-
| [[Normal distribution|Normal]] ''N''(''µ, <math>\sigma^2</math>'')
| &nbsp; <math>\! \frac12\left[1 + \operatorname{erf}\left( \frac{x-\mu}{\sqrt{2\sigma^2}}\right)\right] </math>
|-
| [[Chi-squared distribution|Chi-squared]] <math>\Chi_k^2</math>
| &nbsp; <math>\! \frac{1}{\Gamma(k/2)}\;\gamma(k/2,\,x/2)</math>
|-
| [[Cauchy distribution|Cauchy]] Cauchy(''µ, <math>\theta</math>'')
| &nbsp; <math>\! \frac{1}{\pi} \arctan\left(\frac{x-x_0}{\gamma}\right)+\frac{1}{2}</math>
|-
| [[Gamma distribution|Gamma]] G(''k, <math>\theta</math>'')
| &nbsp; <math>\!\frac{\gamma(k, x/\theta)}{\Gamma(k)}</math>
|-
| [[Exponential distribution|Exponential]] Exp(''<math>\lambda</math>'')
| &nbsp; <math>\! 1 - e^{-\lambda x}</math>
|-
|}
 
There are several problems here, particularly with inconsistent notations. But there are structural problems in defining the cdfs of the discrete distributions, as the formulae given are only valid at the integer points (within the range of the distribution) and would give incorrect values of the cdf at non-integer values. Also several of the functions involved require definitions/wikilinks. So, if the table is to be included, thought needs to be given to possibly dividing it into discrete/continuous tables and/or adding extra columns. [[User:Melcombe|Melcombe]] ([[User talk:Melcombe|talk]]) 09:25, 21 October 2011 (UTC)
== Programming algorithm ==
:: {{ping|Melcombe}} I certainly hope we won't see tables that say "cumulative density function" instead of "cumulative distribution function". [[User:Michael Hardy|Michael Hardy]] ([[User talk:Michael Hardy|talk]]) 18:40, 23 February 2019 (UTC)
 
== citation needed.... really? ==
I've been looking for a better algorithm to generate a random value based on an arbitrary CDF (better than the one I wrote). For example, if one would like to obtain a random value with a "flat" distribution, one can use the 'rand()' function in C's math.h . However, I wrote this function to use an arbitrary function to generate the random value:
// xmin and xmax are the range of outputs you want
// ymin and ymax are the actual limits of the function you want
// function is a function pointer that points to the CDF
long double randfunc(double xmin, double xmax, long double (*function)(long double), double ymin, double ymax)
{ long double val;
while(1)
{ if( (ymax-ymin)*( rand()/((long double)RAND_MAX + 1)) + ymin <
function(val= ((xmax-xmin)*( rand()/((long double)RAND_MAX + 1)) + xmin)))
{ return val;
}
}
}
 
I think it is a little bit ridiculous to expect a citation that a CDF is càdlàg. It is an almost trivial observation that follows directly from the probability space axioms and the definition of a càdlàg function. Surely this is a routine calculation. --[[Special:Contributions/217.84.60.220|217.84.60.220]] ([[User talk:217.84.60.220|talk]]) 11:32, 3 November 2012 (UTC)
I was trying to find a way to do it faster/better. If anyone knows of anything.. let me know. [[User:Fresheneesz|Fresheneesz]] 07:53, 27 December 2006 (UTC)
 
== Please, for didactic, show area relation ==
:Wikipedia really isn't the place to ask these sorts of questions. The talk pages are more for discussion on the articles themselves. Anyway, i will tell you that Donald Knuth's textbook Numerical recipies in C has a good dissertation on random number generation and also includes algorithms. I would further advise that you read the text, not just implement the algorithms listed there, its quite good! [[User:User A1|User A1]] 13:48, 12 March 2007 (UTC)
 
EXAMPLE
:: I think it'd be nice to have something on algorithms on this page. I have actually found a better answer. It invovles either integrating the CDF, and using the definate integral instead of an indefinite integral, or if no definite integral is possible, preintegrate the function and use the numbers prerendered in memory. [[User:Fresheneesz|Fresheneesz]] 02:39, 13 March 2007 (UTC)
 
http://beyondbitsandatomsblog.stanford.edu/spring2010/files/2010/04/CdfAndPdf.gif <span style="font-size: smaller;" class="autosigned">— Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[Special:Contributions/187.66.187.183|187.66.187.183]] ([[User talk:187.66.187.183|talk]]) 07:25, 3 February 2013 (UTC)</span><!-- Template:Unsigned IP --> <!--Autosigned by SineBot-->
::: This is a well-understood problem that has been solved many times over. However, to really appreciate the solutions, I recommend that you pick up a graduate-level textbook on random variables (for example, the book by Papoullis and Pillai). I think you'll find that given an arbitrary CDF and a random variable that is uniformly distributed from 0 to 1, the inverse of the CDF will transform the uniformly distributed random variable into a randomv ariable with that CDF. That is, if your desired CDF is F, the function F<sup>-1</sup> will transform a random variable distributed between 0 and 1 to a random variable distributed by F. This can be used to motivate such algorithms. HOWEVER, I think you'll find that if you know more about the particular random variable you are generating, there are much more efficient ways to generate that random variable from a uniform random variable. Again, an understanding of the underlying probability will greatly simplify the generation of such algorithms. (students of probability are often asked to generate such algorithms as homework problems in, for example, [[MATLAB]]) --[[User:TedPavlic|TedPavlic]] 21:19, 8 April 2007 (UTC)
 
== CDF is definitely LEFT-continuous. ==
== cdf vs pdf ==
 
CDF must be '''left-continuous''', not right as stated on the wiki page.
Hello,
Source: current ongoing University studies, 3 separate professors, books from 4 different authors. <span style="font-size: smaller;" class="autosigned">— Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[Special:Contributions/213.181.200.159|213.181.200.159]] ([[User talk:213.181.200.159|talk]]) 07:46, 6 March 2013 (UTC)</span><!-- Template:Unsigned IP --> <!--Autosigned by SineBot-->
 
:This article follows the convention reached via the link [[right-continuous]]. This is "continuous from the right". Perhaps you are thinking of "continuous to the left". [[Special:Contributions/81.98.35.149|81.98.35.149]] ([[User talk:81.98.35.149|talk]]) 11:34, 6 March 2013 (UTC)
i removed the comment that '''probability distribution function''' is the same as CDF, which i assert to be wrong. My reference is "Probability and Statistics for Engineering and the Sciences" pp 140 (J. Devore) . The PDF is the same as the probability density function, not the CDF. The CDF is the integral of the PDF, not the PDF itself.
 
I'd like to mention that some textbooks define the CDF as the probability that the random variable X at a given point, F_X(a), that X is ''strictly'' less the constant, rather than less than or equal to a. This is where the confusion of this talk section is coming from. --[[Special:Contributions/128.42.66.81|128.42.66.81]] ([[User talk:128.42.66.81|talk]]) 20:23, 25 August 2016 (UTC)
Please comment. [[User:129.78.208.4|129.78.208.4]] 05:28, 12 March 2007 (UTC)
 
== Not that redundant? ==
:The "[[probability distribution function]]" is the same as the "[[cumulative distribution function]]" (CDF) and the "[[distribution function]]". The "[[probability density function]]" (PDF) is the [[derivative]] of the probability distribution function. See e.g. [http://mathworld.wolfram.com/DistributionFunction.html]. --[[User:X-Bert|X-Bert]] 22:22, 7 April 2007 (UTC)
:For more information, you should look into [[measure theory]], which is the basis for probability. Originally, the word '''probability''' was prepended to measure theoretic concepts to imply a special structure of the [[measure]] being used. However, because probability is now used by many who do not have the mathematical sophistication for measure theory, lots of other terms have been introduced (often accidentally) to hide the roots of probability. Thus, the language is now quite sloppy. Reviewing the measure theoretic roots of probability clears up any confusion about why the terms that are used in probability have the names that they do. --[[User:TedPavlic|TedPavlic]] 21:10, 8 April 2007 (UTC)
 
The passage that I deleted but which was restored said
I would also like to see ''probability distribution function'' removed as an alternative name to CDF as it only confuses readers since ''[[probability distribution function]]'' can refer to the probability mass/density fxn<ref>[http://cnx.org/content/m13466/latest/]</ref><ref>http://mathworld.wolfram.com/Probability.html</ref> or cumulative dist fxn depending on the author. If it is included as an alternative name, then I think the different interpretations need to be pointed out in the same paragraph.[[User:Wiki me|Wiki me]] ([[User talk:Wiki me|talk]]) 17:31, 3 March 2009 (UTC)
 
:'''Point probability'''
== Properties Notation ==
:''The "point probability" that ''X'' is exactly ''b'' can be found as''
 
::<math>\operatorname{P}(X=b) = F(b) - \lim_{x \to b^{-}} F(x).</math>
The formula after 'If X is a discrete random variable, then it attains values x1, x2, ... with probability pi = P(xi),' has the final sum of p(xi), why would this not be the sum of pi since we have already introduced pi? Also, I think pi = P(xi) should be introduced as pi = P(X=xi) for clarity? [[User:Chrislawrence5|Chrislawrence5]] 17:31, 16 April 2007 (UTC)
 
:''This equals zero if ''F'' is continuous at ''x''.''
 
However, at the end of the section "Definition" it says
== Limits of integration of F(x) ==
 
:''In the case of a random variable ''X'' which has distribution having a discrete component at a value ''x''<sub>0</sub>,''
I'll preface my comment by saying that I am not a mathematician, so I may be off base. However, should there be some sort of reminder statement that the limits (especially the lower limit, <math>-\infty</math>) of integration for the expression
::<math> \operatorname{P}(X=x_0) =F(x_0)-F(x_0-) ,</math>
:''where F(''x''<sub>0</sub>-) denotes the limit from the left of F at ''x''<sub>0</sub>: i.e. lim ''F''(''y'') as ''y'' increases towards ''x''<sub>0</sub>.''
 
What I deleted looks identical to that, except that it doesn't include the sentence ''This equals zero if ''F'' is continuous at ''x''.''
:<math>F(x) = \int_{-\infty}^x f(t)\,dt.</math>
 
I propose that we re-delete it but put the last-mentioned sentence into the existing section.
should also be compatible with the range of applicability of <math>f(t)</math> ? Some distributions are not defined over the entire range of <math>t</math>. I was scratching my head confirming the CDF for the [[Pareto distribution]] starting with the PDF and couldn't get the listed answer until I realized this. Perhaps this would be obvious to some, but I suggest it to others who are more up on this stuff as a possible point of clarification. I will defer this change, however, to someone who is more of an authority on this. --[[User:Lacomj|Lacomj]] ([[User talk:Lacomj|talk]]) 21:07, 20 December 2008 (UTC)
 
::Okay, no problem. [[User:Nijdam|Nijdam]] ([[User talk:Nijdam|talk]]) 07:08, 19 April 2013 (UTC)
:The way it is defined, the Paraeto distribution's pdf is zero below ''x''<sub>''m''</sub> in the notation of the wikipedia page on the distribution. so you have to see it as
 
== cdf notation ==
::<math>F(x) = \int_{-\infty}^x f(t)\,dt,</math>
::<math> = \int_{-\infty}^{x_m} 0 \,dt + \int_{x_m}^x \frac{k\,x_\mathrm{m}^k}{t^{k+1}}\,dt,</math>
::<math> = \int_{x_m}^x \frac{k\,x_\mathrm{m}^k}{t^{k+1}}\,dt,</math>
 
I went through and changed the notation <math>F_X(x)</math> to <math>F(x)</math> everywhere in the definition section to try to obtain notational consistency through the article, but the change was reverted by Nijdam with edit summary "Difference between cdf of X and just a cdf". But that conflicts with much notation in the article that uses F(x) for the cdf of X. In the Properties section:
:I think it would be most correct to point out that the Paraeto is zero off its support on that page, but this might appear to be redundant to others. [[User:Pdbailey|PDBailey]] ([[User talk:Pdbailey|talk]]) 00:04, 3 March 2009 (UTC)
 
:<i>the CDF of ''X'' will be discontinuous at the points ''x''<sub>''i''</sub> and constant in between:</i>
==The reasons==
 
What is the reasons to define F(x)=P[X<=x], why not F(x)=P[X>=x]? just convention or what? [[User:Jackzhp|Jackzhp]] ([[User talk:Jackzhp|talk]]) 15:24, 2 March 2009 (UTC)
::<math>F(x) = \operatorname{P}(X\leq x) = \sum_{x_i \leq x} \operatorname{P}(X = x_i) = \sum_{x_i \leq x} p(x_i).</math>
:Yes, convention. [[User:Pdbailey|PDBailey]] ([[User talk:Pdbailey|talk]]) 23:57, 2 March 2009 (UTC)
 
:<i>If the CDF ''F'' of ''X'' is [[continuous function|continuous]], then ''X'' is a [[continuous random variable]]; if furthermore ''F'' is [[absolute continuity|absolutely continuous]], then there exists a [[Lebesgue integral|Lebesgue-integrable]] function ''f''(''x'') such that</i>
 
::<math>F(b)-F(a) = \operatorname{P}(a< X\leq b) = \int_a^b f(x)\,dx</math>
 
:<i>for all real numbers ''a'' and ''b''. The function ''f'' is equal to the [[derivative]] of ''F'' [[almost everywhere]], and it is called the [[probability density function]] of the distribution of ''X''.</i>
 
In the Examples section:
 
:<i>As an example, suppose ''X'' is [[uniform distribution (continuous)|uniformly distributed]] on the unit interval [0,&nbsp;1]. Then the CDF of X is given by</i>
 
::<math>F(x) = \begin{cases}
0 &:\ x < 0\\
x &:\ 0 \le x < 1\\
1 &:\ 1 \le x.
\end{cases}</math>
 
In the Derived functions section:
 
::<math>\bar F(x) = \operatorname{P}(X > x) = 1 - F(x).</math>
 
In the multivariate case section:
 
:''for a pair of random variables ''X,Y'', the joint CDF <math>F</math> is given by''
 
::<math>F(x,y) = \operatorname{P}(X\leq x,Y\leq y),</math>
 
So we need to establish consistency of notation -- either use F<sub>X</sub> ''every'' time we mention a cdf "of X", or else never. Your thoughts? [[User:Duoduoduo|Duoduoduo]] ([[User talk:Duoduoduo|talk]]) 15:08, 17 May 2013 (UTC)
 
:In the literature both <math>F_X</math> and <math>F</math> are used, the latter for ease of notation, and only if there is no confusion about the random variable. In the examples you mention there is not always an inconsistency. In the first one you're right, but if it reads: If the CDF ''F'' of ''X'' ..., it merely states <math>F_X=F</math>, where <math>F</math> is some specified function. [[User:Nijdam|Nijdam]] ([[User talk:Nijdam|talk]]) 06:11, 18 May 2013 (UTC)
 
::But there's no confusion anywhere in the article regardless of which is used. So why not use the same one everywhere? [[User:Duoduoduo|Duoduoduo]] ([[User talk:Duoduoduo|talk]]) 12:30, 18 May 2013 (UTC)
 
== Continuous Random Variables ==
 
Elsewhere on Wikipedia, and in many published books, a continuous random variable has an ''absolutely'' continuous c.d.f., not merely continuous as stated in the properties section. I suggest that this page should also state that the c.d.f. is absolutely continuous so that there is a p.d.f. [[User:Paulruud|Paulruud]] ([[User talk:Paulruud|talk]]) 17:26, 30 March 2015 (UTC)
 
== Definition as expectation value ==
 
I found this in the introduction of [[Characteristic function (probability theory)|Characteristic function]]:
 
''The characteristic function provides an alternative way for describing a [[random variable]]. Similarly to the [[cumulative distribution function]]''
:<math>F_X(x) = \operatorname{E} \left [\mathbf{1}_{\{X\leq x\}} \right],</math>
 
<i>( where '''1'''<sub>{''X ≤ x''}</sub> is the [[indicator function]] — it is equal to 1 when {{nowrap|''X ≤ x''}}, and zero otherwise), which completely determines behavior and properties of the probability distribution of the random variable ''X'', the '''characteristic function'''</i>
: <math> \varphi_X(t) = \operatorname{E} \left [ e^{itX} \right ]</math>
 
''also completely determines behavior and properties of the probability distribution of the random variable ''X''. The two approaches are equivalent in the sense that knowing one of the functions it is always possible to find the other, yet they both provide different insight for understanding the features of the random variable.''
 
The notation <math>F_X(x) = \operatorname{E} \left [\mathbf{1}_{\{X\leq x\}} \right]</math> is so confusing I ask the community a clarification in this page ([[Characteristic function (probability theory)|Characteristic function]]).
 
== Kind of reciprocity ==
 
There is a theorem stating that for each F nondecreasing right-continuous with limits 0 in -∞ and 1 in +∞, there exists a ''unique'' probability mesure μ such that F is a the cdf of a random variable X distributed according to μ. I think this important theorem should be displayed in this page.
 
[[User:GizTwelve|GizTwelve]] ([[User talk:GizTwelve|talk]]) 12:28, 5 October 2017 (UTC)
 
== Colors in the first graph are too similar ==
 
Please use clearly distinguishable colors, that work also with color vision deficiency. <!-- Template:Unsigned IP --><small class="autosigned">—&nbsp;Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[Special:Contributions/88.219.233.164|88.219.233.164]] ([[User talk:88.219.233.164#top|talk]]) 12:31, 11 December 2018 (UTC)</small> <!--Autosigned by SineBot-->
 
== Connection to measure theory ==
 
I would like to know the full connection to measure theory. It seems if I have a probabilistic measure space <math>(A,\mathcal{A},P_A)</math> and a measurable function <math>X\colon A\to \mathbb{R}</math> it might be the pushforward measure <math>X_* P_A= P_A \circ X^{-1}</math> which I guess would be a mapping <math> (\mathbb{R}, \mathcal{B}(\mathbb{R}), ||\cdot||_2)\mapsto \mathbb{R}</math> which for some reason is then restricted to half open sets <math>F_X \overset{?}{:=} (X_* P_A)\restriction [-\infty , r)\times \mathbb{R}</math>. But this is original research and feels a bit patchy (unclear to me how to generalize to random variables with values in <math>\mathbb{R}^n</math>) so please if someone who has connected the dots could add the connection. <!-- Template:Unsigned --><small class="autosigned">—&nbsp;Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[User:Rostspik|Rostspik]] ([[User talk:Rostspik#top|talk]] • [[Special:Contributions/Rostspik|contribs]]) 05:58, 27 April 2019 (UTC)</small> <!--Autosigned by SineBot-->
 
== Check Required ==
 
Hello! I am new editor of Wikipedia and I have tried to do some edits and give some citations I hope I have done things right if not pls let me know my mistakes. This would help me a lot Thank you. [[User:Stobene45|Stobene45]] ([[User talk:Stobene45|talk]]) 09:49, 2 July 2021 (UTC)
 
:Hello! ALong with your addition to my talk page I have taken a look at some of your edits. I would like to direct you to the teahouse to actually ask a question about what to add to the article and what not to add to articles (most of which can be found in the [[WP:MOS|Manual of Style]]). I have reverted your edits however please note that I do see your edits as being in good faith considering you contacted me first saying that you are a new user wanting to learn. Otherwise, I probably would've given you warnings. [[User:Blaze The Wolf|Blaze The Wolf &#124; Proud Furry and Wikipedia Editor]] ([[User talk:Blaze The Wolf#top|talk]]) 18:17, 2 July 2021 (UTC)
:{{re|Stobene45}} this is mostly a matter of Wikipedia's accepted style conventions. We don't artificially underline subheadings, but rather use the pre-defined styles generated by the staggered classes of headings; and while italics can be used for emphasis, this should be done very sparingly (see [[Wikipedia:Manual_of_Style/Text_formatting#Emphasis]]). As for the citations, references to Google results are not suitable anywhere on Wikipedia - the cited source must be the specific work used, not just a search result containing the work's title. Cheers! --<span style="font-family:Courier">[[User:Elmidae|Elmidae]]</span> <small>([[User talk:Elmidae|talk]] · [[Special:contributions/Elmidae|contribs]])</small> 16:26, 3 July 2021 (UTC)
 
== Mild error in Examples section ==
 
In the paragraph beginning "Suppose is [[Normal distribution|normal distributed]]. Then the CDF of  is given by" there is an error. The dummy variable "x" is used as the function argument F(x) rather than the upper limit of integration, "t". I will edit to make this F(t). [[Special:Contributions/2601:1C2:B00:C640:D541:DF77:FC10:64EF|2601:1C2:B00:C640:D541:DF77:FC10:64EF]] ([[User talk:2601:1C2:B00:C640:D541:DF77:FC10:64EF|talk]]) 23:56, 23 November 2024 (UTC)
 
:@[[User:Joe Gatt|Joe Gatt]] This edit and thread was created by me, I just hadn't logged in. Your edit was incorrect. I think you were confused by the notation - it's OK to have F(x) = int ^ x. What wouldn't be OK would be to have F(X) = int ^ X as a CDF definition (see [[Probability integral transform]]). Technically, X is a function of a probability space. Under certain restrictions, X(w) = x where w is in some original psp and x is in the support or range of the function X.
:As written, the CDF definition is back to being correct. [[User:Chjacamp|Chjacamp]] ([[User talk:Chjacamp|talk]]) 00:20, 24 November 2024 (UTC)
 
== Please clarify how area inequalities follow from diagram ==
 
The "Properties" section states the inequalities <math display="block">
x (1-F_X(x)) \leq \int_x^{\infty} t\,dF_X(t)
</math>
and
<math display="block">
x F_X(-x) \leq \int_{-\infty}^{-x} (-t)\,dF_X(t)
</math>
for a random variable <math>X</math> with a finite <math>L_1</math>-norm. These inequalities are straightforward to derive from the formula for the expected value of <math>X</math> using integration by parts. But the article claims that we can instead see why the inequalities are true from the included diagram by "consider[ing] the areas of the two red rectangles and their extensions to the right or left up to the graph of <math>F_X</math>". I don't see how these equalities follow from the diagram; please clarify. [[User:Ted.tem.parker|Ted.tem.parker]] ([[User talk:Ted.tem.parker|talk]]) 00:06, 24 March 2025 (UTC)
 
== Hatnote to pdf ==
 
There is [[Talk:Cumulative density function#RfC on redirect, disambig, article, or deletion|a discussion]] (RfC) on what to do with the page [[cumulative density function]]. One option is that it should redirect to this page. Some editors suggest that there should then be a hatnote here linking to [[probability density function]]. Please join the discussion! —[[User:St.nerol|St.Nerol]] ([[User talk:St.nerol|talk]], [[Special:Contributions/St.nerol|contribs]]) 16:59, 1 May 2025 (UTC)