Revision as of 22:49, 7 June 2010 edit Btyner (talk \| contribs) Extended confirmed users 5,887 edits See also Binomial regression ← Previous edit		Revision as of 15:13, 19 July 2010 edit undo Melcombe (talk \| contribs) Pending changes reviewers 18,880 edits add reference, resolve redlink, rewrite much Next edit →
Line 1: In [[statistics]], a '''linear probability model''' is a special case of a [[binomial regression]] model. Here the [[dependent and independent variables\|observed variable]] for each observation takes values which are either 0 or 1. The probability of observing a 0 or 1 in any one case is treared as depending on one or more [[dependent and independent variables\|explanatory variables]]. For the "linear probability model", this relationship is a particularly simple one, and allows the model to be fitted by [[simple linear regression]]. ~~{{Unreferenced\|date=December 2009}}~~ ~~The '''linear probability specification''' of a [[binary regression model]] assumes that, for binary outcome <math>Y </math> and regressor vector <math> X </math>,~~ ==The model== : <math> \Pr(Y=1 \| X=x) = x'\beta. </math> ▼ The model assumes that, for a binary outcome ([[Bernoulli trial]]), ''Y'', and its associated vector of explanatory variables, ''X'',]]<ref name=Cox>Cox, D.R. (1970) ''Analysis of Binary Data'', Methuen. ISBN 0416-10400-2(Section 2.2)</ref> ▲: <math> \Pr(Y=1 \| X=x) = x'\beta . </math> A drawback of this model is that, unless restrictions are placed on <math> \beta </math>, the estimated coefficients can imply probabilities outside the [[unit interval]] <math> [0,1] </math> . For this reason, the [[logit model]] or the [[probit model]] are more commonly used. ▼ For this model, One situation where the linear probability model is commonly used, is when the data set is so large that [[maximum likelihood]] estimation of a logit or probit model is computationally difficult. For the linear probability model, <math> E[Y\|X] = \Pr(Y=1\|X) =x'\beta</math>, so the parameter <math> \beta </math> can be estimated using [[least squares]]. :<math> E[Y\|X] = \Pr(Y=1\|X) =x'\beta,</math> and hence the vector of parameters β can be estimated using [[least squares]]. This method of fitting would be [[Efficiency (statistics)\|inefficient]]<ref name=Cox>Cox, D.R. (1970) ''Analysis of Binary Data'', Methuen. ISBN 0416-10400-2(Section 2.2)</ref> This method of fitting can be improved<ref name=Cox/> by adopting an iterative scheme based on [[weighted least squares]], in which the model from the previous iteration is used to supply estimates of the condotional variances, var(''Y''\|''X=x''), which would vary between observations. This approach can be related to fitting the model by [[maximum likelihood]].<ref name=Cox/> ▲A drawback of this model for the parameter of the [[Bernoulli distribution]] is that, unless restrictions are placed on <math> \beta </math>, the estimated coefficients can imply probabilities outside the [[unit interval]] <math> [0,1] </math> . For this reason, models such as the [[logit model]] or the [[probit model]] are more commonly used. ~~Other Drawbacks:~~ ~~Unrealistic marginal effects at low and high parts of the distribution.~~ ~~==See also==~~ * [[Binomial regression]] ==References== {{reflist}} {{DEFAULTSORT:Linear Probability Model}} [[Category:Regression analysis]]

Linear probability model: Difference between revisions