Binary regression: Difference between revisions

Browse history interactively

Content deleted Content added

VisualWikitext

Revision as of 05:36, 23 April 2019 edit Nbarth (talk \| contribs) Autopatrolled, Extended confirmed users, Pending changes reviewers, Rollbackers 35,278 edits Init		Latest revision as of 20:28, 27 March 2022 edit undo Nbarth (talk \| contribs) Autopatrolled, Extended confirmed users, Pending changes reviewers, Rollbackers 35,278 edits m →top: link
(15 intermediate revisions by 4 users not shown)
Line 1: {{regression bar}} In [[statistics]], specifically [[regression analysis]], a '''binary regression''' estimates a relationship between one or more [[explanatory variable]]s and a single output [[binary variable]]. Generally the probability of the two alternatives is modeled, instead of simply outputting a single value, as in [[linear regression]]. Binary regression is usually analyzed as a special case of [[binomial regression]], with a single outcome (~~{{tmath\|~~<math>n = 1}}</math>), and one of the two alternatives considered as "success" and coded as 1: the value is the [[Count data\|count]] of successes in 1 trial, either 0 or 1. The most common binary regression models are the [[logit model]] ([[logistic regression]]) and the [[probit model]] ([[probit regression]]). ==Applications== Line 9 ⟶ 10: Binary regression models can be interpreted as [[latent variable model]]s, together with a measurement model; or as probabilistic models, directly modeling the probability. === Latent variable model === The latent variable interpretation has traditionally been used in [[bioassay]], yielding the [[probit model]], where normal variance and a cutoff are assumed. The latent variable interpretation is also used in [[item response theory]] (IRT). Formally, the latent variable interpretation posits that the outcome ''y'' is related to a vector of explanatory variables ''x'' by The simplest direct probabilistic model is the [[logit model]]], which models the [[log-odds]] as a linear function of the explanatory variable or variables. The logit model is "simplest" in the sense of [[generalized linear model]]s (GLIM): the log-odds are the natural parameter for the [[exponential family]] of the Bernoulli distribution, and thus it is the simplest to use for computations.▼ : <math>y=1 [y^>0]</math> where <math>y^=x\beta +\varepsilon </math> and <math>\varepsilon \mid x\sim G</math>, {{math\|''β''}} is a vector of [[statistical parameter\|parameters]] and ''G'' is a [[probability distribution]]. This model can be applied in many economic contexts. For instance, the outcome can be the decision of a manager whether invest to a program, <math>y^</math> is the expected net [[discounted cash flow]] and ''x'' is a vector of variables which can affect the cash flow of this program. Then the manager will invest only when she expects the net discounted cash flow to be positive.<ref>For a detailed example, refer to: Tetsuo Yai, Seiji Iwakura, Shigeru Morichi, Multinomial probit with structured covariance for route choice behavior, Transportation Research Part B: Methodological, Volume 31, Issue 3, June 1997, Pages 195–207, ISSN 0191-2615</ref> Often, the [[errors and residuals\|error term]] <math>\varepsilon</math> is assumed to follow a [[normal distribution]] conditional on the explanatory variables ''x''. This generates the standard [[probit model]].<ref>Bliss, C. I. (1934). "The Method of Probits". Science 79 (2037): 38–39.</ref> === Probabilistic model === ▲The simplest direct probabilistic model is the [[logit model]]], which models the [[log-odds]] as a linear function of the explanatory variable or variables. The logit model is "simplest" in the sense of [[generalized linear model]]s (GLIM): the log-odds are the natural parameter for the [[exponential family]] of the Bernoulli distribution, and thus it is the simplest to use for computations. Another direct probabilistic model is the [[linear probability model]], which models the probability itself as a linear function of the explanatory variables. A drawback of the linear probability model is that, for some values of the explanatory variables, the model will predict probabilities less than zero or greater than one. ==See also == {{sectionlink\|Generalized linear model#Binary data}} [[Fractional model]] ==References== {{reflist}} {{refbegin}} {{cite ~~chapter~~book \|title=Regression Models for Categorical Dependent Variables Using Stata, Second Edition \|chapter=4. Models for binary outcomes: 4.1 The statistical model Line 25 ⟶ 45: \|year=2006 \|isbn=978-1-59718011-5 ~~\|ref=harv~~ }} * {{cite ~~chapter~~book \|last=Agresti \|first=Alan \|chapter=3.2 Generalized Linear Models for Binary Data \|year=2007 \|title=Categorical Data Analysis \|url=https://archive.org/details/introductiontoca00agre \|url-access=limited \|edition=2nd ~~\|edition=2nd~~ \|pages=[https://archive.org/details/introductiontoca00agre/page/n88 68]–73 ~~\|pages=68–73~~ ~~\|ref=harv~~ }} {{refend}} Line 41 ⟶ 59: {{statistics-stub}} [[Category:~~Generalized~~Regression ~~linear models~~analysis]]