Dynamic unobserved effects model: Difference between revisions

Content deleted Content added
m typo corrections WP:TYPO
 
(16 intermediate revisions by 12 users not shown)
Line 1:
{{Short description|Statistical model used in econometrics}}
{{Multiple issues|
{{OrphanTechnical|date=DecemberJanuary 20152018}}
{{lead missing|date=November 2015}}
}}
 
A '''dynamic unobserved effects model''' is a [[statistical model]] used in [[econometrics]] for [[panel analysis]]. It is characterized by the influence of previous values of the [[dependent variable]] on its present value, and by the presence of unobservable [[explanatory variable]]s.
The “dynamic” here means the dependence of the dependent variable on its past history, this is usually used to model the “state dependence” in economics. For instance, a person who cannot find a job this year, it will be hard for her to find a job next year because the fact that she doesn’t have a job this year will be a very negative signal for the potential employers. The “unobserved effects” means that one or some of the explanatory variables are unobservable. For example, one’s preference affects quite a lot her consumption choice of the ice cream with a certain taste, but preference is unobservable. A typical dynamic unobserved effects model is represented <ref>Wooldridge, J. (2002): Econometric Analysis of Cross Section and Panel Data, MIT Press, Cambridge, Mass, pp 495.</ref> as:
 
The term “dynamic” here means the dependence of the dependent variable on its past history,; this is usually used to model the “state dependence” in economics. For instance, for a person who cannot find a job this year, it will be hard for herharder to find a job next year because theher factpresent thatlack she doesn’t haveof a job this year will be a very negative signal for the potential employers. The “unobserved“Unobserved effects” means that one or some of the explanatory variables are unobservable.: Forfor example, one’sconsumption preferencechoice affectsof quiteone a lot her consumption choiceflavor of the ice cream withover a certain taste, but preferenceanother is unobservable.a A typical dynamic unobserved effects model is represented <ref>Wooldridge, J. (2002): Econometric Analysisfunction of Crosspersonal Section and Panel Datapreference, MITbut Press,preference Cambridge,is Mass, pp 495unobservable.</ref> as:
P(y<sub>it</sub> = 1│y<sub>i,t-1</sub>, … , y<sub>i,0</sub> , z<sub>i</sub> , c<sub>i</sub> ) = G (z<sub>it</sub> δ + ρ y<sub>i,t-1</sub> + c<sub>i</sub>)
 
==Continuous dependent variable==
where c<sub>i</sub> is an unobservable explanatory variable, z<sub>it</sub> is explanatory variables which are exogenous conditional on the c<sub>i</sub>, and G(∙) is a [[cumulative distribution function]].
{{further|Panel analysis#Dynamic panel models|Arellano–Bond estimator}}
 
==Censored dependent variable==
In this type of model, economists have a special interest in ρ, which is used to characterize the state dependence. For example, ''y<sub>i,t</sub>'' can be a woman’s choice whether work or not, ''z<sub>it</sub>'' includes the ''i''-th individual’s age, education level, numbers of kids and so on. ''c<sub>i</sub>'' can be some individual specific characteristic which cannot be observed by economists.<ref>James J. Heckman (1981): Studies in Labor Markets, University of Chicago Press, Chapter Heterogeneity and State Dependence</ref> It is a reasonable conjecture that one’s labor choice in period ''t'' should depend on his or her choice in period ''t''&nbsp;&minus;&nbsp;1 due to habit formation or other reasons. This is dependence is characterized by parameter ''ρ''.
In a panel data [[tobit model]],<ref>{{cite book |last=Greene |first=W. H. |year=2003 |title=Econometric Analysis |publisher=Prentice Hall |___location=Upper Saddle River, NJ }}</ref><ref>The model framework comes from {{cite book |last=Wooldridge |first=J. |year=2002 |title=Econometric Analysis of Cross Section and Panel Data |url=https://archive.org/details/econometricanaly0000wool |url-access=registration |publisher=MIT Press |___location=Cambridge, Mass |page=[https://archive.org/details/econometricanaly0000wool/page/542 542] |isbn=9780262232197 }} But the author revises the model more general here.</ref> if the outcome <math>Y_{i,t}</math> partially depends on the previous outcome history <math>Y_{i,0},\ldots,Y_{t-1}</math> this tobit model is called "dynamic". For instance, taking a person who finds a job with a high salary this year, it will be easier for her to find a job with a high salary next year because the fact that she has a high-wage job this year will be a very positive signal for the potential employers. The essence of this type of dynamic effect is the state dependence of the outcome. The "unobservable effects" here refers to the factor which partially determines the outcome of individual but cannot be observed in the data. For instance, the ability of a person is very important in job-hunting, but it is not observable for researchers. A typical dynamic unobserved effects tobit model can be represented as
 
: <math>Y_{i,t}=Y_{i,t}^1[Y_{i,t}>0]; </math>
There are several [[Maximum likelihood|MLE]]-based approaches to estimate ''δ'' and ''ρ'' consistently. The simplest way is to treat ''y<sub>i,0</sub>'' as non-stochastic and assume ''c<sub>i</sub>'' is [[Independent variable#Use in statistics|independent]] with ''z<sub>i</sub>''. Then integrate ''P(y<sub>i,t</sub> , y<sub>i,t-1</sub> , … , y<sub>i,1</sub> | y<sub>i,0</sub> , z<sub>i</sub> , c<sub>i</sub>)'' against the density of ''c<sub>i</sub>'', we can obtain the conditional density P(y<sub>i,t</sub> , y<sub>i,t-1</sub> , … , y<sub>i,1</sub> |y<sub>i,0</sub> , z<sub>i</sub>). The objective function for the conditional MLE can be represented as: ''<math> \sum_{i=1}^N </math> log (P (y<sub>i,t</sub> , y<sub>i,t-1</sub>, … , y<sub>i,1</sub> | y<sub>i,0</sub> , z<sub>i</sub>)).''
 
: <math>Y_{i,t}=z_{i,t}\delta+\rho y_{i,t-1}+c_{i}+u_{i,t};</math>
Treating ''y<sub>i,0</sub>'' as non-stochastic implicitly assumes the independence of ''y<sub>i,0</sub>'' on ''z<sub>i</sub>''. But in most of the cases in reality, ''y<sub>i,0</sub>'' depends on ''c<sub>i</sub>'' and ''c<sub>i</sub>'' also depends on ''z<sub>i</sub>''. An improvement on the approach above is to assume a density of ''y<sub>i,0</sub>'' conditional on (''c<sub>i</sub>, z<sub>i</sub>'') and conditional likelihood ''P(y<sub>i,t</sub>) , y<sub>i,t-1</sub> , … , y<sub>t,1</sub>,y<sub>i,0</sub> | c<sub>i</sub>, z<sub>i</sub>)'' can be obtained. Integrate this likelihood against the density of ''c<sub>i</sub>'' conditional on ''z<sub>i</sub>'' and we can obtain the conditional density ''P(y<sub>i,t</sub> , y<sub>i,t-1</sub> , … , y<sub>i,1</sub> , y<sub>i,0</sub> | z<sub>i</sub>)''. The objective function for the [[conditional MLE]] <ref>Greene, W. H. (2003), Econometric Analysis , Prentice Hall , Upper Saddle River, NJ .</ref> is ''<math> \sum_{i=1}^N </math> log (P (y<sub>i,t</sub> , y<sub>i,t-1</sub>, … , y<sub>i,1</sub> | y<sub>i,0</sub> , z<sub>i</sub>)).''
 
: <math>c_i\mid y_{i,0},\ldots, y_{i,t-1} \sim F (y_{i,0}x_i);</math>
Based on the estimates for (''δ, ρ'') and the corresponding variance, test about the coefficients can be implemented <ref>Whitney K. Newey, Daniel McFadden, Chapter 36 Large sample estimation and hypothesis testing, In: Robert F. Engle and Daniel L. McFadden, Editor(s), Handbook of Econometrics, Elsevier, 1994, Volume 4, Pages 2111–2245, {{ISSN|1573-4412}}, ISBN 9780444887665,</ref> and the average partial effect can be calculated.<ref>Chamberlain, G. (1980), “Analysis of Covariance with Qualitative Data,” Journal of Econometrics 18, 5–46</ref>
 
: <math>u_{i,t}\mid z_{i,t},y_{i,0},\ldots,y_{i,t-1}\sim N(0,1).</math>
 
In this specific model, <math>\rho y_{i,t-1}</math> is the dynamic effect part and <math>c_{i}</math> is the unobserved effect part whose distribution is determined by the initial outcome of individual ''i'' and some exogenous features of individual ''i.''
 
Based on this setup, the likelihood function conditional on <math>\{y_{i,0}\}^N_{i-1}</math> can be given as
 
: <math>\prod_{i=1}^N \int f_\theta(c_i \mid y_{i,0},x_i) \left[ \prod_{t=1}^T\Bigl(1[y_{i,t}=0][1-\Phi(z_{i,t}\delta+\rho y_{i,t-1}>0] \frac{\varphi(z_{i,t}\delta+\rho y_{i,t-1}+c_i)}{\Phi(z_{i,t} \delta+\rho y_{i,t-1}+c_i)}\biggr) \right] \, dc_i </math>
 
For the initial values <math>\{y_{i,0}\}^N_{i-1}</math>, there are two different ways to treat them in the construction of the likelihood function: treating them as constant or imposing a distribution on them and calculate out the unconditional likelihood function. But whichever way is chosen to treat the initial values in the likelihood function, we cannot get rid of the integration inside the likelihood function when estimating the model by maximum likelihood estimation (MLE). Expectation Maximum (EM) algorithm is usually a good solution for this computation issue.<ref>For more details, refer to: {{cite book |last1=Cappé |first1=O. |last2=Moulines |first2=E. |last3=Ryden |first3=T. |year=2005 |title=Inference in Hidden Markov Models |publisher=Springer-Verlag |___location=New York |chapter=Part II: Parameter Inference |isbn=9780387289823 |chapter-url=https://books.google.com/books?id=4d_oEYn8Fl0C&pg=PA347 }}</ref> Based on the consistent point estimates from MLE, Average Partial Effect (APE)<ref>{{cite book |last=Wooldridge |first=J. |year=2002 |title=Econometric Analysis of Cross Section and Panel Data |url=https://archive.org/details/econometricanaly0000wool |url-access=registration |publisher=MIT Press |___location=Cambridge, Mass |page=[https://archive.org/details/econometricanaly0000wool/page/22 22] |isbn=9780262232197 }}</ref> can be calculated correspondingly.<ref> For more details, refer to: {{cite journal |first=Takeshi |last=Amemiya |year=1984 |title=Tobit models: A survey |journal=Journal of Econometrics |volume=24 |issue=1–2 |pages=3–61 |doi=10.1016/0304-4076(84)90074-5 }}</ref>
 
==Binary dependent variable==
===Formulation===
A typical dynamic unobserved effects model with a [[binary data|binary]] dependent variable is represented<ref>Wooldridge, J. (2002): Econometric Analysis of Cross Section and Panel Data, MIT Press, Cambridge, Mass, pp 300.</ref> as:
 
:<math>P(y_{it} = 1 \mid y_{i,t-1}, \dots , y_{i,0}, z_i, c_i) = G (z_{it} \delta + \rho y_{i,t-1} + c_i)</math>
 
where c<sub>i</sub> is an unobservable explanatory variable, z<sub>it</sub> isare explanatory variables which are exogenous conditional on the c<sub>i</sub>, and G(∙) is a [[cumulative distribution function]].
 
===Estimates of parameters===
In this type of model, economists have a special interest in ρ, which is used to characterize the state dependence. For example, ''y<sub>i,t</sub>'' can be a woman’swoman's choice whether to work or not, ''z<sub>it</sub>'' includes the ''i''-th individual’sindividual's age, education level, numbersnumber of kidschildren, and soother onfactors. ''c<sub>i</sub>'' can be some individual specific characteristic which cannot be observed by economists.<ref>James J. Heckman (1981): Studies in Labor Markets, University of Chicago Press, Chapter Heterogeneity and State Dependence</ref> It is a reasonable conjecture that one’sone's labor choice in period ''t'' should depend on his or her choice in period ''t''&nbsp;&minus;&nbsp;1 due to habit formation or other reasons. This is dependence is characterized by parameter ''ρ''.
 
There are several [[Maximum likelihood|MLE]]-based approaches to estimate ''δ'' and ''ρ'' consistently. The simplest way is to treat ''y<sub>i,0</sub>'' as non-stochastic and assume ''c<sub>i</sub>'' is [[Independent variable#Use in statistics|independent]] with ''z<sub>i</sub>''. Then integrateby integrating ''P(y<sub>i,t</sub> , y<sub>i,t-1</sub> , … , y<sub>i,1</sub> | y<sub>i,0</sub> , z<sub>i</sub> , c<sub>i</sub>)'' against the density of ''c<sub>i</sub>'', we can obtain the conditional density P(y<sub>i,t</sub> , y<sub>i,t-1</sub> , ... , y<sub>i,1</sub> |y<sub>i,0</sub> , z<sub>i</sub>). The objective function for the conditional MLE can be represented as: ''<math> \sum_{i=1}^N </math> log (P (y<sub>i,t</sub> , y<sub>i,t-1</sub>, … , y<sub>i,1</sub> | y<sub>i,0</sub> , z<sub>i</sub>)).''
 
Treating ''y<sub>i,0</sub>'' as non-stochastic implicitly assumes the independence of ''y<sub>i,0</sub>'' on ''z<sub>i</sub>''. But in most of the cases in reality, ''y<sub>i,0</sub>'' depends on ''c<sub>i</sub>'' and ''c<sub>i</sub>'' also depends on ''z<sub>i</sub>''. An improvement on the approach above is to assume a density of ''y<sub>i,0</sub>'' conditional on (''c<sub>i</sub>, z<sub>i</sub>'') and conditional likelihood ''P(y<sub>i,t</sub>) , y<sub>i,t-1</sub> , … , y<sub>t,1</sub>,y<sub>i,0</sub> | c<sub>i</sub>, z<sub>i</sub>)'' can be obtained. IntegrateBy integrating this likelihood against the density of ''c<sub>i</sub>'' conditional on ''z<sub>i</sub>'' and, we can obtain the conditional density ''P(y<sub>i,t</sub> , y<sub>i,t-1</sub> , … , y<sub>i,1</sub> , y<sub>i,0</sub> | z<sub>i</sub>)''. The objective function for the [[conditional MLE]] <ref>Greene, W. H. (2003), Econometric Analysis , Prentice Hall , Upper Saddle River, NJ .</ref> is ''<math> \sum_{i=1}^N </math> log (P (y<sub>i,t</sub> , y<sub>i,t-1</sub>, … , y<sub>i,1</sub> | y<sub>i,0</sub> , z<sub>i</sub>)).''
 
Based on the estimates for (''δ, ρ'') and the corresponding variance, testvalues aboutof the coefficients can be implemented tested<ref>Whitney K. Newey, Daniel McFadden, Chapter 36 Large sample estimation and hypothesis testing, In: Robert F. Engle and Daniel L. McFadden, Editor(s), Handbook of Econometrics, Elsevier, 1994, Volume 4, Pages 2111–2245, {{ISSN|1573-4412}}, {{ISBN |9780444887665}},</ref> and the average partial effect can be calculated.<ref>Chamberlain, G. (1980), “Analysis of Covariance with Qualitative Data,” Journal of Econometrics 18, 5–46</ref>
 
==References==