Difference in differences: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 22:46, 14 July 2024 edit Fgnievinski (talk \| contribs) Autopatrolled, Extended confirmed users 71,101 edits No edit summary Tags: Mobile edit Mobile web edit Advanced mobile edit ← Previous edit		Latest revision as of 14:48, 24 July 2025 edit undo Msrasnw (talk \| contribs) Autopatrolled, Extended confirmed users 21,013 edits →Further reading: Timothy G. Conley
(8 intermediate revisions by 6 users not shown)
Line 2: '''Difference in differences''' ('''DID'''<ref>{{cite journal \|last=Abadie \|first=A. \|year=2005 \|title=Semiparametric difference-in-differences estimators \|journal=[[Review of Economic Studies]] \|volume=72 \|issue=1 \|pages=1–19 \|doi=10.1111/0034-6527.00321 \|citeseerx=10.1.1.470.1475 \|s2cid=8801460 }}</ref> or '''DD'''<ref name=Bertrand>{{cite journal \|last1=Bertrand \|first1=M. \|last2=Duflo \|first2=E. \|author-link2=Esther Duflo \|last3=Mullainathan \|first3=S. \|year=2004 \|title=How Much Should We Trust Differences-in-Differences Estimates? \|journal=[[Quarterly Journal of Economics]] \|volume=119 \|issue=1 \|pages=249–275 \|doi=10.1162/003355304772839588 \|s2cid=470667 \|url=http://www.nber.org/papers/w8841.pdf }}</ref>) is a [[statistics\|statistical technique]] used in [[econometrics]] and [[quantitative research]] in the social sciences that attempts to mimic an [[experiment\|experimental research design]] using [[observational study\|observational study data]], by studying the differential effect of a treatment on a 'treatment group' versus a '[[control group]]' in a [[natural experiment]].<ref>{{cite book \|last1=Angrist \|first1=J. D. \|last2=Pischke \|first2=J. S. \|year=2008 \|title=Mostly Harmless Econometrics: An Empiricist's Companion \|publisher=Princeton University Press \|isbn=978-0-691-12034-8 \|pages=227–243 \|url=https://books.google.com/books?id=ztXL21Xd8v8C&pg=PA227 }}</ref> It calculates the effect of a treatment (i.e., an explanatory variable or an [[independent variable]]) on an outcome (i.e., a response variable or [[dependent variable]]) by comparing the average change over time in the outcome variable for the treatment group to the average change over time for the control group. Although it is intended to mitigate the effects of extraneous factors and [[selection bias]], depending on how the treatment group is chosen, this method may still be subject to certain biases (e.g., [[regression to the mean\|mean regression]], [[Reverse causality bias\|reverse causality]] and [[omitted variable bias]]). In contrast to a [[time series\|time-series estimate]] of the treatment effect on subjects (which analyzes differences over time) or a [[cross-section study\|cross-section estimate]] of the treatment effect (which measures the difference between treatment and control groups), the difference in differences uses [[panel data]] to measure the differences, between the treatment and control group, of the changes in the outcome variable that occur over time. ==General definition== Line 54: [[File:Parallel Trend Assumption.png\|right\|thumb\|320px\| Illustration of the parallel trend assumption]] All the assumptions of the [[Ordinary least squares#Assumptions\|OLS model]] apply equally to DID. In addition, DID requires a '''parallel trend assumption'''. The parallel trend assumption says that <math>\lambda_2 - \lambda_1</math> are the same in both <math>s=1</math> and <math>s=2</math>. Given that the [[#Formal Definition\|formal definition]] above accurately represents reality, this assumption automatically holds. However, a model with <math>\lambda_{st} ~:~ \lambda_{22} - \lambda_{21} \neq \lambda_{12} - \lambda_{11}</math> may well be more realistic. In order to increase the likelihood of the parallel trend assumption holding, a difference-in-differences approach is often combined with [[Matching (statistics)\|matching]].<ref>{{cite journal \|first1=Pallavi \|last1=Basu \|author2-link=Dylan S. Small \|first2=Dylan \|last2=Small \|year=2020 \|title=Constructing a More Closely Matched Control Group in a Difference-in-Differences Analysis: Its Effect on History Interacting with Group Bias \|journal=[[Observational Studies]] \|volume=6 \|pages=103–130\|doi=10.1353/obs.2020.0011 \|s2cid=221702893 \|url=https://muse.jhu.edu/article/793352/pdf \|arxiv=2009.06935 }}</ref> This involves '~~Matching~~matching' known 'treatment' units with simulated counterfactual 'control' units: characteristically equivalent units which did not receive treatment. By defining the Outcome Variable as a temporal difference (change in observed outcome between pre- and posttreatment periods), and ~~Matching~~matching multiple units in a large sample on the basis of similar pre-treatment histories, the resulting [[Average_treatment_effect\|ATE]] (i.e. the ATT: Average Treatment Effect for the Treated) provides a robust difference-in-differences estimate of treatment effects. This serves two statistical purposes: firstly, conditional on pre-treatment covariates, the parallel trends assumption is likely to hold; and secondly, this approach reduces dependence on associated ignorability assumptions necessary for valid inference. As illustrated to the right, the treatment effect is the difference between the observed value of ''y'' and what the value of ''y'' would have been with parallel trends, had there been no treatment. The Achilles' heel of DID is when something other than the treatment changes in one group but not the other at the same time as the treatment, implying a violation of the parallel trend assumption. Line 109: But this is the expression for the treatment effect that was given in the [[#Formal Definition\|formal definition]] and in the above table. Variants of difference-in-difference frameworks include ones for staggered implementation of treatment as well as an estimator introduced for multiple time periods and other variations by Brantly Callaway and [[Pedro H.C. Sant'Anna]].<ref>{{Cite journal \|last=Callaway \|first=Brantly \|last2=Sant’Anna \|first2=Pedro H. C. \|date=2021-12-01 \|title=Difference-in-Differences with multiple time periods \|url=https://www.sciencedirect.com/science/article/abs/pii/S0304407620303948 \|journal=Journal of Econometrics \|series=Themed Issue: Treatment Effect 1 \|volume=225 \|issue=2 \|pages=200–230 \|doi=10.1016/j.jeconom.2020.12.001 \|issn=0304-4076\|url-access=subscription }}</ref> ==Example== The [[David Card\|Card]] and [[Alan Krueger\|Krueger]] article on [[minimum wage]] in [[New Jersey]], published in 1994,<ref name="David"/> is considered one of the most famous DID studies; Card was later awarded the 2021 [[Nobel Memorial Prize in Economic Sciences]] in part for this and related work. Card and Krueger compared [[Unemployment\|employment]] in the [[fast food]] sector in New Jersey and in [[Pennsylvania]], in February 1992 and in November 1992, after New Jersey's minimum wage rose from $4.25 to $5.05 in April 1992. Observing a change in employment in New Jersey only, before and after the treatment, would fail to control for [[Omitted-variable bias\|omitted variables]] such as weather and macroeconomic conditions of the region. By including Pennsylvania as a control in a difference-in-differences model, any bias caused by variables common to New Jersey and Pennsylvania is implicitly controlled for, even when these variables are unobserved. Assuming that New Jersey and Pennsylvania have parallel trends over time, Pennsylvania's change in employment can be interpreted as the change New Jersey would have experienced, had they not increased the minimum wage, and vice versa. The evidence suggested that the increased minimum wage did not induce a decrease in employment in New Jersey, contrary to what some economic theory would suggest. The table below shows Card & Krueger's estimates of the treatment effect on employment, measured as [[Full-time equivalent\|FTEs (or full-time equivalents)]]. Card and Krueger estimate that the $0.80 minimum wage increase in New Jersey led to aan average 2.75 FTE increase in employment per store. {\| class="wikitable" Line 137 ⟶ 139: ==Further reading== {{cite book \|last1=Angrist \|first1=J. D. \|last2=Pischke \|first2=J. S. \|year=2008 \|title=Mostly Harmless Econometrics: An Empiricist's Companion \|publisher=Princeton University Press \|isbn=978-0-691-12034-8 \|pages=227–243 \|url=https://books.google.com/books?id=ztXL21Xd8v8C&pg=PA227 }} Andrew Baker, Brantly Callaway, Scott Cunningham, Andrew Goodman-Bacon and Pedro H. C. Sant'Anna. 2025. "[[arxiv:2503.13323\|Difference-in-Differences Designs: A Practitioner’s Guide.]]" ''Journal of Economic Literature''. {{cite book \| first1 = Arthur C. \|last1=Cameron \|first2=Pravin K. \|last2=Trivedi \|year=2005 \|title=Microeconometrics: Methods and Applications \|publisher=Cambridge university press \|isbn=9780521848053 \|doi=10.1017/CBO9780511811241 \|pages=768–772 \|s2cid=120313863 }} {{cite journal \|last1=Imbens \|first1=Guido W. \|first2=Jeffrey M. \|last2=Wooldridge \|year=2009 \|title=Recent Developments in the Econometrics of Program Evaluation \|journal=[[Journal of Economic Literature]] \|volume=47 \|issue=1 \|pages=5–86 \|doi=10.1257/jel.47.1.5 \|url=http://nrs.harvard.edu/urn-3:HUL.InstRepos:3043416 }} {{cite journal \|first1=Jon \|last1=Bakija \|first2=Bradley \|last2=Heim \|title=How Does Charitable Giving Respond to Incentives and Income? Dynamic Panel Estimates Accounting for Predictable Changes in Taxation \|journal=NBER Working Paper No. 14237 \|date=August 2008 \|doi=10.3386/w14237 \|doi-access=free }} {{cite journal \|first1=T. \|last1=Conley \|first2=C. \|authorlink1=Timothy G. Conley\|last2=Taber \|title=Inference with 'Difference in Differences' with a Small Number of Policy Changes \|journal=NBER Technical Working Paper No. 312 \|date=July 2005 \|doi=10.3386/t0312 \|doi-access=free }} ==External links==