[[File:Parallel Trend Assumption.png|right|thumb|320px| Illustration of the parallel trend assumption]]
All the assumptions of the [[Ordinary least squares#Assumptions|OLS model]] apply equally to DID. In addition, DID requires a '''parallel trend assumption'''. The parallel trend assumption says that <math>\lambda_2 - \lambda_1</math> are the same in both <math>s=1</math> and <math>s=2</math>. Given that the [[Difference in differences#Formal Definition|formal definition]] above accurately represents reality, this assumption automatically holds. However, a model with <math>\lambda_{st} ~:~ \lambda_{22} - \lambda_{21} \neq \lambda_{12} - \lambda_{11}</math> may well be more realistic. In order to increase the likelihood of the parallel trend assumption holding, a difference-in-difference approach is often combined with [[Matching (statistics)|matching]]<ref>{{cite journal |first1=MariannePallavi |last1=BertrandBasu |first2=EstherDylan |last2=Duflo | first3=Sendhil | last3=MullainathanSmall |year=20042020 |title=HowConstructing Mucha ShouldMore WeClosely TrustMatched DifferencesControl Group in a Difference-Inin-Differences Estimates?Analysis: Its Effect on History Interacting with Group Bias |journal=[[QuarterlyObservational Journal of EconomicsStudies]] |volume=119 |issue=16 |pages=249–275 |doi=10.1162/003355304772839588|s2cid=470667 103–130|url=httphttps://www.nberobsstudies.org/paperswp-content/w8841uploads/2020/09/basu_small_2020-1.pdf }}</ref>. This involves 'Matching' known 'treatment' units with simulated counterfactual 'control' units: characteristically equivalent units which did not receive treatment. By defining the Outcome Variable as a temporal difference (change in observed outcome between pre- and posttreatment periods), and Matching multiple units in a large sample on the basis of similar pre-treatment histories, the resulting [[Average_treatment_effect|ATE]] (i.e. the ATT: Average Treatment Effect for the Treated) provides a robust difference-in-difference estimate of treatment effects. This serves two statistical purposes: firstly, conditional on pre-treatment covariates, the parallel trends assumption is likely to hold; and secondly, this approach reduces dependence on associated ignorability assumptions necessary for valid inference.
As illustrated to the right, the treatment effect is the difference between the observed value of ''y'' and what the value of ''y'' would have been with parallel trends, had there been no treatment. The Achilles' heel of DID is when something other than the treatment changes in one group but not the other at the same time as the treatment, implying a violation of the parallel trend assumption.