Dynamic causal modeling: Difference between revisions

Content deleted Content added
No edit summary
m Working on DCM article
Line 1:
{{User sandbox}}
<!-- EDIT BELOW THIS LINE -->
Dynamic Causal Modelling (DCM) is a methodology and software framework for specifying models of neural dynamics, estimating their parameters and comparing their evidence. It enables the interaction of neural populations (effective connectivity) to be inferred from functional neuroimaging data e.g., [[functional magnetic resonance imaging]] (fMRI), [[magnetoencephalography]] (MEG) or [[electroencephalography]]; EEG).
== Bayesian model reduction ==
Bayesian model reduction <ref name=":0">{{Cite journal|last=Friston|first=Karl|last2=Penny|first2=Will|date=June 2011|title=Post hoc Bayesian model selection|url=https://doi.org/10.1016/j.neuroimage.2011.03.062|journal=NeuroImage|volume=56|issue=4|pages=2089–2099|doi=10.1016/j.neuroimage.2011.03.062|issn=1053-8119|pmc=PMC3112494|pmid=21459150|via=}}</ref><ref name=":1">{{Cite journal|last=Friston|first=Karl J.|last2=Litvak|first2=Vladimir|last3=Oswal|first3=Ashwini|last4=Razi|first4=Adeel|last5=Stephan|first5=Klaas E.|last6=van Wijk|first6=Bernadette C.M.|last7=Ziegler|first7=Gabriel|last8=Zeidman|first8=Peter|date=March 2016|title=Bayesian model reduction and empirical Bayes for group (DCM) studies|url=https://doi.org/10.1016/j.neuroimage.2015.11.015|journal=NeuroImage|volume=128|pages=413–431|doi=10.1016/j.neuroimage.2015.11.015|issn=1053-8119|pmc=PMC4767224|pmid=26569570|via=}}</ref> is a method for computing the [[Marginal likelihood|evidence]] and [[Posterior probability|posterior]] over the parameters of [[Bayesian statistics|Bayesian]] models that differ in their [[Prior probability|priors]]. A full model is fitted to data using standard approaches. Hypotheses are then tested by defining one or more 'reduced' models with alternative (and usually more restrictive) priors, which usually – in the limit – switch off certain parameters. The evidence and parameters of the reduced models can then be computed from the evidence and estimated ([[Posterior probability|posterior]]) parameters of the full model using Bayesian model reduction. If the priors and posteriors are [[Normal distribution|normally distributed]], then there is an analytic solution which can be computed rapidly. This has multiple scientific and engineering applications: these include scoring the evidence for large numbers of models very quickly and facilitating the estimation of hierarchical models ([[Empirical Bayes method|Parametric Empirical Bayes]]).
 
== TheoryMotivation ==
The aim of dynamic causal modeling (DCM) is to infer the causal architecture of coupled nonlinear dynamical systems using Bayesian model comparison procedure that rests on comparing models of how data were generated. Dynamic causal models are formulated in terms of nonlinear state-space models in continuous time and model the dynamics of hidden states in the nodes of a probabilistic graphical model, where conditional dependencies are parameterised in terms of directed effective connectivity. Unlike Bayesian Networks the graphs used in DCM can be cyclic, and unlike Structural Equation modelling and Granger causality, DCM does not depend on the theory of Martingales, i.e., it does not assume that random fluctuations' are serially uncorrelated.
Consider some model with parameters <math>\theta</math> and a prior probability density on those parameters <math>p(\theta)</math>. The posterior belief about <math>\theta</math> after seeing the data <math>p(\theta|y)</math> is given by [[Bayes' theorem|Bayes rule]]:
 
DCM was developed for (and applied principally to) estimating coupling among brain regions and how that coupling is influenced by experimental changes (e.g., time or context). The basic idea is to construct reasonably realistic models of interacting (cortical) regions or nodes. These models are then supplemented with a forward model of how the hidden states of each node (e.g., neuronal activity) map to measured responses. This enables the best model and its parameters (i.e., effective connectivity) to be identified from observed data. The Bayesian model comparison is used to select the best model in terms of its evidence (inference on model-space), which can then be characterised in terms of its parameters (inference on parameter-space). This enables one to test hypotheses about how nodes communicate; e.g., whether activity in a given neuronal population modulates the coupling between other populations, in a task-specific fashion.
{{NumBlk|:|
<math>\begin{align}
p(\theta|y) & = \frac{p(y|\theta)p(\theta)}{p(y)} \\
p(y) & = \int p(y|\theta)p(\theta) d\theta
\end{align}</math>
|1}}
 
In functional neuroimaging, the data may be functional magnetic resonance imaging (fMRI) measurements or electrophysiological (e.g., in magnetoencephalography or electroencephalography; MEG/EEG). Brain responses are evoked by known deterministic inputs (experimentally controlled stimuli) that embody designed changes in sensory stimulation or cognitive set. These experimental or exogenous variables can change hidden states in one of two ways. First, they can elicit responses through direct influences on specific network nodes. This would be appropriate, for example, in modelling sensory evoked responses in the early visual cortex. The second class of inputs exerts their effects vicariously, through a modulation of the coupling among nodes, for example, the influence of attention on the processing of sensory information. The hidden states cover any neurophysiological or biophysical variables needed to form observed outputs. These outputs are measured (hemodynamic or electromagnetic) responses over the sensors considered. Bayesian inversion furnishes the marginal likelihood (evidence) of the model and the posterior distribution of its parameters (e.g., neuronal coupling strengths). The evidence is used for Bayesian model selection (BMS) to disambiguate between competing models, while the posterior distribution of the parameters is used to characterise the model selected.
The second line of Equation 1 is the model evidence, which is the probability of observing the data given the model. In practice, the posterior cannot usually be computed analytically due to the difficulty in computing the integral over the parameters. Therefore, the posteriors are estimated using approaches such as [[Markov chain Monte Carlo|MCMC sampling]] or [[Variational Bayesian methods|variational Bayes]]. A reduced model can then be defined with an alternative set of priors <math>\tilde{p}(\theta)</math>:
== DCM for fMRI ==
 
DCM for fMRI uses a deterministic low-order approximation model ( derived using Taylor series) of neural dynamics in a network or graph of ''n'' interacting brain regions or nodes (Friston ''et al.'' 2003). The activity of each cortical region in the model is governed by single neuronal state-vectors ''x'' in time, which is given by the following bilinear differential equation:
{{NumBlk|:|
<math>\begin{align}
\tilde{p}(\theta|y) & = \frac{p(y|\theta)\tilde{p}(\theta)}{\tilde{p}(y)} \\
\tilde{p}(y) & = \int p(y|\theta)\tilde{p}(\theta) d\theta
\end{align}</math>
|2}}
 
:<math> \dot{x}=f(x,u,\theta)= Ax + \sum_{j=1}^m u_j B^{(j)} x + Cu </math>
The objective of Bayesian model reduction is to compute the posterior <math>\tilde{p}(\theta|y)</math> and evidence <math>\tilde{p}(y)</math> of the reduced model from the posterior <math>p(\theta|y)</math> and evidence <math>p(y)</math> of the full model. Combining Equation 1 and Equation 2 and re-arranging, the reduced posterior <math>\tilde{p}(\theta|y)</math> can be expressed as the product of the full posterior, the ratio of priors and the ratio of evidences:
:<math> </math>
:<math> A= \frac{\partial f}{\partial x}\bigg|_{u=0} \; \quad\; B= \frac{\partial^2 f}{\partial x\,\partial u} \; \quad\; C= \frac{\partial f}{\partial u}\bigg|_{x=0} </math>
 
where <math>\dot{x}= dx/dt\ .</math> The bilinear model is a parsimonious low-order approximation that accounts both for endogenous and exogenous causes of system dynamics. The matrix ''A'' represents the average coupling among nodes in the absence of exogenous input <math>u(t)\ .</math> This can be thought of as the latent coupling in the absence of experimental perturbations. The ''B'' matrices are effectively the change in latent coupling induced by the ''j-th'' input. They encode context-sensitive changes in ''A'' or, equivalently, the modulation of coupling by experimental manipulations. Because <math>B^{(j)}</math> are second-order derivatives they are referred to as ''Bilinear''. Finally, the matrix ''C'' embodies the influences of exogenous input that ''Cause'' perturbations of hidden states. The connectivity or coupling matrices to be estimated are <math>\theta \supset \{A, B, C\}</math> are and define the functional architecture and interactions among brain regions at a neuronal level. <figref>Fig1A.png</figref> summarises this bilinear state-equation and shows the model in graphical
{{NumBlk|:|
<math>\begin{align}
\frac{\tilde{p}(\theta|y)\tilde{p}(y)}{p(\theta|y)p(y)} &=\frac{p(y|\theta)\tilde{p}(\theta)}{p(y|\theta)p(\theta)} \\
\Rightarrow \tilde{p}(\theta|y) &= p(\theta|y)\frac{\tilde{p}(\theta)}{p(\theta)}\frac{p(y)}{\tilde{p}(y)}
\end{align}</math>
|3}}
 
[[Image:Fig1A.png|thumb|400px|left| (A) The bilinear state equation of DCM for fMRI. (B) An example of a DCM describing the dynamics in a simple hierarchical system of visual areas. This system consists of two areas, each represented by a single state variable <math>(x_1, x_2)\ .</math> Black arrows represent connections, grey arrows represent exogenous inputs and thin dotted arrows indicate the transformation from neural states (blue colour) into hemodynamic observations (red colour); see <figref>Fig1A.png</figref> for the hemodynamic forward model. The state equation for this particular model is shown on the right. Adapted from (Stephan ''et al.'', 2007a).]]
The evidence for the reduced model is obtained by integrating over the parameters of each side of the equation:
 
DCM for fMRI combines this bilinear model of neural dynamics with an empirically validated hemodynamic model that describes the transformation of neuronal activity into a BOLD response. This so-called “Balloon model” was initially formulated by (Buxton ''et al.'', 1998) and later extended (Friston ''et al.'', 2000; Stephan ''et al.'', 2007c). In the hemodynamic model, changes in neural activity elicit a vasodilatory signal that leads to increases in blood flow and subsequently to changes in blood volume and deoxyhemoglobin content and summarised schematically in <figref>Fig2A.png</figref>.
{{NumBlk|:|<math>\int \tilde{p}(\theta|y)d\theta = \int p(\theta|y)\frac{\tilde{p}(\theta)}{p(\theta)}\frac{p(y)}{\tilde{p}(y)}d\theta =1</math>|4}}
 
[[Image:Fig2A.png|thumb|400px|right|Fig2A| Schematic of the hemodynamic model used by DCM for fMRI. Neuronal activity induces a vasodilatory and activity-dependent signal ''s'' that increases blood flow ''f''. Blood flow causes changes in volume and deoxyhemoglobin (<math>v</math> and <math>q</math>). These two hemodynamic states enter an output nonlinearity, which results in a predicted BOLD response ''y''. In recent versions, this model has six hemodynamic parameters (Stephan ''et al.,'' 2007c): the rate constant of the vasodilatory signal decay (<math>\kappa</math>), the rate constant for auto-regulatory feedback by blood flow (<math>\gamma</math>), transit time (<math>\tau</math>), Grubb’s vessel stiffness exponent (<math>\alpha</math>), capillary resting net oxygen extraction (<math>E_0</math>), and ratio of intra-extravascular BOLD signal (<math>\epsilon</math>). <math>E</math> is the oxygen extraction function. This figure encodes graphically the transformation from neuronal states to hemodynamic responses; adapted from (Friston ''et al.'', 2003).]]
And by re-arrangement:
 
Together, the neuronal and hemodynamic state equations furnish a deterministic DCM. For any given combination of parameters <math>\theta</math> and inputs <math>u\ ,</math> the measured BOLD response <math>y</math> is modelled as the predicted BOLD signal (the generalised convolution of inputs; <math>h(x,u,\theta)</math>) plus a linear mixture of confounds <math>X\beta</math> (''e.g.'' signal drift) and Gaussian observation error <math>\epsilon\ :</math>
{{NumBlk|:|
<math> \begin{align}
1 &= \int p(\theta|y)\frac{\tilde{p}(\theta)}{p(\theta)}\frac{p(y)}{\tilde{p}(y)}d\theta \\
&= \frac{p(y)}{\tilde{p}(y)}\int p(\theta|y)\frac{\tilde{p}(\theta)}{p(\theta)}d\theta \\
\Rightarrow \tilde{p}(y) &= p(y) \int p(\theta|y)\frac{\tilde{p}(\theta)}{p(\theta)}d\theta
\end{align}</math>
|5}}
 
<math>y=h(x,u,\theta) + X\beta + \epsilon</math>
== Gaussian priors and posteriors ==
Under Gaussian prior and posterior densities, as are used in the context of [[Variational Bayesian methods|variational Bayes]], Bayesian model reduction has a simple analytical solution <ref name=":0" />. First define normal densities for the priors and posteriors:
 
A schematic representation of the hierarchical structure of DCM is
{{NumBlk|:|
<math>\begin{align}
<math> u \overset{f}{\longrightarrow} x \overset{g}{\longrightarrow} y </math>
p(\theta) &= N(\theta;\mu_0,\Sigma_0)\\
\tilde{p}(\theta) &= N(\theta;\tilde{\mu}_0,\tilde{\Sigma}_0)\\
p(\theta|y) &= N(\theta;\mu,\Sigma)\\
\tilde{p}(\theta|y) &= N(\theta;\tilde{\mu},\tilde{\Sigma})\\
\end{align}</math>
|6}}
 
where ''u'' influences the dynamics of hidden (neuronal) states of the system ''x'', through the evolution ''f'' function; ''x'' is then mapped to the predicted data ''y'' through the observation function ''g''. The combined neural and hemodynamic parameters <math>\vartheta \supseteq \{A,B,C,\vartheta\}</math> are estimated from the measured BOLD data, using a Bayesian scheme with empirical priors for the hemodynamic parameters and conservative shrinkage priors for the coupling parameters (see below). Once the parameters of a DCM have been estimated, the posterior distributions of the parameters can be used to test hypotheses about connection strengths (''e.g.'', Ethofer ''et al.'', 2006; Fairhall and Ishai, 2007; Grol ''et al.'', 2007; Kumar ''et al.'', 2007; Posner ''et al.'', 2006; Stephan ''et al.'', 2006; Stephan ''et al.'', 2007b; Stephan ''et al.'', 2005).
Where the tilde symbol (~) indicates quantities relating to the reduced model and subscript zero - such as <math>\mu_{0}</math> - indicates parameters of the priors. For convenience we also define precision matrices, which are the inverse of each covariance matrix:
== DCM for evoked responses ==
 
DCM for evoked responses is a biologically plausible model to understand how event-related responses result from the dynamics of coupled neural populations. It rests on neural mass models, which use established connectivity rules in hierarchical brain systems to describe the dynamics of a network of coupled neuronal sources each of which is modelled using a neural mass model (David and Friston, 2003; David ''et al.'', 2005; Jansen and Rit, 1995). Neural mass model emulates the activity of a cortical area using three neuronal subpopulations, assigned to granular and agranular layers. A population of excitatory pyramidal (output) cells receive inputs from inhibitory and excitatory populations of [[interneurons]], via intrinsic connections (which are confined to the cortical sheet). Within this model, excitatory interneurons can be regarded as spiny stellate cells found predominantly in layer four and in receipt of forward connections. Excitatory pyramidal cells and inhibitory interneurons are considered to occupy agranular layers and receive backward and lateral inputs.
{{NumBlk|:|
<math>\begin{align}
\Pi&=\Sigma^{-1}\\
\Pi_0&=\Sigma_0^{-1}\\
\tilde{\Pi}&=\tilde{\Sigma}^{-1}\\
\tilde{\Pi}_0&=\tilde{\Sigma}_0^{-1}\\
\end{align}</math>
|7}}
 
[[Image:Fig3A.png|thumb|400px|right| Schematic of the DCM used to model evoked electrophysiological responses. This schematic shows the state equations describing the dynamics of sources or regions. Each neuronal source is modelled with three subpopulations (pyramidal, spiny stellate and inhibitory interneurons) which are connected by four intrinsic connections with weights <math>\gamma_{1,2,3,4}\ ,</math> as described in (Jansen and Rit, 1995) and (David and Friston, 2003). These have been assigned to granular and agranular cortical layers which receive forward <math>A^{F}</math>', backward <math>A^B</math> and lateral <math>A^L</math> connections respectively. Adapted from (Kiebel ''et al.'', 2008).]]
The free energy of the full model <math>F</math> is an approximation (lower bound) on the log model evidence: <math>F\approx \ln{p(y)}</math> that is optimised explicitly in variational Bayes (or can be recovered from sampling approximations). The reduced model's free energy <math>\tilde{F}</math> and parameters <math>(\tilde{\mu},\tilde{\Sigma})</math> are then given by the expressions:
 
To model event-related responses, the network receives exogenous inputs via input connections. These connections are exactly the same as forward connections and deliver inputs to the spiny stellate cells. In the present context, inputs <math>u(t)</math> model sub-cortical auditory inputs. The vector <math>C\subset\theta</math> controls the influence of the input on each source. The lower, upper and leading diagonal matrices <math>A^{F},A^{B},A^{L}\subset\theta</math> encode forward, backward and lateral connections, respectively. The DCM here is specified in terms of the state equations and a linear output equation
{{NumBlk|:|
<math>\begin{align}
\tilde{F} &= \frac{1}{2}\ln|\tilde{\Pi}_0\cdot\Pi\cdot\tilde{\Sigma}\cdot\Sigma_0| \\
&- \frac{1}{2}(\mu^T\Pi\mu + \tilde{\mu}_0^T\tilde{\Pi}_0\tilde{\mu}_0 - \mu_0^T\Pi_0\mu_0 - \tilde{\mu}^T\tilde{\Pi}\tilde{\mu}) + F\\
\tilde{\mu} &= \tilde{\Sigma}(\Pi\mu + \tilde{\Pi}_0\tilde{\mu}_0 - \Pi_0\mu_0) \\
\tilde{\Sigma} &= (\Pi+\tilde{\Pi}_0-\Pi_0)^{-1} \\
\end{align}</math>
|8}}
 
:<math> \dot{x}=f(x,u,\theta) </math>
== Example ==
:<math> y= L(\theta)x_0+\epsilon </math>
[[File:Example full and reduced priors.png|thumb|Example priors. In a 'full' model, left, a parameter has a Gaussian prior with mean 0 and standard deviation 0.5. In a 'reduced' model, right, the same parameter has prior mean zero and standard deviation 1/1000. Bayesian model reduction enables the evidence and parameter(s) of the reduced model to be derived from the evidence and parameter(s) of the full model.]]
 
where <math>x_0</math> represents the trans-membrane potential of pyramidal cells and <math>L(\theta)</math> is a lead field matrix coupling electrical sources to the EEG channels (Kiebel ''et al.'', 2006).
Consider a model with a parameter <math>\theta</math> and Gaussian prior <math>p(\theta)=N(0,0.5^2)</math>, which is the Normal distribution with mean zero and standard deviation 0.5 (illustrated in the Figure, left). This prior says that without any data, the parameter is expected to have value zero, but we are willing to entertain positive or negative values (with a 99% confidence interval [-1.16 1.16]). The model with this prior is fitted to the data, to provide an estimate of the parameter <math>q(\theta)</math> and the model evidence <math>p(y)</math>.
 
Within each subpopulation the evolution of neuronal states rests on two operators. The first transforms the average density of pre-synaptic inputs into the average postsynaptic [[membrane potential]]. This is modelled by a linear transformation with excitatory and inhibitory kernels parameterised by <math>H_{e,i}</math> and <math>\tau_{e,i}\ .</math> <math>H_{e,i}\subset\theta</math> control the maximum post-synaptic potential, and <math>\tau_{e,i}\subset\theta</math> represent lumped rate-constants. The second operator ''S'' transforms the average potential of each subpopulation into an average firing rate. This is assumed to be an instantaneous process that follows a sigmoid function (Marreiros ''et al.'', 2008b). Interactions, among the subpopulations, depend on constants <math>\gamma_{1,2,3,4}\ ,</math> which control the strength of intrinsic connections and reflect the total number of [[synapses]] expressed by each subpopulation.
To assess whether the parameter contributed to the model evidence, i.e. whether we learnt anything about this parameter, an alternative 'reduced' model is specified in which the parameter has a prior with a much smaller variance: e.g. <math>\tilde{p}_0=N(0,0.001^2)</math>. This is illustrated in the Figure (right). This prior effectively 'switches off' the parameter, saying that we are almost certain that it has value zero. The parameter <math>\tilde{q}(\theta)</math> and evidence <math>\tilde{p}(y)</math> for this reduced model are rapidly computed from the full model using Bayesian model reduction.
 
== Model evidence and selection ==
The hypothesis that the parameter contributed to the model is then tested by comparing the full and reduced models via the [[Bayes factor]], which is the ratio of model evidences:
 
Bayesian model selection (BMS) is a promising technique to determin the most likely among a set of competing hypotheses about the mechanisms that generated observed data. In the context of DCM, BMS is used to distinguish between different systems architectures. Model comparison and selection rests on the model evidence <math>p(y|m)\ ;</math> ''i.e.'' the probability of observing the data ''y'' under a particular model ''m''. The model evidence is obtained by integrating out dependencies on the model parameters
<math>BF=\frac{p(y)}{\tilde{p}(y)}</math>
 
<math>
The larger this ratio, the greater the evidence for the full model, which included the parameter as a free parameter. Conversely, the stronger the evidence for the reduced model, the more confident we can be that the parameter did not contribute. Note this method is not specific to comparing 'switched on' or 'switched off' parameters, and any intermediate setting of the priors could also be evaluated.
p(y|m)=\int p(y|\theta,m)p(\theta|m)d\theta
</math>
In DCM, model inversion, comparison and reduction are carried out by using computationally tractable approximations to the model evidence (or the log-evidence) called the (negative) free-energy ''F'' (see equation for F below), which handles posterior and priors dependencies properly.
 
For a given DCM, say model ''m'', inversion corresponds to approximating the moments of the posterior or conditional distribution given by Bayes rule
== Applications ==
 
<math>
=== Neuroimaging ===
p(\theta|y,m)= \frac{ p(y|\theta,m)p(\theta|m)}{p(y|m)}
Bayesian model reduction was initially developed for use in neuroimaging analysis <ref name=":0" /><ref>{{Cite journal|last=Rosa|first=M.J.|last2=Friston|first2=K.|last3=Penny|first3=W.|date=June 2012|title=Post-hoc selection of dynamic causal models|url=https://doi.org/10.1016/j.jneumeth.2012.04.013|journal=Journal of Neuroscience Methods|volume=208|issue=1|pages=66–78|doi=10.1016/j.jneumeth.2012.04.013|issn=0165-0270|pmc=PMC3401996|pmid=22561579|via=}}</ref>, in the context of modelling brain connectivity, as part of the [[Dynamic causal modelling]] framework (where it was originally referred to as post-hoc Bayesian model selection<ref name=":0" />). Dynamic causal models (DCMs) are differential equation models of brain dynamics <ref>{{Cite journal|last=Friston|first=K.J.|last2=Harrison|first2=L.|last3=Penny|first3=W.|date=August 2003|title=Dynamic causal modelling|url=https://doi.org/10.1016/S1053-8119(03)00202-7|journal=NeuroImage|volume=19|issue=4|pages=1273–1302|doi=10.1016/s1053-8119(03)00202-7|issn=1053-8119|via=}}</ref>. The experimenter specifies multiple competing models which differ in their priors - e.g. in the choice of parameters which are fixed at their prior expectation of zero. Having fitted a single 'full' model with all parameters of interest informed by the data, Bayesian model reduction enables the evidence and parameters for competing models to be rapidly computed, in order to test hypotheses. These models can be specified manually by the experimenter, or searched over automatically, in order to 'prune' any redundant parameters which do not contribute to the evidence.
</math>
 
Inversion of a DCM involves minimizing the free energy, ''F'', in order to maximize the model evidence or marginal likelihood (''c.f.'' “type-II likelihood”; Good 1965). The posterior moments (mean and [[covariance]]) are updated iteratively using Variational Bayes under a fixed-form Laplace, (‘‘i.e.’’, Gaussian), approximation <math> q(\theta) </math> to the conditional density. This can be regarded as an Expectation-Maximization algorithm; '''EM''' (Dempster ''et al.'', 1977) that employs a local linear approximation of the predicted responses around the current conditional expectation. This [[Bayesian]] method was developed for dynamic system models based on differential equations. In contrast, conventional inversions of state space models typically use maximum likelihood methods and operate in discrete time (''c.f.'' Valdes ''et al.'', 1999). Generalisations of this Variational (Laplace) scheme extend the scope of DCM to cover models based on stochastic differential equations and difference equations (Friston ''et al.'' 2008; Daunizeau ''et al.'' 2009a).
Bayesian model reduction was subsequently generalised and applied to other forms of Bayesian models, for example [[Empirical Bayes method|Parametric Empirical Bayes (PEB)]] models of group effects<ref name=":1" />. Here, it is used to compute the evidence and parameters for any given level of a hierarchical model under constraints (empirical priors) imposed by the level above.
The basic Variational scheme for DCM can be summarized as follows (where ''λ'' is the error variance and ''q'' is the conditional density):
 
:<math> \ \ E-Step:q \leftarrow \min_{q} F(q,\lambda,m)</math>
=== Neurobiology ===
:<math> \ M-Step:\lambda \leftarrow \min_{\lambda} F(q,\lambda,m)</math>
Bayesian model reduction has been used to explain functions of the brain. By analogy to its use in eliminating redundant parameters from models of experimental data, it has been proposed <ref>{{Cite journal|last=Friston|first=Karl J.|last2=Lin|first2=Marco|last3=Frith|first3=Christopher D.|last4=Pezzulo|first4=Giovanni|last5=Hobson|first5=J. Allan|last6=Ondobaka|first6=Sasha|date=October 2017|title=Active Inference, Curiosity and Insight|url=https://doi.org/10.1162/neco_a_00999|journal=Neural Computation|language=en|volume=29|issue=10|pages=2633–2683|doi=10.1162/neco_a_00999|issn=0899-7667|via=}}</ref> that the brain eliminates redundant parameters of internal models of the world while offline (e.g. during sleep <ref>{{Cite journal|last=Tononi|first=Giulio|last2=Cirelli|first2=Chiara|date=February 2006|title=Sleep function and synaptic homeostasis|url=https://doi.org/10.1016/j.smrv.2005.05.002|journal=Sleep Medicine Reviews|volume=10|issue=1|pages=49–62|doi=10.1016/j.smrv.2005.05.002|issn=1087-0792|via=}}</ref>).
:<math> </math>
:<math> F(q,\lambda,m)= \Big \langle lnq(\theta)-lnp(y|\theta,\lambda)-lnp(\theta|m) \Big \rangle_q</math>
:<math> \qquad \qquad \ \ =KL \Big(q||p(\theta|y,\lambda)\Big) - ln \Big(p(y|\lambda,m)\Big)</math>
 
The free-energy is the Kullback–Leibler divergence (denoted by ''KL''), between the real and approximate conditional density minus the log-evidence. This means that when the free-energy is minimised, the discrepancy between the true and approximate conditional density is suppressed. At this point the free-energy approximates the negative log-evidence: <math> F \approx -ln \Big ( p(y|\lambda,m) \Big ) </math> (Friston ''et al.'', 2007; Penny ''et al.'', 2004). Model selection is based on this approximation; where the best model is characterised by the greatest log-evidence (''i.e.'' the smallest free-energy). Pairwise model comparisons can be conveniently described by [http://en.wikipedia.org/wiki/Bayes_factor Bayes factors] (Kass and Raftery, 1995):
== Software implementations ==
 
Bayesian model reduction is implemented in the [[Statistical parametric mapping|Statistical Parametric Mapping]] toolbox, in the [[MATLAB|Matlab]] function [https://github.com/spm/spm12/blob/master/spm_log_evidence_reduce.m spm_log_evidence_reduce.m] .
<math>
BF_{i,j} = \frac {p(y|m_i)}{p(y|m_j)}
</math>
 
Raftery (1995), presents an interpretation of the BF as providing weak (BF < 3), positive (3 ≤ BF < 20), strong (20 ≤ BF < 150) or very strong (BF ≥ 150) evidence for preferring one model over another. Strong evidence in favor of one model thus requires the difference in log-evidence to be three or more (Penny ''et al.'' 2004). Under flat priors on models, this corresponds to a conditional confidence that the winning model is exp(3) = 20 times more likely than the alternative. From the equations above, it can be seen that the Bayes factor is simply the exponential of the difference in log-evidences.
 
The search for the best model precedes (and is often more important than) inference on the parameters of the model selected. Many studies have used BMS to adjudicate among competing DCMs for fMRI (Acs and Greenlee, 2008; Allen ''et al.'', 2008; Grol ''et al.'', 2007; Heim ''et al.'', 2009; Kumar ''et al.'', 2007; Leff ''et al.'', 2008; Smith ''et al.'', 2006; Stephan ''et al.'', 2007c; Summerfield and Koechlin, 2008) and EEG data (Garrido ''et al.'', 2008; Garrido ''et al.'', 2007). This approach, to search for a single best model (amongst those deemed plausible ''a priori'') and then proceed to inference on its parameters, is pursued most often and could be complemented with diagnostic model checking procedures as, for example, suggested by Box (1980). However, alternatives to this single-model approach exist. For example, one can partition model space and make inferences about model families (Stephan ''et al.'' 2009; Penny ''et al.'' 2010). Alternatively, one can use Bayesian model averaging, where the parameter estimates of each model considered are weighted by the posterior probability of the model (Hoeting ''et al.'' 1999; Penny ''et al.'' 2010).
 
 
== Applications: fMRI ==
 
The use of DCM for fMRI is demonstrated by analysing data acquired under a study of attentional modulation during ''[[visual motion]]'' ''processing'' (Büchel and Friston, 1997). These data have been used previously to validate DCM (Friston ''et al.'', 2003) and are available from http://www.fil.ion.ucl.ac.uk/spm/data. The experimental manipulations were encoded as three exogenous inputs: A ''photic stimulation'' input indicated when dots were presented on a screen, a ''motion'' variable indicated that the dots were moving and the ''attention'' variable indicated that the subject was attending to possible velocity changes. The activity was modelled in three regions V1, V5 and superior parietal [[cortex]] (SPC).
 
Three different DCMs are specified, each of which embodies different assumptions about how attention modulates connectivity between V1 and V5. Model 1 assumes that attention modulates the forward connection from V1 to V5, model 2 assumes that attention modulates the backward connection from SPC to V5 and model 3 assumes attention modulates both connections. Each model assumes that the effect of motion is to modulate the connection from V1 to V5 and uses the same reciprocal hierarchical intrinsic connectivity. The models were fitted and the Bayes factors provided consistent evidence in favour of the hypothesis embodied in model 1, that attention modulates the forward connection from V1 to V5.
 
 
{|
|[[Image:Fig4A.png|thumb|400px|center|Fig4A|DCM applied to data from a study on attention to visual motion by (Büchel and Friston, 1997). In all models, photic stimulation enters V1 and motion modulates the connection from V1 to V5. All models have reciprocal and hierarchically organised connectivity. They differ in how attention (red) modulates the connectivity to V5; with model 1 assuming modulation of the forward connection (V1 to V5), model 2 assuming modulation of the backward connection (SPC to V5) and model 3 assuming both. The broken lines indicate the modulatory connections, adapted from (Penny ''et al.'', 2004).]]
 
|[[Image:Fig5A.png|thumb|400px|center|Fig5A|Nonlinear DCM for fMRI applied to the attention to motion paradigm. Left panel: Numbers alongside the connections indicate the ''maximum a posteriori'' (MAP) parameter estimates. Right panel: Posterior density of the estimate for the nonlinear modulation parameter for the V1→V5 connection. Given the mean and variance of this posterior density, we can be 99.1% confident that the true parameter value is larger than zero or, in other words, that there is an increase in gain of V5 responses to V1 inputs that are mediated by parietal activity. Adapted from (Stephan ''et al.'', 2008).]]
|}
 
Note that this model does not specify the source of the attentional top-down effect. This becomes possible with nonlinear dynamic causal models (Stephan ''et al.'' 2008). Nonlinear DCM for fMRI enables one to model how activity in one population gates connection strengths among others. <figref>Fig5A.png</figref> shows an application to the previous example where parietal activity, induced by attention to motion, modulates the connection from V1 to V5.
 
== Applications: Evoked responses ==
 
To illustrate DCM for event-related responses (ERPs) data acquired under a mismatch negativity (MMN) paradigm (http://www.fil.ion.ucl.ac.uk/spm/data) is used. In this example, various models over twelve subjects are compared. The results shown are a part of a program that considered the MMN and its underlying mechanisms (Garrido ''et al.'', 2007). Three plausible models were specified under an architecture motivated by electrophysiological and neuroimaging MMN studies (Doeller ''et al.'', 2003; Opitz ''et al.'', 2002). Each has five sources, modelled as Equivalent Current Dipole (ECDs); (Kiebel ''et al.'', 2006), over left and right primary auditory cortex (A1), left and right superior temporal gyrus (STG) and right inferior frontal gyrus (IFG). An exogenous (auditory) input enters bilaterally at A1, which are connected to their ipsilateral STG. Right STG is connected to the right IFG. Inter-hemispheric (lateral) connections are placed between left and right STG. All connections are reciprocal (''i.e.'', connected with forward and backward connections or with bilateral connections).
 
Three models were tested, which differed in the connections which could show putative repetition-dependent changes, ''i.e.'', differences between listening to standard or deviant tones. Models F, B and FB allowed changes in forward, backward and both, respectively. All three models were compared against a baseline or null model, which had the same architecture but precluded any coupling changes between standard and deviant trials.
 
 
{|
|[[Image:Fig6A.png|thumb|400px|center| Model specification. Sources are connected with forward (dark grey), backward (grey) or lateral (light grey) connections. A1: primary auditory cortex, STG: superior temporal gyrus, IFG: inferior temporal gyrus. Three different models were tested within the same architecture, allowing for repetition-related changes in forward F, backward B and forward and backward FB connections, respectively. The broken lines indicate the connections that were allowed to change, adapted from (Garrido ''et al.'', 2007).]]
 
|[[Image:Fig7A.png|thumb|400px|center| Bayesian model selection among DCMs for the three models, F, B and FB, expressed relative to a null model in which no connections were allowed to change across conditions. The graphs show the negative free-energy approximation to the log-evidence. ('''Left''') Log-evidence for models F, B, and FB for each subject (relative to the null). The diamond attributed to each subject identifies the best model on the basis of the subject’s highest log-evidence. ('''Right''') Log-evidence at the group level, ''i.e.'', pooled over subjects, for the three models, adapted from (Garrido ''et al.'', 2007).]]
|}
 
Bayesian model selection based on the increase in log-evidence over the null model was performed for all subjects. The log-evidences of the three models, relative to the null model (for each subject), reveal that they are substantially better than the null model in all subjects. In particular, the FB-model was best in seven out of eleven subjects. The sum of the log-evidences over subjects (which is equivalent to the log group Bayes factor, see below) showed that there was very strong evidence in favour of model FB at the group level.
 
== Hierarchical model comparison ==
 
Comparison at the between-subject level has been used extensively in previous group studies using the group Bayes factor (GBF). The GBF is simply the product of Bayes factors over subjects and constitutes a fixed-effects analysis. It has been used to decide between competing DCMs for fMRI (Acs and Greenlee, 2008; Allen ''et al.'', 2008; Grol ''et al.'', 2007; Heim ''et al.'', 2009; Kumar ''et al.'', 2007; Leff ''et al.'', 2008; Smith ''et al.'', 2006; Stephan ''et al.'', 2007c; Summerfield and Koechlin, 2008) and EEG data (Garrido ''et al.'', 2008; Garrido ''et al.'', 2007).
 
When the functional architecture is unlikely to differ across subjects, the conventional GBF is both sufficient and appropriate. However, subjects may exhibit different models or functional architectures; for example, due to different cognitive strategies or pathology. In this case, a hierarchical random effects procedure is required (Stephan ‘‘et al.’’, 2009). This rests on treating the model as a random variable and estimating the parameters of a Dirichlet distribution describing the probabilities of all models considered. These probabilities then define a multinomial distribution over model-space, allowing one to compute how likely it is that a specific model generated the data of a randomly chosen subject (and the exceedance probability of one model is more likely than any other).
 
== DCM developments ==
DCM combines a biophysical model of the hidden (latent) dynamics with a forward model that translates hidden states into predicted measurements; to furnish an explicit generative model how observed data were caused (Friston, 2009). This means the exact form of the DCM changes with each application and speaks to their progressive refinement:
 
Since its inception (Friston ''et al.'', 2003), a number of developments have improved and extended DCM: For fMRI, models of precise temporal sampling (Kiebel ''et al.'', 2007), multiple hidden states per region (Marreiros ''et al.'', 2008a), a refined hemodynamic model (Stephan ''et al.'', 2007c) and a nonlinear neuronal model (Stephan ''et al.'', 2008) have been introduced. DCM for EEG/MEG (David ''et al.'', 2006) has also seen rapid developments: DCM with lead-field parameterization (Kiebel ''et al.'', 2006), DCM for induced responses (Chen ''et al.'', 2008), DCM for neural-mass and mean-field models (Marreiros ''et al.'', 2009), DCM for spectral responses (Moran ''et al.'', 2009), stochastic DCMs (Daunizeau ''et al.'', 2009b) and DCM for phase-coupling (Penny ''et al.'', 2009). A review on developments for M/EEG data can be found in (Kiebel ''et al.'', 2008).
 
In relation to model selection, a hierarchical variational Bayesian framework (Stephan ''et al.'', 2009) accounts for random effects at the between-subjects level, ''e.g.'' when dealing with group heterogeneity or outliers. This work was extended by (Penny ''et al.'', 2010) to allow for comparisons between model families of arbitrary size and for Bayesian model averaging within model families.
 
== Recommended reading ==
 
Friston, K., Ashburner, J., Kiebel, S., Nichols, T., Penny, W., 2006. Statistical Parametric Mapping: The Analysis of Functional Brain Images. ''Elsevier'', London.
 
Friston, K., 2009. Causal modelling and brain connectivity in functional magnetic resonance imaging. ''PLoS Biol'' 7, e33.
 
David, O., Guillemain, I., Baillet, S., Reyt, S., Deransart, C., Segebarth, C., Depaulis, A., 2008. Identifying neural drivers with functional MRI: an electrophysiological validation. ''PLoS Biol'' 6, 2683-2697.
 
Penny, W.D., Stephan, K.E., Mechelli, A., Friston, K.J., 2004. Modelling functional integration: a comparison of structural equation and dynamic causal models. ''Neuroimage'' 23: S264-274.
 
Kiebel, S.J., Garrido, M.I., Moran, R.J., Friston, K.J., 2008. Dynamic causal modelling for EEG and MEG. ''Cogn Neurodyn'' 2, 121-136.
 
Stephan, K.E., Harrison, L.M., Kiebel, S.J., David, O., Penny, W.D., Friston, K.J., 2007. Dynamic causal models of neural system dynamics: current state and future extensions. ''J Biosci'' 32, 129-144.
 
 
'''Internal references'''
 
• Valentino Braitenberg (2007) [[Brain]]. ''Scholarpedia'', 2(11):2918.
 
• Olaf Sporns (2007) [[Brain connectivity]]. ''Scholarpedia'', 2(10):4695
 
• James Meiss (2007) [[Dynamical systems]]. ''Scholarpedia'', 2(2):1629.
 
• Paul L. Nunez and Ramesh Srinivasan (2007) [[Electroencephalogram]]. ''Scholarpedia'', 2(2):1348.
 
• William D. Penny and Karl J. Friston (2007) [[Functional imaging]]. ''Scholarpedia'', 2(5):1478
 
• Seiji Ogawa and Yul-Wan Sung (2007) [[Functional magnetic resonance imaging]]. ''Scholarpedia'', 2(10):3105.
 
• Rodolfo Llinas (2008) [[Neuron]]. ''Scholarpedia'', 3(8):1490
 
 
<!-- Authors, please check this list and remove any references that are irrelevant. This list is generated automatically to reflect the links from your article to other accepted articles in Scholarpedia. -->
<b>Internal references</b>
 
* Lawrence M. Ward (2008) [[Attention]]. Scholarpedia, 3(10):1538.
 
* Jan A. Sanders (2006) [[Averaging]]. Scholarpedia, 1(11):1760.
 
* David Spiegelhalter and Kenneth Rice (2009) [[Bayesian statistics]]. Scholarpedia, 4(8):5230.
 
* Valentino Braitenberg (2007) [[Brain]]. Scholarpedia, 2(11):2918.
 
* Olaf Sporns (2007) [[Brain connectivity]]. Scholarpedia, 2(10):4695.
 
* Olaf Sporns (2007) [[Complexity]]. Scholarpedia, 2(10):1623.
 
* Julia Berzhanskaya and Giorgio Ascoli (2008) [[Computational neuroanatomy]]. Scholarpedia, 3(3):1313.
 
* James Meiss (2007) [[Dynamical systems]]. Scholarpedia, 2(2):1629.
 
* Paul L. Nunez and Ramesh Srinivasan (2007) [[Electroencephalogram]]. Scholarpedia, 2(2):1348.
 
* Tomasz Downarowicz (2007) [[Entropy]]. Scholarpedia, 2(11):3901.
 
* Giovanni Gallavotti (2008) [[Fluctuations]]. Scholarpedia, 3(6):5893.
 
* William D. Penny and Karl J. Friston (2007) [[Functional imaging]]. Scholarpedia, 2(5):1478.
 
* Seiji Ogawa and Yul-Wan Sung (2007) [[Functional magnetic resonance imaging]]. Scholarpedia, 2(10):3105.
 
* Anil Seth (2007) [[Granger causality]]. Scholarpedia, 2(7):1667.
 
* Tamas Freund and Szabolcs Kali (2008) [[Interneurons]]. Scholarpedia, 3(9):4720.
 
* Rodolfo Llinas (2008) [[Neuron]]. Scholarpedia, 3(8):1490.
 
* Brian N. Pasley and Ralph D. Freeman (2008) [[Neurovascular coupling]]. Scholarpedia, 3(3):5340.
 
* Marco M Picchioni and Robin Murray (2008) [[Schizophrenia]]. Scholarpedia, 3(4):4132.
 
* David H. Terman and Eugene M. Izhikevich (2008) [[State space]]. Scholarpedia, 3(3):1924.
 
* Anthony T. Barker and Ian Freeston (2007) [[Transcranial magnetic stimulation]]. Scholarpedia, 2(10):2936.
 
== References ==
 
<references />
*Acs, F., Greenlee, M.W., 2008. Connectivity modulation of early visual processing areas during covert and overt tracking tasks. ''Neuroimage'' 41, 380-388.
 
*Akaike, H., 1985. Prediction and Entropy. In A. C. Atkinson and S. E. Feinberg (eds.), A Celebration of Statistics. New York: ''Springer''. 1-24.
 
*Allen, P., Mechelli, A., Stephan, K.E., Day, F., Dalton, J., Williams, S., McGuire, P.K., 2008. Fronto-temporal interactions during overt verbal initiation and suppression. ''J Cogn Neurosci'' 20, 1656-1669.
 
*Box, G.E.P., 1980. ‘Sampling and Bayes’ Inference in Scientific Modelling and Robustness, ''J. Roy. Stat. Soc.'', Series A, Vol.143, 383-430.
 
*Büchel, C., Friston, K.J., 1997. Modulation of connectivity in visual pathways by attention: cortical interactions evaluated with structural equation modelling and fMRI. ''Cereb Cortex'' 7, 768-778.
 
*Buxton, R.B., Wong, E.C., Frank, L.R., 1998. Dynamics of blood flow and oxygenation changes during brain activation: the balloon model. ''Magn. Reson. Med.'' 39, 855-864.
 
*Chen, C.C., Kiebel, S.J., Friston, K.J., 2008. Dynamic causal modelling of induced responses. ''Neuroimage'' 41, 1293-1312.
 
*Daunizeau, J., Friston, K.J., 2007. A mesostate-space model for EEG and MEG. ''Neuroimage'' 38:67–81.
 
*Daunizeau, J., Friston, K.J., Kiebel, S.J., 2009a. Variational Bayesian identification and prediction of stochastic nonlinear dynamic causal models. ''Physica'', D 238, 2089–2118.
 
*Daunizeau, J., Kiebel, S.J., Friston, K.J., 2009b. Dynamic causal modelling of distributed electromagnetic responses. ''Neuroimage'' 47, 590-601.
 
*Daunizeau, J., David, O., Stephan, K.E., 2010. Dynamic Causal Modelling: a critical review of the biophysical and statistical foundations. ''Neuroimage'', in press.
 
*David, O., Friston, K.J., 2003. A neural mass model for MEG/EEG: coupling and neuronal dynamics. ''Neuroimage'' 20, 1743-1755.
 
*David, O., Harrison, L., Friston, K.J., 2005. Modelling event-related responses in the brain. ''Neuroimage'' 25, 756-770.
 
*David, O., Kiebel, S.J., Harrison, L.M., Mattout, J., Kilner, J.M., Friston, K.J., 2006. Dynamic causal modeling of evoked responses in EEG and MEG. ''Neuroimage'' 30, 1255-1272.
 
*Dempster, A.P., Laird, N.M., Rubin, D.B., 1977. Maximum likelihood from incomplete data via EM algorithm. ''Journal of the Royal Statistical Society Series B-Methodological'' 39, 1-38.
 
*Doeller, C.F., Opitz, B., Mecklinger, A., Krick, C., Reith, W., Schroger, E., 2003. Prefrontal cortex involvement in preattentive auditory deviance detection: neuroimaging and electrophysiological evidence. ''Neuroimage'' 20, 1270-1282.
 
*Ethofer, T., Anders, S., Erb, M., Herbert, C., Wiethoff, S., Kissler, J., Grodd, W., Wildgruber, D., 2006. Cerebral pathways in processing of affective prosody: a dynamic causal modeling study. ''Neuroimage'' 30, 580-587.
 
*Fairhall, S.L., Ishai, A., 2007. Effective connectivity within the distributed cortical network for face perception. ''Cereb Cortex'' 17, 2400-2406.
 
*Felleman, D.J., Van Essen, D.C., 1991. Distributed hierarchical processing in the primate cerebral cortex. ''Cereb Cortex'' 1, 1-47.
 
*Friston, K.J., Mechelli, A., Turner, R., Price, C.J., 2000. Nonlinear responses in fMRI: the Balloon model, Volterra kernels, and other hemodynamics. ''Neuroimage'' 12, 466-477.
 
*Friston, K.J., 2002. Bayesian estimation of dynamical systems: an application to fMRI. ''Neuroimage'' 16, 513-530.
 
*Friston, K.J., Harrison, L., Penny, W., 2003. Dynamic causal modelling. ''Neuroimage'' 19, 1273-1302.
 
*Friston, K., Mattout, J., Trujillo-Barreto, N., Ashburner, J., Penny, W., 2007. Variational free energy and the Laplace approximation. ''Neuroimage'' 34, 220-234.
 
*Friston, K.J., Trujillo-Barreto, N., Daunizeau, J., 2008. DEM: a variational treatment of dynamic systems. ''Neuroimage'' 41(3):849-85.
 
*Friston, K., 2009. Causal modelling and brain connectivity in functional magnetic resonance imaging. ''PLoS Biol'' 7, e33.
 
*Garrido, M.I., Kilner, J.M., Kiebel, S.J., Stephan, K.E., Friston, K.J., 2007. Dynamic causal modelling of evoked potentials: a reproducibility study. ''Neuroimage'' 36, 571-580.
 
*Garrido, M.I., Friston, K.J., Kiebel, S.J., Stephan, K.E., Baldeweg, T., Kilner, J.M., 2008. The functional anatomy of the MMN: a DCM study of the roving paradigm. ''Neuroimage'' 42, 936-944.
 
*Good, I.J., 1965. “The Estimation of Probabilities: An Essay on Modern Bayesian Methods”, Cambridge, Mass, ''MIT Press''.
 
*Grol, M.J., Majdandzic, J., Stephan, K.E., Verhagen, L., Dijkerman, H.C., Bekkering, H., Verstraten, F.A., Toni, I., 2007. Parieto-frontal connectivity during visually guided grasping. ''J Neurosci'' 27, 11877-11887.
 
*Heim, S., Eickhoff, S.B., Ischebeck, A.K., Friederici, A.D., Stephan, K.E., Amunts, K., 2009. Effective connectivity of the left BA 44, BA 45, and inferior temporal gyrus during lexical and phonological decisions identified with DCM. ''Hum Brain Mapp'' 30, 392-402.
 
*Hoeting, J.A., Madigan, D., Raftery, A.E., Volinsky, C.T., 1999. Bayesian model averaging: a tutorial. ''Stat. Sci.'' 14, 382–401.
 
*Jansen, B.H., Rit, V.G., 1995. Electroencephalogram and visual evoked potential generation in a mathematical model of coupled cortical columns. ''Biol Cybern'' 73, 357-366.
 
*Kass, R., Raftery, A., 1995. Bayes factors. ''Journal of the American Statistical Association'', 773-795.
 
*Kiebel, S.J., David, O., Friston, K.J., 2006. Dynamic causal modelling of evoked responses in EEG/MEG with lead field parameterization. ''Neuroimage'' 30, 1273-1284.
 
*Kiebel, S.J., Kloppel, S., Weiskopf, N., Friston, K.J., 2007. Dynamic causal modeling: a generative model of slice timing in fMRI. ''Neuroimage'' 34, 1487-1496.
 
*Kiebel, S.J., Garrido, M.I., Moran, R.J., Friston, K.J., 2008. Dynamic causal modelling for EEG and MEG. ''Cogn Neurodyn'' 2, 121-136.
 
*Kumar, S., Stephan, K.E., Warren, J.D., Friston, K.J., Griffiths, T.D., 2007. Hierarchical processing of auditory objects in humans. ''PLoS Comput Biol'' 3, e100.
 
*Leff, A.P., Schofield, T.M., Stephan, K.E., Crinion, J.T., Friston, K.J., Price, C.J., 2008. The cortical dynamics of intelligible speech. ''J Neurosci'' 28, 13209-13215.
 
*Marreiros, A.C., Kiebel, S.J., Friston, K.J., 2008a. Dynamic causal modelling for fMRI: a two-state model. ''Neuroimage'' 39, 269-278.
 
*Marreiros, A.C., Daunizeau, J., Kiebel, S.J., Friston, K.J., 2008b. Population dynamics: variance and the sigmoid activation function. ''Neuroimage'' 42, 147-157.
 
*Marreiros, A.C., Kiebel, S.J., Daunizeau, J., Harrison, L.M., Friston, K.J., 2009. Population dynamics under the Laplace assumption. ''Neuroimage'' 44, 701-714.
 
*Moran, R.J., Stephan, K.E., Seidenbecher, T., Pape, H.C., Dolan, R.J., Friston, K.J., 2009. Dynamic causal models of steady-state responses. ''Neuroimage'' 44, 796-811.
 
*Opitz, B., Rinne, T., Mecklinger, A., von Cramon, D.Y., Schroger, E., 2002. Differential contribution of frontal and temporal cortices to auditory change detection: fMRI and ERP results. ''Neuroimage'' 15, 167-174.
 
*Penny, W.D., Stephan, K.E., Mechelli, A., Friston, K.J., 2004. Comparing dynamic causal models. ''Neuroimage'' 22, 1157-1172.
 
*Penny, W.D., Litvak, V., Fuentemilla, L. Duzel, E., Friston, K., 2009. Dynamic Causal Models for Phase Coupling. ''J Neurosci Methods'', 183(1):19-30.
 
*Penny, W.D., Stephan, K.E., Daunizeau, J., Joao, M., Friston, K., Schofield, T., Leff, A.P., 2010. Comparing Families of Dynamic Causal Models. ''PLoS Computational Biology'', in press.
 
*Posner, M.I., Sheese, B.E., Odludas, Y., Tang, Y., 2006. Analyzing and shaping human attentional networks. ''Neural Netw'' 19, 1422-1429.
 
*Raftery, A.E., 1995. Bayesian model selection in social research. ''Sociological Methodology'' 1995, Vol 25, 111-163.
 
*Schwarz, G.E., 1978. "Estimating the dimension of a model". ''Annals of Statistics'' 6 (2): 461–464.
 
*Smith, A.P., Stephan, K.E., Rugg, M.D., Dolan, R.J., 2006. Task and content modulate amygdala-hippocampal connectivity in emotional retrieval. ''Neuron'' 49, 631-638.
 
*Stephan, K.E., Penny, W.D., Marshall, J.C., Fink, G.R., Friston, K.J., 2005. Investigating the functional role of callosal connections with dynamic causal models. ''Ann N Y Acad Sci'' 1064, 16-36.
 
*Stephan, K.E., Baldeweg, T., Friston, K.J., 2006. Synaptic plasticity and dysconnection in schizophrenia. ''Biol Psychiatry'' 59, 929-939.
 
*Stephan, K.E., Harrison, L.M., Kiebel, S.J., David, O., Penny, W.D., Friston, K.J., 2007a. Dynamic causal models of neural system dynamics: current state and future extensions. ''J Biosci'' 32, 129-144.
 
*Stephan, K.E., Marshall, J.C., Penny, W.D., Friston, K.J., Fink, G.R., 2007b. Interhemispheric integration of visual processing during task-driven lateralization. ''J Neurosci'' 27, 3512-3522.
 
*Stephan, K.E., Weiskopf, N., Drysdale, P.M., Robinson, P.A., Friston, K.J., 2007c. Comparing hemodynamic models with DCM. ''Neuroimage'' 38, 387-401.
 
*Stephan, K.E., Kasper, L., Harrison, L.M., Daunizeau, J., den Ouden, H.E., Breakspear, M., Friston, K.J., 2008. Nonlinear dynamic causal models for fMRI. ''Neuroimage'' 42, 649-662.
 
*Stephan, K.E., Penny, W.D., Daunizeau, J., Moran, R.J., Friston, K.J., 2009. Bayesian model selection for group studies. ''Neuroimage'' 46: 1004-1017.
 
*Stephan, K.E., Penny, W.D., Moran, R.J., Den Ouden, H.E., Daunizeau, J., Friston, K.J., 2010. Ten simple rules for dynamic causal modelling. ''Neuroimage'' 49: 3099-3109.
 
*Summerfield, C., Koechlin, E., 2008. A neural representation of prior information during perceptual inference. ''Neuron'' 59, 336-347.
 
== External links ==
 
http://www.fil.ion.ucl.ac.uk/spm/
 
http://www.fmrib.ox.ac.uk/fsl/
 
http://www.sccn.ucsd.edu/eeglab/
 
http://afni.nimh.nih.gov/afni/
 
http://www.humanbrainmapping.org/
 
http://www.elsevier.com/wps/find/journaldescription.cws_home/622925/description#description
 
http://www3.interscience.wiley.com/cgi-bin/jhome/38751
 
== See also ==
 
[[Computational Neuroanatomy]], [[Event-Related Brain Dynamics]], [[fMRI]], [[MEG]], [[MRI]], [[Models of Neurons]], [[Neurovascular Coupling]], [[Neural Networks]], [[Transcranial Magnetic Stimulation]]