Dynamic causal modeling: Difference between revisions

Content deleted Content added
mNo edit summary
mNo edit summary
Line 10:
 
# Experimental design. Formulate specific hypotheses and conduct a neuroimaging experiment to test those hypotheses.
#Data preparation. PreparePre-process the acquired data (suche.g. as selectingselect relevant data features and removingremove confounds).
# Model specification. Specify one or more forward models (DCMs) of how the data were caused.
#Model estimation. Fit the model(s) to the data to determine their evidence and parameters.
# Model comparison. Compare the evidence for the models using Bayesian Model Comparison, at the single-subject or group level, and inspect the parameters of the model(s).
 
Each of these steps is briefly reviewed below.
 
=== 1. Experimental design ===
Functional neuroimaging experiments are typically task-based or [[Resting state fMRI|resting state]]. In task-based experiments, brain responses are evoked by known deterministic inputs (experimentally controlled stimuli) that embody designed changes in sensory stimulation or cognitive set. These experimental or exogenous variables can change neural activity in one of two ways. First, they can elicit responses through direct influences on specific brain regions. This would include, for example, sensory evoked responses in the early visual cortex. The second class of inputs exerts their effects vicariously, through a modulation of the coupling among nodes, for example, the influence of attention on the processing of sensory information. These two types of input - driving and modulatory - are parameterized separately in DCM. To enable efficient estimation of driving and modulatory effects, a 2x2 [[Factorial experiment|factorial experimental design]] is often used - with one factor modelled as the driving input and the other as the modulatory input.
 
Resting state experiments have no experimental manipulations within the period of the neuroimaging recording. Instead, endogenous fluctuations in brain connectivity during the scan are of interest, or the differences in connectivity between scans or subjects. The DCM framework includes models and procedures for resting state, described below..
 
=== 2. Data preparation ===
For fMRI analysis, summary timeseries are generated for each brain region of interest. For MEG or EEG analysis, the desired data features are selected - e.g. [[Evoked potential|evoked potentials]] or induced responses.
 
=== 3. Model specification ===
Dynamic Causal Models (DCMs) are nonlinear state-space models in continuous time, parameterized in terms of directed effective connectivity between brain regions. Unlike [[Bayesian network|Bayesian Networks]], DCMs can be cyclic, and unlike [[Structural equation modeling|Structural Equation modelling]] and [[Granger causality]], DCM does not depend on the theory of Martingales, i.e., it does not assume that random fluctuations' are serially uncorrelated. Various models have been developed for use with DCM and the experimenter selects their preferred model based on the types of hypothesis they wish to address and the type of data they have collected.
 
Line 38:
The predominant model is DCM for evoked responses. It is a biologically plausible neural mass model, which emulates the activity of a cortical area using three neuronal subpopulations assigned to granular and agranular layers. A population of excitatory pyramidal (output) cells receive inputs from inhibitory and excitatory populations of [[interneurons]], via intrinsic connections (which are confined to the cortical sheet). Within this model, excitatory interneurons can be regarded as spiny stellate cells found predominantly in layer four and in receipt of forward connections. Excitatory pyramidal cells and inhibitory interneurons are considered to occupy agranular layers and receive backward and lateral inputs.
 
'''A short paragraph on on the CMC model please? - sameSame length as the one above.'''
 
== 5. Model estimationcomparison ==
Model inversion or estimation is implemented in DCM using [[Variational Bayesian methods|variational Bayesian]] methods and provides two useful quantities. The log marginal likelihood or model evidence <math>\ln{p(y|m)}</math> is the probability of observing of the given data under the model. This cannot be calculated exactly and in DCM it is approximated by a quantity called the negative variational free energy <math>F</math> . Hypotheses are tested using Bayesian model comparison, which involves comparing the evidence for different models based on their free energy. Model estimation also provides estimates of the parameters <math>p(\theta|y)</math>, for example the connection strengths, which maximise the free energy.
Bayesian inversion furnishes the marginal likelihood (evidence) of the model and the posterior distribution of its parameters (e.g., neuronal coupling strengths). The evidence is used for Bayesian model selection (BMS) to disambiguate between competing models, while the posterior distribution of the parameters is used to characterise the model selected.
 
Neuroimaging studies typically investigate effects which are conserved at the group level, or which differ between subjects. There are two predominant approaches for group-level analysis: random effects Bayesian Model Selection (BMS) and Parametric Empirical Bayes (PEB). Random effects BMS posits that subjects differ in terms of which model generated their data - e.g. drawing a random subject from the population, there would be a 25% chance their data were generated by model 1 and a 75% chance their data were generated by model 2. The PEB approach is a hierarchical model over parameters (connection strengths). It eschews the notion of different models at the level of individual subjects, and posits that people differ in the (continuous) strength of their individual connections.
== Model comparison ==
 
== DCM for fMRI ==
 
DCM for fMRI uses a deterministic low-order approximation model ( derived using Taylor series) of neural dynamics in a network or graph of ''n'' interacting brain regions or nodes (Friston ''et al.'' 2003). The activity of each cortical region in the model is governed by single neuronal state-vectors ''x'' in time, which is given by the following bilinear differential equation:
 
:<math> \dot{x}=f(x,u,\theta)= Ax + \sum_{j=1}^m u_j B^{(j)} x + Cu </math>
:<math> </math>
:<math> A= \frac{\partial f}{\partial x}\bigg|_{u=0} \; \quad\; B= \frac{\partial^2 f}{\partial x\,\partial u} \; \quad\; C= \frac{\partial f}{\partial u}\bigg|_{x=0} </math>
 
where <math>\dot{x}= dx/dt\ .</math> The bilinear model is a parsimonious low-order approximation that accounts both for endogenous and exogenous causes of system dynamics. The matrix ''A'' represents the average coupling among nodes in the absence of exogenous input <math>u(t)\ .</math> This can be thought of as the latent coupling in the absence of experimental perturbations. The ''B'' matrices are effectively the change in latent coupling induced by the ''j-th'' input. They encode context-sensitive changes in ''A'' or, equivalently, the modulation of coupling by experimental manipulations. Because <math>B^{(j)}</math> are second-order derivatives they are referred to as ''Bilinear''. Finally, the matrix ''C'' embodies the influences of exogenous input that ''Cause'' perturbations of hidden states. The connectivity or coupling matrices to be estimated are <math>\theta \supset \{A, B, C\}</math> are and define the functional architecture and interactions among brain regions at a neuronal level. <figref>Fig1A.png</figref> summarises this bilinear state-equation and shows the model in graphical
 
[[Image:Fig1A.png|thumb|400px|left| (A) The bilinear state equation of DCM for fMRI. (B) An example of a DCM describing the dynamics in a simple hierarchical system of visual areas. This system consists of two areas, each represented by a single state variable <math>(x_1, x_2)\ .</math> Black arrows represent connections, grey arrows represent exogenous inputs and thin dotted arrows indicate the transformation from neural states (blue colour) into hemodynamic observations (red colour); see <figref>Fig1A.png</figref> for the hemodynamic forward model. The state equation for this particular model is shown on the right. Adapted from (Stephan ''et al.'', 2007a).]]
 
DCM for fMRI combines this bilinear model of neural dynamics with an empirically validated hemodynamic model that describes the transformation of neuronal activity into a BOLD response. This so-called “Balloon model” was initially formulated by (Buxton ''et al.'', 1998) and later extended (Friston ''et al.'', 2000; Stephan ''et al.'', 2007c). In the hemodynamic model, changes in neural activity elicit a vasodilatory signal that leads to increases in blood flow and subsequently to changes in blood volume and deoxyhemoglobin content and summarised schematically in <figref>Fig2A.png</figref>.
 
[[Image:Fig2A.png|thumb|400px|right|Fig2A| Schematic of the hemodynamic model used by DCM for fMRI. Neuronal activity induces a vasodilatory and activity-dependent signal ''s'' that increases blood flow ''f''. Blood flow causes changes in volume and deoxyhemoglobin (<math>v</math> and <math>q</math>). These two hemodynamic states enter an output nonlinearity, which results in a predicted BOLD response ''y''. In recent versions, this model has six hemodynamic parameters (Stephan ''et al.,'' 2007c): the rate constant of the vasodilatory signal decay (<math>\kappa</math>), the rate constant for auto-regulatory feedback by blood flow (<math>\gamma</math>), transit time (<math>\tau</math>), Grubb’s vessel stiffness exponent (<math>\alpha</math>), capillary resting net oxygen extraction (<math>E_0</math>), and ratio of intra-extravascular BOLD signal (<math>\epsilon</math>). <math>E</math> is the oxygen extraction function. This figure encodes graphically the transformation from neuronal states to hemodynamic responses; adapted from (Friston ''et al.'', 2003).]]
 
Together, the neuronal and hemodynamic state equations furnish a deterministic DCM. For any given combination of parameters <math>\theta</math> and inputs <math>u\ ,</math> the measured BOLD response <math>y</math> is modelled as the predicted BOLD signal (the generalised convolution of inputs; <math>h(x,u,\theta)</math>) plus a linear mixture of confounds <math>X\beta</math> (''e.g.'' signal drift) and Gaussian observation error <math>\epsilon\ :</math>
 
<math>y=h(x,u,\theta) + X\beta + \epsilon</math>
 
A schematic representation of the hierarchical structure of DCM is
<math> u \overset{f}{\longrightarrow} x \overset{g}{\longrightarrow} y </math>
 
where ''u'' influences the dynamics of hidden (neuronal) states of the system ''x'', through the evolution ''f'' function; ''x'' is then mapped to the predicted data ''y'' through the observation function ''g''. The combined neural and hemodynamic parameters <math>\vartheta \supseteq \{A,B,C,\vartheta\}</math> are estimated from the measured BOLD data, using a Bayesian scheme with empirical priors for the hemodynamic parameters and conservative shrinkage priors for the coupling parameters (see below). Once the parameters of a DCM have been estimated, the posterior distributions of the parameters can be used to test hypotheses about connection strengths (''e.g.'', Ethofer ''et al.'', 2006; Fairhall and Ishai, 2007; Grol ''et al.'', 2007; Kumar ''et al.'', 2007; Posner ''et al.'', 2006; Stephan ''et al.'', 2006; Stephan ''et al.'', 2007b; Stephan ''et al.'', 2005).
== DCM for evoked responses ==
 
DCM for evoked responses is a biologically plausible model to understand how event-related responses result from the dynamics of coupled neural populations. It rests on neural mass models, which use established connectivity rules in hierarchical brain systems to describe the dynamics of a network of coupled neuronal sources each of which is modelled using a neural mass model (David and Friston, 2003; David ''et al.'', 2005; Jansen and Rit, 1995). Neural mass model emulates the activity of a cortical area using three neuronal subpopulations, assigned to granular and agranular layers. A population of excitatory pyramidal (output) cells receive inputs from inhibitory and excitatory populations of [[interneurons]], via intrinsic connections (which are confined to the cortical sheet). Within this model, excitatory interneurons can be regarded as spiny stellate cells found predominantly in layer four and in receipt of forward connections. Excitatory pyramidal cells and inhibitory interneurons are considered to occupy agranular layers and receive backward and lateral inputs.
 
[[Image:Fig3A.png|thumb|400px|right| Schematic of the DCM used to model evoked electrophysiological responses. This schematic shows the state equations describing the dynamics of sources or regions. Each neuronal source is modelled with three subpopulations (pyramidal, spiny stellate and inhibitory interneurons) which are connected by four intrinsic connections with weights <math>\gamma_{1,2,3,4}\ ,</math> as described in (Jansen and Rit, 1995) and (David and Friston, 2003). These have been assigned to granular and agranular cortical layers which receive forward <math>A^{F}</math>', backward <math>A^B</math> and lateral <math>A^L</math> connections respectively. Adapted from (Kiebel ''et al.'', 2008).]]
 
To model event-related responses, the network receives exogenous inputs via input connections. These connections are exactly the same as forward connections and deliver inputs to the spiny stellate cells. In the present context, inputs <math>u(t)</math> model sub-cortical auditory inputs. The vector <math>C\subset\theta</math> controls the influence of the input on each source. The lower, upper and leading diagonal matrices <math>A^{F},A^{B},A^{L}\subset\theta</math> encode forward, backward and lateral connections, respectively. The DCM here is specified in terms of the state equations and a linear output equation
 
:<math> \dot{x}=f(x,u,\theta) </math>
:<math> y= L(\theta)x_0+\epsilon </math>
 
where <math>x_0</math> represents the trans-membrane potential of pyramidal cells and <math>L(\theta)</math> is a lead field matrix coupling electrical sources to the EEG channels (Kiebel ''et al.'', 2006).
 
Within each subpopulation the evolution of neuronal states rests on two operators. The first transforms the average density of pre-synaptic inputs into the average postsynaptic [[membrane potential]]. This is modelled by a linear transformation with excitatory and inhibitory kernels parameterised by <math>H_{e,i}</math> and <math>\tau_{e,i}\ .</math> <math>H_{e,i}\subset\theta</math> control the maximum post-synaptic potential, and <math>\tau_{e,i}\subset\theta</math> represent lumped rate-constants. The second operator ''S'' transforms the average potential of each subpopulation into an average firing rate. This is assumed to be an instantaneous process that follows a sigmoid function (Marreiros ''et al.'', 2008b). Interactions, among the subpopulations, depend on constants <math>\gamma_{1,2,3,4}\ ,</math> which control the strength of intrinsic connections and reflect the total number of [[synapses]] expressed by each subpopulation.
 
== Model evidence and selection ==
 
Bayesian model selection (BMS) is a promising technique to determin the most likely among a set of competing hypotheses about the mechanisms that generated observed data. In the context of DCM, BMS is used to distinguish between different systems architectures. Model comparison and selection rests on the model evidence <math>p(y|m)\ ;</math> ''i.e.'' the probability of observing the data ''y'' under a particular model ''m''. The model evidence is obtained by integrating out dependencies on the model parameters
 
<math>
p(y|m)=\int p(y|\theta,m)p(\theta|m)d\theta
</math>
In DCM, model inversion, comparison and reduction are carried out by using computationally tractable approximations to the model evidence (or the log-evidence) called the (negative) free-energy ''F'' (see equation for F below), which handles posterior and priors dependencies properly.
 
For a given DCM, say model ''m'', inversion corresponds to approximating the moments of the posterior or conditional distribution given by Bayes rule
 
<math>
p(\theta|y,m)= \frac{ p(y|\theta,m)p(\theta|m)}{p(y|m)}
</math>
 
Inversion of a DCM involves minimizing the free energy, ''F'', in order to maximize the model evidence or marginal likelihood (''c.f.'' “type-II likelihood”; Good 1965). The posterior moments (mean and [[covariance]]) are updated iteratively using Variational Bayes under a fixed-form Laplace, (‘‘i.e.’’, Gaussian), approximation <math> q(\theta) </math> to the conditional density. This can be regarded as an Expectation-Maximization algorithm; '''EM''' (Dempster ''et al.'', 1977) that employs a local linear approximation of the predicted responses around the current conditional expectation. This [[Bayesian]] method was developed for dynamic system models based on differential equations. In contrast, conventional inversions of state space models typically use maximum likelihood methods and operate in discrete time (''c.f.'' Valdes ''et al.'', 1999). Generalisations of this Variational (Laplace) scheme extend the scope of DCM to cover models based on stochastic differential equations and difference equations (Friston ''et al.'' 2008; Daunizeau ''et al.'' 2009a).
The basic Variational scheme for DCM can be summarized as follows (where ''λ'' is the error variance and ''q'' is the conditional density):
 
:<math> \ \ E-Step:q \leftarrow \min_{q} F(q,\lambda,m)</math>
:<math> \ M-Step:\lambda \leftarrow \min_{\lambda} F(q,\lambda,m)</math>
:<math> </math>
:<math> F(q,\lambda,m)= \Big \langle lnq(\theta)-lnp(y|\theta,\lambda)-lnp(\theta|m) \Big \rangle_q</math>
:<math> \qquad \qquad \ \ =KL \Big(q||p(\theta|y,\lambda)\Big) - ln \Big(p(y|\lambda,m)\Big)</math>
 
The free-energy is the Kullback–Leibler divergence (denoted by ''KL''), between the real and approximate conditional density minus the log-evidence. This means that when the free-energy is minimised, the discrepancy between the true and approximate conditional density is suppressed. At this point the free-energy approximates the negative log-evidence: <math> F \approx -ln \Big ( p(y|\lambda,m) \Big ) </math> (Friston ''et al.'', 2007; Penny ''et al.'', 2004). Model selection is based on this approximation; where the best model is characterised by the greatest log-evidence (''i.e.'' the smallest free-energy). Pairwise model comparisons can be conveniently described by [http://en.wikipedia.org/wiki/Bayes_factor Bayes factors] (Kass and Raftery, 1995):
 
<math>
BF_{i,j} = \frac {p(y|m_i)}{p(y|m_j)}
</math>
 
Raftery (1995), presents an interpretation of the BF as providing weak (BF < 3), positive (3 ≤ BF < 20), strong (20 ≤ BF < 150) or very strong (BF ≥ 150) evidence for preferring one model over another. Strong evidence in favor of one model thus requires the difference in log-evidence to be three or more (Penny ''et al.'' 2004). Under flat priors on models, this corresponds to a conditional confidence that the winning model is exp(3) = 20 times more likely than the alternative. From the equations above, it can be seen that the Bayes factor is simply the exponential of the difference in log-evidences.
 
The search for the best model precedes (and is often more important than) inference on the parameters of the model selected. Many studies have used BMS to adjudicate among competing DCMs for fMRI (Acs and Greenlee, 2008; Allen ''et al.'', 2008; Grol ''et al.'', 2007; Heim ''et al.'', 2009; Kumar ''et al.'', 2007; Leff ''et al.'', 2008; Smith ''et al.'', 2006; Stephan ''et al.'', 2007c; Summerfield and Koechlin, 2008) and EEG data (Garrido ''et al.'', 2008; Garrido ''et al.'', 2007). This approach, to search for a single best model (amongst those deemed plausible ''a priori'') and then proceed to inference on its parameters, is pursued most often and could be complemented with diagnostic model checking procedures as, for example, suggested by Box (1980). However, alternatives to this single-model approach exist. For example, one can partition model space and make inferences about model families (Stephan ''et al.'' 2009; Penny ''et al.'' 2010). Alternatively, one can use Bayesian model averaging, where the parameter estimates of each model considered are weighted by the posterior probability of the model (Hoeting ''et al.'' 1999; Penny ''et al.'' 2010).
 
 
== Applications: fMRI ==
 
The use of DCM for fMRI is demonstrated by analysing data acquired under a study of attentional modulation during ''[[visual motion]]'' ''processing'' (Büchel and Friston, 1997). These data have been used previously to validate DCM (Friston ''et al.'', 2003) and are available from http://www.fil.ion.ucl.ac.uk/spm/data. The experimental manipulations were encoded as three exogenous inputs: A ''photic stimulation'' input indicated when dots were presented on a screen, a ''motion'' variable indicated that the dots were moving and the ''attention'' variable indicated that the subject was attending to possible velocity changes. The activity was modelled in three regions V1, V5 and superior parietal [[cortex]] (SPC).
 
Three different DCMs are specified, each of which embodies different assumptions about how attention modulates connectivity between V1 and V5. Model 1 assumes that attention modulates the forward connection from V1 to V5, model 2 assumes that attention modulates the backward connection from SPC to V5 and model 3 assumes attention modulates both connections. Each model assumes that the effect of motion is to modulate the connection from V1 to V5 and uses the same reciprocal hierarchical intrinsic connectivity. The models were fitted and the Bayes factors provided consistent evidence in favour of the hypothesis embodied in model 1, that attention modulates the forward connection from V1 to V5.
 
 
{|
|[[Image:Fig4A.png|thumb|400px|center|Fig4A|DCM applied to data from a study on attention to visual motion by (Büchel and Friston, 1997). In all models, photic stimulation enters V1 and motion modulates the connection from V1 to V5. All models have reciprocal and hierarchically organised connectivity. They differ in how attention (red) modulates the connectivity to V5; with model 1 assuming modulation of the forward connection (V1 to V5), model 2 assuming modulation of the backward connection (SPC to V5) and model 3 assuming both. The broken lines indicate the modulatory connections, adapted from (Penny ''et al.'', 2004).]]
 
|[[Image:Fig5A.png|thumb|400px|center|Fig5A|Nonlinear DCM for fMRI applied to the attention to motion paradigm. Left panel: Numbers alongside the connections indicate the ''maximum a posteriori'' (MAP) parameter estimates. Right panel: Posterior density of the estimate for the nonlinear modulation parameter for the V1→V5 connection. Given the mean and variance of this posterior density, we can be 99.1% confident that the true parameter value is larger than zero or, in other words, that there is an increase in gain of V5 responses to V1 inputs that are mediated by parietal activity. Adapted from (Stephan ''et al.'', 2008).]]
|}
 
Note that this model does not specify the source of the attentional top-down effect. This becomes possible with nonlinear dynamic causal models (Stephan ''et al.'' 2008). Nonlinear DCM for fMRI enables one to model how activity in one population gates connection strengths among others. <figref>Fig5A.png</figref> shows an application to the previous example where parietal activity, induced by attention to motion, modulates the connection from V1 to V5.
 
== Applications: Evoked responses ==
 
To illustrate DCM for event-related responses (ERPs) data acquired under a mismatch negativity (MMN) paradigm (http://www.fil.ion.ucl.ac.uk/spm/data) is used. In this example, various models over twelve subjects are compared. The results shown are a part of a program that considered the MMN and its underlying mechanisms (Garrido ''et al.'', 2007). Three plausible models were specified under an architecture motivated by electrophysiological and neuroimaging MMN studies (Doeller ''et al.'', 2003; Opitz ''et al.'', 2002). Each has five sources, modelled as Equivalent Current Dipole (ECDs); (Kiebel ''et al.'', 2006), over left and right primary auditory cortex (A1), left and right superior temporal gyrus (STG) and right inferior frontal gyrus (IFG). An exogenous (auditory) input enters bilaterally at A1, which are connected to their ipsilateral STG. Right STG is connected to the right IFG. Inter-hemispheric (lateral) connections are placed between left and right STG. All connections are reciprocal (''i.e.'', connected with forward and backward connections or with bilateral connections).
 
Three models were tested, which differed in the connections which could show putative repetition-dependent changes, ''i.e.'', differences between listening to standard or deviant tones. Models F, B and FB allowed changes in forward, backward and both, respectively. All three models were compared against a baseline or null model, which had the same architecture but precluded any coupling changes between standard and deviant trials.
 
 
{|
|[[Image:Fig6A.png|thumb|400px|center| Model specification. Sources are connected with forward (dark grey), backward (grey) or lateral (light grey) connections. A1: primary auditory cortex, STG: superior temporal gyrus, IFG: inferior temporal gyrus. Three different models were tested within the same architecture, allowing for repetition-related changes in forward F, backward B and forward and backward FB connections, respectively. The broken lines indicate the connections that were allowed to change, adapted from (Garrido ''et al.'', 2007).]]
 
|[[Image:Fig7A.png|thumb|400px|center| Bayesian model selection among DCMs for the three models, F, B and FB, expressed relative to a null model in which no connections were allowed to change across conditions. The graphs show the negative free-energy approximation to the log-evidence. ('''Left''') Log-evidence for models F, B, and FB for each subject (relative to the null). The diamond attributed to each subject identifies the best model on the basis of the subject’s highest log-evidence. ('''Right''') Log-evidence at the group level, ''i.e.'', pooled over subjects, for the three models, adapted from (Garrido ''et al.'', 2007).]]
|}
 
Bayesian model selection based on the increase in log-evidence over the null model was performed for all subjects. The log-evidences of the three models, relative to the null model (for each subject), reveal that they are substantially better than the null model in all subjects. In particular, the FB-model was best in seven out of eleven subjects. The sum of the log-evidences over subjects (which is equivalent to the log group Bayes factor, see below) showed that there was very strong evidence in favour of model FB at the group level.
 
== Hierarchical model comparison ==
 
Comparison at the between-subject level has been used extensively in previous group studies using the group Bayes factor (GBF). The GBF is simply the product of Bayes factors over subjects and constitutes a fixed-effects analysis. It has been used to decide between competing DCMs for fMRI (Acs and Greenlee, 2008; Allen ''et al.'', 2008; Grol ''et al.'', 2007; Heim ''et al.'', 2009; Kumar ''et al.'', 2007; Leff ''et al.'', 2008; Smith ''et al.'', 2006; Stephan ''et al.'', 2007c; Summerfield and Koechlin, 2008) and EEG data (Garrido ''et al.'', 2008; Garrido ''et al.'', 2007).
 
When the functional architecture is unlikely to differ across subjects, the conventional GBF is both sufficient and appropriate. However, subjects may exhibit different models or functional architectures; for example, due to different cognitive strategies or pathology. In this case, a hierarchical random effects procedure is required (Stephan ‘‘et al.’’, 2009). This rests on treating the model as a random variable and estimating the parameters of a Dirichlet distribution describing the probabilities of all models considered. These probabilities then define a multinomial distribution over model-space, allowing one to compute how likely it is that a specific model generated the data of a randomly chosen subject (and the exceedance probability of one model is more likely than any other).
 
== DCM developments ==
DCM combines a biophysical model of the hidden (latent) dynamics with a forward model that translates hidden states into predicted measurements; to furnish an explicit generative model how observed data were caused (Friston, 2009). This means the exact form of the DCM changes with each application and speaks to their progressive refinement:
 
Since its inception (Friston ''et al.'', 2003), a number of developments have improved and extended DCM: For fMRI, models of precise temporal sampling (Kiebel ''et al.'', 2007), multiple hidden states per region (Marreiros ''et al.'', 2008a), a refined hemodynamic model (Stephan ''et al.'', 2007c) and a nonlinear neuronal model (Stephan ''et al.'', 2008) have been introduced. DCM for EEG/MEG (David ''et al.'', 2006) has also seen rapid developments: DCM with lead-field parameterization (Kiebel ''et al.'', 2006), DCM for induced responses (Chen ''et al.'', 2008), DCM for neural-mass and mean-field models (Marreiros ''et al.'', 2009), DCM for spectral responses (Moran ''et al.'', 2009), stochastic DCMs (Daunizeau ''et al.'', 2009b) and DCM for phase-coupling (Penny ''et al.'', 2009). A review on developments for M/EEG data can be found in (Kiebel ''et al.'', 2008).
 
In relation to model selection, a hierarchical variational Bayesian framework (Stephan ''et al.'', 2009) accounts for random effects at the between-subjects level, ''e.g.'' when dealing with group heterogeneity or outliers. This work was extended by (Penny ''et al.'', 2010) to allow for comparisons between model families of arbitrary size and for Bayesian model averaging within model families.
 
== Recommended reading ==