The Kolmogorov forward equation is used to evolve the state of a system forward in time. Given an initial probability distribution
p
t
(
x
)
{\displaystyle p_{t}(x)}
for a system being in state
x
{\displaystyle x}
at time
t
,
{\displaystyle t,}
the forward PDE is integrated to obtain
p
s
(
x
)
{\displaystyle p_{s}(x)}
at later times
s
>
t
.
{\displaystyle s>t.}
A common case takes the initial value
p
t
(
x
)
{\displaystyle p_{t}(x)}
to be a Dirac delta function centered on the known initial state
x
.
{\displaystyle x.}
The Kolmogorov backward equation is used to estimate the probability of the current system evolving so that it's future state at time
s
>
t
{\displaystyle s>t}
is given by some fixed probability function
p
s
(
x
)
.
{\displaystyle p_{s}(x).}
That is, the probability distribution in the future is given as a boundary condition, and the backwards PDE is integrated backwards in time.
A common boundary condition is to ask that the future state is contained in some subset of states
B
,
{\displaystyle B,}
the target set . Writing the set membership function as
1
B
,
{\displaystyle 1_{B},}
so that
1
B
(
x
)
=
1
{\displaystyle 1_{B}(x)=1}
if
x
∈
B
{\displaystyle x\in B}
and zero otherwise, the backward equation expresses the hit probability
p
t
(
x
)
{\displaystyle p_{t}(x)}
that in the future, the set membership will be sharp, given by
p
s
(
x
)
=
1
B
(
x
)
/
‖
B
‖
.
{\displaystyle p_{s}(x)=1_{B}(x)/\Vert B\Vert .}
Here,
‖
B
‖
{\displaystyle \Vert B\Vert }
is just the size of the set
B
,
{\displaystyle B,}
a normalization so that the total probability at time
s
{\displaystyle s}
integrates to one.
Kolmogorov backward equation
edit
Let
{
X
t
}
0
≤
t
≤
T
{\displaystyle \{X_{t}\}_{0\leq t\leq T}}
be the solution of the stochastic differential equation
d
X
t
=
μ
(
t
,
X
t
)
d
t
+
σ
(
t
,
X
t
)
d
W
t
,
0
≤
t
≤
T
,
{\displaystyle dX_{t}\;=\;\mu {\bigl (}t,X_{t}{\bigr )}\,dt\;+\;\sigma {\bigl (}t,X_{t}{\bigr )}\,dW_{t},\quad 0\;\leq \;t\;\leq \;T,}
where
W
t
{\displaystyle W_{t}}
is a (possibly multi-dimensional) Wiener process (Brownian motion ),
μ
{\displaystyle \mu }
is the drift coefficient, and
σ
{\displaystyle \sigma }
is related to the diffusion coefficient
D
{\displaystyle D}
as
D
=
σ
2
/
2.
{\displaystyle D=\sigma ^{2}/2.}
Define the transition density (or fundamental solution )
p
(
t
,
x
;
T
,
y
)
{\displaystyle p(t,x;\,T,y)}
by
p
(
t
,
x
;
T
,
y
)
=
P
[
X
T
∈
d
y
∣
X
t
=
x
]
d
y
,
t
<
T
.
{\displaystyle p(t,x;\,T,y)\;=\;{\frac {\mathbb {P} [\,X_{T}\in dy\,\mid \,X_{t}=x\,]}{dy}},\quad t<T.}
Then the usual Kolmogorov backward equation for
p
{\displaystyle p}
is
∂
p
∂
t
(
t
,
x
;
T
,
y
)
+
A
p
(
t
,
x
;
T
,
y
)
=
0
,
lim
t
→
T
p
(
t
,
x
;
T
,
y
)
=
δ
y
(
x
)
,
{\displaystyle {\frac {\partial p}{\partial t}}(t,x;\,T,y)\;+\;A\,p(t,x;\,T,y)\;=\;0,\quad \lim _{t\to T}\,p(t,x;\,T,y)\;=\;\delta _{y}(x),}
where
δ
y
(
x
)
{\displaystyle \delta _{y}(x)}
is the Dirac delta in
x
{\displaystyle x}
centered at
y
{\displaystyle y}
, and
A
{\displaystyle A}
is the infinitesimal generator of the diffusion:
A
f
(
x
)
=
∑
i
μ
i
(
x
)
∂
f
∂
x
i
(
x
)
+
1
2
∑
i
,
j
[
σ
(
x
)
σ
(
x
)
T
]
i
j
∂
2
f
∂
x
i
∂
x
j
(
x
)
.
{\displaystyle A\,f(x)\;=\;\sum _{i}\,\mu _{i}(x)\,{\frac {\partial f}{\partial x_{i}}}(x)\;+\;{\frac {1}{2}}\,\sum _{i,j}\,{\bigl [}\sigma (x)\,\sigma (x)^{\mathsf {T}}{\bigr ]}_{ij}\,{\frac {\partial ^{2}f}{\partial x_{i}\,\partial x_{j}}}(x).}
The backward Kolmogorov equation can be used to derive the Feynman–Kac formula . Given a function
F
{\displaystyle F}
that satisfies the boundary value problem
∂
F
∂
t
(
t
,
x
)
+
μ
(
t
,
x
)
∂
F
∂
x
(
t
,
x
)
+
1
2
σ
2
(
t
,
x
)
∂
2
F
∂
x
2
(
t
,
x
)
=
0
,
0
≤
t
≤
T
,
F
(
T
,
x
)
=
Φ
(
x
)
{\displaystyle {\frac {\partial F}{\partial t}}(t,x)\;+\;\mu (t,x)\,{\frac {\partial F}{\partial x}}(t,x)\;+\;{\frac {1}{2}}\,\sigma ^{2}(t,x)\,{\frac {\partial ^{2}F}{\partial x^{2}}}(t,x)\;=\;0,\quad 0\leq t\leq T,\quad F(T,x)\;=\;\Phi (x)}
and given
{
X
t
}
0
≤
t
≤
T
,
{\displaystyle \{X_{t}\}_{0\leq t\leq T},}
that, just as before, is a solution of
d
X
t
=
μ
(
t
,
X
t
)
d
t
+
σ
(
t
,
X
t
)
d
W
t
,
0
≤
t
≤
T
,
{\displaystyle dX_{t}\;=\;\mu (t,X_{t})\,dt\;+\;\sigma (t,X_{t})\,dW_{t},\quad 0\leq t\leq T,}
then if the expectation value is finite
∫
0
T
E
[
(
σ
(
t
,
X
t
)
∂
F
∂
x
(
t
,
X
t
)
)
2
]
d
t
<
∞
,
{\displaystyle \int _{0}^{T}\,\mathbb {E} \!{\Bigl [}{\bigl (}\sigma (t,X_{t})\,{\frac {\partial F}{\partial x}}(t,X_{t}){\bigr )}^{2}{\Bigr ]}\,dt\;<\;\infty ,}
then the Feynman–Kac formula is obtained:
F
(
t
,
x
)
=
E
[
Φ
(
X
T
)
|
X
t
=
x
]
.
{\displaystyle F(t,x)\;=\;\mathbb {E} \!{\bigl [}\;\Phi (X_{T})\,{\big |}\;X_{t}=x{\bigr ]}.}
Proof. Apply Itô’s formula to
F
(
s
,
X
s
)
{\displaystyle F(s,X_{s})}
for
t
≤
s
≤
T
{\displaystyle t\leq s\leq T}
:
F
(
T
,
X
T
)
=
F
(
t
,
X
t
)
+
∫
t
T
{
∂
F
∂
s
(
s
,
X
s
)
+
μ
(
s
,
X
s
)
∂
F
∂
x
(
s
,
X
s
)
+
1
2
σ
2
(
s
,
X
s
)
∂
2
F
∂
x
2
(
s
,
X
s
)
}
d
s
+
∫
t
T
σ
(
s
,
X
s
)
∂
F
∂
x
(
s
,
X
s
)
d
W
s
.
{\displaystyle F(T,X_{T})\;=\;F(t,X_{t})\;+\;\int _{t}^{T}\!{\Bigl \{}{\frac {\partial F}{\partial s}}(s,X_{s})\;+\;\mu (s,X_{s})\,{\frac {\partial F}{\partial x}}(s,X_{s})\;+\;{\tfrac {1}{2}}\,\sigma ^{2}(s,X_{s})\,{\frac {\partial ^{2}F}{\partial x^{2}}}(s,X_{s}){\Bigr \}}\,ds\;+\;\int _{t}^{T}\!\sigma (s,X_{s})\,{\frac {\partial F}{\partial x}}(s,X_{s})\,dW_{s}.}
Because
F
{\displaystyle F}
solves the PDE, the first integral is zero. Taking conditional expectation and using the martingale property of the Itô integral gives
E
[
F
(
T
,
X
T
)
|
X
t
=
x
]
=
F
(
t
,
x
)
.
{\displaystyle \mathbb {E} \!{\bigl [}F(T,X_{T})\,{\big |}\;X_{t}=x{\bigr ]}\;=\;F(t,x).}
Substitute
F
(
T
,
X
T
)
=
Φ
(
X
T
)
{\displaystyle F(T,X_{T})=\Phi (X_{T})}
to conclude
F
(
t
,
x
)
=
E
[
Φ
(
X
T
)
|
X
t
=
x
]
.
{\displaystyle F(t,x)\;=\;\mathbb {E} \!{\bigl [}\;\Phi (X_{T})\,{\big |}\;X_{t}=x{\bigr ]}.}
Derivation of the backward Kolmogorov equation
edit
The Feynman–Kac representation can be used to find the PDE solved by the transition densities of solutions to SDEs. Suppose
d
X
t
=
μ
(
t
,
X
t
)
d
t
+
σ
(
t
,
X
t
)
d
W
t
.
{\displaystyle dX_{t}\;=\;\mu (t,X_{t})\,dt\;+\;\sigma (t,X_{t})\,dW_{t}.}
For any set
B
{\displaystyle B}
, define
p
B
(
t
,
x
;
T
)
≜
P
[
X
T
∈
B
∣
X
t
=
x
]
=
E
[
1
B
(
X
T
)
|
X
t
=
x
]
.
{\displaystyle p_{B}(t,x;\,T)\;\triangleq \;\mathbb {P} \!{\bigl [}X_{T}\in B\,\mid \,X_{t}=x{\bigr ]}\;=\;\mathbb {E} \!{\bigl [}\mathbf {1} _{B}(X_{T})\,{\big |}\;X_{t}=x{\bigr ]}.}
By Feynman–Kac (under integrability conditions), taking
Φ
=
1
B
{\displaystyle \Phi =\mathbf {1} _{B}}
, then
∂
p
B
∂
t
(
t
,
x
;
T
)
+
A
p
B
(
t
,
x
;
T
)
=
0
,
p
B
(
T
,
x
;
T
)
=
1
B
(
x
)
,
{\displaystyle {\frac {\partial p_{B}}{\partial t}}(t,x;\,T)\;+\;A\,p_{B}(t,x;\,T)\;=\;0,\quad p_{B}(T,x;\,T)\;=\;\mathbf {1} _{B}(x),}
where
A
f
(
t
,
x
)
=
μ
(
t
,
x
)
∂
f
∂
x
(
t
,
x
)
+
1
2
σ
2
(
t
,
x
)
∂
2
f
∂
x
2
(
t
,
x
)
.
{\displaystyle A\,f(t,x)\;=\;\mu (t,x)\,{\frac {\partial f}{\partial x}}(t,x)\;+\;{\tfrac {1}{2}}\,\sigma ^{2}(t,x)\,{\frac {\partial ^{2}f}{\partial x^{2}}}(t,x).}
Assuming Lebesgue measure as the reference, write
|
B
|
{\displaystyle |B|}
for its measure. The transition density
p
(
t
,
x
;
T
,
y
)
{\displaystyle p(t,x;\,T,y)}
is
p
(
t
,
x
;
T
,
y
)
≜
lim
B
→
y
1
|
B
|
P
[
X
T
∈
B
∣
X
t
=
x
]
.
{\displaystyle p(t,x;\,T,y)\;\triangleq \;\lim _{B\to y}\,{\frac {1}{|B|}}\,\mathbb {P} \!{\bigl [}X_{T}\in B\,\mid \,X_{t}=x{\bigr ]}.}
Then
∂
p
∂
t
(
t
,
x
;
T
,
y
)
+
A
p
(
t
,
x
;
T
,
y
)
=
0
,
p
(
t
,
x
;
T
,
y
)
→
δ
y
(
x
)
as
t
→
T
.
{\displaystyle {\frac {\partial p}{\partial t}}(t,x;\,T,y)\;+\;A\,p(t,x;\,T,y)\;=\;0,\quad p(t,x;\,T,y)\;\to \;\delta _{y}(x)\quad {\text{as }}t\;\to \;T.}
Derivation of the forward Kolmogorov equation
edit
The Kolmogorov forward equation is
∂
∂
T
p
(
t
,
x
;
T
,
y
)
=
A
∗
[
p
(
t
,
x
;
T
,
y
)
]
,
lim
T
→
t
p
(
t
,
x
;
T
,
y
)
=
δ
y
(
x
)
.
{\displaystyle {\frac {\partial }{\partial T}}\,p{\bigl (}t,x;\,T,y{\bigr )}\;=\;A^{*}\!{\bigl [}p{\bigl (}t,x;\,T,y{\bigr )}{\bigr ]},\quad \lim _{T\to t}\,p(t,x;\,T,y)\;=\;\delta _{y}(x).}
For
T
>
r
>
t
{\displaystyle T>r>t}
, the Markov property implies
p
(
t
,
x
;
T
,
y
)
=
∫
−
∞
∞
p
(
t
,
x
;
r
,
z
)
p
(
r
,
z
;
T
,
y
)
d
z
.
{\displaystyle p(t,x;\,T,y)\;=\;\int _{-\infty }^{\infty }p{\bigl (}t,x;\,r,z{\bigr )}\,p{\bigl (}r,z;\,T,y{\bigr )}\,dz.}
Differentiate both sides w.r.t.
r
{\displaystyle r}
:
0
=
∫
−
∞
∞
[
∂
∂
r
p
(
t
,
x
;
r
,
z
)
⋅
p
(
r
,
z
;
T
,
y
)
+
p
(
t
,
x
;
r
,
z
)
⋅
∂
∂
r
p
(
r
,
z
;
T
,
y
)
]
d
z
.
{\displaystyle 0\;=\;\int _{-\infty }^{\infty }{\Bigl [}{\frac {\partial }{\partial r}}\,p{\bigl (}t,x;\,r,z{\bigr )}\,\cdot \,p{\bigl (}r,z;\,T,y{\bigr )}\;+\;p{\bigl (}t,x;\,r,z{\bigr )}\,\cdot \,{\frac {\partial }{\partial r}}\,p{\bigl (}r,z;\,T,y{\bigr )}{\Bigr ]}\,dz.}
From the backward Kolmogorov equation:
∂
∂
r
p
(
r
,
z
;
T
,
y
)
=
−
A
p
(
r
,
z
;
T
,
y
)
.
{\displaystyle {\frac {\partial }{\partial r}}\,p{\bigl (}r,z;\,T,y{\bigr )}\;=\;-\,A\,p{\bigl (}r,z;\,T,y{\bigr )}.}
Substitute into the integral:
0
=
∫
−
∞
∞
[
∂
∂
r
p
(
t
,
x
;
r
,
z
)
⋅
p
(
r
,
z
;
T
,
y
)
−
p
(
t
,
x
;
r
,
z
)
⋅
A
p
(
r
,
z
;
T
,
y
)
]
d
z
.
{\displaystyle 0\;=\;\int _{-\infty }^{\infty }{\Bigl [}{\frac {\partial }{\partial r}}\,p{\bigl (}t,x;\,r,z{\bigr )}\,\cdot \,p{\bigl (}r,z;\,T,y{\bigr )}\;-\;p{\bigl (}t,x;\,r,z{\bigr )}\,\cdot \,A\,p{\bigl (}r,z;\,T,y{\bigr )}{\Bigr ]}\,dz.}
By definition of the adjoint operator
A
∗
{\displaystyle A^{*}}
:
∫
−
∞
∞
[
∂
∂
r
p
(
t
,
x
;
r
,
z
)
−
A
∗
p
(
t
,
x
;
r
,
z
)
]
p
(
r
,
z
;
T
,
y
)
d
z
=
0.
{\displaystyle \int _{-\infty }^{\infty }{\bigl [}{\frac {\partial }{\partial r}}\,p{\bigl (}t,x;\,r,z{\bigr )}\;-\;A^{*}\,p{\bigl (}t,x;\,r,z{\bigr )}{\bigr ]}\,p{\bigl (}r,z;\,T,y{\bigr )}\,dz\;=\;0.}
Since
p
(
r
,
z
;
T
,
y
)
{\displaystyle p(r,z;\,T,y)}
can be arbitrary, the bracket must vanish:
∂
∂
r
p
(
t
,
x
;
r
,
z
)
=
A
∗
[
p
(
t
,
x
;
r
,
z
)
]
.
{\displaystyle {\frac {\partial }{\partial r}}\,p{\bigl (}t,x;\,r,z{\bigr )}\;=\;A^{*}{\bigl [}p{\bigl (}t,x;\,r,z{\bigr )}{\bigr ]}.}
Relabel
r
→
T
{\displaystyle r\to T}
and
z
→
y
{\displaystyle z\to y}
, yielding the forward Kolmogorov equation:
∂
∂
T
p
(
t
,
x
;
T
,
y
)
=
A
∗
[
p
(
t
,
x
;
T
,
y
)
]
,
lim
T
→
t
p
(
t
,
x
;
T
,
y
)
=
δ
y
(
x
)
.
{\displaystyle {\frac {\partial }{\partial T}}\,p{\bigl (}t,x;\,T,y{\bigr )}\;=\;A^{*}\!{\bigl [}p{\bigl (}t,x;\,T,y{\bigr )}{\bigr ]},\quad \lim _{T\to t}\,p(t,x;\,T,y)\;=\;\delta _{y}(x).}
Finally,
A
∗
g
(
x
)
=
−
∑
i
∂
∂
x
i
[
μ
i
(
x
)
g
(
x
)
]
+
1
2
∑
i
,
j
∂
2
∂
x
i
∂
x
j
[
(
σ
(
x
)
σ
(
x
)
T
)
i
j
g
(
x
)
]
.
{\displaystyle A^{*}\,g(x)\;=\;-\sum _{i}\,{\frac {\partial }{\partial x_{i}}}{\bigl [}\mu _{i}(x)\,g(x){\bigr ]}\;+\;{\frac {1}{2}}\,\sum _{i,j}\,{\frac {\partial ^{2}}{\partial x_{i}\,\partial x_{j}}}{\Bigl [}{\bigl (}\sigma (x)\,\sigma (x)^{\mathsf {T}}{\bigr )}_{ij}\,g(x){\Bigr ]}.}
Etheridge, A. (2002). A Course in Financial Calculus . Cambridge University Press.
^ Andrei Kolmogorov, "Über die analytischen Methoden in der Wahrscheinlichkeitsrechnung" (On Analytical Methods in the Theory of Probability), 1931, [1]