Chapter 5: Multivariate Itô Integral and Stochastic Differential Equations

In previous chapters, we established the Itô integral theory under one-dimensional Brownian motion. However, in practical financial mathematics and statistical physics, systems are often driven by multiple independent noise sources. In this chapter, we will extend the Itô integral to high-dimensional spaces, establish the multivariate Itô's formula, and formally introduce the core tool for characterizing continuous-time stochastic dynamic systems: Stochastic Differential Equations (SDEs).

1. Multivariate Itô Integral and Isometry

We first define multi-dimensional Brownian motion and its corresponding filtration.

Definition: Multivariate Brownian Motion and Adapted Matrix Process

Let \(W(t) = (W^1(t), W^2(t), \dots, W^m(t))^T\) be an \(m\)-dimensional standard Brownian motion, where each component \(W^j(t)\) is an independent one-dimensional standard Brownian motion.

Define the filtration generated by it as \(\mathcal{F}(t) = \sigma\{W(s) \mid 0 \le s \le t\}\). For any \(t < s\), the future increment \(W(s) - W(t)\) is completely independent of \(\mathcal{F}(t)\).

Let \(G(t, \omega) = (G^{ij}(t, \omega))_{n \times m}\) be an \(n \times m\) stochastic matrix process. If each component \(G^{ij}(t)\) of \(G(t)\) is \(\mathcal{F}(t)\)-adapted and satisfies the square-integrable condition:

\[ E\left[ \int_0^T \sum_{i=1}^n \sum_{j=1}^m (G^{ij}(t))^2 dt \right] < \infty \]

Then the matrix process is denoted as \(G \in L_{n \times m}^2(0, T)\).

For such matrix processes, the multivariate Itô integral is naturally defined in vector form.

Definition and Theorem: Multivariate Itô Integral

For \(G \in L_{n \times m}^2(0, T)\), we define the multivariate Itô integral \(\int_0^T G dW\) as an \(n\)-dimensional column vector, where its \(i\)-th component is defined as:

\[ \left( \int_0^T G dW \right)^i = \sum_{j=1}^m \int_0^T G^{ij} dW^j \]

Under this definition, the multivariate Itô integral still perfectly preserves the martingale and isometric properties:

(1) Zero Expectation: \(E\left[ \int_0^T G dW \right] = 0\) (i.e., the expectation of each component is 0).

(2) Multivariate Itô Isometry: Define the norm of the matrix as \(|G| = \sqrt{\sum_{i,j} (G^{ij})^2}\), then we have:

\[ E\left[ \left| \int_0^T G dW \right|^2 \right] = E\left[ \int_0^T |G(t)|^2 dt \right] \]

Rigorous Proof of Multivariate Itô Isometry (Click to expand)

By the definition of the vector norm, the squared magnitude of the integral can be written as the sum of the squares of its components:

\[ E\left[ \left| \int_0^T G dW \right|^2 \right] = E\left[ \sum_{i=1}^n \left( \sum_{j=1}^m \int_0^T G^{ij} dW^j \right)^2 \right] \]

Expand the internal squared term into a double summation:

\[ = \sum_{i=1}^n E\left[ \sum_{j=1}^m \int_0^T G^{ij} dW^j \sum_{k=1}^m \int_0^T G^{ik} dW^k \right] \]

Since summation and expectation are linear operators, we swap their order and focus on the cross terms:

\[ = \sum_{i=1}^n \sum_{j=1}^m \sum_{k=1}^m E\left[ \int_0^T G^{ij} dW^j \int_0^T G^{ik} dW^k \right] \]

Key Point: Independence between Brownian motion components. When \(j \neq k\), \(W^j\) and \(W^k\) are independent Brownian motions. Using the polarization identity of the one-dimensional stochastic integral and the independent increments property, the cross expectation of this part is \(0\).

Therefore, non-zero terms only exist on the diagonal where \(j = k\):

\[ = \sum_{i=1}^n \sum_{j=1}^m E\left[ \left( \int_0^T G^{ij} dW^j \right)^2 \right] \]

Apply the one-dimensional Itô isometry \(E[(\int G^{ij} dW^j)^2] = E[\int (G^{ij})^2 dt]\) to each term:

\[ = \sum_{i=1}^n \sum_{j=1}^m E\left[ \int_0^T (G^{ij})^2 dt \right] = E\left[ \int_0^T \sum_{i,j} (G^{ij})^2 dt \right] = E\left[ \int_0^T |G(t)|^2 dt \right] \]

The proof is complete. \(\square\)

2. Multivariate Itô's Formula and Multiplication Table

Before proceeding to solve stochastic differential equations, we must master the "chain rule" in the multi-dimensional case.

Definition: Multivariate Itô Process

The differential form of an \(n\)-dimensional Itô process \(X(t) = (X^1(t), \dots, X^n(t))^T\) is defined as:

\[ dX(t) = F(t) dt + G(t) dW(t) \]

Written in component form, it is:

\[ dX^i(t) = F^i(t) dt + \sum_{j=1}^m G^{ij}(t) dW^j(t), \quad i = 1, \dots, n \]

where the drift term \(F \in L_n^1(0, T)\) and the diffusion term \(G \in L_{n \times m}^2(0, T)\).

The core of handling multivariate stochastic differentials lies in the extended version of the Itô Multiplication Table. Since different Brownian motion components are independent, their "quadratic cross-variation" is 0.

Theorem: Multivariate Itô Multiplication Rule

For the time differential \(dt\) and Brownian motion differentials \(dW^j, dW^k\) of different dimensions, we have the following multiplication rules:

\((dt)^2 = 0\)
\(dt \cdot dW^j = 0\)
\(dW^j \cdot dW^k = \delta_{jk} dt = \begin{cases} dt, & j = k \\ 0, & j \neq k \end{cases}\)

Using the above rules, we can directly derive the Product Rule for two one-dimensional Itô processes \(X_1, X_2\):

\[ d(X_1 X_2) = X_1 dX_2 + X_2 dX_1 + dX_1 dX_2 \]

Derivation of the Cross Term \(dX_1 dX_2\) in the Product Rule (Click to expand)

Let the two processes be \(dX_1 = F_1 dt + \sum_j G_1^j dW^j\) and \(dX_2 = F_2 dt + \sum_k G_2^k dW^k\). Expand their product:

\[ dX_1 dX_2 = \left( F_1 dt + \sum_j G_1^j dW^j \right) \left( F_2 dt + \sum_k G_2^k dW^k \right) \]

According to the multivariate multiplication table, all quadratic terms containing \(dt\) (such as \((dt)^2, dt \cdot dW\)) are treated as higher-order infinitesimals and are directly eliminated:

\[ = \sum_{j=1}^m \sum_{k=1}^m G_1^j G_2^k dW^j dW^k \]

Since \(dW^j dW^k = \delta_{jk} dt\), which is non-zero only when \(j=k\), the summation collapses into a single sum:

\[ = \sum_{j=1}^m G_1^j G_2^j dt \]

Writing this in the form of a vector inner product yields \((G_1^T G_2) dt\). This is the correction term brought by the quadratic variation. \(\square\)

Based on the multivariate multiplication table, we can generalize the Taylor expansion to obtain the most important cornerstone of modern stochastic analysis.

Theorem: Multivariate Itô's Formula

Let \(X(t)\) be an \(n\)-dimensional Itô process, and \(U(t, x_1, \dots, x_n)\) be a multivariate scalar function from \(\mathbb{R}^+ \times \mathbb{R}^n \to \mathbb{R}\) with \(C^{1,2}\) continuous partial derivatives. Then the composite stochastic process \(U(t, X(t))\) is still an Itô process, and its differential is:

\[ dU(t, X(t)) = \frac{\partial U}{\partial t} dt + \sum_{i=1}^n \frac{\partial U}{\partial x^i} dX^i + \frac{1}{2} \sum_{i=1}^n \sum_{j=1}^n \frac{\partial^2 U}{\partial x^i \partial x^j} dX^i dX^j \]

If we expand \(dX^i\) and substitute \(dW^j dW^k = \delta_{jk} dt\), the above formula can be written in a more compact matrix trace form:

\[ dU = \left( U_t + \nabla U^T F + \frac{1}{2} \text{Tr}(G^T D_x^2 U G) \right) dt + (\nabla U^T G) dW \]

where \(\nabla U\) is the spatial gradient vector, and \(D_x^2 U\) is the Hessian matrix.

3. Practical Solving of Linear Stochastic Differential Equations (LSDE)

We now formally consider stochastic differential equations of the form \(dX(t) = b(t, X(t)) dt + \sigma(t, X(t)) dW(t)\). Since explicit solutions for non-linear SDEs often do not exist, this section will focus on analyzing a class of equations that dominate the field of finance: Linear Stochastic Differential Equations (Linear SDEs).

3.1 Exponential Martingale and Hermite Polynomials

Consider the simplest homogeneous linear equation, where its drift term is 0:

\[ \begin{cases} dY(t) = \lambda Y(t) dW(t) \\ Y(0) = 1 \end{cases} \]

This represents that the stochastic fluctuation of the system is proportional to the current state of the system.

Solving Method (Reverse Construction using Itô's Formula)

We guess the solution takes an exponential form \(Y(t) = e^{f(t, W(t))}\). To offset the correction term \(\frac{1}{2} (dW)^2 = \frac{1}{2}dt\) generated when expanding with Itô's formula, we pre-add a "compensation term" into the exponent.

Define the function \(U(x, t) = e^{\lambda x - \frac{\lambda^2}{2}t}\), and replace \(x\) with \(W(t)\). Apply the one-dimensional Itô's formula:

\[ dU(W(t), t) = \frac{\partial U}{\partial t} dt + \frac{\partial U}{\partial x} dW + \frac{1}{2} \frac{\partial^2 U}{\partial x^2} (dW)^2 \]

Substitute the partial derivatives \(U_t = -\frac{\lambda^2}{2} U\), \(U_x = \lambda U\), \(U_{xx} = \lambda^2 U\):

\[ dY(t) = -\frac{\lambda^2}{2} Y(t) dt + \lambda Y(t) dW(t) + \frac{1}{2} \lambda^2 Y(t) dt = \lambda Y(t) dW(t) \]

This perfectly matches the target equation. Therefore, its solution is the Exponential Martingale:

\[ Y(t) = e^{\lambda W(t) - \frac{\lambda^2}{2}t} \]

This solution has profound implications in probability theory; its Taylor expansion is directly linked to orthogonal polynomials.

Extension: Recurrence Relation between Hermite Polynomials and Stochastic Integrals

Recall the generating function formula for Hermite polynomials in probability theory:

\[ e^{\lambda x - \frac{\lambda^2}{2}t} = \sum_{n=0}^{\infty} \frac{\lambda^n}{n!} H_n(x, t) = \sum_{n=0}^{\infty} \lambda^n h_n(x, t) \]

We compare the integral form of the solution \(Y(t)\) with its series expansion form. Given \(Y(t) = 1 + \lambda \int_0^t Y(s) dW(s)\), substitute the expansion:

\[ \sum_{n=0}^{\infty} \lambda^n h_n(W(t), t) = 1 + \lambda \int_0^t \sum_{n=0}^{\infty} \lambda^n h_n(W(s), s) dW(s) \]

By comparing the coefficients of the same power \(\lambda^{n+1}\) on both sides of the equation, we obtain an extremely elegant integral recurrence formula regarding Brownian motion:

\[ h_{n+1}(W(t), t) = \int_0^t h_n(W(s), s) dW(s) \]

This formula demonstrates how repeatedly performing stochastic integration on Brownian motion naturally generates higher-order Hermite polynomials (since the constant term \(h_0 = 1\), then \(h_1 = W(t)\), \(h_2 = \frac{1}{2}(W(t)^2 - t)\), which is perfectly consistent with the Riemann sum limit from the previous chapter).

3.2 Black-Scholes Model

If we add a linear time drift to the above equation, we obtain the most famous Geometric Brownian Motion (GBM) model for asset pricing in finance.

\[ \frac{dS(t)}{S(t)} = \mu dt + \sigma dW(t) \]

where \(\mu\) is called the expected rate of return (drift term), and \(\sigma\) is called the asset volatility.

Exact Solution of Geometric Brownian Motion

We rewrite the equation as \(dS(t) = \mu S(t) dt + \sigma S(t) dW(t)\). Inspired by the previous method, we consider a logarithmic transformation. Let \(U(x) = \ln x\). Differentiate \(\ln S(t)\) according to Itô's formula:

\[ d(\ln S(t)) = U'(S(t)) dS(t) + \frac{1}{2} U''(S(t)) (dS(t))^2 \]

Substitute the derivatives \(U'(x) = 1/x\) and \(U''(x) = -1/x^2\), along with \((dS)^2 = \sigma^2 S^2 dt\):

\[ d(\ln S(t)) = \frac{1}{S(t)} (\mu S(t) dt + \sigma S(t) dW(t)) - \frac{1}{2} \frac{1}{S(t)^2} (\sigma^2 S(t)^2 dt) \]

Simplifying yields:

\[ d(\ln S(t)) = \left(\mu - \frac{\sigma^2}{2}\right) dt + \sigma dW(t) \]

Now the right side no longer depends on \(S(t)\) and becomes an ordinary integral! Integrate both sides of the equation from \(0\) to \(t\):

\[ \ln S(t) - \ln S(0) = \left(\mu - \frac{\sigma^2}{2}\right) t + \sigma W(t) \]

Take the exponential of both sides to get the ultimate solution:

\[ S(t) = S_0 e^{\left(\mu - \frac{\sigma^2}{2}\right)t + \sigma W(t)} \]

This result tells us that although the expectation of the logarithmic return behaves as linear \(\mu t\), due to the penalty of variance, the actual long-term growth rate of the asset price is reduced to \(\mu - \frac{\sigma^2}{2}\). This is the mathematical essence of "Volatility Drag".

4. Brownian Bridge

Finally, we explore a case of an SDE with strong boundary constraints.

In statistics, we often need to study a random walk that is "pinned" between two fixed points (e.g., the limit process in the Kolmogorov-Smirnov test). This is the Brownian bridge process.

Definition: Standard Brownian Bridge Equation

Consider the following stochastic differential equation with a singular drift term, with the initial condition \(B(0) = 0\), and time restricted to \(t \in [0, 1)\):

\[ dB(t) = -\frac{B(t)}{1-t} dt + dW(t) \]

Intuitively: As time \(t\) approaches \(1\), the denominator \((1-t) \to 0\), and the drift term generates an extremely massive "pulling force," forcibly pulling the trajectory of \(B(t)\) back to \(0\).

We can solve this SDE using the integrating factor method from classical ODEs combined with the Itô product rule.

Explicit Solution of the Brownian Bridge

We move the term containing \(B\) to the left side of the equation:

\[ dB(t) + \frac{B(t)}{1-t} dt = dW(t) \]

Multiply both sides by the integrating factor \(\frac{1}{1-t}\):

\[ \frac{1}{1-t} dB(t) + \frac{B(t)}{(1-t)^2} dt = \frac{dW(t)}{1-t} \]

Observing the left side, it is exactly the expansion of the joint differential \(d\left(\frac{B(t)}{1-t}\right)\) (because according to the ordinary product rule for differentiation, there are no two stochastic processes multiplied here, so there is no quadratic correction term):

\[ d\left(\frac{B(t)}{1-t}\right) = \frac{dW(t)}{1-t} \]

Integrate both sides from \(0\) to \(t\) (note that \(B(0) = 0\)):

\[ \frac{B(t)}{1-t} - 0 = \int_0^t \frac{dW(s)}{1-s} \]

Rearranging yields the explicit integral solution of the Brownian bridge:

\[ B(t) = (1-t) \int_0^t \frac{dW(s)}{1-s}, \quad 0 \le t < 1 \]

Framework for Proving Continuity and Zero Value at the Endpoint \(t=1\) (Click to expand)

The denominator of the above integral exhibits a singularity as \(t \to 1\). To prove that almost all paths can continuously "land" at \(B(1) = 0\), advanced probability tools are needed for bounding.

First, analyze the variance of the integral part. Let \(M(t) = \int_0^t \frac{dW(s)}{1-s}\). Since this is an Itô integral with respect to a non-random function, its expectation is 0, and the variance is given by the isometry:

\[ E[M(t)^2] = \int_0^t \frac{1}{(1-s)^2} ds = \frac{1}{1-t} - 1 = \frac{t}{1-t} \]

Therefore, the variance of \(B(t) = (1-t) M(t)\) is \(E[B(t)^2] = (1-t)^2 \frac{t}{1-t} = t(1-t)\). Obviously, as \(t \to 1\), the variance approaches 0. But this only shows convergence in probability or in \(L^2\).

To prove almost sure (a.s.) continuous convergence, one must employ the Borel-Cantelli Lemma and maximal inequalities. Take dyadic grid points \(t_n = 1 - 2^{-n}\). For the interval \([t_n, t_{n+1}]\), use Chebyshev's inequality and the martingale maximal inequality to estimate the probability bound of deviating from the target value within that interval:

\[ P\left( \max_{t \in [t_n, t_{n+1}]} |B(t)| > \varepsilon \right) \le \frac{C}{\varepsilon^2} 2^{-\eta n} \]

Since the right side constitutes a convergent geometric series \(\sum 2^{-\eta n} < \infty\), according to the First Borel-Cantelli Lemma, the probability of large deviation events occurring is 0. Therefore, almost all Brownian bridge paths are forced to converge to \(0\) as \(t \to 1\). \(\square\)