🎲 Probability Theory

This module covers core inequalities in probability theory, convergence theory of random variables, and large sample limit theorems.

1. Core Probability Inequalities

Markov's Inequality

For a non-negative random variable \(X \ge 0\) and any \(a > 0\):

\[ P(X \ge a) \le \frac{E[X]}{a} \]

Chebyshev's Inequality

For any random variable \(X\) and \(a > 0\):

\[ P(|X - E[X]| \ge a) \le \frac{Var(X)}{a^2} \]

Kolmogorov's Maximal Inequality

If \(X_1, \dots, X_n\) are independent and identically distributed (i.i.d.) with mean 0 and finite variance:

\[ P\left(\max_{1\le k\le n} \left|\sum_{i=1}^k X_i\right| \ge \lambda\right) \le \frac{1}{\lambda^2} \sum_{i=1}^n Var(X_i) \]

Borel-Cantelli Lemma

Let the event sequence \(\{A_n\}\) occurring infinitely often (i.o.) be denoted as \(\limsup A_n\):

First Lemma: If \(\sum_{n=1}^\infty P(A_n) < \infty\), then \(P(A_n \text{ i.o.}) = 0\).
Second Lemma: If \(\{A_n\}\) are mutually independent, and \(\sum_{n=1}^\infty P(A_n) = \infty\), then \(P(A_n \text{ i.o.}) = 1\).

2. Convergence of Random Variables

Definitions of Four Modes of Convergence

Convergence in Probability (\(X_n \xrightarrow{P} X\)): For any \(\epsilon > 0\), \(\lim_{n \to \infty} P(|X_n - X| > \epsilon) = 0\).
Almost Sure Convergence (\(X_n \xrightarrow{a.s.} X\)): \(P(\lim_{n \to \infty} X_n = X) = 1\).
Convergence in \(L^p\) (\(X_n \xrightarrow{L^p} X\)): \(\lim_{n \to \infty} E[|X_n - X|^p] = 0\).
Convergence in Distribution (\(X_n \xrightarrow{d} X\)):
- Definition: For all continuity points \(x\) of the limit distribution function \(F(x)\), \(\lim_{n \to \infty} F_n(x) = F(x)\).
- Lévy's Continuity Theorem: \(X_n \xrightarrow{d} X\) if and only if its characteristic function \(\phi_{X_n}(t) \to \phi_X(t)\) holds for every \(t\), and \(\phi_X(t)\) is continuous at \(t=0\).

Relationships Between Modes of Convergence (Important!)

Implication Chains: \(L^p \implies P \implies d\) and \(a.s. \implies P \implies d\).
Reverse Implications (Conditional):
- \(P \to L^p\): If \(X_n \xrightarrow{P} X\) and \(\{|X_n|^p\}\) is Uniformly Integrable (UI), then \(X_n \xrightarrow{L^p} X\).
- \(P \to a.s.\): If \(X_n \xrightarrow{P} X\), then there necessarily exists a subsequence \(\{n_k\}\) such that \(X_{n_k} \xrightarrow{a.s.} X\).
- \(d \to P\): If the limit distribution \(X=c\) is a degenerate constant, then convergence in distribution implies convergence in probability.
Slutsky's Theorem: If \(X_n \xrightarrow{d} X\) and \(Y_n \xrightarrow{P} c\) (a constant), then \(X_n Y_n \xrightarrow{d} cX\) and \(X_n + Y_n \xrightarrow{d} X + c\).

3. Law of Large Numbers and Central Limit Theorem

Khinchin's Weak Law of Large Numbers (WLLN)

Let \(X_1, X_2, \dots, X_n\) be a sequence of independent and identically distributed (i.i.d.) random variables with finite expectation \(E[X_i] = \mu\) (i.e., \(E[|X_1|] < \infty\)). Then the sample mean converges in probability to \(\mu\):

\[ \bar{X}_n = \frac{1}{n}\sum_{i=1}^n X_i \xrightarrow{P} \mu \quad (n \to \infty) \]

Kolmogorov's Strong Law of Large Numbers (SLLN)

If \(X_i\) are i.i.d., then \(E[|X_1|] < \infty\) is a necessary and sufficient condition for \(\bar{X}_n \xrightarrow{a.s.} E[X]\).

Central Limit Theorem (CLT) and Conditions

For a sequence of independent but not necessarily identically distributed random variables \(X_1, X_2, \dots\), let \(E[X_i]=\mu_i\), \(Var(X_i)=\sigma_i^2\), and \(S_n^2 = \sum_{i=1}^n \sigma_i^2\). The standardized sum is \(Z_n = \frac{1}{S_n} \sum_{i=1}^n (X_i - \mu_i)\).

Lindeberg's Condition (Necessary and Sufficient): For any \(\epsilon > 0\), it satisfies:

\[ \lim_{n \to \infty} \frac{1}{S_n^2} \sum_{i=1}^n E[(X_i - \mu_i)^2 I(|X_i - \mu_i| > \epsilon S_n)] = 0 \]

(Intuition: Ensures that no single random variable's extreme tail values dominate the total variance)

Lyapunov's Condition (Sufficient): If there exists a \(\delta > 0\) such that:

\[ \lim_{n \to \infty} \frac{1}{S_n^{2+\delta}} \sum_{i=1}^n E[|X_i - \mu_i|^{2+\delta}] = 0 \]

(Note: Lyapunov's condition is often easier to verify because if a higher-order moment exists and satisfies this limit, simple integral bounding directly implies Lindeberg's condition holds, thus \(Z_n \xrightarrow{d} N(0,1)\))

4. Quick Reference Table for Common Distributions

Distribution Name	Notation	Expectation \(E[X]\)	Variance \(Var(X)\)	PDF/PMF	Characteristic Function \(\phi_X(t)\)	Moment Generating Function \(M_X(t)\)
Normal Distribution	\(N(\mu, \sigma^2)\)	\(\mu\)	\(\sigma^2\)	\(\frac{1}{\sqrt{2\pi}\sigma} e^{-\frac{(x-\mu)^2}{2\sigma^2}}\)	\(e^{it\mu - \frac{1}{2}\sigma^2 t^2}\)	\(e^{t\mu + \frac{1}{2}\sigma^2 t^2}\)
Exponential Distribution	\(Exp(\lambda)\)	\(\frac{1}{\lambda}\)	\(\frac{1}{\lambda^2}\)	\(\lambda e^{-\lambda x} \quad (x \ge 0)\)	\(\frac{\lambda}{\lambda - it}\)	\(\frac{\lambda}{\lambda - t} \quad (t < \lambda)\)
Uniform Distribution	\(U(a, b)\)	\(\frac{a+b}{2}\)	\(\frac{(b-a)^2}{12}\)	\(\frac{1}{b-a} \quad (a \le x \le b)\)	\(\frac{e^{itb} - e^{ita}}{it(b-a)}\)	\(\frac{e^{tb} - e^{ta}}{t(b-a)}\)
Poisson Distribution	\(Pois(\lambda)\)	\(\lambda\)	\(\lambda\)	\(e^{-\lambda} \frac{\lambda^k}{k!} \quad (k \in \mathbb{N})\)	\(e^{\lambda(e^{it} - 1)}\)	\(e^{\lambda(e^t - 1)}\)
Binomial Distribution	\(Bin(n, p)\)	\(np\)	\(np(1-p)\)	\(\binom{n}{k} p^k (1-p)^{n-k}\)	\((1 - p + pe^{it})^n\)	\((1 - p + pe^t)^n\)