π² Probability Theory
This module covers core inequalities in probability theory, convergence theory of random variables, and large sample limit theorems.
1. Core Probability Inequalities
Markov's Inequality
For a non-negative random variable \(X \ge 0\) and any \(a > 0\):
Chebyshev's Inequality
For any random variable \(X\) and \(a > 0\):
Kolmogorov's Maximal Inequality
If \(X_1, \dots, X_n\) are independent and identically distributed (i.i.d.) with mean 0 and finite variance:
Borel-Cantelli Lemma
Let the event sequence \(\{A_n\}\) occurring infinitely often (i.o.) be denoted as \(\limsup A_n\):
- First Lemma: If \(\sum_{n=1}^\infty P(A_n) < \infty\), then \(P(A_n \text{ i.o.}) = 0\).
- Second Lemma: If \(\{A_n\}\) are mutually independent, and \(\sum_{n=1}^\infty P(A_n) = \infty\), then \(P(A_n \text{ i.o.}) = 1\).
2. Convergence of Random Variables
Definitions of Four Modes of Convergence
- Convergence in Probability (\(X_n \xrightarrow{P} X\)): For any \(\epsilon > 0\), \(\lim_{n \to \infty} P(|X_n - X| > \epsilon) = 0\).
- Almost Sure Convergence (\(X_n \xrightarrow{a.s.} X\)): \(P(\lim_{n \to \infty} X_n = X) = 1\).
- Convergence in \(L^p\) (\(X_n \xrightarrow{L^p} X\)): \(\lim_{n \to \infty} E[|X_n - X|^p] = 0\).
- Convergence in Distribution (\(X_n \xrightarrow{d} X\)):
- Definition: For all continuity points \(x\) of the limit distribution function \(F(x)\), \(\lim_{n \to \infty} F_n(x) = F(x)\).
- LΓ©vy's Continuity Theorem: \(X_n \xrightarrow{d} X\) if and only if its characteristic function \(\phi_{X_n}(t) \to \phi_X(t)\) holds for every \(t\), and \(\phi_X(t)\) is continuous at \(t=0\).
Relationships Between Modes of Convergence (Important!)
- Implication Chains: \(L^p \implies P \implies d\) and \(a.s. \implies P \implies d\).
- Reverse Implications (Conditional):
- \(P \to L^p\): If \(X_n \xrightarrow{P} X\) and \(\{|X_n|^p\}\) is Uniformly Integrable (UI), then \(X_n \xrightarrow{L^p} X\).
- \(P \to a.s.\): If \(X_n \xrightarrow{P} X\), then there necessarily exists a subsequence \(\{n_k\}\) such that \(X_{n_k} \xrightarrow{a.s.} X\).
- \(d \to P\): If the limit distribution \(X=c\) is a degenerate constant, then convergence in distribution implies convergence in probability.
- Slutsky's Theorem: If \(X_n \xrightarrow{d} X\) and \(Y_n \xrightarrow{P} c\) (a constant), then \(X_n Y_n \xrightarrow{d} cX\) and \(X_n + Y_n \xrightarrow{d} X + c\).
3. Law of Large Numbers and Central Limit Theorem
Khinchin's Weak Law of Large Numbers (WLLN)
Let \(X_1, X_2, \dots, X_n\) be a sequence of independent and identically distributed (i.i.d.) random variables with finite expectation \(E[X_i] = \mu\) (i.e., \(E[|X_1|] < \infty\)). Then the sample mean converges in probability to \(\mu\):
Kolmogorov's Strong Law of Large Numbers (SLLN)
If \(X_i\) are i.i.d., then \(E[|X_1|] < \infty\) is a necessary and sufficient condition for \(\bar{X}_n \xrightarrow{a.s.} E[X]\).
Central Limit Theorem (CLT) and Conditions
For a sequence of independent but not necessarily identically distributed random variables \(X_1, X_2, \dots\), let \(E[X_i]=\mu_i\), \(Var(X_i)=\sigma_i^2\), and \(S_n^2 = \sum_{i=1}^n \sigma_i^2\). The standardized sum is \(Z_n = \frac{1}{S_n} \sum_{i=1}^n (X_i - \mu_i)\).
- Lindeberg's Condition (Necessary and Sufficient): For any \(\epsilon > 0\), it satisfies:
(Intuition: Ensures that no single random variable's extreme tail values dominate the total variance)
- Lyapunov's Condition (Sufficient): If there exists a \(\delta > 0\) such that:
(Note: Lyapunov's condition is often easier to verify because if a higher-order moment exists and satisfies this limit, simple integral bounding directly implies Lindeberg's condition holds, thus \(Z_n \xrightarrow{d} N(0,1)\))
4. Quick Reference Table for Common Distributions
| Distribution Name | Notation | Expectation \(E[X]\) | Variance \(Var(X)\) | PDF/PMF | Characteristic Function \(\phi_X(t)\) | Moment Generating Function \(M_X(t)\) |
|---|---|---|---|---|---|---|
| Normal Distribution | \(N(\mu, \sigma^2)\) | \(\mu\) | \(\sigma^2\) | \(\frac{1}{\sqrt{2\pi}\sigma} e^{-\frac{(x-\mu)^2}{2\sigma^2}}\) | \(e^{it\mu - \frac{1}{2}\sigma^2 t^2}\) | \(e^{t\mu + \frac{1}{2}\sigma^2 t^2}\) |
| Exponential Distribution | \(Exp(\lambda)\) | \(\frac{1}{\lambda}\) | \(\frac{1}{\lambda^2}\) | \(\lambda e^{-\lambda x} \quad (x \ge 0)\) | \(\frac{\lambda}{\lambda - it}\) | \(\frac{\lambda}{\lambda - t} \quad (t < \lambda)\) |
| Uniform Distribution | \(U(a, b)\) | \(\frac{a+b}{2}\) | \(\frac{(b-a)^2}{12}\) | \(\frac{1}{b-a} \quad (a \le x \le b)\) | \(\frac{e^{itb} - e^{ita}}{it(b-a)}\) | \(\frac{e^{tb} - e^{ta}}{t(b-a)}\) |
| Poisson Distribution | \(Pois(\lambda)\) | \(\lambda\) | \(\lambda\) | \(e^{-\lambda} \frac{\lambda^k}{k!} \quad (k \in \mathbb{N})\) | \(e^{\lambda(e^{it} - 1)}\) | \(e^{\lambda(e^t - 1)}\) |
| Binomial Distribution | \(Bin(n, p)\) | \(np\) | \(np(1-p)\) | \(\binom{n}{k} p^k (1-p)^{n-k}\) | \((1 - p + pe^{it})^n\) | \((1 - p + pe^t)^n\) |