MLE Parameter Estimator
Fit a Normal distribution using Maximum Likelihood Estimation, visualize the likelihood surface, and see how MLE connects to least-squares regression.
Feb 23, 2026, Eric
The Core Idea
Maximum Likelihood Estimation finds the parameters (μ, σ) that make the observed data most probable under the model. For a Normal distribution, the MLE solution is closed-form: μ̂ = sample mean, σ̂² = sample variance (dividing by N, not N−1). The likelihood surface shows how the log-likelihood changes as you move away from this optimal point — and why the MLE is a unique global maximum.
The connection to regression: minimizing mean squared error in OLS regression is exactly equivalent to maximizing likelihood under a Normal error assumption. MLE and least-squares are the same problem in disguise.
MLE μ̂ (daily)
0.0497%
MLE σ̂ (daily)
1.0682%
Ann. μ̂
12.5%
Ann. σ̂
17.0%
Max log-likelihood at (μ̂, σ̂)
ℒ = 623.31
Equivalent formula: ℒ(μ̂,σ̂) = −N·log(σ̂) − N/2·log(2π) − N/2
Data (200 points)
Log-Likelihood Surface
X-axis = μ, Y-axis = σ. Bright = high log-likelihood. White dot = MLE optimum. Hover a cell to inspect its (μ, σ, ℒ).
Log-Likelihood vs. σ (at MLE μ̂)
The single-peak shape confirms the MLE σ̂ is a unique global maximum. Maximizing this is equivalent to minimizing Σ(xi − μ)²/σ² — i.e., sum of squared residuals.
MLE ↔ Least-Squares Connection
Log-likelihood for Normal:
ℒ(μ,σ) = −N·log(σ) − N/2·log(2π) − Σ(xi−μ)² / (2σ²)
Maximizing over μ means minimizing:
Σ(xi − μ)² ← this is the sum of squared residuals
Which gives:
μ̂ = (1/N) Σxi = sample mean
OLS regression does the same thing: minimize Σ(yi − ŷi)² under a Normal error assumption → identical MLE solution.
For this dataset: μ̂ = 0.04972% (annualized: 12.53%), σ̂ = 1.06823% (annualized: 16.96%). The MLE and sample moments are identical — this is a unique property of the Normal distribution.