Back

MLE Parameter Estimator

Fit a Normal distribution using Maximum Likelihood Estimation, visualize the likelihood surface, and see how MLE connects to least-squares regression.

Feb 23, 2026, Eric

The Core Idea

Maximum Likelihood Estimation finds the parameters (μ, σ) that make the observed data most probable under the model. For a Normal distribution, the MLE solution is closed-form: μ̂ = sample mean, σ̂² = sample variance (dividing by N, not N−1). The likelihood surface shows how the log-likelihood changes as you move away from this optimal point — and why the MLE is a unique global maximum.

The connection to regression: minimizing mean squared error in OLS regression is exactly equivalent to maximizing likelihood under a Normal error assumption. MLE and least-squares are the same problem in disguise.

MLE μ̂ (daily)

0.0497%

MLE σ̂ (daily)

1.0682%

Ann. μ̂

12.5%

Ann. σ̂

17.0%

Max log-likelihood at (μ̂, σ̂)

ℒ = 623.31

Equivalent formula: ℒ(μ̂,σ̂) = −N·log(σ̂) − N/2·log(2π) − N/2

Data (200 points)

One column of daily returns

Log-Likelihood Surface

X-axis = μ, Y-axis = σ. Bright = high log-likelihood. White dot = MLE optimum. Hover a cell to inspect its (μ, σ, ℒ).

μ = -3.155%← μ axis →μ = 3.254%

Log-Likelihood vs. σ (at MLE μ̂)

The single-peak shape confirms the MLE σ̂ is a unique global maximum. Maximizing this is equivalent to minimizing Σ(xi − μ)²/σ² — i.e., sum of squared residuals.

MLE ↔ Least-Squares Connection

Log-likelihood for Normal:

ℒ(μ,σ) = −N·log(σ) − N/2·log(2π) − Σ(xi−μ)² / (2σ²)

Maximizing over μ means minimizing:

Σ(xi − μ)² ← this is the sum of squared residuals

Which gives:

μ̂ = (1/N) Σxi = sample mean

OLS regression does the same thing: minimize Σ(yi − ŷi)² under a Normal error assumption → identical MLE solution.

For this dataset: μ̂ = 0.04972% (annualized: 12.53%), σ̂ = 1.06823% (annualized: 16.96%). The MLE and sample moments are identical — this is a unique property of the Normal distribution.