Model Performance Metrics

November 10, 2025

1. MAE — Mean Absolute Error

Formula:

\text{MAE} = \frac{1}{n}\sum_{i=1}^{n} |y_i - \hat{y}_i|

Meaning:

It measures the average magnitude of errors in a set of predictions, without considering their direction (positive or negative).
Lower is better.
It’s in the same units as the target variable.

✅ Intuitive: “On average, my predictions are off by X units.”
⚠️ Downside: It treats all errors equally — large and small errors have the same weight.

2. MSE — Mean Squared Error

Formula:

\text{MSE} = \frac{1}{n}\sum_{i=1}^{n} (y_i - \hat{y}_i)^2

Meaning:

It measures the average squared difference between predicted and actual values.
Larger errors have exponentially more weight because of the squaring.

✅ Highlights large errors — useful when large deviations are especially undesirable.
⚠️ Less interpretable since it’s in squared units of the target.

3. RMSE — Root Mean Squared Error

Formula:

\text{RMSE} = \sqrt{\text{MSE}} = \sqrt{\frac{1}{n}\sum_{i=1}^{n} (y_i - \hat{y}_i)^2}

Meaning:

It’s the square root of MSE, converting it back to the same units as the target variable.
Still penalizes large errors more heavily than MAE.

✅ Common and intuitive metric — “average size of the error.”
⚠️ Sensitive to outliers.

4. R² Score — Coefficient of Determination

Formula:

R^2 = 1 - \frac{\sum_{i=1}^{n} (y_i - \hat{y}_i)^2}{\sum_{i=1}^{n} (y_i - \bar{y})^2}

where $\bar{y}$ is the mean of the actual values.

Meaning:

Measures how well the model explains the variability of the target variable.
Range: $-\infty$ to $1$
- $R^2 = 1$ : perfect prediction
- $R^2 = 0$ : model predicts no better than the mean
- $R^2 < 0$ : model is worse than just predicting the mean

✅ Explains “goodness of fit”
⚠️ Can be misleading for non-linear models or small sample sizes.

Quick Summary Table

Metric	Formula	Penalizes Large Errors	Same Units as Target	Interpretation
MAE	( \frac{1}{n}\sum	y_i - \hat{y}_i	)	❌ No
MSE	$\frac{1}{n}\sum (y_i - \hat{y}_i)^2$	✅ Yes (squared)	❌ No	Avg. squared error
RMSE	$\sqrt{\frac{1}{n}\sum (y_i - \hat{y}_i)^2}$	✅ Yes (squared root)	✅ Yes	Avg. error magnitude
R²	$1 - \frac{\text{SS}_\text{res}}{\text{SS}_\text{tot}}$	—	Unitless	Variance explained

What should be standard values of these to call out model is good

That’s a great and practical question — and the answer depends on your data’s scale and context, but here’s how you can interpret these metrics in general.

⚙️ 1. MAE (Mean Absolute Error)

No fixed “good” value, because it depends on your target variable’s scale.
As a rule of thumb:
- MAE should be as low as possible.
- A good model usually has MAE that’s <10% of the average value of your target variable.
  $\text{MAE ratio} = \frac{\text{MAE}}{\bar{y}} < 0.1$
Example: If your target values average around 100, an MAE of ≤10 is generally quite good.

⚙️ 2. MSE (Mean Squared Error)

MSE’s magnitude depends on the square of your target variable, so it’s hard to interpret directly.
Use it mainly for comparison between models — the smaller, the better.
Often replaced with RMSE, which is easier to interpret.

⚙️ 3. RMSE (Root Mean Squared Error)

Interpreted in the same units as your target.
A good RMSE is typically 10–20% of the range or mean of the target variable.
Lower RMSE than MAE suggests your model is handling outliers well.

✅ Good model:

\frac{\text{RMSE}}{\bar{y}} < 0.2

⚠️ Watch out: If RMSE is much larger than MAE, it means you have outliers or large errors that need attention.

⚙️ 4. R² (Coefficient of Determination)

This one has more universal interpretation:

R² Value	Interpretation
1.0	Perfect predictions
≥ 0.9	Excellent — model explains most variability
0.75–0.9	Good — strong predictive power
0.5–0.75	Moderate — acceptable depending on context
< 0.5	Weak — model doesn’t explain much variance
< 0	Model worse than predicting the mean

💡 Rule of thumb:

For engineering / physical systems, expect R² ≥ 0.9.
For social sciences or human behavior data, R² ≥ 0.6 can already be quite good due to inherent noise.

🧭 Quick Summary

Metric	Ideal Direction	Good Rule of Thumb
MAE	↓ Lower better	<10% of mean(target)
MSE	↓ Lower better	Compare between models
RMSE	↓ Lower better	<20% of mean(target)
R²	↑ Higher better	>0.75 good, >0.9 excellent

Search This Blog

Artificial Intelligence