There are no items in your cart
Add More
Add More
Item Details | Price |
---|
Learn to evaluate predictions using MAE, RMSE, R², and Adjusted R².
So you've built a regression model, perhaps using Simple Linear Regression, Multiple Linear Regression, or even a powerful Random Forest Regressor. It makes predictions! But... how *good* are those predictions? How close are they to the actual values? We need ways to measure this – we need Regression Metrics.
Evaluating your model is crucial. It tells you if the model is useful, helps you compare different models, and guides you on how to improve it. Today, we'll explore the most common metrics used to evaluate regression models.
Simply building a model isn't enough; we need to know if it actually works!
Let's dive into the most frequently used metrics:
y
) and the predicted values (Å·
).Sum up the absolute 'miss distances' for all points (n), then divide by the number of points.
Sum up the squared 'miss distances' for all points (n), then divide by the number of points.
Calculate MSE first, then take its square root.
R² = 1 - [ Σ(yᵢ - ŷᵢ)² / Σ(yᵢ - ȳ)² ]
Compares the errors of your model (Σ(yᵢ - ŷᵢ)²
) to the errors you'd get by just predicting the average Y (Σ(yᵢ - ȳ)²
). Closer to 1 means your model explains much more variance than just the average.
Where:
R²
= the standard R-squared value
n
= number of data points (samples)
k
= number of independent variables (predictors)
After training your model (like a `RandomForestRegressor`) and making predictions, Scikit-learn makes calculating these metrics easy.
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
import numpy as np
# --- Assume model is trained and y_test, y_pred are available ---
# Example placeholder values (replace with your actual data)
# y_test = np.array([150, 200, 130, 300])
# y_pred = np.array([160, 190, 150, 280])
# --- Calculate Metrics ---
# Mean Absolute Error (MAE)
mae = mean_absolute_error(y_test, y_pred)
print(f"Mean Absolute Error (MAE): {mae:.4f}")
# Mean Squared Error (MSE)
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error (MSE): {mse:.4f}")
# Root Mean Squared Error (RMSE)
rmse = np.sqrt(mse) # Or use mean_squared_error(y_test, y_pred, squared=False)
# rmse = mean_squared_error(y_test, y_pred, squared=False) # Simpler way
print(f"Root Mean Squared Error (RMSE): {rmse:.4f}")
# R-squared (R²)
r2 = r2_score(y_test, y_pred)
print(f"R-squared (R²): {r2:.4f}")
# Adjusted R-squared (Requires number of samples 'n' and predictors 'k')
n = len(y_test) # Number of samples in the test set
k = X_test.shape[1] # Number of predictors/features used in the model
if n - k - 1 != 0: # Avoid division by zero
adj_r2 = 1 - (1 - r2) * (n - 1) / (n - k - 1)
print(f"Adjusted R-squared: {adj_r2:.4f}")
else:
print("Adjusted R-squared: Cannot calculate (n-k-1 is zero)")
Okay, you have the numbers, but what makes a "good" score?
Metric | Goal | Interpretation Notes |
---|---|---|
MAE / RMSE | Minimize (Closer to 0) | - Indicates the average prediction error magnitude. - Units are the same as the target variable (easier to relate). - RMSE penalizes large errors more than MAE. - "Good" depends heavily on the context and scale of your target variable (an RMSE of 10 might be great for predicting house prices in millions, but terrible for predicting age). Compare to baseline models or business needs. |
R² | Maximize (Closer to 1) | - Percentage of target variance explained by the model. - 0.7 means 70% explained. - Useful for assessing overall model fit. - Be wary: Adding *any* predictor, useful or not, tends to increase R². |
Adjusted R² | Maximize (Closer to 1) | - Like R², but penalizes for adding useless predictors. - Always ≤ R². - Best used for comparing models with different numbers of features. - If Adjusted R² is much lower than R², it might indicate overfitting or inclusion of irrelevant features. |
Always consider multiple metrics and the context of your specific problem when evaluating a model.
Interview Question
Question 1: What is the main difference in interpretation between MAE and RMSE?
Both measure average prediction error in the original units of the target variable. However, RMSE squares errors before averaging, so it penalizes large errors much more heavily than MAE. MAE treats all errors linearly based on their magnitude. Therefore, a model with a few large errors will have a significantly higher RMSE than MAE compared to a model with many small errors.
Question 2: If you add more features to a Multiple Linear Regression model, what will likely happen to the R² score, and what will likely happen to the Adjusted R² score?
The R² score will almost always either increase or stay the same, even if the added features are irrelevant. It doesn't penalize model complexity.
The Adjusted R² score will only increase if the added features significantly improve the model's explanatory power more than expected by chance. If the added features are useless, the Adjusted R² will likely decrease due to the penalty for added complexity.
Interview Question
Question 3: You have two models. Model A has RMSE = 50. Model B has RMSE = 100. Can you definitively say Model A is better?
Not definitively without context. While a lower RMSE generally indicates better performance (predictions are closer to actual values on average), the scale matters. If predicting house prices in millions, an RMSE of 50 might be excellent, while an RMSE of 100 is still very good. If predicting age in years, both might be poor. You need to compare the RMSE relative to the scale and variability of the target variable, or compare it to a baseline model.
Question 4: What does an R² value of 0.65 mean?
An R² of 0.65 means that 65% of the variance (spread) observed in the dependent variable (the target you are trying to predict) can be explained by the independent variables included in your regression model.
Interview Question
Question 5: Why might Adjusted R² be a more useful metric than R² when comparing models during feature selection?
Because Adjusted R² penalizes the inclusion of extra predictors that do not significantly improve the model fit. R² will always increase or stay the same as you add more predictors, potentially leading you to select an overly complex model with irrelevant features. Adjusted R² helps identify if adding a feature actually provides meaningful improvement relative to the added complexity, making it better for comparing models with different numbers of features.