![](uploads/mean-squared-error-mse-66558fa77af1d.png)
Mean Squared Error (MSE) is a commonly used metric to evaluate the performance of a regression model. It is a measure of the average of the squares of the errors or deviations, that is, the difference between the actual and predicted values of the target variable.
The formula for Mean Squared Error is:
MSE = (1/n) * Σ(yi - ŷi)^2
where:
The MSE is a non-negative value where a smaller MSE indicates a better fit of the model to the data. A value of 0 would indicate a perfect fit where the predicted values match the actual values exactly. However, a very low MSE on the training data may indicate overfitting, where the model is fitting noise in the data rather than the underlying pattern.
On the other hand, a high MSE indicates a poor fit of the model to the data, suggesting that the model's predictions are far from the actual values. In practice, a balance needs to be struck between a low MSE and a model that generalizes well to new, unseen data.
Let's consider a simple example to calculate the Mean Squared Error. Suppose we have a dataset with the following actual and predicted values:
Data Point | Actual Value (yi) | Predicted Value (ŷi) |
---|---|---|
1 | 5 | 4 |
2 | 3 | 5 |
3 | 6 | 6 |
4 | 8 | 7 |
Using the formula for MSE, we calculate the squared errors for each data point:
Summing up these squared errors gives us a total of 6. To get the MSE, we divide this total by the number of data points (n=4), resulting in MSE = 6/4 = 1.5.
Mean Squared Error is widely used in various fields for evaluating regression models. Some common use cases include: