After a model (predictive or evolutive) is trained, DISCOVER generates some metrics that describe model's quality.

These metrics help to evaluate the performance of a model in different ways. Some are calculated with the final model over training data (training metrics), and other are calculated using a cross-validation (CV) framework (cross-validation metrics).

The metrics calculated over training data gives a sense of how well the model can reproduce the training observations while the CV metrics tell us what can be expected from the model when applied to new and unseen samples or designs.

Regression metrics

  • MAE: Mean Absolute Error. It is the average of the absolute distances between the predicted and the real values. MAE units are the same as the target’s.

  • RMSE: Root Mean Square Error. It is the square root of the average of the squared errors.

  • R2 score: Coefficient of determination, is the proportion of the variation in the real target that is predictable from the prediction. It normally ranges from 0 to 1, but can also be negative when the fit is worse than a horizontal prediction located at the target's average. A value of 1 indicates that the model's prediction perfectly fit the data.

CV test MAE error is easy to interpret, as it is in the same units as the target, it signals the average absolute error we can expect on the predictions. RMSE and R2 can be more helpful when comparing 2 or more models' performance in similar datasets.

FAQ

  • What can I do if a model got low training errors but very high errors on CV?

This can be due to several issues. Usually not enough data was provided or maybe the features aren't describing the problem well. We recommend to contact the DS team for a quick evaluation of your problem.


More DISCOVER help articles:

Did this answer your question?