auc¶

getml.pipeline.metrics.auc = 'auc'¶

Area under the curve - refers to the area under the receiver operating characteristic (ROC) curve.

Used for classification problems.

When handling a classification problem, the ROC curve maps the relationship between two conflicting goals:

On the hand, we want a high true positive rate. The true positive rate, sometimes referred to as recall, measures the share of true positive predictions over all positives:

\[TPR = \frac{number \; of \; true \; positives}{number \; of \; all \; positives}\]

In other words, we want our classification algorithm to “catch” as many positives as possible.

On the other hand, we also want a low false positive rate (FPR). The false positive rate measures the share of false positives over all negatives.

\[FPR = \frac{number \; of \; false \; positives}{number \; of \; all \; negatives}\]

In other words, we want as few “false alarms” as possible.

However, unless we have a perfect classifier, these two goals conflict with each other.

The ROC curve maps the TPR against the FPR. We now measure the area under said curve (AUC). A higher AUC implies that the trade-off between TPR and FPR is more benefitial. A perfect model would have an AUC of 1. An AUC of 0.5 implies that the model has no predictive value.