Understand sklearn.metrics.roc_curve() with Examples – Sklearn Tutorial

By | August 4, 2022

sklearn.metrics.roc_curve() can allow us to compute receiver operating characteristic (ROC) easily. In this tutorial, we will use some examples to show you how to use it.

sklearn.metrics.roc_curve()

It is defined as:

sklearn.metrics.roc_curve(y_true, y_score, *, pos_label=None, sample_weight=None, drop_intermediate=True)

It will return: fpr, tpr and thresholds

Understand TPR, FPR, Precision and Recall Metrics in Machine Learning – Machine Learning Tutorial

In order to use this function to compute ROC, we should use these three important parameters:

y_true: true labels, such as [1, 0, 0, 1]

y_score: the score predicted by your model.

pos_label: int or str, the true label of class.

For example: pos_label = 1 or “1”, which means label = 1 or “1” will be the positive class.

How to determine pos_label?

There is an easy way:

If the score of a sample is bigger than a threshold, it will be positive class.

How to use sklearn.metrics.roc_curve()?

There is an easy example.

from sklearn.metrics import roc_curve
labels = [1,0,1,0,1,1,0,1,1,1,1]
score = [-0.2,0.1,0.3,0,0.1,0.5,0,0.1,1,0.4,1]

fpr, tpr, thresholds = roc_curve(labels,score, pos_label=1)
print(fpr, tpr, thresholds)

This example means:

labels score
1 -0.2
0 0.1
1 0.3
0 0
1 0.1
1 0.5
0 0
1 0.1
1 1
1 0.4
1 1

Run this code, we will get:

[0.         0.         0.         0.33333333 1.         1.        ] [0.    0.25  0.625 0.875 0.875 1.   ] [ 2.   1.   0.3  0.1  0.  -0.2]

Then, we can compute EER to choose a best threshold.

In order to compute EER, you can read:

How to Compute EER Metrics in Voiceprint and Face Recognition – Machine Leaning Tutorial

Leave a Reply