sklearn.metrics.roc_curve() can allow us to compute receiver operating characteristic (ROC) easily. In this tutorial, we will use some examples to show you how to use it.
sklearn.metrics.roc_curve()
It is defined as:
sklearn.metrics.roc_curve(y_true, y_score, *, pos_label=None, sample_weight=None, drop_intermediate=True)
It will return: fpr, tpr and thresholds
Understand TPR, FPR, Precision and Recall Metrics in Machine Learning – Machine Learning Tutorial
In order to use this function to compute ROC, we should use these three important parameters:
y_true: true labels, such as [1, 0, 0, 1]
y_score: the score predicted by your model.
pos_label: int or str, the true label of class.
For example: pos_label = 1 or “1”, which means label = 1 or “1” will be the positive class.
How to determine pos_label?
There is an easy way:
If the score of a sample is bigger than a threshold, it will be positive class.
How to use sklearn.metrics.roc_curve()?
There is an easy example.
from sklearn.metrics import roc_curve labels = [1,0,1,0,1,1,0,1,1,1,1] score = [-0.2,0.1,0.3,0,0.1,0.5,0,0.1,1,0.4,1] fpr, tpr, thresholds = roc_curve(labels,score, pos_label=1) print(fpr, tpr, thresholds)
This example means:
labels | score |
1 | -0.2 |
0 | 0.1 |
1 | 0.3 |
0 | 0 |
1 | 0.1 |
1 | 0.5 |
0 | 0 |
1 | 0.1 |
1 | 1 |
1 | 0.4 |
1 | 1 |
Run this code, we will get:
[0. 0. 0. 0.33333333 1. 1. ] [0. 0.25 0.625 0.875 0.875 1. ] [ 2. 1. 0.3 0.1 0. -0.2]
Then, we can compute EER to choose a best threshold.
In order to compute EER, you can read:
How to Compute EER Metrics in Voiceprint and Face Recognition – Machine Leaning Tutorial