numpy.percentile() can allows us to compute the q-th percentile of an array. In this tutorial, we will use some examples to show you how to use this function.
Syntax
numpy.percentile(a, q, axis=None, out=None, overwrite_input=False, interpolation='linear', keepdims=False)
Compute the q-th percentile of the data along the specified axis and returns the q-th percentile(s) of the array elements.
Parameters
a: the input array
q: array_like of float, the percentile, it is 0-100. For example: p = 50.0 is the median value, p = 25.0 is first quartile.
aixs: the array aixs you plan to compute percentile.
overwrite_input: boolean, if overwrite_input = True, the input array a will be modified.
interpolation: It determines how to return the value of the q-th percentile. It can be: {‘linear’, ‘lower’, ‘higher’, ‘midpoint’, ‘nearest’}
- ‘linear’: i + (j – i) * fraction, where fraction is the fractional part of the index surrounded by i and j.
- ‘lower’: i.
- ‘higher’: j.
- ‘nearest’: i or j, whichever is nearest.
- ‘midpoint’: (i + j) / 2.
You can find more detail on its parameters:
https://numpy.org/doc/stable/reference/generated/numpy.percentile.html
How to use numpy.percentile()?
We will use some example to show you how to use it.
Compute the median value using numpy
import numpy as np data = np.array([[10, 7, 4], [3, 2, 1]]) q = 50.0
axis = 0, we will compute the median value of all value on data.
v = np.percentile(data, q) print(data) print(v)
v will be 3.5
axis = 1, we will compute the median value based axis = 0.
v = np.percentile(data, q, axis = 1)
v will be:
[7. 2.]
We can find the shape of v is (2, ) and the rank = 1, which is not the same to input data. We can use keepdims to make their rank is the same.
v = np.percentile(data, q, axis = 1, keepdims = True)
v will be:
[[7.] [2.]]
How about overwrite_input = True?
Look at the example below:
import numpy as np data = np.array([[11, 7, 4, 1], [2, 2, 1, 3]]) q = 50.0 v = np.percentile(data, q, overwrite_input = True, axis = 1, keepdims = True) print(data) print(v)
Run this code, we will get this result:
[[ 1 4 7 11] [ 1 2 2 3]] [[5.5] [2. ]]
The value of data is sorted.
How about axis is a list?
For example:
axis = [1, 2]
import numpy as np data = np.array([[[11, 7], [4, 1]], [[2, 2], [1, 3]]]) q = 50.0 v = np.percentile(data, q, axis = [1, 2],keepdims = True) print(v)
the median value v is:
[[[5.5]] [[2. ]]]
if axis = 2
v = np.percentile(data, q, axis = 2,keepdims = True) print(v)
The vlaue of v is:
[[[9. ] [2.5]] [[2. ] [2. ]]]