Understand numpy.percentile(): Compute the q-th Percentile – NumPy Tutorial

By | December 3, 2020

numpy.percentile() can allows us to compute the q-th percentile of an array. In this tutorial, we will use some examples to show you how to use this function.

Syntax

numpy.percentile(a, q, axis=None, out=None, overwrite_input=False, interpolation='linear', keepdims=False)

Compute the q-th percentile of the data along the specified axis and returns the q-th percentile(s) of the array elements.

Parameters

a: the input array

q: array_like of float, the percentile, it is 0-100. For example: p = 50.0 is the median value, p = 25.0 is first quartile.

aixs: the array aixs you plan to compute percentile.

overwrite_input: boolean, if overwrite_input = True, the input array a will be modified.

interpolation: It determines how to return the value of the q-th percentile. It can be: {‘linear’, ‘lower’, ‘higher’, ‘midpoint’, ‘nearest’}

  • ‘linear’: i + (j – i) * fraction, where fraction is the fractional part of the index surrounded by i and j.
  • ‘lower’: i.
  • ‘higher’: j.
  • ‘nearest’: i or j, whichever is nearest.
  • ‘midpoint’: (i + j) / 2.

You can find more detail on its parameters:

https://numpy.org/doc/stable/reference/generated/numpy.percentile.html

How to use numpy.percentile()?

We will use some example to show you how to use it.

Compute the median value using numpy

import numpy as np

data = np.array([[10, 7, 4], [3, 2, 1]])

q = 50.0

axis = 0, we will compute the median value of all value on data.

v = np.percentile(data, q)
print(data)
print(v)

v will be 3.5

axis = 1, we will compute the median value based axis = 0.

v = np.percentile(data, q, axis = 1)

v will be:

[7. 2.]

We can find the shape of v is (2, ) and the rank = 1, which is not the same to input data. We can use keepdims to make their rank is the same.

v = np.percentile(data, q, axis = 1, keepdims = True)

v will be:

[[7.]
 [2.]]

How about overwrite_input = True?

Look at the example below:

import numpy as np

data = np.array([[11, 7, 4, 1], [2, 2, 1, 3]])

q = 50.0

v = np.percentile(data, q, overwrite_input = True, axis = 1, keepdims = True)
print(data)
print(v)

Run this code, we will get this result:

[[ 1  4  7 11]
 [ 1  2  2  3]]
[[5.5]
 [2. ]]

The value of data is sorted.

How about axis is a list?

For example:

axis = [1, 2]

import numpy as np

data = np.array([[[11, 7], [4, 1]], [[2, 2], [1, 3]]])

q = 50.0

v = np.percentile(data, q, axis = [1, 2],keepdims = True)
print(v)

the median value v is:

[[[5.5]]

 [[2. ]]]

if axis = 2

v = np.percentile(data, q, axis = 2,keepdims = True)
print(v)

The vlaue of v is:

[[[9. ]
  [2.5]]

 [[2. ]
  [2. ]]]

Leave a Reply