Understand numpy.split(): Split an Array into Sub-Arrays – NumPy Tutorial

By | December 6, 2019

numpy.split() can allow us to split a numpy array into some sub-arrays, however, there are some notices you must concern when you are using this function. In this tutorial, we will some examples to discuss these notices.

Syntax

numpy.split(ary, indices_or_sections, axis=0)

Split an array into multiple sub-arrays.

Parameter explained

ary: an array you plan to split

indices_or_sections: int or 1-D array, which determines how to split an array

axis: determine split an array on while axis

There are some notices you must concern when you are using this funtion.

When indices_or_sections is int

For example indices_or_sections = 3, which mean we will split an array into 3 equal arrays along axis. We will use an example to discuss it.

Create a 2*4 numpy array

import numpy as np

#x = 2 * 4
x = np.array([[1, 2, 3, 4],[2, 3, 4, 5]], dtype = np.float32)

Split this array into 4 sub-arrays on axis = 1

xs = np.split(x,indices_or_sections = 4, axis = 1)
print(xs)

We will find xs=

[array([[1.],
       [2.]], dtype=float32), array([[2.],
       [3.]], dtype=float32), array([[3.],
       [4.]], dtype=float32), array([[4.],
       [5.]], dtype=float32)]

From example above, we find the sencond shape of x is 4, we can split it into 4 sub arrays. However, if we split into 3 sub arrays, can it be splitted successfully?

xs = np.split(x,indices_or_sections = 3, axis = 1)
print(xs)

You will get an error: ValueError: array split does not result in an equal division

Because 4/3 is not an integer.

When indices_or_sections is 1-D list

If indices_or_sections is a 1-D list, for example indices_or_sections = [1,2], which does not mean we will split an array into two array, the first array contain 1 element, the second array contains 2 elements. It is not same to tf.split().

You must remmeber: if indices_or_sections is a 1-D list, the value of it is indices of an array on a specific axis.

As to indices_or_sections = [1,2], which mean we will split an array into 2 or 3 sub arrays, the first sub array is [0-1) and the second sub array is [1, 2)

Here is an example:

import numpy as np

x = np.array([1, 2, 3, 4, 2, 3, 4, 5], dtype = np.float32)

xs = np.split(x,indices_or_sections = [1,2], axis = 0)
print(xs)

In this example, we will split numpy array x into 3 sub arrays, the result is:

[array([1.], dtype=float32), array([2.], dtype=float32), array([3., 4., 2., 3., 4., 5.], dtype=float32)]

From the result we will find:

The first sub array is: x[0]

The second sub array is: x[1]

The third sub array is: x[2:]

Meanwhile, if indices_or_sections=[n1, n2, n3,….., nk], you must make n1 < n2 < n3 < …… < nk.

As to this example:

xs = np.split(x,indices_or_sections = [2, 3, 2], axis = 0)
print(xs)

Here 2 < 3 > 2

The sub arrays is:

[array([1., 2.], dtype=float32), array([3.], dtype=float32), array([], dtype=float32), array([3., 4., 2., 3., 4., 5.], dtype=float32)]

We will find the third sub array is array([], dtype=float32), because 3 > 2.

How to split an array into two sub arrays easily?

We can use indices_or_sections = [N,] to implement, where N is the element count of the first sub array.

As to example below, we will split numpy array x into two sub arrays, the first sub array contains 4 elements

import numpy as np

x = np.array([1, 2, 3, 4, 2, 3, 4, 5], dtype = np.float32)

xs = np.split(x,indices_or_sections = [4,], axis = 0)
print(xs)

Here we set indices_or_sections = [4,], the sub arrays are:

[array([1., 2., 3., 4.], dtype=float32), array([2., 3., 4., 5.], dtype=float32)]

Leave a Reply