TensorFlow tf.nn.max_pool() function is one part of building a convolutional network. In this tutorial, we will introduce how to use it correctly.
Syntax
tf.nn.max_pool() is defined as:
tf.nn.max_pool( value, ksize, strides, padding, data_format='NHWC', name=None )
It can perform the max pooling on the value based on ksize.
Parameters
value: it is an input tensor, the shape of it should be: [batch, h, w, channels]. It can be the returned tensor from tf.nn.conv2d().
ksize: a kernel size, the shape of it should be: [1, k_h, k_w, 1]. We do not implement max pool operation on batch and channels.
strides: the movement step of ksize, the shape of it should be: [1, stride, stride, 1], which is the same to strides in tf.nn.conv2d().
padding: ‘VALID’ or ‘SAME’. It is same to padding in tf.nn.conv2d().
data_format: A string. ‘NHWC’, ‘NCHW’ and ‘NCHW_VECT_C’. It is most similar to data_format in tf.nn.conv2d().
Understand tf.nn.conv2d(): Compute a 2-D Convolution in TensorFlow
Return
A tensor with the same data type, the shape of it should be: [batch, out_height, out_width, channels]
To understand how to compute out_height and out_width, you can read:
Understand the Shape of Tensor Returned by tf.nn.conv2d()
Then we will use an example to show you how to use tf.nn.max_pool().
How to use tf.nn.max_pool()?
Look at example code below:
import tensorflow as tf a = tf.constant([[ [[1., 17.], [2., 18.], [3., 19.], [4., 20.]], [[5., 21.], [6., 22.], [7., 23.], [8., 24.]], [[9., 25.], [10., 26.], [11., 27.], [12., 28.]], [[13., 29.], [14., 30.], [15., 31.], [16., 32.]] ]]) pooling = tf.nn.max_pool(a, [1, 2, 2, 1], [1, 1, 1, 1], padding='VALID') with tf.Session() as sess: print('image: ') print(sess.run(a)) print('\n') print('result: ') print(sess.run(pooling))
The channel 1:
The channel 2:
The shape of a is: [2, 4, 4, 2]. ksize is: [1, 2, 2, 1], which means we will get max value in a 2 * 2 matrix.
For example, we get 6 from [1, 2, 5, 6], get 7 from [2, 3, 6, 7]
The result is:
The shape of pooling is [2, 3, 3, 2]