Understand TensorFlow tf.nn.max_pool(): Implement Max Pooling for Convolutional Network – TensorFlow Tutorial

By | August 10, 2020

TensorFlow tf.nn.max_pool() function is one part of building a convolutional network. In this tutorial, we will introduce how to use it correctly.

Syntax

tf.nn.max_pool() is defined as:

tf.nn.max_pool(
    value,
    ksize,
    strides,
    padding,
    data_format='NHWC',
    name=None
)

It can perform the max pooling on the value based on ksize.

Parameters

value: it is an input tensor, the shape of it should be: [batch, h, w, channels]. It can be the returned tensor from tf.nn.conv2d().

ksize: a kernel size, the shape of it should be: [1, k_h, k_w, 1]. We do not implement max pool operation on batch and channels.

strides: the movement step of ksize, the shape of it should be: [1, stride, stride, 1], which is the same to strides in tf.nn.conv2d().

padding: ‘VALID’ or ‘SAME’. It is same to padding in tf.nn.conv2d().

data_format: A string. ‘NHWC’, ‘NCHW’ and ‘NCHW_VECT_C’. It is most similar to data_format in tf.nn.conv2d().

Understand tf.nn.conv2d(): Compute a 2-D Convolution in TensorFlow

Return

A tensor with the same data type, the shape of it should be: [batch, out_height, out_width, channels]

To understand how to compute out_height and out_width, you can read:

Understand the Shape of Tensor Returned by tf.nn.conv2d()

Then we will use an example to show you how to use tf.nn.max_pool().

How to use tf.nn.max_pool()?

Look at example code below:

import tensorflow as tf

a = tf.constant([[
            [[1., 17.],
             [2., 18.], 
             [3., 19.],
             [4., 20.]],
            [[5., 21.],
             [6., 22.],
             [7., 23.],
             [8., 24.]],
            [[9., 25.],
             [10., 26.],
             [11., 27.],
             [12., 28.]],
            [[13., 29.],
             [14., 30.],
             [15., 31.],
             [16., 32.]]
        ]])
pooling = tf.nn.max_pool(a, [1, 2, 2, 1], [1, 1, 1, 1], padding='VALID')
with tf.Session() as sess:
    print('image: ')
    print(sess.run(a))
    print('\n')
    print('result: ')
    print(sess.run(pooling))

The channel 1:

Understand TensorFlow tf.nn.max_pool() - channel 1

The channel 2:

Understand TensorFlow tf.nn.max_pool() - channel 2

The shape of a is: [2, 4, 4, 2]. ksize is: [1, 2, 2, 1], which means we will get max value in a 2 * 2 matrix.

For example, we get 6 from [1, 2, 5, 6], get 7 from [2, 3, 6, 7]

The result is:

Understand TensorFlow tf.nn.max_pool() example result

The shape of pooling is [2, 3, 3, 2]

Leave a Reply