tf.nn.max_pool() function can implement a max pool operation on a input data, in this tutorial, we will introduce how to use it to compress an image.
To understand how to use tensorflow tf.nn.max_pool(), you can read the tutorial:
Understand TensorFlow tf.nn.max_pool(): Implement Max Pooling for Convolutional Network
Read an image using tensorflow
We read the image data using tensorflow first.
Here is an example:
import tensorflow as tf import numpy as np from PIL import Image fpath = 'logo.png' img = tf.read_file(fpath) img_arr = tf.image.decode_png(img, channels=3) #rgb with tf.Session() as sess: imgdata = sess.run(img_arr) print(imgdata.shape)
Run this code, we can get:
(248, 250, 3)
Which means logo.png image contains data with (248, 250, 3)
Here we use tensorflow to read an image, you also can use python pillow to read an image into a numpy array. Here is the tutorial:
Python Pillow Read Image to NumPy Array: A Step Guide – Python Pillow Tutorial
In order to make tf.nn.max_pool() can process image, we should reshape it.
size = tf.shape(img_arr) img_4d = tf.cast(tf.reshape(img_arr, [1, size[0], size[1], 3]), tf.float32)
Then we can use tf.nn.max_pool() to process image.
Compress images using tf.nn.max_pool()
pool = tf.nn.max_pool(img_4d, [1, 5, 5, 1], [1, 1, 1, 1], 'SAME')
Here ksize = [1, 5, 5, 1]. We will get max value from a 5 * 5 matrix.
Then we can convert image matrix data to an image file
Save image
with tf.Session() as sess: imgdata, pool = sess.run([img_arr, pool]) print(imgdata.shape) print(pool.shape) Image.fromarray(np.uint8(pool.reshape(pool.shape[1:4]))).save('logo-processed.png')
We can use python pillow Image.fromarray() function to save numpy array to a file.
Run this code, we can find:
The source logo.png is:
file size: 5,403 byte
We can use different ksize to compress this image.
ksize | image | file size | height*widht |
[1, 5, 5, 1] | 3,377 byte | 248 * 250 | |
[1, 10, 10, 1] | 3,207 byte | 248 * 250 | |
[1, 15, 15, 1] | 3,021 byte | 248 * 250 |