Python Calculate the MD5 Value for Big File – Python Tutorial

By | January 10, 2021

The md5 hash value can determine a unique file. In this tutorial, we will introduce how to calculate it for a big file.

Preliminary

If you only want to compute the md5 value of a python string, you can view:

Generate Python String MD5 Value for Python Beginners

How to generate the md5 value of a file?

As to a file, the size of it may be huge or small. In order to calculate the md5 value, we can calculate block by block.

Here is an example:

import hashlib
 
filename = 'data.txt'
md5_hash = hashlib.md5()
with open(filename,"rb") as f:
    # Read and update hash in chunks of 4K
    for byte_block in iter(lambda: f.read(4096),b""):
        md5_hash.update(byte_block)
    print(md5_hash.hexdigest())

In this example code, we will calculate the file md5 per 4K (4*1024 = 4096).

Run this code, you will get md5 as follows:

b76f7031ca6f31266668a00d81a3f5c1

Leave a Reply