The md5 hash value can determine a unique file. In this tutorial, we will introduce how to calculate it for a big file.
Preliminary
If you only want to compute the md5 value of a python string, you can view:
Generate Python String MD5 Value for Python Beginners
How to generate the md5 value of a file?
As to a file, the size of it may be huge or small. In order to calculate the md5 value, we can calculate block by block.
Here is an example:
import hashlib filename = 'data.txt' md5_hash = hashlib.md5() with open(filename,"rb") as f: # Read and update hash in chunks of 4K for byte_block in iter(lambda: f.read(4096),b""): md5_hash.update(byte_block) print(md5_hash.hexdigest())
In this example code, we will calculate the file md5 per 4K (4*1024 = 4096).
Run this code, you will get md5 as follows:
b76f7031ca6f31266668a00d81a3f5c1