In this tutorial, we will introduce how to use python pypdf2 library to split a large pdf file to a small one by pages.
Preliminary
We should install python pypdf2 first.
pip install pypdf2
Read a pdf file using pypdf2
Here is an example:
from PyPDF2 import PdfFileReader, PdfFileWriter pdf_input = r'2022010614181218.pdf' pdf = PdfFileReader(pdf_input)
Get pages you wanted from source pdf
Here is an example:
pdf_writer = PdfFileWriter() output_filename = "fengyijun.pdf" for page in range(2, 3): pdf_writer.addPage(pdf.getPage(page))
In this example, we will create a PdfFileWriter instance to save pages you want to extract from source pdf.
You shoud notice: the page index starts from 0, which means the first page = 0, the second page = 1.
This example we will extract the third page from 2022010614181218.pdf to save a new pdf.
Save pages to new pdf
Finally, we can save pages extracted from source pdf to a new pdf file.
with open(output_filename, 'wb') as out: pdf_writer.write(out)
You also can use pymupdf to split pdf file, here is the tutorial:
Python Split and Merge PDF with PyMUPDF: A Completed Guide