Fix PyMuPDF RuntimeError: cycle in page tree – Python PDF Operation

By | August 7, 2019

PyMuPDF RuntimeError: cycle in page tree will occur when you are iterating pdf page by page. In this tutorial, we will show you how to fix this problem.

RuntimeError - cycle in page tree

Example Code:

import sys, fitz

pdf = "F:\\114848.pdf"

doc = fitz.open(pdf)

for page in doc:
    text = page.getText("text")
    html_text = page.getText("html")
    #print(text)
    #print(html_text)

This code will report runtime error: cycle in page tree

Locate the error page

page_num = 0
for page in doc:
    page_num += 1
    print(page_num)
    text = page.getText("text")
    html_text = page.getText("html")

From the result, we can find the page 110 report error.

Check the pdf file, we find this page is ok, however, the next page 111 is something wrong: nothing is in 111 page.

To fix this error, we can add try except statement.

Fix code example as below:

try:
    for page in doc:
        page_num += 1
        print(page_num)
        
        text = page.getText("text")
        html_text = page.getText("html")
        #print(text)
        #print(html_text)
                
except Exception as e:
        print(e)     
print("end")

Leave a Reply