Python Parse RSS Feed with feedparser – A Beginner Guide – Python Tutorial

By | March 8, 2021

Rss feed is an important source for capturing website content. In this tutorial, we will introduce how to parse rss feed xml file and get the information we want using python feedparser.

Install feedparser

We can use pip command to install it.

pip install feedparser

pip install feedparser - a beginner guide

feedparser online documents

feedparser detailed documents are here:

https://feedparser.readthedocs.io/en/latest/

Common RSS Elements

In order to parse rss xml file, we should notice what elements are common used in rss. They are:

title, link, description, publication date, and entry ID.

You can find more rss elements in here:

https://www.rssboard.org/rss-profile

Here is an example of rss xml file.

the structure of wordpress rss

How to parse rss feed using feedparser?

We will use an example to show you how to do.

import feedparser
d = feedparser.parse('https://www.tutorialexample.com/feed/')

In this example, we will parse our blog feed.

Print artilce number

print(len(d['entries']))

You will get 10.

Parse the first article

We should notice the d[‘entries’] is a python list, each element is a python dictionary.

for k, v in d['entries'][0].items():
    print(k + " = " + str(v))

Run this code, you may get this output.

the output of a rss feed entry

Then we can get the information we want, then process and save them into our database. Here is the tutorial:

Python Select, Insert, Update and Delete Data from MySQL: A Completed Guide – Python Tutorial

Leave a Reply