atomaAtom, RSS and JSON feed parser for Python 3
Atoma
Atom, RSS and JSON feed parser for Python 3.
Quickstart
Install Atoma with pip:
pip install atoma
Load and parse an Atom XML file:
>>> import atoma
>>> feed = atoma.parse_rss_file('rss-feed.xml')
>>> feed.description
'The blog relating the daily life of web agency developers'
>>> len(feed.items)
5
Parsing feeds from the Internet is easy as well:
>>> import atoma, requests
>>> response = requests.get('http://lucumr.pocoo.org/feed.atom')
>>> feed = atoma.parse_atom_bytes(response.content)
>>> feed.title.value
"Armin Ronacher's Thoughts and Writings"
Features
- RSS 2.0 - RSS 2.0 Specification
- Atom Syndication Format v1 - RFC4287
- JSON Feed v1 - JSON Feed specification
- OPML 2.0, to share lists of feeds - OPML 2.0
- Typed: feeds decomposed into meaningful Python objects
- Secure: uses defusedxml to load untrusted feeds
- Compatible with Python 3.6+
Security warning
If you use this library to display content from feeds in a web page, you NEED to clean the HTML contained in the feeds to prevent Cross-site scripting (XSS). The bleach library is recommended for cleaning feeds.
Useful Resources
To use this library a basic understanding of feeds is required. For Atom, the Introduction to Atom is a must read. The RFC 4287 can help lift some ambiguities. Finally the feed validator is great to test hand-crafted feeds.
For RSS, the RSS specification and rssboard.org have a ton of information and examples.
For OPML, the OPML specification has a paragraph dedicated to its usage for syndication
Non-implemented Features
Some seldom used features are not implemented:
- XML signature and encryption
- Some Atom and RSS extensions
- Atom content other than text, html and xhtml
License
MIT