Dalke Scientific Software: More science. Less time. Products

PyRSS2Gen

A Python library for generating RSS 2.0 feeds.

[Download PyRSS2Gen-1.1.tar.gz]

Requires Python 2.3. (Uses the datetime module for timestamps.) Also works under Python 3.x

To install:

  % python setup.py install
This uses the standard Python installer. For more details, read the installer guide. (And there's only one file, so you could just copy it wherever you need it.)

The documentation was written in 2003 which is why the examples are a bit dated. Don't let that dissuade you! It's now 2012 and many people are still using the package. There have been (minor) bug fixes during the time, and even a port to Python3.


I've finally decided to catch up with 1999 and play around a bit with RSS. I looked around, and while there are many ways to read RSS there are remarkably few which write them. I could use a DOM or other construct, but I want the code to feel like Python. There are more Pythonic APIs I might use, like the effbot's ElementTree, but I also wanted integers, dates, and lists to be real integers, dates, and lists. (And I want bug-eyed monsters from Alpha Centauri to be *real* bug-eyed monsters from Alpha Centauri - is that too much I ask you?)

The RSS generators I found were built around print statements. Workable, but they almost invariably left out proper HTML escaping the sort which leads to Mark Pilgrim's to write feed_parser, to make sense of documents which are neither XML nor HTML. Annoying, but sadly all too common.

So I messed around a bit with the spec.

The result looks like this:

import datetime
import PyRSS2Gen

rss = PyRSS2Gen.RSS2(
    title = "Andrew's PyRSS2Gen feed",
    link = "http://www.dalkescientific.com/Python/PyRSS2Gen.html",
    description = "The latest news about PyRSS2Gen, a "
                  "Python library for generating RSS2 feeds",

    lastBuildDate = datetime.datetime.now(),

    items = [
       PyRSS2Gen.RSSItem(
         title = "PyRSS2Gen-0.0 released",
         link = "http://www.dalkescientific.com/news/030906-PyRSS2Gen.html",
         description = "Dalke Scientific today announced PyRSS2Gen-0.0, "
                       "a library for generating RSS feeds for Python.  ",
         guid = PyRSS2Gen.Guid("http://www.dalkescientific.com/news/"
                          "030906-PyRSS2Gen.html"),
         pubDate = datetime.datetime(2003, 9, 6, 21, 31)),
       PyRSS2Gen.RSSItem(
         title = "Thoughts on RSS feeds for bioinformatics",
         link = "http://www.dalkescientific.com/writings/diary/"
                "archive/2003/09/06/RSS.html",
         description = "One of the reasons I wrote PyRSS2Gen was to "
                       "experiment with RSS for data collection in "
                       "bioinformatics.  Last year I came across...",
         guid = PyRSS2Gen.Guid("http://www.dalkescientific.com/writings/"
                               "diary/archive/2003/09/06/RSS.html"),
         pubDate = datetime.datetime(2003, 9, 6, 21, 49)),
    ])

rss.write_xml(open("pyrss2gen.xml", "w"))

The output does not contain newlines, so if you want to read it, you'll need to use your favorite XML tools to reformat it.

RSS is not a fixed format. People are free to add various metadata, like Dublin Core elements.

The RSS objects are converted to XML using the 'publish' method, which takes a SAX2 ContentHandler. If you want different output, implement your own 'publish'. The "simple" data types which takes a string, int, or date, can be replaced with a publishable object, so you can add metadata to, say, the "description" field. To support new elements for RSS and RSSItem, derive from them and use the 'publish_extensions" hook. To add your own attributes (needed for namespace declarations), redefine 'element_attrs' or 'rss_attrs' in your subclass.

To use a different encoding, create your own ContentHandler instead of using the helper methods 'to_xml' and 'write_xml.' You'll need to make sure the 'characters' method in the handler does the appropriate translation.

The "categories" list is somewhat special. It needs to be a list and doesn't have a publish method. That's because the RSS spec doesn't have an explicit concept for the set of categories -- an RSS2 channel can have 0 or more 'category' elements, but doesn't have a "list of categories" -- my "categories" attribute is an API fiction.

BUGS:

Several people have used this package since its first release in September of 2003 and reported a couple of bugs. All those are fixed. There are no known bugs.

The name PyRSS2Gen is a mouthful. It didn't think it was useful to come up with a cute name. You might consider having

   import PyRSS2Gen as RSS2
in any code which uses this module. I'm not changing the name because anyone who reads "RSS2" will likely think it's a parser and not a generator. Plus, the current name is very easy to find via a web search.

LICENSE

This is copyright (c) by Andrew Dalke Scientific, AB (previously Dalke Scientific Software, LLC) and released under the BSD license. See the file LICENSE in the distribution for details.

CHANGES for 1.1

Released August 25, 2012.

CHANGES for 1.0

Released November 6, 2005.

CHANGES for 0.1.1

Released in September 2003



Copyright © 2001-2020 Andrew Dalke Scientific AB