mail2rss.pl

mail2rss.pl is a rewritten version of another script written by Nick Gerakines.

His script was a good starting block, but Feedvalidator.org noted that the RSS it produced was invalid. I believed that the areas in which it failed are rather important and needed to be fixed so we could adhere to standards.

Disclaimer

I am not responsible for any ill effects on the privacy or security of you or any system on which you run this script. Don't place the RSS file in a publicly accessible web directory without realizing the implications of doing so.

Modifications

Nick's version of mail2rss.pl created four major issues for the feed validator. These were:

  1. A missing version attribute in the <rss> tag. This attribute is required for a feed to be valid, and changing the code to reflect this was very basic.

  2. An invalid <guid> tag. The GUID must be a full URL unless the isPermalink is set to be false. This was also an easy fix. I also changed the GUID to be an MD5 hash of the sender and the time().

  3. An invalid URI in the <link> tag. The original specified the <link> to be "1", and that just didn't sit right with me. The link needed to begin with an IANA-registered URI scheme, so I sifted through them for a while until I decided upon the mid: URI scheme for message IDs. I had to add another variable for procmail to pull the mid and a corresponding flag for the script. The mid had to be scrubbed for spaces and other extraneous characters before mid: was prepended. I have to admit that this approach is a bit kludgy, given that it produces a link to the message referenced by the mid: URI. These links won't be handled properly by most RSS readers, and I'd dare say that they won't be handled properly by any. Nonetheless, it's a valid reference to the message in question.

  4. An invalid <pubDate>, i.e. not in RFC 822 format. The original script just made a call to time, which returns the current epoch time. After a quick Google search I came across this code snippet, which processes the call to time() to output an RFC822-compliant time.\

In addition, I cleaned up the code here and there, but nothing worth mentioning overall. If you're really curious, run diff on the two scripts.

Usage

Copy the Perl script and procmail script to the appropriate server. The example procmail script calls bmf (Bayesian Mail Filter) first, which helps to weed out the spam. After moving identified spam to the appropriate mailbox, it reads data from the message and calls the Perl script.

Files