skip to content


  • pybhl: Accessing the Biodiversity Heritage Library's Data Using OpenURL and Python

    Via Twitter, I heard about the Biodiversity Heritage Library's relatively new OpenURL Resolver, announced in their blog about a month ago. More specifically, I head about Matt Yoder's new Ruby library, rubyBHL, which exploits the BHL OpenURL Resolver to provide metadata about items in their holdings and does some additional screenscraping to return things like links to the OCRed version of the text. In typical fashion, I've ported Matt's library to Python, and have released my code. pybhl is available from my site, PyPI, and Github. Use should be fairly straightforward, as seen below: >>> import pybhl >>> import pprint >>> b = pybhl.BHLOpenURLRequest(genre='book', aulast='smith', aufirst='john', date='1900', spage='5', volume='4') >>> r = b.get_response() >>> len(['citations']) 3 >>> pprint.pprint(['citations'][1]) {u'ATitle': u'', u'Authors': [u'Smith, John Donnell,'], u'Date': u'1895', u'EPage': u'', u'Edition': u'', u'Genre': u'Journal', u'Isbn': u'', u'Issn': u'', u'ItemUrl': u'', u'Language': u'Latin', u'Lccn': u'', u'Oclc': u'10330096', u'Pages': u'', u'PublicationFrequency': u'', u'PublisherName': u'H.N. Patterson,', u'PublisherPlace': u'Oquawkae [Ill.] :', u'SPage': u'Page 5', u'STitle': u'', u'Subjects': [u'Central America', u'Guatemala', u'Plants', u''], u'Title': u'Enumeratio plantarum Guatemalensium imprimis a H.
  • API Fun: Visualizing Holdings Locations

    In my previous post, I included a screenshot of a prototype, but glossed over what it actually does. Given an OCLC record number and a ZIP code, it plots the locations of the nearest holdings of that item on a Google Map. Pulled off in Python (as all good mashups should be), along with SIMILE Exhibit, it uses the following modules: geopy simplejson and, of course, worldcat. If you want to try it out, head on over here. The curent of the code will soon be able as part of the examples directory in the distribution for worldcat, which can be found in my Subversion repository.
  • Lightening the load: Drupal and Python

    Man, if this isn't a "you got your peanut butter in my chocolate thing" or what! As I wrote over on the NYPL Labs blog, we've been up to our necks in Drupal at MPOW, and I've found that one of the great advantages of using it is rapid prototyping without having to write a whole lot of code. Again, that's how I feel about Python, too, but you knew that already. Once you've got a prototype built, how do you start piping stuff into it? In Drupal 6, a lot of the contrib modules to do this need work - most notably, I'm thinking about node_import, which as of yet still has no (official) CCK support for Drupal 6 and CCK 2. In addition, you could be stuck with having to write PHP code for the heavy lifting, but where's the joy in that? Well, it so happens that the glue becomes the solvent in this slow, slow dance.
  • deliciouscopy: a dumb solution for a dumb problem

    You'd think there was some sort of tried and true script for Delicious users to repost bookmarks from their inboxes into their accounts, especially given that there are often shared accounts where multiple people will tag things as "for:foo" to have them show up on foo's Delicious account. Well, there wasn't, until now (at least as far as I could tell). Enter deliciouscopy. It uses pydelicious, as well as the Universal Feed Parser and simplejson. It reads a user's inbox, checks to see if poster of the for:whomever tag was added to your network, and reposts accordingly, adding a via: tag for attribution. It even does some dead simple logging if you need that sort of thing. The code's all there, and GPL license blah blah blah. I hacked this together in about an hour for something at MPOW - namely to repost things to our shared account. It's based on Michael Noll's but diverges from it fairly quickly. Enjoy, and give any feedback if you must.
  • Python WorldCat Module v0.1.2 Now Available

    In preparation for the upcoming WorldCat Hackathon starting this Friday, I've made a few changes to worldcat, my Python module for interacting with OCLC's APIs. Most notably, I've added iterators for SRU and OpenSearch requests, which (like the rest of the module) painfully need documentation. It's available either via download from my site or via PyPI; please submit bug reports to the issue tracker as they arise. EDIT: I've bumped up the version number another micro number to 0.1.1 as I've just added the improvements mentioned by Xiaoming Liu on the WorldCat DevNet Blog (LCCN query support, support for tab-delimited and CSV responses for xISSNRequests, and support for PHP object responses for all xIDRequests). EDIT: Thanks to Thomas Dukleth, I was told that code for the Hackathon was to be licensed under the BSD License. Accordingly, I've now dual licensed the module under both GPL and BSD.
  • Introducing djabberdjaw

    djabberdjaw is an alpha-quality Jabber bot written in Python that uses Django as an administrative interface to manage bot and user profiles. I've included a couple of plugins out of the box that will allow you to perform queries against Z39.50 targets and OCLC's xISBN API (assuming you have the requisite modules). djabberdjaw requires Django 1.0 or later, jabberbot, and xmpppy. It's available either from PyPI (including using easy_install) or via Subversion. You can browse the Subversion repository, too.
  • Python WorldCat API module now available

    I'd like to humbly announce that I've written a pre-pre-alpha Python module for working with the WorldCat Search API and the xID APIs. The code needs a fair amount of work, namely unit tests and documentation. I've released the code under the GPL. The module, called "worldcat", is available from the Python Package Index. You can also checkout a copy of the code from my Subversion repository.
  • Easy Peasy: Using the Flickr API in Python

    Since I'm often required to hit the ground running at $MPOW on projects, I was a little concerned when I roped myself into assisting our photo archives with a Flickr project. The first goal was to get a subset of the photos uploaded, and quickly. Googling and poking around the Cheeseshop led me to Beej's FlickrAPI for Python. Little did I know that it would be dead simple to get this project going. To authenticate: def create_session(api_key, api_secret): """Creates as session using FlickrAPI.""" session = flickrapi.FlickrAPI(api_key, api_secret) (token, frob) = session.get_token_part_one(perms='write') if not token: raw_input("Hit return after authorizing this program with Flickr") session.get_token_part_two((token, frob)) return session That was less painful than the PPD test for tuberculosis. Oh, and uploading? flickr.upload(filename=fn, title=title, description=desc, tags=tags, callback=status) Using this little code plus a few other tidbits, I created an uploader that parses CSV files of image metadata exported from an Access database. And when done, the results look a little something like this.