skip to content

Posts

  • You're All Sheep

    Made by Twittersheep, a new project made (in part) by my acquaintance Ted Roden, a creative technologist for New York Times Research & Development.
  • A Bird's Eye View of Archival Collections

    Mitchell Whitelaw is a Senior Lecturer in the Faculty of Design and Creative Practice at the University of Canberra and the 2008 winner of the National Archives of Australia's Ian Maclean Award. According to the NAA's site, the Ian Maclean Award commemorates archivist Ian Maclean, and is awarded to individuals interested in conducting research that will benefit the archival and historical profession in Australia and promote the important contribution that archives make to society. Dr. Whitelaw has been keeping the world up to date on his work using his blog, The Visible Archive. His work fits well with my colleague Jeanne Kramer-Smyth's archival data visualization project, ArchivesZ, as well as the multidimensional visualization projects underway at the Humanities Advanced Technology & Information Institute at the University of Glasgow. However, his project fascinates me for a few specific reasons. First of all, the scale of the datasets he's working with are astronomically larger than those that any other archival visualization project has tried to tackle so far.
  • API Fun: Visualizing Holdings Locations

    In my previous post, I included a screenshot of a prototype, but glossed over what it actually does. Given an OCLC record number and a ZIP code, it plots the locations of the nearest holdings of that item on a Google Map. Pulled off in Python (as all good mashups should be), along with SIMILE Exhibit, it uses the following modules: geopy simplejson web.py and, of course, worldcat. If you want to try it out, head on over here. The curent of the code will soon be able as part of the examples directory in the distribution for worldcat, which can be found in my Subversion repository.
  • This Is All I'm Going To Say On This Here Blogsite Concerning The Brouhaha About The Policy for Use and Transfer of WorldCat Records Because I Have Other, More Interesting And More Complex Problems To Solve (And So Do You)

    The moderated discussion hosted and sponsored by Nylink went pretty well. Also, I don't need the records to have fun with the data "” I just need robust APIs. (In fact, as I said today, I'd prefer not to have to deal with the MARC records directly.) Robust APIs would help making prototypes like this one I hacked together in a few hours into a real, usable service.
  • Lightening the load: Drupal and Python

    Man, if this isn't a "you got your peanut butter in my chocolate thing" or what! As I wrote over on the NYPL Labs blog, we've been up to our necks in Drupal at MPOW, and I've found that one of the great advantages of using it is rapid prototyping without having to write a whole lot of code. Again, that's how I feel about Python, too, but you knew that already. Once you've got a prototype built, how do you start piping stuff into it? In Drupal 6, a lot of the contrib modules to do this need work - most notably, I'm thinking about node_import, which as of yet still has no (official) CCK support for Drupal 6 and CCK 2. In addition, you could be stuck with having to write PHP code for the heavy lifting, but where's the joy in that? Well, it so happens that the glue becomes the solvent in this slow, slow dance.
  • dEAD Reckoning #1: A FaTHEADed Failure For Faceted Terms and Headings in EAD

    A while back, I wrote a Bad MARC Rant, and I considered titling this a Bad Metadata Rant. However, as the kids say, I got mad beef with a little metadata standard called Encoded Archival Description. Accordingly, I figured I should begin a new series of posts discussing some of these issues that I have with something that is, for better or for worse, a technological fixture of our profession. This is in part prompted by thoughts that I've had as a result of participating in EAD@10 and attending the Something New for Something Old conference sponsored by the PACSCL Consortial Survey Initiative. Anyhow, onto my first bone to pick with EAD. I'm incredibly unsatisfied with the controlled access heading tag <controlaccess/>. First of all, it can occur within itself, and because of this, I fear that there will be some sort of weird instance where I have to end up parsing a series of these tags 3 levels deep. Also, it can contain a <chronlist/>, which also seems pretty strange given that I've never seen any example of events being used as controlled access terms in this way.
  • Going off the Rails: Really Rapid Prototyping With Drupal

    Previously posted on http://labs.nypl.org/. The other Labs denizens and I are going off the rails on a crazy train deeper down the rabbit hole of reimplementing the NYPL site in Drupal. As I pile my work on the fire, I've found that building things in Drupal is easier than I'd ever thought it to be. It's a scary thought, in part because I'm no fan of PHP (the language of Drupal's codebase). Really, though, doing some things can be dead simple. It's a bit of a truism in the Drupal world at this point that you can build a heck of a lot just by using the CCK and Views modules. The important part is that you can build a heck of a lot without really having to know a whole lot of code. This is what threw me off for so long - I didn't realize that I was putting too much thought into building a model like I normally would with another application framework.
  • Does SAA Need To Support Who I Am?

    There's been a whole lot of discussion in the archivoblogosphere about the perceived need for quasi-informal interest groups that are fundamentally driven by identity. While I agree with this in theory, I must register my opposition to having SAA promote, support, or provide any sort of infrastructure for such groups. Fundamentally, I am against this because I believe it poses a strong threat to the privacy of archivists.
  • deliciouscopy: a dumb solution for a dumb problem

    You'd think there was some sort of tried and true script for Delicious users to repost bookmarks from their inboxes into their accounts, especially given that there are often shared accounts where multiple people will tag things as "for:foo" to have them show up on foo's Delicious account. Well, there wasn't, until now (at least as far as I could tell). Enter deliciouscopy. It uses pydelicious, as well as the Universal Feed Parser and simplejson. It reads a user's inbox, checks to see if poster of the for:whomever tag was added to your network, and reposts accordingly, adding a via: tag for attribution. It even does some dead simple logging if you need that sort of thing. The code's all there, and GPL license blah blah blah. I hacked this together in about an hour for something at MPOW - namely to repost things to our shared account. It's based on Michael Noll's deliciousmonitor.py but diverges from it fairly quickly. Enjoy, and give any feedback if you must.
  • Idle Hands Are The Devil's Plaything

    I've had my hands full lately. Two weeks ago I was at the MCN conference (wherein, among other things, I have continued my dominion as Archduke of Archival Description by taking over the MCN Standards SIG chair position from The Bancroft Library's Mary Elings), and next week I'm off to Philadelphia for the PACSCL Something New for Something Old conference. I hammered out the coherent, written version of my paper I gave at EAD@10. I prepared a proposal for next February's code4lib conference in Providence (ahem, vote for mine, if you're so inclined): Building on Galen Charlton's investigations into distributed version control systems for metadata management, I offer a prototype system for managing archival finding aids in EAD (Encoded Archival Description). My prototype relies on distributed version control to help archivists maintain transparency in their work and uses post-commit hooks to initiate indexing and publishing processes. In addition, this prototype can be generalized for any XML-based metadata schema. On top of that, I'm working with a fine group of folks on the RLG Programs project to analyze EAD editing and creation tools, doing hardcore schema mapping at work, and somehow finding enough time to play a little Doukutsu Monogatari to unwind.