skip to content

Drupal For Archivists: Documenting the Asian/Pacific American Community with Drupal

Over the course of the last academic year, I have been part of a team working on survey project aimed at identifying and describing archival collections relating to the Asian and Pacific American community in the New York City metropolitan area. The results of the fifty-plus collections we surveyed have been posted on our Drupal-powered website, which has been an excellent fit for the needs of this project and has also enabled us to engage many of the challenges the project has presented.

By way of introduction, this survey project seeks to address the underrepresentation of East Coast Asian/Pacific Americans in historical scholarship and archival repositories by working with community-based organizations and individuals to survey their records and raise awareness within the community about the importance of documenting and preserving their histories. Funded by a Documentary Heritage Project grant from METRO: Metropolitan New York Library Council, the project is a collaborative effort between the Asian/Pacific/American Institute and the Tamiment Library/Robert F. Wagner Labor Archive at NYU. Three graduate students — I-Ting Emily Chu, Nancy Ng Tam and I — were hired to do the survey work.

For close to nine months, we dug through the basements, closets and storage facilities of artists, activists, scholars and collectors. We visited the offices of arts organizations, theatre companies and social service agencies. We looked at paper files, stage props, moving image materials, digital photographs and emails. Despite the diversity of institutions, people and materials we encountered, a common theme began to emerge.

Due to the nature of many of the organizations we worked with and the cost of space in New York City, many of the collections were spread across several different locations, including private apartments and other publically inaccessible spaces. This problem is even more acute on a larger level; there is no significant archival repository in the NYC area dedicated to collecting documentation of the Asian/Pacific American community.

The website initially began primarily as a way to publicize the project and fulfill the grant requirements. However, as we began thinking about the site's structure, content and audience, we realized that we had the potential to do something far more interesting; to build a research center for scholars and members of the Asian/Pacific American community, and to bring together these physically dispersed collections via standardized descriptions. I was introduced, through Mark's timely prodding, to the wonders of Drupal at DrupalCamp NYC and quickly realized that this project would be a perfect application for Drupal, since we were dealing largely with structured data and wanted the ability to present that data in a variety of ways.

With Mark's good advice and the assistance of Brian Hoffman of NYU's Digital Library Technology Services, I was able to get a site up and running in a few weeks. The majority of the site's content is based on three content types, built with the Content Construction Kit module. The main content type, Archival Resource, contains collection-level information including dates, extent, language, arrangement, an abstract and a scope and content note. The Archival Resource content type also links to an Entity content type via a node reference field. This Entity content type describes the person or corporate body responsible for creating the records, including dates of existence, authorized form of name, and a historical/biographical note. A Location content type, with repository-specific information such as address, hours and contact person, is also tied to the Archival Resource content type via a node reference field.

Taken together, the three content types amount to the front matter for a finding aid. Separating the content into three different types avoided repeated entry of the same data, which in turn prevented wasted effort and data inconsistency. Drupal also allows for field-level data validation and formatting, which dramatically reduces the chances of human error in data entry, which was especially important as there were a number of people responsible for creating content. The display of the of the content is controlled through the Views module, which gives us the ability to programatically create displays from a collection list with brief abstracts to a complete view of the survey data, all with the same data.

We also created four very simple taxonomies - ethnic context, geographic coverage, organization type and person type - and applied these to the collection level description in the Archival Resource content type. These taxonomies allow users to browse through the collections via facets, a critical function on a site that aims to expose hidden collections.

In terms of this project, the real strengths of Drupal have been its ability to handle structured data in flexible and powerful ways via customizable content types. Having developed a number of static HTML sites in the hazy past, I've also been grateful for the way Drupal separates the development of infrastructure and function from the generation of content. This has allowed others a significant hand in creating the site's content, and has prevented me from having the dubious responsibility of being the only person who can update the site.

The site is still very new, and we're looking for ways to publicize it, generate more content, and create a user community. The survey project will continue for another year, thanks to a funding extension, and additional descriptions will be added during this time.