Ever to Excel: Towards an Apologetics of the Spreadsheet
This is the written version of my presentation from Code4lib 2016 in Philadelphia, on March 8, 2016. My presentation was part of a panel with my friends Christina Harlow, Ted Lawless, and Matt Zumwalt, after which we had some discussion moderated by Matt Miller. My slides are available, as are the video of all talks from the panel.
Spreadsheet software is a ubiquitous technology. This ubiquity encourages us to use them as a default starting point for lots of different kinds of tasks. In fact, let’s consider how you use spreadsheets. How many of you use spreadsheets on, say, a monthly basis? How many of you us them on a weekly basis? How about on a daily basis? And, more specifically, how do spreadsheets make you feel when you use them? But how do other people talk about spreadsheets, even if they know that you use them regularly? I have noticed that spreadsheets are often seen as the wrong tool for lots of jobs, particularly in libraries, archives and museums. I certainly know that I’ve been guilty of holding this view before on previous software projects I’ve worked on, and I’ve started to realize that it wasn’t particularly generous way of approaching user needs. So, if you’ve ever felt slighted by me in this regard, please accept my sincerest apologies.
Before we get too much further, I should probably articulate what I mean by a spreadsheet. A spreadsheet is a file with a specific form, which represents data in a tabular format, and within which formulas can be applied to that data. Spreadsheets are read and interpreted within the environment of spreadsheet software, such as Google Sheets or Microsoft Excel. In the context of spreadsheet software, the data in tabular format is addressable through coordinate-based reference to cells, or by address- or category-based ranges. I want to emphasize, however, that tabular data on its own does not adequately qualify as a spreadsheet, meaning that a freestanding CSV-based dataset doesn’t qualify as such either. Nonetheless, such datasets can easily be “converted” or be understood using the paradigm of the spreadsheet, by loading it into spreadsheet software, or into an environment that provides somewhat similar functionality, such as OpenRefine.
This specific consideration allows us to see spreadsheets as the unification of tabularly expressed information and formulas. To put it plainly, as a form of information, a spreadsheet can be seen both as data and a program. This combination is a handy packaging format, which makes it easy to ship your data and the tools you use for calculations and methods for processing or analysis around to others to work with them. The increased ubiquity of spreadsheet software on personal computers over time has expanded the potential for end users to work with data directly in flexible and independent ways, and the unification of data and program has allowed for effective collaboration within groups of users working in a common domain.
This is a longstanding trend that has been fostered by the underlying design of spreadsheets as an application tailored towards end user needs from their inception, as we review the history of spreadsheet software development for personal computers. Arguably, the spreadsheet revolution originated with the development and release of VisiCalc by Software Arts. In his 1984 article on the emergence of spreadsheet software in Harper’s, Steven Levy describes source of the idea by Dan Bricklin, VisiCalc’s co-creator, as originating working on a class assignment while in an MBA program that required Bricklin to complete ledger sheets to analyze the implications of a merger between businesses.1 More specifically, Burton Grad describes Bricklin’s motivation as also being upon his frustration of recognizing that then-current methods of programming were failing him to solve real-time needs based on the interruption and time it took to run a simple program in a timesharing environment to solve a problem.2
In addition, Levy notes that Bricklin’s desire to make a end-user focused version of a spreadsheet designed to run on a personal computer as a potential means to simplify this analysis was met with derision by his professors, as stated by Levy: “Why would a manager want to do a spreadsheet on one of those ’toy’ computers? What were secretaries and accountants and the people down in [data processing] for?”3
VisiCalc’s co-creators, Bricklin and Bob Frankston, have acknowledged many times over that the product was designed in part to give more power to the user. While Bricklin himself has noted that neither that he nor Frankston were the originators of computer-based spreadsheets in general, he describes the value what VisiCalc provided as an what-you-see-is-what-you-get model of interactivity, wherein a user could point to change values, and see figures automatically recalculated based upon stored formulas.4
More broadly, VisiCalc also heralded a change in how users could approach computers. Martin Campbell-Kelly describes a paper given by Frankston in which he compared the potential of VisiCalc to how the projected growth of the telephone system between the 1930s and the 1950s would lead to an unsustainable number of operators. Ultimately, the resolution in the case of the telephone was to provide telephone users with the ability to dial telephones themselves; similarly, Frankston realized that “VisiCalc made everyone a programmer.”5 Similarly, Bricklin cites Jean-Louis Gassée’s description of his encounter with VisiCalc: “That was the day I realized that you didn’t have to be a programmer any more to use a computer. … Approximations, trial and error, simulations – Visicalc is an intellectual modeling clay. It lets you program without knowing it.”6 The power of VisiCalc, and arguably spreadsheets overall, then is this fundamental tension, wherein programming is both perceived to be both fundamentally accessible and simultaneously unnecessary to leverage the potential of the technology.
In this case, another strength of spreadsheet software on personal computers — particularly early applications like VisiCalc — is that they provided this power and control to end users with a degree of constraints within the user interface. Bonnie Nardi and James Miller have identified two specific factors that have demonstrated why spreadsheets work so well: in their words, “computational techniques that match users’ tasks and that shield users from the low-level details of traditional programming; [and] a table-oriented interface that serves as a model for users’ applications.”7 In other words, these are the formula languages targeted at users trying to get stuff done and the constraint of the grid-based format. Specifically, the visual constraint of the spreadsheet’s tabular format for data and formulas allows users a clearer understanding of how to complete their work at hand. In Nardi and Miller’s words,
As a user begins developing a spreadsheet, the tabular grid provides an overarching structure into which the parameters and variables of a model are cast. As the spreadsheet begins to take shape, the user views the emerging model and evaluates its accuracy and completeness. Within the framework of the rows and columns the user can restructure the model by re-arranging rows and columns and by adding new parameters as they become known. A spreadsheet model is grounded in the distinct tabular format of rows and columns, and is constructed in successive approximations as the user critiques the emerging model.8
Frankston himself notes that the “the grid was key”9 to the usability of VisiCalc, which provided a simplified representation of the structure in which users would be working:
If anything, the big break with the VisiCalc was the grid that actually reduced the amount of interactivity. The original design was much more flexible and powerful. The key was the grid, which gave you a framework for reference and simplified everything. It’s almost the opposite of what people think—people think that more features make a design easier. It’s actually a reduction in features that make it feasible. Later, as people became used to it, you could add features, but the basic idea was reducing the grid.10
In her recent book Calm Technology, Amber Case states that “a product that utilizes the right amount of technology becomes invisible more quickly.”11 By constraining options in terms of data representation up front, spreadsheet users and developers can work to address their work at hand more directly.
Both beyond and including of VisiCalc, Martin Campbell-Kelly describes the personal computer spreadsheet sector as being notably responsive to user needs and usability concerns as the market emerged. For example, in describing Lotus 1-2-3, Campbell-Kelly writes that usability-related development was most often targeted at providing user support in environments that were lacking an in-house information technology support staff. Not only did Lotus 1-2-3 provide macros, which assisted end users in automating repetitive tasks, but Lotus also provided extensive documentation and infrastructure to support tutorials, and by providing a “development kit” to allow 1-2-3 users to easily spin up user groups intended to be broader than those targeted previously at “computer professionals.”
In addition, Lotus ultimately encouraged the development of “add-ons” for 1-2-3, such as spreadsheet templates and additional enhancement applications that augmented the functionality of the application. Campbell-Kelly indicates that while Lotus was originally resistant to encouraging the development of these add-ons, they recognized the complementary nature of these products to address user needs that were likely unable to be addressed by a single company, and that a regional “spreadsheet industry” developed in the metropolitan Boston area over time. As such, Campbell-Kelly thus demonstrates that “by the end of 1986, Lotus had a very definite vision of 1-2-3 as a technological system—a platform consisting of the spreadsheet ’engine’ that could support ‘an entire system of products.’”12
By extension, spreadsheets and the other actors that interact with them as boundary objects can also be seen as a socio-technical system — not only through the case of Lotus 1-2-3’s ecosystem of add-ons and the vendors that developed and marketed them, but also through the collaborative nature of spreadsheet development as described by Bonnie Nardi and James Miller. Using ethnographic methods, Nardi and Miller establish in their research that the creation of spreadsheets is most often a site of “co-development”, wherein they come about through the efforts of a group of individuals working closely together. In addition, Nardi and Miller view spreadsheets as an excellent medium for supporting such co-development, as they serve as a conduit for communication, particularly in terms of how they “support design, development and use by people with different levels of both programming and domain knowledge.”13 I believe that this “bridging” function across levels and areas of domain knowledge is incredibly important, in that it improves the potential for dialog around the work at hand, and such co-development can allow for sustained collaboration in the longer term, allowing trust to be built across disparate teams.
In summary, spreadsheets as they’ve developed over time provide a useful instruction in terms of the types of interaction and collaboration we should be considering around the software, products and services we develop. I am deeply curious to hear why we haven’t been able to address this in our profession, and why we’re unable to develop transformative tools that empower users to work more closely together to solve common problems. The continued presence of spreadsheets in our domain of cultural heritage is precisely because they provide an adequate degree of constraint and power, particularly in institutions with small staffs, lower resourced institutions, or institutions where it’s frankly not feasible to develop custom solutions that are seen as well-architected. While not strictly speaking a spreadsheet application itself, I think that these same kinds of considerations in terms of collaboration and usability can account for the success of other applications like OpenRefine. They do just enough to allow us to do our work; they make us feel powerful and productive; and we can work closely with others to establish a shared network of solutions. In opposition, denying users the ability to work with the tools in which they feel the most productive demonstrates a lack of empathy and understanding, and can frankly hinder the efforts to develop a healthy collaborative environment between users and developers. I expect this community to carefully consider how productive and empowering tools are developed, when constraints themselves add value and the likelihood of comprehension, and how to ensure that we can build on this legacy developed over more than 35 years.
Steven Levy, “A Spreadsheet Way of Knowledge.” Harper’s, November 1984. Republished online at https://backchannel.com/a-spreadsheet-way-of-knowledge-8de60af7146e. ↩︎
Burton Grad, “The Creation and the Demise of VisiCalc,” IEEE Annals of the History of Computing, July-September 2007, 21. ↩︎
Dan Bricklin, “Was VisiCalc the ‘first’ spreadsheet?” http://www.bricklin.com/firstspreadsheetquestion.htm. Accessed March 6, 2016. ↩︎
Quoted in Martin Campbell-Kelly, “Number Crunching without Programming: The Evolution of Spreadsheet Usability.” IEEE Annals of the History of Computing, July-September 2007, 7. ↩︎
Jean-Louis Gassée, The Third Apple, quoted in Bricklin. ↩︎
Bonnie A. Nardi and James R. Miller, “The spreadsheet interface: A basis for end-user programming.” In D. Diaper et al (Eds.), Human-Computer Interaction: INTERACT ‘90. Amsterdam: North-Holland, 1990. Republished online at http://www.miramontes.com/writing/spreadsheet-eup/. ↩︎
Nardi and Miller 1990. ↩︎
Grad, 25. ↩︎
Software History Center Oral History Project. “Personal computer (PC) software workshop: VisiCalc.” Ed Bride, Moderator. May 6, 2004, Needham, Mass. Computer History Museum, 2004. CHM Reference number: X4276.2008. Available at http://www.computerhistory.org/collections/catalog/102658146. ↩︎
Amber Case, Calm Technology: Principles and Patterns for Non-Intrusive Design. Sebastopol, CA: O’Reilly, 2015. ↩︎
Campbell-Kelly, 13. ↩︎
Bonnie A. Nardi and James R. Miller. “Twinkling lights and nested loops: Distributed problem solving and spreadsheet development.” International Journal of Man-Machine Studies 34 (1991), 161-184. Republished online at http://www.miramontes.com/writing/twinklinglights/. ↩︎