How Not to Kill People With Spreadsheets


David Gerard at Foreign Policy: “The U.K.’s response to COVID-19 is widely regarded as scattershot and haphazard. So how did they get here?

Excel is a top-of-the-line spreadsheet tool. A spreadsheet is good for quickly modeling a problem—but too often, organizations cut corners and press the cardboard-and-string mock-up into production, instead of building a robust and unique system based on the Excel proof of concept.

Excel is almost universally misused for complex data processing, as in this case—because it’s already present on your work computer and you don’t have to spend months procuring new software. So almost every business has at least one critical process that relies on a years-old spreadsheet set up by past staff members that nobody left at the company understands.

That’s how the U.K. went wrong. An automated process at Public Health England (PHE) transformed the incoming private laboratory test data (which was in text-based CSV files) into Excel-format files, to pass to the Serco Test and Trace teams’ dashboards.

Unfortunately, the process produced XLS files—an outdated Excel format that went extinct in 2003—which had a limit of 65,536 rows, rather than the around 1 million-row limit in the more recent XLSX format. With several lines of data per patient, this meant a sheet could only hold 1,400 cases. Further cases just fell off the end.

Technicians at PHE monitoring the dashboards noticed on Oct. 2 that not all data that had been sent in was making it out the other end. The data was corrected the next day, and PHE announced the issue the day after.

It’s not clear if the software at PHE was an Excel spreadsheet or an in-house program using the XLS format for data interchange—the latter would explain why PHE stated that replacing it might take months—but the XLS format would have been used on the assumption that Excel was universal.

And even then, a system based on Excel-format files would have been an improvement over earlier systems—the system for keeping a count of COVID-19 cases in the U.K. was, as of May, still based on data handwritten on cards….

The process that went wrong was a workaround for a contract issue: The government’s contract with Deloitte to run the testing explicitly stipulated that the company did not have to report “Pillar 2” (general public testing) positive cases to PHE at all.

Since a test-and-trace system is not possible without this data, PHE set up feeds for the data anyway, as CSV text files directly from the testing labs. The data was then put into this system—the single system that serves as the bridge between testing and tracing, for all of England. PHE had to put in place technological duct tape to make a system of life-or-death importance work at all….

The Brookings Institution report Doomed: Challenges and solutions to government IT projects lists factors to consider when outsourcing government information technology. The outsourcing of tracking and tracing is an example where the government has assumed all of the risk, and the contractor assumes all of the profit. PHE did one thing that you should never do: It outsourced a core function. Running a call center or the office canteen? You can outsource it. Tracing a pandemic? You must run it in-house.

If you need outside expertise for a core function, use contractors working within a department. Competing with the private sector on pay can be an issue, but a meaningful project can be a powerful incentive….(More)”.