Excel logjam skews latest Covid-19 results
The sudden rise in Covid-19 cases reported in the UK on Friday was caused by the storage limitations of an Excel spreadsheet used to feed results to the government’s public dashboard.
[Updated at 6pm on 5 Oct to include the secretary of state's explanation to the House of Commons, material about the Excel versions used and some corrections to clarify elements of the Pillar 2 operation.]
Health secretary Matt Hancock confirmed to Parliament on Monday afternoon that as a result of an “ongoing incident”, 15,841 outstanding test results had not been included in the government’s reported total of confirmed UK Covid-19 cases until Sunday morning.
The cause of the discrepancy was a “failure in the automated daily upload” from test labs to Public Health England (PHE) databases, the minister said.
PHE clarified that the data blockage was caused by data from the public Pillar 2 side of the Test & Trace system. The volume of results overwhelmed the data limits of the Excel spreadsheet set up to receive it.
The files have now been split into smaller multiple files to prevent the issue happening again, PHE said. Since the Excel 2007 edition, Microsofts spreadsheet was expanded to allow for just over 1m+ rows and 16,000+ columns. According to some sources, the spreadsheets within the collection chain may have been on an even older version of Excel that could only contain 65,000 rows per sheet.
Pushed for more detail by shadow health secretary John Ashworth, Hancock said: “The problem emerged in a Public Health England legacy system. We had already decided in July to replace this system. I commissioned a new system to replace it and the contract was awarded in August. Work is already underway. We have to immediately solve the problem and upgrade the system.”
“Every person tested was told in the normal way in the normal time frame and told to self-isolate,” Hancock told Parliament. Because the records had not been recorded in official data, contact tracing for those cases did not start until Sunday. Within two days, 51% of those affected had been reached by contact tracers, he added.
A note on the government coronavirus dashboard explained offered further background: “An issue was identified overnight on Friday 2 October in the automated process that transfers positive cases data to PHE. It has now been resolved.
“The cases by publish date for 3 and 4 October include 15,841 additional cases with specimen dates between 25 September and 2 October — they are therefore artificially high for England and the UK.”
The ministerial update followed on the heels of what the Daily Mail called a furious blame game over the weekend between PHE and testing supremo Dido Harding.
For those who have been watching closely, however, the news was less of a surprise than the result of a poorly managed initiative that has been causing concern in medical and Westminster circles since early on in the crisis.
Shadow health secretary John Ashworth asked Hancock whether he had signed off a system “that was made up of fragmented systems passing data back and forth between his department, Public Health England and outsourcing companies like Serco and Deloitte and it’s costing us £12bn”.
The capacity, efficiency and reliability of the Test & Trace system has been a persistent concern since March, when the government decided to bring in private contractors to overcome testing bottlenecks within the NHS and public health infrastructure.
The regime is split into two halves: Pillar 1 testing is undertaken on behalf of PHE and the health service in NHS labs.
Pillar 2 represents the community-based testing effort outsourced to Deloitte. This side of the testing fence takes in home-testing kits and the 50 drive-in and walk-in centres established around the country in the past few months that send their results to seven “lighthouse labs” managed by Serco.
Leaving aside the widely reported under-capacity and booking problems, those involved in Pillar 2 testing found it difficult to track results as more tests were undertaken during the summer, Sky News reported back in July.
The British Medical Association bemoaned the unnecessary complexity and chaos surrounding the Test & Trace regime last month, complaining in its journal that GPs and local authorities were unable to receive timely, detailed information on tests conducted in privately-run sites, despite the government’s commitment to link Pillar 2 data with patient medical records.
“The Deloitte contract does not oblige the company to share detailed data with PHE or local authorities,” the BMJ article noted.
Sir David Norgrove, the chair of the UK Statistics Authority, warned the health secretary in June “the figures (on testing) are still far from complete and comprehensible”.
According to Sky, Deloitte built a database to connect the test laboratories to the National Pathology Exchange (NPEx) so that data could be shared with the NHS. A government document on testing information data privacy confirmed that Deloitte “hosts and maintains the digital platform that the test tracking system sits on” and is responsible for making data available to the NHS.
In a page on the firm’s website, Deloitte Digital partner Joel Bellman commented: “The booking system was designed, built and live within three weeks.
“The digital systems have to link up with every other part of the programme – the public, the courier firms who deliver kits, the testing sites, the labs who process the tests, the parts of the NHS that report your result back to you and on to other parts of the NHS.”
For a firm that should know better, having a spreadsheet involved anywhere in a data consolidation and reporting process represents a fundamental internal control weak spot. In an age of application programming interfaces and Power BI, any halfway competent practitioner should be able to set up an end-to-end reporting dashboard.
A Deloitte source denied any involvement in the “spreadsheet thing” and, like PHE, directed AccountingWEB to the Department of Health.
So who designed the reporting and dashboard preparation mechanisms? And who would have reviewed them to ensure they were robust enough to handle such important and sensitive data?
Matt Hancock avoided answering such questions in Parliament, but before giving the OK for the replacement consolidation system, perhaps he and his team should have a look at AccountingWEB’s Power BI tutorials, which explain how Microsoft’s query and analysis tools can be used to overcome Excel’s data integrity shortcomings.
Our Power BI correspondent Hugh Johnson offered a few pointers: “If each of these 50 units are the same and produce the same output, you would just have an Excel output from each system and consolidate them into Power BI in a twinkle. It’s not a problem – it’s easy.
“It’s easy on outside to look in and draw conclusions. But Excel is the most popular software ever built by anybody. It’s probably what whoever was doing it at the time was familiar with, because Excel can do everything. But it was just not designed for that.”
You might also be interested in
AccountingWEB’s interim Editor in Chief has been with the site since 1999 and returned to the editorial hot seat in March 2020 to lead the hunt for a long-term successor... Send a DM if you're interested! When not tending to the needs of AccountingWEB members and geeking out on their technology habits, he devotes much of his time to his oddball...