Save content
Have you found this content useful? Use the button above to save it to your profile.
Sharp spike in Covid-19 cases looks worse due to Excel data logjam
Coronavirus dashboard_DHSC

Excel logjam skews latest Covid-19 results


The sudden rise in Covid-19 cases reported in the UK on Friday was caused by the storage limitations of an Excel spreadsheet used to feed results to the government’s public dashboard. 

5th Oct 2020
Save content
Have you found this content useful? Use the button above to save it to your profile.

[Updated at 6pm on 5 Oct to include the secretary of state's explanation to the House of Commons, material about the Excel versions used and some corrections to clarify elements of the Pillar 2 operation.]

Health secretary Matt Hancock confirmed to Parliament on Monday afternoon that as a result of an “ongoing incident”, 15,841 outstanding test results had not been included in the government’s reported total of confirmed UK Covid-19 cases until Sunday morning.

The cause of the discrepancy was a “failure in the automated daily upload” from test labs to Public Health England (PHE) databases, the minister said.

PHE clarified that the data blockage was caused by data from the public Pillar 2 side of the Test & Trace system. The volume of results overwhelmed the data limits of the Excel spreadsheet set up to receive it.

The files have now been split into smaller multiple files to prevent the issue happening again, PHE said. Since the Excel 2007 edition, Microsofts spreadsheet was expanded to allow for just over 1m+ rows and 16,000+ columns. According to some sources, the spreadsheets within the collection chain may have been on an even older version of Excel that could only contain 65,000 rows per sheet.

Pushed for more detail by shadow health secretary John Ashworth, Hancock said: “The problem emerged in a Public Health England legacy system. We had already decided in July to replace this system. I commissioned a new system to replace it and the contract was awarded in August. Work is already underway. We have to immediately solve the problem and upgrade the system.”

“Every person tested was told in the normal way in the normal time frame and told to self-isolate,” Hancock told Parliament. Because the records had not been recorded in official data, contact tracing for those cases did not start until Sunday. Within two days, 51% of those affected had been reached by contact tracers, he added.

A note on the government coronavirus dashboard explained offered further background: “An issue was identified overnight on Friday 2 October in the automated process that transfers positive cases data to PHE. It has now been resolved.

“The cases by publish date for 3 and 4 October include 15,841 additional cases with specimen dates between 25 September and 2 October — they are therefore artificially high for England and the UK.”

The ministerial update followed on the heels of what the Daily Mail called a furious blame game over the weekend between PHE and testing supremo Dido Harding.

For those who have been watching closely, however, the news was less of a surprise than the result of a poorly managed initiative that has been causing concern in medical and Westminster circles since early on in the crisis.

Shadow health secretary John Ashworth asked Hancock whether he had signed off a system “that was made up of fragmented systems passing data back and forth between his department, Public Health England and outsourcing companies like Serco and Deloitte and it’s costing us £12bn”.

Persistent concerns

The capacity, efficiency and reliability of the Test & Trace system has been a persistent concern since March, when the government decided to bring in private contractors to overcome testing bottlenecks within the NHS and public health infrastructure.

The regime is split into two halves: Pillar 1 testing is undertaken on behalf of PHE and the health service in NHS labs.

Pillar 2 represents the community-based testing effort outsourced to Deloitte. This side of the testing fence takes in home-testing kits and the 50 drive-in and walk-in centres established around the country in the past few months that send their results to seven “lighthouse labs” managed by Serco.

Leaving aside the widely reported under-capacity and booking problems, those involved in Pillar 2 testing found it difficult to track results as more tests were undertaken during the summer, Sky News reported back in July.

The British Medical Association bemoaned the unnecessary complexity and chaos surrounding the Test & Trace regime last month, complaining in its journal that GPs and local authorities were unable to receive timely, detailed information on tests conducted in privately-run sites, despite the government’s commitment to link Pillar 2 data with patient medical records.

“The Deloitte contract does not oblige the company to share detailed data with PHE or local authorities,” the BMJ article noted.

Sir David Norgrove, the chair of the UK Statistics Authority, warned the health secretary in June “the figures (on testing) are still far from complete and comprehensible”.

Deloitte’s role

According to Sky, Deloitte built a database to connect the test laboratories to the National Pathology Exchange (NPEx) so that data could be shared with the NHS. A government document on testing information data privacy confirmed that Deloitte “hosts and maintains the digital platform that the test tracking system sits on” and is responsible for making data available to the NHS.

In a page on the firm’s website, Deloitte Digital partner Joel Bellman commented: “The booking system was designed, built and live within three weeks.

“The digital systems have to link up with every other part of the programme – the public, the courier firms who deliver kits, the testing sites, the labs who process the tests, the parts of the NHS that report your result back to you and on to other parts of the NHS.”

For a firm that should know better, having a spreadsheet involved anywhere in a data consolidation and reporting process represents a fundamental internal control weak spot. In an age of application programming interfaces and Power BI, any halfway competent practitioner should be able to set up an end-to-end reporting dashboard.

A Deloitte source denied any involvement in the “spreadsheet thing” and, like PHE, directed AccountingWEB to the Department of Health.

So who designed the reporting and dashboard preparation mechanisms? And who would have reviewed them to ensure they were robust enough to handle such important and sensitive data?

Matt Hancock avoided answering such questions in Parliament, but before giving the OK for the replacement consolidation system, perhaps he and his team should have a look at AccountingWEB’s Power BI tutorials, which explain how Microsoft’s query and analysis tools can be used to overcome Excel’s data integrity shortcomings.

Our Power BI correspondent Hugh Johnson offered a few pointers: “If each of these 50 units are the same and produce the same output, you would just have an Excel output from each system and consolidate them into Power BI in a twinkle. It’s not a problem – it’s easy.

“It’s easy on outside to look in and draw conclusions. But Excel is the most popular software ever built by anybody. It’s probably what whoever was doing it at the time was familiar with, because Excel can do everything. But it was just not designed for that.”

Replies (7)

Please login or register to join the discussion.

Ivor Windybottom
By Ivor Windybottom
05th Oct 2020 14:35

Or more accurately put about using spreadsheets for this purpose:

Thanks (1)
Replying to Ivor Windybottom:
By Richard Grant
05th Oct 2020 15:16

A good link but I would have to point out that I doubt very much that Dido Harding had anything to do with the use of spreadsheets, she's only been there a few weeks and PHE have been cocking this up for the last six months.

Thanks (0)
By Richard Grant
05th Oct 2020 14:45

Actually the "daily" cases shown above of 22,961 is still wrong. Those apparently occurred over a week or two, so not only can't they use Excel, they can't interpret or present the data either.
Maybe time for a mass sacking?

Thanks (0)
Hugh Johnson
By Hugh Johnson
06th Oct 2020 23:29

I gather that the problem was even worse than I first imagined when the news broke. It seems that the Excel version used at each of the 50 centres was Excel 2003 (or earlier). I suspect this was the result of penny-pinching procurement. This would limit each centre spreadsheet to report about 14,000 cases (based on a 65k row limit of Excel 2003). This is no where near enough. The latest versions of Excel can handle 1 million rows (enough I would expect for each centre, but not enough for a national view). For the final consolidation, Power BI would be able to handle up to 2 billion rows. Anyone in public procurement, worried about the cost of an Excel desktop licence might perhaps be interested to learn that the equivalent Power BI Desktop licence is FREE :)

Thanks (1)
Replying to hughjohnson:
By philaccountant
07th Oct 2020 11:53

Proprietary software such as excel should always be avoided in the public sphere where possible. You will always end up in a situation where the people charged with procurement won't want to spend money on the upgrades and will eek years of life out of something well past its sell by date.

Microsoft has had far too much of a grip on our public IT infrastructure and is among the most bloated, unreliable and insecure software around.

Thanks (1)
Replying to hughjohnson:
By PChapman
07th Oct 2020 13:11

actually its much simpler than that
they've just been saving as .xls rather than the newer .xlsx format
excel automatically keeps compatibility with the older standard when you do
OR they are using something else that saves as .xls - same limitation applies!

Either way it is a rediculous oversite!

Thanks (1)
By nekillim
07th Oct 2020 10:11

We need to know the 'names' of the people who made these mistakes, so they can be held to account, and if necessary dismissed! Not the Departments or companies!

Thanks (0)