TAPAS.network | 1 August 2024 | Editorial Opinion | Peter Stonham

The Machine Stops

Peter Stonham

THERE ARE certain industries that, due to their time-critical nature, service delivery structure, and user characteristics and expectations, are particularly susceptible to any system downtime or unpredictable interruptions to service.

Transport and logistics have become a prime example in our modern digital world, meaning everything from passenger transport services to traffic control and freight distribution are in the front line for any IT system failure. Ever more so with the growth of digital dependency and reliance on the internet for communication, messaging and data transfer, both to manage service provision, and to support interaction with customers.

Last week a global IT outage brought chaos across airlines, airports, some train operators and other retail, financial and healthcare services. The cybersecurity company CrowdStrike has admitted responsibility from a faulty software update to Microsoft windows that affected 8.5 million computers that displayed ‘blue screens of death’.

Shortly after, incidents of a different kind, but similar consequence, occurred on the TGV rail network in France when arson attacks were made on lineside communications infrastructure bringing services to a halt. This was a physical intervention, but no less related to the dependence on digital capacity for systems to operate.

In our digital-dependent and highly connected society, such incidents are increasingly likely. One failure begets others and a break in connectivity disrupts and devalues the whole network. Last week we found out what happens if systems go offline, with the highly publicised incident demonstrating how a major system failure can cripple operations, frustrate passengers and tarnish reputations.

Whilst the cause in this case was specific, it was hardly unique. Under the right - or rather wrong- conditions, similar scenarios could potentially unfold in many situations at weak points within the digital chain — or more worryingly, from an organised contamination across key system providers.

Consequences can be more than just inconvenience, extending to safety, protection of transactions, critical security data, and contamination and decay of core support systems, not to mention public confidence in the operations in the future, all those involved incurring reputational risk.

A core truth is that Information Technology has radically changed the nature of transport and travel. Driverless cars and trains, automated traffic controls, and connected highways are just some examples of how the transport sector has adopted the Internet of Things, AI Systems instead of human ones, and the substitution of customer self-managed bookings, payments and journey planning at an astonishing rate. However, with each advancement and investment comes increased reliance on the Internet, which those familiar with its inner workings can affirm introduces another level of unpredictability that needs proactive management.

Transport companies have learned that they must both have diagnostic indicators to predict when things are going to fail before they happen, substitute systems in place to kick in when the main ones stop working, and insurance to both spread risk and offer redress when customers are left out in the cold. One key issue is to consider the question of liability and address the associated matter of denial of responsibility through the loophole of considering cyber disruption as outwith human control as an ‘Act of God’.

If a supplier — commercial or a public body — chooses to dispense with human actors and implement technological and artificial intelligence mechanisms, that is a decision with consequences, but not a removal of responsibility. A digital disruption has another significant impact in disabling the ability to monitor the performance and positioning of transport assets — and their payloads — i.e. planes, trains, cars, and trucks. These are moving assets which in normal times are increasingly tracked and visible on display systems, but suddenly ‘lost’ to view. This lack of connectivity means it becomes very difficult to determine if there is an operational issue with one or more vehicles, and to take appropriate action.

The transport sector is already susceptible to shocks and stresses (i.e., extreme weather, protests, and pandemics) that make resiliency fundamental to managing risk in operations. On top of that, any unreliability of the internet or communications carrying systems means potential severe disruption.

But it’s not just disconnection and suspension of operations and the corruption of internal systems that can be costly. When Google Maps went down in August 2022, it took out several apps that rely on its API to deliver directions. For example, Uber and Lyft rely on Google Maps data to provide real-time information about traffic conditions and other factors impacting drivers’ ability to pick up riders. With these apps affected by the outage, their drivers could not pick up customers.

green quotations

The transport sector is already susceptible to shocks and stresses (i.e., extreme weather, protests, and pandemics) that make resiliency fundamental to managing risk in operations. On top of that, any unreliability of the internet or communications carrying systems means potential severe disruption.

One problem clearly demonstrated by last week’s outage is establishing the cause of the disruption. On this occasion, it was relatively easy to point the finger at CrowdStrike for its faulty update. But, it might not always be so simple. The problem is compounded when it is not clear if the issue is in communicating data, a contamination of the operating system, a network connectivity issue, or other mysterious or malignant cause.

In a detailed review of the incident CrowdStrike said there was a “bug” in a system designed to ensure software updates work properly. Crowdstrike said the glitch meant “problematic content data” in a file went undetected. The company said it could prevent the incident from happening again with better software testing and checks, including more scrutiny from developers.

This comes as affected businesses and customers are asking what financial compensation those impacted by the outage will be able to claim. According to insurance firm Parametrix, the top 500 US companies by revenue, excluding Microsoft, faced some $5.4bn (£4.1bn) in financial losses from the outage. It said that only $540m (£418m) to $1.08bn (£840m) of these losses were insured.

“This incident must serve as a broader warning about the national security risks associated with network dependency,” wrote the US House Committee on Homeland Security in a letter quickly sent to Crowdstrike, which it has called to a hearing.

Many IT experts have been drawing some obvious conclusions. Professor Omer Rana, Cardiff University Academic Centre of Excellence in Cyber Security Research & Education, said the outage had “clearly indicated that we need to consider the impact of wider ‘cyber disturbances’ – rather than just cyber attacks”. It is the impact on systems that is important, not just what has caused it, he said. “This shows how vulnerable we are to cloud-hosted services that we all rely on every day. This reliance has increased even more significantly since the Covid pandemic, when many workers were connected on-line and cloud-hosted services played a key role.” The cyber-disturbances that were now occurring have come in the context of ‘edge computing systems’, such as the internet of things devices, as our reliance on these continues to increase.

Much other valuable insight has been offered by similar IT and cyber systems experts in the wake of the outage as we report in this issue - but who will really sit up and listen ?

Steve Sands, Chair of the BCS Information Security Specialist Group, said “Working IT systems are a prerequisite for almost every aspect of modern life and indeed the global economy. We have made a number of key recommendations to improve service and software resilience to government. I sincerely hope that this CrowdStrike issues raise awareness and create some much-needed urgency to continue this vital conversation.”

Dr Inah Omoronyia, based in the Bristol Cybersecurity Research Group at University of Bristol’s School of Computer Science, said:“This outage points to the need to be constantly vigilant of the cloud infrastructures and other critical systems that we now depend on daily. Today’s infrastructures are a lot more complex, with extensive dependencies. Currently, our risk mitigation approaches are too reactive and therefore unsustainable for the current pace of technological innovation. Unless precautions are pro-actively taken to detect and mitigate risks throughout the whole software and systems supply chain our best effort may remain a security theatre.”

Beyond these experts, those involved in the transport sector in more traditional roles, or with responsibilities in planning and policy rather than operations, might be acknowledging that there are indeed some challenging and concerning matters of digital and cyber vulnerability and resilience, and even have worries about the implications. But perhaps they believe that ‘someone’ is looking after those problems on behalf of their organisations, and the country/society at large. That’s what they probably also thought about the preparation for a pandemic, and plans for responding to major weather events like flooding or extreme heatwaves.

Events have demonstrated that such confidence in other people and systems is often misplaced. It is in human nature to be most concerned about things that are immediately apparent, or deeply seated inside human experience — not to imagine and plan for the arrival of new threats and consequences that come unannounced alongside what look like unlimited beneficial advances in human ingenuity and technological invention.

Digital dependence, and the possibilities of system corruption or collapse, may be an existential threat - or at least a trigger towards a set of circumstances beyond anyone’s perception or control bringing serious immediate and long-term damage to society. Coupled with unknowns brought by climate change, the coming of widespread substitution of Artificial Intelligence for core human functions, and management systems that remove discretion from individuals in the field (if indeed there are any left), raise the prospect of more dramatic and chaotic chain reactions that will not have been thought through for their potentially very unpleasant consequences.

We have changed our way of life unrecognisably in just a few decades. In many ways for the better, but in other ways worse. Our ‘new order’ is potentially constructed upon expectations and beliefs of predictability and reliability of underpinning systems, which are, to a greater or lesser extent, misplaced.

The endemic weaknesses are not very visible, and hard to identify. Once upon a time, the nuclear threat brought a massive attention to civil defence, preparations for the collapse of government, and measures to deal with panic and injury that many ridiculed as unlikely to even scrape the surface of the serious nature of the feared catastrophe. It now seems that no current ‘external’ threat to our way of life- of which digital dependence is just one - is having the same chilling effect. Or, perhaps they are, but we prefer to embrace the comforts of believing that ‘somebody’ will be looking after things, or that it’s logical to simply ‘hope for the best’, as matters are now beyond anyone’s real control.

Peter Stonham is the Editorial Director of TAPAS Network

This article was first published in LTT magazine, LTT897, 1 August 2024.

d2-20220516-1
taster
Read more articles by Peter Stonham
A world away from what’s needed
ANYONE LOOKING at the graph from the new report by the International Transport Forum, ITF, that looks at the pathway to decarbonisation in transpoort across the world will probably find it rather familiar. The UK equivalent of that yawning gap between aspiration, necessity and reality is something we have covered in TAPAS extensively in recent months, particularly the work of Professor Greg Marsden, who has closely studied the UK’s trajectory towards achieving net zero in transport.
The Boiling Point. What’s missing if it isn’t the experience?
ARE PERSONAL EXPERIENCES, and demonstrable signals that something untoward is happening, a better way to change thinking and behaviour than hearing the wisdom of experts and the warnings of doomsayers? Or is it possible to even miss the implications of those too? It’s surely worth asking in the light of last week’s unprecedented extreme temperatures in the UK, and even worse in other countries, and the continuing impacts of fossil-fuel supply shortages and massive price rises.
A world away from what’s needed
ANYONE LOOKING at the graph from the new report by the International Transport Forum, ITF, that looks at the pathway to decarbonisation in transpoort across the world will probably find it rather familiar. The UK equivalent of that yawning gap between aspiration, necessity and reality is something we have covered in TAPAS extensively in recent months, particularly the work of Professor Greg Marsden, who has closely studied the UK’s trajectory towards achieving net zero in transport.
Read more articles on TAPAS
Happy 10th birthday to the ‘Mini Holland’ poster child, and may there be many more happy returns
It’s been 10 years since Waltham Forest secured £27m funding from the London Mayor to create a ‘Mini Holland’. The far-reaching success of the programme illustrates the importance of genuine leadership, holding your nerve in the face of adversity, and understanding what the majority thinks, not just what the minority shout, says John Dales
Airport expansion conflicts with Net Zero will be very hard to resolve, whilst economic case is unconvincing
A SURPRISING feature of the new Labour government is its enthusiasm for new airport capacity, notwithstanding its commitment to a pathway to net zero greenhouse gas emissions by 2050. The primacy of economic growth, so strongly espoused by the Prime Minister and his Chancellor, have been summoned to logically justify this approach – both as a practical measure, and a signal to prospective investors in the UK economy that ‘we are open for business’ – even despite the adverse climate change consequences. The economic case itself, however, is very questionable.
Decision time for England’s biggest road project. What are the implications? (Part Two)
The recently completed examination of the revised National Highways proposals for a new downstream Lower Thames Crossing of the Thames between Kent and Essex exposed some fundamental issues about how the rationale behind its justification was both presented and tested, believes Phil Goodwin. In this second part of his review of these matters, he looks in detail at three issues of more general significance, and the wider questions they highlight about major road scheme appraisal and the robustness of the review process for them.