TAPAS.network | 1 August 2024 | Editorial Opinion | Peter Stonham

The Machine Stops

Peter Stonham

THERE ARE certain industries that, due to their time-critical nature, service delivery structure, and user characteristics and expectations, are particularly susceptible to any system downtime or unpredictable interruptions to service.

Transport and logistics have become a prime example in our modern digital world, meaning everything from passenger transport services to traffic control and freight distribution are in the front line for any IT system failure. Ever more so with the growth of digital dependency and reliance on the internet for communication, messaging and data transfer, both to manage service provision, and to support interaction with customers.

Last week a global IT outage brought chaos across airlines, airports, some train operators and other retail, financial and healthcare services. The cybersecurity company CrowdStrike has admitted responsibility from a faulty software update to Microsoft windows that affected 8.5 million computers that displayed ‘blue screens of death’.

Shortly after, incidents of a different kind, but similar consequence, occurred on the TGV rail network in France when arson attacks were made on lineside communications infrastructure bringing services to a halt. This was a physical intervention, but no less related to the dependence on digital capacity for systems to operate.

In our digital-dependent and highly connected society, such incidents are increasingly likely. One failure begets others and a break in connectivity disrupts and devalues the whole network. Last week we found out what happens if systems go offline, with the highly publicised incident demonstrating how a major system failure can cripple operations, frustrate passengers and tarnish reputations.

Whilst the cause in this case was specific, it was hardly unique. Under the right - or rather wrong- conditions, similar scenarios could potentially unfold in many situations at weak points within the digital chain — or more worryingly, from an organised contamination across key system providers.

Consequences can be more than just inconvenience, extending to safety, protection of transactions, critical security data, and contamination and decay of core support systems, not to mention public confidence in the operations in the future, all those involved incurring reputational risk.

A core truth is that Information Technology has radically changed the nature of transport and travel. Driverless cars and trains, automated traffic controls, and connected highways are just some examples of how the transport sector has adopted the Internet of Things, AI Systems instead of human ones, and the substitution of customer self-managed bookings, payments and journey planning at an astonishing rate. However, with each advancement and investment comes increased reliance on the Internet, which those familiar with its inner workings can affirm introduces another level of unpredictability that needs proactive management.

Transport companies have learned that they must both have diagnostic indicators to predict when things are going to fail before they happen, substitute systems in place to kick in when the main ones stop working, and insurance to both spread risk and offer redress when customers are left out in the cold. One key issue is to consider the question of liability and address the associated matter of denial of responsibility through the loophole of considering cyber disruption as outwith human control as an ‘Act of God’.

If a supplier — commercial or a public body — chooses to dispense with human actors and implement technological and artificial intelligence mechanisms, that is a decision with consequences, but not a removal of responsibility. A digital disruption has another significant impact in disabling the ability to monitor the performance and positioning of transport assets — and their payloads — i.e. planes, trains, cars, and trucks. These are moving assets which in normal times are increasingly tracked and visible on display systems, but suddenly ‘lost’ to view. This lack of connectivity means it becomes very difficult to determine if there is an operational issue with one or more vehicles, and to take appropriate action.

The transport sector is already susceptible to shocks and stresses (i.e., extreme weather, protests, and pandemics) that make resiliency fundamental to managing risk in operations. On top of that, any unreliability of the internet or communications carrying systems means potential severe disruption.

But it’s not just disconnection and suspension of operations and the corruption of internal systems that can be costly. When Google Maps went down in August 2022, it took out several apps that rely on its API to deliver directions. For example, Uber and Lyft rely on Google Maps data to provide real-time information about traffic conditions and other factors impacting drivers’ ability to pick up riders. With these apps affected by the outage, their drivers could not pick up customers.

green quotations

The transport sector is already susceptible to shocks and stresses (i.e., extreme weather, protests, and pandemics) that make resiliency fundamental to managing risk in operations. On top of that, any unreliability of the internet or communications carrying systems means potential severe disruption.

One problem clearly demonstrated by last week’s outage is establishing the cause of the disruption. On this occasion, it was relatively easy to point the finger at CrowdStrike for its faulty update. But, it might not always be so simple. The problem is compounded when it is not clear if the issue is in communicating data, a contamination of the operating system, a network connectivity issue, or other mysterious or malignant cause.

In a detailed review of the incident CrowdStrike said there was a “bug” in a system designed to ensure software updates work properly. Crowdstrike said the glitch meant “problematic content data” in a file went undetected. The company said it could prevent the incident from happening again with better software testing and checks, including more scrutiny from developers.

This comes as affected businesses and customers are asking what financial compensation those impacted by the outage will be able to claim. According to insurance firm Parametrix, the top 500 US companies by revenue, excluding Microsoft, faced some $5.4bn (£4.1bn) in financial losses from the outage. It said that only $540m (£418m) to $1.08bn (£840m) of these losses were insured.

“This incident must serve as a broader warning about the national security risks associated with network dependency,” wrote the US House Committee on Homeland Security in a letter quickly sent to Crowdstrike, which it has called to a hearing.

Many IT experts have been drawing some obvious conclusions. Professor Omer Rana, Cardiff University Academic Centre of Excellence in Cyber Security Research & Education, said the outage had “clearly indicated that we need to consider the impact of wider ‘cyber disturbances’ – rather than just cyber attacks”. It is the impact on systems that is important, not just what has caused it, he said. “This shows how vulnerable we are to cloud-hosted services that we all rely on every day. This reliance has increased even more significantly since the Covid pandemic, when many workers were connected on-line and cloud-hosted services played a key role.” The cyber-disturbances that were now occurring have come in the context of ‘edge computing systems’, such as the internet of things devices, as our reliance on these continues to increase.

Much other valuable insight has been offered by similar IT and cyber systems experts in the wake of the outage as we report in this issue - but who will really sit up and listen ?

Steve Sands, Chair of the BCS Information Security Specialist Group, said “Working IT systems are a prerequisite for almost every aspect of modern life and indeed the global economy. We have made a number of key recommendations to improve service and software resilience to government. I sincerely hope that this CrowdStrike issues raise awareness and create some much-needed urgency to continue this vital conversation.”

Dr Inah Omoronyia, based in the Bristol Cybersecurity Research Group at University of Bristol’s School of Computer Science, said:“This outage points to the need to be constantly vigilant of the cloud infrastructures and other critical systems that we now depend on daily. Today’s infrastructures are a lot more complex, with extensive dependencies. Currently, our risk mitigation approaches are too reactive and therefore unsustainable for the current pace of technological innovation. Unless precautions are pro-actively taken to detect and mitigate risks throughout the whole software and systems supply chain our best effort may remain a security theatre.”

Beyond these experts, those involved in the transport sector in more traditional roles, or with responsibilities in planning and policy rather than operations, might be acknowledging that there are indeed some challenging and concerning matters of digital and cyber vulnerability and resilience, and even have worries about the implications. But perhaps they believe that ‘someone’ is looking after those problems on behalf of their organisations, and the country/society at large. That’s what they probably also thought about the preparation for a pandemic, and plans for responding to major weather events like flooding or extreme heatwaves.

Events have demonstrated that such confidence in other people and systems is often misplaced. It is in human nature to be most concerned about things that are immediately apparent, or deeply seated inside human experience — not to imagine and plan for the arrival of new threats and consequences that come unannounced alongside what look like unlimited beneficial advances in human ingenuity and technological invention.

Digital dependence, and the possibilities of system corruption or collapse, may be an existential threat - or at least a trigger towards a set of circumstances beyond anyone’s perception or control bringing serious immediate and long-term damage to society. Coupled with unknowns brought by climate change, the coming of widespread substitution of Artificial Intelligence for core human functions, and management systems that remove discretion from individuals in the field (if indeed there are any left), raise the prospect of more dramatic and chaotic chain reactions that will not have been thought through for their potentially very unpleasant consequences.

We have changed our way of life unrecognisably in just a few decades. In many ways for the better, but in other ways worse. Our ‘new order’ is potentially constructed upon expectations and beliefs of predictability and reliability of underpinning systems, which are, to a greater or lesser extent, misplaced.

The endemic weaknesses are not very visible, and hard to identify. Once upon a time, the nuclear threat brought a massive attention to civil defence, preparations for the collapse of government, and measures to deal with panic and injury that many ridiculed as unlikely to even scrape the surface of the serious nature of the feared catastrophe. It now seems that no current ‘external’ threat to our way of life- of which digital dependence is just one - is having the same chilling effect. Or, perhaps they are, but we prefer to embrace the comforts of believing that ‘somebody’ will be looking after things, or that it’s logical to simply ‘hope for the best’, as matters are now beyond anyone’s real control.

Peter Stonham is the Editorial Director of TAPAS Network

This article was first published in LTT magazine, LTT897, 1 August 2024.

d2-20220516-1
taster
Read more articles by Peter Stonham
It’s not all about Economic Growth now, stupid.
ANYONE SEEKING to discover what the new Truss Government’s transport policy might be should find the Chancellor’s ‘Growth Plan 2022’ an instructive read. This document, tabled by Kwasi Kwarteng as he delivered his mini budget just over a week ago, unashamedly (at least for now) makes economic growth the Government’s central mission, setting a target of reaching a 2.5% trend rate. It claims that “Sustainable growth” will lead to higher wages, greater opportunities and “provide sustainable funding for public services,” although the reaction of the financial markets clearly questioned the affordability and budgetary prudence of the measures Kwarteng announced in a bid to achieve his aims. 
It’s time to re-appraise appraisal, and Wales shows the way
WHAT A CONTRAST IS EMERGING between the Welsh and English approaches to clearly defining the basis upon which future transport investment decisions should sensibly be made. In this LTT issue we examine the Welsh Government’s lucid and accessible set of plans for how schemes should be drawn up and assessed against clearly stated national policy objectives. Meanwhile, the Department for Transport, whose writ now only runs in England on this policy area, has announced a series of esoteric and impenetrable changes to its complex and multi-faceted transport appraisal guidance, still fundamentally based on seeking to predict what the future holds. 
Do we Keep Right On to The End of The Road... even if it isn’t making sense any more?
TWO HIGHLY EXPENSIVE National Highways road schemes have come to the forefront of professional discussion this week, with serious questions to be asked about their value for money, and justification in either economic or environmental terms against the background of a likely requirement for significant public spending cuts as the new Rishi Sunak-led Government seeks to get to grips with the national finances. 
Read more articles on TAPAS
‘Going for growth’: what does it mean to the people in the street, - and for the streets themselves?
Amid all the talk from the government about achieving super-charged levels of economic growth, John Dales sees a disconnect between the high-level language and the realities of most people’s daily lives, and what they would see as genuine improvements for them. What does it look like at street-level, he ponders, and how better might we think about tangible local prosperity and wellbeing?
Can our political system treat transport properly? Lessons from a generation ago
Crucial transport issues are not being addressed by our national political leaders with the attention they deserve, suggests Phil Goodwin. Having recently examined the trajectory of policy over the past 50 years he sees a sharp contrast between the build- up of consensus for change thirty years ago, and the current loss of direction and consistency. This raises questions about the state of our democratic processes and the required recognition of matters of fundamental importance to the nation and the planet.
How sandwich sales have added new insights about current transport and travel trends
Taking the pulse of transport activity in the shorter term is a different challenge to plotting long term modal usage trends - and can potentially benefit from less obvious data sources. John Siraut looks at some new indicators that could be valuable markers in this regard - some that might be unexpected, but revealing alternative measures of transport use.