Blame It On One Person... Seriously?

12 June 2017

It is tempting to comment on the UK political chaos right now... .but as a welcome diversion, we thought we’d make a few observations on the business disaster that has recently placed British Airways at the forefront of the international news. British Airways (BA) experienced a catastrophic IT system failure on a busy bank holiday weekend, resulting in direct financial losses of over £150M. Over 75,000 passengers were affected worldwide and BA were subject to highly critical press coverage with regard to their handling of the incident. It has been widely reported that BA attributes this failure to the actions of an individual IT maintenance contractor. This contractor allegedly failed to follow proper procedures which led to the 15-minute power outage and catastrophic damage to BA servers across the world.

Given BA’s high and mighty status in the aviation world, how can a 15-minute power outage (resulting in over £150M losses) happen because a single contractor inadvertently flicked the on/off switch incorrectly?!

We’re not IT or aviation system experts, but we do know a thing or two about risk management. If we applied a simple 5-WHY approach to this incident, some of the key questions that we would seek answers to would be:

1. Management of Change/cost control: It is clear that BA have undertaken a significant process of organisational change and re-structuring in order to increase efficiency and manage costs. This has resulted in major job losses in the UK and USA and the transfer of critical business activities (i.e. IT) to far-flung locales. Have these changes and the way they have been implemented increased the risk of critical failure within core business processes?

2. Risk management: Like any company today, BA is highly dependent on robust IT systems. As such, they have to understand what could go wrong: how and when. Established control mechanisms, back-up systems, etc. should be the foundation for sound decisions that are critical to managing business risk. How could inadvertently switching off the power create such mayhem? Why did the back-up system fail? Why was this major risk seemingly unforeseen or uncontrolled?

3. Contractor management: Contractors and contracted organisations are critical to big-business on a global basis. BA, like all major organisations, engage thousands of contractors. The way they select them, communicate with them, coach them, work with them and control their activities are all fundamental and critical to business efficiency and success. Without even going into potential competency and supervision issues, why was an individual contractor placed in a situation where a single action disrupted the power and caused a catastrophic IT system failure?

4. Business recovery and customer care: For many years now, the issue of business recovery from a major event has been core activity within all major international companies. For a service provider like BA, the requirement for excellent customer care must be a priority. We all saw the press coverage, the photos and the interviews with stranded passengers. We saw and heard their anger as they voiced their frustration about the lack of communication and support throughout and after this event. Catastrophic IT failures are not unheard of in the aviation industry: Southwest Airlines in 2016, Delta got hit twice in the past year and think back to BA’s own widespread IT system failure last September….these undesired events resulted in losses adding up to hundreds of millions of dollars: ouch! Customer care failures equally reoccur again and again and again. There were plenty of opportunities for BA to learn and appreciate the criticality of effective business restoration plans and genuine and meaningful customer care. How can this happen to a supposed ‘world-leading’ organisation in 2017? What actually are BA’s priorities?

5. Blame culture: According to press coverage, BA’s internal investigation identified human error as the cause of this event. With this incident, BA were actually quoted in the press as saying ‘it is very much human error to blame’. This means, according to them, that the entire catastrophe is down to a mistake made by this unfortunate contractor. Hello? Did BA really say that? Industry is littered with examples of people who have been set up to fail. This is a prime example…the fact that a simple slip could cause such a business disaster is undoubtedly not down to the erroneous actions of one contractor, but the cumulative errors of an organisation which failed to manage a critical business risk. Doesn’t the association of blame that was attached to a single contractor reflect more on BA’s culture, rather than on the capability of an individual?

Like we said, we’re not IT or aviation experts, but we do know that accountable organisations do not blame individuals unless they can prove sabotage, which has never been publicly mentioned in relation to this disastrous incident. It’s hard to imagine what the impact is on the individual involved, as well as the longer-term business impact on BA’s reputation. Customers have long memories and the media will remind us when it happens again in this industry…because, sadly, it will. We will continue to comment when the big boys fail to manage risk, but in this instance it looks like we may be in danger of bidding a fond-farewell to our BA frequent flyer miles!

PS: Guess what?

Apple have thought about this too! Take a look the attached excerpt video released as part of the introduction to Apple’s 2017 WWDC which took place at the start of this month.

WWDC 2017 - APPOCALYPSE - Apple

Also see more postings in the Risk Dimensions Blog.