In this post, I’ll be explaining why poor safety assurance behaviours are a lot like growing a comb-over for an emerging bald spot.
No, seriously!… please bear with me.
Please take a seat
Donald Trump’s four year tenure was accompanied by much ribbing about his intricate ‘comb-over’ hairstyle. Crystallising his traits of misplaced vanity and dishonesty, it was ripe for lampooning. But this was one of the rare times when I actually had some sympathy for ‘the Donald.’ A little bit of combing and teasing is understandable for those who, like me, suffer from the dread of male pattern baldness.
And to be honest, trawling my memory, Trump’s intricate hairstyle was only a mild example of this curious styling choice from a historical perspective. As recently as the 1990s, I would regularly marvel at the solid, swirling lids of hair I saw, that fell miraculously back into place after being raised by a sudden gust of wind. These constructions were as mysterious and imponderable to me as that of the Pyramids of Giza.
Visiting my local barber as a young man, I came to understand the mystery. Skegness’ most experienced and trusted hairdresser, Bill Brown - ran a comb through my hair with a smile and quietly asked:
“Would you like me to leave it a little longer at the front?’
Briefly pondering this question (and the vague notion that there was something conspiratorial about it) I realised two things. The first was that my hairline was receding. The second was that this was how the ‘combover’ started. Bald men didn’t suddenly wake one morning and decide to grow six inches of intricately trained hair to cover an existing patch. The ‘comb-over’ occurred by increments – one haircut at a time, as the tide gradually went out.
The safety ‘comb-over’
I have pondered that youthful trip to the barber many times. It has always served as a useful insight into human nature in my career as a safety engineer. The key lesson was that extreme behaviours don’t suddenly happen. They occur by initial accident, incremental evolution, and a gradual cover up.
When the VW emissions scandal broke in 2015, it immediately sounded to me like a ‘comb-over.’ In ‘dieselgate,’ the US Environmental Protection Agency found that Volkswagen had intentionally programmed turbocharged direct injection (TDI) diesel engines to activate their emissions controls only during laboratory emissions testing. This caused the vehicles' NOx output to meet US standards during regulatory testing, whilst the cars emitted up to forty times more NOx in real-world driving.
Reading about the scandal, it seemed highly unlikely to me that a professional engineer would wake up one morning and deliberately set out to cheat a test so blatantly. Surely the behaviour had evolved through a succession of seemingly innocuous steps?
Sure enough, it became apparent that ‘defeat devices’ had a long history in the automotive sector. As long ago as 1973, the big three Detroit automakers and Toyota were ordered by the US Environmental Protection Agency (EPA) to stop using ambient temperature switches, which disabled pollution controls at low temperatures. The EPA said this function was in violation of the Clean Air Act but the car companies insisted that the switches were not intended to evade rules. In fact, they thought that the function improved engine efficiency and therefore actually reduced pollution.
From initial misunderstandings like this, a succession of other mechanisms evolved over the decades that followed, as guilty behaviours slowly became ingrained and tacitly accepted. VW’s censuring finally came at a time when its ‘comb-over’ was so convoluted that any outsider unacquainted with it couldn’t help but stop, open mouthed and point.
Where are the roots?
As engineered transport systems and vehicles have become more complex and automated, the need to consider safety hazards and requirements has moved more and more deeply into design activity. Extreme discipline and planning is needed to foresee safety hazards and proactively manage them, and this must be evidenced.
However as complexity increases and deep technical expertise becomes more thinly spread, the paperwork and committees to provide assurance seem to expand, like lengthening locks swept across a growing forehead. When the resulting appendage is lifted up, and the threads of the audit trail are inspected, it is often the case that it is very difficult to discover firm roots: clear evidence that the right work has been done.
This was very much in evidence in the fatal crashes of the Boeing 737 Max plane. In 2018, a Lion Air Max plane crashed into the Java Sea thirteen minutes after takeoff, killing all one hundred and eighty-nine passengers and crew. Less than five months later, in strikingly similar circumstances, an Ethiopian Airlines Max plane crashed just after takeoff on a flight from Addis Ababa, killing one hundred and fifty-seven. The focus of initial investigation was the Maneuvering Characteristics Augmentation System (MCAS), which had the ability to trigger flight control movements that could place the airplane into a dangerous nose-down attitude. MCAS function depended on input from a single angle-of-attack sensor, externally mounted on the fuselage. The American Society for Engineering Education reported:
That the MCAS apparently was vulnerable to a single point of failure should never have been permitted.
This was something that an undergraduate reliability engineer would have known. How had something so obvious and fundamental been missed? The US House Transportation and Infrastructure Committee conducted a detailed investigation looking very broadly at the circumstances in which the accident occurred and concluded that there was a ‘culture of concealment’:
The Max crashes were not the result of a singular failure, technical mistake or mismanaged event…They were the horrific culmination of a series of faulty technical assumptions by Boeing’s engineers, a lack of transparency on the part of Boeing’s management and grossly insufficient oversight by the Federal Aviation Administration.
In other words, a classic ‘comb-over’ had occurred. This has been an horrifically costly lesson for all, and the 737 Max has only just recovered its European certification after extensive rework and review.
A little off the top, please
A particular interest of mine is how the safety of software is assured, and will continue to be in the coming years. Systematic software failures can’t be exhaustively tested out. Assurance therefore comes from robust review and checking of technical asset development.
The process industry safety standard IEC 61508 set the template for how to do this and it has been mirrored in the form of standards like EN50128 in rail and ISO26262 in automotive. These standards require the setting of Safety Integrity Levels (SILs), which constrain the system architecture and require robust verification of safety requirements through the whole asset life.
But the problem is that the standards were built with the mindset of 1980s and 1990s computing, where software was a limited adjunct to electro-mechanical systems. At that time, supply chains were simpler and more local. Cyber security threats were also much less of a concern and AI was in its infancy. The underpinning models on which these standards are built are therefore looking increasingly shaky.
We need to ensure that this growing challenge is not obscured by bureaucracy and the blurring of accountabilities. It’s perhaps time to take a careful look in the mirror and consider whether it’s time to make a change. Perhaps a more modern style is needed?
Something (to think about) for the weekend?
So there are a couple of morals to this story that resonate in the world of technological safety. One is the need to challenge behaviours or practice that you aren’t certain about or don’t understand and to always question their broader implications, even if others seem content. The other is the need to be honest about hazards and vulnerabilities as early as possible. That’s not easy of course. And in the security world it needs to be done in the right environment. But the longer it is left, the harder it gets. Unlike hairdressing, in the world of transportation, the consequences can be fatal.
The next issue
In the next issue I’ll be taking you through another of my ramblings on the safety of modern transportation. Please subscribe now so you don’t miss it.
Thanks for reading
I hope you enjoyed the this edition of Tech Safe Transport. If you did please share it with someone else who might. All views are my own and I reserve the right to change my opinion when the facts change (or even just when I think a bit harder). If you have any thoughts or comments please feel free to send me a message on Twitter. Many thanks to Lee Bearfield for the accompanying artwork and thanks also to my ever-discerning editor, Nicola Gray.
I'll add that there is more to the MCAS issues though. Carriers add bolton code for various reasons, difference in options ordered on aircraft, training and simulators. But that also aligns with your statement - few things in life are the result of a single failure, but rather the sum of multiple leading to erroneous conditions & calamity