The main reason for IT failures

My latest BBC column, republished here for UK readers, looks at some of the dispiritingly enduring human reasons behind IT project failures.

The UK’s National Health Service may seem like a parochial subject for this column. But with 1.7 million employees and a budget of over £100 billion, it is the world’s fifth biggest employer – beaten only by McDonalds, Walmart, the Chinese Army, and the US Department of Defence. And this means its successes and failures tend to provide salutary lessons for institutions of all sizes.

Take the recent revelation that an abandoned attempt to upgrade its computer systems will cost over £9.8 billion– described by the Public Accounts Committee as one of the “worst and most expensive contracting fiascos” in the history of the public sector.

This won’t come as a surprise to anyone who has worked on large computing projects. Indeed, there’s something alarmingly monotonous to most litanies of tech project failure. Planning tends to be inadequate, with projected timings and budgets reflecting wishful thinking rather than a robust analysis of requirements. Communication breaks down, with side issues dominating discussions to the exclusion of core functions. And the world itself moves on, turning yesterday’s technical marvel into tomorrow’s white elephant, complete with endless administrative headaches and little scope for technical development.

Statistically, there are few fields more prone to extravagant failure. According to a 2011 study of 1,471 ICT projects by Alexander Budzier and Bent Flyvbjerg of Oxford’s Said Business School, one in every six ICT projects costs at least three times as much as initially estimated: around twenty times the rate at which projects in fields like construction go this wrong.

But if costly IT failures are a grimly unsurprising part of 21st-Century life, what’s revealing is not so much what went wrong this time as why the same mistakes continue to be repeated. Similar factors were, for example, in evidence during one of the first and most famous project management failures in computing history: the IBM 7030 Stretch supercomputer.

Begun in 1956, IBM’s goal was to build a machine at least one hundred times more powerful than its previous system, the IBM 704. This target won a prestigious contract with the Los Alamos National Laboratory – and, in 1960, the machine’s price was set at $13.5 million, with negotiation beginning for other orders.

The only problem was that, when a working version was actually tested in 1961, it turned out to be just 30 times faster than its predecessor. Despite containing a number of innovations that would prove instrumental in the future of computing, the 7030 had dismally failed to meet its target – and IBM had failed to realise what was going on until too late. The company’s CEO announced that the price of the nine systems already ordered would be cut by almost $6 million each – below cost price – and that no further machines would be made or sold. Cheaper, nimbler competitors stepped into the gap.

Are organisations prone to a peculiar blindness around all things digital? Is there something special about information technology that invites unrealistic expectations?

I would suggest that there is – and that one reason is the disjunction between problems as a business sees them, and problems seen in terms of computer systems. Consider the health service. The idea of moving towards an entirely electronic system of patient records makes excellent sense – but bridging the gap between this pristine goal and the varied, interlocking ways in which 1.7 million employees currently work is a fiendish challenge. IBM faced a far simpler proposition, on paper: make a machine one hundred times faster than their previous best. But the transition from paper to reality entailed difficulties that didn’t even exist until new components had been built, complete with new dead ends and frustrations.

All projects face such challenges. With digital systems, though, the frame of reference is not so much the real world as an abstracted vision of what may be possible. The sky is the limit – and big talk has a good chance of winning contracts. Yet there’s an inherent divide between the real-world complexities of any situation and what’s required to get these onscreen. Computers rely on models, systems and simplifications which we have built in order to render ourselves comprehensible to them. And the great risk is that we simply don’t understand ourselves, or our situation, well enough to explain it to them.

We may think we do, of course, and propose astounding solutions to complex problems – only to discover that what we’ve “solved” looks very little like what we wanted or needed. In the case of almost every sufficiently large computing project, in fact, the very notion of solving a small number of enormous problems is an almost certain recipe for disaster, given that beneath such grandeur lurk countless conflicting requirements just waiting to be discovered.

If there is hope, it lies not in endlessly anatomizing those failures we seem fated to repeat, but in better understanding the fallibilities that push us towards them. And this means acknowledging that people often act like idiots when asked to explain themselves in terms machines can understand.

You might call it artificial stupidity: the tendency to scrawl our hopes and biases across a digital canvas without pausing to ask what reality itself will support. We, not our machines, are the problem – and any solution begins with embracing this.

Such modesty is a tough sell, especially when it’s up against polished solutionism and obfuscation – both staples of debate between managers and technicians since well before the digital era. The alternative, though, doesn’t bear thinking about: an eternity of over-promising and under-delivering. Not to mention wondering why the most powerful tools we’ve ever built only seem to offer more opportunities for looking stupid.