11 of the most costly software errors in historyPosted Jan 26, 2022 | 10 min. (1969 words)
The mere mention of a serious software error can strike fear into the heart of any developer, project manager or tech leader.
The wrong error in the wrong system can be incredibly expensive and difficult to resolve, not to mention humiliatingly public. Catastrophic software errors are mercifully rare these days, but the potential for chaos, PR disasters and spiraling costs still remains.
Error monitoring solutions have become more and more of a necessity in recent years, but plenty of businesses still view monitoring as an optional extra, despite the fact that poor quality software continues to cost US organizations over $2 trillion annually.
We’ve put together the biggest software disasters of all time, detailing exactly what happened and why, as well as letting you know how to avoid a breathtakingly expensive mistake.
How can a simple software error end up costing so much?
The financial toll of software errors depends on a few factors. Firstly, there’s the direct cost of paying developers and software engineers to sort out the mess. Then there’s downtime, lost data, and squandered transactions. And in the aftermath, there’s the reputational damage to consider. Any organization that suffers a catastrophic software error will lose credibility with customers and the broader market, and might even violate their service agreements. This can result in long-term financial losses as people lose trust in the brand itself and avoid it in the future.
However, not all costs are measured in dollars. Software is omnipresent and affects virtually everything we do. As well as monetary repercussions, there can also be negative effects on people’s privacy, valuable data and even their safety.
To demonstrate the devastating consequences of a simple error, let’s see just how bad it can get. Without further ado, here are the all-time most costly software errors.
1. The Mariner 1 Spacecraft, 1962
The first entry in our rundown goes right back to the sixties.
Before the summer of love or the invention of the lava lamp, NASA launched a data-gathering unmanned space mission to fly past Venus. It did not go to plan.
The Mariner 1 space probe barely made it out of Cape Canaveral before the rocket veered dangerously off course. Worried that the rocket was heading towards a crash-landing on earth, NASA engineers issued a self-destruct command and the craft was obliterated about 290 seconds after launch.
An investigation revealed the cause to be a very simple software error. A hyphen was omitted in a line of code, which meant that incorrect guidance signals were sent to the spacecraft. The overall cost of the omission was reported to be more than $18 million at the time (about $169 million in today’s world).
2. The Morris Worm, 1988
Not all costly software errors are worn by big companies or government organizations. In fact, one of the most costly software bugs ever was caused by a single student. A Cornell University student created a worm as part of an experiment, which ended up spreading like wildfire and crashing tens of thousands of computers due to a coding error.
The computers were all connected through a very early version of the internet, making the Morris worm essentially the first infectious computer virus. Graduate student Robert Tappan Morris was eventually charged and convicted of criminal hacking and fined $10,000, although the cost of the mess he created was estimated to be as high as $10 million.
History has forgiven Morris though, with the incident now widely credited for exposing a vulnerability and improving digital security. These days, Morris is a professor at MIT and the worm’s source code has been kept as a museum piece on a floppy disc at the University of Boston.
3. Pentium FDIV Bug, 1994
The Pentium FDIV bug is a curious case of a minor problem that snowballed due to mass hysteria.
Thomas Nicely, a math professor, discovered a flaw in the Pentium processor and reported it to Intel. Their response was to offer a replacement chip to anyone who could prove they were affected by it.
The original error was relatively simple, with a problem in the lookup table of the chip’s algorithm. This could cause tiny inaccuracies in calculations, but only very rarely. In fact, the chance of an miscalculation occurring was calculated to be just 1 in 360 billion.
Although the actual effects of the software error were negligible, when details of the bug hit the international press, millions of people requested a new chip, costing Intel upwards of $475 million.
4. Bitcoin Hack, Mt. Gox, 2011
Mt. Gox was the biggest bitcoin exchange in the world in the 2010s, until they were hit by a software error that ultimately proved fatal.
The glitch led to the exchange creating transactions that could never be fully redeemed, costing up to $1.5 million in lost bitcoins.
But Mt. Gox’s woes didn’t end there. In 2014, they lost more than 850,000 bitcoins (valued at roughly half a billion USD at the time) in a hacking incident. Around 200,000 bitcoins were recovered, but the financial loss was still overwhelming and the exchange ended up declaring bankruptcy.
5. EDS Child Support System, 2004
Back in 2004, the UK government introduced a new and complex system to manage the operations of the Child Support Agency (CSA). The contract was awarded to IT services company Electronic Data Systems (EDS). The system was called CS2, and there were problems as soon as it went live.
A leaked internal memo at the time revealed that the system was “badly designed, badly tested and badly implemented”. The agency reported that CS2 “had over 1,000 reported problems, of which 400 had no known workaround”, resulting in “around 3,000 IT incidents a week”. The system was budgeted to cost around £450 million, but ended up costing an estimated £768 million altogether. EDS, a Texas-based contractor, also announced a $153 million loss in their subsequent financial results.
6. Heathrow Terminal 5 Opening, 2008
Imagine prepping to jet off on your eagerly-awaited vacation or important business trip, only to find that your flight is grounded or and your luggage is nowhere to be seen.
This was exactly what happened to thousands of travelers when Heathrow’s Terminal 5 opened back in March 2008, and it was all caused by buggy software. The problem lay with a new baggage handling system that performed well on test runs, but failed miserably in real-life. This caused massive disruptions like malfunctioning luggage belts and thousands of items being lost or sent to the wrong destinations.
British Airways also revealed that problems with the wireless network caused additional problems at the airport. Over the next 10 days, some 42,000 bags were lost and more than 500 flights canceled, costing more than £16 million.
7. NASA’s Mars Climate Orbiter, 1998
Losing $20 from your wallet is probably enough to ruin your day — how would it feel to lose a $125 million spacecraft? NASA engineers found out back in 1998 when the Mars Climate Orbiter burned up after getting too close to the surface of Mars.
It took engineers several months to work out what went wrong. It turned out to be an embarrassingly simple mistake in converting imperial units to metric. According to the investigation report, the ground control software produced by Lockheed Martin used imperial measurements, while the software onboard, produced by NASA, was programmed with SI metric units. The overall cost of the failed mission was more than $320 million.
8. Soviet Gas Pipeline Explosion, 1982
This error is a little bit different to the others, as it was deliberate (or so rumor has it). In fact, the Soviet gas pipeline explosion is alleged to be a cunning example of cyber-espionage, carried out by the CIA.
Back in 1982, at the height of the cold war tensions between the USA and USSR, the Soviet government built a gas pipeline that ran on advanced automated control software. The Soviets planned to steal from a Canadian company that specialized in this kind of programming.
According to accounts, the CIA was tipped off and began plotting some counter-espionage. They worked with the Canadians to place deliberate bugs in the software (also known as a Trojan Horse) to compromise the Soviet pipeline.
The unknowing Soviets went ahead and stole the compromised software and applied it to the pipeline. In June 1982, the explosion occurred with a force which was visible from space. This severely damaged the pipeline, which had cost tens of millions to construct and was intended to produce $8 billion in natural gas revenue.
9. Knight’s $440M in bad trades, 2012
Losing $440 million is a bad day at the office by anyone’s standards. Even more so when it happens in just 30 minutes due to a software error that wipes 75% off the value of one the biggest capital groups in the world.
Knight Capital Group had invested in new trading software that was supposed to help them make a killing on the stock markets. Instead, it ended up killing their firm. Several software errors combined to send Knight on a crazy buying spree, spending more than $7 billion on 150 different stocks.
The unintended trades ended up costing the company $440 million, and Goldman Sachs had to step in to rescue them. Knight never really recovered, and was ultimately acquired by a competitor less than a year later.
10. ESA Ariane 5 Flight V88, 1996
Given the complexity and expense of space exploration, it’s no wonder there are several failed space missions on our list of all-time software errors. However, the European Space Agency’s Ariane 5 failure is an even harsher cautionary tale than the rest, as it was caused by more than one error.
Just 36 seconds after its maiden launch, the rocket engines failed due to the engineers reusing incompatible code from Ariane 4 and a conversion error from 64-bit to 16-bit data.
The failure resulted in a $370 million loss for the ESA, and a whole host of recommendations came out of the subsequent investigation, including calls for improved software analysis and evaluation.
11. The Millennium Bug, 2000
The Millennium Bug, AKA the notorious Y2K, was a massive concern in the lead-up to the year 2000. The concern was that computer systems around the world would not be able to cope with dates after December 31, 1999, due to the fact that most computers and operating systems only used two digits to represent the year, disregarding the 19 prefix for the twentieth century. Dire predictions were made about the implosion of banks, airlines, power suppliers and critical data storage. How would systems deal with the 00 digits?
The anticlimatic answer was “pretty well, actually”. The millennium bug was a bit of a non-starter and didn’t cause too many real-life problems, as most systems made adjustments in advance. However, the fear caused by the potential fallout throughout late 1999 cost thousands of considerable amounts of money in contingency planning and preparations, with institutions, businesses and even families expecting the worst. The USA spent vast quantities to address the issue, with some estimates putting the cost at $100 billion.
Avoiding the nightmare of software errors
Right now you’re probably thinking: yikes. How can I avoid an eye-wateringly expensive software error?
The good news is that, these days, it’s pretty easy.
While errors are an accepted fact of programming, you can access monitoring software that scans your applications and code to make sure that you won’t get caught out by a disastrous oversight.
Error monitoring solutions like Raygun are the ideal weapon to protect you from costly software errors. You get real-time insights into the health of your software with automatic detection and diagnosis of errors, so your team can resolve them before the damage is done.