11 of the most costly software errors in history

| 8 min. (1496 words)

No doubt Raygun is saving thousands of developers from embarrassing or even catastrophic software errors every day, but what was life like without such an awesome (and automatic) error monitoring solution? We’ve looked into some of the biggest disasters over the years to see what happens when software errors cause chaos!

What are costly software errors?

A costly software error can come in many different shapes and sizes, and the cost isn’t always monetary. Software is in almost everything we do, and the consequences when things go wrong can be devastating. From costing multiple billions of dollars to causing real human lives to be lost, broken software has the potential to have unthinkable consequences.

NASA’s Mars Climate Orbiter

On its mission to Mars in 1998 the Climate Orbiter spacecraft was ultimately lost in space. Although the failure bemused engineers for some time it was revealed that a sub contractor on the engineering team failed to make a simple conversion from imperial units to metric. An embarrassing lapse that sent the $125 million craft fatally close to Mars’ surface after attempting to stabilize its orbit too low. Flight controllers believe the spacecraft plowed into Mars’ atmosphere where the associated stresses crippled its communications, leaving it hurtling on through space in an orbit around the sun.

Ariane 5 Flight 501

Europe’s newest un-manned satellite-launching rocket reused working software from its predecessor, the Ariane 4. Unfortunately, the Ariane 5’s faster engines exploited a bug that was not found in previous models. Thirty-six seconds into its maiden launch the rocket’s engineers hit the self destruct button following multiple computer failures. In essence, the software had tried to cram a 64-bit number into a 16-bit space. The resulting overflow conditions crashed both the primary and backup computers (which were both running the exact same software). 

The Ariane 5 had cost nearly $8 billion to develop, and was carrying a $500 million satellite payload when it exploded.

EDS Child Support System

In 2004, EDS introduced a highly complex IT system to the U.K.’s Child Support Agency (CSA). At the exact same time, the Department for Work and Pensions (DWP) decided to restructure the entire agency. The two pieces of software were completely incompatible, and irreversible errors were introduced as a result. The system somehow managed to overpay 1.9 million people, underpay another 700,000, had US$7 billion in uncollected child support payments, a backlog of 239,000 cases, 36,000 new cases “stuck” in the system, and has cost the UK taxpayers over US$1 billion to date.

Soviet Gas Pipeline Explosion

The Soviet pipeline had a level of complexity that would require advanced automated control software. The CIA was tipped off to the Soviet intentions to steal the control system’s plans. Working with the Canadian firm that designed the pipeline control software, the CIA had the designers deliberately create flaws in the programming so that the Soviets would receive a compromised program. It is claimed that in June 1982, flaws in the stolen software led to a massive explosion along part of the pipeline, causing the largest non-nuclear explosion in the planet’s history.

Bitcoin Hack, Mt. Gox

Launched in 2010, Japanese bitcoin exchange, Mt. Gox, was the largest in the world. After being hacked in June, 2011, Mt. Gox stated that they’d lost over 850,000 bitcoins (worth around half a billion US dollars at the time of writing).

Although around 200,000 of the bitcoins were recovered, Mark Karpeles admits “We had weaknesses in our system, and our bitcoins vanished.”

Heathrow Terminal 5 Opening

Just before the opening of Heathrow’s Terminal 5 in the UK, staff tested the brand new baggage handling system built to carry the vast amounts of luggage checked in each day. Engineers tested the system thoroughly before opening the Terminal to the public with over 12,000 test pieces of luggage. It worked flawlessly on all test runs only to find on the Terminal’s opening day the system simply could not cope. It is thought that “real life” scenarios such as removing a bag from the system manually when a passenger had left an important item in their luggage, had caused the entire system to become confused and shut down. 

Over the following 10 days some 42,000 bags failed to travel with their owners, and over 500 flights were cancelled.

The Mariner 1 Spacecraft

On a mission to fly-by Venus in 1962, this spacecraft barely made it out of Cape Canaveral when a software-coding error caused the rocket to veer dangerously off-course, threatening to crash back to earth. Alarmed, NASA engineers on the ground issued a self-destruct command. A review board later determined that the omission of a hyphen in coded computer instructions allowed the transmission of incorrect guidance signals to the spacecraft. The cost for the rocket was reportedly more than $18 million at the time.

The Morris Worm

A program developed by a Cornell University student for what he said was supposed to be a harmless experiment wound up spreading wildly and crashing thousands of computers in 1988 because of a coding error. It was the first widespread worm attack on the fledgling Internet. The graduate student, Robert Tappan Morris, was convicted of a criminal hacking offense and fined $10,000. Morris’s lawyer claimed at the trial that his client’s program helped improve computer security.

Costs for cleaning up the mess may have gone as high as $100 Million. Morris, who interestingly co-founded the startup incubator Y Combinator, is now a professor at the Massachusetts Institute of Technology. A disk with the worm’s source code is now housed at the University of Boston.

The Morris Worm

Patriot Missile Error

Sometimes, the cost of a software glitch can’t be measured in dollars. In February of 1991, a U.S. Patriot missile defence system in Saudi Arabia, failed to detect an attack on an Army barracks. A government report found that a software problem led to an inaccurate tracking calculation that became worse the longer the system operated. On the day of the incident, the system had been operating for more than 100 hours, and the inaccuracy was serious enough to cause the system to look in the wrong place for the incoming missile. The attack killed 28 American soldiers. Prior to the incident, Army officials had fixed the software to improve the Patriot systems accuracy. That modified software reached the base the day after the attack.

Pentium FDIV bug

When a math professor discovered and publicized a flaw in Intel’s popular Pentium processor in 1994, the company’s response was to replace chips upon request to users who could prove they were affected. Intel calculated that the error caused by the flaw would happen so rarely that the vast majority of users wouldn’t notice. Angry customers demanded a replacement for anyone who asked, and Intel agreed. The episode cost Intel $475 million.

Knight’s $440 Million Error 

One of the biggest American market makers for stocks struggled to stay afloat after a software bug triggered a $440 million loss in just 30 minutes. The firm’s shares lost 75 percent in two days after the faulty software flooded the market with unintended trades. Knight’s trading algorithms reportedly started pushing erratic trades through on nearly 150 different stocks, sending them into spasms.

Honourable mention: NOAA-19 Satellite

The NOA-19 satellite

Although not a software error, on September 6, 2003, this satellite was badly damaged while being worked on at the Lockheed Martin Space Systems factory. The satellite fell to the floor as a team was turning it to a horizontal position. An inquiry into the mishap determined that it was caused by a lack of procedural discipline throughout the facility. Turns out that while the turn-over cart used during the procedure was in storage, a technician removed twenty-four bolts securing an adapter plate to it without documenting the action. The team subsequently using the cart to turn the satellite failed to check the bolts, as specified in the procedure, before attempting to move the satellite.

Repairs to the satellite cost $135 million.

Solving your software errors

As you can see, software errors can have devastating consequences. In this day and age, software errors don’t have to be the evil in the dark. There are now a number of tools on the market that give you visibility into the darkness, so that you can avoid being caught off-guard by unwanted software errors. By using error monitoring software, you can have peace of mind that you won’t be on the next one of these lists. Error monitoring keeps a watchful eye over your application, and reports and errors or crashes that your users have experienced, which is a crucial step for ensuring that an error doesn’t happen again.

Don’t want to get caught out by your software bugs? Get automatically notified of your software errors with instant notifications. Book a demo with an experienced team member or sign up for a free trial.