Software_Engineering_Disasters

advertisement
1
Software in our lives, then and
now
I think there is a world market for maybe five computers. - IBM Chairman
Thomas Watson, 1943

Medical (processing and analysis, Computer Aided Surgery, other
various equipment)

Financial and business (banking, trading)

Transportation (trains, cars, planes, auto-pilot)

Home (security / fire)

Leisure

Military
2
Murphy’s law
“Anything that can go wrong, will go wrong.”
3
Previously in CS 577

Mars 2 Rover crash-landing (1971)

dust storm caused incorrect landing angle computations?

Ariane 5 self-destruct (1996)

Data conversion from 64-bit floating point to 16-bit signed integer: overflow

Cost: $370,000,000

Therac-25

Beta radiation overdose (10,000%)

Replacing hardware interlocks with software interlock mechanisms

Frequent overflow in a one-byte counter. Operator input to the machine during
overflow causes interlock mechanism to fail due to race condition

3 deaths, 3 injured

Unrealistic risk assessment, inadequate testing

AMR / Budget Rent-A-Car / Hilton Hotels / Marriott International “Confirm”

Bank of America “MasterNet”
4
Disasters at the people (not company)
level



Panama Radiation Therapy Overdose (2000)

18 deaths, 10 injured

Double counting, Overreliance on automation
Various military vehicle crashes

Chinook Helicopter Crash, 29 deaths (1994): uncommanded run up and run
down of the engines (analysis shows 486 anomalies in 18% of the code)

V-22 Osprey Crash, 4 deaths (2000): software causes aircraft to decelerate
when pilot attempts to reset software

Failed missile interception, 28 deaths, 94 injured (1991): system clock
Y2K (2000)

Abbreviating year with 2 digits

$300,000,000,000 cost
5
Toyota Anti-Lock Brake recalls
(2010)

~150,000 vehicles recalled

Reason: 1 second lag



60 mph (96.5 km/h)  ~90 feet (27.5m)
Enough to cause accidents
Bad PR

$1.1 billion in repairs

$770-880 million in lost sales

Endangering people’s lives
Toyota
"Moving forward"...
even when you don't
want to.
6
Stock Market Flash Crash
(2010)

Dow Jones stock market (very closely watched U.S.
benchmark indices tracking targeted stock market activity).

Biggest on-day market decline, 998.5 points

Cost: $1,000,000,000,000

Procter & Gamble, Accenture: shares price down to a penny,
or up to $100,000.

Recovered a large amount of the point drop
7
Cold War Nuclear Missile
False Alarm

Very sensitive period

Strategy was an immediate nuclear counter-attack to
guarantee “Mutually Assured Destruction”

How it was mitigated: soldier considered it was a computer
error

The bug: false alarm created by a rare alignment of sunlight
on high-altitude clouds and the satellites’ orbits.

Cost: Nuclear World War 3
8
What’s next?
Just as Thomas Watson couldn’t guess what was coming up in
the next 40 years, it is pretty hard for us to estimate how
computers and technology will evolve in the near future.
However, we know for sure that software systems will get
MUCH larger and complex, more tasks will be automated,
reliance on software will greatly increase.
9
Do more testing?
Testing will only catch ~80% of the bugs.
“Program testing can be used to show the presence of bugs, but
never to show their absence!” Edsger Dijkstra
10
Conclusion: our role

Our responsibility increases as the need for reliability in our
system increases

Proper process / practices in architecting, managing risks,
developing and testing.


As we were taught in various SE classes (577, 578…)
Good communication between stakeholders

To ensure all sides are talking about the same thing
11
Download
Related flashcards

American investors

85 cards

Reinsurance companies

11 cards

Disney India

17 cards

Create Flashcards