Bad Software Greg Hoglund CTO, Cenzic, Inc. greg@cenzic.com What is Bad Software? • Software that exposes confidential data to un-authenticate users • Software which crashes or grinds to a halt when exposed to faulty inputs • Software which allows an attacker to inject code and execute it • Software which executes privileged commands for an attacker Denver Airport Baggage • Unmanned carts on a track • Bad failure recovery/detection – Piles of fallen bags would not stop the unloaders • Carts got out of sync – Full carts continue to get loaded – Empty carts get unloaded • Delayed airport opening for 11 months – $1 million dollars a day in cost due to interest bond issues The last photo taken by the Mars Lander before it plunged to it’s death.* *This photo was found on the Internet. It has not been independently verified. NASA Mars Lander • Failed translation – English units into metric units – major error in spacecraft's path as it approached Mars • Crashed into the planet – Shut off descent engines prematurely • Taxpayer cost: $165 Million 4 Marines Killed • MV-22 Osprey Helicopter Crash • Burst hydraulic failure • Software caused backup system to fail Do these look alike? Navy shoots down Civilian Airliner • IN 1988, the US Vicennes shoots down Airbus 320 • 290 human lives lost • “cryptic and misleading output displayed by the tracking software “ Microsoft’s $8.5 billion mistake • I LOVE YOU was only possible because Microsoft Outlook was designed to execute programs that were mailed to it. Why we have Bad Software • Networked Software is not designed to withstand a hostile environment • Development tools do not prevent simple security bugs (i.e., buffer overflows) • QA Testing methods do not address security • Customers pay for bad software Getting Worse • In order to compete, new services must be delivered • New technology is not being properly tested for failures • More connections, devices, and code More Devices What happens when buffer overflows and poor access controls lead to mobile code attacks on cellular phones? Mobile code can effect distributed systems in a matter of hours More Connections • New protocols, delivery mediums • A high degree of connectivity makes it possible for small failures to propagate and lead to massive outages – Telephone network outages – Power system grid failures More Code • Technology is being ‘glued’ together • More feature rich, more drivers and libraries –In 1983, Microsoft word was only 27,000 LOC Code Size 400,000 17 million 40 million 10 million 7 million 35 million Under 5 million 1.5 million Solaris 7 Netscape Space Station Space Shuttle Boeing 777 NT5 Windows 95 Linux More Exposure • Massive increase in connectivity • A vast network of relationships – Arpanet started with 12 nodes • Machines that used to work behind closed doors are now exposed – Computers are now worn on belt-loops 5 Million Backdoors • 5 – 50 bugs per 1000/lines of code [Vaos/McGraw]* 1 LOC ~ 10 bytes X 30,000 HOSTS 4.5 Billion X 10% = 500 Million Security Bugs 500 Million X 10% = 5 Million Remote Security Bugs = 3000 EXE’s ~100K per EXE = 10,000 LOC / EXE 5 Bugs/1000 LOC = 50 bugs/EXE 150,000 Bugs/ 4.5 Billion bugs Host Software is always in the “bleeding edge” phase • Windows 2000 shipped with 63,000 known bugs Software sucks because you buy it • Yes, YOU the CONSUMER play a part in demanding bad software • To demand new features in a very short time frame creates a time-to-market problem for reliable software – Will you wait two years for the features you want? – Will you pay 10-times as much to get those features? Deja Vu • The same software bugs just keep hanging around – We knew about buffer overflows 15 years ago • We are slow to adopt ideas – When will customers hold vendors liable for buffer overflows? – Is it reasonable to accept buffer overflows in production code? Other Industries Get Sued • Software shops gather around to defer bugs, decide which ones to ‘patch later’, and which ones to ignore • In other industries, safety flaws that are not corrected result in major class-action suits How come vendors don’t fix this stuff? • They can afford not to! • Hardware is expensive to replace – so huge investments are placed into testing hardware prior to release – Intel F00F bug cost $500 million • Software bugs can be patched and downloaded from a web-site – They pass the cost of a bug to the customer Software is not a Steel Bridge • The methods used for testing in traditional analog systems do not apply to software • With a bridge, you extrapolate results – What happens in between a 1000 kg test and a 10,000 kg test? – The system is continuous – State changes are gradual and predictable Discrete systems • State changes are not predictable • Numbers can change between 00001111 and 11110000 in an instant Let the compiler do the Diagnostics • If programmers had to book time on the mainframe two weeks in advance, they would invest countless hours checking their work • Code hackers today just bounce code off the compiler until all the errors go away – This puts the responsibility of “code review” on the compiler Form follows Failure • Sub-synchronous resonance in power systems – The addition of series AC capacitors in high energy power systems increases electrical stability – However, due to line inductance, the capacitors create electrical oscillations that effect the mechanical generator • Mohave Generating Station, Southern Nevada, 1971 – This snapped the drive shaft on a generator twice before it was properly diagnosed – This phenomenon is now a serious consideration is any power system design How to Fix Bad Software • Better compilers and languages – More formal, more tractable • Failure analysis and fault-injection • Hold vendors liable • Stop buying it Security Testing Security testing requires attacking the software. The software should be tested for the unexpected and the unknown. Software will never be placed or deployed into a trusted or predictable environment. The Missing Leg of Software Reliability Reliability Functional Reliability Performance Performance Functional Security Traditional QA testing methods have never addressed security. Software systems cannot be reliable unless they are secure. Security Testing History • • • • • Attack and Pen Source Code Review Network Scanning Fault Injection Full Disclosure Fault Injection • • • • Source code changes require recompile Binary instrumentation requires host agent API input testing requires test harness Network input testing requires additional network node Black Box • Can be automated • Can easily find ‘low hanging fruit’ • Automated Tools: – – – – ISICS Spike Hailstorm™ PROTOS MSQL Overflow with Spike s_binary("12 01 00 34 00 00 00 00 00 00 15 00 06 01 00 1b"); s_binary("00 01 02 00 1c 00 0c 03 00 28 00 04 ff 08 00 02"); //this is probably a length field s_binary("10 00 00 00"); //make this big s_string_variable("MSSQLServer"); s_binary("00 24 01 00 00"); UDP-1434 SQL Overflow Buffer Attack Injected Into Protocol Statement 0040e890 0040e895 0040e896 FAULT ->0040e898 0040e89a e87b8cffff c3 8bc0 8b10 33c9 call ret mov mov xor White Box • IDA-Pro (reverse assemble) • More expensive and requires an expert • Very time consuming IDA reverse of popular app-server’s “CanonicalizeURIPath” A Fusion – Grey Box • Combines: – A runtime debugger • SoftIce • GDB – A white box tool • IDA – A black box tool • Hailstorm™ Using Instrumentation • Using Rational Purify™ • Using API call hooks • Using Code-coverage (gcov, etc) – – – – Cananocalization routines Filtering routines Decision logic Parsers Hailstorm™ crashes MS-SQL 7 Input Path Tracing • Path tracing – ltrace – truss • Data tracing – Gdb breakpoints – Modified ltrace • Where is user-data getting placed? – Trusted API calls? Boron Tagging with GDB .text:00056140 INTutil_uri_is_evil_internal: .text:00056140 ldsb [%o0], %o1 .text:00056144 mov 1, %o3 .text:00056148 mov 2, %o4 .text:0005614C cmp %o1, 0 .text:00056150 be,pn %icc, loc_561F4 .text:00056154 mov %o0, %o5 .text:00056158 movx/8s $o0 %o2, %o0 (gdb) .text:0005615C mov 0, %o2 0x97f030: “/iplanet/servers/TEST_STRING” .text:00056160 cmp %o1, 0x2F 0x97f064: "ervers/docs" .text:00056164 0x97f070: "/usr/local/iplanet/docs" .text:00056164 loc_56164: 0x97f090: "" .text:00056164 bne,a %icc, "\227ð\230" loc_561DC 0x97f091: 0x97f095: "" 0x97f096: "" 0x97f097: "" TEST_STRING Using TRUSS on Solaris # truss -u *:: -vall -xall -p 2307 2>&1 | grep –v read | grep –v poll The 2>&1 tag is required since truss does not deliver all of it’s data on the stdout pipe. The output of the command will look something like: /67: <- libns-httpd40:__0FT_util_strftime_convPciTCc() = 50 /67: -> libns-httpd40:__0FT_util_strftime_convPciTCc(0xff2ed342, 0x2, 0x2, 0 /67: <- libns-httpd40:__0FT_util_strftime_convPciTCc() = 0xff2ed345 /67: <- libns-httpd40:INTutil_strftime() = 20 /67: -> libns-httpd40:INTsystem_strdup(0xff2ed330, 0x9, 0x41, 0x50) /67: -> libns-httpd40:INTpool_strdup(0x9e03a0, 0xff2ed330, 0x0, 0x0) /67: -> libc:strlen(0xff2ed330, 0x0, 0x0, 0x0) /67: <- libc:strlen() = 20 /67: <- libns-httpd40:INTpool_strdup() = 0x9f8b10 /67: <- libns-httpd40:INTsystem_strdup() = 0x9f8b10 /67: <- libns-httpd40:time_cache_curr_strftime_logfmt() = 0x9f8b10 /67: -> libc:strcpy(0xf7400710, 0x9f8b10, 0x0, 0x7efefeff) /67: <- libc:strcpy() = 0xf7400710 /67: -> libc:strlen(0xf7400710, 0x9f8b28, 0xf7400710, 0x0) /67: <- libc:strlen() = 20 /67: -> libc:strlen(0x9f4f48, 0x34508f, 0x0, 0x7efefeff) /67: <- libc:strlen() = 25 Win32 hook on strcpy If there is code for it… • • • • • What if? Assume filters fail Assume API call input can be controlled Map the capability of every DLL Controlled by process permissions and access control Every DLL that calls SetSecurityDescriptorDACL User Input • What can the user directly control in terms of API calls? – – – – Authentication calls Filesystem Database Command shell Remote Capability • Do any of the native calls operate over the network? – – – – Domain specification Data source specification Ip address NTFS Path name Authentication • Response aggregation – User/password enumeration when errors differ • No lockout – Brute force • Failed logging – Alternative requests • Can you specify a remote domain or target? – Proxied attacks Filesystem • Can you control a filesystem path – What is the entire set of characters? • Can you create files in a target directory – Create files that will be interpreted in a server context • Can you use remote pathname – //machine_name/etc Execution Flaws • Code Insertion • State Corruption • Fatal Exception Architecture Flaws • Lack of randomness – Hijacking keys • No authentication – Bad configuration or design • No compartments – Use the same buffer for crypto and clear • Race conditions Take control of the Problem • • • • • Test before you buy Perform independent testing on the software Perform internal testing on the software Cooperate and create a shared testing lab Create an acceptance criteria Make the vendor responsible • Vote with your dollars • Force vendors into a comparison against competitive products • Make the vendor produce a technically credible security audit • Force vendors to accept liability associated with a security bug – Make the vendor pay the cost of a bug It’s Ultimately Your Decision • As the customers of technology, you have the right to demand safety and reliability • Security knowledge is widespread • Reliable software is secure • Security testing is the only way to eliminate the bugs that undermine your systems Thank You Greg Hoglund greg@cenzic.com www.cenzic.com