Buffer Overflow – “Don't go out of Bounds” CS2 Background Summary: Data that is stored outside the bounds of an array or other block of memory can interfere with the operation of a program, causing it to crash or creating a vulnerability that hackers might exploit. Description: As all computers store code, data, and information describing the state of running programs in memory, the data that is being manipulated by a program might be found adjacent to instructions and configuration data that are needed to properly run the program. This can cause a problem if the limits of the program's data are not properly respected: code that writes data outside of these limits can modify the state of the program. Attackers can use this to change program state, forcing a program to run their malicious code instead of the instructions that were intended. Running normal PROGRAM INSTRUCTIONS After Attack PROGRAM INSTRUCTIONS BUFFER STORING DATA BUFFER CONTAINS HACKERS CODE POINTER CONTROLS WHICH LINE OF PROGRAM RUNS NEXT CORRUPTED POINTER EXECUTES MALICIOUS CODE Attacker plants code that overflows buffer and corrupts the program pointer, causing malicious code to run Risk – How can It Happen?: Buffer overflows occur when data is written outside of the bounds of a block of memory that has a fixed size. In many programming languages, an array with n items will have valid indices in the range 0... n-1. Using an index outside of this range – less than zero or greater than n-1 – will cause a buffer overflow. Typically, this will happen in one of two ways: 1. Unbounded array access: Some functions might write data into an array without providing any means of indicating the size of the array. For example, the C/C++ gets function reads characters from the console input into a character array, but it does not have any arguments indicating the size of the array. It is very hard to write code that uses gets without introducing a buffer overflow vulnerability. 2. Inadequately bounded copying: Imagine a program that copies a block of data from one spot to another without verifying that it will fit. In the worst case, the data will overflow into adjacent storage. This is exactly what happens in some C and C++ functions that fail to check bounds before copying data. C and C++ are the languages that are most at risk for buffer overflows. Many languages, most notably Java, have facilities such as out-of-bounds exceptions or automatic bufferresizing that reduce risks. However, buffer overflow problems have been found in the runtime systems for these languages - which are often written in C or C++. Example of Occurrence: A buffer overflow in the Java Runtime Environment (JRE)'s code for handling web applets led to a vulnerability that might have been used by untrusted applets to inappropriately gain access to read and write files, or to execute applications, on a client computer (“). Other buffer overflow vulnerabilities that have been found in the Java system include problems with Java Web Start and GIF image processing code. A Buffer Overflow Vulnerability in the Java Runtime Environment (JRE) "Unpack200" JAR Unpacking Utility May Lead to Escalation of Privileges “ http://sunsolve.sun.com/search/document.do?assetkey=1-66- 244992-1 Sun Java Web Start JNLP File Processing Buffer Overflow, http://secunia.com/advisories/25981/ Sun Java JRE GIF Image Processing Buffer Overflow Vulnerability, http://secunia.com/advisories/23757/ Example in Code: public class BufferOverflow { public static void main(String[] args) { int[] vals = new int[10]; for (int i = 0; i <20; i++) { vals[i] = i; } } } When this program is run, the loop counter will go increment past the last element in the array. As soon as the assignment statement tries to store a value in vals[10], the automatic array bound checking facilities in Java's run-time system will trigger an ArrayIndexOutOfBoundsException. In C/C++, similar code has unpredictable results, depending on the compiler and the specific nature of the overflow: Some overflows will not cause any apparent problems, while others will cause the program to crash. Regardless, any buffer overflow is undefined behavior and needs to be corrected. How can I avoid buffer overflow problems? Validate indices: If you have an integer variable, verify that it is within the proper bounds before you use it as an index to an array. This validation is particularly important for any values that might come from untrusted sources such as user input, network data, or untrusted files. Try to avoid allocating storage until you know how much you need: Allocating an array or other block of storage before you know how much space you need can lead you into a dangerous situation. If you end up needing more space than you originally thought, you might overflow the buffer. When at all possible, wait to allocate memory until after you know how much space you need. In some cases, this may mean allocating a new buffer instead of reusing an old one. Use alternative data structures that reduce the risk of overflows: Many buffer overflow vulnerabilities can be avoided by using vectors or other structures instead of traditional arrays. When possible, use ArrayLists and iterators instead of arrays and integer-indexed loop. Note that these tools will not prevent you from running into trouble: you will still have to write your code carefully and correctly. However, they can reduce your risk of buffer overflow vulnerabilities. When possible, use buffer-size accessors: Loops that iterate over arrays need to know the size of the array. Using a variable with the wrong value – or the incorrect constant value – can lead to buffer overflows. Using language-provided accessors - such as num.length for an array named num in a Java program – can help you avoid some of these problems. Avoid risky functions: Some languages have a variety of library functions that may lead to buffer overflow vulnerabilities. If you are using any library functions for reading data from the user, copying data, or allocating or freeing blocks of data, understand the appropriate use of these functions. In many cases, more secure versions of risky functions are available – use these instead. Use your tools: Many compilers provide warnings in cases of potential buffer overflows. Use high warning settings, and fix your code to avoid warnings. Use static analysis tools to analyze your source code or use dynamic analysis tools to examine and report on the state of your program while running and identify possible problems. These tools are available for a variety of programming languages. Handle Exceptions with care: Java and other languages will provide run-time exceptions in many cases that would lead to buffer overflow in C and C++. Although you can use exception-handling facilities to work with these errors, a better approach is to validate input and avoid situations that may cause buffer overflows. Exception handling can then be used to handle any additional situations where overflows might occur. Checking for, and responding to potential overflows in your code, instead of relying on the exception-handling mechanism, will make your code more robust and secure. This is particularly true if underlying vulnerabilities exist in the run-time system. Laboratory/Homework Assignment: Consider this program: import java.util.*; public class Overflow { static final int INPUT_SIZE=10; public static void main(String[] args) { char[] vals = new char[INPUT_SIZE]; Scanner scan = new Scanner(System.in); String s1 = getString(scan); copyVals(s1,vals); String sub = getSubstring(scan,vals); System.out.println("sub string: "+sub); } public static String getString(Scanner scan) { System.out.print("Please type a string: "); String s = scan.nextLine(); return s; } public static void copyVals(String s,char[] vals) { for (int i = 0; i < s.length(); i++) { vals[i] = s.charAt(i); } } public static String getSubstring(Scanner scan,char[] vals) { System.out.print("Starting point: "); int start = scan.nextInt(); System.out.print("Ending point: "); int end = scan.nextInt(); char[] newChars = getChars(start,end,vals); return new String(newChars); } public static char[] getChars(int start,int end,char[] vals) { int sz = end-start; char[] result = new char[sz]; for (int i=0; i<sz; i++) { result[i] = vals[start+i]; } return result; } } 1. Complete the following checklist for this program. 2. List the potential buffer overflow errors. 3. Provide example inputs that might cause buffer overflow problems. 4. What strategies might you use to remove potential buffer overflow vulnerabilities from this program? 5. Revise the program to eliminate potential buffer overflow problems. You should be able to do this without adding any exception handling code. 6. Write a procedure that will copy an arbitrary range of one array of integers into an other array. Your procedure will take four arguments: 1. The source array of integers 2. A starting point and ending point in the source array 3. The destination array of integers – the array that you will be copying numbers to 4. An integer indicating the index in the destination array where copying should start. Be sure to validate all input, responding appropriately to any validation problems. Security Checklist: Security Checklist 4. Underline all occurrences of variables that are used as array indices (as marked in step 3) For each underlined index variable: 5. Mark with a V any assignment, operation, or input to the index variable. 6. Mark with a V any function arguments that are used to send array indices in, and write the legal range next to each argument. If your function does verify that the argument is indeed within those limits, you may have a vulnerability. 7. Identify counted loops that modify the index. For any index that occurs as part of a loop conditional, underline the loop limit. For example, if i < max is the conditional in a for loop, underline max 8. Write the legal range next to the loop limit as you did in step 3 9. Underline all occurrences of variables that are used as loop limits. For each underlined loop limit variable 10. Mark with a V any assignment, operation, or input to the loop limit variable. 11. Mark with a V any function arguments that are used to send loop limits in, and write the legal range next to each argument. If your function does verify that the argument is indeed within those limits, you may have a vulnerability. For each function that has arguments marked with a V: 12. Mark with a V any calls to the function. If the function does not verify that parameters are within the required range, the calls should do so. Shaded areas indicate vulnerabilities! Discussion Questions: 1. Buffer overflows are more troublesome for some programming languages than for others. For example, C and C++ lack the built-in bounds checking facilities that Java provides. Some people have argued that this is a good reason to avoid C and C++ in favor of Java or other “safer” languages. Do you think this is a good idea? Why or why not? 2. Countless currently running programs were built using C and C++. Buffer overflow vulnerabilities are often found in these programs, often after they have been in use for many years. Why should it be so difficult to find and fix buffer overflow flaws in software? 3. Buffer overflows can be troublesome if they are used by hackers to run their own code. What sort of things might an attacker try to do if he or she were able to run arbitrary code on a computer?