Buffer Overflow – “Don't go out of Bounds” CS2 Background Summary: Data that is stored outside the bounds of an array or other block of memory can interfere with the operation of a program, causing it to crash or creating a vulnerability that hackers might exploit. Description: As all computers store code, data, and information describing the state of running programs in memory, the data that is being manipulated by a program might be found adjacent to instructions and configuration data that are needed to properly run the program. This can cause a problem if the limits of the program's data are not properly respected: code that writes data outside of these limits can modify the state of the program. Hackers can use this to change program state, forcing a program to run their malicious code instead of the instructions that were intended. Risk – How can It Happen?: Buffer overflows occur when data is written outside of the bounds of a block of memory that has a fixed size. There are several ways that this might occur: 1. Arrays: in most programming languages, an array with n items will have valid indices in the range 0... n-1. Using an index outside of this range – less than zero or greater than n-1 – will cause a buffer overflow. 2. Heap-allocated memory: Some languages, such as C/C++, provide tools for allocating arbitrary blocks of memory in the heap. Instructions that write data outside the limits of these blocks can cause overflows. 3. Unbounded copying: Imagine a program that copies a block of data from one spot to another without verifying that it will fit. In the worst case, the data will overflow into adjacent storage. This is exactly what happens in some C./C++ procedures that fail to check bounds before copying data. C and C++ are the languages that are most at risk for buffer overflows. Many languages, most notably Java, have facilities such as out-of-bounds exceptions or automatic bufferresizing that reduce risks. However, buffer overflow problems have been found in the runtime systems for these languages - which are often written in C or C++. Example of Occurrence: “On 19 Oct 2000, hundreds of flights were grounded or delayed because of a software problem in the Los Angeles air-traffic control system. The cause was attributed to a controller in Mexico typing 9 (instead of 5) characters of flight-description data, resulting in a buffer overflow.” Peter Neumann, RISKS DIGEST 21(9) http://catless.ncl.ac.uk/Risks/21.09.html#subj1.1 Example in Code: public class BufferOverflow { public static void main(String[] args) { int[] vals = new int[10]; for (int i = 0; i <20; i++) { vals[i] = i; } } } When this program is run, the loop counter will go past past legitimate index in the array. As soon as the assignment statement tries to store a value in vals[10], the automatic array bound checking facilities in Java's run-time system will trigger an ArrayIndexOutOfBoundsException. In C/C++, similar code will have unpredictable results, depending on the operating system and the specific nature of the overflow: Some overflows will not cause any visible problems, while others will cause the program to crash. How can I avoid buffer overflow problems? Validate indices: If you have an integer variable, verify that it is within the proper bounds before you use it as an index to an array. This validation is particularly important for any values that might have been provided as user input. Don't allocate storage until you know how much you need: Allocating an array or other block of storage before you know how much space you need can lead you into a dangerous situation. If you end up needing more space than you originally thought, you might overflow the buffer. When at all possible, wait to allocate memory until after you know how much space you need. In some cases, this may mean allocating a new buffer instead of reusing an old one. When possible, use buffer-size accessors: Loops that iterate over arrays need to know the size of the array. Using a variable with the wrong value – or the incorrect constant value – can lead to buffer overflows. Using language-provided accessors - such as num.length for an array named num in a Java program – can help you avoid some of these problems. Avoid risky functions: Some languages have a variety of library functions that may lead to buffer overflow vulnerabilities. If you are using any library functions for reading data from the user, copying data, or allocating or freeing blocks of data, read up on the documentation in order to understand the appropriate use of these procedures. In many cases, more secure versions of risky functions may be available – use those instead. Use your tools: Many compilers will provide warnings in cases of potential buffer overflows. Various tools can examine your source code (so-called “static analysis”) or the state of your program while running (“run-time analysis”), to identify possible problems. These tools are available for a variety of programming languages. Handle Exceptions with care: Java and other languages will provide run-time exceptions in many cases that would lead to buffer overflow in C/C++. Although you can use exception- handling facilities to work with these errors, a better approach will often be to validate input and avoid situations that may cause buffer overflows. Exception handling can then be used to handle any additional situations where overflows might occur. Checking for, and responding to potential overflows in your code, instead of relying on the exception-handling mechanism, will make your code more robust and secure. This will be particularly true if underlying vulnerabilities in the run-time system are identified. Laboratory/Homework Assignment: Consider this program: import java.util.*; public class Overflow { static final int INPUT_SIZE=10; public static void main(String[] args) { char[] vals = new char[INPUT_SIZE]; Scanner scan = new Scanner(System.in); String s1 = getString(scan); copyVals(s1,vals); String sub = getSubstring(scan,vals); System.out.println("sub string: "+sub); } public static String getString(Scanner scan) { System.out.print("Please type a string: "); String s = scan.nextLine(); return s; } public static void copyVals(String s,char[] vals) { for (int i = 0; i < s.length(); i++) { vals[i] = s.charAt(i); } } public static String getSubstring(Scanner scan,char[] vals) { System.out.print("Starting point: "); int start = scan.nextInt(); System.out.print("Ending point: "); int end = scan.nextInt(); char[] newChars = getChars(start,end,vals); return new String(newChars); } public static char[] getChars(int start,int end,char[] vals) { int sz = end-start; char[] result = new char[sz]; for (int i=0; i<sz; i++) { result[i] = vals[start+i]; } return result; } } 1. 2. 3. 4. Complete the following checklist for this program. List the potential buffer overflow errors. Provide example inputs that might cause buffer overflow problems. What strategies might you use to remove potential buffer overflow vulnerabilities from this program? 5. Revise the program to eliminate potential buffer overflow problems. You should be able to do this without adding any exception handling code. 6. Write a procedure that will copy an arbitrary subrange of one array of integers into an other array. Your procedure will take four arguments: 1. The source array of integers 2. A starting point and ending point in the source array 3. The destination array of integers – the array that you will be copying numbers to 4. An integer indicating the index of the position in the destination array where copying should start. Be sure to validate all input, responding appropriately to any validation problems. Security Checklist: Security Checklist Vulnerability Buffer Overflow Course CS0 Task – Check each line of code Completed 1. Underline each occurrence of an array. For each underlined array: 2.Mark with a V any array assignment that could lead to overflow Indices less than zero Indices greater than the size of an array 3. Check any loop that iterates an array index. Mark with a V any loop boundary that could exceed the array max. Shaded areas indicate vulnerabilities! Discussion Questions: 1. Countless currently running programs were built using C and C++. Buffer overflow vulnerabilities are often found in these programs, often after they have been in use for many years. Why should it be so difficult to find and fix buffer overflow flaws in software? 2. Text input boxes in graphical user interfaces present the possibility of a different kind of buffer overflow. Specifically, users input can fill the box and (in some cases) cause some of the input to be obscured. What are the possible problems that this type of overflow might cause? How do they differ from the problems associated with a buffer overflow? 3. Buffer overflows can be troublesome if they are used by hackers to run their own code. What sort of things might a hacker try to do if he or she were able to run any code they wanted on a computer?