Achieving Trusted Systems by Providing Security and Reliability A Finite State Machine Methodology for Analyzing Security Vulnerabilities Shuo Chen, Zbigniew Kalbarczyk, Jun Xu, Ravishankar K. Iyer Statistical Analysis: Bugtraq Vulnerability Classification In-depth Analysis of Vulnerability Reports • Observations 5925 reports of security vulnerabilities (Nov.30 2002) 3% Unknow n 6% 2% Access Validation Error 10% Observation 1: exploits must pass through multiple elementary activities Observation 2: exploiting a vulnerability involves multiple vulnerable operations on several objects. Observation 3: for each elementary activity, the vulnerability data and corresponding code inspections allow us to define a predicate, which if violated, naturally results in a security vulnerability. Access Validation Error Atomicity Error Boundary Condition Error Boundary Condition Error 21% Input Validation Error 23% Configuration Error Design Error Environment Error Failure to Handle Exceptional Conditions Input Validation Error Configuration Error 5% Failure to Handle Exceptional Conditions 11% 1% • These observations motivate development of a FSM model to depict security vulnerabilities. Origin Validation Error Race Condition Error Design Error 18% – Statistical study on Bugtraq database – In-depth study on vulnerability reports and corresponding source codes. – Develop a FSM (finite state machine) methodology to model the vulnerabilities, based on the observations, from the analyzed data. Serialization Error Case 2: NULL HTTPD Heap Overflow Common pFSM Types Operation 1: get (contentLen, input) contentLen is an integer, input is an text string to be read from a socket Heap Layout X <0 ntLen conte pFSM1 con te ntL e n> =0 th( ng e l )< ata -Calloc PostData[1024+contentLen] Siz D ost e( P Calloc is called Allocate and free the buffer PostData Note: addr_free is the .GOT entry of function free Manipulate the .GOT entry of function free (i.e., addr_free) - Copy input from the socket to PostData by recv() call FIN B->fd=&addr_free-(offset of field bk) B->bk=Mcode B->fd and B->bk unchanged - - When buf is freed, execute B->fd->bk = B->bk FIN .GOT entry of function free points to MCode Operation 3: Free chunk C pFSM2 pFSM3 Used chunk PostData Free chunk B fd=A bk=C ? B->fd=&addr_free-(offset of field bk) B->bk=Mcode - B->fd=A B->bk=C Free chunk A ut) p n i length(input) <= Size(PostData) - Operation 2: - Load addr_free to the memory during program initialization Three common pFSM types are identified, corresponding to three common reasoning flaws in programs Type of pFSM Example Vulnerabilities Sendmail Signed Integer Overflow Object Type Check pFSM1: Does the input represent a long integer? NULL HTTPD Heap Overflow Rwall File Corruption pFSM2 : Is the target file a terminal? Content and Attribute Check pFSM4 - Execute addr_free when function free is called FIN Mcode is executed Reference Consistency Check pFSM2: Is the integer in the interval [0 , 100] ? pFSM3: Is GOT entry of setuid()unchanged? pFSM1: contentLen 0? pFSM2 : length(input) size(buffer) pFSM3 : Are free-chunk links unchanged? pFSM4: Is GOT entry of free() unchanged? pFSM1: Does the user have a root privilege? IIS Filename Decoding Vulnerability Xterm File Race Condtion pFSM1: Does the filename contain “../”? pFSM1: Does the user have a write permission to the file? GHTTPD Buffer overflow on Stack rpc.statd format string vulnerability pFSM1: size(message) 200 ? addr_free changed addr_free unchanged - Case 1: Sendmail Debugging Function Signed Integer Overflow Reject State Operation 1: Write debug level i to tTvect[x] ? 231 Elementary _x) > r t s y Activity 1 (IMPL_REJ) ted b resen er rep g ) e t J n (i C_RE (SPE -(SP pFSM1 EC ( in _AC tege PT) by s r tr_x repres ) 3 ente 2 1 -convert str_i and str_x - d to integer i and x -get text strings str_x and str_i Elementary Activity 2 Accept State SPEC Check State x > 100 - 0 - > 10 or x x<0 J) C_RE (SPE (SP EC pFSM2 _AC 0 PT) x 100 - (IMPL_REJ) x 100 - - tTvect[x]=i FIN .GOT entry of function setuid (i.e., pSetuid ) points to Mcode Elementary Activity 3 Operation 2: Manipulate the GOT entry of function setuid (i.e., pSetuid) - Load pSetuid to the memory during program initialization Starting sendmail program anged id ch u t e S p J) C_RE (SPE pFSM3 (SPE C_AC pSetu PT) id un chang ed - ? (IMPL_REJ) -- Execute code referred FIN by pSetuid Execute MCode Unknow n Read postdata from socket to an allocated buffer PostData • E.g., a new remotely exploitable heap overflow vulnerability, which is now published in Bugtraq, has been discovered using this approach. ) – How are security vulnerabilities distributed among different categories? – What are the limitations of existing techniques of security vulnerability analysis? – How to develop a new analysis technique to overcome the limitations. Three steps of the analysis CP T Specific Objectives – Data in Bugtraq are well organized and suitable for statistical analysis. (IM PL _A – Understand the characteristics of security vulnerabilities – Identification of root causes of security vulnerabilities can help us prevent and detect them – Enables modeling a variety of security vulnerabilities, including stack overflow, heap overflow, signed integer overflow, format string vulnerability, and file race conditions. – Identify reasoning flaws as root causes of the analyzed security vulnerabilities. – Helps uncovering application vulnerabilities. (IMPL_ACPT) Major Data Source: Bugtraq General Objectives Effectiveness of the FSM methodology ) Overview of the Analysis Approach (IM PL _A CP T Motivations pFSM1: Does the filename contain format directives (e.g., %n, %d)? pFSM2: Does the filename refer to another unverified file? pFSM2: Is the return address unchanged? pFSM2: Is the return address unchanged? Future Directions • Automate the FSM analysis of vulnerabilities • Each pFSM indicates a vulnerability, also an opportunity of detection. How to build protection mechanisms based on FSM? • Study the common impacts of security vulnerabilities, e.g., what are common activities of viruses?