Detecting Format String Vulnerabilities with Type Qualifier Umesh Shankar, Kunal Talwar, Jeffrey S. Foster, David Wanger University of California at Berkeley Format String Bugs I/O functions in C use format strings printf(“%s”, buf) print buf as string But you can also do printf(buf); Format String Bugs Attacker may set the content in the buffer buf = <data-from-network> printf(buf); Set buf = “%s%s%s%s%s” to crash program Set buf = “…%n…” to write to memory may yield exploits to gain root access Format String Bugs Application Found by Impact years ------------------------------------------------------------------------------wu-ftpd 2.* security.is remote root >6 Linux rpc.statd security.is remote root >4 IRIX telnetd LSD remote root >8 Apache + PHP3security.is remote user >2 NLS / locale CORE SDI local root ? screen Jouko Pynnonen local root >5 BSD chpass TESO local root ? OpenBSD fstat ktwo local root ? Traditional Techniques Testing – how to ensure coverage? Manual code review – bugs too subtle while (fgets(buf, sizeof buf, f)){ lreply(200, buf); } void lreply(int n, char* fmt,…){ vsnprintf(buf, sizeof buf, ftm, ap); } Re-implement – not possible for legacy system Using Type Qualifiers Add qualifier annotations int printf(untainted char *fmt, …); tainted int getchar(); int main(int argc, tainted char *argv[]); tainted = may be controlled by the attacker untainted = must not be controlled by the attacker The Basic Idea void f(tainted int); void g(untainted int) untainted int a; untainted int a; tainted int b; tainted int b; f(a); // OK g(a); // OK f(b); // OK g(b); // Error Subtyping void f(tainted int); void g(untainted int) untainted int a; tainted int b; f(a); g(b); f accepts both tainted and untainted data. g only accepts untainted data. untainted < tainted Type System Type system will prove judgments of the form Γ├e: “In type environment Γ, expression e has type ” Type Rules ├ e1: 1 2 ├ e1 e2: 2 ├ e2: 1 “In type environment , if expression e1 has type from 1 to 2 and e2 has type 1, then application of e2 to e1 has type 2” Partial Order A partial order is a relation that satisfies the following three properties: 1. reflexivity: a a 2. antisymmetry: If a b and b a then a = b 3. transitivity: If a b, b c, then a c Lattice A lattice is a partial order where any two elements x and y have a least upper bound, x y, and a greatest lower bound, x y. a tainted c b d untainted Qualifier Subtyping Rule Q1 Q2 Q1 int Q2 int untainted tainted untainted int tainted int int can replaced by any C primitive data type: char, double …. How about pointer? Pointer Qualifier Subtyping Rule Q1 Q2 T1 T2 Q1 ptr(T1) Q2 ptr(T2) Wrong! tainted char *t; // T2 = tainted char untainted char *u; // T1 = untainted char t = u; // allowed by the wrong rule *t = <tainted data>; // t is alias of u Pointer Aliasing We have multiple names for the same memory location But they have different types And we can write into memory at different types t tainted untainted u Pointer Qualifier Subtyping Rule The right rule Q1 Q2 T1 = T2 Q1 ptr(T1) Q2 ptr(T2) Type Qualifier Inference Recall the format string vulnerabilities We have legacy C program that had no information about qualifiers We add qualifier annotation for the standard library functions Then we check whether there were any contradiction This requires type qualifier inference Qualifier Inference A small number of annotations at key places in the program Generate a fresh qualifier variable at every position for a type Analyze and generate sub-typing constraints Check if the constraints have a valid solution Qualifier Inference Example tainted char *getenv (const char *name) int printf(untaintd const char *fmt, …) char *s, *t s = getenv(“LD_LIBRARY_PATH”); getenv_ret_p = tainted printf_arg0_p = untainted getenv_ret s getenv_ret_p s_p t = s; s t s_p t_p printf(t); t printf_arg0 t printf_arg0_p tainted = getenv_ret_p s_p t_p printf_arg0_p = untainted tainted untainted Error! Type Rules So Far Primitive types Q1 Q2 Q1 int Q2 int Pointer Q1 Q2 T1 = T2 Q1 ptr(T1) Q2 ptr(T2) Qualifier Inference Extension 1 Leaf Polymorphism char id (char x) { return x } … tainted char t; untainted char u; char a, b; a = id (t); x is tainted, id_ret is tainted b = id (u); b is tainted However id() is just a identity function. It preserves the qualifiers. Polymorphism Add in a qualifier variable to id() function char id ( char x) Instantiate the to the correct qualifier for each function call a = id (t) b = id (u) tainted char id (tainted char x) untainted char id (untainted char x) Polymorphism - Subtyping Use naming trick to specify subtyping relation for polymorphism $_1_2 char* strcat($_1_2 char*, $_1 const char*) {1} {1, 2} $_1 $_1_2 Qualifier Inference Extension 2 Explicit Type Casts Type cast should preserve the qualifier void *y; char *x = (char*) y; // if y is tainted, x should be tainted Preserve by collapsing qualifiers char **s, **t; void *v = (void *) s; t = (char **)v; s_p = s_p_p = v_p v_p = t_p = t_p_p If either *s or **s is tainted, then *v is tainted and *t and **t is tainted Cast Type Qualifier Allow qualifier cast void *y; char *x = (untainted char*) y; Qualifier Inference Extension 3 Variable Argument Functions Annotate the varargs specifier … int printf(untainted char*, untainted …); Use naming trick to specify the subtyping relation int sprintf($_1_2 char*, untainted char *, $_2 …); Qualifier Inference Extension 4 Const f(const char *x); char *unclean, *clean; unclean = getenv(“PATH”); // returns tainted f(unclean); f(clean); Pointer Rule unclean is tainted clean is tainted Q1 Q2 T1 = T2 Q1 ptr(T1) Q2 ptr(T2) Pointer Rule for Const Const make sure the input data is not modified If the pointer has a const qualifier, the pointer rule becomes Q1 Q2 T1 < T2 Q1 ptr(T1) Q2 ptr(T2) Evaluation Conclusion Qualifier is a generic static analysis technique User-Space/Kernel-Space Trust Errors Deadlock Detection Has low false positive and negative rates Demo of Cqual