CPS110: General security Landon Cox

CPS110: General security Landon Cox Intro: general security  Hardware reality  Processor, memory, disks, NIC  Components can be used by anyone  OS abstraction  Controlled access to hardware Security abstractions  What HW primitives exist for access control?  Processor mode bit (kernel/user mode)  Protected instructions (I/O instructions, halt)  Protected data (page tables, interrupt vector)  Want to build two abstractions on these  Identity (authentication)  Who are you?  Security policy (authorization)  What are you allowed to do? Building up from the hardware  One of the themes of this class  Threads  Atomic test&set, interrupt enable/disable  Transactions  Single disk sector write  Reliable communication  Atomic packet send Building up from the hardware  Already built on top of HW security primitives  Secure sharing of physical memory  Use protection between address spaces  (fault on unauthorized access)  Secure sharing of kernel services  Use system calls to safely transition to privileged mode  (trap after syscall instruction) Authentication: who are you?  Prove your identity to the OS  Many ways to authenticate 1. Passwords 2. Physical tokens 3. Biometrics Password authentication  Password  Shared secret between you and the OS  Several weaknesses  How should we store passwords?  Cryptographic hashes (MD5, SHA)  Hashes are one-way functions  Check that hash(input) = hash(password) Cryptographic hashes SHA1 hash “Hash digest” Arbitrarily large 160 bits  Infeasible to reverse (even for the system!)  I.e. cannot compute plaintext from digest  Collision-resistant hash function  Probability (collision) < probability (HW error)  Extremely useful tool in lots of domains  Can be used as a shorthand for naming objects (e.g. files) Password authentication  What’s the weakest link in password systems?  People!  People choose short passwords  (brute force attacks take less time)  People choose easy-to-remember passwords  (narrows the search space: dictionary attack)  People choose the same password for everything  (break the weakest web site, gain access to the strongest)  How could you use cryptographic hashes to solve this?  User remembers one password, systems sends Hash(pass + site name) Weak passwords  Consider an 8-character password  256^8 possible passwords  2^64 ~ 16 * billion * billion  If you only choose lower-case letters  26^8 ~ 200 billion  If you choose a word from the dictionary  320,000 words in Webster’s unabridged  1000/second  find a password in 5 minutes  Researchers ran an attack on Michigan passwords  Recovered thousands; some of the most popular: “beer”, “hockey”  Similar result at Berkeley CS in the 80s: “gandolph” Physical token authentication  Require something physical  E.g. a ticket to a dance performance  What if your token is stolen/forged?  Require a physical token + password  E.g. your ATM card + PIN Biometric authentication  Essentially a physical token  One that should be hard to steal/forge  Examples  Thumbprint, retina scan, signature  Most have accuracy problems  False positives (accept invalid person)  False negatives (reject valid person) Real-world authentication  How to authenticate to your credit card company?  Give them your social security number  They ask over the phone and check  How is this vulnerable?  Company has my SSN, so they can pretend to be me  (or someone who breaks into them can pretend to be me)  Phone line snoop can pretend to be me Authorization: what can you do?  Fundamental structure  Access control matrix  Rows = authentication domains  Columns = objects File 1 File 2 File 3 User 1 RW RW RW User 2 -- R RW  Two approaches to storing data:  Access control lists and capabilities Access control lists  Standard for file systems  At each object  Store who can access the object  Store how the object can be accessed  On each access  Check that the user has proper permissions  Use groups to make things more convenient  E.g. forbes, chase, and lpcox are all in group “prof” Access control lists  How can you defeat ACL authorization?  Villain convinces the OS it is someone else  Example     Sendmail runs as root (full privileges) Attacker compromises sendmail process Can now run arbitrary code under root ID This is what you will be doing in Project 3  What if you get the ACL wrong? Frighteningly common  “Usability and privacy: a study of Kazaa…”  Good and Krekelberg, SigCHI 2003  In 12 hours, found 150 inboxes on Kazaa  Observed people downloading them  Many other examples: web, NFS, etc  People don’t understand software interactions  Affects even diligent users … Data breach Snoop Despite her best efforts, Alice’s data is only as secure as her least competent confidant. Alice Bob What to do?  Two goals 1. Identify sensitive files 2. Infer who is allowed to view them  Many more constraints 1. Don’t impact performance 2. Don’t bother the user 3. Remain backwards-compatible Identifying sensitive files  Approach: identify the common case  I download sensitive docs from many sources mail.cs.duke.edu What do the examples have in common? Encryption! Identifying sensitive files  Approach: identify the common case  I download sensitive docs from many sources  Insight: data is often encrypted for a reason  In our examples, servers allow clients to cache sensitive file  To protect those files in the network, communication is encrypted  How can we tell if network communication is encrypted?  Use the port (e.g., 443 is used for HTTPS)  Ciphertext exhibits high “entropy” or randomness  Entropy = information density, measured in bits/byte  Can inexpensively measure the entropy of every socket  If data stream exhibits high entropy, see if “resulting data” hits FS  If so, tag the file as sensitive Download entropy scores Bits-per-byte Plaintext HTTPS 8 7 6 5 4 3 2 1 0 Gradesheet Email Bank Tax return statement How can we distinguish compression from encryption? if compressed format, use data-similarity if actual compression, measure relative data volumes Image Capabilities  At each user  Store a list of objects they may access  Store how those objects may be accessed  Example  At user2, store ‹file2, r: file3, rw›  On each access  User presents capability  Capability proves they are authorized Capabilities  Possession of capability provides authorization  Like car keys  If you have the door key, you can get in the car  If you have the ignition key, you can drive it  How can you defeat a capability system?  Forge an authorized user’s capabilities  Like making a copy of someone’s car keys Capabilities  Solution to forged capabilities?  “Self-authenticating” capabilities  E.g. include {user, filename}system-key  Mastercard number, expiration date  Have these, you can charge to account  What makes these hard to forge?  Random numbers in a sparse space Revocation  How to revoke permissions with an ACL?  Just delete/change the ACL entry  How to revoke permissions with capabilities?  Not nearly as easy  Could change the car door’s lock  This revokes access for everyone with the key Dealing with security  Principle of least privilege  Users get minimum privilege to do their work  Defense in depth  Create multiple levels  Lower-levels monitor higher ones  Minimize trusted computing base     Every system has a trusted part (e.g. the thing that checks the ACL) This part has special privileges It cannot be subject to the security mechanism Reducing the trusted base  Why reduce the size of the trusted base?  Small and simple = easier to reason about  Hopefully fewer bugs and security loopholes  Most systems  Kernel is trusted  Special user (root, Administrator) is trusted  Emerging systems  Virtual machines: trust the hypervisor  Trusted Platform Modules: hardware attests to software Common attacks  Hidden channels  Buffer overflow  Trojan horse Hidden channels  Get system to communicate in unintended ways  Example: tenex (supposedly secure OS)  Created a team to break in  Team had all passwords within 48 hours … oops. Password checker for (i=0; i<8; i++) { if (input[i] != password[i]) { break; } }  Goal: require 256^8 tries to see if password is right Hidden channels: tenex Password checker for (i=0; i<8; i++) { if (input[i] != password[i]) { break; } }  How to break? (hint: vm faults are visible)     Specially arrange the input’s layout in memory Force a page fault if second character is read If you get a fault, the first character was right Do again for third, fourth, … eighth character  Can check the password in 256*8 tries Buffer overflow  Suppose you have an on-stack buffer  You read input into the buffer  You don’t check the length of the input  A villain’s long input can corrupt the stack  If they’re smart, they can corrupt the return value  Allows villain to jump to their own code  First Internet worm did this  Creator was a Cornell grad student, Robert Morris  Son of a famous computer security guru  He is now a professor at MIT Trojan horse  Basic idea  Give somebody something useful  Make it do something evil  Examples     Replace login with something that emails passwords Send a Word document with a malicious macro Download Kazaa and install spyware Install a Sony CD and have it install a buggy DRM root-kit Why security is so hard  Ken Thompson’s self-replicating code  Goal: put a backdoor into login  Allow “ken” to login as root w/o password 1.Make it possible (easy) 2.Hide it (tricky) Login backdoor  Step 1: modify login.c  (code A) if (name == “ken”) login as root  This is obvious so how do we hide it?  Step 2: modify C compiler  (code B) if (compiling login.c) compile A into binary  Removes code A from login.c, keeps backdoor  This is now obvious in the compiler, how do we hide it?  Step 3: distribute a buggy C compiler binary  (code C) if (compiling C compiler) compile code B into binary  No trace of attack in any source code (only in C compiler’s binary)  Upshot: everything is bootstrapped, everything trusts something Conficker Conficker and DNS  Problem: how to update your malware?  Why not just host at a web site, have malware check?  Easy to catch  Monitor malware, see which web sites they connect to  Follow the paper trail from that site  Instead, each malware client computes random DNS names Conficker and DNS The name of each generated domain is 4 to 10 characters, to which a randomly selected TLD is appended from the following list of 116 suffix (mapping to 110 TLDs): [ "ac" , "ae" , "ag" , "am" , "as" , "at" , "be" , "bo" , "bz" , "ca" , "cd" , "ch" , "cl" , "cn" , "co.cr" , "co.id" , "co.il" , "co.ke" , "co.kr" , "co.nz" , "co.ug" , "co.uk" , "co.vi" , "co.za" , "com.ag" , "com.ai" , "com.ar" , "com.bo" , "com.br" , "com.bs" , "com.co" , "com.do" , "com.fj" , "com.gh" , "com.gl" , "com.gt" , "com.hn" , "com.jm" , "com.ki" , "com.lc" , "com.mt" , "com.mx" , "com.ng" , "com.ni" , "com.pa" , "com.pe" , "com.pr" , "com.pt" , "com.py" , "com.sv" , "com.tr" , "com.tt" , "com.tw" , "com.ua" , "com.uy" , "com.ve" , "cx" , "cz" , "dj" , "dk" , "dm" , "ec" , "es" , "fm" , "fr" , "gd" , "gr" , "gs" , "gy" , "hk" , "hn" , "ht" , "hu" , "ie" , "im" , "in" , "ir" , "is" , "kn" , "kz" , "la" , "lc" , "li" , "lu" , "lv" , "ly" , "md" , "me" , "mn" , "ms" , "mu" , "mw" , "my" , "nf" , "nl" , "no" , "pe" , "pk" , "pl" , "ps" , "ro" , "ru" , "sc" , "sg" , "sh" , "sk" , "su" , "tc" , "tj" , "tl" , "tn" , "to" , "tw" , "us" , "vc" , "vn" ]  Checks to see if they can download from site until it hits  Why is it doing this?  What aspects of DNS does this take advantage of? Coming up  More security (project 3)  Storage and files  Google

CPS110: General security Landon Cox

Related documents

Products

Support

CPS110: General security Landon Cox

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib