Traffic Morphing: An Efficient Defense Against Statistical Traffic Analysis Charles V. Wright MIT Lincoln Laboratory Scott E. Coull Johns Hopkins University Fabian Monrose University of North Carolina Presented by Yang Gao 11/2/2011 Outline Potential Hazards Counter measures and Traffic Morphing How it works? Evaluation and Results Privacy Security Privacy Security Packet Size and Timing Information Classification Tools Language of a VoIP call Password in SSH Web browsing habits ... Privacy Leakage How does the attack happen Webpage browsing Statistical Identification of Encrypted Web Browsing Traffic (Sun,Q. Stanford University) Only Objects number and sizes are recorded A 2000 sample from 100,000 WebPages Jaccard’s coefficient Trained classifier How does the attack happen Webpage browsing Statistical Identification of Encrypted Web Browsing Traffic (Sun,Q. Et Stanford University) Inferring the Source of Encrypted HTTP Connections (Marc Liberatore and Brian Neil Levine UMA) Identification of Encrypted VoIP Traffic Results of the Classifiers Outline Potential Hazards Counter measures and Traffic Morphing How it works? Evaluation and Results Countermeasures Padding Mimicking Morphing Sending at fixed time intervals(counter the timing analysis) Comparison Traffic Morphing morphing morphing How does the morphing work? NL1 : NL2 = 2 : 1 L1 L2 L1 L1 NL1 : NL2 = 1 : 2 L2 L2 Outline Potential Hazards Counter measures and Traffic Morphing How it works? Evaluation and Results Traffic Morphing Goals Good resemblance in packet size distribution Less overhead Steps Morphing matrix construction Morphing Matrix Size y1 Size x1 Size yn Size xn 2*n equations and n2 unknowns How to solve these equations? We won't solve them directly. Convex Optimization Cost Function Restrictions Example L1 L2 L1 L1 L2 L2 Example L1 L1 L2 Reduce? Add more constrains to avoid this situation. L2 L1 L2 Steps for Traffic Morphing Matrix Construction Select the source process and calculate the probability distribution of the packets size. Select the target process and calculate the probability distribution of the packets size. Solve the morphing matrix with optimization method which could minimize the cost while following the restrictions. Traffic Morphing Get the packet to send. set up a random number to select the element in the matrix Calculate the corresponding packet size. Padding or reduce the packet size Transmit the new packet. Traffic Morphing Goals Good resemblance in packet size distribution Less overhead Steps Morphing matrix construction Additional Morphing Constraints Pitfall 1 System is over-specified Y = AX Solution: Multi-level programming Find Z which is closest to Y Find A which such that most efficiently maps X to Z Z=A’X Z=AX => Minimize( fd(Y,Z) ) => Minimize( f0(A) ) Traffic Morphing Goals Good resemblance in packet size distribution Less overhead Steps Morphing matrix construction Additional Morphing Constraints Dealing with Large Sample Spaces Pitfall 2 Pool Scalability Pentium 4 2.8G run 1 hr for 80x80 matrix with 6560 constraints MTU(40~1500) means 1460x1460 Matrix Solution Multi-level method Sub-matrix Morphing Multi-level method Traffic Morphing in sum Goals Good resemblance in packet size distribution Less overhead Steps Morphing matrix construction Additional Morphing Constraints Convex optimization 2 level Multi-level programming Dealing with Large Sample Spaces Sub-matrix Morphing Outline Potential Hazards Counter measures and Traffic Morphing How it works? Evaluation and Results Evaluation Encrypted Voice over IP Web Page Identification Defeating Original Classifier Evaluating Indistinguishability Encrypted Voice over IP Language Identification of Encrypted VoIP Traffic:Alejandra y Roberto or Alice and Bob? Charles V. Wright Lucas Ballard Fabian Monrose Gerald M. Masson from Department of Computer Science Johns Hopkins University White box encode Why even the encrypted voice packet will leak information Unigram frequencies of bit rates 2-gram resemblance Blackbox Results for original classifier Results for Indistinguishablity Overhead Web page Identification Overhead Practical Considerations Short Network Sessions Variations in Source Distribution Short of packets generated by source? Keep generating until reach a distance threshold Packets size difference for training and using? Divide and conquer Reduced Packet Sizes How to deal with the reduced packet size in HTTP Packing to the next Traffic Morphing in a nut shell Resemblance Overhead Minimization Morphing Matrix Convex Optimization Additional Morphing Constraints Dealing with Large Sample Spaces Practical Considerations Short Network Sessions Variations in Source Distribution Reduced Packet Sizes Conclusion User privacy are vulnerable even under encryption protected. Traffic morphing is effective and robust Traffic morphing is applicable. Traffic morphing is much more efficient than padding. Discussion The other side of morphing Anti-intrude-detection. Mimicry attack deny Malicious call combination library System call sequence accept morphing Thank you!