Information Security and digital forensics research in CS, HKU Drs KP Chow, Lucas Hui, SM Yiu Center for Information Security & Cryptography (CISC) 邹锦沛, 许志光, 姚兆明 香港大学资讯保安及密码学研究中心 1 Research Directions in CISC 研究项目 Security and cryptography research Computer Forensics research 2 Applications & Hybrid (software + GPU ……… hardware token) implementation (图形处理单元卡) 应用与实现 混合系统 (软件 + 硬件密钥) Cryptographic protocol (密码协议) VANETs (Vehicular ad hoc network) 车辆随意网路 Database system 数据库系统 (e.g. data mining with privacy 数据挖掘隐私问题) Smart (power) grid system 智能电网系统 Anonymous authentication (credential) in discussion group 讨论组匿名身份验证 (凭据) Cryptographic primitives 加密基元 Signature/encryption schemes….. Leakage resilience 泄漏的韧性 Infrastructure (Identity-based; PKI-based etc) and different security Models …….. ………. 3 (1) Leakage Resilience (泄漏的韧性) Old Belief: Encryption protects your data well and the attacker has no information (not even 1 bit) about your secret key (e.g. passwords). This is WRONG!! The “new” assumption: Attacker may get partial information about the secret key. E.g. Measure running time of CPU, temperature of CPU, sound of the keyboard stroke, etc… Impact: old security schemes are not guaranteed to be secure!! 4 The model To formalize these attacks, we model it as an efficiently computed leakage function f which represents how much leakage information can be obtained by the attacker. By restricting the power of f, we restrict how much information is leaked. E.g. f outputs x bits only, with x < key (password) length. Security scheme A f(key) all other msgs/info Attacker Can we still prove that scheme A is still secure? Selected publication: “ID-based encryption scheme on continual auxiliary leakage model”, Eurocrypt 2012. 5 (2) Dynamic Birthmark Generation for Javascript (JavaScript 动态软件胎记) Question Addressed: Given 2 JavaScript programs, does one program copy the other? [plagiarism? IP court cases: Software thefts?] One may change the source code Our Research Approach: Run the two programs, after some time: Dump the objects at the memory (heap area) of the two programs. This is the birthmark of the programs (like birthmark of the pig) If the data structure (heap graph in this case) of the two programs are similar, one is likely to be copying the other. Heap Graph Example 6 Selected publications Preliminary ideas: “Dynamic Software Birthmark for Java Based on Heap Memory Analysis”, CMS 2011. “JSBiRTH: Dynamic JavaScript Birthmark Based on the Runtime Heap”, COMPSAC 2011. A more mature methodology: “Heap graph based software theft detection”, IEEE Transaction on Information Forensics and Security (IEEE TIFS) 2012. 7 (3) Android security DroidChecker Issue: Unlike Apple’s App stores, no screening process of the apps being published on the Android market Privilege escalation attack: The app can perform a function that it is NOT supposed to do. Our technique: identify risky path from control-flow graph DroidChecker: Analyzing Android Applications for Capability Leak, ACM WiSec 2012. 1,179 Android apps scanned => 23 found to be risky Adobe photoshop express 1.31: a malicious app can make use of it to retrieve all email contacts of the phone 8 Still on-going…… Research Directions in CISC 研究项目 Security and cryptography research Computer Forensics research 9 Computer Forensics Research Group 计算机取证 • Software tools development – 数字调查和取证: • • • • DESK (数字证据搜索工具) BTM (也称为网线监察系统) 拍卖现场监测 互联网监控平台 • Research – Digital identity profiling (數碼特徵) • Behavior profiling: 互联网上罪犯的數碼特征 • Visual profiling: 數碼视觉特征 CISC – Cybercrime model – ….. 10 我们的研究 - 數碼特征 •互联网罪犯的數碼特征 (digital identity profiling) –行为特徵 (Behavior profiling) •互联网上侵权罪犯的數碼特征 •互联网拍卖欺诈的數碼特征 In physical word, we (e.g. FBI) use it a lot for: 同系列犯罪的调查,例如:性侵犯,凶杀,色情凶杀案 CISC 网络犯罪有系列本质 (serial in nature): 网络犯罪的系列本质允许罪犯行为的识别和 常量分类 (repeating in nature重複性質) 11 网上用户特性 (preliminary study) •网络身份与用户真实身份没有联系 •在互联网中可以非常容易的隐藏个人真实 身份和行为 •很多情况下,一个人拥有多个用户帐户 •判别一系列网络行为是否由一个用户引起 还是多个用户涉及是很复杂的 用戶数码特征分析 根據每個用戶的張貼,計算一個特徵詞的權 重向量 (a vector of the weights of feature words) Computing the weight of a feature word (t) w.r.f. a user (u)? TF-IDF weight (Salton et al.) Total number U W(t,u) = TF(t,u) x log {u’ U tu’} Frequency of t in u’s postings # of users having t in their postings of users Fewer users have the word, the weight 13 larger A Profile (用戶数码特征) • User dow_jones in uwants.com CISC 1 2 3 4 5 6 7 8 9 10 Feature word 80后 社民连 五区 泛民 西九 黄毓民 功能组别 总辞 八十后 社民 Weight 0.21761 0.14349 0.12547 0.11357 0.10983 0.08671 0.08433 0.08296 0.08194 0.08126 14 使用用戶数码特征進行預測 •這些 discuss.com.hk 論壇上的張貼,是不是 uwants.com 用戶 dow_jones 發布 CISC 15 Example – Users that are similar To be trial used by Hong Kong Police 數碼相機 SD 卡案例 相片1 相片2 相片3 相片10 相片11 相片80 受害人的 陈述书 Jan 2005 Jan 2006 Oct 2006 Dec 2006 (分手) 受害人說謊?? 或是創建日期不正確 !! Time Jan 2007 (犯罪行为) 17 Publications: IEEE Transactions, Eurocrypt, ACNS, ACISP, …. E.g. TW Chim et al., "OPQ: OT-based Private Querying in VANETs," to appear in the IEEE TITS, 2011. TW Chim et al.,"VSPN: VANET-based Secure and Privacy-preserving Navigation,“ IEEE TC, 2012. TW Chim et al., “PAPB: Privacy-preserving Advance Power Reservation”, IEEE Communications Magazine (CM) 2012. Patrick Chan et al., “Heap graph based software theft detection”, IEEE TIFS 2012. Zoe L. Jiang et al., “Maintaining hard disk integrity with digital legal professional privilege (LPP) data”, IEEE TIFS 2013. Quite a few were awarded “Best paper award” Fundings: Research funding, e.g. GRF, AoE, ITF, CRF Due to the time limit, may be we can share other projects next time. 18