Applying Security Techniques to 3D Fabrication Methods for General Purpose Processors CDR Mike Bilzor Simple Overview 3D Fabrication Techniques + Execution Monitor Theory = 3D Processor Execution Monitor Develop a "recipe" for making this, from a processor's specification Proposed Dissertation Abstract Hardware malicious inclusions, or hardware Trojans, in microprocessors present an increasing threat to U.S. high-assurance computing systems, particularly those of the Department of Defense, due to vulnerabilities at several stages in the acquisition chain. Existing testing techniques are limited in their ability to detect these maliciously modified integrated circuits. We propose a novel method, based on the evolution of three-dimensional integrated circuit fabrication techniques and on execution monitor theory, by which malicious inclusions, including those not detectable by existing means, may be detected and potentially mitigated in the lab and in fielded, real time operation. The proposed work will develop and implement techniques for detecting and mitigating hardware malicious inclusions by utilizing 3D connections to monitor the control and data flows in an untrusted, target commodity processor from a trusted attached processor called the "control plane". Outline • • • • • • • • Threat to General Purpose Processors 3D Fabrication Techniques Processor Execution Malicious Inclusions and Existing Detection Methods Example of a 3D Execution Monitor Planned Experiments and Tools Related Work and Limitations Proposed Contributions The Threat to General Purpose Processors Microprocessor Threat • High assurance customers in DoD need reliable processors • Classified systems, weapons, aircraft • The most advanced processors are still designed in the U.S., but almost none are manufactured here • China, Taiwan, Korea, Philippines • U.S. companies becoming "fabless" – design only • DoD's Trusted Foundry program cannot support all high-assurance needs Microprocessor Threat • "The Hunt for the Kill Switch" [Adee08] • In 2007, Israel bombs a suspected Syrian nuclear facility, but Syrian radars are not functioning – were they disabled using a "kill switch"? A New York Times article cites a source claiming knowledge of the operation • In 2008, an anonymous defense contractor reports that a European manufacturer has designed a chip that can be remotely disabled, and claims that French contractors have used the chip in military equipment • "The Hacker in Your Hardware" • Scientific American, August 2010 [Villa10] Microprocessor Threat • High-assurance supply chain contains counterfeit processors [King10, Grow08] • Over 400 fake Cisco routers seized in 2007-2008, many sold to DoD, some for classified systems • January 2007 – counterfeit chip discovered in an F-15 flight computer during maintenance at Warner-Robins AFB, GA • DoD discovered 9,356 fake electronic parts in 2008 alone • Estimate: as many as 15% of all replacement processors purchased by DoD may be counterfeit Microprocessor Threat • Where processors can be subverted • Design and Fabrication • Changes to HDL design • Insider threat • Use/re-use of compromised modular, publicly available HDL design components • Changes to low-level, optimized layout (netlist) • Shipping, distribution, component assembly • Replacing bona fide parts with counterfeit parts • Advanced techniques like FIB milling Processor Development Architectural Design Specification -Instruction Set -Registers -Cache -Interrupts -Privilege Levels High Level Processor Design -Libraries -Packages and Entities -Logic Design -VHDL, Verilog Low Level Processor Design Fabrication -Optimization -Place and Route -Netlist and Schematics -Wafer Mask Generation -Wafer Production and Test -Processor Finishing Assembly and Distribution -Processor Batch Testing -Printed Circuit Board Packaging and Test -Shipping Installation and Operation -System Integration -Operational Test Processor Development Architectural Design Specification -Instruction Set -Registers -Cache -Interrupts -Privilege Levels High Level Processor Design -Libraries -Packages and Entities -Logic Design -VHDL, Verilog Low Level Processor Design Assembly and Distribution Fabrication -Optimization -Place and Route -Netlist and Schematics -Wafer Mask Generation -Wafer Production and Test -Processor Finishing -Processor Batch Testing -Printed Circuit Board Packaging and Test -Shipping Installation and Operation -System Integration -Operational Test Trusted Either / Mix Untrusted Source: DARPA TRUST in Integrated Circuits Program, Industry Day Brief, 26 March 2007 Processor Development 3D Execution Monitor Source Data Architectural Design Specification -Instruction Set -Registers -Cache -Interrupts -Privilege Levels Parallel Design High Level Processor Design -Libraries -Packages and Entities -Logic Design -VHDL, Verilog 3D Fabrication Low Level Processor Design Fabrication -Optimization -Place and Route -Netlist and Schematics -Wafer Mask Generation -Wafer Production and Test -Processor Finishing Targeted Threats Assembly Assembly and Distribution -Processor Batch Testing -Printed Circuit Board Packaging and Test -Shipping 3D Execution Monitor Implementation Installation and Operation -System Integration -Operational Test 3D Fabrication Techniques 3D Fabrication • Manufacturers (Intel, AMD) are motivated to develop 3D techniques for performance • 2D feature size is near its physical performance limit • Heat, timing, and leakage issues dominate • Early uses • 3D Cache memory with lower latency due to proximity • 3D Stacked coprocessor-type modules • Special uses like image sensors • Future implementations • Many-core CPU stacks • Performance monitoring, verification, and security? Image: Synopsis 3D Fabrication • 3D methods • MCMs – "Multi-Chip Modules" • TSVs – "Through Silicon Vias" • Wire-bonded connections • Micro-RF relays • If you could connect to any circuit in the adjacent plane directly, how might that be useful? Image: [Dav05] 3D Fabrication Possible Layouts Optional Control Plane – Custom, Trusted IC “3D” Interconnect Connection from Printed Circuit Board to Integrated Circuit Computation Plane – Commodity, Untrusted IC OR Computation Plane – Commodity, Untrusted IC “3D” Interconnect Connection from Printed Circuit Board to Integrated Circuit Optional Control Plane – Custom, Trusted IC Processor Execution Processor Elements • How can we categorize what operations are performed in a general purpose processor? zero? Opcode ALU Op. load IR 31 (PC) 30 (Link) rd rt rs load A load B Reg. Sel. - Instruction Execution - Movement of Data busy? load MA and - Storage Retrieval - Arithmetic and Logic MA addr IR A B 0 rs 1 rt 2 rd addr ... Ex. Sel. Immediate Extended ALU 30 31 Link PC Registers En. Imm. Bus En. ALU data Reg. Wrt. En. Reg. On-Chip Memory data Mem. Wrt. En. Mem. Processor Elements • The basic elements that characterize what happens in a general processor: • • • • Execution Flow (following instructions) Data Paths Storage and Retrieval Arithmetic and Logic Computation • Conjecture: Everything that happens in a general purpose processor falls into one (or more) of these categories Processor Operation • How do we trust the basic types of operations in a general-purpose processor? • • • • Execution Flow Data Paths Storage and Retrieval Arithmetic and Logic Computation • Must also add "Functional Protection" – ensuring continued processor availability • Disabling direct vulnerabilities (zero-cycle attacks) Processor Operation • Execution Flow • Turing Machine analogy: • Read an input symbol, write to the tape, move left or right • Need to perform these functions in the specified order • Data Paths • Turing Machine analogy: • After reading an input symbol (the data), its value must not change before a transition is identified and executed, or correctness is violated • Storage and Retrieval • Turing Machine analogy: • Read and write – need to ensure that when we write something to the tape, it's the same value when we go back to read it later Processor Operation • Math and Logic Computation • Turing Machine analogy: • If you change the encoding of the TM while it's running, or modify the input during execution, correctness is violated • Functional Protection (Keep-Alive) • Turing Machine analogy: • If you turn the machine off, it no longer works correctly (cannot fulfill its obligations) Correct Processor Operation • Processor operation is correct only if: • Execution flow is correct, i.e. it proceeds according to the architectural design specification, WRT the instruction opcode, and the instruction opcode itself is correct • Data path integrity is preserved • Arithmetic and logic computations in the processor all function correctly, according to their specifications • Storage and retrieval in the processor always function correctly (to include storage and retrieval of instruction opcodes, either from inputs or from local memory) • There is no direct impairment: no uncommanded halts, resets, shutdowns, or other functional attacks Malicious Inclusions and Existing Detection Methods Malicious Inclusions • "Malicious Inclusion" = "Hardware Trojan" • Physical change to a processor that causes a deviation from its specified functionality • Details of actual attacks may be classified • Several academic investigations have demonstrated malicious inclusions in the last few years • Subverted hardware cannot be corrected using software Malicious Inclusion Taxonomy Taxonomy slightly modified from [Tehr10] Classification Characteristics Activation External Trigger Distribution Internal Trigger Structure Action* Leak Information Modify Data Size Always Active After Trigger Modify Functionality Type Conditionally Active Disable Functionality *May include more than one type of malicious action Malicious Inclusion Taxonomy Classification Characteristics Activation External Trigger Distribution Internal Trigger Structure Action Leak Information 3D Detection or Mitigation Technique Datapath Integrity Math/Logic Verification Modify Data Load/Store Verification Size Always Active After Trigger Modify Functionality Execution Monitor Type Conditionally Active Disable Functionality Keep-Alive Protections Malicious Inclusion Taxonomy Classification Characteristics Activation External Trigger Distribution Internal Trigger Structure Action Leak Information 3D Detection or Mitigation Technique Datapath Integrity Math/Logic Verification Modify Data Load/Store Verification Size Always Active After Trigger Modify Functionality Execution Monitor Type Conditionally Active Disable Functionality Keep-Alive Protections Research Focus Malicious Inclusion: HDL Example IF (r.d.inst (conv_integer (r.d.set)) = X"80082000") THEN hackStateM1 <= '1'; Instruction Register END IF; IF (hackStateM1 = '1' and r.d.inst (conv_integer (r.d.set)) = X"80102000") THEN r.w.s.s <= '1'; END IF; Control Register Privilege Bit Trigger: Instruction codes 80082000 and 81012000 translate to: AND R0, #0 OR R0, #0 [VHDL code for Leon3 processor. Example from King, Hicks] Execution Flow 3D Execution Monitor Example Execution Flow • With respect to the processor only, execution flow is correct only if: • The opcode is correct: not modified since being input to the chip, or retrieved from local memory (cache) • The control flow precisely follows the design specification for that instruction opcode • In the following example, we assume the instruction opcode is correct, and look to verify that the execution follows the design specification • Consider an example bus-based MIPS architecture... zero? Opcode ALU Op. load IR load A Source: MIT Open Courseware 6.823 31 (PC) 30 (Link) rd rt rs A MIPS Bus-Based Architecture load B load MA busy? Reg. Sel. MA addr IR A B 0 rs 1 rt 2 rd addr ... Ex. Sel. Immediate Extended 30 ALU Link PC 31 Reg. Wrt. Registers En. Imm. En. Reg. data En. ALU On-Chip Memory Mem. Wrt. En. Mem. data Bus State Micro Code ld IR Reg Sel Reg Writ e en Reg ld A ld B ALU Op en ALU ld MA Mem Write en Mem Ex Sel en Imm μBranch Type FETCH_0: MA <- PC; A <- PC; 0 PC 0 1 1 * * 0 1 * 0 * 0 Next IR <- Mem 1 * * 0 0 * * 0 0 0 1 * 0 Next PC <- A + 4 0 PC 1 1 0 * INC_A_ 4 1 * * 0 * 0 Dispatch per IR branch to fetch 0 * * 0 * * * 0 * * 0 * 0 Jump FETCH_0 NOP_0: Next State 31 (PC) 30 (Link) rd rt rs Register-Register Add zero? Opcode ALU Op. load IR load A load B load MA busy? Reg. Sel. MA addr IR A B 0 rs 1 rt 2 rd addr ... Ex. Sel. Immediate Extended 30 ALU Link PC 31 Reg. Wrt. Registers En. Imm. En. Reg. data En. ALU On-Chip Memory Mem. Wrt. En. Mem. data Bus State Micro Code ld IR Reg Sel Reg Writ e en Reg ld A ld B ALU Op en ALU ld MA Mem Write en Mem Ex Sel en Imm μ Branch Type ADD_0: A <- rs 0 rs 0 1 1 * * 0 * * 0 * 0 Next B <- rt 0 rt 0 1 * 1 * 0 * * 0 * 0 Next rd <- result 0 rd 1 0 * * ADD 1 * * 0 * 0 Jump Next State FETCH_0 31 (PC) 30 (Link) rd rt rs Register-Register Add zero? Opcode ALU Op. load IR load A load B load MA busy? Reg. Sel. MA addr IR A B 0 rs 1 rt 2 rd addr ... Ex. Sel. Immediate Extended 30 ALU Link PC 31 Reg. Wrt. Registers En. Imm. En. Reg. data En. ALU On-Chip Memory Mem. Wrt. En. Mem. data Bus State Micro Code ld IR Reg Sel Reg Writ e en Reg ld A ld B ALU Op en ALU ld MA Mem Write en Mem Ex Sel en Imm μ Branch Type ADD_0: A <- rs 0 rs 0 1 1 * * 0 * * 0 * 0 Next B <- rt 0 rt 0 1 * 1 * 0 * * 0 * 0 Next rd <- result 0 rd 1 0 * * ADD 1 * * 0 * 0 Jump Next State FETCH_0 31 (PC) 30 (Link) rd rt rs Register-Register Add zero? Opcode ALU Op. load IR load A load B load MA busy? Reg. Sel. MA addr IR A B 0 rs 1 rt 2 rd addr ... Ex. Sel. Immediate Extended 30 ALU Link PC 31 Reg. Wrt. Registers En. Imm. En. Reg. data En. ALU On-Chip Memory Mem. Wrt. En. Mem. data Bus State Micro Code ld IR Reg Sel Reg Writ e en Reg ld A ld B ALU Op en ALU ld MA Mem Write en Mem Ex Sel en Imm μ Branch Type ADD_0: A <- rs 0 rs 0 1 1 * * 0 * * 0 * 0 Next B <- rt 0 rt 0 1 * 1 * 0 * * 0 * 0 Next rd <- result 0 rd 1 0 * * ADD 1 * * 0 * 0 Jump Next State FETCH_0 Malicious Inclusion Upon observing an ALU operation with special operands as triggers, this malicious inclusion commands the ALU result to be written from the bus to a specified "secret" address in local memory (in addition to being written to the destination register). As a result, the operand value is "leaked". B A FF0A Equal? 13D0 Equal? Mem Write load MA 99FF MA 31 (PC) 30 (Link) rd rt rs Register-Register Add zero? Opcode ALU Op. load IR load A load B load MA busy? Reg. Sel. MA addr IR FF0A A 0 B rs FF0A 1 rt 2 rd addr ... Ex. Sel. Immediate Extended 30 ALU Link PC 31 Reg. Wrt. Registers En. Imm. En. Reg. data En. ALU On-Chip Memory Mem. Wrt. En. Mem. data Bus State Micro Code ld IR Reg Sel Reg Writ e en Reg ld A ld B ALU Op en ALU ld MA Mem Write en Mem Ex Sel en Imm μ Branch Type ADD_0: A <- rs 0 rs 0 1 1 * * 0 * * 0 * 0 Next B <- rt 0 rt 0 1 * 1 * 0 * * 0 * 0 Next bus <- result 0 rd 1 0 * * ADD 1 * * 0 * 0 Jump Next State FETCH_0 31 (PC) 30 (Link) rd rt rs Register-Register Add zero? Opcode ALU Op. load IR load A load B load MA busy? Reg. Sel. MA addr IR 13DO FF0A A B 0 FF0A rs 1 13DO rt addr rd 2 ... Ex. Sel. Immediate Extended 30 ALU Link PC 31 Reg. Wrt. Registers En. Imm. En. Reg. data En. ALU On-Chip Memory Mem. Wrt. En. Mem. data Bus State Micro Code ld IR Reg Sel Reg Writ e en Reg ld A ld B ALU Op en ALU ld MA Mem Write en Mem Ex Sel en Imm μ Branch Type ADD_0: A <- rs 0 rs 0 1 1 * * 0 * * 0 * 0 Next B <- rt 0 rt 0 1 * 1 * 0 * * 0 * 0 Next bus <- result 0 rd 1 0 * * ADD 1 * * 0 * 0 Jump Next State FETCH_0 31 (PC) 30 (Link) rd rt rs Register-Register Add zero? Opcode ALU Op. load IR load A load B load MA busy? Reg. Sel. MA addr IR 13DO FF0A A B 0 FF0A rs 1 13DO rt 2 12DA rd addr ... Ex. Sel. Immediate Extended 30 ALU 99FF Link PC 31 En. Imm. En. Reg. data En. ALU On-Chip Memory Reg. Wrt. Registers 12DA Mem. Wrt. En. Mem. data Bus State Micro Code ld IR Reg Sel Reg Writ e en Reg ld A ld B ALU Op en ALU ld MA Mem Write en Mem Ex Sel en Imm μ Branch Type ADD_0: A <- rs 0 rs 0 1 1 * * 0 * * 0 * 0 Next B <- rt 0 rt 0 1 * 1 * 0 * * 0 * 0 Next bus <- result 0 rd 1 0 * * ADD 1 1 1 1 * 0 Jump Next State FETCH_0 But What if We Can Monitor the Bus and the Control Signals? zero? Opcode ALU Op. load IR 31 (PC) 30 (Link) rd rt rs load A load B load MA busy? Reg. Sel. MA addr IR A B 0 rs 1 rt 2 rd addr ... Ex. Sel. Immediate Extended ALU 30 31 Link PC Reg. Wrt. Registers En. Imm. En. ALU data En. Reg. On-Chip Memory data Mem. Wrt. En. Mem. Bus Control Plane Monitor Point 31 (PC) 30 (Link) rd rt rs Register-Register Add zero? Opcode ALU Op. load IR load A load B load MA busy? Reg. Sel. MA addr IR FF0A A 0 B rs FF0A 1 rt 2 rd addr ... Ex. Sel. Immediate Extended 30 ALU Link PC 31 Reg. Wrt. Registers En. Imm. En. Reg. data En. ALU On-Chip Memory Mem. Wrt. En. Mem. data Bus State Micro Code ld IR Reg Sel Reg Writ e en Reg ld A ld B ALU Op en ALU ld MA Mem Write en Mem Ex Sel en Imm μ Branch Type ADD_0: A <- rs 0 rs 0 1 1 0 0 0 0 0 0 0 0 Next B <- rt 0 rt 0 1 0 1 0 0 0 0 0 0 0 Next rd <- result 0 rd 1 0 0 0 ADD 1 0 0 0 0 0 Jump Next State FETCH_0 31 (PC) 30 (Link) rd rt rs Register-Register Add zero? Opcode ALU Op. load IR load A load B load MA busy? Reg. Sel. MA addr IR 13DO FF0A A B 0 FF0A rs 1 13DO rt addr rd 2 ... Ex. Sel. Immediate Extended 30 ALU Link PC 31 Reg. Wrt. Registers En. Imm. En. Reg. data En. ALU On-Chip Memory Mem. Wrt. En. Mem. data Bus State Micro Code ld IR Reg Sel Reg Writ e en Reg ld A ld B ALU Op en ALU ld MA Mem Write en Mem Ex Sel en Imm μ Branch Type ADD_0: A <- rs 0 rs 0 1 1 0 0 0 0 0 0 0 0 Next B <- rt 0 rt 0 1 0 1 0 0 0 0 0 0 0 Next rd <- result 0 rd 1 0 0 0 ADD 1 0 0 0 0 0 Jump Next State FETCH_0 31 (PC) 30 (Link) rd rt rs Register-Register Add zero? Opcode ALU Op. load IR load A load B load MA busy? Reg. Sel. MA addr IR 13DO FF0A A B 0 FF0A rs 1 13DO rt 2 12DA rd addr 99FF ... Ex. Sel. Immediate Extended 30 ALU Link PC 31 En. Imm. En. Reg. data En. ALU On-Chip Memory Reg. Wrt. Registers 12DA Mem. Wrt. En. Mem. data Bus State Micro Code ld IR Reg Sel Reg Writ e en Reg ld A ld B ALU Op en ALU ld MA Mem Write en Mem Ex Sel en Imm μ Branch Type ADD_0: A <- rs 0 rs 0 1 1 0 0 0 0 0 0 0 0 Next B <- rt 0 rt 0 1 0 1 0 0 0 0 0 0 0 Next rd <- result 0 rd 1 0 0 0 ADD 1 1 1 1 0 0 Jump Incorrect flow based on current state Next State FETCH_0 0,2,1,0,0,0,1,1,0,0,0,0,0 0,0,0,1,1,0,0,0,0,0,0,0,0 ADD_0 No_Op Add FETCH_0 All Else Load 0,1,0,1,0,1,0,0,0,0,0,0,0 ADD_1 ADD_2 All Else All Else FAULT LOAD_0 Mult Hypothesis: The totality of control-type signals forms a stateful representation of a circuit’s control flow, which can be modeled by a finite automata MULT_0 ... State Micro Code ld IR Reg Sel Reg Writ e en Reg ld A ld B ALU Op en ALU ld MA Mem Write en Mem Ex Sel en Imm μ Branch Type ADD_0: A <- rs 0 rs/0 0 1 1 0 0 0 0 0 0 0 0 Next ADD_1: B <- rt 0 rt/1 0 1 0 1 0 0 0 0 0 0 0 Next ADD_2: bus <- result 0 rd/2 1 0 0 0 ADD/1 1 0 0 0 0 0 Jump Next State FETCH_0 Execution Flow • Since we are describing an execution monitor in this section, we compare its claimed properties to the execution monitor (EM) definitions from [Schn2000] • Ψ: the universe of all possible sequences • ΣS: a subset of Ψ corresponding to the execution of some target S • Definition: a security policy is specified by giving a predicate on sets of executions. A target S satisfies security policy P if and only if P(ΣS) is true. Execution Flow • Execution monitors (EMs) [Schn00] • Any security policy P that can be enforced from an enforcement mechanism on an execution set Π must be able to be specified by a predicate of the form: P() : ( : P̂( )) • DFAEM Meets this criteria • DFAEM must also terminate an unauthorized execution after some finite time • The system would do this by halting on the FAULT state, then executing some prescribed corrective action Execution Flow • DFAEM meets the criteria for a security automata, as described in [Schn00]. • Needs to be able to accept infinite-length inputs • Acceptance requires revisiting at least one accepting state an infinite number of times (in this case, the FETCH_0 state) 0,2,1,0,0,0,1,1,0,0,0,0,0 0,0,0,1,1,0,0,0,0,0,0,0,0 ADD_0 No_Op Add FETCH_0 Load All Else LOAD_0 ADD_1 All Else FAULT 0,1,0,1,0,1,0,0,0,0,0,0,0 ADD_2 All Else Execution Flow • Malicious Changes or Errors? • Though malicious inclusions may motivate 3D security designs, at this level a malicious change may be indistinguishable from a design flaw or a transient electrical error • A 3D execution monitor detects any deviation from the specified behavior, and therefore would detect malicious inclusions, design flaws, and transient errors in the same manner • Though malicious inclusions are the motivation, a processor EM might also be useful for detecting transient errors or flaws Execution Flow • Primary Research tasks • Examine techniques for identification of the control signals which must be monitored • Show the connection between established "execution monitor" theory and the 3D monitoring of a processor's instruction execution flow • Design a working execution monitor in HDL for one or more simple processors • Demonstrate detection of execution-flow malicious inclusions, in software simulation • Demonstrate 3D execution monitor cost metrics – number of gates, time requirements, number of Automata states required Planned Experiments and Supporting Techniques Processor Execution Monitor • Simple Processor Demonstration First • Yes: Uniprocessor, simple cache or memory model • No: Multicore, superscalar, CISC, advanced features • Maybe: Pipelining and interrupt support • Candidates from OpenCores.org • ZPU – stack-based 32-bit reduced MIPS • Plasma – register-based MIPS-1 with interrupts, cache • OpenRISC1200 – register-based, fuller MIPS ISA Processor Execution Monitor • Tools for generating 3D monitor • VHDL and Verilog FSMs – integrate with existing HDL • Run in ModelSim or Xilinx ISE (ZPU in ModelSim) Research Areas • Identify Control Signals • Analyze HDL for dependency on the opcode value in the instruction register • Modify existing lex/yacc or other HDL parser to obtain control signal dependency graphs • Identify Malicious Inclusions (or Design Errors) • Examine the HDL control signal dependency graph for anomalous dependencies; see if they represent malicious inclusion HDL code or design errors • Examine the HDL code for exceptionally rare control-circuit events (on the order of 1 in 232, for example) to identify possible "triggers", like the Leon3 example Research Areas • Touching on the other areas • Control/Test/Debug and "Keep Alive" circuits • Discussion of the vulnerabilities created by JTAG, debug, and control circuits • Discussion of moving these circuits to the control plane • No simulation experiments • Data Paths, Storage and Retrieval, Arithmetic and Logic • General discussion of potential 3D security implementations • No simulation experiments Related Work, Limitations, and Concerns Related Work • 3D Introspective Monitor [Mys06] • Demonstrated physical feasibility of 3D-attached monitoring chip with ~1,000 TSV attachments, in terms of area, heat, and power • Traditional Detection of Malicious Inclusions • • • • • Power and timing analysis Comparison against "golden" sample Works if difference in number of gates is sufficiently large Does not account for design phase subversions May or may not not detect untriggered subversions Recent Closely-Related Work • Analyzing HDL for Malicious Inclusions • "Blue Chip" paper, Oakland 2010 [Hicks10] • Uses interrupts, hardware-software handoff, "code coverage" analysis • Self-monitoring logic in processors • Each modular processor unit uses custom logic to monitor another processor unit [Waks10] • Reprogrammable on-chip monitoring [Abram09] • Add reprogrammable security blocks to ASICs • Build state machines to monitor processor components • Proposed in general terms only; 2-page industry paper Research Limitations • 3D Design Tools • Physical layout (floorplanning) tools for 3D are under development • 3D HDL design tools not available, and VHDL and Verilog do not have 3D-specific language extensions • Use existing design tools, but assume 1-2 clock cycle latency getting signals to the monitor, based on literature • 3D Monitor Complexity • For multicore, deeply pipelined, complex instruction set processors, a full-featured 3D monitor would also be very complex • Start small and simple Research Concerns • No way to find all malicious inclusions or design flaws • Proposed approach is complementary to existing methods • An attacker who is knowledgeable of the monitoring scheme could bypass the monitored signals • We would want to isolate the 3D monitor development from the target processor development to avoid this – develop them in parallel Research Concerns • Opinion of some HOST 2010 participants: detecting all MIs using a real-time monitor is difficult, since the MI designer could bypass the monitor • In 2D design space, the target and monitor are designed together; a 3D monitor could be developed semiindependently, and therefore isolated better from an attacker • We're not looking to detect all malicious inclusions, just those from one of the four identified categories Preliminary Work • ZPU: open source, stack-based, 32-bit, MIPS subset • Monitor mirrors ZPU execution state flow • Monitor imports necessary signals, compares the ZPU's actual state against the expected state, based on the signals observed • Sets a "predicate" signal to false if a difference is detected Expected: Observed: Proposed Contributions Proposed Contributions • Establish the connection between existing "execution monitor" theory and the 3D monitoring of a processor's instruction execution flow • Design a working execution monitor in HDL for one or more simple processors • Demonstrate detection of execution-flow malicious inclusions, in software simulation • Demonstrate 3D execution monitor cost metrics – number of gates, time requirements, number of Automata states required • Explore techniques for automatically identifying which processor control signals should be monitored Proposed Contributions • Supporting contributions • Map the common microprocessor architectural features to the types of malicious inclusions to which they are vulnerable • Identify the key computation-plane architectural components which must be monitored by posts for successful 3D detection of malicious inclusions, and explain their importance • Discuss the limits of what can be reasonably detected using these methods; identify the tradeoffs of monitor size, number of monitor points against performance, and describe an optimum balance of performance and security Applying Security Techniques to 3D Fabrication Methods for General Purpose Processors Questions and Comments? References • • • • • • • [Schn00] Fred Schneider. Enforceable Security Policies. ACM Transactions on Information and System Security, Vol. 3, No. 1, February 2000, Pages 30-50. [Tehr10] Mohammed Tehranipoor and Farniaz Koushanfar. A Survey of Hardware Trojan Taxonomy and Detection. IEEE Design and Test of Computers, vol. 27, issue 1, January/February 2010. [TIIC07] Brian Sharky. DARPA TRUST in Integrated Circuits Program, Industry Day Brief. 26 March 2007. [Dav05] W. Rhett Davis, John Wilson, Stephen Mick, Jian Xu, Hao Hua, Christopher Mineo, Ambarish M. Sule, Michael Steer, and Paul D. Franzon. Demystifying 3D ICs: The Pros and Cons of Going Vertical. IEEE Design & Test of Computers, November-December 2005. [King10] Rachael King. Fighting a Flood of Counterfeit Tech Products. Business Week, 1 March 2010. [Grow08] Brian Grow, Chi-Chu Tschang, Cliff Edwards, and Brian Burnsed. Dangerous Fakes. Business Week, 2 October 2008. [Mys06] Mysore, Agrawal, Srivastava, Lin, Banerjee, and Sherwood. Introspective 3D Chips. International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). October 21-25, 2006. San Jose, CA. References • • • • • [Hicks10] Matt Hicks, Murph Finnicum, Sam King, Milo Martin, and Jonathan Smith. Overcoming an Untrusted Computing Base: Detecting and Removing Malicious Hardware Automatically. Proceedings of the 31st IEEE Symposium on Security & Privacy (Oakland), May 2010. [Adee08] Sally Adee. The Hunt for the Kill Switch. IEEE Spectrum, May 2008. http://spectrum.ieee.org/semiconductors/design/the-hunt-for-the-kill-switch [Villa10] John Villasenor. The Hacker in Your Hardware. Scientific American, August 2010. [Waks10] Adam Waksman and Simha Sethumadhavan. Tamper Evident Microprocessors. IEEE Security and Privacy, May 2010. [Abram09] Miron Abramovici and Paul Bradley. Integrated Circuit Security – New Threats and Solutions. CSIRW, April 2009.