ESE535: Electronic Design Automation Day 13: February 27, 2013 Dataflow Penn ESE535 Spring 2013 -- DeHon 1 Previously A1 A3 A5 A2 A4 A6 A7 A8 A9 A10 A11 A12 A13 B2 B1 B3 B10 B11 B4 B5 B6 B7 • Scheduling of concurrent operations B8 B9 Penn ESE535 Spring 2013 -- DeHon 2 Want to See • Abstract compute model – natural for parallelism and hardware • Describe computation abstracted from implementation – Defines correctness Penn ESE535 Spring 2013 -- DeHon 3 Behavioral (C, MATLAB, …) Today Arch. Select Schedule RTL • Dataflow • SDF – Single rate – Multirate • Dynamic Dataflow • Expression FSM assign Two-level, Multilevel opt. Covering Retiming Gate Netlist Placement Routing Layout Masks Penn ESE535 Spring 2013 -- DeHon 4 Parallelism Motivation Penn ESE535 Spring 2013 -- DeHon 5 Producer-Consumer Parallelism Stock predictions encrypt • Can run concurrently • Just let consumer know when producer sending data Penn ESE535 Spring 2013 -- DeHon 6 Pipeline Parallelism ME DCT VQ code • Can potentially all run in parallel • Like physical pipeline • Useful to think about stream of data between operators Penn ESE535 Spring 2013 -- DeHon 7 DAG Parallelism Decode video Check/decode block synchronize Decode audio • Doesn’t need to be linear pipeline • Synchronize inputs Penn ESE535 Spring 2013 -- DeHon 8 Graphs with Feedback + + ×k ×k ×k ×k • In general may hold state • Very natural for many tasks Penn ESE535 Spring 2013 -- DeHon 9 Definitions Penn ESE535 Spring 2013 -- DeHon 10 Dataflow / Control Flow Dataflow • Program is a graph of operators • Operator consumes tokens and produces tokens • All operators run concurrently Control flow (e.g. C) • Program is a sequence of operations • Operator reads inputs and writes outputs into common store • One operator runs at a time – defines successor Penn ESE535 Spring 2013 -- DeHon 11 Token • Data value with presence indication – May be conceptual • Only exist in high-level model • Not kept around at runtime – Or may be physically represented • One bit represents presence/absence of data Penn ESE535 Spring 2013 -- DeHon 12 Token Examples? • What are familiar cases where data may come with presence tokens? – Network packets – Memory references from processor • Variable latency depending on cache presence – Start bit on serial communication Penn ESE535 Spring 2013 -- DeHon 13 Operator • Takes in one or more inputs • Computes on the inputs • Produces results • Logically self-timed – “Fires” only when input set present – Signals availability of output Penn ESE535 Spring 2013 -- DeHon 14 Penn ESE535 Spring 2013 -- DeHon 15 Dataflow Graph • Represents – computation sub-blocks – linkage • Abstractly – controlled by data presence Penn ESE535 Spring 2013 -- DeHon 16 Dataflow Graph Example Penn ESE535 Spring 2013 -- DeHon 17 In-Class Dataflow Example Penn ESE535 Spring 2013 -- DeHon 18 Stream • Logical abstraction of a persistent pointto-point communication link – Has a (single) source and sink – Carries data presence / flow control – Provides in-order (FIFO) delivery of data from source to sink stream Penn ESE535 Spring 2013 -- DeHon 19 Streams • Captures communications structure – Explicit producerconsumer link up • Abstract communications – Physical resources or implementation – Delay from source to sink Penn ESE535 Spring 2013 -- DeHon 20 Register Transfer Level (RTL) Behavioral (C, MATLAB, …) • Describe computation as logic and registers • Equations (logic) define values to be clocked into next register • Typically what you right in VHDL, Verilog Arch. Select Schedule RTL FSM assign Two-level, Multilevel opt. Covering Retiming Gate Netlist Placement Routing Layout Masks Penn ESE535 Spring 2013 -- DeHon 21 Dataflow Abstracts Timing • Doesn’t say – on which cycle calculation occurs [contrast RTL] • Does say – What order operations occur in – How data interacts • i.e. which inputs get mixed together • Permits – Scheduling on different # of resources – Operators with variable delay [examples?] – Variable delay in interconnect [examples?] Penn ESE535 Spring 2013 -- DeHon 22 Examples • Operators with Variable Delay – Cached memory or computation – Shift-and-add multiply – Iterative divide or square-root • Variable delay interconnect – Shared bus – Distance changes • Wireless, longer/shorter cables – Computation placed on different cores? Penn ESE535 Spring 2013 -- DeHon 23 Difference: Dataflow Graph/Pipeline Penn ESE535 Spring 2013 -- DeHon 24 Clock Independent Semantics Interconnect Takes n-clocks Penn ESE535 Spring 2013 -- DeHon 25 Semantics • Need to implement semantics – i.e. get same result as if computed as indicated • But can implement any way we want – That preserves the semantics – Exploit freedom of implementation Penn ESE535 Spring 2013 -- DeHon 26 Dataflow Variants Penn ESE535 Spring 2013 -- DeHon 27 Synchronous Dataflow (SDF) • Particular, restricted form of dataflow • Each operator – Consumes a fixed number of input tokens – Produces a fixed number of output tokens – When full set of inputs are available • Can produce output – Can fire any (all) operators with inputs available at any point in time Penn ESE535 Spring 2013 -- DeHon 28 Synchronous Dataflow + + ×k ×k ×k ×k Penn ESE535 Spring 2013 -- DeHon 29 SDF: Execution Semantics while (true) Pick up any operator If operator has full set of inputs Compute operator Produce outputs Send outputs to consumers Penn ESE535 Spring 2013 -- DeHon 30 Multirate Synchronous Dataflow • Rates can be different – Allow lower frequency operations – Communicates rates to CAD • Something not clear in RTL • Use in scheduling, provisioning – Rates must be constant • Data independent 2 decimate 1 Penn ESE535 Spring 2013 -- DeHon 31 SDF • Can validate flows to check legal – Like KCL token flow must be conserved – No node should • be starved of tokens • Collect tokens • Schedule onto processing elements – Provisioning of operators • Provide real-time guarantees • Simulink is SDF model Penn ESE535 Spring 2013 -- DeHon 32 SDF: good/bad graphs 1 1 2 1 1 1 1 Penn ESE535 Spring 2013 -- DeHon 1 1 1 1 1 33 SDF: good/bad graphs 1 1 1 2 1 2 1 1 1 1 Penn ESE535 Spring 2013 -- DeHon 2 2 1 1 1 1 1 1 34 Dynamic Rates? • When might static rates be limiting? – Compress/decompress • Lossless • Even Run-Length-Encoding – Filtering • Discard all packets from geraldo – Anything data dependent Penn ESE535 Spring 2013 -- DeHon 35 Data Dependence • Add Two Operators – Switch – Select Penn ESE535 Spring 2013 -- DeHon 36 Switch Penn ESE535 Spring 2013 -- DeHon 37 Filtering Example Geraldo? discard dup switch Penn ESE535 Spring 2013 -- DeHon 38 Select Penn ESE535 Spring 2013 -- DeHon 39 Constructing If-Then-Else Penn ESE535 Spring 2013 -- DeHon 40 Select Example In-Order Merge of Streams (smallest to largest) select select > switch switch select Penn ESE535 Spring 2013 -- DeHon 41 Looping • for (i=0;i<Limit;i++) Penn ESE535 Spring 2013 -- DeHon 42 Universal • Once we add switch and select, the dataflow model is as power as any other – E.g. can do anything we could do in C – “Turing Complete” in formal CS terms Penn ESE535 Spring 2013 -- DeHon 43 Dynamic Challenges • In general, cannot say – If a graph is well formed • Will not deadlock – How many tokens may have to buffer in stream – Right proportion of operators for computation Penn ESE535 Spring 2013 -- DeHon 44 Feels a bit tacked-on …not flow that well with rest. Might think about how to do better. Expression How would we capture this in a Programming Language? Penn ESE535 Spring 2013 -- DeHon 45 Expression • Could express operators in C/Java – Each is own thread • Link together with Streams • E.g. SystemC Penn ESE535 Spring 2013 -- DeHon 46 C Example while (!(eos(stream_a) && !(eos(stream_b)) A=stream_a.read(); B=stream_b.read(); Out=(a+b)*(a-b); stream_out.write(Out); Penn ESE535 Spring 2013 -- DeHon 47 Connecting up Dataflow stream stream1=new stream(); operator prod=new stock(stream1); operator cons=new encrypt(stream1); Stock predictions Penn ESE535 Spring 2013 -- DeHon encrypt 48 What’s the Point? • Seen repeatedly: exploit freedom – To reduce costs • A1: How do we capture freedom that exists in computational? – Higher-level than an implementation – Perhaps as a useful intermediate • A2: How do we allow freedom for implementations (or instances) to take variable time? Penn ESE535 Spring 2013 -- DeHon 49 Summary • Dataflow Models – Simple pipelines – DAGs – SDF (single, multi)-rate – Dynamic Dataflow • Allow – express parallelism – freedom of implementation Penn ESE535 Spring 2013 -- DeHon 50 Big Ideas: • Dataflow – Natural model for capturing computations – Communicates useful information for optimization • Linkage, operator usage rates • Abstract representations – Leave freedom to implementation Penn ESE535 Spring 2013 -- DeHon 51 Admin • Homework 4 Due Today • Spring Break next week • Back on Monday 3/11 – Reading on Blackboard Penn ESE535 Spring 2013 -- DeHon 52