ECE 551 Digital System Design & Synthesis Lecture 09 Synthesis of Common Verilog Constructs Overview Don’t Cares and z’s in synthesis Unintentional latch synthesis Synthesis & Flip-Flops Gating the clock… and alternatives Loops in synthesis 2 Synthesis Of x And z Only allowable uses of x is as “don’t care”, since x cannot actually exist in hardware in casex in casez in defaults of conditionals such as if, case, ? Only allowable uses of z: As a “don’t care” Constructs implying a 3-state output 3 Don’t Cares x assigned as a default in a conditional, case, or casex. x, ?, or z within case item expression in casex Does not actually output “don’t cares”! Values for which input comparison to be ignored Lets synthesizer choose between 0 or 1 Choose whichever helps meet design goals z, ? within case item expression in casez Same conditions as listed above except does not apply to x 4 Don’t Cares Example [1] Give synthesizer flexibility in optimization case (state) 3’b000: out = 1’b0; 3’b001: out = 1’b1; 3’b010: out = 1’b1; 3’b011: out = 1’b0; 3’b100: out = 1’b1; // below states not used 3’b101: out = 1’b0; 3’b110: out = 1’b0; 3’b111: out = 1’b0; endcase VS case (state) 3’b000: out = 1’b0; 3’b001: out = 1’b1; 3’b010: out = 1’b1; 3’b011: out = 1’b0; 3’b100: out = 1’b1; default: out = 1’bx; endcase 5 Don’t Cares Example [2] Synthesis results for JUST output calculation: VS Area = 21.04 Area = 15.24 6 Don’t Cares Example [3] casex can reduce number of conditions specified and may aid synthesis casex (state) 3’b000: out = 1’b1; 3’b001: out = 1’b1; 3’b010: out = 1’b1; 3’b011: out = 1’b1; 3’b100: out = 1’b0; 3’b101: out = 1’b0; 3’b110: out = 1’b1; 3’b111: out = 1’b1; endcase VS casex (state) 3’b0??: out = 1’b1; 3’b10?: out = 1’b0; 3’b11?: out = 1’b1; endcase 7 Synthesis Of z Example 1: (Simple Tri-State Bus) assign bus_line = (en) ? data_out : 32’bz; Example 2 (Bidirectional Bus) // bus_line receives from data when en is 0 assign bus_line = (!en) data : 32’bz; // bus_line sends to data when enable is 1 assign data = (en) ? bus_line : 32’bz; 8 Combinational Logic Hazards Avoid structural feedback in continuous assignments, combinational always assign z = a | y; assign y = b | z; // feedback on y and z signals! Avoid incomplete sensitivity lists in combinational always always (*) // helps to avoid latches For conditional assignments, either: For warning, set hdlin_check_no_latch true before compiling Set default values before statement Make sure LHS has value in every branch/condition 9 Synthesis Example [1] module latch2(input a, b, c, d, output reg out); always @(a, b, c, d) begin if (a) out = c | d; else if (b) out = c & d; end endmodule Area = 44.02 10 Synthesis Example [2] module no_latch2(input a, b, c, d, output reg out); always @(a, b, c, d) begin if (a) out = c | d; else if (b) out = c & d; else out = 1’b0; end endmodule Area = 16.08 11 Synthesis Example [3] module no_latch2_dontcare(input a, b, c, d, output reg out); always @(a, b, c, d) begin if (a) out = c | d; else if (b) out = c & d; else out = 1’bx; end endmodule When can this be done? Area = 12.99 12 Registered Combinational Logic Combinational logic followed by a register always @(posedge clk) y <= (a & b) | (c & d); always @(negedge clk) case (select) 3’d0: y <= a; 3’d1: y <= b; 3’d2: y <= c; 3’d3: y <= d; default y <= 8’bx; endcase The above behaviors synthesize to combinational logic followed by D flip-flops. Combinational inputs (e.g., a, b, c, and d) should not be included in the sensitivity list 13 Flip-Flop Synthesis Each variable assigned in an edge-controlled always block implies a flip-flop. 14 Gated Clocks Use only if necessary (e.g., for low-power) Clock gating is for experts! module gated_dff(clk, en, data, q); input clk, en, data; output reg q; wire my_clk = clock & en; always @(posedge my_clk) q <= data; endmodule When clock_en is zero flip-flop won’t update. What problems might this cause? 15 Alternative to Gated Clocks module enable_dff(clk, en, data, q); input clk, en, data; output reg q; always @(posedge clock) if (en) q <= data; endmodule Flip-flop output can only change when en is asserted How might this be implemented? 16 Clock Generators Useful when you need to gate large chunks of your circuit. Adding individual enable signals adds up module clock_gen(pre_clk, clk_en, post_clk); input pre_clk, clk_en; output post_clk; assign post_clk = clk_en & pre_clk; endmodule Key: If you do this, you cannot use pre_clk on any flip-flops! Exclusively use post_clk. This is still mostly recommended only for experts. 17 Loops A loop is static (data-independent) if the number of iterations is fixed at compile-time Loop Types Static without internal timing control Combinational logic Static with internal timing control Sequential logic Non-static without internal timing control Not synthesizable Non-static with internal timing control Sometimes synthesizable, Sequential logic 18 Static Loops w/o Internal Timing Combinational logic results from “loop unrolling” Example always@(a) begin andval[0] = 1; for (i = 0; i < 4; i = i + 1) andval[i + 1] = andval[i] & a[i]; end What would this look like? How can we register the outputs? 19 Static Loops with Internal Timing If a static loop contains an internal edge-sensitive event control expression, then activity distributed over multiple cycles of the clock always begin for (i = 0; i < 4; i = i + 1) @(posedge clk) sum <= sum + i; end What does this loop do? This might synthesize, but why take your chances? 20 Non-Static Loops w/o Internal Timing Number of iterations is variable Not known at compile time Can be simulated, but not synthesized! Essentially an iterative combinational circuit of data dependent size! always@(a, n) begin andval[0] = 1; for (i = 0; i < n; i = i +1) andval[i + 1] = andval[i] & a[i]; end What if n is a parameter? 21 Non-Static Loops with Internal Timing Number of iterations determined by Variable modified within the loop Variable that can’t be determined at compile time Due to internal timing control— Distributed over multiple cycles Number of cycles determined by variable above Variable must still be bounded always begin continue = 1’b1; for (; continue; ) begin @(posedge clk) sum = sum + in; if (sum > 8’d42) continue = 1’b0; end end Timing controls inside the always block make design guidelines fuzzy… Confusing, and best to avoid. 22 Unnecessary Calculations Expressions that are fixed in a for loop are replicated due to “loop unrolling.” Solution: Move fixed (unchanging) expressions outside of all loops. for (x = 0; x < 5; x = x + 1) begin for (y = 0; y < 9; y = y + 1) begin index = x*9 + y; value = (a + b)*c; mem[index] = value; end end Which expressions should be moved? 23 Unnecessary Calculations Expressions that are fixed in a for loop are replicated due to “loop unrolling.” Solution: Move fixed (unchanging) expressions outside of all loops. value = (a + b)*c; for (x = 0; x < 5; x = x + 1) begin index = x*9; for (y = 0; y < 9; y = y + 1) begin mem[index+y] = value; end end Not so different from Loop optimization in software. 24 FSM Replacement for Non-Static Loops Not all loop structures supported by vendors Can always implement a loop with internal timing using an FSM Can make a “while” loop easily by rewriting as FSM Often use counters along with the FSM All (good) synthesizers support FSMs! Synopsys supports for-loops with a static number of iterations Don’t bother with “fancy” loops outside of testbenches 25 Review Questions Draw the Karnaugh-map for each statement, assuming each variable is one bit. if (c) z = a | b; else z = 0; if (c) z = a | b; else z = 1`x; Give a simplified Boolean expression for each statement What would you expect to get from the following expressions? if (c) z = a & b; 26 Review Questions Rewrite the following code, so that it takes two clock cycles and can use only a single multiplier and adder to compute z. Draw potential hardware for each piece of code. reg [15:0] z; reg [7:0] a, b, c, d always @(posedge clock) z = a*b + c*d; 27 Review Questions What hardware would you expect from the following code? module what(input clk, input [7:0] b, output [7:0] c); reg [7:0] a[0:3]; integer i; always@(posedge clk) begin a[0] <= b; for (i = 1; i < 4; i=i+1) a[i] <= a[i-1] + b; end assign c = a[3]; endmodule 28