Slides

advertisement
What is an “SoC”?
• SoC = SOC = System on Chip = System on a Chip
• Wider use:
a Chip that implements a Complete System
• More common use:
a Chip with
one or more CPU cores,
Peripheral Interface Blocks,
and Dedicated HW Blocks
around a System Bus
What is ASIC, FPGA, SoC?
ASA
ASIC
AS
ASA
SoC
AS
ASA
FPGA
AS
Individual Gates and Memory
Pre-designed
All Layout except Wires
Pre-designed
Layout
Not Pre-designed
Full-custom
Whole Chip
Pre-designed
ASIC
Gate-array
(Structured ASIC)
SoC
FPGA
Inside an FPGA
CLB: Configurable Logic Block
An example of
an ARM based MCU SoC
From a Designer’s Perspective
• ASIC, FPGA, SoC: all the same from a designer’s point of view
• We are in the SoC age =>
• Shop for IP blocks
(IP block = Library block)
• Integrate them with each other and your design
What is ASIC?
• IC
• Full-custom IC
• IC = SP or ASSP
• SP
= Standard Product
= Memory chip, Processor
• ASSP = Application Specific Standard Product
= USB interface chip for ex.
• ASIC =>
Think of Vestel or Cisco – an equipment=box=system maker
that buys ICs (SP or ASSP) puts them on a PCB.
They sometimes need extra logic =>
hence ASIC (Application Specific Integrated Circuit)
Contemporary (wider)
meaning of ASIC
• Previous slide described the original (narrow) meaning of ASIC
(how the word ASIC came about)
• Such chips required quick methods for design because:
• constraints in design time
• constraints in design personnel
• designs were not so aggressive
• This resulted in what we call: ASIC Design Flow
• Hence: an “ASIC Designer” doing “ASIC Design”
may be working on an SP
done in ASIC Design Flow
as opposed to Full-Custom Flow.
Why/when design your own chip
or customize an SoC?
As opposed to taking a CPU and writing code that runs on it
BECAUSE:
• CPU solution is not fast enough
(FPGA is slower but offers more parallelism)
• CPU is too expensive
• CPU sucks too much power
• CPU cannot meet the exact I/O timing requirements (no later no earlier)
• CPU does not have the right number and mix of I/O pins
• Form-factor: CPU is too big and/or requires a heat/sink, fan, and/or chip-set
a LOOK at the SECTOR
Top Semi Companies (2011)
Fabless semi
1. Intel (USA):
2. Samsung (Korea):
3. TSMC (Taiwan):
4. TI (USA):
5. Toshiba (Japan):
6. Renesas (Japan):
7. Qualcomm (USA):
8. STMicro (Fr-Ita):
9. Hynix (Korea):
10. Micron (USA):
Fab = Foundry
$50B
$29B/$260B+
$15B
$14B
$13B/$80B
$11B
$10B
$10B
$9B
$7B
11. Broadcom (USA):
12. AMD (USA):
13. Infineon (Germany):
14. Sony (Japan):
15. Freescale (USA):
16. Elpida (Japan):
17. NXP (Holland):
18. UMC (Taiwan):
19. NVIDIA (USA):
20. Globalfoundries (USA):
FPGA market size
$7B
$6B
$5B
$5B/$90B
$4B
$4B
$4B
$4B
$4B
$4B
$5B
Top FPGA (=PLD=CPLD) Companies
(all with HQs in the USA)
1.
2.
3.
4.
5.
Xilinx:
Altera:
Lattice:
Microsemi (was Actel):
Quicklogic:
49%
40%
6%
4%
1%
DESIGN ISSUES
ASIC Implementation Flow
SW tools = $100K - $1M
3-12 months
ASIC Design
~ 2 months
Fabrication
~ 1 month
Package/Test
~ 1 month
Validation
NRE = $100K - $4M
ASIC
FPGA
NRE
No NRE
Lower unit cost
in high volume
Lower unit cost
in low volume
Faster
Cheaper or free design tools
Lower power
Fast time to market
Low barrier to entry
Higher levels of integration
More analog integration
Programmable
- Next few slides are Courtesy of Xilinx (DAC 2001)
ASIC Design Flow
Specification & Arch.
spec
(behav. code)
Front-End Design
HDL RTL
Front-End Verification
HDL RTL
Synthesis/Timing
Back-End Verification
(Timing, GateSim,
Formal, DRC, LVS)
HDL gates
Layout in GDSII
Back-End Design
ASIC Design Tool-set
Editor
Front-End Design
HDL RTL
Simulator SW
Front-End Verification
Stdcell Library
HDL RTL
Synthesis SW
Back-End Verification
(Timing, GateSim,
Formal, DRC, LVS)
Synthesis/Timing
HDL gates
Layout in GDSII
Back-End Design
Physical design, verif., DFT/ATPG SWs
Top EDA Companies
(all with HQs in the USA)
1.
2.
3.
4.
Synopsys:
Mentor Graphics:
Cadence:
Other:
$1500M
$900M
$850M
27%
(Above are my 2010 estimates.
Total market size: $4.5B)
FPGA Design Flow
Specification & Arch.
spec
(behav. code)
Front-End Design
HDL RTL
Front-End Verification
HDL RTL
Back-End Verification
(Timing, GateSim,
Formal, DRC, LVS)
Synthesis,
Back-end,
Timing
Bitfile
FPGA Design
Tool-set for Xilinx
Xilinx ISE
Editor, Simulator, Synthesis
All in one IDE
Front-End Design
HDL RTL
Front-End Verification
HDL RTL
Synthesis,
Back-end,
Timing
Bitfile
MODERN DIGITAL DESIGN
- BASICS -
You hardly need anything you learned in your Logic course
in Modern (HDL and Synthesis based) Digital Design
because:
• We write code
• We don’t design circuits
• At least no gate-level circuits
• We don’t care about theorems in Boolean Algebra 
• We don’t care about Karnaugh-maps
• The synthesis SW (compiler)
does the logic minimization for us
• The FPGA has 1000s of gates anyway
• (OK, in some extreme cases we may need to care)
• Before we care about area minimization
we need to care about meeting timing
We write RTL code
What is RTL code?
What is the RTL programming paradigm?
What does RTL mean in the first place?
RTL = RT-Level = Register Transfer Level
What is RT-Level digital (logic) design?
Everything is a STATE MACHINE!
Your (RTL) code describes the logic cloud
storedVars
Inputs
Outputs
storedVars_next
Cloud of Logic
(Combinational)
more Flops
Flop
for ex. INCREMENTER
clk
INCREMENTER
clk
0
0
0
0
0
0
1
0
INCREMENTER
clk
time
0
0
0
0
0
0
1
1
INCREMENTER
clk
time
0
0
0
0
1
0
0
1
INCREMENTER
clk
time
0
0
0
0
1
1
0
0
INCREMENTER
clk
time
0
0
0
0
1
1
1
0
INCREMENTER
clk
time
0
0
0
0
1
1
1
1
INCREMENTER
clk
time
0
0
1
0
0
1
0
1
INCREMENTER
clk
time
Key points in this programming paradigm:
• What are we programming?
• How will we program?
(Any guidelines?)
• What is a “flop” by the way?
Flop: What is it?
Edge-Triggered D-Type Flip Flop
= D-Type Flip Flop
= Flip-Flop
= Flop
Edge-Triggered  Flip-Flop
as opposed to:
Level-Sensitive  Transparent Latch = Latch
clk
D
Q
D
clk
Flop = 1-bit DigiCam
Q
posedge
posedge
posedge
posedge
Flop: explained with WAVEFORMS
2 Flops back to back = Shift Register
clk
Q1
D
Q2 D
Q1
clk
Q2
How a FLOP behaves (shown with a SHIFT REGISTER)
t = before posedge clk
1
1
flop1
0
0
0
flop2
1
1
1
How a FLOP behaves (shown with a SHIFT REGISTER)
t = posedge clk
1
1
flop1
1
0
0
flop2
0
1
1
How a FLOP behaves (shown with a SHIFT REGISTER)
C2Q delay
like good cholestrol
t = posedge clk + C2Q delay
1
1
flop1
1
1
1
flop2
0
0
0
SWITCH = LATCH
Latch = Transparent Latch
clk
clk (= enable)
D
Q
SWITCH = LATCH
Latch = Transparent Latch
clk
clk (= enable)
D
Q
SWITCH = LATCH
Latch = Transparent Latch
0
D
1
clk (= enable)
Q
FLOP = 2 back-to-back LATCHes
clk
clk1
flop
clk2
NON-OVERLAPPING
clk
latch (master)
C2Q delay
clk1
ClockToQ (C2Q) delay
latch (slave)
clk2
Key points in this programming paradigm:
• What are we programming?
Your program
DESCRIBES
this
clk
Key points in this programming paradigm:
• What are we programming?
Your program
DESCRIBES
ONE CYCLE
clk
Key points in this programming paradigm:
• How will we program?
Any guidelines?
That brings us to…
VERILOG TUTORIAL
- BASICS -
example design: counter
module counter();
endmodule
counter
example design: counter
4
counter
module counter(
cnt
);
output [3:0] cnt;
cnt
endmodule
example design: counter
btn
1
4
counter
cnt
module counter(
cnt,
btn
);
output [3:0] cnt;
input btn;
endmodule
example design: counter
module counter(cnt, btn, clk);
output [3:0] cnt;
input btn, clk;
btn
1
4
counter
clk
cnt
endmodule
example design: counter
module counter(cnt, btn, clk);
output [3:0] cnt;
input btn, clk;
btn
1
always @(*)
4
4
cntNxt
clk
cnt
endmodule
example design: counter
module counter(cnt, btn, clk);
output [3:0] cnt;
input btn, clk;
btn
1
4
4
cntNxt
clk
cnt
reg [3:0] cnt, cntNxt;
always @(posedge clk) begin
cnt <= #1 cntNxt;
end
endmodule
example design: counter
module counter(cnt, btn, clk);
output [3:0] cnt;
input btn, clk;
reg [3:0] cnt, cntNxt;
btn
1
always @(*)
4
4
cntNxt
clk
cnt
always @(posedge clk) begin
cnt <= #1 cntNxt;
end
always @(*) begin
if(btn)
cntNxt = cnt +1;
end
endmodule
example design: counter
module counter(cnt, btn, clk);
output [3:0] cnt;
input btn, clk;
reg [3:0] cnt, cntNxt;
0
+1
4
4
1
cnt
always @(posedge clk) begin
cnt <= #1 cntNxt;
end
btn
clk
always @(*) begin
if(btn)
cntNxt = cnt +1;
end
endmodule
example design: counter
module counter(cnt, btn, clk);
output [3:0] cnt;
input btn, clk;
reg [3:0] cnt, cntNxt;
0
+1
4
4
1
btn
clk
cnt
always @(posedge clk) begin
cnt <= #1 cntNxt;
end
always @(*) begin
cntNxt = cnt;
if(btn)
cntNxt = cnt +1;
end
endmodule
cnt
example design: counter
prevBtn
btn
module counter(cnt, btn, clk);
output [3:0] cnt;
input btn, clk;
reg [3:0] cnt, cntNxt;
reg prevBtn, posedgeBtn;
always @(posedge clk) begin
cnt <= #1 cntNxt;
prevBtn <= #1 btn;
end
always @(*) begin
cntNxt = cnt;
posedgeBtn = ~prevBtn & btn;
if(posedgeBtn)
cntNxt = cnt +1;
end
endmodule
cnt
example design: counter
prevBtn
btn
module counter(cnt, btn, clk);
output [3:0] cnt;
input btn, clk;
reg [3:0] cnt, cntNxt;
reg prevBtn;
always @(posedge clk) begin
cnt <= #1 cntNxt;
prevBtn <= #1 btn;
end
always @(*) begin
cntNxt = cnt;
if(~prevBtn & btn)
cntNxt = cnt +1;
end
endmodule
example design: counter
cnt
assign
prevBtn
btn
always @(*)
module counter(cnt, btn, clk);
output [3:0] cnt;
input btn, clk;
reg [3:0] cnt, cntNxt;
reg prevBtn;
wire posedgeBtn;
always @(posedge clk) begin
cnt <= #1 cntNxt;
prevBtn <= #1 btn;
end
assign posedgeBtn = ~prevBtn & btn;
always @(*) begin
cntNxt = cnt;
if(posedgeBtn)
cntNxt = cnt +1;
end
endmodule
cnt
example design: counter
btn
posDet
clk
module counter(cnt, btn, clk);
output [3:0] cnt;
input btn, clk;
reg [3:0] cnt, cntNxt;
reg prevBtn;
wire posedgeBtn;
always @(posedge clk) begin
cnt <= #1 cntNxt;
end
posDet posDet(clk, btn, posedgeBtn);
always @(*) begin
cntNxt = cnt;
if(posedgeBtn)
cntNxt = cnt +1;
end
endmodule
example design: counter
cnt
assign
prevBtn
btn
always @(*)
module counter(cnt, btn, clk);
output [3:0] cnt;
input btn, clk;
reg [3:0] cnt, cntNxt;
reg prevBtn;
wire posedgeBtn;
always @(posedge clk) begin
cnt <= #1 cntNxt;
prevBtn <= #1 btn;
end
assign posedgeBtn = ~prevBtn & btn;
always @(*) begin
cntNxt = cnt;
if(posedgeBtn)
cntNxt = cnt +1;
end
endmodule
example design: counter
cnt
assign
prevBtn
btn
always @(*)
module counter(cnt, btn, clk);
output [3:0] cnt;
input btn, clk;
reg [3:0] cnt, cntNxt;
reg prevBtn;
wire posedgeBtn;
always @(posedge clk) begin
cnt <= #1 cntNxt;
prevBtn <= #1 btn;
end
always @(*) posedgeBtn = ~prevBtn & btn;
always @(*) begin
cntNxt = cnt;
if(posedgeBtn)
cntNxt = cnt +1;
end
endmodule
cnt
example design: counter
prevBtn
btn
module counter(cnt, btn, clk);
output [3:0] cnt;
input btn, clk;
reg [3:0] cnt, cntNxt;
reg prevBtn, posedgeBtn;
always @(posedge clk) begin
cnt <= #1 cntNxt;
prevBtn <= #1 btn;
end
always @(*) begin
cntNxt = cnt;
posedgeBtn = ~prevBtn & btn;
if(posedgeBtn)
cntNxt = cnt +1;
end
endmodule
Expressing ALGORITHMs in RT-Level paradigm?
1. Think of your HW module as a netlist of HW
submodules.
2. Each submodule can in turn be a netlist of
subsubmodules.
3. Leaf modules can be expressed by behavior that
can be synthesized: (what we call) RTL code.
4. RTL is how we express an algorithm in HW.
5. Break your algorithm into clock cycles.
6. You have to specify what is done in each cycle.
Expressing ALGORITHMs in RT-Level paradigm? – cont’d
6. Think of it as a STATE MACHINE where every state is
executed in a different cycle.
7. Store everything that needs top be remembered
between states (= cycles) in explicitly coded REGISTERs.
8. Store also the STATE in an explicitly coded register.
9. At the top of the put a case(STATE).
10. What you will really code other than the registers is
actually a Truth Table coded with a high-level language.
11. That is: Outputs depend on only inputs, which are
external inputs plus register outputs.
GOLDEN RULES
L
GOLDEN RULE 1
NO COMBINATIONAL LOOP
always @(*)
GOLDEN RULE 1
NO COMBINATIONAL LOOP
always @(*)
cntNxt
always @(*) begin
if(cntNxt)
cntNxt = cnt –1;
end
GOLDEN RULE 1
NO COMBINATIONAL LOOP
always @(*)
always @(*) begin
if(cnt)
cntNxt = cnt –1;
end
GOLDEN RULE 1
NO COMBINATIONAL LOOP
always @(*)
always @(*) begin
if(cnt)
cntNxt = cnt –1;
else
cntNxt = cntNxt;
end
GOLDEN RULE 1
NO COMBINATIONAL LOOP
always @(*)
always @(*) begin
if(cnt)
cntNxt = cnt –1;
else
cntNxt = cnt;
end
GOLDEN RULE 1 – IMPLICATION
Always have DEFAULT ASSIGNMENTS
at the top of always @(*)
always @(*) begin
cntNxt = cnt;
if(cnt)
cntNxt = cnt –1;
end
GOLDEN RULE 1 – IMPLICATION
Always have DEFAULT ASSIGNMENTS
at the top of always @(*)
always @(*) begin
cntNxt = cnt;
if(cntNxt)
cntNxt = cnt –1;
end
GOLDEN RULE 2
NO INDIRECT COMBINATIONAL LOOPS
always @(*)
always @(*)
always @(*) and assign are equivalent
GOLDEN RULE 3
NO MULTIPLE DRIVERS
always @(*)
sameVar
always @(*)
sameVar
GOLDEN RULE 3
NO MULTIPLE DRIVERS
always @(*) begin
cntNxt = cnt;
if(btn1)
cntNxt = cnt +1;
end
always @(*) begin
cntNxt = cnt;
if(btn2)
cntNxt = cnt –1;
end
GOLDEN RULE 3
NO MULTIPLE DRIVERS
// Merge in a single always
always @(*) begin
cntNxt = cnt;
if(btn1)
cntNxt = cnt +1;
if(btn2)
cntNxt = cnt –1;
end
GOLDEN RULE 3
NO MULTIPLE DRIVERS
Extra input may be needed
always @(*)
var_v1
always @(*)
always @(*)
var_v2
var
Arbiter
(~~~ Priority Encoder)
GOLDEN RULE 4
SINGLE CLOCK DOMAIN
- unless really necessary
- extra care needed for signals
between different clock domains
in
clk’ = derived clk
= divided clk
= gated clk
clk
GOLDEN RULE 4
Do NOT Write Anything in always @pos blocks
other than flop definitions
i.e. Flop <= #1 FlopNxt
GOLDEN RULE 5
SINGLE CLOCK DOMAIN
- unless really necessary
- extra care needed for signals
between different clock domains
0
in
1
clk
clk
GOLDEN RULE 6
Do NOT Ignore Warning Messages
other then the ones for #1’s.
GOLDEN RULE 7
Write a Testbench and Simulate!
It is well worth the time.
HANDLING MULTIPLE CLOCKS
•
•
Clocks with different frequencies
Clocks with same frequency
different phases between them.
but
HANDLING MULTIPLE CLOCKS
Metastable state
Stable 0
Stable 1
•
•
Setup Time and Hold Time violations
Metastability
Setup time
D
Clock
Hold Time
HANDLING MULTIPLE CLOCKS
•
•
Clock nomenclature
Design partitioning
•
•
One module should work on one clock only
A synchronizer module be made for all
signals that cross from one clock domain to
another
Clk2_SigD
Sync
2 to1
Clock1
logic
Clk1_SigB
Sync
1to 2
Clk2_SigC
Clk1_SigA
Clock1 domain
Clock2 domain
Clock2
logic
HANDLING MULTIPLE CLOCKS
•
Transfer of Control Signals
Src clock
domain
src_ctrl
dest_ctrl
dest_clk
Two-stage synchronizer
Dest clock
domain
HANDLING MULTIPLE CLOCKS
•
Transfer of DataSignals
• Handshake signaling method
xreq
X clock
domain
xclk
data
Y clock
domain
yclk
HANDLING MULTIPLE CLOCKS
•
Transfer of DataSignals
• Asynchronous FIFO
fifo_full
X clock
domain
FIFO
write
xclk
fifo_empty
read
Two-stage synchronizer
Y clock
domain
yclk
Download