New Strategies for System Level Design Daniel Gajski

advertisement
New Strategies
for
System Level Design
Daniel Gajski
Center for Embedded Computer Systems (CECS)
University of California, Irvine
gajski@uci.edu
Overview
VLSIDAT 2006
•
Introduction
•
Issues
•
Models
•
Platforms
•
Tools
•
Benefits
•
Conclusion
Copyright 2006 Daniel D. Gajski
Closing the System Gap
Capture &
Simulate
Describe &
Synthesize
Specify, Explore
& Refine
Specs
Specs
Executable
Spec
Algorithms
Algorithms
Algorithms
Functionality
Algorithms
Architecture
System Gap
SW?
SW?
Network
Design
Describe
Design
SW/HW
Logic
Simulate
Logic
Logic
Physical
Physical
Physical
Manufacturing
Manufacturing
Manufacturing
1960's
1980's
2000's
Simulate
Real gap: behavior and structure (semantics and syntax)
VLSIDAT 2006
Copyright 2006 Daniel D. Gajski
Connectivity
Protocols
Performance
Timing
Simulation Based Methodology
Ambiguous semantics of hardware/system level languages
3.415
2.715
--
Finite state machine
Controller
case X is
when X1=>
.
.
.
when X2=>
--
Look-up table
Memory
Simuletable but not synthesizable or verifiable
VLSIDAT 2006
Copyright 2006 Daniel D. Gajski
In Search of a Solution
Algebra: < objects, operations>
a*(b+c) = a*b + a*c
Arithmetic algebra allows creation
of expressions and equivalences
VLSIDAT 2006
Copyright 2006 Daniel D. Gajski
Model Algebra
Model algebra: <objects, compositions>
B1
B1
=
B2
B3
B3
B2
PE1
Model algebra allows creation of models and model equivalences
VLSIDAT 2006
Copyright 2006 Daniel D. Gajski
PE2
Specify-Explore-Refine Methodology
System specification
model
Design decisions
SER
Model refinement
Intermediate models
Replacement or
re-composition
Cycle accurate
implementation model
FPGA board
VLSIDAT 2006
Copyright 2006 Daniel D. Gajski
How many models?
Minimal set for any methodology
(3 is enough)
•
System specification model (application designers)
•
Transaction-level model (system designers)
•
Pin&Cycle accurate model (implementation designers)
VLSIDAT 2006
Copyright 2006 Daniel D. Gajski
Three Models (with Respect to OSI)
Pin / Cycle Accurate Model
Transaction Level Model
Specification Model
7.
Application
7.
Application
6.
Presentation
6.
Presentation
5.
Session
5.
Session
4.
Transport
4.
Transport
3.
Network
3.
Network
Spec
2b. Link + Stream
2a. Media Access Ctrl
2b. Link + Stream
TLM
2a. Media Access Ctrl
2a. Protocol
2a. Protocol
1.
1.
Physical
Physical
Address lines
Data lines
Control lines
P/CAM
Source: G Schirner
VLSIDAT 2006
Copyright 2006 Daniel D. Gajski
System Specification
CPU
Mem
Computation
•
B1
B2
v1
Communication
Channels (in C)
Variables (in C)
Bridge
C1
Arbiter
•
•
C2
Behaviors (in C)
B3
HW
B4
IP
System Definition = (Partial) Platform + (Partial) Spec
VLSIDAT 2006
Copyright 2006 Daniel D. Gajski
CPU
Transaction-Level Model (TLM)
Mem
B1
B2
OS
Drivers
HAL
CPU Bus
IP Bus
B3
HW
VLSIDAT 2006
B4
IP
Copyright 2006 Daniel D. Gajski
CPU
Pin/Cycle Accurate Model (P/CAM)
Mem
Program
EXE
RTOS
HAL
IC
Bridge
Arbiter
HW
P/CAM is downloaded
automatically for fast prototyping
with FPGAs or ASIC design
IP
Source: D. Gajski et al.
VLSIDAT 2006
Copyright 2006 Daniel D. Gajski
How many components?
Minimal set for any design
(4 is enough?)
VLSIDAT 2006
•
Processing element (PE)
•
Memory
•
Transducer / Bridge
•
Arbiter
Copyright 2006 Daniel D. Gajski
General System Model
Arbiter 2
Arbiter 1
Interrupt2.1
PE 2.1
(Master)
Interrupt1.1
PE 1.1
Transducer1-2
Interrupt2.2
PE 2.2
(Slave)
Arbiter 3
PE 1.2
PE 3.1
Interrupt3.1
Interrupt3.2
Transducer2-3
Memory 1
Memory 3
Bus1
VLSIDAT 2006
Bus2
Bus3
Copyright 2006 Daniel D. Gajski
Transducer Model
Addr bus1
Data bus1
PE1
Addr bus2
Data bus2
PE2
Transducer
Ready1
Ack_ready1
Interrupt1
Interrupt2
Ready2
Ack_ready2
Processor1
<clk1>
FSMD1
<clk1>
FSMD2
<clk2>
Data1
Memory1
Data2
Memory2
Queue
<clk3>
Source: H. Cho
VLSIDAT 2006
Processor2
<clk2>
Copyright 2006 Daniel D. Gajski
Processing Element: NISC technology
•
•
•
Direct compilation of C to HW (fastest possible execution)
Statically and dynamically reconfigurable (anytime, anywhere)
Designed for manufacturability (solving timing closure)
RF / Scratch pad
const
CW
PC
CMem
B1
B2
offset
status
Status
AG
MUL
ALU
Memory
address
B3
Programmable
controller
VLSIDAT 2006
Datapath
Multi-cycle
units
Pipelined
units
Controller
pipelining
Datapath
pipelining
Copyright 2006 Daniel D. Gajski
Data
forwarding
General System Design Environment
Model A
GUI
Estimation
tool
Refinement
tool
Synthesis
tool
Transforms:
t1
t2
.
.
.
tn
Component
library
Verify
tool
Simulation
tool
ti
Model B
VLSIDAT 2006
Copyright 2006 Daniel D. Gajski
How many tools?
Minimal set for any methodology
(2 is enough?)
•
•
VLSIDAT 2006
Front-End (for application developers)
–
Input:
C, C++, Mathlab, UML, …
–
Output:
TLM
Back-End (for SW/HW system designers)
–
Input :
TLM
–
Output:
Pin/Cycle accurate Verilog/VHDL
Copyright 2006 Daniel D. Gajski
ES Environment
Validation
User
Interface
(VUI)
Decision
User
Interface
(DUI)
ESE Front – End
Create
Compiler
Debugger
System Capture + Platform Development
Select
Stimulate
Partition
Verify
Map
TIMED
Compile
CYCLE
ACCURATE
Replace
ESE Back – End
SW Development + HW Development
Compile
Check
Simulate
Verify
Application Tools : Compilers/Debuggers
VLSIDAT 2006
Commercial Tools : FPGA, ASIC
Copyright 2006 Daniel D. Gajski
Benefit: Spec-to-Prototype in 1 Week
VLSIDAT 2006
Copyright 2006 Daniel D. Gajski
Does it work?
• Intuitively it does
– Well defined models, rules, transformations, refinements
– Worked in the past: layout, logic, RTL?
– System level complexity simplified
• Proof of concept demonstrated
–
–
–
–
–
Embedded System Environment (ESE)
Automatic model generation
Model synthesis and verification
Universal IP technology (NISC)
Productivity gains greater then 1000
• Benefits
–
–
–
–
Large productivity gains
Easy design management
Easy derivatives
Shorter TTM
VLSIDAT 2006
Copyright 2006 Daniel D. Gajski
Design flow with NISC technology
for(int i=0; i<8; i++)
for(int j=0; j<8; j++){
sum=0;
for(int k=0; k<8; k++)
sum = sum + A[i][k] ×B[k][j];
C[i][j] = sum;
Code
Refinement
}
for(int i=0; i<8; i++)
for(int j=0; j<8; j++){
i8 = i × 8;
sum = *(A + i8) × *(B + j);
sum += *(A + i8 + 1) × *(B + 8 + j);
...
C[i][j] = sum;
}
Application
NISC
Compiler
const
CW
PC
CMem
RF
B1
B2
ALU
MUL
Mul
CW
PC
CMem
OR
offset
Memory
AG
status
address
NISC
bL
aL
const
NISC
Refinement
status
status
NISC
Compiler
RF
offset
AG
Application
Results
B3
NISC
ALU
Sum
AR
Mem
P
Add
DR
Iterative design & refinement
Source: M. Reshadi
VLSIDAT 2006
Copyright 2006 Daniel D. Gajski
Results
DCT with NISC technology
Execution Time
Power
Energy
Area
1.4
1.2
1
0.8
0.6
0.4
0.2
0
MIPS
NMIPS
Performance
CDCT1 CDCT2 CDCT3 CDCT4 CDCT5 CDCT6 CDCT7 Manual
Power saving
Energy saving
Area reduction
1.25X
NA
NA
NA
CDCT3 vs. NMIPS
5.3X
2.1X
11.6X
2.5X
CDCT7 vs. NMIPS
10X
1.3X
12.8X
3X
CDCT7 vs. Manual
0.83X
1.3X
0
2.1X
NMIPS vs. MIPS
Source: B. Gorjiara
VLSIDAT 2006
Copyright 2006 Daniel D. Gajski
100
90
80
70
60
50
40
30
20
10
0
0.35
0.3
0.25
0.2
0.15
seconds
% chip utilization
MP3 on Xillinx with ESE
%Slices
%BRAMs
Exec. time
0.1
0.05
0
SW+0 SW+1 SW+2 SW+4
Design Points
•
Area
•
•
% of FPGA slices and BRAMS
Performance
•
Time to decode 1 frame of MP3 data
Source: S. Abdi
VLSIDAT 2006
Copyright 2006 Daniel D. Gajski
100
90
80
70
60
50
40
30
20
10
0
0.35
0.3
0.25
0.2
0.15
seconds
% chip utilization
MP3 on Xillinx with ESE using NISC
%Slices
%BRAMs
Exec. time
0.1
0.05
0
SW+0 SW+1 SW+2 SW+4 NISC+0
Design Points
•
Area
•
•
NISC uses fewer FPGA slices and more BRAMs than manual HW
Performance
•
NISC comparable to manual HW and much faster than SW
VLSIDAT 2006
Copyright 2006 Daniel D. Gajski
Development
Manual Development
Time withTime
ESE
70
person-days
60
50
SW+0
40
SW+1
30
SW+2
ESE
20
SW+4
10
0
Spec.
TLM
RTL
Board
models
•
Model Development time
•
•
Includes time for C, TLM and RTL Verilog coding and debugging
ESE drastically cuts RTL and Board development time
Source: S. Abdi
VLSIDAT 2006
Copyright 2006 Daniel D. Gajski
seconds
hours
Validation
Validation
TimeTime
with ESE
X
18.06 hrs
17.71 hrs
17.56 hrs
15.93 hrs
10
10
99
88
77
66
55
44
33
22
11
00
SW+0
SW+0
SW+1
SW+1
SW+2
SW+2
SW+4
SW+4
ESE
Spec.
Spec.
TLM
RTL
Board
Board
models
•
•
•
Simulation time measured on 3.3 GHz processor
Emulation time measured on board with Timer
ESE cuts validation time from hours to seconds
Source: S. Abdi
VLSIDAT 2006
Copyright 2006 Daniel D. Gajski
Conclusions
• Extreme makeover is necessary for a new paradigm,
where
– SW = HW = SOC = Embedded Systems
– Simulation based flow is not acceptable
– Design methodology is based on scientific principles
• Model algebra is enabling technology for
– System design, modeling and simulation
– System synthesis, verification, and test
• What is next?
– Change of mind
– Application oriented EDA
– Looking for early adapters
VLSIDAT 2006
Copyright 2006 Daniel D. Gajski
Thank You
Daniel Gajski
Center for Embedded Computer Systems (CECS)
www.cecs.uci.edu
Download