Digital Signal Processors for Mobile Phone Terminals

advertisement
Digital Signal Processors
for Mobile Phone Terminals
Katsuhiko Ueda
Matsushita Electric Ind. Co., Ltd.
K.
K.Ueda,
Ueda,'99
'99VLSI
VLSICircuits
CircuitsShort
ShortCourse
Course
Abstract
1
- A DSP is one of the key components in a digital
cellular phone terminal.
- This talk will discuss the role of the DSP in the
terminal and how to achieve high performance with
low power consumption.
- Mobile phone system is moving rapidly into mobile
multimedia era. A DSP architecture suitable for this
new era will be also discussed.
K. Ueda, '99 VLSI Circuits Short Course
Outline
2
1. The role of DSP in cellular phone terminal.
2. How to achieve high performance with low power
consumption.
3. Issues for DSP in next generation mobile phone
systems.
4. Mobile multimedia DSP.
5. Summary.
K. Ueda, '99 VLSI Circuits Short Course
Architecture of Portable Phone Terminal
Receiver
Synthesizer
Transmitter
3
Demodulator
Equalizer
Channel CODEC
Speech
CODEC
Speaker
Microphone
Modulator
-----
Microcomputer
Keypad/Display
Role of DSP
- Speech CODEC
- Channel CODEC
- Equalizer
- Mod/Demodulator
K. Ueda, '99 VLSI Circuits Short Course
Relationship between Speech sig. and Tx sig. (PDC)
Receiver
Demodulator
Synthesizer
Transmitter
4
Point A
Speech
CODEC
Channel CODEC
Speaker
Microphone
Modulator
Point B
1 TDMA Frame (40ms)
8bits
(u-Law)
Point A
125us (8KHz)
-> 2,560bits/40ms -> 64kbps(8bits*8KHz)
Speech: 138bits/40ms -> 3.45kbps
FEC: 86bits/40ms -> 2.15kbps
Total: 5.6kbps
Point B
42kbps
User 6
User 1
User 2
User 3
User 4
User 5
User 6
User 1
1 slot (~6.7ms, 7kbps)
K. Ueda, '99 VLSI Circuits Short Course
Speech CODEC
Digitized
Input Speech
Signal
Spectrum
Analysis
Gain
Code Book
[MOPS]
20
15
5
Synthesis Filter
Minimizing the difference
between input speech
signal and synthesized
speech signal
Parameter
PSI-CELP
13kbps
VSELP
10
11.2kbps
VSELP
5
RPE-LTP
1
0
10
20
Bit rate
[kbps]
K. Ueda, '99 VLSI Circuits Short Course
Dedicated DSP for Portable Phone
BASIC DSP
POINTER UNIT
ADR CTRL
ADR CTRL
(ex. modulo, bit reverse)
(ex. repeat, loop)
POINTER
POINTER
DATA MEMORY
I/O
DMA CTRL
CONTROL
SERI AL
MEMORY X
INST
MEMORY
PARALLEL
MEMORY Y
DECODER
n
Dedicated
DSP
for Portable Phone
- Increase Performance
by adding accelerators
DPU
MUL
- Reduce
Power consumption
ALU
SAT
REG
(ACC)
Pc
f*c*v^2
f: frequency, c: capacitance
v: voltage
K. Ueda, '99 VLSI Circuits Short Course
6
A DSP Architecture for Portable Phone Terminal
DATA MEMORY
da
PROGRAM
CONTROL
AMA
Data ROM
Data RAM
INST ROM
IR
DEC
PU
AMB
sp
I/O
IP
STACK
Double Access
7
cc
DMA CONT
SERIAL
PARALLEL
M BUS 16
A BUS 16
B BUS 16
Special
memory
scheme to EXT
CLK
realize double
speed MAC
PLL
CLK
GEN
BSFT
DATA
REGS
RB-MAC
ALU
SAT
ACS
DPU
ACC
MAC
DSP-CORE
Viterbi accelerator
Dedicated MAC unit
Double speed MAC scheme
Redundant binary number system
K. Ueda, '99 VLSI Circuits Short Course
Double Speed MAC Scheme
POINTER X
(PX)
1 MACHINE
CYCLE
EVEN
SIDE
1 cycle
POINTER Y
(PY)
MEMORY X
(MX)
16-bit
16-bit
MEMORY Y
(MY)
16-bit
16-bit
ODD
SIDE
EVEN
SIDE
ODD
SIDE
TEMP REG
TEMP REG
16-bit
16-bit
16-bit
Output
of MX
D(2x)
D(2x+2)
D(2x+3)
D(2x+1)
D(2x+3)
D(2x+5)
D(2x)
D(2x+2)
D(2x+3)
D(2x+1)
D(2x+3)
D(2x+5)
TEMP
REG
A-BUS
B-BUS
16-bit
8
A-BUS
D(2x)
D(2x+1) D(2x+2) D(2x+3) D(2x+4) D(2x+5)
B-BUS
D(2y)
D(2y+1) D(2y+2) D(2y+3) D(2y+4) D(2y+5)
D(2x)
*
D(2y)
D(2x+1) D(2x+2) D(2x+3) D(2x+4) D(2x+5)
*
*
*
*
*
D(2y+1) D(2y+2) D(2y+3) D(2y+4) D(2y+5)
MULTIPLIER
1/2 MACHINE
CYCLE
32-bit
PIPELINE REG
ADDER
1/2 MACHINE
CYCLE
MULTIPLIER
40-bit
ACC
MAC UNIT
BARREL SHIFTER
0.5 cycle
K. Ueda, '99 VLSI Circuits Short Course
Accelerator for Viterbi Decoding
9
PM0(t-1) BMa(t) PM1(t-1) BMb(t)
PM0(t-1)
PM0(t)
BMa(t)
Add
BMb(t)
ALU
Lower 8-bits
Upper 8-bits
Compare
COMPARATOR
REG
PM1(t-1)
SHIFT REG
PM0(t) = min[(PM0(t-1)+BMa(t)),
(PM1(t-1)+BMb(t))]
Two Adds, one Compare
and one Select -> ACS operation
Select
- Normal operation: The ALU is used as
a 16-bit processing unit.
- ACS operation: The ALU is used as
two 8-bit adders.
K. Ueda, '99 VLSI Circuits Short Course
Effect of Accelerators
10
Comparison of the number of clock cycles needed to realize
an 11.2kbps VSELP CODEC.
[%]
100
Total: - 33.1%
80
60
40
- 9.0%
- 4.7%
Misc
- 8%
Block Floating
Error Correction
- 11.4%
MAC
20
ALU
0
DSP w/o MAC &
Viterbi Accelerators
DSP w/ MAC &
Viterbi Accelerators
K. Ueda, '99 VLSI Circuits Short Course
Dual MAC Scheme
11
FIR Filtering: tow outputs in parallel with delay register
y(0)=c(0)x(0)+c(1)x(-1)+c(2)x(-2)+c(3)x(-3)+
y(1)=c(0)x(1)+c(1)x(0)+c(2)x(-1)+c(3)x(-2)+
y(2)=c(0)x(2)+c(1)x(1)+c(2)x(0)+c(3)x(-1)+
y(3)=c(0)x(3)+c(1)x(2)+c(2)x(1)+c(3)x(0)+
c(i)
x(n-i+1)
REG
x(n-i)
Single
MAC
Dual
MAC
Dual MAC
with REG
# of MAC
operations
N
N
N
# of Memory
reads
2N
2N
N
c(i)
MAC1
MAC0
Acc1
Acc0
Low power consumption
K. Ueda, '99 VLSI Circuits Short Course
MAC Unit using Redundant Binary Number
12
A-BUS
16 b B-BUS
16 b
RBMU
MUL
0.5 cycle
24 b
BW-MAC
24 b
P-Reg 1
ACC
0.5 cycle
RBA 1
RBA 2
RBA 3
RB-MAC
RBAU
P-Reg 2
0
20
40
60
80
100 [%]
40 b
RB->B CNV
0.5 cycle
Preg2
RTBC
40 b
ACC
RBMU : Redundant Binary Multiply Unit
RBAU : Redundant Binary Accumulation Unit
RTBC : Redundant Binary Digit
to Binary Digit Conversion Unit
Mmux1
Conv
FA Tree2(BW)
Preg1
RBA Tree2(RB)
FA Tree1(BW)
RBA Tree1(RB)
Partial Product Gen.
Encoder
Power Consumption Ratio normalized to a BW-MAC
K. Ueda, '99 VLSI Circuits Short Course
A System LSI realizing Base Band Processing & Control
Receiver
Synthesizer
Transmitter
13
Demodulator
Digital Signal
Processor
Modulator
Misc
TDMA Controller
Audio I/F
Speaker
Microphone
VCO
Microcomputer
Keypad/Display
Features of the LSI
Process Tech. 0.35 um CMOS
# of Transistors 2.5 Million
Die Size
9.26x10.0mm
Package
11x11mm CSP
16b MCU MN102L(3MIPS)
16b DSP MN1930(40MIPS)
Integrated IP
Demodulator, TDMA Controller,
VCO, etc.
K. Ueda, '99 VLSI Circuits Short Course
Goal of the next generation Mobile Phone System
14
Hello
High Speed
Wireless
Network
Next Generation
System
(W-CDMA)
Video Phone
8 kbps ~ 2 Mbps
System Requirements
Current System
(PDC)
High Bit Rate Data Transfer
-> MORE cycles for error correction
-> MORE data input/output to/from DSP
Video CODEC Capability
Of course, LOW POWER
5.6 / 11.2 kbps
K. Ueda, '99 VLSI Circuits Short Course
Increasing Capability of Access Systems
15
CDMA(Code Division Multiple Access)
f1,c1
c2 B
c1 A
Time
A
f1,c2
B
Freq
Increase
- Channel
Capacity
- Data Speed
TDMA
Digital System
ex. IS-95, W-CDMA
(Time Division Multiple Access)
f1,s1
B
A BA
A
f1,s2
-> System Complexity Time s s s s
s
32
(DSP Performance)
1 3 s f1 f2 f3 f4 Freq
2
B
Digital System
ex. PDC, GSM, IS-54,
PHS
1
FDMA (Frequency Division Multiple Access)
f1
f2
Time
A B
f1 f2 f3 f4
Freq
A
Analog System
B
K. Ueda, '99 VLSI Circuits Short Course
Next Generation Mobile Phone Terminal and Issues to LSIs
Low-power &
High speed correlator
16
High speed Viterbi/
Turbo decoder
Low-power &
High speed A/D
Data
Receiver
DUP
A/D
De-spread
Rake
Base band Signal
Processing unit
RF unit
Channel
CODEC
Speech/
Transmitter
D/A
Video
Spread
CODEC
Control unit
High speed &
Low-power LSI
Voice
Video
Low-power Video/
Audio CODEC LSI
DSP
K. Ueda, '99 VLSI Circuits Short Course
DSP Architecture for the Next Generation System
17
CORE
PU
Instruction
Memory
Adrs
Data Memory
regs
DPP
AMB
- Trace back
- Modulo
- Bit reverse
Data
M BUS
PCU
IOU
16
Wide bandwidth
to/from DSP
DPU
AB
ICU
- RF
- ADC,DAC
- Spreading/
Despreading
AMA
CKU
CKU:
DPP:
DPU:
ICU:
IOU:
PCU:
PU:
Clock control Unit
Direct Parallel Port
Data Processing Unit
Interrupt Control Unit
data I/O Unit
Program flow Control Unit
Pointer Unit
High Performance Processing Unit
for Viterbi Decoding
K. Ueda, '99 VLSI Circuits Short Course
Dual ACS Operation
T-1
2 path metrics are updated in 1 cycle
T
PM0
18
PM0'
PM1
[MIPS]
Data
Memory
PM1'
PM1
20
register
PM0
40
60
80
Data Rate
32
{PM1,PM0}
8 Kbps
32
{BM1,BM0}
(VOICE)
Conventional
This scheme
CMPR
32 Kbps
COMP
AU1
{BM1+PM0} {BM1+PM0}
{BM0+PM1}
<>
{BM0+PM1}
ALU
AU0
(DATA)
{BM0+PM0} {BM0+PM0}
<>
{BM1+PM1}
{BM1+PM1}
64 Kbps
ASR1
(DATA)
ASR2
to Data Memory
to Data Memory
K. Ueda, '99 VLSI Circuits Short Course
DSP for Wireless Video Phone
19
SDRAM
DMA Controller
Video
Out
Double
Buffer
DA
Video
I/F
Video
In
Dedicated
Engine
DSP Core
Double
Buffer
Shared
Memory
DSP Core
Dedicated
Engine
Host
I/F
CPU
Shared
Register
AD
Local
Memory
Local
Memory
Sub-Processor
Main-Processor
Features
Clock frequency
Technology
Number of devices
Die size
Supply voltage
Performance
67.5MHz(14.8nsec)
0.25um-CMOS(4-Metal)
7,670KTr.
9.41 x 9.22(=86.76)mm^2
1.8V(Internal), 3.3V(IO)
4GOPS
15frames/sec(CIF CODEC)
K. Ueda, '99 VLSI Circuits Short Course
Summary
20
1. DSP is one of key components in portable phone
terminal and high performance with low power
consumption is essential factor.
2. In the mobile multimedia era, new DSPs with
higher performance and increased functionality will
be necessary.
3. DSP for portable phone must keep on achieving
MORE MIPS and LESS power consumption.
K. Ueda, '99 VLSI Circuits Short Course
[References]
[1] K. Ueda, T. Sugimura, et. al., "A 16-bit Digital Signal Processor with Specially Arranged MultiplyAccumulator for Low Power Consumption," IEICE Transaction, Vol. E78-C, No.12, pp.1709-1716, 1995.
[2] K. Honma and O. Kato, "Trends of research and development in Europe and America," Journal of The
IEICE, Vol.78, no.2, pp.173-178, 1995.
[3] H. Kabuo, M. Okamoto, et. al., "An 80 MOPS-Peak High-Speed and Low-Power-Consumption 16-bit
Digital Signal Processor," IEEE JSSC, Vol. 31, No. 4, pp.494-503, 1996.
[4] A. P. Chandrakasan, S. Sheng, et. al., "Low-power CMOS digital design," IEEE JSSC, Vol. 27, No. 4,
pp.473-484, 1992.
[5] I. Verbauwhede and M. Touriguian, "Low Power DSP Engine for Wireless Communications," Journal of
VLSI Signal Processing 18, pp.177-186, 1998.
[6] N. Nakajima, H. Shibata, et al., "Baseband System LSI for Cellular Mobile Telephone,"
Matsushita Technical Journal, pp.46-52, 1999.
[7] M. Okamoto K. Stone, et. al., "A High Performance DSP Architecture for Next Generation Mobile Phone
Systems," IEEE DSP Workshop,1998.
[8] T. Ishikawa, H. Suzuki, et al., "W-CDMA hardware-related issues," IEEE ICCT,1998.
[9] S. Kurohmaru, M. Matsuo, et. al., "A MPEG4 Programmable Codec DSP with an Embedded Pre/Postprocessing Engine," IEEE CICC,1999.
K. Ueda, '99 VLSI Circuits Short Course
Mobile Phone for Internet & Data Communication
Ex. i Mode system provided by NTT DoCoMo
Applications
- E-mail
- Web Browsing
- Banking
- Locating combining car navigation
system
etc.
Panasonic P502i
Download