Problem 1 – Inverter Sizing

advertisement

Problem 5 -- Sequential circuits

CLK2 a) Would the sequential circuit from the figure above be considered a latch, a master-slave latch pair or a pulse-triggered latch? Briefly explain your answer.

This is a pulse-triggered latch. The data is sampled when both CLK=1 and CLK2=1(the duration is only 3*tinv). b) All transistors in this circuit are unit-sized, with equivalent resistances R and gate capacitances

C (ignore diffusion capacitances). Calculate the propagation delay tClk-Q for high-to-low and low-to-high transitions. Load on the output Q is equal 12C. Ignore the signal slopes in delay calculation.

For L

H, 3R*2C+R*C

Q

=18RC

For H

L, 2R*C

Q

=24RC c) This circuit does not strictly follow the rules for designing sequential logic discussed in the class. List three major problems in the operation of this circuit.

(1) There is no latch at the output to hold the Q value when CLK=0

(2) There is a feedforward path from CLK to Q which may cause overshot at the output.

(3) When CLK switches from 0 to 1, no matter what value D is, Q will be pulled down first before the first stage settles.

Functionality? Why is this Circuit good? Main reason for inverter I1?

Flip Flop/C2MOs (no skew)/to avoid loading state node (buffer)

Also, review:

1.

brief overview on timing.

2. Question 5 on http://bwrc.eecs.berkeley.edu/classes/icdesign/ee141_f05/Exams/Final-f05-sol.pdf

3. problem 3 on http://bwrc.eecs.berkeley.edu/classes/icdesign/ee141_s06/Homeworks/ee141_hw

10_sp06.pdf

Problem 2 – Variable Block Carry Bypass Adder

Consider a 24-bit, 6 stage carry-bypass adder with the following delays: t setup

=4, t carry

=1, t sum

=4, t bypass

=2

Setup Setup Setup Setup Setup Setup

C in

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

C out

Sum Sum Sum Sum Sum Sum a) Identify the critical path through the adder. List the delays for each block along the critical path and give the total delay. Assume that each stage bypasses the same number of bits.

The critical path is through the setup of the first stage (specifically, the first bit of the first stage), through all four bits of the first stage’s carry chain, through the first five bypass multiplexers, three bits of the last stage’s carry chain, then through the final stage’s sum.

The last carry bit of the final stage does not affect the sum, only the carry out. Carry out through the final multiplexer is not on the critical path as the sum is slower.

The delays for each component of this are t setup

, 4t carry

, 3t bypass

+2t bypass

, 3t carry

, and t sum respectively. These add up to the expression given in equation 11.9 in the text, and substituting the given delays gives a total of 25. The critical path is grayed in figure 11-

14. b) Consider the setup delay and carry propagation of the second and third stages. These are not on the critical path and can be made slower without affecting performance. If we allow each stage to handle a different number of bits, what is the relationship between the number of bits per stage and the respective carry propagation delay? How many bits would you assign to each of the first three stages to minimize the delay from inputs to the carry output for the first 12 bits of the adder?

The worst case delay from the first stage’s inputs through the setup, carry propagation and bypass to the start of the fourth stage is t setup

+ M

0* t carry

+ 3t bypass where M

0

is the number of bits in the first stage.

The delay for the second and third stages are similarly t setup

+ M

1* t carry

+ 2t bypass and t setup

+ M

2* t carry

+ t bypass

Making all of these equal, we get that M

1

- M

0

= M

2

– M

1

= t bypass

/t carry

= 2. Thus, the first, second, and third stages should add 2, 4 and 6 bits respectively.

The original critical path is now 2 carries shorter, for a total delay of 23 at the final sum output (it is also acceptable to just give the delay to the end of the third stage carry being now 12 instead of 14). The second and third stages are now also critical paths as well, with the same delay.

c) How many bits would you assign to each stage in the second half of the adder? What is/are the delays along the critical path(s) now?

Same approach as for part b, except the critical paths are now from the carry in from the third stage, to the sum outputs. Delays for each path are:

2t bypass

+ (M

5

-1)t carry

+ t sum t bypass

+ (M

4

-1)t carry

+ t sum

(M

3

-1)t carry

+ t sum

Making all of these equal, we get that M

4

- M

5

= M

3

– M

4

= t bypass

/t carry

= 2. Thus, the fourth, fifth and sixth stages should add 6, 4 and 2 bits respectively.

The critical path to the final stage sum output is now another 2 carries shorter, for a total delay of 21. The fourth and fifth stage outputs are now also critical paths.

Note: parts a and c assumed that the sum logic for a bit has a delay of t sum

from its carry in to the sum out. From the structure of the mirror adder, one might consider t sum

to refer to the delay from the bit’s own carry to its output, in which case the critical path delays would have one extra t carry.

(M instead of M-1) This does not affect the choice of stage widths and is acceptable for the answers.

Problem 1 – Inverter Sizing

Consider a standard CMOS inverter shown above driving a capacitive load C

L

= 80 fF with a relatively fast step at its input. Assume that a minimum size “unit” inverter has symmetric high and low drive strength R eq,u

= 20 kΩ, intrinsic output capacitance C int,u

=

3 fF, and input capacitance C in,u

= 4 fF. Also assume that inverter resistances and capacitances scale linearly with size. a) What is the shortest t p that can possibly be attained for the above circuit by sizing the inverter, and how would it be sized? Call this delay t pmin

.

Soln: The delay is given by t p

=0.69

R eq

(C int

+C

L

), R eq

= R eq,u

/k, C int

= C int,u

⋅ k, where k is the size of the inverter relative to the unit inverter. From this equation it is seen that t p is minimized by letting k

 delay is just equal to the intrinsic self loading delay of t pmin

= 0.69

R eq

C int

= 41.4 ps. b) What size should the inverter be, relative to the unit inverter, to obtain t p

=

1.3

t pmin

? What are the input and intrinsic output capacitances of this inverter?

Soln: t p

=0.69

(R eq,u

/k)

(C int,u

⋅ k+C

L

)=1.3

× ⋅

41.4ps



k = 89 times larger than a unit inverter. The capacitances are C in

= 89

C in,u

= 356 fF, C int

= 89

C int,u

= 267 fF. c) Now consider the dynamic energy consumed driving C in

, C int

, and C

L over a complete input cycle (one logic transition in each direction). What inverter size minimizes the energy delay product of this circuit? How do the inverter capacitances compare to C

L in this case?

Soln: For constant supply voltage energy is simply proportional to the total capacitance that is being charged, in this case E ∝ (C in,u

+C int,u

)

⋅ k+C

L

, where k is the inverter size

relative to the unit inverter. The delay is given by t

[(C in,u

+C int,u

)

⋅ k+C

L

]

×

[t pmin p

=0.69

(R eq,u

/k)

(C int,u

⋅ k+C

L

) = t pmin

+0.69

R eq,u

C

L

/k. Thus, we minimize the energy delay product by minimizing

+0.69

R eq,u

C

L

/k] over k. This is done by setting the derivative with respect to k equal to zero (to be complete, we should also show that this finds a minima, not a maxima, which can be seen by considering extreme values of k). First we simplify the above expression by expanding the products, combining terms, and dropping terms that are constant with respect to k (i.e., terms that don’t affect the derivative). This results in the simplified expression:

U(k) = (C in,u

+C int,u

)

⋅ k

⋅ t pmin

+ 0.69

R eq,u

C

L

2

/k ( argmin{U(k)} = argmin{E(k)} ) dU/dk = (C in,u

+C int,u

)

⋅ t pmin

- 0.69

R eq,u

C

L

2

/k

2

= 0 k

2

= 0.69

R eq,u

C

L

2

/[t pmin

(C in,u

+C int,u

)] = 0.69

R eq,u

C

L

2

/[0.69

R eq,u

C int,u

(C in,u

+C int,u

)] k = C

L

/sqrt(C int,u

[C in,u

+C int,u

]) = 17.5 times larger than the unit inverter

An inverter this size has C in

= 70 fF, C int

= 52.5 fF. These capacitances are similar to the load capacitance itself when optimizing the energy delay product. If we had only considered C int and neglected C in

, the solution would have been to set C int

= C

L

, i.e., to make self loading and external loading equal.

Problem 2 – CMOS Scaling

A microprocessor consumes 0.3mW/MHz when fabricated using a 0.13 um process. The area of the processor is 0.7 mm

2

. Assume a 200 MHz clock frequency, and 1.2 V power supply. Its leakage power is 0.1mW. Assume short channel devices, but ignore second order effects like mobility degradation, series resistance, etc.

(a) If the supply voltage of the microprocessor scaled to 90 nm is reduced to 1.0V, what will the area, frequency, power consumption, and power density be?

(b) If the threshold voltage in the 0.13

 m process is 0.35V, what should be the threshold voltage in 90nm? Assuming 80mV/dec subthreshold slope, what would be the leakage power of the new processor?

1. Analysis Using the Unified Model

Below is another I-V transfer curve for a different NMOS transistor operating under slightly different conditions (see next page):

In this problem, the objective is to use a transfer curve like the one above to obtain the transistor parameters. The transistor has (W/L)=(20/1). You may also assume that velocity saturation does not play a role in this example. Also assume –2

F

= -0.6V

From the figure on the next page, determine the following parameters: the threshold voltage V

T0

, body effect parameter

, channel length modulation parameter

.

Hint: Depending on your choice of curves, you might get unreasonable values for V

T0

. Therefore, use the curves with the two lowest Vgs values (1V, 1.5V) for the determination of V

T0

, and explain why using curves with higher Vgs doesn’t give you sensible answers.

Vgs=2.5V, Vbs=0V

Vgs=2.5V, Vbs=-1V

Vgs=2V, Vbs=0V

Vgs=2V, Vbs=-1V

Vgs=1.5V, Vbs=0V

Vgs=1.5V, Vbs=-1V

Vgs=1V, Vbs=0V

Vgs=1V, Vbs=-1V

Vds (V)

Solution

V

T0

This one should immediately signal you to look at a curve(s) that don’t have body-effect. That means V

BS

= 0V. Pick two points, each from different curves that satisfy the no-body-effect condition. Make sure they’re in the same operating region too!

Point V

GS

V

DS

I

D

A

B

1.5V

1.0V

1mA

0.36mA

Operating

Region saturation saturation

2V

2V

I

D , A

I

D , B

1

2 k p

1

2 k p

W

( V

GS , A

V

T 0

)

2

( 1

  

V

DS , A

)

L

W

L

( V

GS , B

V

T 0

)

2

( 1

  

V

DS , B

)





0 .

1

36

(

( 1 .

5

V

T 0

)

2

1 .

0

V

T 0

)

2

V

T0

= 0.25V

As you can see, working on two points with the same V

DS helps to cancel as many variables as possible to be able to solve the equation. The reason that we choose the curves with lower Vgs values is because velocity saturation effect is less prominent at these values. Therefore, the quadratic equations can still provide a reasonable fit to the curves.

We can use the same methodology as above. This time, we want to keep V

GS constant.

Point

A

B

V

GS

1V

1V

V

DS

2.0V

1.0V

I

D

0.36mA

0.32mA

Operating

Region saturation saturation

I

D , A

1 k p

2

W

L

( V

GS , A

V

T

)

2

( 1

  

V

DS , A

)

I

D , B

1 k p

2

W

L

( V

GS , B

V

T

)

2

( 1

  

V

DS , B

)

0 .

36

0 .

32

( 1

( 1

2 .

0 )

1 .

0 )

= 0.143V

-1

It shouldn’t be a surprise, but that leaves us to keep almost everything constant except for V

SB

.

Operating

Region saturation saturation

Point V

SB

V

GS

V

DS

I

D

A

B

1.0V

0.0V

1.0V

1.0V

2V

2V

0.16mA

0.36mA

I

D , A

1

2 k p

W

L

( V

GS , A

V

T

)

2

( 1

  

V

DS , A

)

I

D , B

1

2 k p

W

L

( V

GS , B

V

T 0

)

2

( 1

  

V

DS , B

)

0 .

16

0 .

36

(

( 1 .

0

1 .

0

V

T

)

2

0 .

25 )

2

V

T

= 0.5V

Now solve for

 using the following equation:

V

T

V

T 0

0 .

5

 

0 .

25

 

V

SB 

2

F

1

0 .

6

= 0.51V

1/2

2. Inverter delay analysis and power.

First order analysis again. Like problem 3 in HW3.

Warm up exercises for buffer sizing.

Difference between E

VDD

and E

C

3. Capacitance

0 .

6

2

 F

What is Miller effect? Try KCL method at output node and find the Ceq=2(C gdN

+C gdP

)

Problem 3. CMOS Gate Design and Implementation a) Design F

AB

AC

BC in static CMOS using the least number of devices. Draw the Logic Graphs corresponding to the circuit and identify the Euler paths. b) Using the Euler paths you found draw the stick diagram for the implementation. Try to use the appropriate colors to make your diagram clear.

Download