#### CMOS SEQUENTIAL STATE

DELAY

- Semp
- propaganon
- HOUS

ENERGY

- DATA US. CLOCK
- CLOCK g ATT &

AREA -> TOTAL GATE CAP

BASIC Transmission GATE MASTER/SLAVE FLIPFLOP



WE WANT TO CHAPACTERIZE THE SEND TIME, CLOCK-10-Q PROPAGATION DELAY AND THE MOS TIME

LOGICAL REFORT WILL NOT DIRECTLY BE OF MULLI USE SINCE THE TRANSMISSION GATES COUPLE THE DELAY OF VARIOUS GATES TO GETHER

so we will use ELMONE DELAY, BUT FIRST WE MED AN RC MODEL OF A TRAISMISSION GATE

NOTE THE CONSERVATIVE USE OF EXTRA INVENTORS AT THE INDUT AND OUTDUT. THIS IS COMMEND IN STANDARD CELLS TO MELP IMPROVE THE CELL'S "ELECTRICAL MODULARITY"

### TIC MODEL FOR TRANSMISSION GATE



ASSUME SIZE OF NMOS AND PMOS ARE THE SAME MINIMUM WIDTH

ASSUME A TRANSISTON DRIVING A WEAK
VALUE (IR PMOS PASSING RED, NMOS PASSING
ONE) HAVE 2x HONSE Effective resurrance





NOTE MAT WPUT NOVE HAS PARASITIC CAP OF 2C AND OUTPUT NOVE Also WAS PARASITIC CAP OF ZC



Roughly Assume Effective resistance 15 R INDERENSENT OF WHICH VALUE WE ARE PASTING

NOW THAT WE HAVE AN RE MODEL FOR A TRANSMISSION GATE WE CAN GO BACK AND CREATE RC MODELS FOR THE SEND TIME, CLOCK-70-Q PID PAGATION DELAY, AND HOLD TIME.

## SETUP TIME

Allowing us to CAPTURE THE VAIVE.

TO CARCULATE SETUP TIME WE ASK OUR SELVES, "HOW FAT DOES THE INPUT SIGNAL NEED TO PROPAGATE SO MAT WE CAN RECIABLY "FLIP"
THE MASTER LATER CEFORE THE CLOCK Edge.

The INPUT SIGNAL MUST go mough gates A, B, C, AND D to really FUP THE CROSS COUPLES INVESTEDS.

ASSUME WPUT IS ONE, AND SIMPLIFY

SETUP TIME IS 32 RC/3RC = 9.7 Y

# CLOCK-TO-Q PROPAGATION DELAY

AFTER THE MISING CLOCK EDGE TGATE B AND I OPEN AND TGATE E AND F CLOSE. THE PROPAGATION THEREY IS FROM GATE C THROUGH GATES F, G, AND J



Tecunically we sol'T news to include mis CAP SINCE IT was already Discharged During the setup TIME

CLOCK TO Q PROPAGATION DELAY IS

21.7C + 91C + 31C = 76 RC | 31C = 8.67

#### HOLD TIME

After the SAMPling esqu (eg. 11514g clock esqu in Mis case) IN order to prevent corrupting the STATE

IF WE ASSUME & AND \$\overline{\pi}\$ CHANGE INSTANCOUSLY OF THE

RISING CLOCK EDGE THEN WE ACTUALLY HAVE A NEGATIVE

HOW TIME. IF THE INPUT CHANGES FIGHT AFTER THE EDGE

THEN BY THE TIME IT GETS TO THE FIRST TRANSMUSICAL GATE (GATE B)

THAT GATE IS Already Open and THE IPPUT SIGNAL CANNOT

COTTUPT THE STATE.

IN FACT, The WPUT CAN CHANGE A little the fore the RIGHT SINCE IT TAKES SOME TIME TO propagate Through the FIRST WVENTER.

Mon specifically, me Deray mough me first inventer is nost

$$\frac{R}{m} = \frac{5RC}{3RC} = 1.67 \text{ }$$

$$\frac{1}{3C} = \frac{5RC}{3RC} = 1.67 \text{ }$$

or course we get me same asswer using logical effort  $d = gh + \rho = (i)\left(\frac{2c}{3c}\right) + 1 = 1.67 \text{ P}$ 

SO THE HOW TIME IF WE ASSUME & AND \$ CHANGE INSTANTED ON THE PUSING CLOCK FORE IS -1.67 N

#### INTERNAL COOK DELAY

WHAT IF WE ASSUME MOVE IS SOME DELAY RETWEED THE CLOCK IMPUT PIN OF OUT CELL AND THE ACTUAL & AND \$5 519NALS?



HOW Does mis impact me semp time, propagation delay, and most time? remember that all time of these metrics are retired with respect to the cik his of the cell not the local of and \$\overline{\pi}\$ signals

FIRST we need to calculate me Devay mough me local clock tree AND mes we can FACTOR mis into me supp time, propagation Devay, AND MOUS Time

Notional @Rean4 42-382 100 SHEETS EYE-EASE" - 5 SQUA

O AND TO EACH Drive one Tymonum sines transition in four Transmission gaties, so the 1040 For each one is 40



WE CON COPIENDATIVELY FOCUS ON JUST THE PATH TO \$ SINCE IT WILL THE LONGE THAN THE PATH TO \$\overline{\Psi}\$ AND DOMENATE WHEN THE TRANSMISSION GATES OPEN OF CLOSE

$$d = g_{0}h_{0} + p_{0} + g_{1}h_{1} + p_{1}$$

$$= (1)\left(\frac{7c}{3c}\right) + 1 + (1)\left(\frac{4c}{3c}\right) + 1$$

$$= \frac{7c}{3c} + 1 + \frac{4}{3} + 1$$

$$= \frac{17}{3} = 5.6 \text{ P}$$

Motive we simply incorporate the extratives CAP in the electrical effort. We are not optimizing sizing just calculating delay

This extra Delay reduces me semp time, increases the propagated Delay, AND increases the most time

Serp The 9.7 - 5.6 = 4.17

CUK-10-Q REVAY 8.6 + 5.6 = 14.27

MOLD TIME -1.7 + 5.6 = 3.9 P NOW WE HAVE A POSITIVE

Essentially mat we have none is smift me sampling window

IDEAL CLK

SERUP

LOCAL CLK

Thee

D

TITILITY

Q

SERUP

LOCAL CLK

A PEGATIVE HOLD TIME

Q

C

X

DELAYES CIK POSITIVE MOLD TIME
Thee D TIMINIMITY XIMINIMITY

Q SEX

ENERGY

we will look at me every? The so togging me DATA cives us me every? The to toggling me clock lives

FIRST LET'S LABEL All OF THE GATE AND PATASITIC CAPS



We are interestes in the worst case any energy per write/news
Access so we can estimate this By simply carculating the
Maximum switches capacitance while assuming every none togglis

DATA SWITCHES CAP =  $6 \times 6 + 4 \times 4 = 52C$ CLOCK SWITCHES CAP =  $2 \times 6 + 4 \times 2 = 20C$ 

Renender mat for 90 pm, C = 0.5 ff, so Evergy can be estimated as follows

 $E_{\text{DATA}} = \frac{1}{2} C V_{00}^2 = \frac{1}{2} (52C) \frac{0.5 + F}{C} (10)^2 = 13 \text{ fT}$   $E_{\text{CLOCK}} = \frac{1}{2} C V_{00}^2 = \frac{1}{2} (20C) \frac{0.5 + F}{C} (10)^2 = 5 + T$ 

DEPENS ON ACTIVITY FACTOR AND CLOCK FREQUENCY

SONEETS EYELASE\* - 5 Sunant 42-382 TO SHEETS EYELASE\* - 5 Sunant 42-382 TO SHEETS EYELASE\* - 5 SONANT STORMS

QD on ASIC processon) AND AN ACTIVITY FACTOR OF 0.1 FOR DATA
AND 2 for CLOCK

NOTE THAT THE ACTIVITY FACTOR FOR THE Clock is >1 because IT togges Twice par cycle!

$$E_{NATA} = \frac{1}{2} \left( \frac{1}{2} \left( \frac{1}{2} \right) \right) = \frac{1}{2} \left( \frac{1}{2} \left( \frac{1}{2} \right) \right)^{2} = 1.3 \text{ f}$$

$$P_{0ATA} = \frac{1}{2} \left( \frac{1}{2} \left( \frac{1}{2} \right) \right) = \frac{1}{2} \left( \frac{1}{2} \left( \frac{1}{2} \right) \right)^{2} = 0.65 \text{ mW}$$

$$E_{CLOCK} = \frac{1}{2} \left( \frac{1}{2} \right) =$$

evergy fower is significantly higher man DATA energy let's estimate me total DATA (coll evergy for a simple processor. How Many Fuptiops is me DATAPATA?



Roughly 10 PIRELINE registers, But let's nound up to 16 to Account For other STATE IN the control upit, memory interface, exception pc, etc. each register is 32 bits so we might estimate A total of Around

16 × 32 = 512 6 175

Note that this ignores the registative which is Retiritary not regligible

WIM SIZ birs me TOTAL DATA CTOCK evergy Power is

This ignores me clock tree. Let'S try and estingte me lower consumes in me clock tree

IC PAPELINE

REGISTES

EACH 15 32 6

EACH ONE BIT FULL-KISP ADDS 3C LOAD TO LAST WEBTER
IN THE CLOCK TREE, FOR A TOTAL CLOCK LOAD OF 3×32=96C

HOW shows we size me whites is me clock there to resuce Delay? NOTE mat skew is man more important man absolute Delay but we still nees a relatively fast tree to Avoid Vay BAD slew rates.

we can use logical effort to ste me clock tree



$$\hat{J} = \sqrt[3]{F} = 5.4$$

$$\hat{D} = 3(5.4) + 3 = 19.2 \text{ Y}$$

$$C_{12} = \frac{1}{5.4} 96C = 17.8C$$
 $C_{11} = \frac{1}{5.4} 17.8C \times 4 = 13.2C$ 
 $C_{12} = \frac{1}{5.4} 13.2C \times 4 = 9.8C$ 

TOTAL SWITCHES CAP?

SUITCHES CAPACITANCE WE CARE A GOOT THE GATE

CAP AND THE PARASITIC CAP.

TOTAL Cow for 3 stage Tree is 690C

logy F = 3.6

50 optimal number of stagles
For what chand with PATA
effort ~ 154 is 3-4 stages
looks like our 3 stage
clock tree is in the light
TAIL PAIL

let's nows to realit ever multiple of mumimum weather



WHAT ABOUT A four-level Clock Tree?



so our four-level clock mee is ~590 taste but at what cost?

$$C_{1M,L3} = (1/3.52)$$
 96C = 27.3C 27C

 $C_{M,C2} = (1/3.52)$  27.3Cx4 = 31C 30C

 $C_{M,M} = (1/3.52)$  31C x4 = 35C 36C

 $C_{M,M} = (1/3.52)$  31C x4 = 35C 36C

 $C_{M,M} = (1/3.52)$  35C = 9.94C 9C

TOTAL SWITCHES CAP?

TOTAL CSW for 4 STAGE THEE IS 1194 C

50 4 STAJE THER IS 5% FASHER BUT NI. 7 X MORE EVERTY/POWER/AREA CET'S STICK WITH THE THREE STAJE!

MAMMUE W OR

POATA = 33.3 MW = 7.6 MW =

POWER IS SPENT IN THE LOCAL "HATTA-CELL" Clock logic!

Parocuitree = 173 mW

Prot = 3.1 mW

CAN CLOQUE GATING HELP?

Signal From propagating to local olocal

global clock

local clock

ENABLE

glitar on erable in propagated to local clock if no later

The FACT mat local clock is complement of global clock is not A problem wire our transmission SATE FUPFIOPS.

CLOCK GATING resules ACTIVITY ON LZ of CLOCK TREE AND LOCAL INTRA-CELL CLOCKING LOGIC OUT ALSO ADDS EXTRA SWITCHES CAP UNEW REGISTER IS EVABLES.

Clock para une regista is enables continuously

what about when Jisables? Assume enable signal is continuously set to zero. It is STILL ZYC, but of course the local clock Does not toggle.

The local clock is 576 C for Final inventor in clock tree pius as appitional 512 x 20 C = 10,240 C for all of the local intra-cell clock logic.

|                          | DATA   | intra all | CUK Tree | CHI THE | 94723 | TOTAL  |
|--------------------------|--------|-----------|----------|---------|-------|--------|
| No olk games             | 333 MM | 7.6 mw    | 28.5mW   | 144mW   | NIA   | 3.1mW  |
| of allegating + evables  | 333mW  | 7.6 mW    | 28.5 mW  | 144 mW  | 96 mw | 3.2mW  |
| of alle garay + Disables | 333MW  | OnW       | 78.5ml   | Omw     | 96 MW | 458 mW |

Actually gate any registers, we ask a power overnead of ~3% of the Best case wheel every register is gated we result total power for the pipeline registers By 85%