

© Digital Integrated Circuits<sup>2nd</sup>

## Input Pattern Effects on Delay

 Delay is dependent on the pattern of inputs
 Low to high transition

- both inputs go low
   dolay is 0.69 P. /2 (
  - delay is 0.69  $R_p/2 C_L$
- one input goes low
   delay is 0.69 R<sub>p</sub> C<sub>L</sub>
- □ High to low transition
  - both inputs go high
    - delay is 0.69  $2R_n C_L$

#### **Delay Dependence on Input Patterns**



### **Transistor Sizing**



# Transistor Sizing a Complex CMOS Gate



© Digital Integrated Circuits<sup>2nd</sup>

#### **Fan-In Considerations**





Distributed RC model (Elmore delay)

$$_{\text{pHL}} = 0.69 \text{ R}_{\text{eqn}}(\text{C}_{1} + 2\text{C}_{2} + 3\text{C}_{3} + 4\text{C}_{\text{L}})$$

Propagation delay deteriorates rapidly as a function of fan-in – quadratically in the worst case.

## t<sub>p</sub> as a Function of Fan-In



© Digital Integrated Circuits<sup>2nd</sup>

## t<sub>p</sub> as a Function of Fan-Out



© Digital Integrated Circuits<sup>2nd</sup>

## t<sub>p</sub> as a Function of Fan-In and Fan-Out

- Fan-in: quadratic due to increasing resistance and capacitance
- □ Fan-out: each additional fan-out gate adds two gate capacitances to C<sub>L</sub>

$$t_p = a_1 F I + a_2 F I^2 + a_3 F O$$



© Digital Integrated Circuits<sup>2nd</sup>

- Transistor sizing
  - as long as fan-out capacitance dominates
- Progressive sizing



**Distributed RC line** 

M1 > M2 > M3 > ... > MN (the fet closest to the output is the smallest)

Can reduce delay by more than 20%; decreasing gains as technology shrinks

© Digital Integrated Circuits<sup>2nd</sup>



© Digital Integrated Circuits<sup>2nd</sup>

□ Alternative logic structures

F = ABCDEFGH







© Digital Integrated Circuits<sup>2nd</sup>

Isolating fan-in from fan-out using buffer insertion



Reducing the voltage swing

 $t_{pHL} = 0.69 (3/4 (C_L V_{DD}) / I_{DSATn})$ 

= 0.69 (3/4 ( $C_L V_{swing}$ )/  $I_{DSATn}$ )

- linear reduction in delay
- also reduces power consumption

But the following gate is much slower!
 Or requires use of "sense amplifiers" on the receiving end to restore the signal level (memory design)

## Sizing Logic Paths for Speed

- Frequently, input capacitance of a logic path is constrained
- Logic also has to drive some capacitance
- Example: ALU load in an Intel's microprocessor is 0.5pF
- How do we size the ALU datapath to achieve maximum speed?
- We have already solved this for the inverter chain – can we generalize it for any type of logic?





For given *N*:  $C_{i+1}/C_i = C_i/C_{i-1}$ To find *N*:  $C_{i+1}/C_i \sim 4$ How to generalize this to any logic path?

© Digital Integrated Circuits<sup>2nd</sup>

## **Logical Effort**

$$Delay = k \cdot R_{unit} C_{unit} \left( 1 + \frac{C_L}{\gamma C_{in}} \right)$$
$$= \tau \left( p + g \cdot f \right)$$

*p* – intrinsic delay (3k $R_{unit}C_{unit}$ ) - gate parameter ≠ f(*W*) *g* – logical effort (k $R_{unit}C_{unit}$ ) – gate parameter ≠ f(*W*) *f* – effective fanout

Normalize everything to an inverter:  $g_{inv} = 1, p_{inv} = 1$ 

Divide everything by  $\tau_{inv}$ (everything is measured in unit delays  $\tau_{inv}$ ) Assume  $\gamma = 1$ .

© Digital Integrated Circuits<sup>2nd</sup>

### **Delay in a Logic Gate**



Logical effort is a function of topology, independent of sizing Effective fanout (electrical effort) is a function of load/gate size

© Digital Integrated Circuits<sup>2nd</sup>



- Inverter has the smallest logical effort and intrinsic delay of all static CMOS gates
- Logical effort of a gate presents the ratio of its input capacitance to the inverter capacitance when sized to deliver the same current
- Logical effort increases with the gate complexity



Logical effort is the ratio of input capacitance of a gate to the input capacitance of an inverter with the same output current



© Digital Integrated Circuits<sup>2nd</sup>

#### **Logical Effort of Gates**



#### **Logical Effort of Gates**



#### **Logical Effort of Gates**



© Digital Integrated Circuits<sup>2nd</sup>

**Combinational Circuits** 

50

## **Add Branching Effort**

Branching effort:

$$b = \frac{C_{on-path} + C_{off-path}}{C_{on-path}}$$





© Digital Integrated Circuits<sup>2nd</sup>

### Multistage Networks

$$Delay = \sum_{i=1}^{N} (p_i + g_i \cdot f_i)$$

Stage effort:  $h_i = g_i f_i$ 

Path electrical effort:  $F = C_{out}/C_{in}$ 

Path logical effort:  $G = g_1 g_2 \dots g_N$ 

Branching effort:  $B = b_1 b_2 \dots b_N$ 

Path effort: H = GFB

Path delay  $D = \Sigma d_i = \Sigma p_i + \Sigma h_i$ 

© Digital Integrated Circuits<sup>2nd</sup>

## **Optimum Effort per Stage**

When each stage bears the same effort:

$$h^{N} = H$$
$$h = \sqrt[N]{H}$$

Stage efforts:  $g_1f_1 = g_2f_2 = \dots = g_Nf_N$ 

Effective fanout of each stage:  $f_i = h/g_i$ 

Minimum path delay

$$\hat{D} = \sum (g_i f_i + p_i) = NH^{1/N} + P$$



## **Optimal Number of Stages**

For a given load, and given input capacitance of the first gate Find optimal number of stages and optimal sizing

1/77

$$D = NH^{1/N} + Np_{inv}$$
$$\frac{\partial D}{\partial N} = -H^{1/N} \ln(H^{1/N}) + H^{1/N} + p_{inv} = 0$$

Substitute 'best stage effort'  $h = H^{1/\hat{N}}$ 

© Digital Integrated Circuits<sup>2nd</sup>



## **Logical Effort**

|             | Number of Inputs |     |     |            |
|-------------|------------------|-----|-----|------------|
| Gate Type   | 1                | 2   | 3   | n          |
| Inverter    | 1                |     |     |            |
| NAND        |                  | 4/3 | 5/3 | (n + 2)/3  |
| NOR         |                  | 5/3 | 7/3 | (2n + 1)/3 |
| Multiplexer |                  | 2   | 2   | 2          |
| XOR         |                  | 4   | 12  |            |

From Sutherland, Sproull

© Digital Integrated Circuits<sup>2nd</sup>

#### **Example: Optimize Path**



Effective fanout, F = G =

- H =
- *h* =
- a =
- b =

© Digital Integrated Circuits<sup>2nd</sup>

#### **Example: Optimize Path**



Effective fanout, F = 5 G = 25/9 H = 125/9 = 13.9 h = 1.93 a = 1.93  $b = ha/g_2 = 2.23$  $c = hb/g_3 = 5g_4/f = 2.59$ 

57 Combinational Circuits

© Digital Integrated Circuits<sup>2nd</sup>

#### **Example: Optimize Path**



Effective fanout, H = 5 G = 25/9 F = 125/9 = 13.9 f = 1.93 a = 1.93  $b = fa/g_2 = 2.23$  $c = fb/g_3 = 5g_4/f = 2.59$ 

58 Combinational Circuits

© Digital Integrated Circuits<sup>2nd</sup>

#### **Example – 8-input AND**



## Method of Logical Effort

 Compute the path effort: F = GBH
 Find the best number of stages N ~ log<sub>4</sub>F
 Compute the stage effort f = F<sup>1/N</sup>
 Sketch the path with this number of stages
 Work either from either end, find sizes: C<sub>in</sub> = C<sub>out</sub>\*g/f

Reference: Sutherland, Sproull, Harris, "Logical Effort, Morgan-Kaufmann 1999.



#### Table 4: Key Definitions of Logical Effort

| Term              | Stage expression              | Path expression                           |
|-------------------|-------------------------------|-------------------------------------------|
| Logical effort    | $oldsymbol{g}$ (seeTable 1)   | $G = \prod g_i$                           |
| Electrical effort | $h = \frac{C_{out}}{C_{in}}$  | $H = rac{C_{out (path)}}{C_{in (path)}}$ |
| Branching effort  | n/a                           | $B = \prod b_i$                           |
| Effort            | f = gh                        | F = GBH                                   |
| Effort delay      | f                             | $D_F = \sum f_i$                          |
| Number of stages  | 1                             | N                                         |
| Parasitic delay   | $oldsymbol{ ho}$ (seeTable 2) | $P = \sum p_i$                            |
| Delay             | d = f + p                     | $D = D_F^+ P$                             |

Sutherland, Sproull Harris



© Digital Integrated Circuits<sup>2nd</sup>

## Input Pattern Effects on Delay

 Delay is dependent on the pattern of inputs
 Low to high transition

- both inputs go low
   dolay is 0.69 P. /2 (
  - delay is 0.69  $R_p/2 C_L$
- one input goes low
   delay is 0.69 R<sub>p</sub> C<sub>L</sub>
- □ High to low transition
  - both inputs go high
    - delay is 0.69  $2R_n C_L$

#### **Delay Dependence on Input Patterns**



#### **Transistor Sizing**



# Transistor Sizing a Complex CMOS Gate



© Digital Integrated Circuits<sup>2nd</sup>

#### **Fan-In Considerations**





Distributed RC model (Elmore delay)

$$_{\text{pHL}} = 0.69 \text{ R}_{\text{eqn}}(\text{C}_{1} + 2\text{C}_{2} + 3\text{C}_{3} + 4\text{C}_{\text{L}})$$

Propagation delay deteriorates rapidly as a function of fan-in – quadratically in the worst case.

## t<sub>p</sub> as a Function of Fan-In



© Digital Integrated Circuits<sup>2nd</sup>

## t<sub>p</sub> as a Function of Fan-Out



© Digital Integrated Circuits<sup>2nd</sup>

## t<sub>p</sub> as a Function of Fan-In and Fan-Out

- Fan-in: quadratic due to increasing resistance and capacitance
- □ Fan-out: each additional fan-out gate adds two gate capacitances to C<sub>L</sub>

$$t_p = a_1 F I + a_2 F I^2 + a_3 F O$$



© Digital Integrated Circuits<sup>2nd</sup>

- Transistor sizing
  - as long as fan-out capacitance dominates
- Progressive sizing



**Distributed RC line** 

M1 > M2 > M3 > ... > MN (the fet closest to the output is the smallest)

Can reduce delay by more than 20%; decreasing gains as technology shrinks

© Digital Integrated Circuits<sup>2nd</sup>



© Digital Integrated Circuits<sup>2nd</sup>

□ Alternative logic structures

F = ABCDEFGH







© Digital Integrated Circuits<sup>2nd</sup>

Isolating fan-in from fan-out using buffer insertion



Reducing the voltage swing

 $t_{pHL} = 0.69 (3/4 (C_L V_{DD}) / I_{DSATn})$ 

= 0.69 (3/4 ( $C_L V_{swing}$ )/  $I_{DSATn}$ )

- linear reduction in delay
- also reduces power consumption

But the following gate is much slower!
 Or requires use of "sense amplifiers" on the receiving end to restore the signal level (memory design)

## Sizing Logic Paths for Speed

- Frequently, input capacitance of a logic path is constrained
- Logic also has to drive some capacitance
- Example: ALU load in an Intel's microprocessor is 0.5pF
- How do we size the ALU datapath to achieve maximum speed?
- We have already solved this for the inverter chain – can we generalize it for any type of logic?





For given *N*:  $C_{i+1}/C_i = C_i/C_{i-1}$ To find *N*:  $C_{i+1}/C_i \sim 4$ How to generalize this to any logic path?

© Digital Integrated Circuits<sup>2nd</sup>

## **Logical Effort**

$$Delay = k \cdot R_{unit} C_{unit} \left( 1 + \frac{C_L}{\gamma C_{in}} \right)$$
$$= \tau \left( p + g \cdot f \right)$$

*p* – intrinsic delay (3k $R_{unit}C_{unit}$ ) - gate parameter ≠ f(*W*) *g* – logical effort (k $R_{unit}C_{unit}$ ) – gate parameter ≠ f(*W*) *f* – effective fanout

Normalize everything to an inverter:  $g_{inv} = 1, p_{inv} = 1$ 

Divide everything by  $\tau_{inv}$ (everything is measured in unit delays  $\tau_{inv}$ ) Assume  $\gamma = 1$ .

© Digital Integrated Circuits<sup>2nd</sup>

#### **Delay in a Logic Gate**



Logical effort is a function of topology, independent of sizing Effective fanout (electrical effort) is a function of load/gate size

© Digital Integrated Circuits<sup>2nd</sup>



- Inverter has the smallest logical effort and intrinsic delay of all static CMOS gates
- Logical effort of a gate presents the ratio of its input capacitance to the inverter capacitance when sized to deliver the same current
- Logical effort increases with the gate complexity



Logical effort is the ratio of input capacitance of a gate to the input capacitance of an inverter with the same output current



© Digital Integrated Circuits<sup>2nd</sup>

#### **Logical Effort of Gates**



#### **Logical Effort of Gates**



#### **Logical Effort of Gates**



© Digital Integrated Circuits<sup>2nd</sup>

**Combinational Circuits** 

50

## **Add Branching Effort**

Branching effort:

$$b = \frac{C_{on-path} + C_{off-path}}{C_{on-path}}$$





© Digital Integrated Circuits<sup>2nd</sup>

#### Multistage Networks

$$Delay = \sum_{i=1}^{N} (p_i + g_i \cdot f_i)$$

Stage effort:  $h_i = g_i f_i$ 

Path electrical effort:  $F = C_{out}/C_{in}$ 

Path logical effort:  $G = g_1 g_2 \dots g_N$ 

Branching effort:  $B = b_1 b_2 \dots b_N$ 

Path effort: H = GFB

Path delay  $D = \Sigma d_i = \Sigma p_i + \Sigma h_i$ 

© Digital Integrated Circuits<sup>2nd</sup>

## **Optimum Effort per Stage**

When each stage bears the same effort:

$$h^{N} = H$$
$$h = \sqrt[N]{H}$$

Stage efforts:  $g_1f_1 = g_2f_2 = \dots = g_Nf_N$ 

Effective fanout of each stage:  $f_i = h/g_i$ 

Minimum path delay

$$\hat{D} = \sum (g_i f_i + p_i) = NH^{1/N} + P$$



## **Optimal Number of Stages**

For a given load, and given input capacitance of the first gate Find optimal number of stages and optimal sizing

1/77

$$D = NH^{1/N} + Np_{inv}$$
$$\frac{\partial D}{\partial N} = -H^{1/N} \ln(H^{1/N}) + H^{1/N} + p_{inv} = 0$$

Substitute 'best stage effort'  $h = H^{1/\hat{N}}$ 

© Digital Integrated Circuits<sup>2nd</sup>



## **Logical Effort**

|             | Number of Inputs |     |     |            |
|-------------|------------------|-----|-----|------------|
| Gate Type   | 1                | 2   | 3   | n          |
| Inverter    | 1                |     |     |            |
| NAND        |                  | 4/3 | 5/3 | (n + 2)/3  |
| NOR         |                  | 5/3 | 7/3 | (2n + 1)/3 |
| Multiplexer |                  | 2   | 2   | 2          |
| XOR         |                  | 4   | 12  |            |

From Sutherland, Sproull

© Digital Integrated Circuits<sup>2nd</sup>

#### **Example: Optimize Path**



Effective fanout, F = G =

- H =
- *h* =
- a =
- b =

© Digital Integrated Circuits<sup>2nd</sup>

#### **Example: Optimize Path**



Effective fanout, F = 5 G = 25/9 H = 125/9 = 13.9 h = 1.93 a = 1.93  $b = ha/g_2 = 2.23$  $c = hb/g_3 = 5g_4/f = 2.59$ 

57 Combinational Circuits

© Digital Integrated Circuits<sup>2nd</sup>

#### **Example: Optimize Path**



Effective fanout, H = 5 G = 25/9 F = 125/9 = 13.9 f = 1.93 a = 1.93  $b = fa/g_2 = 2.23$  $c = fb/g_3 = 5g_4/f = 2.59$ 

58 Combinational Circuits

© Digital Integrated Circuits<sup>2nd</sup>

#### **Example – 8-input AND**



## Method of Logical Effort

 Compute the path effort: F = GBH
 Find the best number of stages N ~ log<sub>4</sub>F
 Compute the stage effort f = F<sup>1/N</sup>
 Sketch the path with this number of stages
 Work either from either end, find sizes: C<sub>in</sub> = C<sub>out</sub>\*g/f

Reference: Sutherland, Sproull, Harris, "Logical Effort, Morgan-Kaufmann 1999.



#### Table 4: Key Definitions of Logical Effort

| Term              | Stage expression              | Path expression                           |
|-------------------|-------------------------------|-------------------------------------------|
| Logical effort    | $oldsymbol{g}$ (seeTable 1)   | $G = \prod g_i$                           |
| Electrical effort | $h = \frac{C_{out}}{C_{in}}$  | $H = rac{C_{out (path)}}{C_{in (path)}}$ |
| Branching effort  | n/a                           | $B = \prod b_i$                           |
| Effort            | f = gh                        | F = GBH                                   |
| Effort delay      | f                             | $D_F = \sum f_i$                          |
| Number of stages  | 1                             | N                                         |
| Parasitic delay   | $oldsymbol{ ho}$ (seeTable 2) | $P = \sum p_i$                            |
| Delay             | d = f + p                     | $D = D_F^+ P$                             |

Sutherland, Sproull Harris