VLSI Design

Module 1 : Introduction to VLSI Design Lecture 1 : Motivation of the Course Objectives In this lecture you will learn the following:

• Motivation of the course • Course Objectives

1.1 Motivation of the course Why do some circuits work the first time and some circuits take over a year and multiple design iterations to work properly? Why can, for some circuits, the produced quantities easily be ramped up, and for others both circuit and process optimisation is needed. Why are some circuits running red-hot requiring expensive cooling solutions while other circuits, for similar performance, are running from small batteries in hand held gadgets? Why do some companies make money with successful innovations and why do some companies loose hundreds of millions of dollars of revenue just because they did not get their product on market in time. The answer to these questions is (a lack of) system engineering: analysis and design of a system's relevant electrical parameters. The deep submicron CMOS technologies have moved the bottleneck from device and gate level issues to interconnects and communication (metal wires) bottle necks, where we currently do not have any design automation. This course aims to provide a working knowledge of system electrical issues at chip level related to remove or live with these new bottle-necks (so that the disasters in design can be avoided with proper structures and performance budgeting). 1.2 Course Objectives The course provides for final year undergraduates a solid and fundamental engineering view of digital system operation and how to design systematically well performing digital VLSI systems exceeding consistently, customer expectations and competitor fears. The aim is to teach the critical methods and circuit structures to identify the key 1 % of the circuitry on-chip which dominates the performance, reliability, manufacturability, and the cost of the VLSI circuit. With the current utilisation of the deep submicron CMOS technologies (0.25 micron and below design rules) the major design paradigm shift is associated with the fact that the interconnections (metal Al or Cu wires connecting gates) and the chip communication in general is the main design object instead of active transistors or logic gates. The main design issues defining the make-or-break point in each project is associated with power and signal distribution and bit/symbol communication between functional blocks on-chip and off-chip. In the course we provide a solid framework in understanding: - Scaling of technology and their impact on interconnects - Interconnects as design objects - Noise in digital systems and its impact on system operation - Power distribution schemes for low noise - Signal and signalling conventions for on-chip and off-chip communication - Timing and synchronisation for fundamental operations and signaling

The course objective is to provide the student with a solid understanding of the underlying mechanism and solution techniques to the above key design issues, so that the student, when working as industrial designer, is capable of identifying the key problem points and focus his creative attention and 90% of available resources to right issues for 1% of the circuitry and leave the remaining 99% of circuitry to computer automated tools or unqualified engineers. Recap In this course you have learnt the following

• Motivation of the course • Course Objectives

Module 3 : Fabrication Process and Layout Design Rules Lecture 10 : General Aspects of CMOS Technology Objectives In this course you will learn the following

• Gate Material • Parasitic Capacitances • Self-aligned silicon gate technology • Channel Stopper • Polysilicon deposition • Oxide Growth • Active mask or Isolation mask (thin-ox)

10.1 Gate Material Metals have several advantages when considered as gate electrodes. The use of metal gates would certainly eliminate the problems of dopant penetration through the dielectric and subsequent gate depletion. The use of metals with appropriate work functions for NMOS and PMOS devices would led to transistors with symetrical and tailored threshold voltages. Most refractory metals are good choices for this application primarily due to their high melting points, which allow them to be used at high temperatures necessary for source-drain implant activation. However thermodynamic stability of metal-dielectric interfaces at processing temperatures are major concerns which need to be addressed in addition to more subtle issues of electrical properties, flat band voltage (ultimately threshold voltages) stability and the charge trapping at the interface. The problem with using aluminium is that once deposited, it cannot be subjected to high temperature processes. Copper causes a lot of trap generation when used as a gate material. 10.2 Parasitic Capacitances

Figure 10.2: Parasitic capacitances in MOSFET

Though a lot of parasitic capacitances exist in a MOSFET as shown in figure 10.2, but those of prime concern to us are the gate to drain capacitance (Cgd) and gate to source capacitance (Cgs) because they are common to input and output nodes and gate multiplied by gain during circuit operation. Thus they increase the input capacitance drastically and decrease the charging rate. 10.3 Self-aligned Silicon Gate Technology

Figure 10.3: Cross sectional view of MOSFET under Selfalgining process

When the metal is used as the gate material, then the source and drain are deposited before the gate and thus to align the gate, mask aligners are used and errors in aligning takes place. In case of polysilicon gate process, the exposed gate oxide (not covered by polysilicon) is etched away and the wafer is subjected to dopant-source or ion-implant which causes source-drain deposition and also these are formed in the regions not covered by polysilicon and thus source and drain donot extend under the gate. This is called self-aligning process. 10.4 Channel Stopper It is used to prevent the channel formation in the substrate below the field oxide. For example, for a p-substrate, the channel stopper implant would p+ which will increase the magnitude of threshold voltage. Irregular surfaces can cause "step coverage problems" in which a conductor thins and can even break as it crosses a thick to thin oxide boundary. One of the methods used to remove these irregularities is to pre-etch the silicon in areas where the field oxide is to be grown by around half the final required field oxide thickness. LOCOS (will explain it shortly) oxidation done after this gives the planner field oxide/gate oxide interface. 10.5 Polysilicon Deposition The sheet resistance of undoped polysilicon is 10^8 ohms/cm and it can be reduced to 30 ohm/cm by heavy doping. The advantage of using polysilicon as gate material is its use as further mask to allow precise definition of source and drain. The polysilicon resistance affects the input resistance of the transistor and thus should be small for improving the RC time constant. For this, higher doping concentration is used. 10.6 Oxide Growth

Figure 10.6: Formation of bird's beak in MOSFET

Oxide grown on silicon may result in an uneven surface due to unequal thickness of oxide grown from same thickness of silicon. Stress along the edge of an oxidized area (where silicon has been trenched prior to oxidation to produce a plainer surface) may produce severe damage in the silicon. To relieve this stress, the oxidation temperature must be sufficiently high to allow the stress in the oxide to relieved by viscous flow. In the LOCOS process, the transistor area is masked by SiO2/SiN sandwich and the thick field oxide is then grown. The oxide grows in both the directions vertically and also laterally under the sandwich and results in an encroachment into the gate region called as bird's beak. This reduces the active area of the transistor and specially the width. Some improvements in the LOCOS process produce Bird's crest which reduces the encroachments, but it is non-uniform.

Figure 10.62: Comparison of the LOCOS process

with and without some sacrificial polysilicon The goal is to oxidize Si only locally, whenever a field oxide is needed. This is necessary for the following reasons: -- Local oxide penetrates into the Si, so the Si-SiO2interface is lower than the source-drain regions to be made later. This could not be achieved with oxidizing all of the Si and then etching of unwanted oxide. -- For device performance reasons, this is highly beneficial, if not absolutely necessary. 10.7 Active Mask or Isolation Mask (thin-ox)

It describes the areas where thin oxides are needed to implement the transistor gates and allow implantations to form p/n type diffusions. A thin layer of SiO2is grown and covered with SiN and this is used as mask. The bird's bead must be taken into account while designing thin-ox. Recap In this lecture you have learnt the following

• Gate Material • Parasitic Capacitances • Self-aligned silicon gate technology • Channel Stopper • Polysilicon deposition • Oxide Growth • Active mask or Isolation mask (thin-ox)

Congratulations, you have finished Lecture 10.

Module 3 : Fabrication Process and Layout Design Rules Lecture 11 : General Aspects of CMOS Technology (contd...) Objectives In this course you will learn the following

• Why polysilicon prefered over aluminium as gate material? • Channel stopper Implant • Local Oxidation of silicon (LOCOS)

11.1 Why polysilicon prefered over aluminium as gate material? Because-

Figure 11.11: Self-alignment is not possible in case of

Al gate due to Cgd and Cgs

Figure 11.12 Self-alignment possible in case of polysilicon

1. Penetration of silicon substrate: If aluminium metal is deposited as gate, we

can't increase the temperature beyond 500 degree celcius due to the fact that aluminium will then start penetrating the silicon substrate and act as p-type impurity.

2. Problem with non-self alignment: In case of aluminium gate, we have to first create source and drain and then gate implant. We can't do the reverse because diffusion is a high temperature process. And this creates parasitic overlap input capacitances Cgd and Cgs (figure 11.11).Cgd is more harmful because it is a feedback capacitance and hence it is reflected on the input magnified by (k+1) times (recall Miller's theorem), where k is the gain. So if aluminum is used, the input capacitance increases unnecessarily which further increases the charging time of the input capacitance.

Therefore output doesn't appear immediately. If poly-silicon is used instead, it is possible to first create gate and then source & drain implant, which eliminates the problem of overlap capacitances Cgd and Cgs. Resistivity of poly-silicon is 10^8 ohm/cm. So we need to dope polysilicon so that it resembles a metal like Al and its resistance is reduced to 100 or 300 ohm (although its still greater than Al). Time for charging capacitance varies as negative exponential of (RC)^(-1) where R and C are resistance and capacitance of the device. As we know the resistance is directly propotional to the length, so poly-silicon length should be kept small so that the resistance is not large, otherwise the whole purpose of decreasing C (hence the time constant RC) will be nullified. 11.2 Channel stopper Implant As we know millions of transistors are fabricated on a single chip. To seperate (insulate) these from each-other, we grow thick oxides (called field oxides). So, at very high voltages, inversion may set in the region below the field oxide also, despite the large thickness of these oxides.

Figure 11.21: channel stopper implant before

field oxide region is grown (yellow color region)

To avoid this problem, we do an implant in this region before growing the field oxide layer so that threshold voltage for this region is much greater than that for the desired active transistor channel region. This implant layer is called channel stopper implant. (as shown in figure 11.21) 11.3 Local Oxidation of Silicon (LOCOS)

Figure 11.31: Formation of LOCOS Creation of LOCOS:

During etching, anything irregular becomes more irregular. So we grow oxide fields 50% above and 50% below the wafer. This is called LOCal Oxidation of Silicon (LOCOS).

Figure 11.32: bird's beak

0.45 mU of silicon, when oxidized, becomes 1 mU of SiO2 because of change in density. When field oxides are grown, there is an encroachment of the oxide layer in the active transistor region below the gate oxide, because of the affinity of the SiO2 gate oxide for oxygen. The resulting structure resembles a bird's beak (as shown in figure 11.32). This affects the device performance.

Figure 11.33: bird's creast

If we use Si3N as the gate dielectric, it will not let oxygen pass through. But due to mismatch of the thermal coefficients of Si and Si3N4, hence the resulting stress produces a nonplanar structure called bird's crest (as shown in figure 11.33). The thermal coefficients of Si and SiO2 match. So when Si3N4 is used as the gate dielectric, we first grow a thin oxide layer underneath. The stress, which would otherwise be generated on the account of the difference in the thermal coefficients of Si and SiO2 is now reduced. Since SiO2 is now there, bird's beak will be formed. Recap In this lecture you have learnt the following

• Why polysilicon prefered over aluminium as gate material? • Channel stopper Implant • Local Oxidation of silicon (LOCOS)


Module 3 : Fabrication Process and Layout Design Rules Lecture 12 : CMOS Fabrication Technologies Objectives In this course you will learn the following

• Introduction • Twin Well/Tub Technology • Silicon on Insulator (SOI) • N-well/P-well Technology

12.1 Introduction CMOS fabrication can be accomplished using either of the three technologies:

• N-well/P-well technologies • Twin well technology • Silicon On Insulator (SOI)

In this discussion we will focus chiefly on N-well CMOS fabrication technology. 12.2 Twin Well Technology Using twin well technology, we can optimise NMOS and PMOS transistors separately. This means that transistor parameters such as threshold voltage, body effect and the channel transconductance of both types of transistors can be tuned independenly. n+ or p+ substrate, with a lightly doped epitaxial layer on top, forms the starting material for this technology. The n-well and pwell are formed on this epitaxial layer which forms the actual substrate. The dopant concentrations can be carefully optimized to produce the desired device characterisitcs because two independent doping steps are performed to create the well regions. The conventional n-well CMOS process suffers from, among other effects, the problem of unbalanced drain parasitics since the doping density of the well region typically being about one order of magnitude higher than the substrate. This problem is absent in the twin-tub process. 12.3 Silicon on Insulator (SOI) To improve process characteristics such as speed and latch-up susceptibility, technologists have sought to use an insulating substrate instead of silicon as the substrate material. Completely isolated NMOS and PMOS transistors can be created virtually side by side on an insulating substrate (eg. sapphire) by using the SOI CMOS technology.

This technology offers advantages in the form of higher integration density (because of the absence of well regions), complete avoidance of the latch-up problem, and lower parasitic capacitances compared to the conventional n-well or twin-tub CMOS processes. But this technology comes with the disadvantage of higher cost than the standard n-well CMOS process. Yet the improvements of device performance and the absence of latch-up problems can justify its use, especially in deep submicron devices. 12.4 N-well Technology In this discussion we will concentrate on the well established n-well CMOS fabrication technology, which requires that both nchannel and p-channel transistors be built on the same chip substrate. To accomodate this, special regions are created with a semiconductor type opposite to the substrate type. The regions thus formed are called wells or tubs. In an n-type substrate, we can create a p-well or alternatively, an n-well is created in a p-type substrate. We present here a simple n-well CMOS fabrication technology, in which the NMOS transistor is created in the p-type substrate, and the PMOS in the n-well, which is built-in into the p-type substrate. Historically, fabrication started with p-well technology but now it has been completely shifted to n-well technology. The main reason for this is that, "n-well sheet resistance can be made lower than p-well sheet resistance" (electrons are more mobile than holes). The simplified process sequence (shown in Figure 12.41) for the fabrication of CMOS integrated circuits on a p-type silicon substrate is as follows:

• N-well regions are created for PMOS transistors, by impurity implantation into the substrate.

• This is followed by the growth of a thick oxide in the regions surround the NMOS and PMOS active regions.

• The thin gate oxide is subsequently grown on the surface through thermal oxidation.

• After this n+ and p+ regions (source, drain and channel-stop implants) are created.

• The metallization step (creation of metal interconnects) forms the final step in this process.

Fig 12.41: Simplified Process Sequence For Fabrication Of CMOS ICs

The integrated circuit may be viewed as a set of patterned layers of doped silicon, polysilicon, metal and insulating silicon dioxide, since each processing step requires that certain areas are defined on chip by appropriate masks. A layer is patterned before the next layer of material is applied on the chip. A process, called lithography, is used to transfer a pattern to a layer. This must be repeated for every layer, using a different mask, since each layer has its own distinct requirements. We illustrate the fabrication steps involved in patterning silicon dioxide through optical lithography, using Figure 12.42 which shows the lithographic sequences.

Fig 12.42: Process steps required for patterning of silicon dioxide First an oxide layer is created on the substrate with thermal oxidation of the silicon surface. This oxide surface is then covered with a layer of photoresist. Photoresist is a light-sensitive, acid-resistant organic polymer which is initially insoluble in the developing solution. On exposure to ultraviolet (UV) light, the exposed areas become soluble which can be etched away by etching solvents. Some areas on the surface are covered with a mask during exposure to selectively expose the photoresist. On exposure to UV light, the masked areas are shielded whereas those areas which are not shielded become soluble. There are two types of photoresists, positive and negative photoresist. Positive photoresist is initially insoluble, but becomes soluble after exposure to UV light, where as negative photoresist is initially soluble but becomes insoluble (hardened) after exposure to UV light. The process sequence described uses positive photoresist.

Negative photoresists are more sensitive to light, but their photolithographic resolution is not as high as that of the positive photoresists. Hence, the use of negative photoresists is less common in manufacturing high-density integrated circuits. The unexposed portions of the photoresist can be removed by a solvent after the UV exposure step. The silicon dioxide regions not covered by the hardened photoresist is etched away by using a chemical solvent (HF acid) or dry etch (plasma etch) process. On completion of this step, we are left with an oxide window which reaches down to the silicon surface. Another solvent is used to strip away the remaining photoresist from the silicon dioxide surface. The patterned silicon dioxide feature is shown in Figure 12.43

Fig 12.43: The result of single photolithographic

patterning sequence on silicon dioxide

The sequence of process steps illustrated in detail actually accomplishes a single pattern transfer onto the silicon dioxide surface. The fabrication of semiconductor devices requires several such pattern transfers to be performed on silicon dioxide, polysilicon, and metal. The basic patterning process used in all fabrication steps, however, is quite similar to the one described earlier. Also note that for accurate generation of high-density patterns required in submicron devices, electron beam (E-beam) lithography is used instead of optical lithography. In this section, we will examine the main processing steps involved in fabrication of an n-channel MOS transistor on a p-type silicon substrate.

The first step of the process is the oxidation of the silicon substrate (Fig 12.44(a)), which creates a relatively thick silicon dioxide layer on the surface. This oxide layer is called field oxide (Fig. 12.44(b)). The field oxide is then selectively etched to expose the silicon surface on which the transistor will be created (Fig. 12.44(c)). After this the surface is covered with a thin, high-quality oxide layer. This oxide layer will form the gate oxide of the MOS transistor (Fig. 12.44(d)). Then a polysilicon layer is deposited on the thin oxide (Fig 12.44(e)). Polysilicon is used as both a gate electrode material for MOS transistors as well as an interconnect medium in silicon integrated circuits. The resistivity of polysilicon, which is usually high, is reduced by doping it with impurity atoms. Deposition is followed by patterning and etching of polysilicon layer to form the interconnects and the MOS transistor gates (Fig. 12.44(f)). The thin gate oxide not masked by polysilicon is also etched away exposing the bare silicon surface. The drain and source junctions are to be formed (Fig 12.44(g)). Diffusion or ion implantation is used to dope the entire silicon surface with a high concentration of impurities (in this case donor atoms to produce n-type doping). Fig 12.44(h) shows two n-type regions (source and drain junctions) in the p-type substrate as doping penetrates the exposed areas of the silicon surface. The penetration of impurity doping into the polysilicon reduces its resistivity. The polysilicon gate is patterned before the doping and it precisely defines the location of the channel region and hence, the location of the source and drain regions. Hence this process is called a self-aligning process. The entire surface is again covered with an insulating layer of silicon dioxide after the source and drain regions are completed (Fig 12.44(i)). Next contact windows for the source and drain are patterned into the oxide layer (Fig. 12.44(j)). Interconnects are formed by evaporating aluminium on the surface (Fig 12.44(k)), which is followed by patterning and etching of the metal layer (Fig 12.44(l)). A second or third layer of metallic interconnect can also be added after adding another oxide layer, cutting (via) holes, depositing and patterning the metal.

Fig 12.44: Process flow for the fabrication of an n-type MOSFET on p-type silicon

We now return to the generalized fabrication sequence of n-well CMOS integrated circuits. The following figures illustrate some of the important process steps of the fabrication of a CMOS inverter by a top view of the lithographic masks and a cross-sectional view of the relevant areas. The n-well CMOS process starts with a moderately doped (with impurity concentration typically less than 1015 cm-3) p-type silicon substrate. Then, an initial oxide layer is grown on the entire surface. The first lithographic mask defines the n-well region. Donor atoms, usually phosphorus, are implanted through this window in the oxide. Once the n-well is created, the active areas of the nMOS and pMOS transistors can be defined

The creation of the n-well region is followed by the growth of a thick field oxide in the areas surrounding the transistor active regions, and a thin gate oxide on top of the active regions. The two most important critical fabrication parameters are the thickness and quality of the gate oxide. These strongly affect the operational characteristics of the MOS transistor, as well as its long-term stability. Chemical vapor deposition (CVD) is used for deposition of polysilicon layer and patterned by dry (plasma) etching. The resulting polysilicon lines function as the gate electrodes of

the nMOS and the pMOS transistors and their interconnects. The polysilicon gates also act as self-aligned masks for source and drain implantations. The n+ and p+ regions are implanted into the substrate and into the n-well using a set of two masks. Ohmic contacts to the substrate and to the n-well are also implanted in this process step.

CVD is again used to deposit and insulating silicon dioxide layer over the entire wafer. After this the contacts are defined and etched away exposing the silicon or polysilicon contact windows. These contact windows are essential to complete the circuit interconnections using the metal layer, which is patterned in the next step.

Metal (aluminum) is deposited over the entire chip surface using metal evaporation, and the metal lines are patterned through etching. Since the wafer surface is non-planar, the quality and the integrity of the metal lines created in this step are very critical and are ultimately essential for circuit reliability. The composite layout and the resulting cross-sectional view of the chip, showing one nMOS and one pMOS transistor (built-in nwell), the polysilicon and metal interconnections. The final step is to deposit the passivation layer (for protection) over the chip, except for wire-bonding pad areas. This completes the fabrication of the CMOS inverter using n-well technology.

Recap In this lecture you have learnt the following

• Motivation • N-well / P-well Technologies • Silicon on Insulator (SOI) • Twin well Technology


Module 3 : Fabrication Process and Layout Design Rules Lecture 13 : Layout Design Rules Objectives In this course you will learn the following

• Motivation • Types of Design Rules • Layer Representations • Stick Diagrams

13.1 Motivation In VLSI design, as processes become more and more complex, need for the designer to understand the intricacies of the fabrication process and interpret the relations between the different photo masks is really trouble some. Therefore, a set of layout rules, also called design rules, has been defined. They act as an interface or communication link between the circuit designer and the process engineer during the manufacturing phase. The objective associated with layout rules is to obtain a circuit with optimum yield (functional circuits versus non-functional circuits) in as small as area possible without compromising reliability of the circuit. In addition, Design rules can be conservative or aggressive, depending on whether yield or performance is desired. Generally, they are a compromise between the two. Manufacturing processes have their inherent limitations in accuracy. So the need of design rules arises due to manufacturing problems like –

• Photo resist shrinkage, tearing. • Variations in material deposition, temperature and oxide thickness. • Impurities. • Variations across a wafer.

These lead to various problems like :

• Transistor problems: Variations in threshold voltage: This may occur due to variations in oxide

thickness, ion-implantation and poly layer. Changes in source/drain diffusion overlap. Variations in substrate.

• Wiring problems:

Diffusion: There is variation in doping which results in variations in resistance, capacitance. Poly, metal: Variations in height, width resulting in variations in resistance, capacitance. Shorts and opens.

• Oxide problems:

Variations in height. Lack of planarity.

• Via problems: Via may not be cut all the way through. Undersize via has too much resistance. Via may be too large and create short.

To reduce these problems, the design rules specify to the designer certain geometric constraints on the layout artwork so that the patterns on the processed wafers will preserve the topology and geometry of the designs. This consists of minimum-width and minimum-spacing constraints and requirements between objects on the same or different layers. Apart from following a definite set of rules, design rules also come by experience. 13.2 Types of Design Rules The design rules primary address two issues:

1. The geometrical reproduction of features that can be reproduced by the maskmaking and lithographical process ,and

2. The interaction between different layers. There are primarily two approaches in describing the design rules.

1. Linear scaling is possible only over a limited range of dimensions. 2. Scalable design rules are conservative .This results in over dimensioned and less

dense design. 3. This rule is not used in real life.

1. Scalable Design Rules (e.g. SCMOS, λ-based design rules):

In this approach, all rules are defined in terms of a single parameter λ. The rules are so chosen that a design can be easily ported over a cross section of industrial process ,making the layout portable .Scaling can be easily done by simply changing the value of. The key disadvantages of this approach are:

2. Absolute Design Rules (e.g. μ-based design rules ) : In this approach, the design rules are expressed in absolute dimensions (e.g. 0.75μm) and therefore can exploit the features of a given process to a maximum degree. Here, scaling and porting is more demanding, and has to be performed either manually or using CAD tools .Also, these rules tend to be more complex especially for deep submicron. The fundamental unity in the definition of a set of design rules is the minimum line width .It stands for the minimum mask dimension that can be safely transferred to the semiconductor material .Even for the same minimum dimension, design rules tend to differ from company to company, and from process to process. Now, CAD tools allow designs to migrate between compatible processes.

13.3 Layer Representations With increase of complexity in the CMOS processes, the visualization of all the mask levels that are used in the actual fabrication process becomes inhibited. The layer concept translates these masks to a set of conceptual layout levels that are easier to visualize by the circuit designer. From the designer's viewpoint, all CMOS designs have the following entities:

• Two different substrates and/or wells: which are p-type for NMOS and n-type for PMOS.

• Diffusion regions (p+ and n+): which defines the area where transistors can be formed. These regions are also called active areas. Diffusion of an inverse type is needed to implement contacts to the well or to substrate.These are called select regions.

• Transistor gate electrodes : Polysilicon layer • Metal interconnect layers • Interlayer contacts and via layers.

The layers for typical CMOS processes are represented in various figures in terms of:

• A color scheme (Mead-Conway colors). • Other color schemes designed to differentiate CMOS structures. • Varying stipple patterns • Varying line styles

13.31 Mead Conway Color coding for layers.

An example of layer representations for CMOS inverter using above design rules is shown below-

Figure 13.32 :CMOS Inverter Layout Figure

13.4 Stick Diagrams Another popular method of symbolic design is "Sticks" layout. In this, the designer draws a freehand sketch of a layout, using colored lines to represent the various process layers such as diffusion, metal and polysilicon .Where polysilicon crosses diffusion, transistors are created and where metal wires join diffusion or polysilicon, contacts are formed. This notation indicates only the relative positioning of the various design components. The absolute coordinates of these elements are determined automatically by the editor using a compactor. The compactor translates the design rules into a set of constraints on the component positions, and solve a constrained optimization problem that attempts to minimize the area or cost function. The advantage of this symbolic approach is that the designer does not have to worry about design rules, because the compactor ensures that the final layout is physically correct. The disadvantage of the symbolic approach is that the outcome of the compaction phase is often unpredictable. The resulting layout can be less dense than what is obtained with the manual approach. In addition, it does not show exact placement, transistor sizes, wire lengths, wire widths, tub boundaries.

For example, stick diagram for CMOS Inverter is shown below.

Figure 13.41: Stick Diagram of a CMOS Inverter


• Motivation • Types of Design Rules • Layer Representations • Stick Diagrams


Module 3 : Fabrication Process and Layout Design Rules Lecture 14 : λ-based Design Rules Objectives In this course you will learn the following

• Background • λ-based Design Rules

14.1 Background As we studied in the last lecture, Layout rules are used to prepare the photo mask used in the fabrication of integrated circuits. The rules provide the necessary communication link between the circuit designer and process engineer. Design rules represent the best possible compromise between performance and yield. The design rules primarily address two issues -

1. The geometrical reproductions of features that can be reproduced by mask making and lithographical processes.

2. Interaction between different layers Design rules can be specified by different approaches

1. λ-based design rules 2. μ-based design rules

As λ-based layout design rules were originally devised to simplify the industry- standard μ-based design rules and to allow scaling capability for various processes. It must be emphasized, however, that most of the submicron CMOS process design rules do not lend themselves to straightforward linear scaling. The use of λ-based design rules must therefore be handled with caution in sub-micron geometries. In further sections of this lecture, we will present a detailed study about λ-based design rules. 14.2 λ-based Design Rules Features of λ-based Design Rules: λ-based Design Rules have the following features-

• λ is the size of a minimum feature • All the dimensions are specified in integer multiple of λ. • Specifying λ particularizes the scalable rules. • Parasitic are generally not specified in λ units • These rules specify geometry of masks, which will provide reasonable yields

Guidelines for using λ-based Design Rules:

As, Minimum line width of poly is 2λ & Minimum line width of diffusion is 2λ

As Minimum distance between two diffusion layers 3λ

As It is necessary for the poly to completely cross active, other wise the transistor that has been created crossing of diffusion and poly, will be shorted by diffused path of source and drain. Contact cut on metal

Contact window will be of 2λ by 2λ that is minimum feature size while metal deposition is of 4λ by 4λ for reliable contacts. In Metal

Two metal wires have 3λ distance between them to overcome capacitance coupling and high frequency coupling. Metal wires width can be as large as possible to decrease resistance. Buttering contact

Buttering contact is used to make poly and silicon contact. Window's original width is 4λ, but on overlapping width is 2λ. So actual contact area is 6λ by 4λ. The distance between two wells depends on the well potentials as shown above. The reason for 8l is that if both wells are at same high potential then the depletion region between them may touch each other causing punch-through. The reason for 6l is that if both wells are at different potentials then depletion region of one well will be smaller, so both depletion region will not touch each other so 6l will be good enough.

The active region has length 10λ which is distributed over the followings- • 2λ for source diffusion • 2λ for drain diffusion • 2λ for channel length • 2λ for source side encroachment • 2λ for drain side encroachment


• Background • λ-based Design Rules


Module 4 : Propagation Delays in MOS Lecture 15 : CMOS Inverter Characteristics Objectives In this lecture you will learn the following

• CMOS Inverter Characterisitcs • Noise Margins • Regions of operation • Beta-n by Beta-p ratio

15. CMOS Inverter Characterisitcs The complementary CMOS inverter is realized by the series connection of a p- and n-device as in fig 15.11.

Fig 15.11: CMOS Inverter

Fig 15.12: I-V characteristics of PMOS & NMOS

Fig 15.13: Transfer Characteristics of CMOS

Inverter characteristics: In the below graphical representation (fig.2). The I-V characteristics of the p-device is reflected about x-axis. This step is followed by taking the absolute values of the p-device, Vds and superimposing the two characteristics. Solving Vinn and Vinp and Idsn=Idsp gives the desired transfer characteristics of a CMOS inverter as in fig3. 15.2 Noise Margins Noise margin is a parameter closely related to the input-output voltage characteristics. This parameter allows us to determine the allowable noise voltage on the input of a gate so that the output will not be affected. The specification most commonly used to specify noise margin (or noise immunity) is in terms of two parameters- The LOW noise margin, NML, and the HIGH noised margin, NMH. With reference to Fig 4, NML is defined as the difference in magnitude between the maximum LOW output voltage of the driving gate and the maximum input LOW voltage recognized by the driven gate. Thus,

The value of NMH is difference in magnitude between the minimum HIHG output voltage of the driving gate and the minimum input HIGH voltage recognized by the receiving gate. Thus,

Where, VIHmin = minimum HIGH input voltage. VILmax = maximum LOW input voltage. VOHmin= minimum HIGH output voltage. VOLmax= maximum LOW output voltage.

Fig 15.2: Noise Margin diagram

15.3: Regions of Operation The operation of CMOS inverter can be divided into five regions .The behavior of n- and p-devices in each of region may be found using

We will describe about each regions in details- Region A: This region is defined by 0 =< Vin < Vtn in which the n-device is cut off (Idsn =0), and the p-device is in the linear region. Since Idsn = –IIdsp, the drain-to-source current Idsp for the p-device is also zero. But for Vdsp = Vout– VDD, with Vdsp = 0, the output voltage is Vout=VDD. Region B: This region is characterized by Vtn =< Vin < VDD /2 in which the p-device is in its nonsaturated region (Vds != 0) while the n-device is in saturation. The equivalent circuit for the inverter in this region can be represented by a resistor for the p-transistor and a current source for the n-transistor as shown in fig. 6 . The saturation current Idsn for the n-device is obtained by setting Vgs = Vin. This results in

and Vtn=threshold voltage of n-device, μn=mobility of electrons Wn = channel width of n-device & Ln = channel length of n-device The current for the p-device can be obtained by noting that Vgs = ( Vin – VDD ) and Vds = (Vout – VDD ). And therefore,

and Vtp =threshold voltage of n-device, μp=mobility of electrons, Wp = channel width of n-device and Lp=channel length of n-device. The output voltage Vout can be expressed as-

Fig 15.31: Equivalent circuit of MOSFET in region B

Region C: In this region both the n- and p-devices are in saturation. This is represented by fig 7 which shows two current sources in series.

Fig 15.32: Equivalent circuit of MOSFET in region C

The saturation currents for the two devices are given by.

This yields,

By setting,

Which implies that region C exists only for one value of Vin. We have assumed that a MOS device in saturation behaves like an ideal current soured with drain-to-source current being independent of Vds.In reality, as Vds increases, Ids also increases slightly; thus region C has a finite slope. The significant factor to be noted is that in region C, we have two current sources in series, which is an “unstable” condition. Thus a small input voltage as a large effect at the output. This makes the output transition very steep, which contrasts with the equivalent nMOS inverter characteritics. characteritics. The above expression of Vth is particularly useful since it provides the basis for defining the gate threshold Vinv which corresponds to the state where Vout=Vin .This region also defines the “gain” of the CMOS inverter when used as a small signal amplifier. Region D:

Fig 15.33: Equivalent circuit of MOSFET in region D

This region is described by VDD/2 <Vin =< VDD+ Vtp.The p-device is in saturation while the n-device is operation in its nonsaturated region. This condition is represented by the equivalent circuit shown in fig 15.33. The two currents may be written as

with Idsn = -Idsp. The output voltage becomes

Region E: This region is defined by the input condition Vin >= VDD -Vtp, in which the pdevice is cut off (Idsp =0), and the n-device is in the linear mode. Here, Vgsp= Vin - VDD Which is more positive than Vtp. The output in this region is Vout=0 From the transfer curve , it may be seen that the transition between the two states is very step.This characteristic is very desirable because the noise immunity is maximized.

15.4 βn/βp ratio:

Figure 15.4: βn/βp graph

The gate-threshold voltage, Vinv, where Vin =Vout is dependent on βn/βp. Thus, for given process, if we want to change βn/βp we need to change the channel dimensions, i.e.,channel-length L and channel-width W. Therefore it can be seen that as the ratio βn/βp is decreased, the transition region shifts from left to right; however, the output voltage transition remains sharp. Recap In this lecture you have learnt the following

• CMOS Inverter Characterisitcs • Noise Margins • Regions of operation • Beta-n by Beta-p ratio


Module 4 : Propagation Delays in MOS Lecture 16 : Propagation Delay Calculation of CMOS Inverter Objectives In this lecture you will learn the following

• Few Definitions • Quick Estimates • Rise and Fall times Calculation

16.1 Few Definitions Before calculating the propagation delay of CMOS Inverter, we will define some basic terms-

• Switching speed - limited by time taken to charge and discharge, CL. • Rise time, tr: waveform to rise from 10% to 90% of its steady state value • Fall time tf: 90% to 10% of steady state value • Delay time, td: time difference between input transition (50%) and 50% output

level

Fig 16.1: Propagation delay graph

The propagation delay tp of a gate defines how quickly it responds to a change at its inputs, it expresses the delay experienced by a signal when passing through a gate. It is measured between the 50% transition points of the input and output waveforms as

shown in the figure 16.1 for an inverting gate. The defines the response time of the

gate for a low to high output transition, while refers to a high to low transition. The

propagation delay as the average of the two

16.2 Quick Estimates:

We will give an example of how to calculate quick estimate. From fig 16.22, we can write following equations.

Fig 16.21: Example CMOS Inverter Circuit

Fig. 16.22 : Propagation Delay of above MOS circuit

From figure 16.21, when Vin = 0 the capacitor CL charges through the PMOS, and when Vin = 5 the capacitor discharges through the N-MOS. The capacitor current is –

From this the delay times can be derived as

The expressions for the propagation delays as denoted in the figure (16.22) can be easily seen to be

16.3 Rise and Fall Times Figure 16.21 shows the familiar CMOS inverter with a capacity load CL that represents the load capacitance (input of next gates, output of this gate and routing). Of interest is the voltage waveform Vout(t) when the input is driven by a step waveform, Vin(t) as shown in figure 16.22.

Fig 16.31: trjectory of n-transistor operating point

Figure 16.31 shows the trajectory of the n-transistor operating point as the input voltage, Vin(t), changes from 0V to VDD. Initially, the end-device is cutt-off and the load capacitor is charged to VDD. This illustrated by X1 on the characteristic curve. Application of a step voltage (VGS = VDD) at the input of the inverter changes the operating point to X2. From there onwards the trajectory moves on the VGS= VDD characteristic curve towards point X3 at the origin. Thus it is evident that the fall time consists of two intervals:

1. tf1=period during which the capacitor voltage, Vout, drops from 0.9VDD to (VDD–Vtn)

2. tf2=period during which the capacitor voltage, Vout, drops from (VDD–Vtn) to 0.1VDD.

The equivalent circuits that illustrate the above behavior are show in figure (16.32 & 16.33).

Figure 16.32: Equivalent circuit for showing behav. of tf1

Figure 16.33: Equivalent circuit for showing behav. of tf2

As we saw in last section, the delay periods can be derived using the general equation

from figure (16.32) while in saturation,

Integrating from t = t1, corresponding to Vout=0.9 VDD, to t = t2 corresponding to Vout=(VDD-Vtn) results in,

Fig 16.34: Rise and Fall time graph

When the n-device begins to operate in the linear region, the discharge current is no longer constant. The time tf1 taken to discharge the capacitor voltage from (VDD-Vtn) to 0.1VDD can be obtained as before. In linear region,

Thus the complete term for the fall time is,

The fall time tf can be approximated as,

From this expression we can see that the delay is directly proportional to the load capacitance. Thus to achieve high speed circuits one has to minimize the load capacitance seen by a gate. Secondly it is inversely proportion to the supply voltage i.e. as the supply voltage is raised the delay time is reduced. Finally, the delay is proportional to the βn of the driving transistor so increasing the width of a transistor decreases the delay. Due to the symmetry of the CMOS circuit the rise time can be similarly obtained as; For equally sized n and p transistors (where βn=2βp) tf=tr Thus the fall time is faster than the rise time primarily due to different carrier mobilites associated with the p and n devices thus if we want tf=tr we need to make βn/βp =1. This implies that the channel width for the p-device must be increased to approximately 2 to 3 times that of the n-device. The propagation delays if calculated as indicated before turn out to be,

Figure 16.35: Rise and Fall time graph of Output w.r.t Input

If we consider the rise time and fall time of the input signal as well, as shown in the fig 16.35 we have,

These are the rms values for the propagation delays. Recap In this lecture you have learnt the following

• Few Definitions • Quick Estimates • Rise and Fall times Calculation


Module 4 : Propagation Delays in MOS Lecture 17 : Pseudo NMOS Inverter Objectives In this lecture you will learn the following

• Introduction • Different Configurations with NMOS Inverter • Worries about Pseudo NMOS Inverter • Calculation of Capacitive Load

17.1 Introduction The inverter that uses a p-device pull-up or load that has its gate permanently ground. An n-device pull-down or driver is driven with the input signal. This roughly equivalent to use of a depletion load is Nmos technology and is thus called ‘Pseudo-NMOS’. The circuit is used in a variety of CMOS logic circuits. In this, PMOS for most of the time will be linear region. So resistance is low and hence RC time constant is low. When the driver is turned on a constant DC current flows in the circuit.

Fig 17.1: CMOS Inverter Circuit

17.2 Different Configurations with NMOS Inverter

17.3 CMOS Summary Logic consumes no static power in CMOS design style. However, signals have to be routed to the n pull down network as well as to the p pull up network. So the load presented to every driver is high. This is exacerbated by the fact that n and p channel transistors cannot be placed close together as these are in different wells which have to be kept well separated in order to avoid latchup.

17.4 Pseudo nMOS Design Style The CMOS pull up network is replaced by a single pMOS transistor with its gate grounded. Since the pMOS is not driven by signals, it is always ‘on'. The effective gate voltage seen by the pMOS transistor is Vdd. Thus the overvoltage on the p channel gate is always Vdd -VTp. When the nMOS is turned ‘on', a direct path between supply and ground exists and static power will be drawn. However, the dynamic power is reduced due to lower capacitive loading. 17.5 Static Characteristics As we sweep the input voltage from ground to, we encounter the following regimes of operation:

• nMOS ‘off’ • nMOS saturated, pMOS linear • nMOS linear, pMOS linear • nMOS linear, pMOS saturated

17.6 Low Input

• When the input voltage is less than VTn. • The output is ‘high’ and no current is drawn from the supply. • As we raise the input just above VTn, the output starts falling. • In this region the nMOS is saturated, while the pMOS is linear.

17.7 nMOS saturated, pMOS linear The input voltage is assumed to be sufficiently low so that the output voltage exceeds the saturation voltage Vi - VTn. Normally, this voltage will be higher than VTp, so the p channel transistor is in linear mode of operation. Equating currents through the n and p channel transistors, we get

The solutions are:

substituting the values of V1 and V2 and choosing the sign which puts V0 in the correct range, we get

As the input voltage is increased, the output voltage will decrease. The output voltage will fall below Vi – Vtnwhen

The nMOS is now in its linear mode of operation. The derived equation does not apply beyond this input voltage.


• Introduction • Different Configurations with NMOS Inverter • Worries about Pseudo NMOS Inverter • Calculation of Capacitive Load


Module 4 : Propagation Delays in MOS Lecture 18 : Dependence of Propagation delay on Fan-in and

Fan-out Objectives In this lecture you will learn the following

• Motivation • Design Techniques for large Fan-in

18.1 Motivation First we will show you how the fan-in and fan-out depends on propagation delay and then we will analyze how to make Fan-in large. The propagation delay of a CMOS gate deteriorates rapidly as a function of the fan-in. firstly the large number of transistor (2N) increases the overall capacitance of the gate. Secondly a series connection of transistor either in the PUN or PDN slows the gate as well, because the effective (dis)charging resistance is increased . Fan-out has a lager impact on the gate delay in complementary CMOS than some other logic states. In complementary circuit style, each input connects to both an NMOS and a PMOS device and presents a load to the driving gate equal to the sum of the gates capacitances.

Fig 18.1: Dependence of Propagation delay on Fan-in

Thus we can approximate the influence of fan in and fan-out on propagation delay in complementary CMOS gate as:

Where a1, a2 and a3 are weighing factor which are a function of technology 18.2 Design techniques for large fan in

1. Transistor Sizing: Increasing the transistor sizes increases the available

(dis)charging current. But widening the transistor results in large parasitic capacitor. This does not only affect the propagation delay of the gate but also present a larger load to the preceding gate.

2. Progressive Transistor Sizing: Usually we assume that all the intrinsic

capacitances, in a series connected array of transistors, can be lumped into a single load capacitance CL and no capacitance is present at the internal nodes of network.

Fig 18.21: Illustration of Progressive Transistor Sizing

Under these assumptions making all transistors in a series chain equal in size makes sense. This model is an over-simplification, and become more and more incorrect for increasing fan in. referring to the circuit below we can see that the capacitor associated with the transistor as we go down the chain increases and so the transistor has to discharge an increasing current as we go down the chain. While transistor MN has to conduct the discharge current only of load capacitance CL. M1 has to carry the discharge current from the total capacitance Ctot =C1+ C2 + ....+CL, which is substantially larger. Consequently a progressive scaling of the transistors is beneficial. M1>M2 >M3>....>MN. This technique has for instance proven to be advantageous in the decoders of memories where gates with large fan in are common. The effect of progressive sizing can be understood by the circuit in fig 18.21.

Spice simulation Example: Taking CL=15 fF; N =5; C1=C2=C3=C4=10fF. When all transistors are of minimum size SPICE predicts a propagation delay of 1.1nsecs. The transistors M5 toM1 are then made progressively wider in such a way that the width of the transistor is proportional to the total capacitor it has to discharge. M5 is of minimum size, WM4=WM5(CL+ C4)/CL, WM3=WM5(CL+C3+C4)/CL and so on. The resulting circuit has tpHL of 0.81nsecs or a reduction of 26.5%.

3. Transistor Ordering: Some signals in complex combinational logic blocks might be more critical than others .no all inputs of a gate arrive at the same time (may be

due to propagation delays of the preceding blocks). An input signal to a gate is called critical if it is the last signal of all input to assume a stable value. The path through the logic which determines the ultimate speed of the structures is called the critical path. Putting the critical path transistor closer to the output of the gate can result in a speed up. Referring to the figure given below signal In1 is assumed to be the critical signal. Suppose we assume signal In2 and In3 are high and In1 undergoes a 0 to 1 transition. Assume also that CL is initially charged high in 1st case no path to ground exists until M1 is turned on .the delay between the arrival of In1 and the output is therefore determined by the time it takes to discharge CL + C1 + C2. In the 2nd case C1 and C2 are already discharged when In1 changes. Only CL has to be discharged, resulting in a faster response time. Using SPICE the tPHL for a 4-input NAND gate was calculated. With the critical input connected to the bottommost transistor the tpd =717ns and when connected to the uppermost transistor tpd = 607 ns, an improvement of 15%.

Fig18.21: Two examples circuits for critical path Recap In this lecture you have learnt the following

• Motivation • Design Techniques for large Fan-in


Module 4 : Propagation Delays in MOS Lecture 19 : Analyzing Delay for various Logic Circuits Objectives In this lecture you will learn the following

• Ratioed Logic • Pass Transistor Logic • Dynamic Logic Circuits

19.1 Ratioed Logic Instead of combination of active pull down and pull up networks such a gate consists of an NMOS pull down network that realizes the logic function and a simple load device. For an inverter PDN is single NMOS transistor.

Fig 19.1: Ratioed Logic Circuit

The load can be a passive device, such as a resistor or an active element as a transistor. Let us assume that both PDN and load can be represented as linearized resistors. The operation is as follows: For a low input signal the pull down network is off and the output is high by the load. When the input goes high the driver transistor turns on, and the resulting output voltage is determined by the resistive division between the impedances of pull down and load network:

VOL= RDVDD/(RD+RL) where RD = pulldown n/w resistance, RL= load resistance. To keep the low noise margin high it is important to chose RL>>RD. This style of logic therefore called ratioed, because a careful PDN scaling of impedances (or transistor sizes) is required to obtain a workable gate. This is in contrast to the ratioless logic style

as complementary CMOS, where the low and high level don’t depend upon transistor sizes. As a satisfactory level we keep RL>=4RD. To achieve this, (W/L)D/(W/L)L> 4. 19.2 Pass Transistor Logic The fundamental building block of nMOS dynamic logic circuit, consisting of an nMOS pass transistor is shown in figure 19.21.

Fig 19.21: Pass Transistor Logic Circuit

The pass transistor MP is driven by the periodic clock signal and acts as an access switch to either charge up or down the parasitic capacitance, Cx, depending on the input signal Vin. Thus there are 2 possible operations when the clock signal is active are the logic “1” transfer( charging up the capacitance Cx to logic high level) and the logic “0” transfer( charging down the capacitance Cx to a logic low level). In either case, the output of the depletion load of the nMOS inverter obviously assumes a logic low or high level, depending on the voltage Vx. The pass transistor MP provides the only current path to the intermediate capacitive node X. when clock signal becomes inactive (clk=0) the pass transistor ceases to conduct and the charge is stored in the parasitic capacitor Cx continues to determine the output level of the inverter. Logic “1” Transfer: Assume that the Vx = 0 initially. A logic "1"level is applied to the input terminal which corresponds to Vin=VOH=VDD. Now the clock signal at the gate of the pass transistor goes from 0 to VDD at t=0. It can be seen that the pass transistor starts to conduct and operate in saturation throughout this cycle since VDS=VGS. Consequently VDS> VGSVtn. Analysis: The pass transistor operating in saturation region starts to charge up the capacitor Cx, thus:

The previous equation for Vx(t) can be solved as-

The variation of the node voltage Vx(t)is plotted as a function of time in fig. 19.22. The voltage rises from its initial value of 0 and reaches Vmax =VDD-Vtn after a large time. The pass transistor will turn off when Vx = Vmax. Since Vgs= Vtn. Therefore Vx can never attain VDD during logic 1 transfer. Thus we can use buffering to overcome this problem.

Fig 19.22: Node Voltage Vx vs t

Logic “0” Transfer: Assume that the Vx=1 Initially. A logic“0” level is applied to the input terminal which corresponds to Vin=1. Now the clock signal at the gate of the pass transistor goes from 0 to VDD at t=0. It can be seen that the pass transistor starts to conduct and operate in linear mode throughout this cycle and the drain current flows in the opposite direction to that of charge up. Analysis: We can write –

The above equation for Vx(t) can be solved as –

Plot of Vx(t) is shown in figure 19.23.

Fig 19.22: Node Voltage Vx vs t

19.3 Dynamic Logic Circuits In case of static CMOS for a fan-in of N, 2N transistors are required. In order to reduce this, various other design logics were used like pseudo-NMOS logic and pass transistor logic. However the static power consumption in these cases increased. An alternative to

these design logics is Dynamic logic, which reduces the number of transistors at the same time keeps a check on the static power consumption. Principle: A block diagram of a dynamic logic circuit is as shown in fig 19.31. This uses NMOS block to implement its logic The operation of this circuit can be explained in two modes.

1. Precharge 2. Evaluation

Fig 19.31: Dynamic CMOS Block Diagram

In the precharge mode, the CLK input is at logic 0. This forces the output to logic 1, charging the load capacitance to VDD. Since the NMOS transistor M1 is off the pulldown path is disabled. There is no static consumption in this case as there is no direct path between supply and ground. In the evaluation mode, the CLK input is at logic 1. Now the output depends on the PDN block. If there exists a path through PDN to ground (i.e. the PDN network is ON), the capacitor CL will discharge else it remains at logic 1.As there exists only one path between the output node and a supply rail, which can only be ground, the load capacitor can discharge only once and if this happens, it cannot charge until the next precharge operation. Hence the inputs to the gate can make at most one transition during evaluation

19.32: DOMINO CMOS Block Diagram

Advantages of dynamic logic circuits:

1. As can be seen, the number of transistors required here are N+2 as compared to 2N in the Static CMOS circuits.

2. This circuit is still a ratioless circuit as in Static case. Hence, progressive sizing and ordering of the transistors in the PDN block is important.

3. As can be seen, the static power loss is negligible. Disadvantages of dynamic logic circuits:

1. The penalty paid in such circuits is that the clock must run everywhere to each such block as shown in the diagram.

2. The major problem in such circuits is that the output node is at Vdd till the end of the precharge mode. Now if the CLK in the next block arrives earlier compared to the CLK in this block, or the PDN network in this block takes a longer time to evaluate its output, then the next block will start to evaluate using this erroneous value

The second part of the disadvantage can be eliminated by using DOMINO CMOS circuits which are as shown below. As can be seen the output at the end of precharge is inverted by the inverter to logic 0. Thus the next block will not be evaluated till this output has been evaluated. As an ending point, it must be noted that this also has a disadvantage that since at each stage the output is inverted, the logic must be changed to accommodate this. Recap In this lecture you have learnt the following



Module 1 : Introduction to VLSI Design Lecture 2 : System approach to VLSI Design Objectives In this lecture you will learn the following:

• What is System? • Design Abstraction Levels • Delay and Interconnect Issues in physical circuits

2.1 What is System?

2.1.1 Definition of System A system is something which gives an output when it is provided with an input (see figure 1).

Figure 1. A simple System

2.1.2 System-On-Chip (SoC) As the name suggests, its basically means shrinking the whole system onto a single chip.The most important feature of the chip is that its functionality should be comparable to that of the original system. It improves quality, productivity and performance.

Figure 2. An SoC example

2.2 Design Abstraction levels Every system should be decomposed into three fundamental domains:

1. Behavioral Domain 2. Structural Domain 3. Physical Domain

In every domain, there are diffirent layers or levels of hierarchy. The following Onion Diagram will give a better understanding of this -

Figure 3. Onion Diagram We can design the system at various layers, which are called design abstraction levels:

1. Architechture 2. Algorithm 3. Modules (or Functions) 4. Logic 5. Switch 6. Circuit 7. Device

In this course, we are only dealing with Logic, Switch and Circuit levels.

Representation examples Behavioral Representation Structural Representation

2.3 Delay and Interconnect Issues in physical circuits It must be noted that when the adder described (in the above structural Representation) is realized physically, the output may not arrive at the instant the input is given i.e. if the input is given at time t=0, output can be obtained at time t=t1>0, where values of t1 may range from picoseconds to milliseconds, but never zero.

Figure 4: Delay in system output

These delays may occur in the devices used to realize the system. However, today the major concern of designers are are the interconnecting wires which connect the various devices. They are the major bottleneck in the speed of the systems today. They occur due to parasitic resistances and capacitances present in the circuits designed. A detailed discussion on Circuit Interconnects will be done in later lectures. Recap In this lecture you have learnt the following

• What is System? • Design Abstraction Levels • Delay and Interconnect Issues in physical circuits


Module 4 : Propagation Delays in MOS Lecture 20 : Analyzing Delay in few Sequential Circuits Objectives In this lecture you will learn the delays in following circuits

• Motivation • Negative D-Latch • S-R Latch using NOR Gates • Simple Latch using two Inverters (Bistable Element) • Master Slave Flip-Flop

20.1 Motivation We know that digital circuits are formed by two type of components- (1) Combinational circuit and (2) Sequential Circuits. Combinational circuit components are used only for logic implementation and can't store the bits i.e. work as memory. But Sequential circuit components can store bits, hence used as memory elements. How fast a circuit (containing memory elements i.e sequential elements) can store or retrieve the value from its memory depends upon the delays in each of such basic sequential elements e.g. flip-flops etc. In coming sections, we will analyze basic functionalities and delays of such sequential elements- 20.2 Negative D-Latch Structure: This circuit consists of a multiplexer and an inverter. Data is fed at the i1 input of mux where as the output is given to the inverter, which in turn is fed to the i2 input of the mux. Clock is given to the select input. Working:

Fig 20.2 Negative D-Latch Circuits

When clock = '0' the data is passed on to the output. When clock = '1' the data gets latched. This circuit can be converted into positive clock latch by giving an inverted clock at the select input. Note: Latch = level senstive Flip flop = edge triggered.

20.3 S-R Latch using NOR Gates Structure: Using two nor gates this circuit is designed .In this circuit one of the input of

nor gate is 'R' .Other input is and the output of the gate is Q. In the second NOR gate

the inputs are S and and the output is Q. Thus we see that the two inputs are connected in feedback configuration.

Fig 20.3: S-R Latch circuits using NOR Note: Synchronous circuit: A circuit is said to be in synchronous mode if the output data speed is equal to the speed of the clock. Asynchronous circuit: A circuit is said to be in asynchronous mode if the output data speed is less than the speed of the clock. Setup time: it is the time for which the valid data must be present before the clock edge arrives. Hold time: it is the time for which the data must be held after the arrival of the clock edge. Sufficient set up and hold time must be provided to prevent contention of data. 20.4 Simple Latch using two Inverters (Bistable Element) Structure: Here the output voltage of ig1 is equal to the input of ig2, and vice-versa. Fig 20.41 shows the latch using two inverters.

Fig 20.41: Latch using two inverters

Truth Table:

Working: Notice that the input and output voltages of ig2 correspond to the output and input voltages of ig1 respectively. It can be seen that the two voltage transfer characteristics intersect at three points. Two of them are stable, while the middle point is unstable. The gain at the stable points is less than unity. Thus if input is at any of these points, it remains stable. The voltage gain at the third operating point is greater than unity. However if the input has a small perturbation, it is amplified and led to any of the two stable states. Hence this state is called metastable state. Since the circuit has two stable operating points it is called bistable. The potential energy is at its minimum at two of the three operating points, since the voltage gains of both inverters are equal to zero. By contrast, energy attains maximum value at the operating point at which the voltage gains of both inverters are maximum. Thus the circuit has two stable states corresponding to the two energy minima, and one unstable state corresponding to the potential energy maximum. Consider the above circuit at vg1 =vg2=vinv, the unstable operating point. Assume that input capacitance cg of each inverter is much more than output capacitance cd. The drain current of each inverter is also equal to the gate current of other inverter.

--eq1 Where gm represents transconductance of inverter. The gate charges q1 and q2 are

--eq2

Fig 20.42: Stability Graph

The small signal gate current of each inverter can be written as-

--eq3 Using eq1 and eq3,

--eq4 These equation In terms q1 & q2 is given as below-

--eq5 --eq6

Combining equations eq5 and eq6, we will get

--eq7 This expression is simplified by using To, the transient time constant

The time domain solution is

--eq8

The initial condition is --eq9 By solving these, we will get

--eq9 Note that the magnitude of both the output voltages increases exponentially with time. Depending on the polarity of the initial small perturbation dVo1(0) and dVo2(0) the output voltages will diverge from there initial value of Vinv to either Vol or Voh

Fig 20.3: Voltage Stablity graph Solution for the problem:

1. The inverter should not be identical. 2. The lines connecting the two inverters should be of different lengths.

Note:

1. This same circuit can be used as static ram. After the data which is fed has started circulating, the input can be removed since it keep on circulating.

2. This circuit is also called as transparent latch or level sensitive latch.

20.4 Master Slave Flip-Flop or 1-bit shift resister (d negative edge triggered flip flop)

1: When the clock is at logic low the pass gate no.1 of master allows the input to pass the input to its output

2: when the clock is at logic high pass gate no.2 of master becomes transparent and the input gets latched.

3: In the second part, when the clock is at logic high the slave passes the output of master which was initially inverted to its output by again inverting it. Thus the input data D reaches the output at negative edge of the clock cycle.

4: If the clock is low, then the slave part latches the output.

Fig 20.41: Master Slave Flip-Flop circuit

Another circuit for Negative Edge Triggered D flip/flop: Advantage In this circuit is we need to use just 8 transistors.

Fig 20.42: Alternative circuit of Master Slave Flip-Flop

Working: When the clock input is low, then the output is in high impedance state. So we are not able to get any data at the output. So the earlier data which was stored at the output capacitance is latched. In the next clock phase, the output terminal is connected to the pull-down or pull-up networks. So the data which is at input of the pull-down or pull-up network is stored at output in its inverted form. In the second part of the circuit the same process takes place. Thus the data is shifted by a bit, and acts as a 1-bit shift register. Disadvantage: 1) There is a problem of charge sharing in this circuit.

2) This is used for slower circuits. Note: Clock frequency <= 1/(5*propagation delay).




Module 4 : Propagation Delays in MOS Lecture 21 : Logical Effort Objectives In this lecture you will learn the following

• Motivation for Logical Effort • Definition of Logical Effort • Delay in a Logical Gate

21.1 Motivation for Logical Effort Here, we are going to introduce the concepts on which logical effort is based. Logical effort as introduced by Sutherland et al is just a formalized representation of these concepts. The propagation delay of a MOS transistor depends on the capacitance of the transistor. So, as the width W is increased capacitance increases and so does the propagation delay. Let us say that when Cin, the input capacitance of a gate (say inverter) is equal to CL, the load capacitance, then the propagation delay of the gate is

. If CL is not is equal Cin, then Now let us see that if we can introduce a buffer then can we reduce the propagation delay of the gate (see figure 21.11).

Fig 21.11: A circuit for propagation delay calculation

Here, the input capacitance of 1st gate is Cin and the load capacitance is Cl. The input

capacitance of 2nd gate, which is also load to the 1st, is uCin. Consider, . Now, let us find out the optimum CL for which introducing a buffer will provide Cin

performance improvement. For this we use the following: Solving this

differential equation leads to the following result, From the above result, we observe that a buffer will yield better results for Y > 4. Here the input capacitance of first gate is Cin,

that of 2nd is uCin, hat of next is and so on and for the last one the input

capacitance is .

We have,

Fig 21.12: A circuit of high order for propagation delay calculation

taking lnu=1 and Y=1000. Therefore, with 7 stages we can drive a load capacitance 1000 times theinput capacitance. Such cases arise when we have situations where we have to drive, lets say, a motor. Then Y> 1000 and large currents have to be delivered. In such cases where load current is to be provided outside chip then buffer should be put very close to the output pad to avoid adding line capacitance. 21.2 Definition of Logical Effort The method of logical effort is an easy way to estimate the delay in an MOS circuit. The method can be used to decide the number of logic stages on a path and also what should be the size of the transistors. Using this method we can do a simple estimations in the early stages of design, which can be a starting point for more optimizations. The logical effort of a gate tells how much worse it is at producing output current than an inverter, given that each of its inputs may contain only the same input capacitance as the inverter. Reduced output current means slower operation, and thus logical effort number for a logic gate tells how much more slowly it will drive a load than an inverter would. Equivalently, logical effort is how much more input capacitance a gate presents to deliver the same output current as an inverter. As we can see from the table presented above, the logical effort increases as the complexity of a gate increases. Also, for the same logic gate, as the number of inputs increases, the logical effort increases. Thus, larger or more complex logic gates will exhibit more delay. Thus we can evaluate different choices of logical structure by considering their logical effort. For example, designs that minimize the number of stages will require more inputs for each logic gate and thus have larger logical effort. Similarly, designs with fewer inputs and thus less logical effort per stage may require more stages of logic. These tradeoffs should be evaluated for an optimum design. 21.1 Delay in a Logic Gate

Delays in a MOS gate are caused by the capacitive loads and due to the gate topology. We will take an inverter as the unit gate and compare performance of other gates with an inverter. A complex logic gate, which may have transistors connected in series, will have more delay than an inverter with similar transistor sizes that drives the same load, as they are poorer at driving current. The method of logical effort quantifies these effects. We will consider as the delay unit that characterizes a given MOS process. is about 50ps for a typical 0.6 process.

The absolute delay of the gate is , Where d is unitless delay of the gate. The delay incurred by a logic gate can be expressed as, d = f + p, Where p is a fixed part called parasitic delay and f is proportional to the load on the gate’s output called the effort delay or stage effort. d is measured in units of . The effort delay f depends on the load and on the properties of the logic gate driving that load and comprises of two components. f = gh ,Where g, logical effort, accounts for the properties of the gate h, electrical effort, characterizes the load. Combining above equations, we get - d = gh + p Thus, we see that there are four components that basically contribute to delay, namely,

, g, h and p. The process parameter represents the speed of the basic transistor. The parasitic delay, p, represents the intrinsic delay of the gate due to its own internal capacitance. The electrical effort, h, is Cout/Cin, where Cout is the capacitance due to the load and Cin is the capacitance due to sizes of the transistors. The logical effort, g, expresses the effect of circuit topology and is independent of load and transistor sizing. Thus logical effort depends only on circuit topology. Recap In this lecture you have learnt the following

• Motivation for Logical Effort • Definition of Logical Effort • Delay in a Logical Gate


Module 4 : Propagation Delays in MOS Lecture 22 : Logical Effort Calculation of few Basic Logic Circuits Objectives In this lecture you will learn the following

• Introduction • Logical Effort of an Inverter • Logical Effort of NAND Gate • Logical Effort of NOR Gate • Logical Effort of XOR Gate • Logic Effort Calculation of few Mixed Circuits • Delay Plot

22.1 Introduction The method of logical effort is an easy way to estimate delay in a CMOS circuit. We can select the fastest candidate by comparing delay estimates of different logic structures. The method also specifies the proper number of logic stages on a path and the best transistor sizes for the logic gates. Because the method is easy to use, it is ideal for evaluating alternatives in the early stages of a design and provides a good staring point for more intricate optimizations. It is founded on a simple model of the delay through a single MOS logic gate. The model describes delays caused by the capacitive load that the logic gate drives and by the topology of the logic gate. Clearly as the load increases, the delay increases, but the delay also depends on the logic function of the gate. Inverters, the simplest logic gates, drive loads best and are often used as amplifiers to drive large capacitances. Logic gates that compute other functions require more transistors, some of which are connected in series, making them poorer than inverters at driving current. Thus a NAND gate has more delay than an inverter with similar transistor sizes that drives the same load. The method of logical effort quantifies these effects to simplify delay analysis for individual logic gates and multistage logic networks. The method of logical effort is founded on a simple model of the delay through a single MOS logic gate. The model describes delay caused by the capacitive load that the logic gate drives. Certainly as the load increases the delay increases, but delay also depends on logical function of the gate. Invertors, the simplest logical gates, drive loads best and are often used as amplifiers to drive large capacitances. Logic gates that compute other functions require more transistors, some connected in series, making them poorer than inverters at driving currents. Thus NAND gate has more delay than inverter with similar transistor size and driving load. The method of logical effort qualifies these effects to simplify delay analysis for individual logic gates and multistage logic networks.

22.2 Logical Effort of an Inverter

Fig 22.21: Inverter Circuit The logical effort of an Inverter is defined to be unity. 22.3 Logical Effort of a NAND Gate A NAND gate contain two NMOS (pull down) transistors in series and two PMOS (pull up) transistors as shown in fig 22.3).

Fig 22.3:2-input NAND

We have to size the transistors such that the gate has the same drive characteristics as an inverter with a pull down of width 1 and a pull up of width 2. Because the two pull down transistors are in series, each must have the twice the conductance of the inverter pull down transistor so that the series connection has a conductance equal to that of the inverter pull down transistor. Hence these two transistors should have twice the width compared to inverter pull down transistor. By contrast, each of the two pull up transistors in parallel need be only as large as the inverter pull up transistor to achieve the same drive as the reference inverter. So, the logical effort per input can be calculated as g = (2+2)/ (1+2) = 4/3. For 3 input NAND gate, g = (3+2)/ (1+2) =5/3 For n input NAND gate, g = (n+2)/ 3

22.4 Logical Effort of a NOR Gate A NOR gate contain two pull down transistors in parallel and two pull up transistors in series as shown in figure 22.4.

Fig 22.4: 2-input NOR

Because the two pull up transistors are in series, each must have the twice the conductance of the inverter pull up transistor so that the series connection has a conductance equal to that of the inverter pull up transistor. Hence these two transistors should have twice the width compared to inverter pull down transistor. By contrast, each of the two pull down transistors in parallel need be only as large as the inverter pull down transistor to achieve the same drive as the reference inverter. So, the logical effort per input can be calculated as effort of NOR gate, g = (1+4)/ (1+2) = 5/3 For n input NOR gate, g = (2n+1)/3 22.5 Logical Effort of a XOR Gate A two input XOR gate is shown in figure 22.5.

Fig 22.4: XOR Gate

Here we will calculate the logical effort for a bundle (A* or B*) instead of only one input as complementary inputs are applied. Logical effort for a bundle A is g = (2+4+2+4)/ (1+2) = 4. Logical effort for a bundle B is g = (2+4+2+4)/ (1+2) = 4.

22.6 Examples Circuits Example Circuit 1:

Fig 22.61: Example Circuit 1 Example Circuit 2: 4 BIT MUX

Fig 22.62: Example Circuit 2 22.7 Tabular View of Logical Efforts Logical effort for different circuits is tabulated in the table below in fig. 22.71

Fig 22.71: Logical efforts of basic gates with different input configurations

Example 1

Example 2 Logical effort for input D is gD = (2+4)/ (1+2) = 2 Logical effort for bundle S is gs =(2+4)/ (1+2) = 2. For one arm, g = 12/3 = 4 For N-way symmetrical MUX g= 4N (this is for the static CMOS MUX only)

The parasitic delays for different is tabulated in fig. 22.72.

Fig 22.72: parasitic delay of basic gates

Now delay, d = gh+ p For example, dINV = (1*1) +1=2. If we assume tau = 25 ps Absolute delay dABS =50 ps. 22.8 Delay Plot The delay of a simple logic gate as represented in equation d = gh + p is a simple linear relationship.

Fig 22.8: Delay Plot

The fig 22.8 shows this relationship graphically. Delay appears as a function of electrical effort for an inverter and for a two-input NAND gate.The slope of each line is the logical effort of the gate.It’s intercept is the parasitic delay. The graph shows that we can adjust the total delay by adjusting the electrical effort or by choosing a logic gate with a different logical effort.

Example3: Fonout-of-4 (FO4) inverter circuit-

Fig22.82: FO4 circuit

Because each inverter is identical, Cout = 4Cin, so h = 4. The logical effort g = 1 for an inverter. Thus FO4 delay is, d = gh + p = 1*4 + pINV =4 + 1 = 5. Recap In this lecture you have learnt the following

• Introduction • Logical Effort of an Inverter • Logical Effort of NAND Gate • Logical Effort of NOR Gate • Logical Effort of XOR Gate • Logic Effort Calculation of few Mixed Circuits • Delay Plot

Module 4 : Propagation Delays in MOS Lecture 23 : Logical Effort of Multistage Logic Networks Objectives In this lecture you will learn the following

• Logical Effort of Multistage Logic Networks • Minimizing Delay along a Path • Few Examples

23.1 Logical Effort of Multistage Logic Networks The logical effort along a path compounds by multiplying the logical effort of all the logic gates along the path. We denote it by the letter 'G'. Hence,

The electrical effort along a path through the network is simply the ratio of the capacitance that loads the logic gate in the path to input capacitance of the first gate in the path. We denote it by the letter 'H'.

When fanout occurs within a logic network, some of the available drive current is directed along the path we are analyzing, and some are directed off that path. Branching effort (b) at the output of a logic gate is defined as

Where is the load capacitance along the path and is the capacitance of connections that lead off the path. If there is no branching in the path the branching effort is unity. Branching effort along the entire path 'B' is the product of branching effort at each of

the stages along the path.- Path effort (F) is defined as- The path branching and electrical effort are related to the electrical effort of each stage as-

The path delay D is the sum of the delays of each of the stages of logic in the path.

where DF is path effort delay and P is path parasitic delay which are given as –

23.2 Minimizing Delay along a Path Consider two path stages as in figure 23.21.

Fig 23.21: An Example Circuit equating

The total delay of the above circuit is given by

Substituting in equation for D we get,

To minimize D, we take the partial derivative of D with respect to it to zero we get,

i.e. the product of logical effort and electrical effort of each stage should be equal to get minimum delay. This is independent of scale of circuit and of the parasitic delay. The delay in the two stages will differ only if the parasitic delays are different. We can generalise this result for N stages as-

So,

Example of Minimizing delay: Consider the path from A to B involving three two input NAND gates as in fig 23.22. The input capacitance of first gate is C and the load capacitance is also C . Find the least delay in this path and how should the transistors be sized to achieve least delay?

Fig 23.22: Example Circuit

Solution: Logical effort of a two input NAND gate is g = 4/3 so G = (4/3)*3 = 64/27 = 2.37 . B = 1 (as there is no branching) , H = Cout / Cin = 1 Path Effort F = 64/27*1*1 = 64/27 if each stage has same parasitic delay then P = p1+ p2+p3 =6 pinv (as all are two

input), then

As, So, Cz = g3 * C/ (4/3) = C, Cy = g2 * C/ (4/3) = C. Now if Cout = 8C, then

23.3 Reduction of Delay For the minimum delay of the circuit we optimize the number of stages. Let total number of stages be N = n1 + n2

Fig 23.31: Example Circuit

But the number of stages for minimum delay may not be the integer, so it is not feasible to implement it. So we realise the circuit by either taking the number of stages greatest integer of the obtained value or the one more then the greatest integer whatever gives us the minimun delay.

Fig 23.32:

We will study about in more details in next chapter. Recap In this lecture you have learnt the following

• Logical Effort of Multistage Logic Networks • Minimizing Delay along a Path • Few Examples


Module 4 : Propagation Delays in MOS Lecture 24 : Methods for Reduction of Delays in Mutlistage Logic Networks Objectives In this lecture you will learn the following

• Effect of Using Wrong Number of Stages • Dynamic Latch • Carry Propagation Gate • Dynamic Mular C-element • Fork

24.1 Using Wrong No. of Stages Let us assume that the number of stages is wrong by a factor s, i.e. the number of

stages is . Where is the best number to use. The delay can be expressed as a function of N(assuming parasitic delay of each stage is same as p) as:

Let rbe the ratio of the delay when using sNstages to the delay when using best number of stages, N. So,

Since is the best number we know that . Solving for r we obtain

This relationship is plotted in the figure for p = 1 and p = 3.59.

Fig 24.11: The relative delay compared to the best possible,

as a function of the relative error in the number of stages used A designer often faces the problem of deciding whether it would be beneficial to change the number of stages in an existing circuit. This can easily be done by calculating the stage effort. If the effort is between 2 and 8, the design is within 35% of best delay. If the effort is between 2.4 and 6, the design is within 15% of best delay. Therefore, there is little benefit in modifying a circuit unless the stage effort is grossly high or low. 24.2 Dynamic Latch Fig 24.21 shows a dynamic latch: when the clock signal is HIGH, and its complement

is LOW, the gate output q is set to the complement of the input d. The total logical effort of this gate is 4; the logical effort per input for d is 2, and the logical effort of the

bundle is also 2. (Note is 2)

Fig 24.21: A dynamic latch with input d and output q.

The clock bundle is 24.3 Carry Propagation Gate Fig 24.31 shows one stage of a ripple-carry chain in an adder. The stage accepts carry

and delivers a carry out in inverted form on . The inputs g and come from the two bits to be summed at this stage. The signal g is HIGH if this stage generates a new

carry, forcing . Similarly, is LOW if this stage kills incoming carries, forcing

The total logical effort of this gate is . The logical effort per input for

is 2; for the g input it is ; and for the input it is .

Fig 24.41: A carry propgation gate

24.4 Dynamic Muller C-element Fig 24.41 shows an inverting dynamic Muller C-element with two inputs. Although this gate is rarely seen in designs for synchronous systems, it is a staple of asynchronous system design. The behavior of the gate is as follows: When both inputs are HIGH, the output goes LOW; when both inputs go LOW, the output goes HIGH. In other conditions, the output retains its previous value - the C-element thus retains state. The total logical effort of this gate is 4, divided between the two inputs.

Fig 24.41: A two input inverting dynamic Muller C-element

24.5 Fork If we try to use a signal and an inverter for the complimentary signal then we get unequal delay between two signals. So we use N-stages and adjust the sizing such that we get two complementary signals with equal delay. Fig 24.51 shows a 2-1 fork and a 3-2 fork, both of which produce the same logic signals. Fig 24.52 shows a general fork.

Fig 24.51: A 2-1 fork and 3-2 fork

Fig 24.52: A general fork

The design of a fork starts out with a known load on the output legs and known total

input capacitance. As shown in Fig 24.52, we shall call the two output capacitances

and . The combined total load driven we will call . The total input

capacitance for the fork we shall call , and can thereby describe the

electrical effort for the fork as a whole to be . This electrical effort of the fork

may differ from the electrical efforts of the individual legs, and . The input current to an optimized fork may divide unequally to drive its two legs. Even if the load capacitances on the two legs of the fork are equal, it is not in general true that the input capacitances to the two legs of the fork are equal. Because the legs have different number of amplifiers but must operate with the same delay, their electrical efforts may differ. The leg that can support the larger electrical effort, usually the leg with more amplifiers, will require less input current than the other leg, and can therefore

have a smaller input capacitance. If we call the electrical efforts of the two legs and

, using the notation of Fig 24.52, then and . Even if ,

may not equal and and may also differ. The design of a fork is a balancing act. Either leg of the fork can be made faster by reducing its electrical effort, which is done by giving it wider transistors for its amplifier. Doing so, however, takes input current away from the other leg of the fork and will

inevitably make it slower. A fixed value of provides, in effect, only a certain total width of transistor material to distribute between the first stages of the two legs; putting wider transistors in one leg requires putting narrower transistors in the other leg. The task of designing a minimum delay fork is really the task of allocating the available

transistor width set by to the input stages of the two legs.


• Effect of Using Wrong Number of Stages • Dynamic Latch • Carry Propagation Gate • Dynamic Mular C-element • Fork


Module 4 : Propagation Delays in MOS Lecture 25 : Designing Asymmetric Logic Gates Objectives In this lecture you will learn the following

• Brief Introduction to Asymmetric Logic Gates • Application of Asymmetric Logic Gates • Analyzing Delays • Pseudo NMOS Circuits

25.1 Brief Introduction to Asymmetric Logic Gates Logic gates sometimes have different logical efforts for different inputs. We call such gates asymmetric. Asymmetric gates can speed up critical paths in a network by reducing the logical effort along the critical paths. This attractive property has a price, however the total logical effort of the logic gate increases. This lecture discusses design issues arising from biasing a gate to favor particular inputs.

Fig 25.11: An asymmetric NAND gate

Fig 25.11 shows a NAND gate designed so that the widths of the two pull-down transistors can differ; input a has a width 1/(1-s), while input b has width 1/s. The parameter s, 0<s<1, called the symmetry factor, determines the amount by which the logic gate is asymmetric. If s=1/2, the gate is symmetric, the pull down transistors have equal sizes, and the logical effort is the same as calculated in the previous lectures. Values of s between 0 and 1/2 favor the a input by making its pull-down transistor smaller than the pull-down transistor for b. Values of s between ½ and 1 favor the b input.

The logical efforts per input for inputs a and b, and the logical effort :

(Eq 25.1)

(Eq 25.2)

(Eq 25.3)

Choosing the least value possible for s, such as 0.01, minimizes the logical effort of input a. This design results in pull-down transistor of width 1.01 for input a and a

transistor of width 100 for input b. The logical effort of input a is then , or

almost exactly 1. The logical effort of input b becomes , or about 34 if . The

total logical effort is about 35, again assuming . Extremely asymmetric designs, such as with s=0.1, are able to achieve a logical effort for one input that almost matches that of an inverter, namely 1. The price of this achievement is an enormous total logical effort, 35, as opposed to 8/3 for a symmetric design. Moreover, the huge size of the pull-down transistor will certainly cause layout problems, and the benefit of the reduced logical effort on input a may not be worth the enormous area of this transistor. Less extreme asymmetry is more practical. If s=1/4, the pull-down transistors have

widths 4/3 and 4, and the logical effort of input a is , which is 1.1 if . The logical effort of input b is 2, and the total logical effort is 3.1, which is very little more than 8/3, the total logical effort of the asymmetric design. This design achieves a logical effort for the favored input, a, that is only 10% greater than that of an inverter, without a huge increase in total logical effort. 25.2 Applications of Asymmetric Logic Gates The principal application of asymmetric logic gates occurs when one path must be very fast. For example, in a ripple carry adder or counter, the carry path must be fast. The best design uses an asymmetric circuit that speeds the carry even though it retards the sum output.

Paradoxically, another important use of asymmetric logic gates occurs when a signal path may be unusually slow, as in a reset signal. Figure 25.21 shows a design for a

buffer amplifier whose output is forced LOW, when the reset signal, , is LOW. The buffer consists of two stages: a NAND gate and an inverter. During normal operation,

when is HIGH, the first stage has an output drive equivalent to that of an inverter with pull-down width 6 and pull-up width 12, but the capacitive load on the in input is slightly larger than that of the corresponding inverter:

(Eq 25.4)

Fig 25.21: A buffer amplifier with a reset input. When is LOW,

the output will always be LOW. This circuit takes advantage of slow response allowed to changes on by using the smallest pull-up transistor possible. This choice reduces the area required to lay out the gate, partially compensating for the large area pull-down transistor. Area can be further reduced by sharing the the reset pull-down among multiple gates that switch at different times; this is known as Virtual Ground technique.

25.3 Analyzing Delays We can model the delay of an individual stage of logic with one of the following two expressions:

(Eq 25.5)

(Eq 25.6)

where the delays are measured in terms of . Notice that the logical efforts, parasitic delays, and stage delays differ for rising transitions (u) and falling transitions (d). In path containing N logic gates, we use one of two equations for the path delays, depending on whether the final output of the path rises or falls. In the equations, i is the distance from the last stage, ranging from 0 for the final gate, to (N-1) for the first gate.

(Eq 25.7)

(Eq 25.8) Equation 25.7 models the delay incurred when a network produces a rising output transition. In this equation, the first sum tallies the delay of falling transitions at the output of stages whose distance from the last stage is odd, and the second tallies the delay of the rising transitions at the output of stages whose distance from the last stage is even. Similarly Equation 25.8 models the falling output transition. A reasonable goal is to minimize the average delay:

(Eq 25.9)

Then we have for the average delay:

(Eq 25.10) 25.4 Pseudo NMOS Circuits

Fig 25.41 Pseudo-NMOS inverter, NAND and NOR gates, assuming

The analysis presented in the previous delay analysis applies to pseudo-NMOS designs. The PMOS transistor produces 1/3rd of the current of the reference inverter, and the NMOS transistor stacks produce 4/3rd of the current of the reference inverter. For falling transitions, the output current is pull-down current minus the pull-up, i.e., 4/3 - 1/3 = 1. For rising transitions, the output current is just the pull-up current, 1/3. The inverter and NOR gate have an input capacitance of 4/3. The falling logical effort is the input capacitance divided by that of an inverter with the same output current, or

. The rising logical effort is 3 times greater, , because the current produced on a rising transition is only 1/3rd that of a falling transition. The

average logical effort is . This is independent of the number of inputs, explaining why pseudo-NMOS is a way to build fast, wide NOR gates. Table 25.41 shows the rising, falling and average logical efforts of other pseudo-NMOS

gates, assuming and a 4:1 pull-down to pull-up strength ratio.

Recap

In this lecture you have learnt the following

• Brief Introduction to Asymmetric Logic Gates • Application of Asymmetric Logic Gates • Analyzing Delays • Pseudo NMOS Circuits


Module 5 : Power Disipation in CMOS Circuits Lecture 26 : Power Disipation in CMOS Circuits Objectives In this lecture you will learn the following

• Motivation • Effect of Power Disipation • How to Reduce Temperature • Components of Power Disipation • Static Power Dissipation • Dynamic Power Dissipation • Methods to Reduce Power Disipation • Short-Circuit Power Dissipation

26.1 Motivation Why is power dissipation so important? Power dissipation considerations have become important not only from reliability point of view, but they have assumed greater importance by the advent of portable battery driven devices like laptops, cell phones, PDAs etc. 26.2 Effects of Power Dissipation When power is dissipated, it invariably leads to rise in temperature of the chip. This rise in temperature affects the device both when the device is off as well as when the device is on.

When the device is off, it leads to increase in the number of intrinsic carriers, by the following relation:

(Eq 26.1) From this relation it can be seen that as temperature increases, it leads to increase in the number of intrinsic carriers in the semiconductor. The majority carriers, contributed by the impurity atoms, are less affected by increase in temperature. Hence the device becomes more intrinsic. As temperature increases, leakage current, which directly depends on minority carrier concentration, increases which leads to further increase in temperature. Ultimately, the device might break down, if the increase in temperature is not taken care of by time to time removal of the dissipated heat. A ON device won’t be affected much by minority carrier increase, but will be affected by VT and µ which decrease with increase in temperature and lead to change in ID. Hence the device performance might not meet the required specifications. Also, power

dissipation is more critical in battery powered applications as the greater power dissipated, the battery life will be. 26.3 How to Reduce Temperature The heat generated due to power dissipation can be taken away by the use of heat sinks. A heat sink has lower thermal resistance than the package and hence draws heat from it. For the heat to be effectively removed, the rate of heat transfer from the area of heat generation to the ambient should be greater than the rate of heat generation. This rate of heat transfer depends on the thermal resistance. The thermal resistance, θ is given by the following relation:

(Eq 26.2) where, l = length, A = Area and σc= thermal conductivity of the heat sink From the above relation it can be seen that large σc implies smaller θ. θ is also given by the relation,

(Eq 26.3) Using this relation, we can see that for a given power dissipation, PD

(Eq 26.4) where, Tj= junction temperature, and Ta= ambient temperature. Heat sink materials are generally coated black to radiate more energy 26.4 Components of Power Dissipation Unlike bipolar technologies, here a majority of power dissipation is static, the bulk of power dissipation in properly designed CMOS circuits is the dynamic charging and discharging of capacitances. Thus, a majority of the low power design methodology is dedicated to reducing this predominant factor of power dissipation. There are three main sources of power dissipation:

• Static power dissipation (PS) • Dynamic power dissipation (DS) • Short circuit power dissipation (PSC)

Thus the total power dissipation, , is

(Eq 26.5)

26.5 Static Power Dissipation

Consider the complementary CMOS gate, shown in Figure 26.51

Fig 26.51: CMOS inverter model for static power dissipation evaluation

When input = '0', the associated n-device is off and the p-device is on. The output

voltage is or logic '1'. When the input = '1', the associated n-device is on and the

p-device turns off. The output voltage is '0' volts or . It can be seen that one of the transistors is always off when the gate is in either of these logic states. Since no current

flows into the gate terminal, and there is no DC current path from to , the

resultant quiescent (steady-state) current, and hence power , is zero. However, there is some small static dissipation due to reverse bias leakage between diffusion regions and the substrate. In addition, subthreshold conduction can contribute to the static dissipation. A simple model that describes the parasitic diodes for a CMOS inverter should be looked at in order to have an understanding of the leakage involved in the device. The source-drain diffusions and the n-well diffusion form parasitic diodes. In the model, a parasitic diode exists between n-well and the substrate. Since parasitic diodes are reverse biased, only their leakage current contributes to static power dissipation. The leakage current is described by the diode equation:

(Eq 26.6)

where, is= reverse saturation current V = diode voltage q = electronic charge k = Boltzmann's constant T = temperature The static power dissipation is the product of the device leakage current and the supply voltage:

(Eq 26.7)

26.6 Dynamic Power Dissipation During switching, either from '0' to '1' or, alternatively, from '1' to '0', both n- and p-transistors are on for a short period of time. This results in a short current pulse from

to . Current is also required to charge and discharge the output capacitive

load. This latter term is usually the dominant term. The current pulse from to results in a 'short-circuit' dissipation that is dependent on the input rise/fall time, the load capacitance and the gate design.

Fig 26.61: Power dissipation due to charging/discharging of capacitor

The dynamic dissipation can be modeled by assuming that the rise and fall time of the

step input is much less than the repetition period. The average dynamic power, ,

dissipated during switching for a square-wave input, Vin, having a repetition frequency

of , is given by

(Eq 26.8) where

= n-device transient current

= p-device transient current

For a step input and with

(Eq 26.9)

(Eq 26.10)

with ,

resulting in (Eq 26.11)

Thus for a repetitive step input the average power that is dissipated is proportional to the energy required to charge and discharge the circuit capacitance. The important factor to be noted here is that Eq 26.11 shows power to be proportional to switching frequency but independent of device parameters. The power dissipation also depends on the switching activity, denoted by, α. The equation can then can be written as

(Eq 26.12) 26.7 Methods to Reduce Dynamic Power Dissipation

As can be seen from Eq (26.12), the power dissipated can be reduced by reducing either

the clock frequency, , or the load capacitance, , or the rail voltage, , or the switching activity parameter, α. Reducing the clock frequency is the easiest thing to do, but it seriously affects the performance of the chip. Applications where power is paramount, this is approach can be used satisfactorily. Another method to reduce the

dissipated power is to lower the load capacitance, . But this method is more difficult than the previous approach because it involves conscientious system design, so that there are fewer wires, smaller pins, smaller fan-out, smaller devices etc.

Power dissipation can also be reduced by reducing the rail voltage, . But this can be done only through device technology. Also rail voltage is a standard agreed to in many cases by the semiconductor industry, hence we do not have much control over this parameter. Also rail voltage is strongly dependent on the threshold voltage and the noise margin. Some special techniques are also used to reduce power dissipation. The first one involves the use of pipelining to operate the internal logic at a lower clock than the i/o frequency. The other technique is to reduce switching activity, α, by optimizing algorithms, architecture, logic topology and using special encoding techniques. 26.8 Short-Circuit Power Dissipation The short-circuit power dissipation is given by

(Eq 26.13) For the input waveform shown in Fig 26.81a, which depicts the short-circuit (Fig26.81b) in an unloaded inverter,

(Eq 26.14)

assuming that and and that the behavior is symmetrical around t2.

With

Thus for an inverter without load, assuming that ,

where tp is the period of the waveform. This derivation is for an unloaded inverter. It shows that the short-circuit current is dependent on β and the input waveform rise and fall times. Slow rise times on nodes can result in significant (20%) short-circuit power dissipation for loaded inverters. Thus it is good practice to keep all edges fast if power dissipation is a concern. As the load capacitance is increased the signifance of the short-circuit dissipation is reduced by the capacitive dissipation PD. Recap In this lecture you have learnt the following

• Motivation • Effect of Power Disipation • How to Reduce Temperature • Components of Power Disipation • Static Power Dissipation • Dynamic Power Dissipation • Methods to Reduce Power Disipation • Short-Circuit Power Dissipation


Module 6 : Semiconductor Memories Lecture 27 : Basics of Seminconductor Memories Objectives In this lecture you will learn the following

• Introduction • Memory Classification • Memory Architechtures and Building Blocks • Introduction to Static and Dynamic RAMs

27.1 Introduction Semiconductor based electronics is the foundation to the information technology society we live in today. Ever since the first transistor was invented way back in 1948, the semiconductor industry has been growing at a tremendous pace. Semiconductor memories and microprocessors are two major fields, which are benefited by the growth in semiconductor technology.

Fig 27.11: Increasing memory capacity over the years

The technological advancement has improved performance as well as packing density of these devices over the years Gordon Moore made his famous observation in 1965, just four years after the first planar integrated circuit was discovered. He observed an exponential growth in the number of transistors per integrated circuit in which the number of transistors nearly doubled every couple of years. This observation, popularly known as Moore's Law, has been maintained and still holds true today. Keeping up with this law, the semiconductor memory capacity also increases by a factor of two every year. 27.2 Memory Classification

• Size: Depending upon the level of abstraction, different means are used to express the size of the memory unit. A circuit designer usually expresses memory in terms of bits, which are equivalent to the number of individual cells need to store the data. Going up one level in the hierarchy to the chip design level, it is common to express memory in terms of bytes, which is a group of 8 bits. And on a system level, it can be expressed in terms of words or pages, which are in turn collection of bytes.

• Function: Semiconductor memories are most often classified on the basis of access patterns, memory functionality and the nature of the storage mechanism. Based on the access patterns, they can be classified into random access and serial access memories. A random access memory can be accessed for read/write in a random fashion. On the other hand, in serial access memories, the data can be accessed only in a serial fashion. FIFO (First In First Out) and LIFO (Last In Last Out) are examples of serial memories. Most of the memories fall under the random access types.

Based on their functionalities, memory can be broadly classified into Read/Write memories and Read-only memories. As the name suggests, Read/Write memory offers both read and write operations and hence is more flexible. SRAM (Static RAM) and DRAM (Dynamic RAM) come under this category. A Read-only memory on the other hand encodes the information into the circuit topology. Since the topology is hardwired, the data cannot be modified; it can only be read. However, ROM structures belong to the class of the nonvolatile memories. Removal of the supply voltage does not result in a loss of the stored data. Examples of such structures include PROMs, ROMs and PLDs. The most recent entry in the filed are memory modules that can be classified as nonvolatile, yet offer both read and write functionality. Typically, their write operation takes substantially longer time than the read operation. An EPROM, EEPROM and Flash memory fall under this category.

Fig 27.21: Classification of memories

• Timing Parameters: The timing properties of a memory are illustrated in Fig 27.22. The time it takes to retrieve data from the memory is called the read-access time. This is equal to the delay between the read request and the moment the data is available at the output. Similarly, write-access time is the time elapsed between a write request and the final writing of the input data into the memory. Finally, there is another important parameter, which is the cycle time (read or write), which is the minimum time required between two successive read or write cycles. This time is normally greater than the access time.

27. 3 Memory Architecture and Building Blocks The straightforward way of implementing a N-word memory is to stack the words in a linear fashion and select one word at a time for reading or writing operation by means of a select bit. Only one such select signal can be high at a time. Though this approach as shown in Fig 27.31 is quite simple, one runs into a number of problems when trying to use it for larger memories. The number of interface pins in the memory module varies linearly with the size of the memory and this can easily run into huge values.

Fig 27.31: Basic Memory Organization

To overcome this problem, the address provided to the memory module is generally encoded as shown in Fig 27.32. A decoder is used internally to decode this address and make the appropriate select line high. With 'k' address pins, 2K number of select pins can be driven and hence the number of interface pins will get reduced by a factor of log2N.

Fig 27.32: Memory with decoder logic

Though this approach resolves the select problem, it does not address the issues of the memory aspect ratio. For an N-word memory, with a word length of M, the aspect ratio will be nearly N:M, which is very difficult to implement for large values of N. Also such sort of a design slows down the circuit very much. This is because, the vertical wires connecting the storage cells to the inputs/outputs become excessively long. To address this problem, memory arrays are organized so that the vertical and horizontal dimensions are of the same order of magnitude, making the aspect ratio close to unity. To route the correct word to the input/output terminals, an extra circuit called column decoder is needed. The address word is partitioned into column address (A0 to AK-1) and row address (AK-1 to AL-1). The row address enables one row of the memory for read/write, while the column address picks one particular word from the selected row.

Fig 27.33: Memory with row and column decoders

27. 4 Static and Dynamic RAMs RAMs are of two types, static and dynamic. Circuits similar to basic D flip-flop are used to construct static RAMs (SRAMs) internally. A typical SRAM cell consists of six transistors which are connected in such a way as to form a regenerative feedback. In contrast to DRAM, the information stored is stable and does not require clocking or refresh cycles to sustain it. Compared to DRAMs, SRAMs are much faster having typical

access times in the order of a few nanoseconds. Hence SRAMs are used as level 2 cache memory. Dynamic RAMs do not use flip-flops, but instead are an array of cells, each containing a transistor and a tiny capacitor. '0's and '1's can be stored by charging or discharging the capacitors. The electric charge tends to leak out and hence each bit in a DRAM must be refreshed every few milliseconds to prevent loss of data. This requires external logic to take care of refreshing which makes interfacing of DRAMs more complex than SRAMs. This disadvantage is compensated by their larger capacities. A high packing density is achieved since DRAMs require only one transistor and one capacitor per bit. This makes them ideal to build main memories. But DRAMs are slower having delays in the order tens of nanoseconds. Thus the combination of static RAM cache and a dynamic RAM main memory attempts to combine the good properties of each. Recap

• In this lecture you have learnt the following • Introduction • Memory Classification • Memory Architechtures and Building Blocks • Introduction to Static and Dynamic RAMs


Module 6 : Semiconductor Memories Lecture 28 : Static Random Access Memory (SRAM) Objectives In this lecture you will learn the following

• SRAM Basics • CMOS SRAM Cell • CMOS SRAM Cell Design • READ Operation • WRITE Operation

28.1 SRAM Basics The memory circuit is said to be static if the stored data can be retained indefinitely, as long as the power supply is on, without any need for periodic refresh operation. The data storage cell, i.e., the one-bit memory cell in the static RAM arrays, invariably consists of a simple latch circuit with two stable operating points. Depending on the preserved state of the two inverter latch circuit, the data being held in the memory cell will be interpreted either as logic '0' or as logic '1'. To access the data contained in the memory cell via a bit line, we need atleast one switch, which is controlled by the corresponding word line as shown in Figure 28.11.

Fig 28.11: SRAM Cell

28.2 CMOS SRAM Cell A low power SRAM cell may be designed by using cross-coupled CMOS inverters. The most important advantage of this circuit topology is that the static power dissipation is very small; essentially, it is limited by small leakage current. Other advantages of this design are high noise immunity due to larger noise margins, and the ability to operate at lower power supply voltage. The major disadvantage of this topology is larger cell size. The circuit structure of the full CMOS static RAM cell is shown in Figure 28.12. The memory cell consists of simple CMOS inverters connected back to back, and two access

transistors. The access transistors are turned on whenever a word line is activated for read or write operation, connecting the cell to the complementary bit line columns.

Fig 28.21: Full CMOS SRAM cell

28.3 CMOS SRAM Cell Design To determine W/L ratios of the transistors, a number of design criteria must be taken into consideration. The two basic requirements, which dictate W/L ratios, are that the data read operation should not destroy the stored information in the cell. The cell should allow stored information modification during write operation. In order to consider operations of SRAM, we have to take into account, the relatively large parasitic

column capacitance and and column pull-up transistors as shown in Figure 28.31.

Fig 28.31: CMOS SRAM cell with precharge transistors

When none of the word lines is selected, the pass transistors M3 and M4 are turned off and the data is retained in all memory cells. The column capacitances are charged by the pull-up transistors P1 and P2. The voltages across the column capacitors reach VDD - VT. 28.4 READ Operation Consider a data read operation, shown in Figure 28.41, assuming that logic '0' is stored in the cell. The transistors M2 and M5 are turned off, while the transistors M1 and M6 operate in linear mode. Thus internal node voltages are V1 = 0 and V2 = VDD before the cell access transistors are turned on. The active transistors at the beginning of data read operation are shown in Figure 28.41.

Fig 28.41: Read Operation

After the pass transistors M3 and M4 are turned on by the row selection circuitry, the voltage CBb of will not change any significant variation since no current flows through M4. On the other hand M1 and M3 will conduct a nonzero current and the voltage level of CB will begin to drop slightly. The node voltage V1 will increase from its initial value of '0'V. The node voltage V1 may exceed the threshold voltage of M2 during this process, forcing an unintended change of the stored state. Therefore voltage must not exceed the threshold voltage of M2, so the transistor M2 remains turned off during read phase, i.e.,

(Eq 28.1) The transistor M3 is in saturation whereas M1 is linear, equating the current equations we get

(Eq 28.2)

substituting Eq 28.1 in Eq 28.2 we get

(Eq 28.3) 28.5 WRITE Operation Consider the write '0' operation assuming that logic '1' is stored in the SRAM cell initially. Figure 28.51 shows the voltage levels in the CMOS SRAM cell at the beginning of the data write operation. The transistors M1 and M6 are turned off, while M2 and M5 are operating in the linear mode. Thus the internal node voltage V1 = VDD and V2 = 0 before the access transistors are turned on. The column voltage Vb is forced to '0' by the write circuitry. Once M3 and M4 are turned on, we expect the nodal voltage V2 to remain below the threshold voltage of M1, since M2 and M4 are designed according to Eq. 28.1.

Fig 28.51: SRAM start of write '0'

The voltage at node 2 would not be sufficient to turn on M1. To change the stored information, i.e., to force V1 = 0 and V2 = VDD, the node voltage V1 must be reduced

below the threshold voltage of M2, so that M2 turns off. When the transistor M3 operates in linear region while M5 operates in saturation region. Equating their current equations we get

(Eq 28.4)

Rearranging the condition of in the result we get

(Eq 28.5) 28.6 WRITE Circuit The principle of write circuit is to assert voltage of one of the columns to a low level.

This can be achieved by connecting either or to ground through transistor M3 and either of M2 or M1. The transistor M3 is driven by the column decoder selecting the specified column. The transistor M1 is on only in the presence of the write enable signal

and when the data bit to be written is '0'. The transistor M2 is on only in the

presence of the write signal and when the data bit to be written is '1'. The circuit for write operation is shown in Figure 28.61

Fig 28.61: Circuit for write operation


• SRAM Basics • CMOS SRAM Cell • CMOS SRAM Cell Design • READ Operation • WRITE Operation


Module 6 : Semiconductor Memories Lecture 29 : Basics Of DRAM Cell And Access Time Consideration Objectives In this lecture you will learn the following

• DRAM Basics • Differential Operation In Dynamic RAMs • DRAM Read Process With Dummy Cell • Operation Of The Read Circuit • Calculation Of Change In Bitline Voltage • Area Considerations • Metal Gate Diffusion Storage

29.1 DRAM Basics A typical 1-bit DRAM cell is shown in Figure 29.11

Fig 29.11: DRAM Cell

The CS capacitor stores the charge for the cell. Transistor M1 gives the R/W access to the cell. CB is the capacitance of the bit line per unit length. Memory cells are etched onto a silicon wafer in an array of columns (bit lines) and rows (word lines). The intersection of a bit line and word line constitutes the address of the memory cell. DRAM works by sending a charge through the appropriate column (CAS) to activate the transistor at each bit in the column. When writing, the row lines contain the state the capacitor should take on. When reading, the sense amplifier determines the level of charge in the capacitor. If it is more than 50%, it reads it as "1”; otherwise it reads it as "0". The counter tracks the refresh sequence based on which rows have been accessed in what order. The length of time necessary to do all this is so short that it is expressed

in nanoseconds (billionths of a second). e.g. a memory chip rating of 70ns means that it takes 70 nanoseconds to completely read and recharge each cell. The capacitor in a dynamic RAM memory cell is like a leaky bucket. Dynamic RAM has to be dynamically refreshed all of the time or it forgets what it is holding. This refreshing takes time and slows down the memory. 29.2 Differential Operation In Dynamic RAMs The sense amplifier responds to difference in signals appearing between the bit lines. It is capable of rejecting interference signals that are common to both lines, such as those caused by capacitive coupling from the word lines. For this common-mode to be effective, both sides of the amplifier must be matched, taking into account the circuit that feed each side. This is required in order to make the inherently single ended output of the DRAM cell appear differential. Single To Differential Conversion: Large memories (>1Mbit) that are exceedingly prone to noise disturbances resort to translating the single ended sensing problem into a differential one. The basic concept behind the single to differential is demonstrated in Figure 29.21

Fig 29.21: Single to differential conversion

A differential sense amplifier is connected to a single ended bit line on one side and a reference voltage positioned between the "0" and "1" level at the other end. Depending on the value of BL the flip flop toggles in one or the other direction. Voltage levels tend to vary from die to die or even a single die so the reference source must track those variations. A popular way of doing so is illustrated in Figure 29.31 for the case of 1T DRAM. The memory array is divided into two halves, with the differential amplifier placed in the middle. On each side, a column called Dummy cell is added, these are 1T memory cells that are similar to the others, but whose sole purpose is to serve as reference. This approach is often called Open bit line architecture.

29.3 DRAM Read Process With Dummy Cell Circuit Construction: The circuit is illustrated in Figure 29.31. Each bit line is split into two identical halves. Each half line is connected to half cells in the column and an additional cell known as

Dummy cell having a capacitor . When a word line on the left side is selected for reading the Dummy cell on the right side (controlled by XR) is also selected and vice versa, i.e. when a word line on the right side is selected the Dummy cell on the left (controlled by XL) is also selected. In effect, then, the Dummy cell operation serves as the other half of a differential DRAM cell. When the left bit line is in operation, the right half bit line acts as complement for b line and vice versa. These cells shown here are the cells of a column, but look like a row. The distribution of the select lines are such that the even X's are in the right half and all the odds are in the left half.

Fig 29.31: Arrangement for obtaining differential operation

from the single ended DRAM cell 29.4 Operation Of The Read Circuit The circuit is shown in the previous slide in Figure 29.31. The two halves of the line are

precharged to and their voltages are equalized. At the same time, the capacitors of

the two Dummy cells are precharged to . Then a word line is selected, and the Dummy cell of the other side is enabled (with and raised to VDD). Thus the half line

connected to the selected cell will develop a voltage increment (above ) of v or v0

depending on whether a "1" or "0" is stored in the cell. Meanwhile the other half of the

line will have its voltage held equal to that of Cd (i.e. ) the result is a differential signal that the sense amplifier detects and amplifies when it is enabled. As usual by the end of the regenerative process, the amplifier will cause the voltage on one half of the line to become VDD and that on the other half to become 0.

Fig 29.41: Timing diagram of DRAM operation

If X1 cell is accessed, then the dummy cell on the right side is also selected.

Then the actual voltage of the bit line will be , where

where CB is BIT capacitance and CS is storage cell capacitance. VS is voltage between the nodes of the storage capacitance. The storage capacitance with dummy cell is taken

on both sides in a column.

If

then ; this shows that there is no charge sharing. If VS = ‘0’

then ;

Since Dummy cell is discharged

(which is between 5 and 4) The tricky part of a DRAM cell lie in the design of the circuitry to read out the stored value and the design of the capacitor to maximise the stored charge/minimise the storage capacitor size. Stored values in DRAM cells are read out using sense amplifiers, which are extremely sensitive comparators which compare the value stored in the DRAM cell with that of a reference cell. The reference cell used is a dummy cell which stores a voltage halfway between the two voltage levels used in the memory cell (experimental multilevel cells use slightly different technology). Improvements in sense amplifiers reduce sensitivity to noise and compensate for differences in threshold voltages among devices. 29.5 Calculation Of Change In Bitline Voltage An array of DRAM cells is laid out as in Figure 29.21

Fig 29.21: Array of DRAM Cells

Let the capacitance per unit length of bitline be CB and the storage capacitance of the DRAM cell be CS. If there are n such sections (as shown) then the net bit capacitance of the bitline is nCB. Let the bitline be precharged to VP. So when the bitline is connected to the capacitance of the DRAM cell, the net voltage will be some intermediate voltage due to charge sharing and is given by:

(Eq 29.1)

A term Charge Transfer Ratio is defined in this context as

(Eq 29.2)

where is defined as . For a particular technology, CB is fixed. So only CS can be changed. When the bitline is connected to storage capacitor, the change of voltage at the bitline is given by

(Eq 29.3)

For good design the value of (the change in voltage at bitline) should be as high as possible, so that it will allow the sensor to sense the bit correctly and quickly.

Increase in requires CTR to increase. That leads to increase in the value of n’. n’ depends on CB and CS. Thus to increase n’ the storage capacitance CS can be increased or the bitline capacitance CB can be decreased or both can be done. However increasing the value of storage capacitance requires larger area.

29.6 Area Considerations To reduce the area requirement, still allowing the larger capacitance for storage, retrofit technique is used. In this case, the capacitors are laid down as shown below in Figure 29.31

Fig 29.31: Retrofit Technique

Standard DRAM cell uses diffusion, poly or metal as the bit line which will be discussed later. Typical area calculation for storage capacitance is illustrated in Figure 29.32

Fig 29.32: Area Calculation

Area, A, consumed by the capacitor is given by:

(Eq 29.4) Large area leads to large storage capacitance. Large storage capacitance leads to large

change in bit voltage and therefore the access time will be small. DRAM cell with small access time can be designed by improvement on cell itself and the sense amplifier.

29.4 Metal Gate Diffusion Storage The cross section of a Metal Gate Diffusion Storage is shown below in Figure 29.41.

Fig 29.41: Metal Gate Diffusion Storage

In this case the charge is stored in the depletion capacitance of the substrate and diffusion region. The problem with this design are:

• Diffusion line has higher capacitance. Thus increases and hence CTR decreases. • Parasitic capacitances are higher and the gate is not self-aligned. • There is routing problem associated with this kind of design.

A very much similar configuration can be used for inversion storage. In this case when gate input goes HIGH, the inversion charge stored in the capacitor is drained out and the potential at the channel drops indicating a '1' was stored. The reverse phenomena occurs if a '0' is stored. Several DRAM cells are in place now. Few of them are as follows:

• SPDB - Single Poly Diffused Bit • SPPB - Single Poly Poly Bit • SPMB - Single Poly Metal Bit

Similarly it's possible to have:

• DPDB - Dual Poly Diffused Bit • DPPM - Dual Poly Poly Bit • DPMB - Dual Poly Metal Bit

Layout of a SPDB is shown in Figure 29.42

Fig 29.42: Layout of SPDB Cell


• DRAM Basics • Calculation Of Change In Bitline Voltage • Area Considerations • Metal Gate Diffusion Storage


Module 2 : MOSFET Lecture 3 : Introduction to MOSFET Objectives In this course you will learn the following:

• Basic MOS Structure • Types of MOSFET • MOSFET I-V Modelling

3.1 Basic MOSFET Structure In the introduction to a system, we got an overview of various levels of design, viz. Architectural level design, Program level design, Functional level design and Logic level design. However we can't understand the levels of design unless we are exposed to the basics of operation of the devices currently used to realize the logic circuits, viz., MOSFET (Metal Oxide Semiconductor Field Effect Transistor). So in this section, we'll study the basic structure of MOSFET. The cross-sectional and top/bottom view of MOSFET are as in figures 3.11 and 3.12 given below :

fig 3.11 Cross-sectional view of MOSFET fig 3.12 Top/Bottom View of MOSFET An n-type MOSFET consists of a source and a drain, two highly conducting n-type semiconductor regions which are separated from the p-type substrate by reverse-biased p-n diodes. A metal or poly crystalline gate covers the region between the source and drain, but is isolated from the semiconductor by the gate oxide. 3.2 Types of MOSFET MOSFETs are divided into two types viz. p-MOSFET and n-MOSFET depending upon its type of source and drain.

Fig. 3.21: p-MOSFET Fig. 3.22: n-MOSFET Fig. 3.23: c-MOSFET

The combination of a n-MOSFET and a p-MOSFET (as shown in figure 3.21) is called cMOSFET which is the mostly used as MOSFET transistor. We will look at it in more detail later. 3.3 MOSFET I-V Modelling We are interested in finding the outputcharacteristics ( ) and the transfer charcteristics ( ) of the MOSFET. In other words, we can find out both if we can formulate a mathematical equation of the form:

ntutively, we can say that voltage level specifications and the material parameters cannot be altered by designers. So the only tools in the designer's hands with which he/she can improve the performance of the device are its dimensions, W and L (shown in top view of MOSFET fig 2). In fact, the most important parameter in the device simulations is ratio of W and L. The equations governing the output and transfer characteristics of an n-MOSFET and p-MOSFET are :

p-MOSFET:

n-MOSFET: The output characteristics plotted for few fixed values of for p-MOSFET and n-MOSFET are shown next :

fig 3.31 p-MOSFET fig 3.32 n-MOSFET

Linear Saturation

Linear Saturation

The transfer characteristics of both p-MOSFET and n-MOSFET are plotted for a fixed value of as shown next :

fig 3.33 p-MOSFET fig 3.34 n-MOSFET

Note: From now onwards in the lectures, we will symbolize MOSFET by MOS. 3.4 C-V Characteristics of a MOS Capacitor As we have seen earlier, there is an oxide layer below Gate terminal. Since oxide is a very good insulator, it contributes to an oxide capacitance in the circuit. Normally, the capacitance value of a capacitor doesn't change with values of voltage applied across its terminals. However, this is not the case with MOS capacitor. We find that the capacitance of MOS capacitor changes its value with the variation in Gate voltage. This is because application of gate voltage results in the band bending of silicon substrate and hence variation in charge concentration at Si-SiO2 interface. Also we can see (from fig.3.42 ) that the curve splits into two (reason will be explained later), after a certain voltage, depending upon the frequency (high or low) of AC voltage applied at the gate. This voltage is called the threshold voltage(Vth) of MOS capacitor.

fig 3.41 Cross section view of MOS Capacitor Fig 3.42: plot of MOS Capacitor Recap In this lecture you have learnt the following

• Basic MOS Structure • Types of MOSFET • MOSFET I-V Modelling


Module 6 : Semiconductor Memories Lecture 30 : SRAM and DRAM Peripherals Objectives In this lecture you will learn the following

• Introduction • SRAM and its Peripherals • DRAM and its Peripherals

30.1 Introduction Even though a lot of the concepts here have been discussed earlier, they are repeated for convenience. Broadly memories can be classified into

• RAM (Random Access Memory) • Serial Memory

A RAM is one in which the time required for accessing the information and retrieving the information is independent of the physical location of the information. In contrast, in a Serial memory, the data is available only in the same form as it was stored previously. The following diagram shows the organization of a Memory

Fig 30.11: Organization of Memory

This memory consists of two address decoders viz. Row and Column decoders to select a particular bit in the memory. If there are M rows and N columns, then the number of

bits that can be accessed are . Either a read operation or a write operation can be done on any selected bit by the use of control signals. RAMs are once again classified into two types:

• SRAM (Static RAM) • DRAM (Dynamic RAM)

30.2 SRAM and Its Peripherals

Fig 30.21: SRAM Cell

Figure 30.21 shows a standard 6 transistor SRAM cell. The signal designed as WL is the

WORDLINE used to read or write into the cell. and are the data to be written into the cell.

Fig 30.22: Circuit for reading and writing data into cell

The circuits shown in the previous page are used to write and read the data to and from

the cell. When a read operation is to be performed, signal is made HIGH and

at the same time is made LOW. As a result the data present on the and lines are transferred to the input of the sense amplifier (Sense amplifier operation will be discussed shortly). The sense amplifier then senses the data and gives the output.

During the write operation, is made LOW and is made HIGH. Thus the

and will be written onto the and lines respectively. However the read and write operation on a particular cell takes place only if the cell is enabled by the corresponding row(Word) and column(Digit) lines. It is important to

remember that before every read operation, the and are precharged to a

voltage (usually VDD/2). During read operation, one of the two BIT ( or ) lines discharges slightly whereas the other line charges to a voltage slightly greater than its precharged value. This difference in these voltages is detected by the sense amplifier to produce and output voltage, which corresponds to the stored value in the cell which is read. Care should be taken in sizing the transistors to ensure that the data stored in the cell does not change its value. 30.3 Sense Amplifier The circuit shown in Figure 30.31 is the sense amplifier used to read data from the cell. As soon as the SE signal goes HIGH the amplifier senses the difference between the

and voltages and produces an output voltage appropriately. The access time of the memory, which is defined as the time between the initiation of the read operation and the appearance of the output, mainly depends on the performance of the sense amplifier. So the design of the sense amplifier forms the main criteria for the design of memories. The one that is shown here is a simple sense amplifier.

Fig 30.31: Differential Sense Amplifier

Figure 30.31 shows the block diagram of a memory cell with all the peripherals

Fig 30.32: Block Diagram Of A Memory Cell With All Its Peripherals

30.4 Another Type of Sensing

Figure 30.41 illustrates the SRAM sensing scheme.

Fig 30.41: SRAM Sensing Scheme

In the above figure, is the signal used to precharge the and lines before every read operation. The transistor labelled EQ is the equalization transistor to ensure

equal voltages on and lines after precharge. SE is the sense enable signal

used to sense the voltage difference between the and lines. As mentioned earlier, the access time of the memory mainly depends on the performance of the sense amplifier. In contrast with the simple sense amplifier shown

earlier, Figure 30.42 shows an amplifier which is somewhat complicated to improve the performance.

Fig 30.42: Two Stage Differential Amplifier

30.5 DRAM and Its Peripherals The circuit shown in Figure 30.51 is the simple DRAM circuit. Charge sharing takes place between the two capacitors during read and write operations in the following manner. During the write cycle, CS is charged or discharged by asserting WL and BL. During the read cycle, charge redistribution takes place between the bit line and the storage capacitance.

(Eq 30.1) Voltage swing is small; typically around 250mV. Figure 30.52 shows a simple 3-transistor DRAM cell.

Fig 30.52: 3-Transistor DRAM Cell

Figure 30.53 shows a very simple address decoder. These address decoders are compulsory in case of main memories. But the cache memories avoid the usage of address decoders. Many other possible architectures are available for address decoding.

Fig 30.53: A Simple Address Decoder


• Introduction • SRAM and its Peripherals • DRAM and its Peripherals


Module 6 : Semiconductor Memories Lecture 31 : Semiconductor ROMs Objectives In this lecture you will learn the following

• Introduction to Semiconductor Read Only Memory (ROM) • NOR based ROM Array • NAND based ROM Array

31.1 Introduction Read only memories are used to store constants, control information and program instructions in digital systems. They may also be thought of as components that provide a fixed, specified binary output for every binary input. The read only memory can also be seen as a simple combinational Boolean network, which produces a specified output value for each input combination, i.e. for each address. Thus storing binary information at a particular address location can be achieved by the presence or absence of a data path from the selected row (word line) to the selected column (bit line), which is equivalent to the presence or absence of a device at that particular location. The two different types of implementations of ROM array are:

• NOR-based ROM array • NAND-based ROM array

31.2 NOR-based ROM Array There are two different ways to implement MOS ROM arrays. Consider the first 4-bit X 4-bit memory array as shown in Figure 31.21. Here, each column consists of a pseudo-nMOS NOR gate driven by some of the row signals, i.e., the word line.

Fig 31.21: NOR-based ROM array

As we know, only one word line is activated at a time by raising tis voltage to VDD, while all other rows are held at a low votlage level. If an active transistor exists at the cross point of a column and the selected row, the column voltage is pulled down to the logic LOW level by that transistor. If no active transistor exists at the cross point, the column voltage is pulled HIGH by the pMOS load device. Thus, a logic "1"-bit is stored as the absence of an active transistor, while a logic "0"-bit is stored as the presence of an active transistor at the cross point. The truth table is shown in Figure 31.22.

Fig 31.22: Truth Table for Figure 31.21

In an actual ROM layout, the array can be initially manufactured with nMOS transistors at every row-column intersection. The "1"-bit are then realized by omitting the drain or source connection, or the gate electrode of the corresponding nMOS transistors in the

final metallization step. Figure 31.23 shows nMOS transistors in a NOR ROM array, forming the intersection of two metal lines and two polysilicon word lines. To save silicon area, the transistors in every two rows are arranged to share a common ground line, also routed in n-type diffusion. To store a "0"-bit at a particular address location, the drain diffusion of the corresponding transistor must be connected to the metal bit line via a metal-to-diffusion contact. Omission of this contact, on the other hand, results in stored "1"-bit.

Fig 31.23: Metal column line to load devices

The layout of the ROM array is shown below in Figure 31.24.

Fig 31.24: Programming using the Active Layer Only

Figure 31.25 shows a different type of NOR ROM layout implementation which is based on deactivation of the nMOS transistor by raising their threshold voltage through channel implants. In this case, all nMOS transistors are already connected to the column lines: therefore, storing a "1"-bit at a particular location by omitting the corresponding the drain contact is not possible. Instead, the nMOS transistor corresponding to the stored "1"-bit can be deactivated, i.e. permanently turned off, by raising its threshold voltage above the VCH level through a selective channel implant during fabrication.

Fig 31.25: Programming using the Contact Layer Only

31.3 NAND-based ROM Array In this types of ROM array which is shown in Figure 31.31, each bit line consists of a depletion-load NAND gate, driven by some of the row signals, i.e. the word lines. In normal operation, all word lines are held at the logic HIGH voltage level except for the selected line, which is pulled down to logic LOW level. If a transistor exists at the cross point of a column and the selected row, that transistor is turned off and column voltage is pulled HIGH by the load device. On the other hand, if no transistor exists (shorted) at that particular cross point, the column voltage is pulled LOW by the other nMOS transistors in the multi-input NAND structure. Thus, a logic "1"-bit is stored by the presence of a transistor that can be deactivated, while a logic "0"-bit is stored by a shorted or normally ON transistor at the cross point.

Fig 31.31: NAND-based ROM

As in the NOR ROM case, the NAND-based ROM array can be fabricated initially with a transistor connection present at every row-column intersection. A "0"-bit is then stored by lowering the threshold voltage of the corresponding nMOS transistor at the cross point through a channel implant, so that the transistor remains ON regardless of the gate voltage. The availability of this process step is also the reason why depletion-type nMOS load transistors are used instead of pMOS loads.

Fig 31.32: Truth table for Fig 31.31

Figures 31.33 and 31.34 show two different types of layout implementations of NAND ROM array. In the implant-mask NAND ROM array, vertical columns of n-type diffusion intersect at regular intervals with horizontal rows of polysilicon, which results in an nMOS transistor at each intersection point. The transistor with threshold voltage implant

operate as normally-ON depletion devices, thereby providing a continuous current path regardless of the gate voltage level. Since this structure has no contacts embedded in the array, it is much more compact than the NOR ROM array. However, the access time is usually slower than the NOR ROM, due to multiple series-connected nMOS transistor in each column.

Fig 31.33: Programming using the Metal-1 Layer Only

Fig 31.34: Programming using Implants Only


• Introduction to Semiconductor Read Only Memory (ROM) • NOR based ROM Array • NAND based ROM Array


Module 6 : Semiconductor Memories Lecture 32 : Few special Examples of Memories Objectives In this lecture you will learn the following

• Non-Volatile READ-WRITE Memory • The Floating Gate Transistor • Erasable Programmable Read Only Memory (EPROM) • Electrically Erasable Programmable Read Only Memory ( E2PROM)

32.1 Non-Volatile Read-Write Memory The architecture of Non-Volatile Read-Write (NVRW) Memory is virtually identical to the ROM structure. The memory core consists of an array of transistors placed on the wordline/bitline grid. Selectively disabling or enabling some of the devices programs the memory. In a ROM, this is accomplished by mask-level alterations. In a NVRW memory, a modified transistor that permits its threshold to be altered electrically is used. This modified threshold is retained indefinitely (or atleast for the lifetime, typically of the order of 10 yrs) even when the supply is turned off. To reprogram the memory, the programmed values must be erased, after which a new programming round must be started. The method of erasing is the main differentiating factor between the various classes of reprogrammable nonvolatile memories. The programming of the memory is typically an order of magnitude slower than the reading operation. 32.2 Floating Gate Transistor Over the years, various attempts have been made to create a device with electrically alterable characteristics and enough reliability to support a multitude of write cycles. For example, the MNOS (Metal Nitride Oxide Semiconductor) transistor held promise, but has been unsuccessful until now. In this device, threshold-modifying electrons are trapped in a Si3N4 layer deposited on the top of the gate SiO2. A more accepted solution is offered by the floating gate transistor shown in Figure 32.21, which forms the core of virtually every NVRW memory built today.

Fig 32.21: FAMOS Structure

The structure is similar to traditional MOS device, except that an extra polysilicon strip is inserted between the gate and channel. This strip is not connected to anything and is called as Floating gate. The most obvious impact of inserting this extra gate is to double the gate oxide thickness tox, which results in a reduced device transconductance as well as increased threshold voltage. Both these properties are not particularly desirable. This device has property that its threshold voltage is programmable. Applying a high voltage (about 10V) between the source and drain terminals creates a high electric field and causes avalanche injection to occur. Electrons acquire sufficient energy to become 'HOT' and traverse through the first oxide insulator, so that they get trapped on the floating gate. This phenomenon can occur with oxide as thick as 100nm, which makes it relatively easy to fabricate the device. In reference to the programming mechanism, the floating gate transistor is often called Floating Gate Avalanche Injection MOS (FAMOS). The trapping of electrons on the floating gate effectively drops the voltage on that gate. This process is self-limiting and the negative charge accumulated on the floating gate reduces the electrical field over the oxide so that ultimately it becomes incapable of accelerating more hot electrons. Removing the voltage leaves the induced negative charge in place, and results in a negative voltage on the intermediate gate. From the device point of view, this translates into an effective increase in threshold voltage. To turn on the device, a higher voltage is needed to overcome the effect of the induced negative charge. Typically, the resulting threshold voltage is around 7V; thus a 5V gate-to-source voltage is not sufficient to turn on the transistor, and the device is effectively disabled. Since the floating gate is surrounded by SiO2, which is an excellent insulator, the trapped charge can be stored for many years, even when the supply voltage is removed, creating the nonvolatile storage mechanism. One of the major concerns of the floating gate approach is the need for high programming voltages. By tailoring the impurity profiles, technologists have been able to reduce the required voltage from the original 25V to approximately 12.5V in today's memories.

32.3 Erasable Programmable Read Only Memory (EPROM) An EPROM is erased by shining ultraviolet light on the cells through a transparent window in the package. The UV radiation renders the oxide slightly conductive by direct generation of electron-hole pair in the material. The erasure process is slow and can take from seconds to several minutes, depending on the intensity of the UV source. Programming takes several (5-10) microseconds/word. Another problem with the process is limited endurance, that is, the number of erase/program cycles is generally limited to maximum of 1000, mainly as a result of UV erase procedure. Reliability is also an issue. The device threshold might vary with repeated programming cycles. Most EPROM memories therefore contain on-chip circuitry to control the value of thresholds to within a specified range during programming. Finally, the injection always entails a large channel current, as high as 0.5mA at a control gate voltage of 12.5V. This causes high power dissipation during programming. The EPROM cell is extremely simple and dense, making it possible to fabricate large memories at a low cost. EPROMs were therefore attractive in applications that do not require regular programming. Due to cost and reliability issues, EPROMs have fallen out of favor and have been replaced by Flash Memories. 32.4 Electrically Erasable Programmable Read Only Memory

(EEPROM) The major disadvantage of the EPROM approach is that erasure procedure has to occur "off system". This means the memory must be removed from the board and placed in the EPROM programmer for programming. The EEPROM approach avoids this labor intensive and annoying procedure by using another mechanism to inject or remove charges from the floating gate viz. tunneling. A modified floating gate device called the FLOTOX (Floating Gate Tunnel Oxide) transistor is used as a programmable device that supports an electrical erasure procedure. A cross section of the FLOTOX structure is shown in Figure 32.41.

Fig 32.41: FLOTOX Structure

It resembles the FAMOS device, except that a portion of the dielectric separating the floating gate from the channel and drain is reduced in thickness to about 10nm or less.

When a voltage of approximately 10V is applied over the thin insulator, electrons can move to and from the floating gate through tunneling. The main advantage of this programming approach is that it is reversible; that is, erasing is simply achieved by reversing the voltage applied during the writing process. Injecting electrons onto the floating gate raises the threshold, while the reverse operation lowers it. This bidirectionality, however, introduces a threshold control problem: removing too much charge from the floating gate results in a depletion device that cannot be turned off by the standard wordline signals. Notice that the resulting threshold voltage depends on initial charge on the gate, as well as the applied programming voltages. It is a strong function of the oxide thickness, which is subject to non-neglible variations over the die. To remedy this problem, an extra transistor connected in series with the floating gate transistor is added to the EEPROM cell. This transistor acts as the access device during the read operation, while the FLOTOX transistor performs the storage function. This is in contrast to the EPROM cell, where the FAMOS transistor acts as both the programming and access device. The EEPROM cell with its two transistors is larger than its EPROM counterpart. This area penalty is further aggravated by the fact that the FLOTOX device is intrinsically larger than the FAMOS transistor due to the extra area of the tunneling oxide. Additionally, fabrication of very thin oxide is a challenging and costly manufacturing step. Thus EEPROM components pack less bits for more cost than EPROMs. On the positive side EEPROM offer high versatility. They also tend to last longer, as they can support upto 100,000 erase/write cycles. Repeated programming causes a drift in the threshold voltage due to permanently trapped charges in the SiO2.This finally leads to malfunction or the inability to reprogram the device. Recap In this lecture you have learnt the following

• Non-Volatile READ-WRITE Memory • The Floating Gate Transistor • Erasable Programmable Read Only Memory (EPROM) • Electrically Erasable Programmable Read Only Memory (E2PROM)


Module 7 : I/O PADs Lecture 33 : I/O PADs Objectives In this lecture you will learn the following

• Introduction • Electrostatic Discharge • Output Buffer • Tri-state Output Circuit • Latch-Up • Prevention of Latch-Up

33.1 Introduction Pad cells surround the rectangular metal patches where external bonds are made. Pads must be sufficiently large and sufficiently spaced apart from each other. There are three types of pad cells, input, output, power (also, tristate, analog). Typical structures inside pad cells should have

• sufficient connection area (eg. 85 x 85 microns) in the pad, • electrostatic discharge (ESD) protection structures • interface to internal circuitry • circuitry specific to input and output pads

Pads are generally arranged around the chip perimeter in a "pad frame". Pad frame will have a signal ring of pads in smaller designs. Lower limit on pad size is minimum size to which a bond wire can be attached; typically 100-150 micrometers. It is also a minimum pitch at which bonding machines can operate. Input pads of gates of input buffer transistors are susceptible to high voltage build up, so we need to have ESD protection for it. Output pads are expected to drive large capacitance loads, so characteristics of load must be met by proper sizing of output buffer. Due to large transistors, I/O currents are higher and hence Latch-Up may occur. To prevent this, we use guard rings in layout. For area efficiency, I/O transistors should be constructed from several small transistors in parallel. Long gates must be provided to reduce avalanche breakdown tendency. 33.2 Electrostatic Discharge (ESD) ESD damage is usually caused by poor handling procedures. ESD is especially severe in low humidity environments. Electrostatic discharge is a pervasive reliability concern in VLSI circuits. It is a short duration (<200ns) high current (>1A) event that causes irreparable damage. The most common manifestation is the human body ESD event, where a charge of about 0.6uC can be induced on a body capacitance of 100pF, leading to electrostatic potentials of 4KV or greater.

Whenever body comes in contact with plastic or other insulating material, static charge is generated. It can be a very small charge, as low as nano Coulombs, but it can cause potential damage to MOS devices, as voltages are pretty high. We know that Q = CV V = Q/C V = It/C Let us consider a modest 1pF capacitor, in which, this 1nC charge is put (can be through a 100uA current for a millisec). This results in

SiO2 breakdown voltage is 109 volts/meter. If gate oxide is about 0.1um thick, say;

Maximum allowable voltage is . This can easily be generated by walking across a carpet!! A human touch can produce instanteous voltages of 20,000 volts! A typical solution of the ESD protection problem is to use clamping diodes implemented using MOS transistors with gates tied up to either GND for nMOS transistors, or to VDD for pMOS transistors as shown in Figure 33.21. For normal range of input voltages these transistors are in the OFF state. If the input voltage builds up above (or below) a certain level, one of the transistors starts to conduct clamping the input voltage at the same level.

Fig 33.21: Clamping Transistors

These clamping transistors are very big structures consisting of a number of transistors connected in parallel, and are able to sustain significant current. The thick field NMOS used design is not suitable for deep submicron processes, and the thin field oxide NMOS presents oxide breakdown problems while interfacing between blocks with high power supply voltages. Scaling of VLSI devices have reduced the dimensions of all structures used in ICs and this has increased their susceptibility to ESD damage. Hence ESD protection issues are becoming increasingly important for deep submicron technologies. The gate oxide thicknesses are approaching the tunneling regime of around 35 Angstroms. From an ESD perspective, the important issue is whether the oxide breakdown is reached before the protection devices are able to turn on and protect them! 33.3 Output Buffer The intra-chip buffer circuits are relatively well known. They are fast, and need only be as big as needed to drive their particular load capacitances. However, in the inter-chip buffer design case, there are some very important limitations. First, these buffers must be able to drive large capacitive loads, as they are driving off-chip signals, which means driving I/O pads, parasitic board capacitances, and capacitances on other chips. Adding a few picofarads of capacitance at the output node is really inconsequential, and shouldnt significantly degrade the propagation delay through this structure. So, the O/P load for worst case design is considered to be 50 times normal load, approximately 50pF. The simplest driver for the output pad consists of a pair of inverters with large transistors in addition to the standard ESD protection circuitry. The driver must be able to supply enough current (must have enough driving capability) to achieve satisfactory rise and fall times (tr, tf) for a given capacitive load. In addition the driver must meet any required DC characteristics regarding the levels of output voltages for a given load type, viz. CMOS or TTL.

Fig 33.31: Output Buffer

Design method is same as we have already discussed in previous lectures. Optimum number of stages are found for a load capacitance assumed a-priori. Logical effort method can be used to decide sizing. Second, the voltage across any oxide at any time should not be greater than the supply voltage, which ensures oxide reliability; most process design engineers will not guarantee oxide reliability for oxide voltages greater than the chip VDD. If a low voltage chip is tied to a bus which connects several chips, some with higher supply voltages, then the Input buffer must be designed such that there is no chance of a problem with the oxide. 33.4 Tri-State Output Circuit The circuits of VLSI chips are designed to be tri-statable as shown in Figure 33.41, which

is designed to be driven only when the output enable signal is asserted. The circuit implementation requires 12 transistors. However in terms of silicon area, this implementation may require a relatively small area since the last stage transistors need not be sized large.

Fig 33.41 Tri-State Output Circuit

33.5 Latch-Up Large MOS transistors are susceptible to the latch-up effect. In the chip substrate, at the junctions of the p and n material, parasitic pnp and npn bipolar transistors are formed as in the following cross-sectional view shown in Figure 33.51

Fig 33.51: Latch-Up

These bipolar transistors form a silicon-controlled rectifier (SRC) with positive feedback as in the following circuit model shown in Figure 33.52

Fig 33.52: SRC With Positive Feedback

The final result of the latch-up is the formation of a short-circuit (a low impedance path) between VDD and GND which results in the destruction of the MOS transistor. 33.6 Prevention of Latch-Up

Fig 33.61: Latch-Up Prevention Techniques

The following techniques can be used to prevent latch-up:

• Use p+ guard rings to ground around nMOS transistors and n+ guard rings connected to VDD around pMOS transistors to reduce Rwell and Rsub and to capture injected minority carriers before they reach the base of the parasitic BJTs

• Place substrate and well contacts as close as possible to the source connections

Use minimum area p-wells (in case of twin-tub technology or n-type substrate) so that the p-well photocurrent can be minimized during transient pulses

• Source diffusion regions of pMOS transistors should be placed so that they lie

along equipotential lines when currents flow between VDD and p-wells. In some n-well I/O circuits, wells are eliminated by using only nMOS transistors.

• Avoid the forward biasing of source/drain junctions so as not to inject high

currents; the use of a lightly doped epitaxial layer on top of a heavily doped substrate has the effect of shunting lateral currents from the vertical transistor through the low-resistance substrate.

• Layout n- and p-channel transistors such that all nMOS transistors are placed close

to GND and pMOS transistors are placed close to VDD rails. Also maintain sufficient spacings between pMOS and nMOS transistors.


• Introduction • Electrostatic Discharge • Output Buffer • Tri-state Output Circuit • Latch-Up • Prevention of Latch-Up


Module 2 : MOSFET Lecture 4 : MOS Capacitor Objectives In this course you will learn the following

• MOS as Capacitor • Modes of operation • Capacitance calculation of MOS capacitor

4.1 MOS as Capacitor Refering to fig. 4.1, we can see there is an oxide layer below the Gate terminal. Since oxide is a very good insulator, it contributes to an oxide capacitance in the circuit.

Fig 4.1: Cross-section view of MOS Capacitor 4.2 Modes of operation Depending upon the value of gate voltage applied, the MOS capacitor works in three modes :

Fig 4.2a: Accumulation mode (grey layer - strong hole concentration)

Normally, the capacitance value of a capacitor doesn't change with values of voltage applied across its terminals. However, this is not the case with MOS capacitor. We find that the capacitance of MOS capacitor changes its value with the variation in Gate voltage. This is because application of gate voltage results in band bending in silicon substrate and hence variation in charge concentration at Si-SiO2 interface.

Fig 4.2b: Depletion Mode (light grey layer – depletion region)

1. Accumulation: In this mode, there is accumulation of holes (assuming n-

MOSFET) at the Si-SiO2 interface. All the field lines emanating from the gate terminate on this layer giving an effective dielectric thickness as the oxide thickness (shown in Fig. 4.2a). In this mode, Vg <0

2. Depletion: As we move from negative to positive gate voltages the holes at the interface are repelled and pushed back into the bulk leaving a depleted layer. This layer counters the positive charge on the gate and keeps increasing till the gate voltage is below threshold voltage. As shown in Fig. 4.2b we see a larger effective dielectric length and hence a lower capacitance.

3. Strong Inversion:When Vg crosses threshold voltage, the increase in depletion region width stops and charge on layer is countered by mobile electrons at Si-SiO2 interface. This is called inversion because the mobile charges are opposite to the type of charges found in substrate. In this case the inversion layer is formed by the electrons. Field lines hence terminate on this layer thereby reducing the effective dielectric thickness as shown in Fig. 4.2c)

Fig 4.2c: Strong Inversion mode (grey layer - strongelectron concentration, light grey - depletion region)

4.3 Capacitace calculation of MOS Capacitor In the last chapter, we gave you an introduction of MOS as capacitor. In this chapter, we will see how MOS works as a capacitor with derivation of some related equations.

where p and n are hole and electron concentrations of substrate and is hole or electron concentration of the corresponding intrinsic seminconductor. We see that if we keep making more and more -ve, the charges Qs and Qm keep increasing. Thus, it is acting like a good parallel plate capacitor. Its capacitance can be given as-

Fig 4.3: Gate and Depeletion charge of MOS Capacitor

For +ve bias voltage on gate, increasing will increase Qm and Qs.

Using the depletion approximation, we can write depletion width as a function of as

where is the substrate acceptor density, is dielectric constant of substrate and is the surface potential at substrate. The depletion region grows with increased voltage across the capacitor until strong inversion is reached. After that, further increase in the voltage results in inversion rather than more depletion. Thus the maximum depletion width is:

Also,

Therefore at

But by Gauss's law, electrons must compensate for increasing Qs. So,

where charge Qi is due to electrons in the inversion layer.

By Gauss's Law: Also by thermal equilibrium:

Earlier due to low electric field, the electron-hole pairs formed below the oxide interface recombine. However, once the electric field increases, the electron-hole pairs formed are not able to recombine. So the free electron concentration increases.

By Kirchoff's law, is given by:


• MOS as Capacitor • Modes of operation • Capacitance calculation of MOS capacitor


Module 2 : MOSFET Lecture 5 : MOS Capacitor (Contd...) Objectives In this course you will learn the following

• Threshold Voltage Calculation • C-V characteristics • Oxide Charge Correction

5.1 Threshold Voltage Calculation

Threshold voltage is that gate voltage at which the surface band bending is twice ,

Where

We know that the depth of depletion region for is between 0 and and is given by,

Charge in depletion region at is given by where

Beyond threshold, the total charge QD in the seminconductor has to balance the charge

on gate electrode, Qs i.e. where we define the charge in the inversion layer as a quantity which needs to be determined. This leads to following expression for gate voltage-

In case of depletion, there in no inversion layer charge, so Qi =0, i.e. gate voltage becomes

but in case of inversion, the gate voltage will be given by :

The second term in second equality of last expression states our basic assumption, namely that any change in gate voltage beyond the threshold requires a change in inversion layer charge. Also from the same expression, we obtain threshold voltage as :

5.2 C-V Characteristics The low frequency and high frequency C-V characteristics curves of a MOS capacitor are shown in fig 5.2.

Fig 5.2 : Low & High Frequency C-V curves

The low frequency or quasi-static measurement maintains thermal equilibrium at all times. This capacitance is the ratio of the change in charge to the change in gate voltage, measured while the capacitor is in equilibrium. A typical measurement is performed with an electrometer, which measures the charge added per unit time as one slowly varies the applied gate voltage. The high frequency capacitance is obtained from a small-signal capacitance measurement at high frequency. The bias voltage on the gate is varied slowly to obtain the capacitance versus voltage. Under such conditions, one finds that the charge in the inversion layer does not change from the equilibrium value corresponding to the applied DC voltage. The high frequency capacitance therefore reflects only the charge variation in the depletion layer and the (rather small) movement of the inversion layer charge. 5.3 Oxide Charge Correction To keep the value of within -1 Volt and +1 Volt, an n-channel device has high doping (similarly, pchannel device has high doping). Recap In this lecture you have learnt the following

• Threshold Voltage Calculation • C-V characteristics • Oxide Charge Correction


Module 2 : MOSFET Lecture 6 : MOSFET I-V characteristics Objectives In this course you will learn the following

• Derivation of I-V relationship • Channel length modulation and body bias effect

6.1 derivation of I-V relationship

In this section, the relation between and is discussed. We assume that gate-

body voltage drop is more than threshold voltage , so that mobile electrons are created in the channel. This implies that the transistor is either in linear or saturation region. Here we will derive some simple I-V characteristics of MOSFET, assuming that the device essentially acts as a variable resistor between source and drain, and only drift ohmic current needs to be calculated. Also note that the MOSFET is basically a two-

dimensional device. The gate voltage produces a field in the vertical (x) direction, which induces charge in the silicon, including charge in the inversion layer. The voltage

produces a field in the lateral (y) direction, and current flows (predominantly) in the y-direction. Strictly speaking, we must solve the 2-D Poisson and continuity equations to evaluate the I-V characteristics of the device. These are analytically intractable. We therefore resort to the gradual channel approximation described below. To find the current flowing in the MOS transistor, we need to know the charge in the inversion layer. This charge, Qn(y) (per sq. cm) is a function of position along the channel, since the potential varies going from source to drain. We assume that Qn(y) can be found at any point y by solving the Poisson equation only in the x direction, that is treating the gate-oxide-silicon system in the channel region very much like a MOS capacitor. This is equivalent to assuming that vertical electric field Ex is much larger than the horizontal electric field Ey, so that the solution of the 1-dimensional Poisson equation is adequate. This gradual channel approximation (the voltage varies only gradually along the channel) is quite valid for long channel MOSFETs since Ey is small. For Qn(y) using charge control relation at location y we have:

Now we turn our attention to evaluate the resistance of the infinitesimal element of length dy along the channel (as shown in fig 6.21). Assuming that only drift current is present and hence applying Ohm's law, we get :

Fig 6.21: Cross Sectional View of channel

Here we have I = dy, and A=Wxi, where xi = inversion layer thickness.

Now using equation (6.22), We have:

Since is varying along the transverse direction, we define as:

Now using in eqn (6.23) and rearranging the terms, we will get:

Neglecting recombination-regeneration which implies IDS(y) = IDS i.e. current constant throughout the channel. Integrating RHS of eqn (6.26) from 0 to VDS and LHS from 0 to L, we will get

Now substituting Qn(y) from eqn (6.21) in eqn (6.27), we will get:

Eqn (6.29) holds true for .

The drain current first increases linearly with the applied drain-to-source voltage, but then reaches a maximum value. This occurs due to the formation of depletion region between pinch-off point and drain. This behavior is known as drain saturation which is

observed for as shown in figure below.

Fig 6.22: IDS-VDS graph

The saturation current IDSsat is given by eqn (6.210),

6.2 Channel length modulation and body bias effect The observed current IDS does not saturate, but has a small finite slope as shown in fig 6.31. This is attributed as channel

Fig 6.31: Actual vs Ideal IDS-VDS graph

Length modulation. This in MOSFET is caused by the increase in depletion layer width at the drain as the drain voltage is increased. This leads to a shorter channel length (reduced by ) and increased drain current. When the channel length of MOSFET is decreased and MOSFET is operated beyond channel pinch-off, the relative importance of pinchoff length with respect to physical length is increased. This effect can be included in saturation current as :

Here is called channel length modulation coefficient. Till now we assumed that the body of MOSFET is to be grounded. We will now take effect of body bias into account i.e. body being applied a negative voltage in case of n-MOSFET. Application of VSB > 0 increases the potential build up across the semiconductor. Depletion region widens in order to compensate for the extra required field, which implies higher VT. Viewing it from the point of energy band diagram, a higher potential needs to be applied to the gate in order to bend the bands by the same amount in order to create the same electron concentration in the channel. With the application to the body bias, it modulates to the threshold voltage governed by the threshold voltage governed by the following equations:

where is known as the body coefficient. Recap In this lecture you have learnt the following:

• Derivation of I-V relationship • Channel length modulation and body bias effect

Module 2 : MOSFET Lecture 7: Advanced Topics Objectives In this course you will learn the following

• Motivation for Scaling • Types of Scaling • Short channel effect • Velocity saturation

7.1 Motivation for Scaling The reduction of the dimensions of a MOSFET has been dramatic during the last three decades. Starting at a minimum feature length of 10 mm in 1970 the gate length was gradually reduced to 0.15 mm minimum feature size in 2000, resulting in a 13% reduction per year. Proper scaling of MOSFET however requires not only a size reduction of the gate length and width but also requires a reduction of all other dimensions including the gate/source and gate/drain alignment, the oxide thickness and the depletion layer widths. Scaling of the depletion layer widths also implies scaling of the substrate doping density. In short, we will study simplified guidelines for shrinking device dimensions to increase transistor density & operating frequency and reduction in power dissipation & gate delays. 7.2 Types of Scaling Two types of scaling are common:

1) constant field scaling and 2) constant voltage scaling.

Constant field scaling yields the largest reduction in the power-delay product of a single transistor. However, it requires a reduction in the power supply voltage as one decreases the minimum feature size. Constant voltage scaling does not have this problem and is therefore the preferred scaling method since it provides voltage compatibility with older circuit technologies. The disadvantage of constant voltage scaling is that the electric field increases as the minimum feature length is reduced. This leads to velocity saturation, mobility degradation, increased leakage currents and lower breakdown voltages. After scaling, the different Mosfet parameters will be converted as given by table below: Before Scaling After Constant Field Scaling After Constant Voltage Scaling

Where s = scaling parameter of MOS 7.3 Short Channel Effect So far our discussion was based upon the assumptions that channel was long and wide enough, so that “edge” effects along the four sides was negligible, longitudinal field was negligible and electric field at every point was perpendicular to the surface. So we could perform one-dimensional analysis using gradual channel approximation. But in devices where channel is short longitudinal field will not be negligible compared to perpendicular field. So in that case one-dimensional analysis gives wrong results and we will have to perform dimensional analysis taking into account both longitudinal and vertical fields. (which is out of the scope this course) When is a channel called a short channel?

(i) When junction (source/drain) length is of the order of channel length. (ii) L is not much larger then the sum of the drain and source depletion width.

We have shown below the comparative graphs of I-V characteristics for both long channel and short channel length MOSFETs. From graph, it can be clearly concluded that when the channel becomes short, the current in saturation region becomes linearly dependent on applied drain voltage rather than being square dependent.

Figure 7.3: Comparison of ID vs VDS characteristics

for long and short channel MOSFET devices 7.4 Velocity Saturation As we were assuming longitudinal field to be very small in the channel, the magnitude of carrier velocity |vd| was proportional to |Ex|. But it has been observed that for high values of |Ex| carrier velocity tends to saturate. It is no more proportional to |Ex|. This lack of proportionality at high |Ex| values is known as velocity saturation.


• Motivation for Scaling • Types of Scaling • Short channel effect • Velocity saturation

Module 2 : MOSFET Lecture 8 : Short Channel Effects Objectives In this course you will learn the following

• Motivation • Mobility degradation • Subthreshold current • Threshold voltage variation • Drain induced barrier lowering (DIBL) • Drain punch through • Hot carrier effect • Surface states and interface trapped charge

8.1 Motivation As seen in the last lecture as channel length is reduced, departures from long channel behaviour may occur. These departures, which are called Short Channel Effects, arise as results of a two-dimensional potential distribution and high electric fields in the channel region. For a given channel doping concentration, as the channel length is reduced, the depletion layer widths of source and drain junctions become comparable to channel length. The potential distribution in the channel now depends on both the tranverse field Ex(controlled by the gate voltage and back-surface bias) and the longitudinal field Ey(controlled by the drain bias). In other words, the potential distribution becomes two dimensional, and the gradual channel approximation (i.e. Ex >> Ey) is no longer valid. This two dimensional potential results in the degradation of the threshold behaviour, dependence of threshold voltage on the channel length & biasing voltages and failure of the current saturation due to punch through effect. In further sections, we will study various effects due to short channel length in MOSFET. 8.2 Mobility Degradation Mobility is important because the current in MOSFET depends upon mobility of charge carriers(holes and electrons).

We can describe this mobility degradation by two effects:

Figure 8.2: Mobilty degradation graph

i. Lateral Field Effect: In case of short channels, as the lateral field is increased,

the channel mobility becomes field-dependent and eventually velocity saturation occurs (which was referred to in the previous lecture). This results in current saturation.

ii. Vertical Field Effect: As the vertical electric field also increases on shrinking the

channel lengths, it results in scattering of carriers near the surface. Hence the surface mobility reduces (Also explained by the mobility dependence equation given below).

Thus for short channels, we can see (in the figure 8.2) the mobility degradation which occurs due to velocity saturation and scattering of carriers. 8.3 Subthreshold Current An effect that is exacerbated by short channel designs is the subthreshold current which arises from the fact that some electrons are induced in the channel even before strong inversion is established. For the low electron concentration (typically of subthreshold regime), we expect diffusion current (propotional to carrier gradients) to dominate over drift currents (propotional to carrier concentrations). For very short channel lengths,

such carrier diffusion from source to drain can make it impossible to turn off the device below threshold. The subthreshold current is made worse by the DIBL effect (will be explained in later sections) which increases the injection of electrons from the source. 8.4 Threshold Voltage variation with Channel Length

Figure 8.41: Dependence of VT on L for MOSFET

Figure 8.42: IDS Vs VGS for short channel

In case of long channel MOSFETs, gate has control over the channel and supports most of the charge. As we go to short channel lengths as seen in the graph above, the threshold voltage begins to decrease as the charge in the depletion region is now supported by the drain and the source also. Thus the gate needs to support less charge in this region and as a result, VT falls down. This phenomenon is known as charge sharing effect. Now since IDS is propotional to (VGS - VT), therefore as VT begins to fall in case of short channels, IDS starts increasing resulting in larger drain currents. Also when VGS is zero and the MOSFET is in the cut off mode, since VT is small, (VGS - VT) will be a small negative value and will result in leakage current which further multiplied by the drain voltage will result in leakage power. In case of long channel MOSFETs, VT is large enough and (VGS - VT) is a comparatively larger negative value, in cut off mode leakage power is very small.

Transit Time: As seen in previous lecture, the short channel results in velocity saturation over part of the channel. So the argument used to derive the transit time for long channel MOSFET is no longer valid for short channel MOSFETs. We note that the transit time will be larger if electrons were moving at maximum speed all over the channel. Thus,

Figure 8.42 shows that the transit time of a device operating in the 'flat' part of IDS-VGS characteristics curve which concludes that transit time cannot be decreased by increasing further VGS. Quantum Mechanical Increase Effect: Another effect of quantum mechanics that also increases with scaling, is a shift in the surface potential required for strong inversion. This effect arises from so called "energy quantization" of confined particles which preludes electrons and holes from existing at zero energy in the conduction or valence bands. It is a direct consequence of the coupled Poisson-Schrodinger equation solution. This surface potential shift manifests itself as an increase in |VT| which for the long devices is given by –

Above equation tells that |VT| increases as devices are scaled down. 8.5 Drain Induced Barrier Lowering (DIBL)

Figure 8.5: Surface potential graph with

constant gate voltage (VDS and L are varied)

The source and drain depletion regions can intrude into the channel even without bias, as these junctions are brought closer together in short channel devices. This effect is called charge sharing (as mentioned earlier) since the source and drain in effect take part of the channel charge, which would otherwise be controlled by the gate. As the

drain depletion region continues to increase with the bias, it can actually interact with the source to channel junction and hence lowers the potential barrier. This problem is known as Drain Induced Barrier Lowering (DIBL). When the source junction barrier is reduced, electrons are easily injected into the channel and the gate voltage has no longer any control over the drain current. In DIBL case,

For figure 8.5, we can observe that under extreme conditions of encroaching source and drain depletion regions, the two curves can meet. 8.6 Drain Punch Through When the drain is at high enough voltage with respect to the source, the depletion region around the drain may extend to the source, causing current to flow irrespective of gate voltage (i.e. even if gate voltage is zero). This is known as Drain Punch Through condition and the punch through voltage VPT given by:

So when channel length L decreases (i.e. short channel length case), punch through voltage rapidly decreases. 8.7 Hot Carrier Effect Electric fields tend to be increased at smaller geometries, since device voltages are difficult to scale to arbitrarily small values. As a result, various hot carrier effects appear in short channel devices. The field in the reversed biased drain junction can lead to impact ionization and carrier multiplication. The resulting holes contribute to substrate current and some may move to the source, where they lower source barrier and result in electron injected from source into p-region. In fact n-p-n transistor can result within source channel drain configuration and prevent gate control of the current. Another hot electron effect is the transport of the energetic electrons over (or tunneling through) the barrier into the oxide. Such electrons become trapped in the oxide, where they change the threshold voltage and I-V characteristics of the device. Hot electron effects can be reduced by reducing the doping in the source and drain regions, so that the junction fields are smaller. However lightly doped source and drain regions are incompatible with small geometry devices because of contact resistances and other similar problems. A compromise design of MOSFET, called Lightly Doped Drain (LDD), using two doping levels with heavy doping over most of the source and drain areas with light doping in a region adjacent to the channel. The LDD structure decreases the field between drain and channel regions, thereby reduces injection into the oxide, impact ionization and other hot electron effects.

8.8 Surface States and Interface Trapped Charge At Si-SiO2 interface, the lattice of bulk silicon and all the properties associated with its periodicity terminate. As a result, localized states with energy in the forbidden energy gap of silicon are introduced at or very near to the Si-SiO2 interface. Interface trapped charges are electrons or holes trapped in these states. The probability of occupation of a surface state by an electron or by a hole is determined by the surface state energy relative to the Fermi level. An electron in conduction band can contribute readily to electrical conduction current while an interface trapped electron does not, except hopping among the surface states. Thus by trapping electrons and holes, surface states can reduce conduction current in MOSFETs. Surface states can also act as localized generation-recombination centers and lead to leakage currents. 8.9 Conclusion Because short channel effects complicate device operation and degrade device performance, these effects should be eliminated or minimized, so that a physical short channel device can preserve the electrical long channel behaviour. Recap In this lecture you have learnt the following

• Motivation • Mobility degradation • Subthreshold current • Threshold voltage variation • Drain induced barrier lowering (DIBL) • Drain punch through • Hot carrier effect • Surface states and interface trapped charge


Module 3 : Fabrication Process and Layout Design Rules Lecture 9 : Introduction to Fabrication Process Objectives In this course you will learn the following

• Motivation • Photolithography • Fabrication Process

9.1 Motivation In the previous module, we did a detailed study about the MOSFET. VLSI circuits are very complex circuits i.e we cannot make circuits by interconnecting few single MOSFET transistors. A VLSI circuit consists of millions to billions of transistors. For this purpose, we use Photolithography which is a method/technology to create the circuit patterns on a silicon wafer surface and the process is called Fabrication. In this lecture, we will study in detail photolithography, how it is done and what sort of materials are used for this purpose. 9.2 Photolithography Photolithography is the method that sets the surface dimensions (horizontal) of various parts of devices and circuits. Its goal is two fold. First goal is to create in and on the wafer surface a pattern whose dimensions are as close to the device requirements as possible. This is known as resolution of images on the wafer and the pattern dimensions are known as feature or image sizes of the circuit. Second goal is the correct placement called alignment or registration of the circuit patterns on the wafer. The entire circuit patterns must be correctly placed on the wafer surface because misaligned mask layers can cause the entire circuit to fail.

Figure 9.1: Clear Field mask

Figure 9.2: Dark Field mask

In order to create patterns on the wafer, the required pattern is first formed in the reticles or photomasks. The pattern on reticle or mask is then transfered into a layer of photoresist. Photoresist is a light sensitive material similar to the coating on a regular photographic film. Exposure to light causes changes in its structure and properties. If the exposure to light causes photoresist to change from a soluble to insoluble one, it is known as negative actingand the chemical change is called polymerization. Similarly, if exposure to light causes it change from relatively non-soluble to much more soluble, it is known as positive acting and the term describing it is called as photosolubilisation. The exposure radiation is generally UV and E-beam. Removing the soluble portions with chemical solvents called developers leaves a pattern on the photoresist depending upon the type of mask used. A mask whose pattern exists in the opaque regions is called clear field mask. The pattern could also be coded in reverse, and such masks are known as dark field masks. The result obtained from the photomasking process from different combinations of mask and resist polarities is shown in the following table:

The second transfer takes place from the photoresist layer into the wafer surface layer. The transfer occurs when etchants remove the portion of the wafer's top layer that is not covered by the photoresist. The chemistry of the photoresists is such that they do not dissolve in the chemical etching solutions; they are etch resistant; hence the name photoresists.The etchant generally used to remove silicon dioxide is hydrogen fluoride (HF).

The choice of mask and resist polarity is a function of the level of dimensional control and defect protection required to make the circuit work. For example, sharp lines are not obtainable with negative photoresists while etchants are difficult to handle with positive photoresists. After the pattern has been taken on resist, the thin layer needs to be etched. Etching process is used to etch into a specific layer the circuit pattern that has been defined during the photomasking process. For example, aluminium connections are obtained after etching of the aluminium layer.

9.3 Fabrication Process Why polysilicon gate? The most significant aspect of using polysilicon as the gate electrode is its ability to be used as a further mask to allow precise definition of source and drain regions. This is achieved with minimum gate to source/drain overlap, which leads to lower overlap capacitances and improved circuit performance. Procedure:

1. A thick layer of oxide is grown on the wafer surface which is known as field oxide (FOX). It is much thicker than the gate oxide. It acts as shield which protects the underlying substrate from impurities when other processes are being carried out on the wafer. Besides, it also aids in preventing conduction between unrelated transistor source/drains. In fact, the thick FOX can act as a gate oxide for a parasitic MOS transistor. The threshold voltage of this transistor is much higher than that of a regular transistor due to thick field oxide. The high threshold voltage is further ensured by introducing channel-stop diffusion underneath the field oxide, which raises the impurity concentration in the substrate in the areas where transistors are not required.

2. A window is opened in the field oxide corresponding to the area where the transistor is to be made. A thin highly controlled layer of oxide is deposited where active transistors are desired. This is called gate oxide or thinox. A thick layer of silicon dioxide is required elsewhere to isolate the individual transistors.

3. The thin gate oxide is etched to open windows for the source and drain diffusions. Ion implantation or diffusion is used for the doping. The former tends to produce shallower junctions which are compatible with fine dimension processes. As the diffusion process occurs in all directions, the deeper a diffusion is the more it spreads laterally. This lateral spread determines the overlap between gate and source/drain regions.

4. Next, a gate delineation mask is used to determine the gate area. There has to be minimum overlap between gate and source/drain regions. This is referred to as self-aligned process because source and drain do not extend under the gate. Polysilicon is then deposited over the oxide.

5. The complete structure is then covered with silicon dioxide and contact holes are etched using contact window mask down to the surface to be contacted. These allow metal to contact diffusion or polysilicon regions.

6. Metallization is then applied to the surface using interconnect mask and selectively etched to produce circuit interconnections.

7. As a final step, the wafer is passivated and openings to the bond pads are etched to allow for wire bonding. Passivation protects the silicon surface against the ingress of contaminants than can modify circuit behavior.


• Motivation • Photolithography • Fabrication Process


VLSI Design

Documents

course course objectives

key design issues

design rules

design automation

main design issues

vlsi design lecture

design objects noise

main design object