Transistors - a primer What is a transistor? • Solid-state “triode” - three-terminal device, with voltage (or current) at third terminal used to control current between other two terminals. • Two types: bipolar junction transistors and field effect devices. • We concentrate on FETs: source drain gate The history: vacuum tubes Basic idea of three-terminal devices for current control goes back to 1906, when Lee deForest invented the vacuum triode: grid (base) anode (collector) cathode (emitter) heater • Ground anode; bias cathode to large negative voltage w.r.t anode. • Heater “boils” electrons off cathode (thermionic emission); accelerated through grid toward anode. • ac voltage on grid can modulate electron current! Three-terminal device with gain: dawn of the information age. 1 The history: vacuum tubes Problems with vacuum tubes: • Bulky • Fragile • Long warm-up times • High power consumption, largely wasted • High speeds difficult - require extra plates, grids to minimize capacitances (pentodes) • Miniaturization very challenging. There must be a better way…. Vacuum tubes still best for: • High powers (ex.: e-gun power supplies) • Radiation-hardened electronics (ex.: B-52s) The history: Lillienfeld Basic idea proposed (1930): “Method and apparatus for controlling electric currents” Use third electrode to modulate current between ohmic contacts on a semiconductor. Solid-state triode. from Pierret. 2 Basic field effect transistor idea Normally off: Gate acts like capacitor plate applied voltage creates “channel” with free carriers, connecting source and drain. Lilienfeld Normally on: Gate acts like grid - applied voltage restricts flow of (lightly doped) carriers from source to drain. Shockley (grid-like) Why did these ideas first run into trouble? Surface states! Materials quality prevented practical FETs until 1955…. Types of field effect transistors JFET Junction FET different metals MESFET MEtal Semiconductor FET • Channel, source, drain all same type of doped semiconductor. • Normally “on” - requires gate voltage to turn off SD conduction; devices operate in depletion mode. • Gate can be anywhere between source and drain. • Current flow restricted by depletion zone (from pn junction in JFET; from Schottky barrier in MESFET). • Fairly robust (no super-thin insulating layers, etc.) 3 Types of field effect transistors MOSFET or MISFET Metal Oxide or Metal Insulator Semiconductor FET channel • Source, drain different doping type than bulk of semiconductor • Normally OFF - gate must be biased sufficiently to invert channel region in order to see transistor action; devices operate in accumulation mode. • Requires gate to extend over source and drain. • Thin insulating barrier (gate oxide) necessary. • Bulk may be p or n - complementary metal-oxide-semiconductor processing (CMOS). Types of field effect transistors HEMT High Electron Mobility Transistor ++++++++++++++++++++++++++++ undoped GaAs • Source, drain are ohmic contacts to GaAs 2deg channel. • Normally ON device (modulation doping) operates in depletion mode. • Very high mobilities can lead to very high speed devices (cell phone electronics). • Gate can be anywhere between source and drain. 4 Transistor operation and transconductance Basic transistor operation: ID load VG One obvious figure of merit for a transistor is transconductance: gm ≡ ∂I D ∂VG Transistors with a higher gm are better switches than those with lower values. The MOS system At left are band diagrams for metal-insulator-semiconductor stacks. n-type These are drawn assuming flat bands; that is, negligible charge transfer at interfaces to cause band bending, + negligible surface states. For these ideal cases, with no bias on the metal, the charge density of the semiconductor = the full doped density right up to the interface. p-type 5 Band picture + inversion Ideal system when metal is biased: Accumulation mode: gate bias bends bands to enhance free charge density in plane at insulatorsemiconductor interface. Depletion mode: gate bias bends bands to reduce free charge density in plane at insulator-semiconductor interface. Inversion mode: gate bias bends bands so much that the free charge density in plane at insulatorsemiconductor interface has the opposite sign as the doping! Threshold voltage • The threshold voltage VT is defined as the gate voltage required to produce inversion in the channel. • VT depends on the band structure, the doping level of the semiconductor, and the geometry of the device (oxide thickness, oxide + SC dielectric constants, etc.). • Can be calculated in certain models (will do later). • Often determined empirically. • For an ideal intrinsic MOS stack, threshold voltage is zero. Assumes no surface states + bands flat when VG=0. In following, assume VT is just a device parameter. 6 Basic transistor operation and the linear regime from Pierret Assume first that VS = 0, and (VG-VT) >> VD. “gradual channel” Cx = capacitance per unit area of gate oxide 2d charge density in inversion layer: en2 d ≈ C x (VG − VT ) Total current from this layer: I d = −W ⋅ en2 d µ VD W ≈ − µC x (VG − VT )VD L L Basic transistor operation and the linear regime So, for small source-drain biases, VD << (VG − VT ) Id ≈ − W µC x (VG − VT )VD L FET acts here like gate-controlled variable resistor: ID gm ≡ ∂I D W = − µC xVD L ∂VG increasing VG VD • Higher gate capacitance, higher transconductance! • Knowing device dimensions, can measure ID vs. VG and calculate the mobility from this linear regime. • Mobility found in FETs tends to be lower than bulk: Gate field enhances interface scattering. 7 Saturation regime - physical picture from Pierret What happens at higher sourcedrain voltages? That is, what about when VD > ~ (VG-VT) ? Physically, the thickness and charge density of the inversion layer (channel) shrinks along the length of the channel. When inversion layer just vanishes at drain, device is at “pinch-off”. At higher values of VD, for long channels ID stops changing. Result is “saturation regime”. Saturation regime, quantitative: “square law” Define the channel direction as y. Local potential in channel = φ(y) Local charge density = C x (VG − VT − φ ( y )) Local current: I ( y ) = WµC x (VG − VT − φ ( y )) L ∫0 dφ dy VD I ( y )dy = I D L = −WµC x ∫ (VG − VT − φ )dφ Result: 0 ID = WµC x VD2 , (VG − VT )VD − L 2 0 ≤ VD ≤ VD ,sat VG ≥ VT 8 “Square law” ID = WµC x VD2 V V V ( − ) − , G T D L 2 0 ≤ VD ≤ VD ,sat VG ≥ VT Since ID only increases until pinch-off, can use above formula to find both pinch-off voltage and saturation current: VD ,sat = VG − VT I D ,sat = WµC x (VG − VT ) 2 2L So, assuming constant mobility and ignoring changes in depletion width down length of channel, we find that saturation current scales quadratically with (VG-VT). This provides another way of inferring mobility…. (Tacit assumption: source, drain contact resistances are negligible.) What sets equilibrium depletion width? First, recall some definitions: Ei = energy of the middle of the gap in the semiconductor. φS = potential at sc-oxide interface. φF = bulk (E--EF)/e φ(x) = (1/e)[Ei(bulk)-Ei(x) ni = intrinsic carrier density = N C NV exp( − Eg / 2k BT ) 9 What sets equilibrium depletion width? For nondegenerate semiconductors, Middle of depletion: φ S = φF Onset of inversion: φ S = 2φF k BT ln( N A / ni ) φF = e k T − B ln( N D / ni ) e Delta depletion Exact self-consistent solution shows inversion charge confined to very thin layer at interface. Depletion width increases only slightly once inversion occurs. Approximation: further gating only affects inversion charge. Depletion width at some surface potential: d= 2ε s ε 0 eN A 1/ 2 φS Depletion width at inversion: dT = 2ε sε 0 eN A 1/ 2 2φ F 10 “Bulk charge” picture Takes into account variation in depletion width along channel. Suppose the depletion width near source and drain under no bias is dT, and under bias it depends locally on position, d(y). The induced charge density at position y is then − C x (VG − VT − φ ) + qN A[ d ( y ) − dT ] inversion layer “free” charge Defining Vd ≡ additional exposed acceptors eN A dT Cx and substituting our delta-depletion results for the ds gives − C x VG − VT − φ − VW 1 + φ − 1 2φ F “Bulk charge” picture With this more careful accounting, we find a more exact expression for the ID-VD characteristics as a function of gate voltage: ID = WµC x V2 4 V (VG − VT )VD − D − Vd φ F 1 + D L 2 3 2φ F 3/ 2 − 1 + 3VD , 4φ F 0 ≤ VD ≤ VD ,sat VG ≥ VT Neither the bulk charge picture nor the square law picture predict saturation - it has to be inserted by hand into the model. Complete numerical solution of the whole system does, of course, give pretty nice results, including saturation. 11 What performance issues are important? • Speed (10 GHz) • Threshold voltage (< ~0.5 V) • On-off ratio (> 10000) • Off-current & sub-threshold behavior • Durability (mean time to failure) What limits speed? Gate capacitance Switching FET requires moving charge off and on the gate. Assuming low capacitance and high conductance leads, the maximum frequency possible is set by when the gate admittance becomes comparable to the transconductance: g µVDsat f max ≈ m = 2πC x 2πL2 Time-of-flight Clearly in some limit one is limited by the speed with which carriers can traverse the device. 12 Why are low thresholds important? In some sense, threshold voltages show how efficient your switching is - until inversion, one pays the cost of charging up the gate without getting any of the transistor benefit. Also, power dissipation varies like VG2, so being able to run at lower voltages would produce a big savings in heating! Trend: c. 1980, TTL logic: VG ~ 5 V. Now, VG ~ 2.2 V on CPU. On/off ratios and off-currents • A transistor is only a good switch if, when it’s “off”, it’s really off. • Typical on/off current ratios must be ~ 104, or else these subthreshold source-drain currents end up dissipating an enormous amount of power. • Transistor should also switch sharply - it’s subthreshold properties need to be good. 13 Durability Commercially viable transistors need to last a long time! Remember, ~ 107 transistors per chip, each operating 109 times per second. Only a few failures ruin the chip. When was the last time the CPU died in any computer you own? • The mean time to failure is extremely long! Most common transistor failure mode: gate oxide breakdown. Not suprising: ~ 3 V across 3 nm of oxide = 109 V/m (!). Summary • Transistors are three-terminal devices, and MOSFETs are the most commonly used type in high technology. • Normally off devices, with linear source-drain IV curves at low source-drain bias once gate voltage exceeds threshold for inversion. • IV curves saturate at high bias, with saturation currents depending strongly (roughly quadratically) on gate voltage. • Performance criteria clearly depend both on device geometry and on materials choices. • MOSFETs are only as good as they are because of decades of exacting materials development. 14 Next time: • Demands of the electronics industry for high performance transistors. • The semiconductor “roadmap”, and signs of trouble ahead. 15 Demands of electronics industry Last time, we got a quick overview of the silicon MOSFET. Today, we will examine the state-of-the-art in MOSFET technology, with an eye toward what the expected requirements are for the future. Keep an eye out for nano-related issues that will crop up…. 1G Ongoing trends: Moore’s (1st) Law Transistors / CPU 100M The number of components per IC doubles roughly once every 18 months. 10M 1M 100k Lateral feature sizes have also decreased exponentially with time. 10k 1k 100 Feature size [ µ m] 1970 1980 1990 2000 Year 10 Breaking the 100 nm barrier in production in 2003…. 1 These trends cannot continue forever. • What will replace traditional Si? 0.1 0.01 • Why will that replacement occur? 1980 1990 2000 2010 ECONOMICS. Year 1 Ongoing trends: Moore’s (2nd) Law 10000 Cost [$M] 1000 100 10 1 1970 1980 1990 2000 2010 Year • While cost per complexity plummets exponentially (35%/yr), cost of production plant rises exponentially. • By 2025, projected trend says fab plant cost ~ $1 trillion. • Clearly this trend cannot continue either…. International Technology Roadmap for Semiconductors These trends have been continuing by design for the last ~ 10 years. SEMATECH: international consortium of semiconductor manufacturers – set goals, fund research of common interest to them all. Includes such US players as: AMD, Agere Systems, Hewlett-Packard, Hynix, Infineon Technologies, IBM, Intel, Motorola, Philips, STMicroelectronics….TSMC, and Texas Instruments Identifies “technology nodes” and spec/cost/performance targets. These days, nodes identified by DRAM pitch: 2 ITRS production cycle Technology nodes are labeled by production – research demonstration must come well ahead of any node goal. Basic parts 3 Current production factoids: • Typical Pentium: ~ 107 transistors, total chip area of 310 mm2 • Active area of transistors is ~ 28 mm2 • Cost per transistor currently between 50 and 100 microcents (!). • Total number of processing steps needed for one chip: hundreds • Total number of masks needed for one chip: ~ 30-40 • Acceptable total yield ~ 50% (!) State-of-the-art: Si material Growth method: Czochralski • A seed crystal is attached to slowly rotating rod, and is dipped into Si at just over the melting point. • The rod is slowly withdrawn from the melt. • Rate is increased at end to avoid impurity contamination. Diameter: 300 mm Specs needed for 99% good wafers: Site flatness: < 130 nm Number of particles: < 120/wafer Surface metal contamination: < 1010 at/cm2 Iron concentration: < 1010 at/cm3 Stacking faults: < 1/cm2 http://www.techfak.uni-kiel.de/matwis/amat/elmat_en/kap_5/illustr/i5_1_1.html 4 State-of-the-art: Lithography Light source: 193 nm Phase compensated masks + chemically amplified resists allow smallest features (e.g. FET channel length) to be ~ 65 nm. Resist pattern edge roughness: < 3.6 nm (3 σ)* Particle contamination: < 1500/m2 of size 100 nm or greater* Number of defects in patterned film: < 0.05/cm2 of 50 nm* Overlay accuracy of mask: 28 nm State-of-the-art: MOSFET silicide spacer Poly-Si source drain Gate oxide n-type n-type p-type Intel 2Q 2005: Parasitic RSD contribution: < 180 Ω-µm Oxide thickness: ~ 1.8 nm Energy per switching: 1 fJ/µm Channel length: ~ 65 nm Static power dissipation: 600 nW/µm Gate position: ~ 6 nm (!) Characteristic time: ~ 0.86 ps Subthreshold leakage: 0.05 µA/micron 5 State-of-the-art: power • Supply voltage in processor core: ~ 1.1 V • High performance processor power dissipation (with heatsink): 130 W • Battery-powered processor power dissipation: 3-5 W Can crunch some numbers on high-performance system. Say 107 transistors running at 2.5 GHz gives that 130 W figure. Now consider 108 transistors in the same area, operating at 10 GHz, for example. Such a processor made with present-day designs and approaches would dissipate ~ 5 kW / cm2 (!!) This is comparable to the power density radiated by a rocket engine…. State-of-the-art: reliability Device early failures (in first 4000 hours): 50 ppm Long-term failures (in first 109 hours): 10-100 ppm Electrostatic protection survival: 10 V/µm Testing is done under “accelerated failure” conditions – typically running devices at higher-than-normal temperatures, for example. 6 Reading the roadmap • White = manufacturable solutions known and being optimized. • Yellow = manufacturable solutions known and demonstrated, but not yet in practice (often, too expensive / yields too low / too new to be optimized yet). • Red = “brick wall” = no known manufacturable (!) solution to given problem / means of meeting criterion. Remember the ramp-up cycle. If there’s a red item and it’s less than two years away, the issue is a very serious one. Roadmap goes out ~ 10 years, but is constantly under revision. Near-term demands (2007): Si material Site flatness: < 64 nm (critical, but hard to measure) Number of particles: < 123/wafer (below measurable threshold) Surface metal contamination: < 1010 at/cm2 (more critical) Iron concentration: < 1010 at/cm3 (more critical) Stacking faults: < 0.3/cm2 (factor of 3 over current) General trends: • Even when current tolerances don’t change by much, their importance increases. • Running into metrology problems - don’t have adequate tools to efficiently assess whether criteria are being met. 7 Near-term demands (2007): Lithography Light source: 193 nm? 157 nm? FET channel length: 35 nm. Resist pattern edge roughness: < 2.2 nm (3 σ) Particle contamination: < 1500/m2 of size 100 nm or greater Number of defects in patterned film: < 0.04/cm2 of 40 nm Overlay accuracy of mask: 23 nm This is particularly alarming: Running into physical limitations of lithographic patterning (not just optical, but polymer resist based in general). Near-term demands (2007): MOSFET Equivalent oxide thickness: ~ 1 nm Channel length: ~ 25 nm Gate position: ~ 2 nm 25 nm Characteristic time: < 0.68 ps Subthreshold leakage: 1 µA/micron Parasitic RSD contribution: < 20% 15nm Energy per switching: 0.032 fJ Static power dissipation: 53 nW Biggest problems: oxide thickness, contact resistances, and leakage problems due to tunneling / thermal emission. 8 Long-term demands (2016): Si material Wafer size (!): 450 mm (How does one grow and polish these?) Site flatness: < 23 nm Number of particles: < 75/wafer (below measurable threshold) Surface metal contamination: < 1010 at/cm2 (more critical) Iron concentration: < 1010 at/cm3 (more critical) Stacking faults: < 0.06/cm2 (another factor of 5) • Most requirements continue increasing criticality. • Metrology even more of a problem. • Larger wafer size desired, but may not happen…. Long-term demands (2016): Lithography Light source: X-ray? E-beam? Imprint? FET channel length: 9 nm. Resist pattern edge roughness: < 0.7 nm (3 σ) Particle contamination: < 500/m2 of size 50 nm or greater Number of defects in patterned film: < 0.01/cm2 of 10 nm Overlay accuracy of mask: 9 nm Noone knows how to do this. Biggest problems: • Single-nm alignments across ~ 2cm chip, + across 450 mm wafers. • Metrology. 9 Long-term demands (2016): MOSFETs Equivalent oxide thickness: ~ 0.4 nm Channel length: ~ 9 nm Parasitic RSD contribution: < 35% Characteristic time: < 0.15 ps Energy per switching: 0.285 fJ/µm Subthreshold leakage: 0.5 µA/micron Static power dissipation: 4.4 µW/µm • Intel can make THz, 10 nm channel transistors, but not in bulk. • Several finite-size problems crop up (contact resistances again) • Irreversibly changing “1” to “0” costs, minimally, kBT ln 2 = 0.002 fJ (!) General observations • We’re running out of time fast for standard CMOS processing if we want to continue Moore’s (1st) law. • At the nm scale, lack of (fast) metrology is a real killer. • Not all coming problems are “simple” engineering or process development issues: “We have entered the era of material limited device scaling”. • We’re approaching the era of physics-limited device scaling in certain aspects as well. 10 Is industry considering alternatives? The 2001 ITRS was the first roadmap to include a section on Emerging Research Devices. Planners well aware that they need to be looking at: • “Nonclassical CMOS” (Transport-enhanced/ultrathin body/source-drain engineered /double-gate/vertical MOSFETs) • Alternative devices (single-electron transistors) • Hybrid devices (nanotube FETs) • Novel architectures (defect tolerance, cellular automata, biologically inspired) • Really novel architectures (molecular computers, quantum computers) Roles for “nano” Pure research • Fundamental physics and chemistry of these materials at nm scale. • Understanding new phenomena as they arise / become relevant. • Learning the science of possible new architectures. Applied research • Nanomaterials including resists. • Metrology: how do you measure critical properties on these length scales? 11 Summary and conclusions • Moore’s Laws are obeyed by design, not by accident. • Electronics industry wants to continue aggressive scaling, but faces many challenges along the way. • “Nano” can and must play a role in addressing these challenges / opportunities. • Either we’ll make some significant paradigmatic shift within 10-15 years, or computer hardware performance will plateau (e.g. passenger airplane speeds). • One of the major limiting problems is economic. Next time: MOSFET scaling in detail: what’s the physics? 12