The Engineering Management of Speed Robert C. Leachman Dept. …courses.ieor.berkeley.edu/ieor130/8_The Engineering... · 2020. 8. 18. · The Engineering Management of Speed. ...

1

The Engineering Management of Speed Robert C. Leachman

Dept. of Industrial Engineering and Operations Research University of California, Berkeley

Abstract

Analytical formulas are introduced for quantifying the revenue gains associated with local, incremental improvements in the speed of product development, supply chain development, or supply chain execution. The formulas provide practical means of imputing an overall economic value to engineering projects that impact lead time. Practical analytical queuing formulas for estimating changes in supply chain speed resulting from engineering changes to product or process also are discussed. An overall approach to engineering management in manufacturing companies characterized by rapid technological evolution is proposed, emphasizing disciplined, sophisticated management of speed.

1. Introduction and Overview We have all heard expressions like "Time is money" or "We must make the market window" in reference to the importance of the speed of product and supply chain development or the speed of supply chain execution. In many technology-based companies, the urgency of product development, process development, supply chain development and supply chain execution is keenly felt. But the true economic value of incremental improvements in supply speed is rarely quantified. Worse, job descriptions and performance evaluations of engineers and managers rarely measure the total economic impact of improvements made to the speed of development or execution. Going back to the Industrial Revolution, the traditional organization of management emphasizes a focus on product cost for managers and engineers developing and operating the supply chain. Supply chain managers and engineers typically are evaluated in terms metrics of cost, throughput and quality. In many industries this works fine. But in businesses subject to Moore’s Law rates of technological evolution, it does not work so well. Prices for any given product decline relentlessly and quickly as its obsolescence steadily grows. It has become very difficult to make profits in electronics hardware businesses; only the swiftest can do so. While many and perhaps most improvements in speed result in cost reduction, throughput enhancement and/or yield improvement, many have a more profound impact on sales revenues than they do on these factors. As a result, their true economic worth is undervalued. In this paper I propose changes to engineering management to more properly value improvements in speed, to measure the impact on speed from changes to process or product, and to make the engineering organization more proficient at managing speed. “Cycle time” is semiconductor industry jargon for the elapsed time to pass manufacturing lots through the manufacturing process, from lot creation until lot completion. The term is also

2

applied to individual manufacturing steps, measuring the elapsed time from completion of the preceding step until completion of the step in question, or to a series of manufacturing steps (the sum of the cycle times of the subject steps). In produce-to-order environments, cycle time is part of the product/service apparent to the customer and is therefore an important competitive issue. Suppliers able to offer shorter cycle times will be preferred. In the case of goods experiencing a rapid pace of technological obsolescence such as semiconductors, cycle time has a very strong influence on realized average selling prices. Firms with shorter cycle times are able make sales at earlier times when prevailing prices are higher. And by making early sales, such firms drive prices down and thereby diminish revenue available to competitors. Thus cycle time is very important even in a make-to-stock environment. In this paper we will first impute economic value to cycle time reduction in terms of the revenue gain resulting from increased selling prices enabled by shorter time to market. Next, we review and adapt models from queuing theory as a practical means for computing entitlement cycle times. Finally, we propose organizational policies promoting disciplined and sophisticated cycle time management and improvement. 2. Empirical Evidence of Price Decline Driven by relentless progress following Moore’s Law, products produced in a planar fabrication process experience a very rapid decline in selling prices over the product lifetime. These products include semiconductors, liquid crystal displays, solar panels, light emitting diodes, read-write heads for disk drives, nanotechnology products, etc., as well as higher level products utilizing these products as components, such as computers, smart phones, tablets and televisions. Examples of this are displayed in Figures 1, 2 and 3. Figure 1 depicts average selling prices for five generations of dynamic random-access memory chips. Prices are normalized so that 100% represents the price at product introduction. The time scale shows the number of months since the product was first introduced into the market. A log scale is used to display prices so that the percentage rate of decline in prices may be more easily discerned. The various product generations were introduced during quite different conditions in the DRAM industry, e.g., the 4MB DRAM was produced in volume during a period of tight product supply, whereas the 16MB DRAM was produced during a period of generous product supply. Nonetheless, on a percentage basis, prices track remarkably closely in every generation. About 12 months after product introduction, prices have declined to 20% of the initial price. A year later, prices are down to about 10% of the introductory price, and end-of-life is less than three years after first introduction. Figure 2 displays a similar graph for the prices of four generations of Intel microprocessors, Pentium through Pentium 4. Here the slopes of the price decline curves are increasing every generation. By the Pentium 4 generation, price has declined to 10% of introductory price only 9 months after introduction. Product lives for all four generations are less than two years.

3

Figure 1. DRAM Average Selling Prices

Figure 2. Intel Microprocessor Average Selling Prices

DRAM price decline history

1%

10%

100%

0 2 4 6 8 10 12 14 16 18 20 22 24 26

months after

introduction

% o

f Int

rodu

ctor

y P

rice

(LO

G) 256K

1M

4M

16M

64M

Intel CPU price decline history

10%

100%

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

months after

introduction

% o

f Int

rodu

ctor

y P

rice

(LO

G)

P 166PII 300PIII 700P4 1.4G

4

Figure 3 displays quoted prices from semiconductor foundries for wafer fabrication in various CMOS digital process technologies. Prices for foundry service do not decline as fast as they do for DRAMs or microprocessors, but they nonetheless decline with time. Thus even a contract manufacturer providing generic manufacturing services faces declining prices for fabrication in any particular technology. This decline reflects the introduction and improvement of newer process technologies.

Figure 3. Semiconductor Foundry Wafer Prices

These graphs suggest there must be large economic values associated with compressing the time until volume production capability is realized. That is, there must be large economic values stemming from compression of product development time, time to install and qualify production equipment, time to ramp up yield and volume, and the manufacturing cycle time itself. Quite apart from cost reduction, these values stem from the opportunity to realize higher selling prices by being earlier to market. 3. Imputing the Economic Value of Cycle Time

The nearly straight-line curves in Figures 1, 2 and 3 suggest price history is well-modeled as a negative exponential. Let P0 denote the market price at the time a product is conceived, and let t = 0 denote the epoch when the product is conceived. We assume product price is continuously declining at a constant rate, and let α denote the rate of price decline. The product price at time t is expressed as

Foundry price decline history

10%

100%

1 8 15 22 29 36 43 50 57 64

months after

introduction

% o

f int

rodu

ctor

y pr

ice(

LOG

)

0.13u0.15u0.18u0.25u0.35u0.5u

5

tePtP α−= 0)( . We assume there are multiple suppliers in a well-established commodity market whereby the market can absorb any output from a given supplier at time t and the output can be immediately sold at price P(t).

We assume there is a period of length VT termed the development time from the time the product is conceived until the manufacturing process is qualified and blank silicon wafers or other substrates started into the manufacturing process (hereafter called wafer starts) may be sold. During this period, the process technology to fabricate the product is developed and qualified, and the manufacturing equipment necessary to process the product is procured, installed and qualified. Let H denote the lifetime for the manufacturing process. Wafer starts that will be sold are made from time VT until time VT + H, at which time the product becomes obsolete and the wafer starts in the technology are terminated. Let W(t) denote the wafer starts made at time t. Let Y(t) denote the yield of wafers started at time t. Let CT(t) denote the manufacturing cycle time for wafers started at time t. We assume wafer starts made at time t will be sold at time t + CT(t).

Let β denote a suitable discount rate for computing a present value of lifetime revenues of the product. In that case, the present value of the lifetime revenue from the given product for that supplier may be expressed as [ ]∫

+ ++−HVT

VT

tCTt dttYtWeP )()()()(0

βα . (1)

Note the rate of price decline α acts like an add-on discount factor, as if the overall discount factor were α + β. In principle, the lifetime revenue integral (2) may be calculated for status quo conditions and for changed histories of CT, Y, and W resulting from a proposed engineering project. This should be done for all products affected by the project. Taking the difference between the two integrals expresses the present value of estimated revenue gains from the project. If we let the subscript i denote product and if we let the superscripts B denote status quo values (B standing for before project) and A denote values if the project is pursued (A for after project), the present value of revenue gains attributable to the project may be expressed as

[ ]

[ ]∑ ∫

∑ ∫+ ++−+−

+ ++−+− −

i

HVT

VT

Bi

Bi

tCTtVT

i

HVT

VT

Ai

Ai

tCTtVT

Bi

Bi

Bi

Bi

Bi

Ai

Ai

Ai

Ai

Ai

dttYtWeeP

dttYtWeeP

)()(

)()(

)()()(0

)()()(0

βαβα

βαβα

. (2)

If we re-set time 0 to denote the epoch that the first wafer starts for revenue sales are made (i.e., when process development is completed), and redefine the functions CT(t), Y(t) and W(t) such that the argument of the functions denotes how long after this epoch that the wafer start is made, then (1) may be re-written as

6

[ ]∫ ++−+−

−

H tCTtVTVT dttYtWeeP

0

)()()( )()(βαβα . (3)

We note in the foregoing equations that the price decline rate α and the discount factor β for computing present values always appear together as a sum. Without loss of generality, we shall assume for the rest of this paper that the parameter α represents the sum of the price decline rate and the discount factor. Yield Ramp In industrial practice there typically is a yield learning curve. At time of process qualification, yield is relatively low; we shall call this value Y0. After a period of engineering detective work and problem solving, yield is gradually raised to a mature value we shall call YF. The period during which yield is improved is called the ramp time, denoted by RT. After time RT, the product is produced at mature yield YF until wafer starts are terminated at time H. Figure 4 illustrates the typical yield history, albeit depicting a smoother profile than in a real case. The start of product and process development is set to be time –VT; the first time wafers are started that will be sold is set to be time 0.

Figure 4. General Pattern of Yield vs. Product Life

The shape of the yield curve in Figure 4 suggests the yield ramp can be modeled by one minus a negative exponential. We posit the following model for the yield curve:

RTteeYYYtY bRT

bt

F ≤≤−−

−+= −

−

0,11)()( 00 . (4)

Die Yield

0 RT-VT TimeH

YF

Y0

Ramp PhaseMature Yield

Phase

7

Here, b is a shape factor for yield learning during the ramp phase. A practical approach to calibrate b is to specify the progress in yield learning completed halfway through the ramp time RT. Suppose halfway through the ramp the yield equals

))(()5.0( 00 YLFYYYRTtY F −+== .

That is, the parameter YLF (“yield learning factor) equals the fraction of yield learning realized halfway through the ramp. For example, YLF might be 50%, 60%, 70%, etc. Given a value of YLF, b may be determined as follows. We have

bRT

bRT

eeYLF −

−

−−

=1

1 5.0

or

bRTbRT eeYLF 5.01)1( −− −=−

i.e.,

bRTbRT eYLFYLF

e 5.0111 −− −=−

i.e.,

−+−= −− 1110 5.0

YLFe

YLFe bRTbRT .

Let bRTex 5.0−= . Then bRTex −=2 . Substituting x into the equation above, we have

−+

−= − 1110 2

YLFx

YLFex bRT .

Applying the quadratic formula, we find

2

211

2

11411 2

−±=

−−

±

= YLFYLFYLFYLFYLFx .

The positive root is given by

8

112

22

−=−

=YLF

YLFx .

Therefore,

115.0 −=−

YLFe bRT

and hence

−

−= 11ln

5.01

YLFRTb .

For example, suppose RT = 180 days and 2/3rds of the yield learning is completed halfway through the ramp phase. In that case, b should be set to a value of about 0.0077. If CT and W are constant, the integral (3) is practical to calculate when the yield history Y is expressed as in (4). In that case, it can be shown that (3) reduces to In the case RT = VT = CT = 0, (5) reduces to

α

αH

FVTeWYP

−

−−1 . (6)

Equation (6) expresses the present value of the ideal lifetime revenue if there were no delays for process and product development, yield and volume ramps, and zero manufacturing cycle time. The development time VT and the manufacturing cycle time CT result in discount factors e-αVT and e-αCT applied to the lifetime revenue. The long expression within square brackets involving the ramp time RT and the learning curve shape factor b is much more complicated, but numerically it also acts as a discount factor, just a weaker discount factor than caused by VT and CT. That is, while waiting for VT or CT there is zero output; on the other hand, during RT there is reduced output associated with inferior yield, but not zero output. Value of Development Time To illustrate the value of development time, consider the following example. Suppose the selling price at start of development is $10,000 for a perfect-yielding wafer. Assume the selling price is declining 50% per year, and the discount rate is 25% per year. Suppose the process life H is three years, the manufacturing cycle time is a constant 50 days over the process life, and the wafer

( )( )

)5(111

11

0

0

+

−−

−

−

−+

−

+

− +−−

−

−−−+−

− bee

eYY

eYYeeeWYP

RTbRT

bRTF

RT

F

HRTCTVT

FVT αααα

αααααα

9

starts volume is 10,000 wafers per week, also constant of the process life. Suppose the yield ramp time RT is 180 days, and the yield learning curve shape factor b is set such that 2/3rds of yield learning is completed halfway through the ramp. In Table 1 below, we examine the present value of lifetime revenue for four different durations for VT. As may be seen, compressing development time is worth about $5 million per day in this example. Note that there are slightly increasing returns to scale with respect to cycle time, i.e., the incremental value of compressing one more day is slightly larger than the value of the previous day of compression.

Table 1. Economic Return from Compressing Development Time

VT (days)

Present Value of Lifetime Revenue

PV Gain Compared to VT =120 days

120 $1.279 billion $0 105 $1.356 billion $76 million 90 $1.436 billion $157 million 75 $1.522 billion $242 million

Value of Manufacturing Cycle Time To illustrate the value of manufacturing cycle time, consider the following simplified situation. Suppose yield Y, wafer start volume W, and cycle time CT are constant over the entire process life H. At time t, the average selling price is

tePtP α−= 0)(

where P0 denotes the average selling price when saleable wafer starts commence. Then the lifetime revenue integral (3) may be simplified in this case as

.10

0 00

)(0

−==

−−−−+−∫ ∫ α

ααααα

HCT

H HtCTCTt eeWYPdteeWYPdteWYP

Now suppose cycle time is permanently shortened by an amount ∆CT, i.e., CTCTCT ∆−→ . Then the lifetime revenue becomes

.1)(0

− −∆−−

α

αα

HCTCT eeWYP

The revenue gain from reducing cycle time by an amount ∆CT is therefore

10

−−

−=∆

−−

−∆−−

αα

αα

αα

HCT

HCTCT eeWYPeeWYPR 11

0)(

0

or

( ) .11 0

−−=∆

−−∆

α

ααα

HCTCT eeWYPeR (7)

We illustrate (7) with 2006 data from a fabrication plant producing image sensors. The fabrication plant made 10,000 wafers per week yielding an average of 420 good die per wafer. The cycle time was 50 days. The selling price at time 0 was $4.50 per die, declining 35% per year. The product life was two years. Assume a discount rate of 25%. If time is expressed in days, then α (accounting for both the price decline rate and the discount rate) satisfies

365)75.0)(65.0( α−= e

or α = 0.0019684. Applying (7), the value of a one day reduction in cycle time is then

( ) ( )( )0019684.0

150.44207000,101

)730)(0019684.0()50)(0019684.0(0019684.0

−− −

−=∆eeeR

or

235,867,1$=∆R ,

i.e., the present value of revenue gains from permanent cycle time reductions achieved at time 0 without diminution of yield or wafer volume in this fab was worth about $1.9 million per day of reduction.1 Not included in the above analysis is the reduction in product cost associated with reducing cycle time. The most important element of such cost reduction in semiconductor manufacturing concerns the positive impact on yield from cycle reduction. The positive impact on yield not only reduces product cost, it also provides further revenue gain associated with shortened yield ramp time and/or higher mature yield. This occurs for two reasons: (1) Some yield loss mechanisms involve equipment or process “excursions” in which the process or equipment shifts out of control, but the presence of this excursion is not detected until the first lot processed after the excursion commenced reaches the end of the production line and is tested. All lots that had

1 Similar to the case of development time, each succeeding day of cycle time reduction is worth slightly more than the previous day of reduction.

11

passed the out-of-control point before the first lot is tested also will have poor yield. When cycle time is reduced, the work-in-process in the manufacturing line is reduced, and so the number of lots with exposure to excursion loss is reduced, and therefore total excursion losses are reduced. (2) A process change that will improve yield must be justified on the basis of a successful in-line experiment. Typically, a portion of the wafers in a selected manufacturing lot are processed the old way, while the other wafers from the same lot are processed the new way. The lot then must travel through the rest of the fabrication process to the end of the line where all wafers are tested, and where it must be demonstrated statistically that the process change indeed improves yield. The shorter the cycle time, the less time is required to complete the experiment and implement the process change, and therefore the yield learning curve can be improved. A quantification of such benefits is described in Leachman and Ding (2011). Value of Yield Ramp Time Let us return to the image sensor example immediately above, but now let us assume that we are at the completion of process development and the start of revenue manufacturing. As above, suppose the selling price per die at this moment is $4.50 but that price is declining 35% per year. Suppose the mature yield YF = 420 die per wafer will not be realized until after a yield ramp time RT = 180 days. Suppose the initial die yield Y0 is 46.67 die per wafer. Suppose the yield learning factor b is set such that 2/3rds of the yield learning is completed halfway through the ramp phase. As before, assume a discount rate of 25% and a remaining product life of 2 years. Applying (5), we find the total discounted lifetime revenue is about $811.0 million. Now suppose the company invests in sophisticated metrology equipment to monitor the process for defects, coupled with sophisticated analysis software to relate end-of-line yield loss to defects observed in-line. Suppose that passing production lots through the metrology equipment adds 10 days to manufacturing cycle time, but only 10% of production lots are subjected to this metrology. Thus average cycle time increases by 1 day to 51 days. Suppose further that, by virtue of the metrology and yield analysis, the ramp time RT is reduced from 180 to 160 days. At first one might expect lifetime revenue to decline because of the increased cycle time. However, the reduced ramp time RT brings more volume to market while the prices are still relatively high. Is this volume gain enough to offset the negative impact of increased cycle time? In this case, it is. Re-computing the discounted lifetime revenue (5) with the revised parameters reflecting implementation of the metrology equipment and yield analysis software as immediately above, we find the value of (7) rises to about $823.2 million, a gain of about $12.2 million. 4. Cycle Time Metrics and Cycle Time Analysis Given the large economic values associated with cycle time reduction, it is meaningful to consider metrics for managing and engineering cycle time. Actual cycle time (ACT) is a statistic measuring the time from the creation or arrival of an unprocessed manufacturing lot until its completion. ACT may be measured for the entire manufacturing process to fabricate a given product as well as for an individual process step in the flow. A weighted-average ACT may be computed for all process steps on all products under the responsibility of a particular engineering

12

section as well as across all products to obtain a factory-level metric. Dynamic cycle time for a given process step is computed based on averaging actual cycle times for many lots passing through a particular process step within a given time frame such as a shift, a day or a week. Static cycle time for a given product is computed based on averaging the actual total elapsed time from lot entry into the process flow until lot completion at the last step of the flow over a series of lots completing the flow. An estimate of total-flow cycle time also may be derived by summing up step dynamic cycle times. This derived figure typically will be a different number than the static cycle time statistic because of the different time frames involved. For the purposes of engineering improvement of particular processes and equipment, the dynamic cycle time metric on actual lot cycle time to pass through individual steps or groups of steps performed by the same equipment is of interest. We define entitlement cycle time as the average manufacturing cycle time if manufacturing execution were perfect, taking into account wafer start volumes for all products, the process specifications for all steps, equipment qualified by process engineering to perform each step, statistics on lot inter-arrival times, and statistics on process and equipment trouble. Such trouble includes temporary disqualification of equipment from performing certain process steps, lots placed on hold because of out-of-spec or out-of-control process parameters, and equipment unavailable for production work because of maintenance or engineering reasons. ECT may be computed for individual steps on individual products, for the entire process flow to make a given product, and for collections of steps within one process section. ECT is not an observable statistic in the data collected at a factory. Nevertheless, ECT can be analytically estimated, either conducting discrete-event simulations or by exercising formulas from analytical queuing theory. Either analytical approach uses statistical data collected from the factory. The information includes data from equipment logs on processing times, data from equipment tracking on equipment non-available times, process specifications as to which process tools are qualified to perform which steps, production volumes, etc. The importance of ECT is its usefulness for separating cycle time issues into engineering problems versus execution problems. ECT may be viewed as the cycle time report card for process engineering. If ECT is too long, no amount of execution improvement by the manufacturing department can overcome this weakness. Excessive ECT is an engineering problem: ECT can only be reduced by engineering changes to equipment maintenance, to process control, to equipment qualifications, to process specifications, or by modifications to the equipment itself. The gap between actual cycle time and entitlement cycle time may be viewed as the report card for manufacturing management and supporting staff, including information systems, automation and industrial engineering. Where there is a large gap between actual manufacturing cycle time and entitlement cycle time (hereafter termed the execution gap), it is an indication that execution could be and should be improved. Closing the gap could entail improvements to scheduling, information, automation, training or administration.

13

In many companies, industrial engineers have developed simulation models of the manufacturing line. These models can be very helpful for evaluating cycle time gains associated with engineering or operational changes. But they tend to be large, complex models that can be exercised only by a handful of expert users. In a typical high-technology factory, every process engineering section faces technical issues affecting cycle time every day. Asking the expert simulation modelers to address all of these issues would amount to a hopeless organizational bottleneck; as a result, there is no time to analyze most issues. What is needed instead are distributed tools, tailored for the issues faced by each section, and deployed in parallel. These could be simulation models, but there need to be separate or duplicate models for each section and proficient modelers/users within each engineering section so that parallel analyses can be pursued. Analytical queuing theory generally provides more approximate estimations of cycle time that can be achieved using simulation. But the virtues of queuing theory are (1) the formulas can be housed in Excel spreadsheets, making the tools accessible and practical for all engineers, (2) cycle time analysis is essentially instantaneous, and can be integrated with other spreadsheet analyses engineers do, and (3) while there is some bias in queuing-theoretic formulas, if one is simply computing cycle time differences associated with engineering changes, the amount of bias in the difference will be small, making the estimates reasonably accurate. For this reason, application of queuing theoretic formulas is attractive, if reasonably accurate formulas can be developed. The next section discusses the application of queuing theory for entitlement cycle time analysis in high-technology manufacturing. 5. Estimating Entitlement Cycle Times Using Queuing Theory ECT has several components: time manufacturing lots are placed on hold; time lots are actually engaged in material handling transport or processing; and time lots are available for processing but waiting for selection by operators or automation and/or waiting for a qualified process tool. Actual statistics on hold times may be collected. Actual statistics on transport times and process times also may be collected; in modern advanced technology factories, these statistics are collected automatically from machine and transport equipment logs. The analytical challenge for estimating ECT largely rests with estimation of lot waiting times. In a manufacturing environment, if there is no lot hold time, the observed cycle time for a particular process step performed on a particular type of processing machine is CT = WT + SCT (8) where SCT denotes the standard cycle time and WT denotes the lot waiting time. Standard cycle time is an observable statistic, it is the average time required for a lot from when it is accepted by the machine for performance of a process step until the lot is completed and ready for transport to the next step. SCT is the irreducible portion of cycle time (irreducible without engineering changes to process specifications or equipment). Another parameter about the process step is the process time PT, the average time between starts of consecutive lots on the machine when the lots are tendered to the machine as quickly as possible and there are no interruptions of

14

processing activity. For some equipment, PT and SCT are identical, but for many they are not. On many types of equipment, lots can be “pipelined” whereby the next lot is started before the previous lot is ready for departure. In such cases, SCT is larger than PT. On other types, there may be some necessary delay for cleaning or reconditioning of the machine chamber between consecutive lots. In such cases, SCT may be smaller than PT. Queuing Theory is a branch of operations research used to analyzing waiting times in resource-constrained systems. In our context, cycle time consists of the following components: execution time (process time and material movement time), waiting time, and lot hold time (e.g., lot stopped from further processing because of an SPC out-of-control alarm). Waiting time is typically the largest component of cycle time. The terminology of queuing theory reflects analysis of a human waiting line. Customers randomly arrive for service and await an available server. When all previous customers have been served and server becomes free, the customer enters service. The time between customer arrival and entry into service is called the queue time. The customer departs service after a service time. In our manufacturing context, the customers are typically the production lots and the servers are the machines qualified to perform the process step. The queue time corresponds to the lot waiting time, and the service time corresponds to the process time (i.e, the time between consecutive lots). The mathematics of queuing theory is involved. Most of the useful results are approximate formulas, and they typically stem from an assumption that the lot inter-arrival time follows an exponential distribution, i.e., lot arrivals are Poisson. (This is also called the Markovian assumption.) Considering the nature of fabrication process flows, this is not a bad assumption for semiconductor fabrication. The typical work station (photo, etch, diffusion, implant, etc.) experiences arrivals of lots coming from a variety of work stations throughout the fab. If work stations operate independently, then the aggregate arrivals from many work stations should reflect a Poisson distribution. One of the simplest queuing systems to analyze is the M/M/1 queue. This is a model of a production workstation that assumes the lot interarrival time is Poisson, the time to “service” the lot (i.e., the process time) also has an exponential distribution, and there is only one “server” (i.e., one machine). Let ta denote the mean time between arrivals and let ts denote the mean service time. The arrival rate is then λ = 1/ta and the service rate is µ = 1/ts. (It is customary in queuing theory to use the notation λ for the arrival rate and µ for the service rate.) Because the interarrival and process time distributions are memoryless, the time since the last arrival and the elapsed service time are irrelevant to the future behavior of the queue. The only information we need to characterize the current state of the system is the number of lots currently at the workstation; call this number n. By computing for each possible value of n the long-run probability that the number of lots in the system is n, we can in turn derive the expected queue time and cycle time for the workstation. Let pn denote the probability of finding the system in state n (i.e., the total number of lots in queue and in process is n). Because jobs arrive one at a time and because the machine works on only one job at a time, the system state can only change by one unit at a time. Given the system

15

is in state n, the rate the system moves from state n to state n+1 is λ. And given the system is in state n+1, the rate the system moves from state n+1 to n is µ. The unconditional rate at which the system moves from state n to state n+1 is therefore λpn, i.e., the probability the system is in state n times the arrival rate. The unconditional rate at which the system moves from state n+1 to n is µpn+1, i.e., the probability the system is in state n+1 times the service rate. In long-run equilibrium, these rates must be equal, i.e.,

.or, 11 nnnnn uppppp === ++ µλµλ

Here we have made the identification that the utilization of the server must equal the arrival rate divided by the service rate. That follows from the following observation: Utilization equals the fraction of time that the server is busy processing lots, which in turn equals the arrival rate times the mean service time, i.e.,

u = (λ)(ts) = λ/µ . Now the state probabilities must sum to unity, i.e.,

,1

11 00

00

−

=== ∑∑∞

=

∞

= upupp

n

n

nn

so therefore

p0 = 1- u . Note that we must assume u < 1 for the sum to converge, i.e., utilization of the server must be strictly less than 100% to have a stable queue in the presence of randomness. The expected number of lots in the system is

.1)1(

1)1()1()1()1(0

20

1

00 uu

uuuu

duduunuuuuunnp

n

n

n

n

n

n

nn −

=−

−=−=−=−= ∑∑∑∑∞

=

∞

=

−∞

=

∞

=

The cycle time, i.e., the expected total time a lot spends in the system is

ut

uutpntCT s

sn

ns −=

+−

=+= ∑∞

= 11

1)1(

0’

and the queue time, i.e., the expected time a lot spends in queue is just ts less than this, i.e.,

16

.11 ss

s tu

utu

tQT

−=−

−= (9)

The queue-theoretic view of the world is that servers are engaged in providing processing service without interruption, and the queue-theoretic cycle time is the sum of queue time and service time:

CT = QT + ts

where QT is the queue time and ts is the average service time, i.e., the reciprocal of the processing rate. The queue-theoretic cycle time does not account for the discrepancy between standard cycle time and process time. Thus if we apply queuing theory to manufacturing, we must account for this difference, i.e., CT = QT + ts + (SCT - PT) . (10) In any manufacturing environment, machines are sometimes down for maintenance. Let A denote the average fraction of time a machine is available for processing service, i.e., the average fraction of time the machine is not down for maintenance or engineering work A is termed the availability of the machine. To adapt queuing theory for manufacturing, the nominal processing rate of the machine is de-rated by the availability. If PT denotes the average process time per lot, then ts = PT/A . (11) Substituting (11) into (10) we have CT = QT + PT/A + (SCT - PT) = QT + (1/A – 1)PT + SCT . (12) Comparing (12) to (8), we see that the observed waiting time in manufacturing is estimated using queuing theory as WT = QT + (1/A – 1)PT . (13) i.e., waiting time equals the queue-theoretic queue time plus a correction factor for non-availability of the machine. Queuing theory provides exact analytical expressions for queue-theoretic waiting time in the case of exponential distributions for independent lot inter-arrival times and process times. For manufacturing applications, we are interested in generalizations modeling machine non-availability, multiple machines performing in parallel a certain kind of process step, and lot inter-arrival and process time distributions that are different from exponential. Such a general case defies an exact expression. But useful and reasonably accurate approximate formulas for this case have been progressively developed by Kingman (1961), Sakasegawa (1977) and Hopp and

17

Spearman (1999). Formula (14) below applies to the case where m machines are qualified to perform a given manufacturing step. Other notation is as follows: u – the utilization of availability of the work station (i.e., of the bank of m parallel machines) PT – mean lot process time

20c – the squared coefficient of variation in lot process time (i.e., the ratio of the variance of lot

process time to the square of the mean process time) A – the average availability of the machines in the work station MTTR – mean length of a machine down time event

2cr – the squared coefficient of variation in the length of a down time event 2ce - short-hand notation for the squared coefficient variation of the effective service time, a

function of A, MTTR, 2cr , PT, and 20c defined below.

2ca - the squared coefficient of variation in the lot inter-arrival time; for Poisson arrivals, 2ca =1. The expected (average) queue time is then

−

+=

−+

APT

umucecaQT

m

)1(2

1)1(222

(14)

where

−++=

PTMTTRAAcrcce )1()1( 22

02 . (15)

Let us examine this general queue time formula (13) qualitatively. It is the product of three terms. The first term is a term involving the variability of lot arrivals and the variability of service time. More variability increases cycle time. The second term in (14) concerns the utilization of the work station and the number of qualified machines. Note the (1-u) in the denominator; as utilization is pushed to 100% of the availability, queuing theory predicts wait time will explode. The behavior of this term is graphed in Figure 5 for various values of m and for u ranging from 0 up to 100% of availability. Note that wait time is reduced as m is increased, even for the same utilization. At 90% utilization of availability, average wait time when only one machine is qualified is more than nine times higher than when eight machines are qualified. The third term in (14) expresses a ratio of process time to availability. All else being equal, a longer process time also means a longer average wait time. All else being equal, a lower availability means a longer average wait time (even if utilization is reduced to the same utilization of availability).

18

Figure 5. Wait Time as a Function of the Number of Qualified Machines and the Utilization of Availability

Queue time may be reduced if any of the three terms in (14) is reduced. This suggests the general avenues for cycle time reduction: (1) reducing variability (i.e., reducing ca or reducing ce), (2) increasing m or reducing u, and (3) reducing PT or increasing A. The basic queuing formula (14) needs to be modified for the cases of machines whose operation requires the lots to be grouped into batches that are sequenced through the machine. Batching may be performed because of a large load size for the equipment (e.g., diffusion furnaces), or because of significant setups applicable to classes of manufacturing steps (e.g., species setups on ion implanters). Batch Machines To model machines with load sizes larger than one lot, we proceed as follows. Lots experiencing a common process recipe must be batched together to make up a load. For example, in semiconductor fabrication, a number of different product/steps may share the same furnace recipe; lots arriving for performance of these product/steps may share one equipment cycle. We therefore view batches of lots for a given recipe as the customers of the queuing system rather

Wait Time vs. Utilization

0123456789

10

0.05 0.1 0.1

5 0.2 0.25 0.3 0.3

5 0.4 0.45 0.5 0.5

5 0.6 0.65 0.7 0.7

5 0.8 0.85 0.9 0.9

5

Utilization

Wai

t Tim

e

Wait Time (1 machine)

Wait Time (2 machines)



19

than lots of a given product/step. Even if a furnace is available when it arrives, a random lot may experience an additional delay waiting for other lots to arrive in order to make up a batch. Suppose the average load size is b lots and the overall arrival rate of lots sharing a common recipe is S lots per hour. Then the average lot experiences the following delay (in hours) for building up a batch (“batching time”):

S

bBT2

1−= . (16)

The overall cycle time is then estimated as CT = BT + QT + PT (1/A – 1) + SCT (17) where QT is estimated as in (14) and BT is estimated as in (16). The extra batching-time term presents another cycle time issue. The batch size can be set too small (resulting in too-high utilization of the furnaces) or too large (resulting in too much wait time to build up a batch). Figure 6 illustrates this trade-off for a particular furnace type (“TEL_OXWETG”).

Figure 6. Cycle time vs. Batch Size for a Particular Diffusion Furnace Type Setup Machines A similar issue arises when estimating cycle times at steps performed by machines experiencing significant setup times to recondition, readjust or reconnect the machine when changing from

Cycle time vs. Batch size for TEL_OXWETG

3.53.75

44.25

4.54.75

55.25

5.5

1.5 2 2.5 3 3.5 4 4.5 5 5.5Avg. batch size (Lots)

Avg

. cyc

le ti

me

(hou

rs)

20

performing one class of product steps to another. An example of this in semiconductor manufacturing is ion implantation. For high-current ion implant machines, there is significant lost time when switching the machine from using one species to another. Species that are implanted include boron, boron fluoride, phosphorous, arsenic and antimony. Considering the lost time when changing species, lots belonging to steps involving the same species are grouped together and processed sequentially. When all lots of same species are processed the machine is changed to use a different species. A species group is defined as the set of product-steps utilizing a given species. To apply the queuing formula (14), we view the species groups as the queuing entities. The process time in the queuing model for such an entity is the time to process an entire species batch, i.e., the sum of lot process times for an average-sized species batch. Once it is decided to make a setup to process a batch of a certain species, the average lot must wait for the setup to be completed and then wait for its turn within the species batch. Suppose the average species batch size is b lots. Then the average lot experiences the following delay (in hours) waiting for its turn within the setup batch (“batch sequence time”) plus the time setting up the implanter:

stPTbBT +−

=2

1 (18)

where st is the time required to set up the implanter to perform product-steps in the given species group and PT is the average process time per lot. Let Ul denote the fraction of time an implant machine is engaged in processing the species group l to which a particular product-step belongs. With probability Ul, an arriving lot will find the machine performing its species batch type and will not need to wait for the machine to process other batch types before its batch type is set up. The overall cycle time for a lot-step belonging to species group l is then estimated as CTl = BTl + (1 – Ul)*(QTl + PTl*bl/A) + SCTl – PTl (19) where bl is the average batch size for species batch type l, SCTl and PTl are the average standard cycle time and average takt time for lot-steps in species group l, QTl is estimated as in (14) using PTl*bl as the process time, and BTl is estimated as in (18). Care must be exercised in defining u as used in (14) and in defining Ul to account for time performing species setups, in addition to time processing lots. As before, the extra batching-time term presents another cycle time issue. The batch size can be set too small (resulting in too-high utilization of the implanters) or too large (resulting in too much wait time to build up a species batch). Application Issues

21

Many different product/steps are performed by a typical work station. The process time input to (14) needs to be a weighted average of the process times for the various product/steps. However, the number of qualified machines m and the standard cycle time SCT can be specific in the estimation of cycle time of each product/step. 6. Capacity Planning Capacity planning is the process of determining appropriate equipment investments to support a targeted increase in production volume. The equipment investments might be proposed for within in an existing fabrication facility, or they might be for an entirely new fabrication facility. The capacity planning problem is usually framed in terms of determining the minimum equipment investments enabling a target production volume and mix. But the capacity planning problem ought to be viewed as more than that. It showed be viewed as a cycle time management problem as well. Excel workbooks containing the entitlement cycle time formulas described above, if linked together, can provide the basis for conducting capacity planning so as to manage cycle time well. Given target volumes by process flow, the first step is to determine minimum quantities for each type of process tool such that utilization of availability is less than 100% for each type. For such a tool set, entitlement cycle time likely would be unacceptably long. Given the set of ECT analysis workbooks for all equipment areas calibrated with data for the process flows to be operated in the new or expanded facility, the least-cost tool set vs. entitlement cycle time may be determined using the following procedure: Step 1. For a given type of process tool, determine the reduction in entitlement cycle time if one more tool of that type were installed in the facility and qualified for all process steps appropriate for that tool. Do this separately for all types of process tools. Step 2. For each type, compute the ratio of the reduction in ECT to the capital cost to procure and install one such tool. Identify the tool type with the highest ECT ROI, i.e., the tool providing the largest cycle time reduction per capital dollar invested. Assume one of these tools is procured. Step 3. If the shortest ECT of interest has been realized, then terminate. Otherwise, return to Step 1 to iterate. This procedure can be completely automated and carried out in a matter of minutes once all required data is populated into the workbooks. We illustrate the results for the case of a new fabrication facility proposed back in 2001 to include a single process flow producing 70nm flash memory devices on 300mm wafers. Figure 7 illustrates the trade-off of ECT vs. capital cost for process tools for a factory sized to start 14,500 flash memory wafers per week. The minimum-feasible tool set costs about $1.45 billion and results in an entitlement cycle time of about 63 days. Each subsequent point on the curve represents the installation of the tool with the highest ECT ROI in that situation. Because the highest ECT ROI investments are made first, the slope of the curve is continuously declining, resulting in a hyperbola of ECT vs. tool cost. Note the total tool investment for a fab with an entitlement cycle time of 31 days is about $1.52 billion, i.e., an

22

additional 5% investment in process tools, about $60 million, if invested optimally, results in a fab that is twice as fast. Figure 8 provides similar information for a facility sized to start 29,500 flash memory wafers per week, i.e., a facility about twice as large. The tool investment required to realize an ECT of 31 days is now about $2.89 billion. Figure 9 provides similar information for a facility sized to start 54,500 flash memory wafers per week; in this case, the tool investment required to realize an ECT of 31 days is now about $5.17 billion. Because this analysis was automated, it was easy to carry it out for many possible fabrication volumes. Figure 10 summarizes the results for facility sizes ranging from 12,000 to 59,500 wafer starts per week. The horizontal axis shows total tool investment per volume capability (measured in wafer starts per week), while the vertical axis shows the resulting entitlement cycle time for the flash memory process flow. Because equipment counts must be integral, the curves sometimes cross. Nonetheless, it is very apparent there are considerable economies of scale with respect to the capital investment required per unit capacity to achieve a given fab speed. Figure 11 presents these data a different way to emphasize the economies of scale. The vertical axis is now the tool capital cost expenditure per thousand wafer starts per week of capacity, and the horizontal axis is the fabrication facility size (i.e., capacity), measured in thousand wafer starts per week. Trade-off curves are displayed for various values of the entitlement cycle time, ranging from 41 days down to 23 days. There are three important insights to be gained from this graph. First, note that all curves decline steeply until about 40,000 wafer starts per week, where the curves begin to flatten out. This makes clear the economies of scale with respect to cycle time capability in fabrication facilities. For example, a 30,000 WSPW (wafer starts per week) fab achieving the same ECT as a 15,000 WSPW fab has tool costs amounting to about 5% less per unit volume. Second, note the close spacing of ECT curves from 41 days down to 29 days. Buying speed capability with additional tools to push ECT down from 41 days down to about 29 days is relatively cheap, about $1.5 million per 1,000 wafer starts per week. Third, the spacing of the curves spreads out as one pursues shorter ECTs, i.e., speed becomes more expensive after that point. The cost of enabling a 26-day ECT capability is roughly another million dollars per 1,000 wafer starts per week, and the cost of enabling a 23-day wafer start is roughly another $1 - $1.5 million dollars per 1,000 wafer starts per week on top of that. Not apparent in these graphs are the differences in the process tool sets selected by this optimized ECT analysis vs. the tool sets suggested by a traditional analysis pursuing uniform U/A levels for the various tool types. Certain tool types operated at relatively modest levels of U/A experience relatively long cycle times yet are relatively inexpensive. Wet benches and diffusion furnaces are good examples of tools in this category. The ECT-based optimization will

23

Figure 7. Entitlement Cycle Time vs. Optimized Tool Investment for 14.5K WSPW

ECT vs. Tool Set Capital Cost, 14.5K WSPW of L52A

20

25

30

35

40

45

50

55

60

65

$1,440 $1,450 $1,460 $1,470 $1,480 $1,490 $1,500 $1,510 $1,520 $1,530 $1,540 $1,550 $1,560 $1,570 $1,580 $1,590 $1,600

Capital Cost of Least-Cost Tool Set (millions)

ECT

(day

s)

24

Figure 8. Entitlement Cycle Time vs. Optimized Tool Investment for 29.5K WSPW

ECT vs. Capital Cost of Tool Set, 29.5K WSPW of L52A

20

30

40

50

60

70

80

90

2,800 2,820 2,840 2,860 2,880 2,900 2,920 2,940 2,960

Capital $ Spent on Tool Set (millions)

ECT

(day

s)

25

Figure 9. Entitlement Cycle Time Vs. Optimized Tool Investment for 54.5K WSPW

ECT vs. Tool Set Capital Cost, 54.5K WSPW of L52A

20

25

30

35

40

45

50

55

60

$5,080 $5,100 $5,120 $5,140 $5,160 $5,180 $5,200 $5,220 $5,240 $5,260 $5,280 $5,300

$ Spent on Tool Set (millions)

ECT

(day

s)

26

Figure 10. Entitlement Cycle Time Vs. Tool Investment Per WSPW

Entitlement Cycle Time vs. Tool Capital Cost for Various-Sized Fabs, L52A

2326293235384144475053565962656871747780838689929598

101104

0.093 0.094 0.095 0.096 0.097 0.098 0.099 0.100 0.101 0.102 0.103 0.104 0.105 0.106 0.107 0.108 0.109 0.110 0.111 0.112

Equipment Capital Cost per WSPW (million $)

Entit

lem

ent C

ycle

Tim

e (d

ays)

12K WSPW

14.5K17K

19.5K24.5K

29.5K34.5K

39.5K44.5K

49.5K54.5K

59.5K

27

Figure 11. Optimized Tool Investment Vs. Fabrication Facility Size

Tool Capital Cost for 50 Series Flash vs. ECT and Fab Size

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

14.5 19.5 24.5 29.5 34.5 39.5 44.5 49.5 54.5 59.5

Fab Size (K WSPW)

Tool

Cap

ital C

ost p

er K

WSP

W (m

illio

n $)

ECT = 23 days26 days29 days32 days35 days38 days41 days

Tool capital cost vs. ECT and fab size

28

tend to recommend buying more of such tools and less of tools with good reliability and high counts of qualified tools for their process steps, such as ashers and lithography tools. That is, optimal capacity planning does not strive to equalize U/A values across the different tool sets. 7. Advanced Scheduling Methodology The most common methodology for factory floor scheduling involves the application of dispatching rules. The work-in-process awaiting processing in a given work station is prioritized using dispatching rules. When a machine is made free and available, the highest-priority lot is assigned for processing. Typically, dispatch priority is a function of a pre-established due-date for completion of the lot. The intent is to give priority to lots behind schedule. In semiconductor fabrication practice, there are a number of problems associated with application of dispatching rules. First, the lot due-dates become “stale” and may not reflect true priorities. Most fabrication facilities process many lots of the same product. Customers are indifferent as to which of these lots they receive. Because of inspection sampling, process holds and other issues, it is very common for lots to get out of their starting order. When one lot of a given product passes another lot of the same product, ideally they should exchange due dates, but this typically does not happen. Moreover, manufacturing yields are variable; when yields of downstream lots are worse than planned, the due dates of the upstream lots should be advanced (and when downstream yields are better than planned, the due dates of upstream lots should be delayed), but again this typically does not happen. Even worse, the target output schedule may be changed, but lot due dates not updated to reflect this change. A second problem concerns the fact that a dispatch list is not really a schedule, it is simply a priority list for a group of parallel machines comprising a workstation. It still requires the humans to determine and define the arrangement of lots among the alternative machines. A third and related problem arises if the way lots are allocated among the machines in a work station can significantly impact yield or lot throughput, then strictly following the dispatch order can be undesirable. In such cases, the manufacturing organization may give little heed to the dispatch list, as the list provides no basis for assessing trade-offs between on-time delivery and workstation efficiency. An alternative approach is described in Leachman, Kang and Lin (2002). The so-called SLIM (Short cycle time and Low Inventory in Manufacturing) system was developed and implemented at Samsung Electronics Corp., Ltd. to provide shift production targets and on-line scheduling of fabrication activity in Samsung’s fab lines. Certain aspects of SLIM will be highlighted in these notes, but the reader should refer to full paper to fully comprehend the methodology. The methodology of SLIM is summarized as follows. A target output schedule for the fabrication facility is established by the planning function. Target cycle time from fab-in to fab-out also is established for each product of the fab. The SLIM methodology breaks out the target cycle time by step in a manner that establishes cycle time buffers in each masking layer in proportion to the cycle time variability in that layer.

29

Once each shift, the target fab out schedule is translated into equivalent production targets for each step on each product. These targets are termed the ideal production quantities (IPQs). The use of the IPQs is as follows: Arrangement of lots on machines to facilitate better process control or throughput is acceptable as long as completion of the IPQs is not jeopardized. That is, we strive to be current with the cycle time plan and the fab out plan by the end of each production shift. Next, on-line scheduling logic within SLIM assigns all on-hand work-in-process to the machines in the workstation. This logic strives to complete IPQs by the end of the shift, prevent starvation of downstream bottlenecks, and make arrangements of lots on the machines that are efficient. The logic must be customized for each workstation reflecting the process and equipment peculiarities in that workstation. Determining Target Cycle Times and Target WIP Levels by Layer2 Given a target cycle time TCT for an entire fab process, we can determine target cycle times and target WIP levels for each photo layer according to the following procedure. Let SCT denote the sum of standard cycle times for all steps in the process flow. The difference between total target cycle time TCT and total standard cycle time SCT is the total buffer time TBT to be allocated across the process flow. If the target fab output rate for the process is constant, we may apply Little's Law, to translate the same relationship in terms of WIP. The target total WIP level is TW = (TCT)(λ), where λ is the production rate. The total active WIP in the process is given by ActvW = (SCT)(λ), The difference TW – ActvW = (TCT - SCT)(λ) is the buffer inventory TBW to be allocated across the process flow. The primary purpose of the buffer time or buffer inventory is to protect bottleneck resources against utilization losses. In most wafer fabs, the photolithography steppers are the bottleneck. When the process is operating in control and all equipment is operating, the buffer WIP across all steps between visits to the steppers will be concentrated right before the stepper processing step. But because of random machine down times or process out-of-control events, WIP is sometimes more distributed over the steps in the layer, and there is a risk the WIP at the stepper operation may be drawn down below the active WIP level, i.e., there is a risk of undesired idle time. It is often the case that the steppers are somewhat inflexible, i.e., certain mask layers must be performed using certain steppers. It is also often the case that changing which layer a stepper is processing involves a setup time that consumes capacity. In such cases, it is valuable to have a WIP buffer in each layer. Logically, the size of the buffer should be proportional to the amount of uncertainty in the WIP supply to that layer. That is, the more uncertainty, the more risk of starvation of the photo machines performing that layer, and the greater the need for a WIP buffer.

2 In these notes, the special case of constant target fab output rates for all products are assumed. Formulas for the more general case of time-varying target output rates are provided in the Factory Floor Scheduling notes.

30

We decompose the process flow into layers for the purposes of WIP control as follows. Layer 1 consists of all processing steps from fab start up through the first photolithography exposure. Layer j consists of all processing steps after layer j-1 up through the jth photolithography exposure. (NOTE: This definition of layers is different from the way a designer would describe the physical layers of the device.) As a proxy for the uncertainty in the WIP supply to each layer, we can consider the uncertainty in the elapsed cycle time from completion of the bottleneck processing step in layer j-1 to arrival in the queue for the bottleneck processing step in layer j. (If j=1, we consider the standard deviation in the cycle time from the start of the process flow until arrival in the queue for the first bottleneck step.) The idea here is that, if the WIP flow through some layer has many disruptions because of equipment or process trouble, there should be a high variance of cycle time in that layer. Let σj denote the standard deviation of the cycle time for the layer ending at the jth photo operation, j = 1, 2, ... , L. Data must be collected empirically to establish σj. To prevent starvation of the photo machines performing the layer j exposure operation, we would like to plan a buffer time BTj in layer j of size k * σj, where k is as large as possible. Equivalently, we would like to plan a buffer WIP level BWj = (BTj )( λ ). We can equalize the protection against starvation in all layers by choosing the same "k" factor for all layers. That is, we take

∑

=

−= L

jj

SCTTCTk

1σ

(20)

The target cycle time for layer j, TCTj, is then set to TCTj = SCTj + BTj, where BTj = k * σj. Equivalently, we set the target WIP level for layer j, TWj, to be TWj = ActvWj + BWj, where ActvWj is the active WIP in layer j and BWj = ( k * σj ) ( λ ) is the target buffer for layer j. (The active WIP in layer j is given by (SCTj )( λ ), where SCTj is the sum of standard cycle times for all process steps in layer j and λ is the production rate.) For the series of process steps after the last photo step in the process, we simply set the target cycle time equal to the standard cycle time, i.e., we set the target WIP equal to the active WIP. There is no need for a buffer after the last visit to the bottleneck. In some cases, data on σj are not available or impractical to derive, but data on actual cycle times ACTj are available. In such a case, we may use ACTj – SCTj is an index of the uncertainty in the cycle time for layer j. (The idea is, layers in which the discrepancy between actual and standard cycle time is large must have more waiting time and therefore require a larger downstream buffer to maintain throughput.) We therefore re-define the “k” factor as

31

( )∑

=

−

−= L

jjj SCTACT

SCTTCTk

1

(21)

And we set the target cycle time for layer j, TCTj, to be TCTj = SCTj + BTj where BTj = k*(ACTj – SCTj). Production Targets and Scheduling Priorities The target WIP levels allow us to determine production targets. We define the ideal production quantity (IPQ) of a particular process step to be the quantity we should complete by the end of a production shift such that the downstream WIP is made exactly equal to the amount required to fulfill the production rate over the target cycle time to fab out. We illustrate the computation of the IPQ for photo bottleneck steps for a simplified case assuming line yields are 100% and a constant target output rate. Let AWj denote the actual WIP in layer j, and let TWj denote the target WIP in layer j. Let λ denote the target production rate per shift, and let ∆FO denote any shortage (surplus if negative) of fab outs to date compared to the target production rate. Then

( ) λ+−+∆= ∑+

+=

1

1

L

jlllj AWTWFOIPQ (22)

where TWL+1 and AWL+1 denote the target and actual WIP levels, respectively, in the string of process steps after the last photo layer. We remark that IPQj may have any positive or negative value. (It is negative when production is more than one shift ahead of schedule.) Even if positive, it may be impossible to finish the IPQ in one shift if the supply of WIP is insufficient and/or if the available capacity is insufficient. Hence the adjective ideal. The production targets can be turned into schedule priorities simply by dividing by the production rate. We define the schedule score (SS) of a photo step to be SSj = -IPQj / λ (23) We can prioritize photo steps by least SS first in order to decide which step to set up first on an available machine. Note that we have prioritized steps rather than lots. Suppose we wish to economize on changeovers between different product/steps which utilize different masks and different settings of the photo machine. Then once a step is selected for processing, we should try to dispatch enough lots of that step to fulfill its IPQ. If we can continue to process lots of that product/step without jeopardizing fulfillment of the IPQs for the other product/steps we may do so, but otherwise we need to change to process a different product/step with a negative schedule score and on-hand WIP.

32

SLIM Scheduling Logic An optimization problem may be defined and solved to schedule the available WIP so as to meet the SLIM priorities as much as possible, or a heuristic algorithm may be developed. We outline a heuristic algorithm as follows. We suppose changing the product/step performed by the machine is a productivity and/or process control issue and that such changeovers are to be economized. 1. Prioritize product/steps by schedule score. 2. For running machines, if there is available unassigned WIP and SS < 0 for the current product/step, assign lots to the machine until the IPQ is reached or the available WIP is exhausted or the available machine time in the shift is exhausted, whichever limit is reached first. Update IPQ scores as assignments are made (simply subtract the assignment). 2. Proceed down the priority list of product/steps. Choose the preferred machine to assign the highest-priority product-step. The preference ordering of machines might reflect setups common across a group of product/steps, the remaining available time on the machines, difference in process times among the machines, the total amount of candidate WIP for each machine, or other factors. Assign lots to that machine until the WIP is exhausted or the IPQ is reached or all available machine time is exhausted, whichever occurs first. Continue performing step 2 for all product/steps with negative schedule scores until as much progress towards fulfillment of the IPQs has been made as is feasible. 3. Go back through the list of product/steps and repeat. This time, assign up to all available WIP or available machine time. 4. In situations of low WIP, one or more machines may be idled as a result of applying the above assignment algorithm. In this case, one can adapt the algorithm by removing a lot from a busy machine, assigning it to the idle machine, and pretending this lot is already running (and therefore treated in step 2). 8. Engineering Organizational Policies to Better Engineer and Manage Speed Given the large economic values associated with cycle time reduction and given the analytical means to estimate entitlement cycle time described in preceding sections, it is meaningful to consider the engineering organization and how it should deal with supply-chain speed in a high-technology company. Typically, a high-technology company has a product design organization and a process engineering organization. Process engineering may be further divided into an organization dealing with R&D of new process technology and another dealing with improvement of existing process technologies. The process engineering organizations are further subdivided into sections dealing with particular types of process technology. For example, in companies manufacturing using planar fabrication process technologies, there will be sections for photolithography, etch, ion implant, chemical vapor deposition, metal deposition, diffusion, chemical mechanical polish, etc. Each section is responsible for the process specifications

33

precisely defining how each manufacturing step is to be performed and the determination or verification of quality of output of each step. Typically, the sections are responsible for resource planning, staffing, and costs of the operations to perform the steps within their domain. In a typical high-technology company, on nearly every working day every process section faces issues and makes decisions that influence cycle time. Supporting the sections are staff departments such as information systems, industrial engineering, automation and human resources, as well as a manufacturing department executing the process and maintaining the equipment. They too face issues and make decisions affecting cycle time on a near-daily basis. For a high-technology manufacturing company to be proficient at managing cycle time, the following managerial and organizational policies are proposed: (1) Management should impute an economic value to cycle time for each product and declare this value to the engineering and operations organizations. Management should require that any proposals for changes to product development plans, to the manufacturing process or to operational policies that would change cycle times must be justified by quantifying the overall economic impact, including the gain or loss in future revenues associated with changes in cycle time. (2) Management should establish cycle time goals for each product. These goals may be dynamic, anticipating learning curve improvements. Where entitlement cycle time is larger than the cycle time goal, the engineering organization must devise changes to equipment or process enabling the goal to be met. (2) Entitlement cycle times should be calculated for every product and process. Each engineering section has tailored queuing formulas or simulation models embedded in spreadsheet tools and knowledgeable staff enabling it to routinely calculate its entitlement cycle times, and to routinely compute the predicted change in its ECTs as a function of proposed changes to product, process, equipment, operational policies, product mix or factory volume. (3) Every proposal for an engineering project or an operational change quantifies the change in ECTs as well as changes in product cost anticipated from the project. Every proposal for an engineering project or an operational change quantifies the true discounted cash flow to the company from the project, including changes in product lifetime revenues as well as changes in product cost. Management does not sign off on any project proposal unless the cycle time impact and consequent speed dollars are included in the evaluation. (4) Where actual cycle time is larger than entitlement cycle time, the manufacturing organization and supporting engineering staff needs to improve execution. Improved execution tools may be required, i.e., more advanced planning and scheduling systems. (5) Engineering sections and operational management are evaluated on the basis of their overall discounted cash flow contribution to the company, including revenue gains as well as cost reduction. Engineering or operational improvements that increase speed are appropriately recognized and rewarded.

34

Table 2 displays the 2006 cycle time results at an image sensor fabrication plant embracing this approach. ECT analysis tools (Excel spreadsheets housing customized analytical queuing formulas) were developed and implemented in every process engineering section during the spring of 2006. There are rows in the table for each section (e.g., “photo” for photolithography, “CFA” for color filter array, “CMP” for chemical mechanical polish, “Wet” for wet-bench etching and cleaning, etc.). There are columns for entitlement cycle time in June and September, 2006 and for actual cycle time in the same months. Viewing the totals, as of June, 2006, the entitlement cycle time for fabricating image sensors was almost 39 days and the actual cycle time was almost 48 days. After three months of engineering effort, entitlement cycle time had been dropped to a little less than 29 days, i.e., a 10-day reduction. ECT in only one section (CFA) got worse; this was because this portion of the process was being removed from the factory, and therefore there was no value to improving it.

Table 2. Entitlement and Actual Cycle Time Improvements in 2006 at an Image Sensor Fabrication Plant

Section June Entitlement Sept Entitlement June Actual Sept Actual Photo 3.31 2.43 6.16 3.25 CFA 1.72 1.83 2.96 2.35

Metrology 4.91 1.96 4.47 3.03 Strip 0.90 0.74 2.17 1.84 Scrub 1.37 0.93 2.19 1.34

Implant 4.14 3.62 4.28 3.96 CMP 1.23 1.09 1.28 1.14 CVD 1.33 1.11 3.31 2.31

Metals 1.03 0.67 0.85 0.75 Etch 2.10 1.43 3.14 2.36 Probe 0.06 0.02 0.11 0.06

Diffusion 9.31 6.65 6.88 4.75 Wet 7.37 6.43 10.11 8.46

Total 38.78 28.92 47.91 35.60 As may be seen, actual cycle time dropped to less than 36 days, i.e., a more than 12-day reduction. During the summer months there was considerable improvement in ECT as well as execution improvements reducing the gap between actual and entitlement cycle times.3 The 12-day reduction in actual cycle time was worth about $24 million in increased lifetime revenues for the image sensors in production at that time. Moreover, the improved competitiveness enabled the company to capture major new accounts. References Hopp, Wallace and Mark Spearman, Factory Physics, McGraw-Hill, New York (2001).

3 Reported actual cycle times in the Diffusion section were less than the entitlement cycle times because of an accounting convention: Most of the waiting time for diffusion steps was credited to the Wet section’s actual cycle time. The sum of Diffusion + Wet actual and entitlement cycle times may be compared.

35

Kingman, J. F. C., “The Single-Server Queue in Heavy Traffic,” Proceedings of the Cambridge Philosophical Society, 57, p. 902-904 (1961). Leachman, Robert C., Jeenyoung Kang and Vincent Lin, “SLIM: Short Cycle Time and Low Inventory in Manufacturing at Samsung Electronics,” Interfaces, 32 (1), p. 61-77 (Jan.-Feb. 2002). Leachman, Robert C. and Shengwei Ding, 2011. “Excursion Yield Loss and Cycle Time Reduction in Semiconductor Manufacturing,” IEEE Transactions on Automation Science and Engineering, 8 (1), p. 112-117 (January, 2011). Sakasegawa, H. “An Approximation Formula )1/( ρρα β −⋅≅qL ,” Annals of the Institute of Statistical Mathematics, 29 (1A), p. 67-75.

The Engineering Management of Speed Robert C. Leachman Dept. …courses.ieor.berkeley.edu/ieor130/8_The Engineering... · 2020. 8. 18. · The Engineering Management of Speed. ...

Documents