Maintenance of Mining Machinery

8/22/2019 Maintenance of Mining Machinery

1/22

Maintenance of Mining Machinery

Importance of Maintenance

Mining used to be a people-intensive industry. Not any more. The productivity inmining has grown by a factor of twenty in the last 40 years. This growth has been

possible by large-scale mechanisation.

Obviously, when the mining is done by machinery, the important consideration is tokeep them operating. This is partly the aim of the maintenance function.

For a typical surface mining operation, maintenance-related costs make up about 50%of all operating costs. For an underground mine operation, it is about 30-40%. Onaverage, mining machinery maintenance is 30-50% of the operating costs for thenations mining industry or $10-15b/year.

Another indicator of the importance of maintenance is given by the following table thatlists improvements in profits caused by 1% improvement in various area:

Improvement (1%) Area Effect on profit

Productivity 3 %

Availability 3 %

Reduce Operating Costs 0.5-3.5 %Product Price Increase 0.5-0.9 %

Reduced Interest Rate 0.7-1.2 %

Source : World Mining Equipment, Dec '98

After years of seeing this as a cost of doing business, the mining industry is starting torecognise the importance of equipment reliability and maintenance. Many companieshave implemented computer-based centralised maintenance management systems andare reviewing new equipment purchases with a stronger focus on issues such as life-

cycle costing, reliability and maintainability.

While maintenance costs are significant, even of higher significance is the cost of lostproduction. That is why the availability has a high leverage on profits and 1%improvement in availability increases company profits by up to 3%.

Another way of stating this is that modern mining requires large amounts of investmentin capital infrastructure and equipment to produce a product sold at a low unit cost. Tostay competitive, such a business has to focus on full realisation of its capitalinvestment, i.e. sustained production at high levels. Maintenance plays an essential rolein achieving this goal.
http://www.mech.uq.edu.au/courses/mmme2104/chap11/maintenance.pdfhttp://www.mech.uq.edu.au/courses/mmme2104/chap11/maintenance.pdf


2/22

In simple words, you have to keep the machine running to stay ahead of the pack. Thiscan be expressed as an optimisation problem:

Problem = Maximise the annual production

Total annual production is given by another simple formula:

Annual Production = Tonnes/h x [TH(=365 x 24) - PlannedMaintenance(PM) -BreakdownMaintenance(BM)]

where we assume round-the-clock potential operation (365 days a year, 24 hours a day).

The term between the brackets is an important parameter as it represents the totaloperating time. This is usually expressed as its ratio to the total time potentiallyavailable for production and this ratio is called the Availability.

Availability:

It is a responsibility for the entire site to keep this figure at a high level. The availabilitygoes down when the equipment is poorly operated. It also goes down when it is poorlymaintained.

The Concept of Failure

Failure

Failure is the loss of ability of an item to perform itsrequired function. An example is the downdrive gearshown in the figure on the right. A number of teeth onthis gear are broken and the gear cannot fulfill itsfunction of transmitting power.

The Cost associated with this failure has three maincomponents:- Lost production when the machine was down- The cost of the replacement gear

- Maintenance labour costsMTBF = Mean Time Between FailuresMTTR = Mean Time To Repair

For the example on the right, MTBF = 100 hours andMTTR = 20 hours.

The availability is sometimes expressed in terms ofMTBF and MTTR as

where N is the number of failures in the data collection


3/22

period.

Failure Probability Distrbutions

Failures usually ocur randomly. The repair time is also hardly constant. If we treat

failure as a random event, then we can use the well-established tools of probability andstatistics to model the uptime, downtime and availability for our equipment.

Poisson distribution is commonly used in forecasting to represent the number ofoccurrences of a specific event in a given continuous interval.

Ships arriving at a dock on a given day

Traffic accidents on the SE freeway in a month

Mad cow disease breakouts in the world in one year

Typos per page in a long report typed by Hal Gurgenci

Cable shovel failures in one day of operation

The following is the probability distribution of the Poisson random variable Xrepresenting the number of outcomes occurring in a given time interval t. Lamda is theaverage number of outcomes per unit time:

Assume failure events follow a Poisson distribution. It is then easy to find the

probability of having NO FAILURES in a given time interval t by substituting x=0 inthe Poisson distribution function:

This is referred to as the survival probability or the reliability.

Component Reliability

Reliability is the probability that a product will operate throughout a specified periodwithout failure

when maintained in accordance with the manufacturer's instructions; and

when not subjected to the environmental or operational stresses beyond limits

stipulated by the manufacturer.

Failure Probability

If the reliability, R(t), is the probability to survive through time t, then the probability offailingin that period is 1 R(t), or


4/22

This is called the cumulative failure distribution function or shortly failure c.d.f. It iscalled cumulative because it expresses the aggregate probability of failure for a time

period t. The time-derivative of the c.d.f. gives the Probability Distribution Function or

p.d.f.

The coefficient lamda is called the hazard rate or the failure rate (eg if a piece ofequipment is expected to fail twice a day on average, lamda is 0.5 d-1). If we havefailure data for a large population, then the hazard rate can be estimated as follows:

Hazard Rate (Failure Rate):

This assumes an exponential (uniform failure rate) distribution. The Mean TimeBetween Failure is the inverse of the failure rate.

Mean Time Between Failure(MTBF):

Hazard Rate

The probability of failure in a unit time interval (t, t+1) is roughly equal to f(t) as shownin the following figure:

This probability depends on two things

The probability of survival until time t. Obviously, if the piece fails before

time=t, it will not fail in (t, t+dt). The survival probability is of course reliability,R(t).


5/22

The probability of failure in one unit time interval. This is called the hazard rate,h.

The probability of both of these mutually independent events happening is equal to theirproduct. Therefore,

The hazard rate can also be defined as the conditional probability of failure in a smalltime interval (t, t+dt). It is conditional on there being no failure until t.

For exponential failure distribution, the hazard rate is constant

Weibull Reliability

The Hazard Rate is not always constant. For example, assume the failure rate isincreasing by the formula 0.1t where t is measured in days It starts from zero and at theend of the month, the hazard rate (or failure rate) is 3 failures per day. How do wegenerate the reliability function for this component?

In a sample of N, dN will fail in a time intervaldt

If it were dN=-mNdt, this would correspond touniform rate of failure. When dN=-Nmdt, thismeans the rate of failure increases with time(assuming m>0). We can then group the terms,integrate and get the number of units that

would survive over time t. This is the survivalprobability or reliability. The expression forN(t) can then be found as

where No is the value of N at t=0. Then thereliability function is


6/22

This reliability expression is different fromsimple exponential distribution because of theexponent on t. Curves of this form are calledWeibull distribution curves.

The general Weibull distribution curve is

where R is the probability of surviving through time t, beta is the shape factor and eta isthe scale factor.

Weibull Distribution Curves:

Cumulative Distribution Function (cdf)

Probability Distribution Function (pdf)

Hazard Rate

The value of beta determines the shape of the Weibull curve. For example, beta=1corresponds to a constant hazard rate or an exponential distribution. Beta>1 means thatthe hazard rate increases with increasing age. This is summarised in the left-hand figure

below:


7/22

By using different values for and factors, Weibull distribution can be made to fit awide range of failure data. The figure on the right-hand side above is a typical Weibull

curve. For this component, the probability of survival drops below 90% after the first1250 hours. In other words, the probability of survival through 1250 hours is 90%.

Bathtub Curve


8/22

The so-calledBathtubCurve would

be found inalmost every

maintenancemanagement

book. It ispartlypopularbecause of itsanthropomor

phic qualitieswith asuperficialresemblance

to a humanlife span.

Babies arevulnerable tomanyhazards.Some of thesources ofhazard arecongenitaland theothers arecaused by theenvironment.This is theinfantmortalityregionsimilar to thelearning and

commissioning phaseafter theinstallation ofcomplexmachinery.

As the humanchild growsto an adult,he or she

learns to copewith their


9/22

congenitalweaknessesand at thesame time

buildsdefensemechanismsto deal withenvironmental or externalhazards. Thehazard rate isfairlyuniform foradults over a

relativelylong period.This is theflat portionon the

bathtub curveandcorrespondsto theoperating lifeof a complexmachine.

Old agebrings frailtyandincreasinghazards. Thisis the thethird part ofthe bathtub

curve wherethe hazardprobabilityrapidlyincreaseswithincreasingage. If youknow wherethat periodstarts, you

may want toconsider


10/22

retiring themachine atthat point andreplace itwith a new

unit.The bathtub curve conceptis plausible and easy tounderstand. Most complexsystems show a bathtub-like

behaviour. Unfortunately,when it comes down to theindividual parts andcoponents, the bathtubcurve does have onlylimited application.

The curves on the right arethe failure patterns observedon aircraft electroniccomponents in a studycompleted in 1978 by

Nowlan and Heap. Theyshow that only 4% of thecomponents go through a

bathtub curve. Another 2%

has a sudden death regionand another 5% has a slowlyincreasing hazard rate withtime. These charts tell usthat most things do not failthrough an age-relatedmechanism. This is contraryto laboratory tests becausein laboratory tests thehazard rate usually increaseswith time. Most mechanical

engineering components failthrough a fatigue-relatedfailure mechanism whichimplies a time dependence.

Even though Nowlan &Heap study was conductedfor electronic components inthe aircraft industry, theirconclusions are almostuniversally accepted for all

industries. Our own studiesin the longwall face


11/22

equipment failures indicatethat a uniform-hazardassumption or anexponential failure

distribution is more usefulthan a Weibull-type curve torepresent longwallstoppages that are caused by

both mechanical andelectrical componentfailures.

The main reason for thediscrepancy between the labtests and the field

experience is that mostcomponents in the field failthrough a variety of failuremodes and their interactioncannot be clearlyunderstood. The lack ofunderstanding of the actualfailure modes causes us todump them all together andthis tends to favour aexponential distribution.

Reliability of Multi-Component Systems

Series Systems

A series system is a chain of components. When one of these parts fails, the entiresystem fails.

Parallel Systems

The failure for a parallel system means the failure of each individual component. Thesystem failure probability is then the product of individual failure probabilities (1 R).


12/22

Most mining machinery systems are series systems. In other words, the failure of onecomponent fails the entire system. The redundancy in mining can be provided by having

multiple systems, eg spare trucks or shovels.

MANAGING RELIABILITY

Optimum utilisation of its capital investment in equipment is essential for companyprofits. Equipment reliability plays a major role in this. Therefore, managing reliabilityis a core business for a mining company. This is a task for both production andmaintenance engineers. In the rest of this module, we will focus on the maintenancefunction.

Maintenance Function

The maintenance function can be broadly separated into two parts: Preventive andCorrective.

The aim ofpreventive maintenance is to increase the reliability of the system bypreventing failures from occurring. This can be done in a number of ways:

Servicing such as cleaning or lubrication

Inspection to find and correct incipient failures

Planned replacement of parts at fixed intervals

Corrective maintenance is the repair action that is taken when the system fails. Theamount of corrective maintenance is governed by the system reliability. Very little timeon corrective maintenance is spent with a system that has a high reliability.

The aim of the maintenance function is to help maximise asset utilisation. This is doneby maximising the equipment availability. In mining industry, the opportunity cost oflost production due to machine downtime is higher than any other cost associated withthe maintenance function. Therefore, increasing machine availability is the paramount

aim. Which maintenance strategy vetter serves this aim? Preventive or Corrective? An


13/22

optimal mix of preventive and corrective maintenance needs to be found for eachapplication to maximise equipment availability.

Corrective Maintenance

Through corrective maintenance we assume that the system is brought to an "as new"state. The reliability is not affected. The impact on the system availability is by the timeit takes to repair (on average, this is referred to as MTTR or Mean Time To Repair).The two main KPIs in assessing Corrective Maintenance are

the validation of the assumption that the corrective maintenance brings the

component back to its "as new" state the duration of the time it takes to do the repair

The repair duration has the following components

Fault Identification- What caused the failure? What needs to be repaired?

Set-up time

- Find and bring the right person to the job Actual repair

Logistic delays

- Waiting for the spare part Restart time

- Time spent to bring the system back to normal operation after the fault isrepaired

The actions to reduce the overall Mean Time To Repair (MTTR) follow logically fromthe above breakdown:

Identify the failed components quickly. This is achieved by experienced

operators, on-line fault detection tools For frequent failures have the repair crew with the right skills on standby

Ditto for the frequently failing spare parts

Design the equipment and the operating procedure to minimise re-starting time

Preventive Maintenance

Preventive Maintenance is done to reduce or to eliminate the risk of failure. It is alwaysan interruption to production. It is important that the preventive maintenance is effectivein avoiding failure and the cost of failure avoided as a result the preventive maintenanceexceeds the cost of preventive maintenance.

The cost of failure - MTTR; the cost of the repair and the replacement should be greaterthan The cash cost of the planned maintenance action (salaries, consumables, etc) plusThe opportunity cost (lost production).

The Preventive Maintenance can be performed at different levels and at each level

decision has to be made whether the action is necessary or not.


14/22

Service (necessary only if the service action has a significant effect on systemreliability)

Inspection (necessary only if there is sufficient timebetween potential and actual

system failure, this time is referred to as theP-F time in maintenance literature) Periodic Replacements (necessary only if the part is in or about to enter a period

where the age-induced failure rate is steeply increasing, eg the end portion of thebathtub curve)

P-F Time

If we can identify theonset of failure byinspecting the part andif there is enough time

between thisidentification and the

expected failure toschedule and implementa correctivemaintenance task, thenthis is a successfulinspection. The crucial

parameter is the intervalbetween the occurrenceof a potential failure andits decay to an actualfailure. This is calledthe P-F interval.

For this strategy to beeffective, the inspectionintervals should beequal to the P-Finterval. This is

practical only if the P-Finterval is long enough.Otherwise, we would

spend our timeinspecting the machinewithout producinganything.

In other words,scheduled inspectionswill only help when

Potential failure

condition is

clearly defined The P-F interval


15/22

is consistent It is practical to

inspect atintervals lessthan the P-Finterval

The P-F interval

is long enoughto implementcorrectivemaintenanceaction

Periodic Replacements

Periodic part replacements are always a costly component of preventive maintenance.Therefore, they should be done only when they contribute to the system reliability at alevel compensating for the cash and time cost of performing the replacement action.

Periodic (or scheduled) replacements help when

The component breakdown has costly consequences (eg chain of failures,

distance from the workshop, etc) The dominant failure mode is age-related with the hazard rate consistently

increasing above an acceptable value at around the set replacement period

Decreasing Hazard RateScheduled replacement

increases failure probability

Constant Hazard RateScheduled replacement has

no effect on failureprobability

Increasing Hazard RateScheduled replacement

decreases failure probability


16/22

Reliability Data Analysis

A typical mine site today would have archives of past data on maintenance histories,equipment availability, and production delays. The quality of the data in terms ofidentifying the root causes of equipment failure and developing accurate reliability

statistics is usually questionable. Nevertheless, it is a good place to start for anybodywanting to achieve a significan improvement in equipment utilisation.

It is usually necessary to process 100000+lines from multiple sources of sometimesquestionable accuracy

Pareto Principle says that in any list ofitems there are a Significant few" and theremainder are the "Insignificant many. Inthe context of mining equipmentmaintenance, a large part of the failures aredue to a small number of causes. A Pareto

plot helps to identify the most significantcauses. The biggest benefit is incurred bythe maintenance action addressing only thesignificant issues

Another way of representing past failure data is by way of Scatter Plots. A scatter plot isa logarithmic plot of MTTR against the number of failures N. Since the total downtimeassociated with each failure is NxMTTR, constant downtime curves appear as lines onlogarithmic axes.


17/22

Reliability Analysis

Pareto Analysis and Scatter Plots are good tools to identify the reliability sinks in theequipment. The next step is to calculate the failure probability distribution curves for allcritical components. The MTTR statistics may also be required if MTTR is notreasonably constant for each item. This step requires high quality data.

One needs a large enough data set to have at least 4-5 failure events for each targetfailure mode. This data set should cover a sufficiently long time period to eliminatelocal and temporary effects. Uniform operating conditions need to apply over this

period.

At the same time, the accuracy of the data should be free of collectors bias. Forexample, if the attributes of collected data has an influence on the assessment of the datacollecting staff, it should not be surprising if there is a bias favourable to the assessment

KPIs.

In this module, we will limit ourselves to the estimation of exponential failuredistribution:

The reliability in exponential distribution is expressed as

where lamda is the uniform failure rate (or the inverse of the Mean Time BetweenFailures or MTBF). Let us see how we can develop an estimate for lamda using a realexample:


18/22

Example 1

Suppose that we have the failure log for a component as 180, 216, 930, 990, 1300 and1850 hours. Estimate the MTBF assuming an exponential probability distribution

Solution

Since we have a record of failures, it is best to use the cumulative failure distributionfunction. For an assumption of exponential distribution of failures, the cumulativedistribution function is

We have records of six failures. The following plots them on a time axis

Assuming that after each failure, the system is brought back to "as new" condition, thetime periods between failures can be treated as examples of different systems survivingfor different periods of times. Let us calculate the Time-Between-Failure data from theabove history:

TBF = 36, 714, 60, 310 and 550.

Therefore, the MTBF is the average of the above (=334) and the reliability functiondescribing the above failure history is

This is plotted in the following chart. The vertical lines represent the actual failure data.


19/22

Example 2

How can you decide if the exponential distribution was the correct assumption for thefailure data analysed in Example 1?

Solution

Let us look at the TBF data again. TBF = 36, 714, 60, 310 and 550. Let us sort andtabulate the data:

TBF,h

36

60

310

550

714

We can roughly say than 20% of the failures occur before the first 36 hours, 40% occurbefore 60 hours, 60% occur before 310 hours, etc. This leads us to create the followingtable:

TBF,h

RoughEstimatefor thec.d.f.

36 20%


20/22

60 40%

310 60%

550 80%

714 100%

A better estimate is probably to say that 10% of the failures occur before 26 hours andanother 10% occur after 36 but before 48 (the mid-point between 36 and 60). With asimular reasoning for the other rows, a better estimate is formed as

TBF,h

RoughEstimatefor thec.d.f.

Mid-pointEstimatesfor thec.d.f.

36 20% 10%

60 40% 30%

310 60% 50%

550 80% 70%

714 100% 90%

Now we can superimpose the estimated cdf values from our actual data onto theexponential distribution curve:


21/22

is plotted against time(hours) in the chart on the left-hand side.The discrete data points correspond to the rough and mid-point estimates for thecumulative distribution function as given in the tables above. This does not look like agood fit.


22/22

People usually plot against time because for an exponential distribution this

gives a straight line. It is left as an exercise to explain why this so. The above figure

plots against time . It is clearer here that the exponential fit is not a very badassumption for this data set.

Maintenance of Mining Machinery

Documents