-
Azim Houshyar, January 2011 Section 1 - page 1
SECTION 1 PRINCIPLES, CONCEPTS & DEFINITIONS OF
RELIABILITY
1.1 Introduction to Reliability and Maintainability
Reliability and Maintainability (R&M) are vital
characteristics of products & manufacturing machinery and
equipment that enable U.S. manufacturers to be world class
competitors.
Reliability consideration plays an increasing role in virtually
all-engineering disciplines. As the demand for the systems that
perform better and cost less increase, there is great need to
minimize the probability of failures, whether the failures simply
increase costs and inconvenience, or gravely threaten the public
safety.
In the broad sense, reliability is associated with
dependability, with successful operation, and with the absence of
break-downs or failures.
From the product point view, customer relies on a product that
performs its intended function with no failure. From the
manufacturing point of view, efficient production planning depends
on a process that yields high quality parts at a specific rate
without interruption. Predictable reliability and maintainability
of the manufacturing machinery and equipment is a key ingredient in
maintaining production efficiency and the effective deployment of
Just-in-Time principles. In both cases, improved reliability and
maintainability of a product/equipment lead to lower total life
cycle costs that are necessary to maintain the customer
satisfaction competitive edge.
This document provides the methodology to achieve these
objectives by providing R&M techniques and guidance on where to
apply them in the up-front design and development of a new
product/equipment. Furthermore, the methodology for continuous
improvement of a design and or machinery after installation is
provided. Implementation of the R&M concepts described in the
notebook will help increase the product reliability, and/or
equipment availability and reduce its overall operational and
maintenance costs.
1.2 Basic Definitions of Reliability & Maintainability
Reliability is the probability that a product (equipment) can
perform continuously, without failure, for a specified interval of
time when operating under stated conditions. Increased reliability
implies less failure of an equipment and consequently less downtime
and loss of production.
Maintainability is a characteristic of design, installation and
operation, usually expressed as the probability that a machine can
be restored to specified operable condition (returned to a
serviceable state) within a specified interval of time when
maintenance action is performed in accordance with prescribed
procedures and resources.
-
Azim Houshyar, January 2011 Section 1 - page 2
Benefits of R&M
Highly reliable and maintainable production machinery offers the
means for producing consistently high quality products at lower
costs and at higher output levels. Successful application of
R&M techniques has a very positive effect on employee morale
and pride since the reduction in downtime also results in
significant reduction in employee stress and frustration.
Table 1-1. R&M User/Supplier Benefits
User Benefits Supplier Benefits Higher machinery & equipment
availability Reduced warranty costs
Unscheduled downtime reduced/eliminated Reduced build costs
Reduced maintenance costs Reduced design costs
Stabilized work schedule Improved customer relations
Improved J-I-T performance capability Higher customer
satisfaction
Improved profitability Increased understanding of
productions
Increased employee satisfaction Increased sales volume
Lower overall cost of production Increased employee
satisfaction
Higher quality parts and product Improved status in the
marketplace
Less need for in-process inventory to cover downtime A
competitive edge in the marketplace
Reduced Life Cycle Cost Life Cycle Cost (LCC) refers to the
total cost of a system during its operational life. LCC is the sum
of non-recurring costs plus operation and support costs. Operation
and support costs typically consume about 50% of the total LCC.
Figure 1-1. Total Life Cycle Cost
CONCEPTION STAGE
DEVELOPMENT
MACHINE BUILD OPREATION AND SUPPORT
50% 35%
12%
3%
-
Azim Houshyar, January 2011 Section 1 - page 3
Emphasizing R&M practices during the conception and
development stages can lower the total LCC. By using R&M to
minimize stress (electrical, mechanical, etc.), the equipment will
be less prone to failure during operation. This results in a
decrease of the operation and support costs that account for the
bulk of total LCC.
A slight increase in spending to incorporate R&M practices
during the conception and design stages can dramatically lower the
operation and support costs.
It is important to consider R&M at the early stage of a
program. Studies have shown that as much as 95% of LCC is
determined during conceptual and development stages. Once new
product (equipment) has reached the build stage, therefore, only 5%
opportunity remains to effectively improve the reliability or
maintainability of the product (equipment).
Examples of LCC Improvement: Intel Corporation is engaged in the
design and manufacture of solid-state devices. Intel has developed
and is implementing a corporate strategy that addresses the subject
of reliability and maintainability in an aggressive, committed
manner.
In portions of its assembly operation, Intel has improved the
Mean Time Between Adjustments (MTBA) from 5 minutes to 16 minutes.
This improvement makes it possible for one operator to run eight
machines rather than four, a doubling of operator productivity. In
addition, process yields have been improved due to the elimination
of scrap that resulted from the more frequent shutdowns.
Intel's R&M program was also responsible for improving the
Mean Time Between Failures (MTBF) from 10 hours to 250 hours on its
solid-state component wire bonding machines. This improvement had
the same effect as adding 30% capacity to the existing machine
base. Another benefit of this improved reliability lies in the fact
that Intel was able to reassign the three line technicians who
previously served as "baby-sitters" to more productive work.
Your Example of LCC Improvement: Choose a product (equipment)
that you are familiar with. State your approach to improving the
life cycle cost for the product (equipment). What additional
resources (time, money, technology, labor,...) are needed, and what
are the foreseeable benefits?
Example of Life Cycle Cost
Equipment Name:
Estimated Initial Cost:
Estimated Life:
Estimated Annual Operational Cost:
Current Status:
Recommended Modifications:
-
Azim Houshyar, January 2011 Section 1 - page 4
What do we mean when we say we have a Reliable Product?
Well, we may think of a dependable, trustworthy product, but can
these descriptions be quantified?
Can you predict the exact time when a given product will fail?
Well, even though you probably can't say the exact time of the
failure of a product, you can estimate the percentage of products
that will fail by a given time.
Reliability can be stated in different forms. For instance: 1)
The reliability that a product (equipment) will be performing its
intended function after
1,000 hours of use is 0.80; or 2) The reliability at 1,000 hours
is 0.80, or the reliability is 80%. 3) Another way to look at it is
that if we place 100 units of this product (equipment) in use,
80 of them will still be operating (with no failure) at 1,000
hours. 4) The reliability at any future time (say 1,500 hours) is
less.
Remember that the reliability of a product (equipment) should
not be stated as simply 0.8, since no time is specified. It is
equally ambiguous for a product (equipment) to have a 1,000-hour
life without indicating a reliability for that time. Instead it
should be stated that the 1,000-hour reliability of the product
(equipment) is 0.8.
Question: Looking at the Figure, state your findings regarding
the relationship between reliability and time? Which of the two
curves represents a more reliable system? Why?
Response:
100
Rel [%]
0
0 10,000
1 2
Time to failure
Reliability Functions
-
Azim Houshyar, January 2011 Section 1 - page 5
In the definition of reliability, three phrases were used. Those
phrases were: 1) Perform intended functions satisfactorily; 2) For
the specified period of time; and 3) Under specified
conditions.
What do we mean by "Perform Intended Function Satisfactorily"?
To understand this phrase better, let's define Failure.
FAILURE: An event when machinery/equipment is not available to
produce parts at specified conditions when scheduled or is not
capable of producing parts or perform scheduled operations to
specification. For every failure, an action is required.
Unsatisfactory performance is subject to interpretation.
Therefore it must be clearly defined at the time of the contract.
There will be various levels of failure based on the customer's
level of severity for incidents on the manufacturing equipment.
What do we mean by "Specified Time Period"?
Products deteriorate with use and even with age when dormant.
This is especially true for wood products. Longer lengths of usage
imply higher chance of failure and hence lower reliability.
For design purposes, target usage periods must be identified.
Typically identified usage periods are:
The warranty period; Durability life that is a measure of useful
life, defining the number of operating
hours until overhaul is expected or required.
What do we mean by "Specified Conditions"?
Products react to the environment in which they are being placed
in. Different environments promote different failure modes and
different failure rates for a product. Therefore the environmental
factors which the product will encounter must be clearly
defined.
Environmental factors such as: Temperature, Humidity, Vibration,
Mechanical shock, Immersion/splash, Pressure/vacuum, Contamination,
Electrical noise, Electromagnetic fields, Corrosive materials,...,
must be addressed during the design stages of the equipment. These
environmental conditions must be thoroughly documented.
1.3 Association between Quality and Reliability
Lamberson lists quality characteristics as: Psychological
(taste, beauty, style, status); Technological (hardness, vibration,
noise, materials); Time oriented (reliability and maintainability);
Contractual (warranty); and Ethical (honesty of repairman,
experience of sales force).
-
Azim Houshyar, January 2011 Section 1 - page 6
Quality is referred to as fitness for use. This comprises all
phases of the life cycle of the product including engineering,
manufacturing, marketing and maintenance. This must be addressed
from the customers' standpoint. Company-wide quality control is a
philosophy that focuses on meeting customer needs and expectations
throughout the life cycle of the product while continuously
improving the production process.
Quality Defects are defined as those which can be located by
conventional inspection techniques.
Reliability Defects are defined as those which require some
stress applied over time to develop into detectable defects.
Performance and Reliability: Engineering is concerned with
designing and building products for improved performance. This
requires the designs to incorporate features that may tend to be
less reliable than the older systems with lower performances.
The trade-offs between performance and reliability are often
subtle. Thus any product with both improved performance and
reliability is significant advance.
We usually improve performance through increased loading;
Decrease the weight of an aircraft increase in the stress level of
structure Increase in temperature to get thermodynamical efficient
rapid corrosion in material
This approach to the physical limits of system increases number
of failures.
Specifications for a purer material, tighter dimensional
tolerance, ..., is required to reduce uncertainty in the
performance limits, and thereby permit one to operate close to
these limits without increasing the probability of exceeding
them.
The performance of a system is often increased at the expense of
increased complexity, this again decreases reliability, unless
compensating measures are taken.
Probably greatest improvements in performance is introduction of
new materials or devices to achieve a particular goal:
Replacement of wood by metal, Replacement of piston with jet
aircraft engine, Replacement of vacuum tubes with solid
electronics.
Notes: Even with major advances in technology, reliability may
be a severe problem, particularly during the early stages of
introducing a new technological advance.
At any stage of technological development, trade-offs must be
made between: Reliability and performance, Reliability and
cost.
Ex: Race car: Performance is improving, but reliability remains
below 50%. Here performance is everything, and one must tolerate a
high probability of break-down if there is to be any chance of
winning the race.
-
Azim Houshyar, January 2011 Section 1 - page 7
Ex: Military aircraft: An intermediate example in which
reliability and performance are balanced.
Ex: Commercial airliner: In this case, reliability is the
overriding design consideration. Thus degraded speed, payload, and
fuel economy are accepted to maintain a very small probability of
catastrophic failure.
1.4 Definition of Reliability Measures
In this section, we will define: Repairable and non-repairable
units; Mean Time Between Failures (MTBF); Mean Time To Failure
(MTTF); Failure rate; Mean Time To Repair (MTTR); Reliability,
Maintainability, and Availability.
Items/components/subsystems/systems can be classified as
repairable or non-repairable. Whenever we use MTBF, we are
referring to repairable entities, whereas MTTF is used for
non-repairable entities.
What are some indicators used to Quantify Product
Reliability?
Mean Time Between Failures (MTBF): The average time between
failure occurrences. The sum of the operating time of a machine
divided by the total number of failures.
Mean Cycle Between Failure (MCBF): The average cycles between
failure occurrences. The sum of the operating cycles of a machine
divided by the total number of failures.
Failure Rate: Number of failures per unit of gross operating
period in terms of time, events, cycles, or number of parts.
Reliability: R(t) indicates reliability at time t, where t is
the duration of failure-free operation of the equipment.
MTBF = (Operating time)/(Total number of equipment failures)
Failure Rate = (Total number of equipment failures)/(Operating
time)
1. MTBF=1,000 hours means that, on the average, a failure will
occur with every 1,000 hours of usage.
2. A failure rate of 1 failure per 1000 hours (= 0.001/hr.)
means that, on the average, one failure will occur with every 1,000
hours of usage.
3. R(t=1000 hr.) = 0.8 means that the probability of 1000 hours
of failure-free performance is 80%.
-
Azim Houshyar, January 2011 Section 1 - page 8
What is the relationship between Reliability Numbers?
The relationship between the Failure Rate and MTBF is:
MTBF = 1/Failure Rate
Therefore a failure rate of 0.001/hr implies a MTBF of 1000
hours.
Assuming that the reliability function for the equipment is
Exponentially Distributed, we can use the following equation to
calculate the reliability of a product or machinery at a specified
time t.
R(t) = e-t/MTBF t>0
where t = time over which machine is to be operated without
failure, and e = the natural log number 2.718.
For example, a one-shift reliability of the machine with MTBF of
1,000 hour is:
R(t=1,000 hr.) = e-8/1000 = 0.992 => R8= 99.2%
There is 99.2% chance of running the equipment for 8 hours
without encountering a failure. The same equipment has only 79.4%
chance of running for 100 hours without encountering a failure.
I recommend selection of an agreeable time frame over which
reliability is to be sustained. An example might be the 8-hour
Reliability, denoted by R8, which represents the probability that
the machine will not fail during an 8-hr shift.
Example 1: The failure rate of a component is 0.001 hr-1. a)
Find the MTBF.
b) Find the R8.
c) What is the probability that the component will not fail in a
one-month continuous operation.
Example 2: Given the reliability function R(t) = e-t/1000, where
t is time to failure in hrs. a) Find the 100 hour reliability.
b) Find the 1,000 hour reliability.
c) If 1,000 devices are placed in operation. How many will still
be operating at 100 hrs?
-
Azim Houshyar, January 2011 Section 1 - page 9
Example 3: A machine has an MTBF of 50 hours. a) Find the
failure rate.
b) Find the One-shift reliability.
c) Find the three-shift reliability.
d) In 100 hours of operational time, how many failure would you
expect?
What is the Relationship Between MTBF of a System and MTBFs of
its Components? Most systems consist of several subsystems.
Occasionally we need to combine MTBFs from different subsystems to
calculate the MTBF for the main system. An example is to analyze a
design in which we may have data on the MTBFs of the different
subsystems used in the new design.
Example 4: Consider a system in which one subsystem has an MTBF
of 25 hours. On the average, in 100 hours of uptime, there will be
4 failures. Using the relationship between MTBF, number of
failures, and operating time it is seen that:
MTBF = (Uptime)/(Total Number of Failures)
MTBF1 = 100/4 = 25 hrs.
Now consider adding a second subsystem with a MTBF of 20 hours
to the previous system. This subsystem is expected to have 5
failures in 100 hours of uptime. MTBF2 = 100/5 = 20 hrs.
How do we combine the MTBFs to obtain the MTBF for the main
system?
Obviously, we can expect 4+5 = 9 failures in 100 hours of
operation, therefore: MTBFS= 100/9 = 11 hours! that is the system
fails more often than each of the subsystems.
Can you figure out the rule? The rule is to combine the failure
rates
S=1+ 2
or equivalently: 1/MTBFS = 1/MTBF1 + 1/MTBF2
that is: 1/MTBFS = 1/25 + 1/20 = 0.09 => MTBFS = 11 hrs.
-
Azim Houshyar, January 2011 Section 1 - page 10
Example 5: Consider a press which consists of the following five
subsystems:
Subsystem MTBF Crown Assemblies 50,000 hrs 2X10 -5 hr -1 Slide
Assemblies 20,000 hrs 5X10 -5 hr -1 Gibing 200,000 hrs 5X10 -6 hr
-1= 0.5 X 10-5 hr -1 Columns 10,000 hrs 1X10 -4 hr -1= 10 X 10 -5
hr -1 Beds 10,000 hrs 1X10 -4 hr -1= 10 X 10 -5 hr -1
a) Calculate the MTBF for the press.
b) What is the 8-hour reliability of the press.
Example 6: Consider a work-station for which the subsystems
failure rates are:
Subsystem (1/hr.) MTBF(hrs) Load/unload mechanism 0.00003 33,333
Mechanical actuator 0.00001 100,000 Electronics 0.000005 200,000
Hydraulics 0.0004 2,500
a) Calculate the MTBF for the work station.
b) What is the 4-hour reliability of the work-station?
MTBF and Failure Rate are two related measures of the
Reliability of the equipment or product. The next question is: How
do we measure the Maintainability?
Maintainability is a characteristic of design, installation and
operation, usually expressed as the probability that a machine can
be retained, or restored to, specified operable condition (returned
to a serviceable state) within a specified interval of time when
maintenance is performed in accordance with prescribed
procedures.
In what follows some Maintainability Improvement Strategies is
discussed.
-
Azim Houshyar, January 2011 Section 1 - page 11
1.5 Maintainability Improvement Strategies
Safety: Safety engineering must be introduced at the design
stage, not after the equipment is built. Safety personnel must be
consulted up front to fully utilize the best technology available
in a safe and ergonomically efficient manner. Properly designed,
the operators environment will not only reduce the risk of injury,
it will also avoid exposure to health risks or activities likely to
cause repetitive motions disorders. Pinch points guarding, safety
labels, personnel guards, warning devices, lock-outs and other
appropriate safety measures must be integrated into the design.
Safety requirements must be included in the specifications.
Applicable safety standards must be adhered to.
Accessibility: Accessibility means having sufficient working
space around a component to diagnose, troubleshoot and complete
maintenance activities safely and effectively. Provision must be
made for movement of necessary tools and equipment with
consideration for human ergonomic limitations.
Operators, maintenance and service personnel have the best
knowledge as to how the repair job will be done and to identify the
problems, therefore, they should be involved in evaluating the
design for accessibility.
Common Building Practices: Due to the relevance of the
connections, piping runs, wiring and plumbing to the performance of
the equipment, there should be a more rigorous approach to the
practice. At a very minimum, machine suppliers should develop a
well-documented practices manual and advise their assemblers to
adhere to its content.
Diagnostics: Diagnostic devices indicating the status of
equipment should be built into manufacturing machinery to aid
maintainability support processes. Use of electronic light emitting
diodes to indicate fault status can be helpful. The diagnostics can
be as simple as a visual display indicating the equipment's status
as a go/no-go condition, or as sophisticated as a knowledge-based
expert system with the capability of analyzing a problem and
recommending the most likely solution.
Diagnostic systems should have the capability of storing
equipment performance data as permanent records for reliability
analysis and supplier feedback supporting the reliability growth
management process. Output from diagnostic systems should be in a
compatible format with commercially available data base management
software.
When component assemblies and subsystems are used to create a
manufacturing system, hardware and software "hooks" should be put
in place in the concept and design phase to facilitate integration
of the diagnostics system in the build phase. Diagnostic systems
should indicate the specific component to replace or repair.
Captive Hardware and Quick Attach/Detach: Captive and quick
attach/detach hardware provides for rapid and easy replacement of
components, panels, brackets and chassis. The environment in which
these devices are used may restrict the type of device used. Spare
parts and replaceable subassemblies should also be configured with
these devices preassembled. Examples are:
-
Azim Houshyar, January 2011 Section 1 - page 12
plate, anchor and caged nuts push and snap-in fasteners clinch
and self-clinching nuts quarter-turn fasteners
Modularity: Modularity requires that designs be divided into
physically and functionally distinct units to facilitate removal
and replacement. It allows design of components as removable and
replaceable units for an enhanced design with minimum downtime.
Modular design concepts typically are thought of in terms of
electrical black boxes, printed circuit boards and other quick
attach/detach electrical components. These concepts are also
applicable to the mechanical elements of production equipment.
Advantages of modularity are: New designs can be simplified and
design time can be shortened by making use of
standard, previously developed building blocks. Specialized
technical skill will be reduced. Training of plant maintenance
personnel is easier. Engineering changes can be made quickly with
fewer side effects.
Maintenance Procedures: Maintenance procedures must describe in
details the adjustments, replacement and repair of machine systems,
subsystems and component parts. The original equipment manufacturer
will provide recommended preventive maintenance procedures at
intervals based on time and/or machine cycle count. Maintenance
requirements should be prioritized to enable the equipment user to
prioritize maintenance scheduling related to the criticality of the
activity.
The maintenance procedures should be contained in service
manuals or a computerized data-base reflecting the specific content
and configuration of the equipment being supported. Exploded view
illustrations, photographs, simplified assembly drawings and/or
parts lists relating to the required maintenance activities and
procedures should be included wherever applicable. Technical
information such as pressure settings, operational sequences and
moving part clearances should be included as appropriate.
Visual Management Techniques: Visual Management techniques
differ for varying types of equipment. A team effort must exist
between supplier and user to deliver the best techniques to the
user. All the team members should review these on a continual basis
at concept/design, machine build and on the manufacturing floor.
Visual management techniques are used on machinery and equipment to
bring the workplace awareness to a level that allows problems and
abnormal conditions to be quickly recognized at a single glance.
Through visual management, a system is created that enhances the
equipment inspection process by allowing quick identification of
safety, quality, environmental, equipment and process
abnormalities. Typical visual management techniques include:
Match marking of all fasteners (nuts, bolts, screws) fixed,
adjustable or critical Match marking of all control adjustments
(pressure, flow, temperature, speed,
level, voltage, current, etc.) The identification of normal
operating ranges and levels Direction of flow and product color
coding on piping and hoses Direction of rotation on (drives, belts,
chains, motors, etc.)
-
Azim Houshyar, January 2011 Section 1 - page 13
Function labels on (switches, valves, buttons, lights, etc.)
Identification labels on (cabinets, panels, boxes, etc.) Filters
(lube, hydraulic and air) that indicate when dirty Filters labeled
with replacement filter element number Belt and chain drives with
guarding that permit quick visual inspection and access Replacement
belt or chain number labels on guarding Each lube point labeled
with product number and color code Temperature sensitive labels on
all critical components (motors, drives, controls,
hydraulic units, etc.) Equipment layout with all electrical
control panel safety lockout points indicated
(affixed to the main electrical control panel) Equipment layout
with all lubrication fill points, frequencies and product codes
indicated (affixed to the main electrical control panel) The
identification of all control drawing numbers on the main
electrical control
panel Signals or alarms that indicate a major abnormality,
safety interlock tripped,
process out of control, etc. Equipment and process operator
inspection list (affixed to the main electrical
control panel)
Spare Parts Management: Maintenance of manufacturing equipment
and machinery requires a readily available supply of spare parts
and supporting materials to operate, maintain and service the
equipment. Spare parts management will identify and make available
the required quantities of spare parts at an optimum inventory cost
to the equipment user.
Plans for equipment support through spare parts management
should begin during the equipment design phase and continue through
the life cycle of the equipment. Consideration should be given to
the lead time required to requisition, manufacture and receive into
inventory the required parts and/or materials to avoid excessive
costs to procure replacement parts on an emergency basis.
The machinery and equipment manufacturer should make a
recommended spare parts list available to the equipment user. Parts
may be provided from previously purchased inventory (commercial
parts and supplies) or purchased specifically for the subject
equipment and maintained in inventory for its use. Sourcing of
spare and/or replacement parts, including consumable materials,
should be managed to ensure that the performance and capability of
the manufacturing machinery and equipment is maintained at or above
the original manufacturer's specifications.
Other strategies for maintainability improvement include
Standardization of the component parts that are commercial
standard, readily available, and common from machine to machine;
and Color Coding which can help to speed up maintenance
procedures
Now that we have reviewed a few of the procedures for
maintainability improvement, it seems natural to ask the following
question:
-
Azim Houshyar, January 2011 Section 1 - page 14
How can we measure the Maintainability Performance of a machine?
The response is by measuring its Mean Time To Repair (MTTR), which
can be used as an indication of the ease of maintaining the
equipment.
MTTR is the average time to restore machinery or equipment to
specified conditions.
What procedure can be used to measure MTTR? In case that there
exists very little data to calculate MTTR:
1. Determine Different Modes of Failure of the equipment, using
your judgement and maintenance experience with similar
equipment;
2. For each Mode of failure, estimate its Frequency of
Occurrences; 3. Based on the way the equipment is to be designed,
estimate its Time to Repair; 4. By multiplying the Failure Rate
(item 2 above) by the Time to Repair (item 3 above),
calculate the Maintenance Load for each mode of failure; 5.
Calculate MTTR as:
MTTR = (Maintenance Load)/(System Failure Rate)
in which system failure rate is the sum of failure rate for
different modes; 6. Comparing the Maintenance Load for different
Modes of Failure; initiate design action
for failure modes that create a high load on the maintenance
function.
Example 7: During the equipment design and development phase
(using the Failure Mode Analysis), the following three failure
modes were identified, and the corresponding failure rates and
times to repair were estimated. Use the information to estimate
MTTR and to rank those failure modes.
Failure Mode Failure rate per 1000 hrs Time to repair (t) (hrs)
Maintenance Load ( x t)
Hydraulic leak 10 1 10
Torn part 2 10 20
Conveyor jammed 4 0.5 2
MTTR = (Maintenance Load)/(System failure Rate) =
(10+20+2)/(10+2+4) = 32/16 = 2 hrs
The procedure can be summarized as:
(component failure rate x time to repair component) MTTR =
----------------------------------------------------------------------
Total system failure rate
-
Azim Houshyar, January 2011 Section 1 - page 15
Example 8: Consider the following situation in which the MTBF
and TTR for five different modes of failure are listed:
Subsystem number MTBF (hour) Time to Repair (hour) 1 1,000
1.5
2 5,000 4.0
3 10,000 1.0
4 2,500 2.5
5 500 0.5
Calculate MTBF and MTTR for the system and rank the significance
of different modes of failure from the maintenance load point of
view.
Subsystem Failure Rate() Time to Repair Maintenance Load 1
2
3
4
5
MTBF =
MTTR =
Example 9: Consider the following situation in which the MTBF
and TTR for six different modes of failure are listed:
Subsystem number MTBF (hour) Time to Repair (hour) 1 120 4.0
2 100 5.5
3 600 3.0
4 1,000 1.0
5 1,500 0.5
6 750 1.5
-
Azim Houshyar, January 2011 Section 1 - page 16
Calculate MTBF and MTTR for the system and rank the significance
of different modes of failure from the maintenance load point of
view.
Subsystem Time to Repair Maintenance Load 1
2
3
4
5
6
Total
MTBF =
MTTR =
1.6 The Relationship Between R&M and Availability
What is Availability and how is it related to R&M?
Availability (A) is the probability that at any time, the system
is either operating satisfactory or is ready to be operated on
demand, when used under stated conditions.
The goal of availability engineering and management is to
determine and achieve the availability performance necessary to the
manufacturer's corporate, operating, company, and plant-level
business performance and leadership.
Remember that the plant does not have to be shut down to
experience reduced availability. When many plant items fail, they
do not shut down the plant. Nor do they always reduce its
production level. However, the plant's characteristic availability
has been reduced.
A simple example is a plant with two pieces of equipment. One is
a spare. When one fails, the other is placed in service. Thus, the
plant's real-time production level is not reduced. However, the
probability of maintaining that level over a period of time is
substantially less.
The availability can be looked at as the ability of an equipment
(under combined aspects of its reliability, maintainability and
maintenance support) to perform its required function at a stated
instant of time.
Availability includes the built-in equipment features (R&M)
as well as in-plant maintenance support function (M).
-
Azim Houshyar, January 2011 Section 1 - page 17
RELIABILITY + MAINTAINABILITY => AVAILABILITY
How do we measure Availability? Depending on the stage of the
life cycle, we can use one of the following two models: a) During
design and development phase, the availability is calculated from
the design data
using: A = MTBF/(MTBF + MTTR)
b) During the later phases of the life of the product, the
availability is calculated using the actual data on operating time
and downtime; that is:
A = Operating Time/Net Available Time in which:
Net Available Time = Operating Time + Unplanned Downtime
We will see that the two equations result in the same value for
A.
Example 10: Calculate the availability for the following welder
machine:
Subsystem MTBF (hr) Failure rate Ave. TTR (hr) Maint. Load 1.
2,400 1.25
2. 4,000 1.0
3. 200 2.25
4. 1,500 0.5
5. 7,000 3.0
TOTAL
MTBF =
MTTR =
A =
One useful tool for measuring the performance of a piece of
equipment is OEE.
Overall Equipment Effectiveness (OEE) is a comprehensive measure
of equipment effectiveness. The measurement encompasses:
1) What percentage of time the machinery is available
(availability). 2) How fast the machinery is running relative to
its design cycle time (speed ratio or
performance efficiency). 3) What percentage of the resulting
product is within quality specifications (yield).
-
Azim Houshyar, January 2011 Section 1 - page 18
OEE = Availability x Performance Efficiency x Yield
The above formula is not only applied to the overall system, it
is also applied to each individual machine that comprises the
system.
R&M is an excellent means of increasing both individual and
system OEE percentages. This is because emphasis of R&M
practices has a profound effect on all three factors in the OEE
equation.
Availability. R&M increases uptime because the equipment is
more reliable and when it does require maintenance the services can
be accomplished in a shorter time.
Performance Efficiency. Due to proper R&M practices, the
equipment has fewer failures and less maintenance time. This means
the equipment can operate for longer periods at its designed cycle
time.
Yield. When components of the equipment are designed according
to R&M practices the equipment is less susceptible to
variations that could result in unacceptable parts.
Improved Uptime. Uptime and its counterpart, downtime, as
functions of OEE are more clearly defined in Figure 1-5. The
importance of R&M is realized when it is considered that the
goal of R&M practices is to reduce the time required for
preventive and corrective maintenance. By reducing the time
required for these major contributors to downtime, the uptime
increases correspondingly.
Example 11: A machine that was designed to produce 360 parts per
hour was put under a continuous production test over a 5-day
period. During that interval the machine broke down 6 times for a
total of 14 hours. In addition the machine was on scheduled repair
for 4 hours. The parts produced by the machine were inspected.
32,000 of the parts passed the inspection, but 1,400 of them were
rejected. Based on the given information, try to answer the
following questions: a) Calculate the MTBF.
b) Calculate the MTTR.
c) Calculate Availability.
d) Calculate Performance Efficiency.
e) Calculate the Yield.
f) Calculate the machine OEE.
-
Azim Houshyar, January 2011 Section 1 - page 19
Figure 1-5. Relationship of Typical Time Elements
-
Azim Houshyar, January 2011 Section 1 - page 20
Few Notes: 1) Reliability was defined as the probability that a
component, device, or system will
perform satisfactory for at least a given period of time when
used under stated conditions.
2) To measure reliability we need to specify: a) A precise
definition of satisfactory performance; b) The time base over which
the performance must be maintained; c) The environmental conditions
that will be encountered.
3) The concept of reliability tells us that any given product
has a built-in reliability function that relates its reliability to
time and decreases as time progresses.
Note also that: Reliability is a probability concept.
Reliability theory is a subset of quality control, but Q.C.
function deals primarily
with new products under inspection, whereas reliability deals
with products in service.
We shall consider the system as a set of interacting components
working together as an integrated whole.
A system is said to fail when it ceases to perform its intended
function
4) When there is a total cessation of function, the system has
clearly failed, but often it is necessary to define failure
quantitatively to consider failure through deterioration or
instability of function. Examples:
A motor that is no longer capable of delivering a specified
torque, A machine that no longer processes parts at its designed
capacity.
5) The way in which time is specified in the definition of
reliability varies considerably, depending on the nature of the
system under consideration. Therefore in any intermittently
operated system (say a switch), we must specify whether calendar
time or the hours of operation is to be used.
6) Specifying the conditions under which a system is to operate
is important. It may be divided into:
The principle design loads, and The environmental effects:
weight that a structure must support, the electrical load
on a generator, the rate of information transfer on a
telecommunication system, temporary extremes, dust, salt, humidity,
...
7) Several quantities can be used to characterize the
reliability of a system including mean time to failure and failure
rate, and in the case of repairable systems mean time to repair and
availability.
8) Reliability is defined positively, in terms of a system
performing its intended function, and no distinction is made
between failures. In reality, there is concern not only with the
probability of failure, but also with the potential consequences of
different modes of failure. In particular, failures that present
severe safety problems are important. Home appliances need
reliability for avoiding frequent failures that result in customer
dissatisfaction and or create a safety hazard such as electric
shock.
-
Azim Houshyar, January 2011 Section 1 - page 21
1.7 System Life Cycle and Reliability
Successful implementation of R&M is dependent upon thorough
communication between marketing and design engineers, and/or the
user and supplier. This communication must begin at project
conception and continue through the entire life of the product
(equipment). This ensures that equipment problems will be
identified, root causes determined, and corrective action
implemented.
This section discusses a five-phase program management process
that describes how R&M applies to the various phases that occur
in the life cycle of product (equipment). Techniques to assist the
R&M implementation are also suggested. For more detail refer to
the "Reliability and Maintainability Guideline" published by
SAE.
Five-Phase Program Management Process
Machinery and equipment development programs can be managed
using a five-phase program management process. The process starts
in Phase 1 with concept and proceeds through decommissioning and/or
conversion in Phase 5. This process is appropriate for any hardware
development program for machinery and equipment.
Figure 1-6. Five Phases of Manufacturing Machinery and Equipment
Life Cycle
The reliability activities taking place during each of these
phases of the product life may be quite different. For instance in
the project definitions, the objectives of the systems are set
forth in the form of one or more functional requirements. For an
ergonomically designed chair, the exact requirements and for a
computer desk, the exact dimensions and specifications are
specified. In addition, the environment in which the system is to
function must be determined (i.e.: the range of temperature and
humidity, the concentrations of dust or other contaminates,).
Finally, the service life to which the system is to be designed
must be specified.
From such requirements, a conceptual design is formulated that
in broad form outlines how the system is to function, and provides
the general plan for its construction. From the functional
requirements comes the definition of failure, and thus of
reliability. Reliability requirements may then be set, and the
trade-offs between reliability, cost and functional requirements
may be examined as the design proceeds into the detailed phase.
-
Azim Houshyar, January 2011 Section 1 - page 22
The conceptual design must be converted into a detailed set of
drawings and specifications from which the system can be built.
During this phase, maintenance requirements and procedures are also
likely to take place. As the design proceeds, experiments, testing,
and analysis are required to choose between alternatives, to solve
problems, and to predict the performance of subsystems or
components.
Reliability considerations should permeate this stage of design
in setting safety factors and design margins, eliminating
unnecessary complexities, translating system reliability criteria
into reliability requirements for subsystems, and on setting time
intervals for inspection, maintenance and replacement of parts
subject to wear.
Note that in this stage, the detailed examination of potential
failure mechanisms and models is most beneficial, for often they
may be eliminated or mitigated without too much expense. In the
later stages of the design of the process, prototypes are built and
the first reliability tests may be performed.
Historically, reliability considerations during the
manufacturing of a system are related to the practices of quality
control. Reliability in manufacture is monitored and controlled,
and use of statistical Q.C. techniques for reliability testing on
manufactured item is exceedingly important. Verification of end
product reliability by testing to failure is not possible in large
one-of-a-kind system. Thus very stringent acceptance criteria on
components, careful supervision and control of the construction
process and an elaborate set of proofs or acceptance tests are
necessary in such situations.
Reliability and Phase 1 - Concept Phase The first phase is
research and limited development or design usually resulting in a
proposal. During this phase both the user and the supplier must
work together to establish system requirements. It is recommended
that the user team include machine operators, maintenance personnel
and product engineers. The supplier team should include MM&E
suppliers.
Machinery mission and environmental requirements are defined
during this phase. Also identified are safety issues, desired goals
for reliability and maintainability and life cycle cost.
Simultaneous (concurrent) engineering can be introduced at either
Phase 1 or Phase 2 depending on the particular situation and
MM&E.
Reliability and Phase 2 - Development/Design Phase The
development/design phase determines the majority of the life cycle
cost. The issues from the concept phase are incorporated. Safety,
ergonomics, accessibility and other maintainability issues are
designed into the system. R&M allocation requirements are
formalized.
Components and component suppliers should be selected based on
the predictive R&M statistics they provide. It is recommended
that MM&E suppliers utilize methods highlighted in the SAE
guideline to assure that R&M goals will be met.
The design review is a procedure for assuring that the planned
design is likely to, or does in fact; meet all requirements in the
most cost-effective way, considering all variables and
constraints.
-
Azim Houshyar, January 2011 Section 1 - page 23
1) Maintainability is a major consideration in the design
review. 2) A preliminary review is held prior to commitment to a
final design approach. 3) It is followed by an intermediate design
review when more details of the design become
available. 4) The status of design actions resulting from the
preliminary design review may be
reviewed at this time. 5) A design review is conducted to review
overall readiness for production prior to release
of drawings to the manufacturing function. 6) Regular design
review sessions are recommended to ensure that communication is
clear
between the user and MM&E suppliers. 7) It is also
recommended to include operators, maintenance personnel and
product
engineers in the design review. This will give all concerned an
understanding of the design intent.
8) At this phase, considerations must be given in the design for
demonstration of compliance to requirements through testing.
9) Suitable test plans must be developed.
Reliability and Phase 3 - Build & Install Phase During the
manufacturing and assembly of the machine, the achievement of
reliability requirements should be monitored. Issues that could
affect R&M must be communicated back to the design engineers to
assure any redesign includes reliability improvements.
Manufacturing process variables affecting R&M should be
identified and targeted for control. MM&E suppliers and the
user must negotiate meaningful R&M goals and requirements for
future monitoring and divide responsibility for collecting,
analyzing and reporting of data.
Several events occur during Phase 3: Problems encountered when
runoff tests are conducted should be documented for
elimination. Maintenance procedures are developed. A customer
representative should be involved in
this process. Training starts here and continues to the next
phases. Machine acceptance testing should be agreed to and
performed prior to teardown and
installation. R&M data base collection begins during machine
acceptance testing. Problems
encountered during this phase should be documented for future
reference/use. The machine will be transferred from the builder's
location to the customer's plant.
Critical assembly processes should be identified during
teardown. Installation is a very critical step: The machine has to
be reassembled to the build
requirements. Special attention should be given to the critical
assembly processes identified during teardown.
Reliability and Phase 4 - Operation and Maintenance Phase In
this phase the equipment is at the customer location and fully
operational. Data collection and feedback are very important at
this phase. Data collection mechanisms should be in place and
agreed upon by both parties. Information collected during this
phase often leads to R&M growth and continuous improvement.
-
Azim Houshyar, January 2011 Section 1 - page 24
During this phase maintenance should be performed regularly. For
an R&M initiative to be successful, the MM&E and component
suppliers must have access to maintenance records and R&M data
bases.
Reliability and Phase 5 - Decommissioning and/or Conversion
Phase This phase is the end of the expected life of the machine.
During this phase machine may require decommissioning due to an
increasing failure rate that has resulted in increasingly expensive
maintenance or may be rebuilt to a good-as-new state.
In another possible situation, the machine may still be in good
condition but the production needs have changed requiring the
machine to go through major conversion to be used for production of
other products.
When either the decommissioning or conversion action is taken,
the feedback from the user plant should be recorded and all the
information should be used for R&M growth and continuous
improvement in future generations of machinery.
________________________
Notes: