Section 1

Azim Houshyar, January 2011 Section 1 - page 1

SECTION 1 PRINCIPLES, CONCEPTS & DEFINITIONS OF RELIABILITY

1.1 Introduction to Reliability and Maintainability

Reliability and Maintainability (R&M) are vital characteristics of products & manufacturing machinery and equipment that enable U.S. manufacturers to be world class competitors.

Reliability consideration plays an increasing role in virtually all-engineering disciplines. As the demand for the systems that perform better and cost less increase, there is great need to minimize the probability of failures, whether the failures simply increase costs and inconvenience, or gravely threaten the public safety.

In the broad sense, reliability is associated with dependability, with successful operation, and with the absence of break-downs or failures.

From the product point view, customer relies on a product that performs its intended function with no failure. From the manufacturing point of view, efficient production planning depends on a process that yields high quality parts at a specific rate without interruption. Predictable reliability and maintainability of the manufacturing machinery and equipment is a key ingredient in maintaining production efficiency and the effective deployment of Just-in-Time principles. In both cases, improved reliability and maintainability of a product/equipment lead to lower total life cycle costs that are necessary to maintain the customer satisfaction competitive edge.

This document provides the methodology to achieve these objectives by providing R&M techniques and guidance on where to apply them in the up-front design and development of a new product/equipment. Furthermore, the methodology for continuous improvement of a design and or machinery after installation is provided. Implementation of the R&M concepts described in the notebook will help increase the product reliability, and/or equipment availability and reduce its overall operational and maintenance costs.

1.2 Basic Definitions of Reliability & Maintainability

Reliability is the probability that a product (equipment) can perform continuously, without failure, for a specified interval of time when operating under stated conditions. Increased reliability implies less failure of an equipment and consequently less downtime and loss of production.

Maintainability is a characteristic of design, installation and operation, usually expressed as the probability that a machine can be restored to specified operable condition (returned to a serviceable state) within a specified interval of time when maintenance action is performed in accordance with prescribed procedures and resources.


Benefits of R&M

Highly reliable and maintainable production machinery offers the means for producing consistently high quality products at lower costs and at higher output levels. Successful application of R&M techniques has a very positive effect on employee morale and pride since the reduction in downtime also results in significant reduction in employee stress and frustration.

Table 1-1. R&M User/Supplier Benefits

User Benefits Supplier Benefits Higher machinery & equipment availability Reduced warranty costs

Unscheduled downtime reduced/eliminated Reduced build costs

Reduced maintenance costs Reduced design costs

Stabilized work schedule Improved customer relations

Improved J-I-T performance capability Higher customer satisfaction

Improved profitability Increased understanding of productions

Increased employee satisfaction Increased sales volume

Lower overall cost of production Increased employee satisfaction

Higher quality parts and product Improved status in the marketplace

Less need for in-process inventory to cover downtime A competitive edge in the marketplace

Reduced Life Cycle Cost Life Cycle Cost (LCC) refers to the total cost of a system during its operational life. LCC is the sum of non-recurring costs plus operation and support costs. Operation and support costs typically consume about 50% of the total LCC.

Figure 1-1. Total Life Cycle Cost

CONCEPTION STAGE

DEVELOPMENT

MACHINE BUILD OPREATION AND SUPPORT

50% 35%

12%

3%


Emphasizing R&M practices during the conception and development stages can lower the total LCC. By using R&M to minimize stress (electrical, mechanical, etc.), the equipment will be less prone to failure during operation. This results in a decrease of the operation and support costs that account for the bulk of total LCC.

A slight increase in spending to incorporate R&M practices during the conception and design stages can dramatically lower the operation and support costs.

It is important to consider R&M at the early stage of a program. Studies have shown that as much as 95% of LCC is determined during conceptual and development stages. Once new product (equipment) has reached the build stage, therefore, only 5% opportunity remains to effectively improve the reliability or maintainability of the product (equipment).

Examples of LCC Improvement: Intel Corporation is engaged in the design and manufacture of solid-state devices. Intel has developed and is implementing a corporate strategy that addresses the subject of reliability and maintainability in an aggressive, committed manner.

In portions of its assembly operation, Intel has improved the Mean Time Between Adjustments (MTBA) from 5 minutes to 16 minutes. This improvement makes it possible for one operator to run eight machines rather than four, a doubling of operator productivity. In addition, process yields have been improved due to the elimination of scrap that resulted from the more frequent shutdowns.

Intel's R&M program was also responsible for improving the Mean Time Between Failures (MTBF) from 10 hours to 250 hours on its solid-state component wire bonding machines. This improvement had the same effect as adding 30% capacity to the existing machine base. Another benefit of this improved reliability lies in the fact that Intel was able to reassign the three line technicians who previously served as "baby-sitters" to more productive work.

Your Example of LCC Improvement: Choose a product (equipment) that you are familiar with. State your approach to improving the life cycle cost for the product (equipment). What additional resources (time, money, technology, labor,...) are needed, and what are the foreseeable benefits?

Example of Life Cycle Cost

Equipment Name:

Estimated Initial Cost:

Estimated Life:

Estimated Annual Operational Cost:

Current Status:

Recommended Modifications:


What do we mean when we say we have a Reliable Product?

Well, we may think of a dependable, trustworthy product, but can these descriptions be quantified?

Can you predict the exact time when a given product will fail? Well, even though you probably can't say the exact time of the failure of a product, you can estimate the percentage of products that will fail by a given time.

Reliability can be stated in different forms. For instance: 1) The reliability that a product (equipment) will be performing its intended function after

1,000 hours of use is 0.80; or 2) The reliability at 1,000 hours is 0.80, or the reliability is 80%. 3) Another way to look at it is that if we place 100 units of this product (equipment) in use,

80 of them will still be operating (with no failure) at 1,000 hours. 4) The reliability at any future time (say 1,500 hours) is less.

Remember that the reliability of a product (equipment) should not be stated as simply 0.8, since no time is specified. It is equally ambiguous for a product (equipment) to have a 1,000-hour life without indicating a reliability for that time. Instead it should be stated that the 1,000-hour reliability of the product (equipment) is 0.8.

Question: Looking at the Figure, state your findings regarding the relationship between reliability and time? Which of the two curves represents a more reliable system? Why?

Response:

100

Rel [%]

0

0 10,000

1 2

Time to failure

Reliability Functions


In the definition of reliability, three phrases were used. Those phrases were: 1) Perform intended functions satisfactorily; 2) For the specified period of time; and 3) Under specified conditions.

What do we mean by "Perform Intended Function Satisfactorily"? To understand this phrase better, let's define Failure.

FAILURE: An event when machinery/equipment is not available to produce parts at specified conditions when scheduled or is not capable of producing parts or perform scheduled operations to specification. For every failure, an action is required.

Unsatisfactory performance is subject to interpretation. Therefore it must be clearly defined at the time of the contract. There will be various levels of failure based on the customer's level of severity for incidents on the manufacturing equipment.

What do we mean by "Specified Time Period"?

Products deteriorate with use and even with age when dormant. This is especially true for wood products. Longer lengths of usage imply higher chance of failure and hence lower reliability.

For design purposes, target usage periods must be identified. Typically identified usage periods are:

The warranty period; Durability life that is a measure of useful life, defining the number of operating

hours until overhaul is expected or required.

What do we mean by "Specified Conditions"?

Products react to the environment in which they are being placed in. Different environments promote different failure modes and different failure rates for a product. Therefore the environmental factors which the product will encounter must be clearly defined.

Environmental factors such as: Temperature, Humidity, Vibration, Mechanical shock, Immersion/splash, Pressure/vacuum, Contamination, Electrical noise, Electromagnetic fields, Corrosive materials,..., must be addressed during the design stages of the equipment. These environmental conditions must be thoroughly documented.

1.3 Association between Quality and Reliability

Lamberson lists quality characteristics as: Psychological (taste, beauty, style, status); Technological (hardness, vibration, noise, materials); Time oriented (reliability and maintainability); Contractual (warranty); and Ethical (honesty of repairman, experience of sales force).


Quality is referred to as fitness for use. This comprises all phases of the life cycle of the product including engineering, manufacturing, marketing and maintenance. This must be addressed from the customers' standpoint. Company-wide quality control is a philosophy that focuses on meeting customer needs and expectations throughout the life cycle of the product while continuously improving the production process.

Quality Defects are defined as those which can be located by conventional inspection techniques.

Reliability Defects are defined as those which require some stress applied over time to develop into detectable defects.

Performance and Reliability: Engineering is concerned with designing and building products for improved performance. This requires the designs to incorporate features that may tend to be less reliable than the older systems with lower performances.

The trade-offs between performance and reliability are often subtle. Thus any product with both improved performance and reliability is significant advance.

We usually improve performance through increased loading; Decrease the weight of an aircraft increase in the stress level of structure Increase in temperature to get thermodynamical efficient rapid corrosion in material

This approach to the physical limits of system increases number of failures.

Specifications for a purer material, tighter dimensional tolerance, ..., is required to reduce uncertainty in the performance limits, and thereby permit one to operate close to these limits without increasing the probability of exceeding them.

The performance of a system is often increased at the expense of increased complexity, this again decreases reliability, unless compensating measures are taken.

Probably greatest improvements in performance is introduction of new materials or devices to achieve a particular goal:

Replacement of wood by metal, Replacement of piston with jet aircraft engine, Replacement of vacuum tubes with solid electronics.

Notes: Even with major advances in technology, reliability may be a severe problem, particularly during the early stages of introducing a new technological advance.

At any stage of technological development, trade-offs must be made between: Reliability and performance, Reliability and cost.

Ex: Race car: Performance is improving, but reliability remains below 50%. Here performance is everything, and one must tolerate a high probability of break-down if there is to be any chance of winning the race.


Ex: Military aircraft: An intermediate example in which reliability and performance are balanced.

Ex: Commercial airliner: In this case, reliability is the overriding design consideration. Thus degraded speed, payload, and fuel economy are accepted to maintain a very small probability of catastrophic failure.

1.4 Definition of Reliability Measures

In this section, we will define: Repairable and non-repairable units; Mean Time Between Failures (MTBF); Mean Time To Failure (MTTF); Failure rate; Mean Time To Repair (MTTR); Reliability, Maintainability, and Availability.

Items/components/subsystems/systems can be classified as repairable or non-repairable. Whenever we use MTBF, we are referring to repairable entities, whereas MTTF is used for non-repairable entities.

What are some indicators used to Quantify Product Reliability?

Mean Time Between Failures (MTBF): The average time between failure occurrences. The sum of the operating time of a machine divided by the total number of failures.

Mean Cycle Between Failure (MCBF): The average cycles between failure occurrences. The sum of the operating cycles of a machine divided by the total number of failures.

Failure Rate: Number of failures per unit of gross operating period in terms of time, events, cycles, or number of parts.

Reliability: R(t) indicates reliability at time t, where t is the duration of failure-free operation of the equipment.

MTBF = (Operating time)/(Total number of equipment failures)

Failure Rate = (Total number of equipment failures)/(Operating time)

1. MTBF=1,000 hours means that, on the average, a failure will occur with every 1,000 hours of usage.

2. A failure rate of 1 failure per 1000 hours (= 0.001/hr.) means that, on the average, one failure will occur with every 1,000 hours of usage.

3. R(t=1000 hr.) = 0.8 means that the probability of 1000 hours of failure-free performance is 80%.


What is the relationship between Reliability Numbers?

The relationship between the Failure Rate and MTBF is:

MTBF = 1/Failure Rate

Therefore a failure rate of 0.001/hr implies a MTBF of 1000 hours.

Assuming that the reliability function for the equipment is Exponentially Distributed, we can use the following equation to calculate the reliability of a product or machinery at a specified time t.

R(t) = e-t/MTBF t>0

where t = time over which machine is to be operated without failure, and e = the natural log number 2.718.

For example, a one-shift reliability of the machine with MTBF of 1,000 hour is:

R(t=1,000 hr.) = e-8/1000 = 0.992 => R8= 99.2%

There is 99.2% chance of running the equipment for 8 hours without encountering a failure. The same equipment has only 79.4% chance of running for 100 hours without encountering a failure.

I recommend selection of an agreeable time frame over which reliability is to be sustained. An example might be the 8-hour Reliability, denoted by R8, which represents the probability that the machine will not fail during an 8-hr shift.

Example 1: The failure rate of a component is 0.001 hr-1. a) Find the MTBF.

b) Find the R8.

c) What is the probability that the component will not fail in a one-month continuous operation.

Example 2: Given the reliability function R(t) = e-t/1000, where t is time to failure in hrs. a) Find the 100 hour reliability.

b) Find the 1,000 hour reliability.

c) If 1,000 devices are placed in operation. How many will still be operating at 100 hrs?


Example 3: A machine has an MTBF of 50 hours. a) Find the failure rate.

b) Find the One-shift reliability.

c) Find the three-shift reliability.

d) In 100 hours of operational time, how many failure would you expect?

What is the Relationship Between MTBF of a System and MTBFs of its Components? Most systems consist of several subsystems. Occasionally we need to combine MTBFs from different subsystems to calculate the MTBF for the main system. An example is to analyze a design in which we may have data on the MTBFs of the different subsystems used in the new design.

Example 4: Consider a system in which one subsystem has an MTBF of 25 hours. On the average, in 100 hours of uptime, there will be 4 failures. Using the relationship between MTBF, number of failures, and operating time it is seen that:

MTBF = (Uptime)/(Total Number of Failures)

MTBF1 = 100/4 = 25 hrs.

Now consider adding a second subsystem with a MTBF of 20 hours to the previous system. This subsystem is expected to have 5 failures in 100 hours of uptime. MTBF2 = 100/5 = 20 hrs.

How do we combine the MTBFs to obtain the MTBF for the main system?

Obviously, we can expect 4+5 = 9 failures in 100 hours of operation, therefore: MTBFS= 100/9 = 11 hours! that is the system fails more often than each of the subsystems.

Can you figure out the rule? The rule is to combine the failure rates

S=1+ 2

or equivalently: 1/MTBFS = 1/MTBF1 + 1/MTBF2

that is: 1/MTBFS = 1/25 + 1/20 = 0.09 => MTBFS = 11 hrs.


Example 5: Consider a press which consists of the following five subsystems:

Subsystem MTBF Crown Assemblies 50,000 hrs 2X10 -5 hr -1 Slide Assemblies 20,000 hrs 5X10 -5 hr -1 Gibing 200,000 hrs 5X10 -6 hr -1= 0.5 X 10-5 hr -1 Columns 10,000 hrs 1X10 -4 hr -1= 10 X 10 -5 hr -1 Beds 10,000 hrs 1X10 -4 hr -1= 10 X 10 -5 hr -1

a) Calculate the MTBF for the press.

b) What is the 8-hour reliability of the press.

Example 6: Consider a work-station for which the subsystems failure rates are:

Subsystem (1/hr.) MTBF(hrs) Load/unload mechanism 0.00003 33,333 Mechanical actuator 0.00001 100,000 Electronics 0.000005 200,000 Hydraulics 0.0004 2,500

a) Calculate the MTBF for the work station.

b) What is the 4-hour reliability of the work-station?

MTBF and Failure Rate are two related measures of the Reliability of the equipment or product. The next question is: How do we measure the Maintainability?

Maintainability is a characteristic of design, installation and operation, usually expressed as the probability that a machine can be retained, or restored to, specified operable condition (returned to a serviceable state) within a specified interval of time when maintenance is performed in accordance with prescribed procedures.

In what follows some Maintainability Improvement Strategies is discussed.


1.5 Maintainability Improvement Strategies

Safety: Safety engineering must be introduced at the design stage, not after the equipment is built. Safety personnel must be consulted up front to fully utilize the best technology available in a safe and ergonomically efficient manner. Properly designed, the operators environment will not only reduce the risk of injury, it will also avoid exposure to health risks or activities likely to cause repetitive motions disorders. Pinch points guarding, safety labels, personnel guards, warning devices, lock-outs and other appropriate safety measures must be integrated into the design. Safety requirements must be included in the specifications. Applicable safety standards must be adhered to.

Accessibility: Accessibility means having sufficient working space around a component to diagnose, troubleshoot and complete maintenance activities safely and effectively. Provision must be made for movement of necessary tools and equipment with consideration for human ergonomic limitations.

Operators, maintenance and service personnel have the best knowledge as to how the repair job will be done and to identify the problems, therefore, they should be involved in evaluating the design for accessibility.

Common Building Practices: Due to the relevance of the connections, piping runs, wiring and plumbing to the performance of the equipment, there should be a more rigorous approach to the practice. At a very minimum, machine suppliers should develop a well-documented practices manual and advise their assemblers to adhere to its content.

Diagnostics: Diagnostic devices indicating the status of equipment should be built into manufacturing machinery to aid maintainability support processes. Use of electronic light emitting diodes to indicate fault status can be helpful. The diagnostics can be as simple as a visual display indicating the equipment's status as a go/no-go condition, or as sophisticated as a knowledge-based expert system with the capability of analyzing a problem and recommending the most likely solution.

Diagnostic systems should have the capability of storing equipment performance data as permanent records for reliability analysis and supplier feedback supporting the reliability growth management process. Output from diagnostic systems should be in a compatible format with commercially available data base management software.

When component assemblies and subsystems are used to create a manufacturing system, hardware and software "hooks" should be put in place in the concept and design phase to facilitate integration of the diagnostics system in the build phase. Diagnostic systems should indicate the specific component to replace or repair.

Captive Hardware and Quick Attach/Detach: Captive and quick attach/detach hardware provides for rapid and easy replacement of components, panels, brackets and chassis. The environment in which these devices are used may restrict the type of device used. Spare parts and replaceable subassemblies should also be configured with these devices preassembled. Examples are:


plate, anchor and caged nuts push and snap-in fasteners clinch and self-clinching nuts quarter-turn fasteners

Modularity: Modularity requires that designs be divided into physically and functionally distinct units to facilitate removal and replacement. It allows design of components as removable and replaceable units for an enhanced design with minimum downtime. Modular design concepts typically are thought of in terms of electrical black boxes, printed circuit boards and other quick attach/detach electrical components. These concepts are also applicable to the mechanical elements of production equipment.

Advantages of modularity are: New designs can be simplified and design time can be shortened by making use of

standard, previously developed building blocks. Specialized technical skill will be reduced. Training of plant maintenance personnel is easier. Engineering changes can be made quickly with fewer side effects.

Maintenance Procedures: Maintenance procedures must describe in details the adjustments, replacement and repair of machine systems, subsystems and component parts. The original equipment manufacturer will provide recommended preventive maintenance procedures at intervals based on time and/or machine cycle count. Maintenance requirements should be prioritized to enable the equipment user to prioritize maintenance scheduling related to the criticality of the activity.

The maintenance procedures should be contained in service manuals or a computerized data-base reflecting the specific content and configuration of the equipment being supported. Exploded view illustrations, photographs, simplified assembly drawings and/or parts lists relating to the required maintenance activities and procedures should be included wherever applicable. Technical information such as pressure settings, operational sequences and moving part clearances should be included as appropriate.

Visual Management Techniques: Visual Management techniques differ for varying types of equipment. A team effort must exist between supplier and user to deliver the best techniques to the user. All the team members should review these on a continual basis at concept/design, machine build and on the manufacturing floor. Visual management techniques are used on machinery and equipment to bring the workplace awareness to a level that allows problems and abnormal conditions to be quickly recognized at a single glance. Through visual management, a system is created that enhances the equipment inspection process by allowing quick identification of safety, quality, environmental, equipment and process abnormalities. Typical visual management techniques include:

Match marking of all fasteners (nuts, bolts, screws) fixed, adjustable or critical Match marking of all control adjustments (pressure, flow, temperature, speed,

level, voltage, current, etc.) The identification of normal operating ranges and levels Direction of flow and product color coding on piping and hoses Direction of rotation on (drives, belts, chains, motors, etc.)


Function labels on (switches, valves, buttons, lights, etc.) Identification labels on (cabinets, panels, boxes, etc.) Filters (lube, hydraulic and air) that indicate when dirty Filters labeled with replacement filter element number Belt and chain drives with guarding that permit quick visual inspection and access Replacement belt or chain number labels on guarding Each lube point labeled with product number and color code Temperature sensitive labels on all critical components (motors, drives, controls,

hydraulic units, etc.) Equipment layout with all electrical control panel safety lockout points indicated

(affixed to the main electrical control panel) Equipment layout with all lubrication fill points, frequencies and product codes

indicated (affixed to the main electrical control panel) The identification of all control drawing numbers on the main electrical control

panel Signals or alarms that indicate a major abnormality, safety interlock tripped,

process out of control, etc. Equipment and process operator inspection list (affixed to the main electrical

control panel)

Spare Parts Management: Maintenance of manufacturing equipment and machinery requires a readily available supply of spare parts and supporting materials to operate, maintain and service the equipment. Spare parts management will identify and make available the required quantities of spare parts at an optimum inventory cost to the equipment user.

Plans for equipment support through spare parts management should begin during the equipment design phase and continue through the life cycle of the equipment. Consideration should be given to the lead time required to requisition, manufacture and receive into inventory the required parts and/or materials to avoid excessive costs to procure replacement parts on an emergency basis.

The machinery and equipment manufacturer should make a recommended spare parts list available to the equipment user. Parts may be provided from previously purchased inventory (commercial parts and supplies) or purchased specifically for the subject equipment and maintained in inventory for its use. Sourcing of spare and/or replacement parts, including consumable materials, should be managed to ensure that the performance and capability of the manufacturing machinery and equipment is maintained at or above the original manufacturer's specifications.

Other strategies for maintainability improvement include Standardization of the component parts that are commercial standard, readily available, and common from machine to machine; and Color Coding which can help to speed up maintenance procedures

Now that we have reviewed a few of the procedures for maintainability improvement, it seems natural to ask the following question:


How can we measure the Maintainability Performance of a machine? The response is by measuring its Mean Time To Repair (MTTR), which can be used as an indication of the ease of maintaining the equipment.

MTTR is the average time to restore machinery or equipment to specified conditions.

What procedure can be used to measure MTTR? In case that there exists very little data to calculate MTTR:

1. Determine Different Modes of Failure of the equipment, using your judgement and maintenance experience with similar equipment;

2. For each Mode of failure, estimate its Frequency of Occurrences; 3. Based on the way the equipment is to be designed, estimate its Time to Repair; 4. By multiplying the Failure Rate (item 2 above) by the Time to Repair (item 3 above),

calculate the Maintenance Load for each mode of failure; 5. Calculate MTTR as:

MTTR = (Maintenance Load)/(System Failure Rate)

in which system failure rate is the sum of failure rate for different modes; 6. Comparing the Maintenance Load for different Modes of Failure; initiate design action

for failure modes that create a high load on the maintenance function.

Example 7: During the equipment design and development phase (using the Failure Mode Analysis), the following three failure modes were identified, and the corresponding failure rates and times to repair were estimated. Use the information to estimate MTTR and to rank those failure modes.

Failure Mode Failure rate per 1000 hrs Time to repair (t) (hrs) Maintenance Load ( x t)

Hydraulic leak 10 1 10

Torn part 2 10 20

Conveyor jammed 4 0.5 2

MTTR = (Maintenance Load)/(System failure Rate) = (10+20+2)/(10+2+4) = 32/16 = 2 hrs

The procedure can be summarized as:

(component failure rate x time to repair component) MTTR = ---------------------------------------------------------------------- Total system failure rate


Example 8: Consider the following situation in which the MTBF and TTR for five different modes of failure are listed:

Subsystem number MTBF (hour) Time to Repair (hour) 1 1,000 1.5

2 5,000 4.0

3 10,000 1.0

4 2,500 2.5

5 500 0.5

Calculate MTBF and MTTR for the system and rank the significance of different modes of failure from the maintenance load point of view.

Subsystem Failure Rate() Time to Repair Maintenance Load 1

2

3

4

5

MTBF =

MTTR =

Example 9: Consider the following situation in which the MTBF and TTR for six different modes of failure are listed:

Subsystem number MTBF (hour) Time to Repair (hour) 1 120 4.0

2 100 5.5

3 600 3.0

4 1,000 1.0

5 1,500 0.5

6 750 1.5


Calculate MTBF and MTTR for the system and rank the significance of different modes of failure from the maintenance load point of view.

Subsystem Time to Repair Maintenance Load 1

2

3

4

5

6

Total

MTBF =

MTTR =

1.6 The Relationship Between R&M and Availability

What is Availability and how is it related to R&M?

Availability (A) is the probability that at any time, the system is either operating satisfactory or is ready to be operated on demand, when used under stated conditions.

The goal of availability engineering and management is to determine and achieve the availability performance necessary to the manufacturer's corporate, operating, company, and plant-level business performance and leadership.

Remember that the plant does not have to be shut down to experience reduced availability. When many plant items fail, they do not shut down the plant. Nor do they always reduce its production level. However, the plant's characteristic availability has been reduced.

A simple example is a plant with two pieces of equipment. One is a spare. When one fails, the other is placed in service. Thus, the plant's real-time production level is not reduced. However, the probability of maintaining that level over a period of time is substantially less.

The availability can be looked at as the ability of an equipment (under combined aspects of its reliability, maintainability and maintenance support) to perform its required function at a stated instant of time.

Availability includes the built-in equipment features (R&M) as well as in-plant maintenance support function (M).


RELIABILITY + MAINTAINABILITY => AVAILABILITY

How do we measure Availability? Depending on the stage of the life cycle, we can use one of the following two models: a) During design and development phase, the availability is calculated from the design data

using: A = MTBF/(MTBF + MTTR)

b) During the later phases of the life of the product, the availability is calculated using the actual data on operating time and downtime; that is:

A = Operating Time/Net Available Time in which:

Net Available Time = Operating Time + Unplanned Downtime

We will see that the two equations result in the same value for A.

Example 10: Calculate the availability for the following welder machine:

Subsystem MTBF (hr) Failure rate Ave. TTR (hr) Maint. Load 1. 2,400 1.25

2. 4,000 1.0

3. 200 2.25

4. 1,500 0.5

5. 7,000 3.0

TOTAL

MTBF =

MTTR =

A =

One useful tool for measuring the performance of a piece of equipment is OEE.

Overall Equipment Effectiveness (OEE) is a comprehensive measure of equipment effectiveness. The measurement encompasses:

1) What percentage of time the machinery is available (availability). 2) How fast the machinery is running relative to its design cycle time (speed ratio or

performance efficiency). 3) What percentage of the resulting product is within quality specifications (yield).


OEE = Availability x Performance Efficiency x Yield

The above formula is not only applied to the overall system, it is also applied to each individual machine that comprises the system.

R&M is an excellent means of increasing both individual and system OEE percentages. This is because emphasis of R&M practices has a profound effect on all three factors in the OEE equation.

Availability. R&M increases uptime because the equipment is more reliable and when it does require maintenance the services can be accomplished in a shorter time.

Performance Efficiency. Due to proper R&M practices, the equipment has fewer failures and less maintenance time. This means the equipment can operate for longer periods at its designed cycle time.

Yield. When components of the equipment are designed according to R&M practices the equipment is less susceptible to variations that could result in unacceptable parts.

Improved Uptime. Uptime and its counterpart, downtime, as functions of OEE are more clearly defined in Figure 1-5. The importance of R&M is realized when it is considered that the goal of R&M practices is to reduce the time required for preventive and corrective maintenance. By reducing the time required for these major contributors to downtime, the uptime increases correspondingly.

Example 11: A machine that was designed to produce 360 parts per hour was put under a continuous production test over a 5-day period. During that interval the machine broke down 6 times for a total of 14 hours. In addition the machine was on scheduled repair for 4 hours. The parts produced by the machine were inspected. 32,000 of the parts passed the inspection, but 1,400 of them were rejected. Based on the given information, try to answer the following questions: a) Calculate the MTBF.

b) Calculate the MTTR.

c) Calculate Availability.

d) Calculate Performance Efficiency.

e) Calculate the Yield.

f) Calculate the machine OEE.


Figure 1-5. Relationship of Typical Time Elements


Few Notes: 1) Reliability was defined as the probability that a component, device, or system will

perform satisfactory for at least a given period of time when used under stated conditions.

2) To measure reliability we need to specify: a) A precise definition of satisfactory performance; b) The time base over which the performance must be maintained; c) The environmental conditions that will be encountered.

3) The concept of reliability tells us that any given product has a built-in reliability function that relates its reliability to time and decreases as time progresses.

Note also that: Reliability is a probability concept. Reliability theory is a subset of quality control, but Q.C. function deals primarily

with new products under inspection, whereas reliability deals with products in service.

We shall consider the system as a set of interacting components working together as an integrated whole.

A system is said to fail when it ceases to perform its intended function

4) When there is a total cessation of function, the system has clearly failed, but often it is necessary to define failure quantitatively to consider failure through deterioration or instability of function. Examples:

A motor that is no longer capable of delivering a specified torque, A machine that no longer processes parts at its designed capacity.

5) The way in which time is specified in the definition of reliability varies considerably, depending on the nature of the system under consideration. Therefore in any intermittently operated system (say a switch), we must specify whether calendar time or the hours of operation is to be used.

6) Specifying the conditions under which a system is to operate is important. It may be divided into:

The principle design loads, and The environmental effects: weight that a structure must support, the electrical load

on a generator, the rate of information transfer on a telecommunication system, temporary extremes, dust, salt, humidity, ...

7) Several quantities can be used to characterize the reliability of a system including mean time to failure and failure rate, and in the case of repairable systems mean time to repair and availability.

8) Reliability is defined positively, in terms of a system performing its intended function, and no distinction is made between failures. In reality, there is concern not only with the probability of failure, but also with the potential consequences of different modes of failure. In particular, failures that present severe safety problems are important. Home appliances need reliability for avoiding frequent failures that result in customer dissatisfaction and or create a safety hazard such as electric shock.


1.7 System Life Cycle and Reliability

Successful implementation of R&M is dependent upon thorough communication between marketing and design engineers, and/or the user and supplier. This communication must begin at project conception and continue through the entire life of the product (equipment). This ensures that equipment problems will be identified, root causes determined, and corrective action implemented.

This section discusses a five-phase program management process that describes how R&M applies to the various phases that occur in the life cycle of product (equipment). Techniques to assist the R&M implementation are also suggested. For more detail refer to the "Reliability and Maintainability Guideline" published by SAE.

Five-Phase Program Management Process

Machinery and equipment development programs can be managed using a five-phase program management process. The process starts in Phase 1 with concept and proceeds through decommissioning and/or conversion in Phase 5. This process is appropriate for any hardware development program for machinery and equipment.

Figure 1-6. Five Phases of Manufacturing Machinery and Equipment Life Cycle

The reliability activities taking place during each of these phases of the product life may be quite different. For instance in the project definitions, the objectives of the systems are set forth in the form of one or more functional requirements. For an ergonomically designed chair, the exact requirements and for a computer desk, the exact dimensions and specifications are specified. In addition, the environment in which the system is to function must be determined (i.e.: the range of temperature and humidity, the concentrations of dust or other contaminates,). Finally, the service life to which the system is to be designed must be specified.

From such requirements, a conceptual design is formulated that in broad form outlines how the system is to function, and provides the general plan for its construction. From the functional requirements comes the definition of failure, and thus of reliability. Reliability requirements may then be set, and the trade-offs between reliability, cost and functional requirements may be examined as the design proceeds into the detailed phase.


The conceptual design must be converted into a detailed set of drawings and specifications from which the system can be built. During this phase, maintenance requirements and procedures are also likely to take place. As the design proceeds, experiments, testing, and analysis are required to choose between alternatives, to solve problems, and to predict the performance of subsystems or components.

Reliability considerations should permeate this stage of design in setting safety factors and design margins, eliminating unnecessary complexities, translating system reliability criteria into reliability requirements for subsystems, and on setting time intervals for inspection, maintenance and replacement of parts subject to wear.

Note that in this stage, the detailed examination of potential failure mechanisms and models is most beneficial, for often they may be eliminated or mitigated without too much expense. In the later stages of the design of the process, prototypes are built and the first reliability tests may be performed.

Historically, reliability considerations during the manufacturing of a system are related to the practices of quality control. Reliability in manufacture is monitored and controlled, and use of statistical Q.C. techniques for reliability testing on manufactured item is exceedingly important. Verification of end product reliability by testing to failure is not possible in large one-of-a-kind system. Thus very stringent acceptance criteria on components, careful supervision and control of the construction process and an elaborate set of proofs or acceptance tests are necessary in such situations.

Reliability and Phase 1 - Concept Phase The first phase is research and limited development or design usually resulting in a proposal. During this phase both the user and the supplier must work together to establish system requirements. It is recommended that the user team include machine operators, maintenance personnel and product engineers. The supplier team should include MM&E suppliers.

Machinery mission and environmental requirements are defined during this phase. Also identified are safety issues, desired goals for reliability and maintainability and life cycle cost. Simultaneous (concurrent) engineering can be introduced at either Phase 1 or Phase 2 depending on the particular situation and MM&E.

Reliability and Phase 2 - Development/Design Phase The development/design phase determines the majority of the life cycle cost. The issues from the concept phase are incorporated. Safety, ergonomics, accessibility and other maintainability issues are designed into the system. R&M allocation requirements are formalized.

Components and component suppliers should be selected based on the predictive R&M statistics they provide. It is recommended that MM&E suppliers utilize methods highlighted in the SAE guideline to assure that R&M goals will be met.

The design review is a procedure for assuring that the planned design is likely to, or does in fact; meet all requirements in the most cost-effective way, considering all variables and constraints.


1) Maintainability is a major consideration in the design review. 2) A preliminary review is held prior to commitment to a final design approach. 3) It is followed by an intermediate design review when more details of the design become

available. 4) The status of design actions resulting from the preliminary design review may be

reviewed at this time. 5) A design review is conducted to review overall readiness for production prior to release

of drawings to the manufacturing function. 6) Regular design review sessions are recommended to ensure that communication is clear

between the user and MM&E suppliers. 7) It is also recommended to include operators, maintenance personnel and product

engineers in the design review. This will give all concerned an understanding of the design intent.

8) At this phase, considerations must be given in the design for demonstration of compliance to requirements through testing.

9) Suitable test plans must be developed.

Reliability and Phase 3 - Build & Install Phase During the manufacturing and assembly of the machine, the achievement of reliability requirements should be monitored. Issues that could affect R&M must be communicated back to the design engineers to assure any redesign includes reliability improvements. Manufacturing process variables affecting R&M should be identified and targeted for control. MM&E suppliers and the user must negotiate meaningful R&M goals and requirements for future monitoring and divide responsibility for collecting, analyzing and reporting of data.

Several events occur during Phase 3: Problems encountered when runoff tests are conducted should be documented for

elimination. Maintenance procedures are developed. A customer representative should be involved in

this process. Training starts here and continues to the next phases. Machine acceptance testing should be agreed to and performed prior to teardown and

installation. R&M data base collection begins during machine acceptance testing. Problems

encountered during this phase should be documented for future reference/use. The machine will be transferred from the builder's location to the customer's plant.

Critical assembly processes should be identified during teardown. Installation is a very critical step: The machine has to be reassembled to the build

requirements. Special attention should be given to the critical assembly processes identified during teardown.

Reliability and Phase 4 - Operation and Maintenance Phase In this phase the equipment is at the customer location and fully operational. Data collection and feedback are very important at this phase. Data collection mechanisms should be in place and agreed upon by both parties. Information collected during this phase often leads to R&M growth and continuous improvement.


During this phase maintenance should be performed regularly. For an R&M initiative to be successful, the MM&E and component suppliers must have access to maintenance records and R&M data bases.

Reliability and Phase 5 - Decommissioning and/or Conversion Phase This phase is the end of the expected life of the machine. During this phase machine may require decommissioning due to an increasing failure rate that has resulted in increasingly expensive maintenance or may be rebuilt to a good-as-new state.

In another possible situation, the machine may still be in good condition but the production needs have changed requiring the machine to go through major conversion to be used for production of other products.

When either the decommissioning or conversion action is taken, the feedback from the user plant should be recorded and all the information should be used for R&M growth and continuous improvement in future generations of machinery. ________________________

Notes:

Section 1

Documents

product reliability

maintainability rm

improved reliability

reliability consideration

predictable reliability

increased reliability

benefits of rm

product equipment