Comparing Maintenance Strategies Based on Cost and Availability
Reliability Centered Maintenance (RCM)
analysis provides a structured framework for analyzing the functions and
potential failure modes for a physical asset (such as an airplane, a
manufacturing production line, etc.) in order to develop a scheduled
maintenance plan that will provide an acceptable level of operability, with
an acceptable level of risk, in an efficient and cost-effective manner.
RCM techniques often utilize a logic diagram
approach for evaluating the potential effects of failure and selecting the
appropriate maintenance strategy. As an example, Figure 1 shows a portion of
one of the decision-making flowcharts presented in the SAE JA1012 document,
A Guide to the Reliability-Centered Maintenance (RCM) Standard. Similar
diagrams are provided in other published RCM guidelines. (Some of the major
RCM publications are listed in the References section of this article.)
In addition to, or instead of, a logic
diagram approach, the RCM analyst may wish to use cost- and
availability-based comparisons of potential maintenance strategies when
selecting and assigning maintenance tasks. This article provides an overview
of these comparison techniques along with a couple of demonstration
Types of Maintenance
Strategies to Consider
Although there is variation among practitioners regarding the
terminology used to describe the available maintenance techniques, in
general, the RCM analyst may consider any of the following maintenance
strategies to address a potential failure mechanism:
- Run-to-Failure - fix the equipment when it
fails but do not perform any scheduled maintenance.
- Scheduled Inspections
- Failure Finding Inspections - inspect
the equipment on a scheduled basis to discover hidden failures. If the
equipment is found to be failed, initiate corrective maintenance.
- On-Condition Inspections - inspect the
equipment on a scheduled or ongoing basis to discover conditions
indicating that a failure is about to occur. If the equipment is found
to be about to fail, initiate preventive maintenance.
- Scheduled Preventive Maintenance
- Service - perform lubrication or other
servicing actions on a scheduled basis.
- Repair - repair or overhaul the
equipment on a scheduled basis.
- Replace - replace the equipment on a
- Design Change - Re-design the equipment,
select different equipment or make some other one-time change to improve
the reliability/availability of the equipment.
Using Simulation to
Compare Maintenance Strategies
Given certain information about how the equipment will be
operated, the probability of occurrence for the failure mode and the
maintenance characteristics, the analyst can use simulation to estimate the
cost and average availability that can be expected over the operational life
of the equipment when a particular maintenance strategy is employed. The
calculations can then be used to compare available maintenance strategies so
that the analyst can select the most cost-effective strategy that provides
an acceptable level of performance.
(Corrective Maintenance Only)
To estimate the cost and average availability that can be
expected for a run-to-failure (corrective maintenance only) maintenance
strategy, the analyst must provide the following information:
- The amount of time that the equipment will
- The probability density function ( pdf )
that describes the probability that the equipment will fail due to a
particular failure cause.
- An indication of whether the failure is
detectable during normal operation.
- The amount of time that the equipment is
expected to be down each time corrective maintenance is required. This can
include the time to perform the maintenance as well as any logistical
delays (i.e., waiting for labor and/or materials required).
- The cost each time corrective maintenance
is required, including the downtime, labor, materials and other costs.
- The degree to which the equipment will be
restored by corrective maintenance (e.g., "as good as new," "as bad
as old," etc.).
The analyst can then simulate the operation
of the equipment for the specified operating time, given the specified
reliability/maintainability characteristics, in order to estimate 1) the
expected number of corrective maintenance actions that will be performed and
2) the amount of time that the equipment is expected to be operating
(uptime) over the specified time. These estimates can then be used to
calculate the total operating cost, cost per uptime and average
availability, as follows:
To calculate the cost and availability that can be expected from
a maintenance strategy that involves preventive repair/replacement of the
equipment, the following information is required (in addition to the inputs
The time interval at which the
preventive maintenance will be performed.
The amount of time that the
equipment is expected to be down each time preventive maintenance is
The cost each time preventive
maintenance is performed.
The degree to which the
equipment will be restored by preventive maintenance.
With this additional
information, simulation can be used to estimate the expected number of
corrective maintenance (CM) and preventive maintenance (PM) actions, along
with the uptime. The total operating cost for this maintenance strategy
includes the cost of all CMs plus the cost of all PMs, as shown next. Note
that the Cost per Uptime and Average Availability calculations are the same,
regardless of task type.
Calculations for Service and
Failure Finding tasks are performed in a similar manner except that the
assumptions of the simulation will vary to fit the conditions of the task.
For example, if the failure is undetectable during normal operation and the
equipment is found to be failed during a scheduled Service task, then the
simulation will assume that corrective maintenance will be initiated.
Likewise, a Failure Finding task can initiate corrective action if the
equipment is found to be failed but does not restore the equipment to any
degree if it is found to be running.
On-Condition Inspection tasks (which are designed to monitor the
equipment at scheduled intervals or on an ongoing basis and initiate
preventive maintenance only if a specific condition is detected) require
additional information and a more complex simulation/calculation method. In
addition to operating life, probability of failure and corrective
maintenance characteristics, the analyst must describe the characteristics
of the scheduled inspections that will be performed:
The time interval at which the
inspection will be performed.
The amount of time that the
equipment is expected to be down each time an inspection is performed.
The cost each time an inspection
An indication of when the
approaching failure will become detectable during inspection (which could be
described as a percentage of item life or as a fixed time interval).
For the cases in which the
inspection detects that a failure is approaching, the analysis also requires
the downtime, cost and restoration factor associated with the preventive
maintenance that will be initiated.
Simulation of this scenario will
return 1) the expected number of corrective maintenance actions, 2) the
expected number of inspections, 3) the expected number of preventive
maintenance actions and 4) the amount of uptime. The total operating cost
then includes the cost of all CMs plus all inspections plus all PMs, as
This total operating cost is
then used to calculate cost per uptime and average availability as described
1 - Mechanical Component with Wearout
Consider an RCM analysis for a large truck that is intended to
operate for 120,000 miles per year. A critical failure mode has been
identified for a mechanical component and reliability analysis indicates
that the failure behavior follows a Weibull distribution with beta = 2.3 and
eta = 72,000 miles. Considering logistical factors, downtime penalties and
the actual repair resources, it takes 7 work days (3,500 miles of lost
“production”) and costs $4,650 each time the component must be replaced when
it fails. The component will be “as good as new” after the maintenance
action. The RCM analysis team is considering whether to incorporate a
scheduled preventive replacement task into the maintenance plan. Because
there are no additional logistical delays/costs for a planned replacement,
the PM task will take only 1 work day and cost $2,050.
Using the RCM++ software, the
team can first estimate the optimum preventive replacement time for the
component and then simulate the operation of the equipment for 120,000 miles
to estimate the cost and average availability that can be expected in a year
from the two maintenance strategies that are under consideration. By
entering the cost of corrective maintenance (CM), the cost of preventive
maintenance (PM) and the probability of failure into the following equation,
the optimum PM interval is determined to be 60,330.25 miles.
Rounding to 60,000 miles and
performing the simulation yields the following results per vehicle per year:
1.43 CMs and Uptime = 115,114.20
Total Operating Cost = $6,626.25
Cost per Uptime = 0.058 per mile
Average Availability = 95.93%
.98 CMs, .79 PMs and Uptime =
Total Operating Cost = $6,188.95
Cost per Uptime = 0.053 per mile
Average Availability = 96.87%
Figure 2 shows the results of
the preventive replacement analysis in RCM++. The analysis indicates that
the scheduled replacement strategy provides both lower cost and better
availability. Note that the differences between the two strategies will be
even greater when applied to the entire fleet of vehicles over multiple
years of operation.
2 - Electrical Component with Infant Mortality
Another critical failure mode has been identified for an
electrical component of the truck described in Example 1. This follows a
Weibull distribution with beta = .76 and eta = 100,000 miles. The RCM
analysis team is considering a planned replacement for this component at
60,000 miles to coincide with the PM for the mechanical component. For this
failure mode, the CM downtime is 4 work days; the CM cost is $2,800; the PM
cost would be $1,200 and there would be no additional PM downtime because
the equipment is already down for the other maintenance task. The analysis
yields the following results:
1.21 CMs and Uptime = 117,617.74
Total Operating Cost = $3,374.00
Cost per Uptime = 0.029 per mile
Average Availability = 98.02%
1.39 CMs, .88 PMs and Uptime =
Total Operating Cost = $4,934.80
Cost per Uptime = 0.042 per mile
Average Availability = 97.71%
In this case, the analysis
indicates that a run-to-failure maintenance strategy will be more
cost-effective and provide better availability. In fact, since the beta
parameter of the failure distribution is less than 1, this indicates that
the equipment does not experience wearout and there is no optimum preventive
replacement time. The team could repeat the analysis for other maintenance
intervals and would always determine that run-to-failure is more
As this article demonstrates, cost-based comparisons can be very
useful to help RCM analysts to select the most appropriate maintenance
strategy for a particular piece of equipment/failure mode. ReliaSoft's RCM++
software automatically performs the maintenance task cost calculations
described here. This functionality relies on the same powerful simulation
engine available in ReliaSoft's BlockSim software, which can also be used
for maintenance planning and other more complex system reliability,
maintainability and availability analyses. For more information, see
ATA MSG-3 “Operator/Manufacturer Scheduled Maintenance
Development,” updated in March 2003.
NAVAIR 00-25-403 “Guidelines for
the Naval Aviation Reliability-Centered Maintenance Process,” issued in
Nowlan, F. Stanley and Howard F.
Heap, Reliability-Centered Maintenance. Issued in December, 1978.
SAE JA1011 “Evaluation Criteria for Reliability-Centered Maintenance (RCM)
Processes,” issued in August 1999.
SAE JA1012 “A Guide to the
Reliability-Centered Maintenance (RCM) Standard,” issued in January 2002.