A system is a collection of subsystems, assemblies and/or components arranged in a specific design to achieve the desired functionality. A system can be repairable or non-repairable and the appropriate analysis method will differ based on this distinction. This article describes a mistake that is often made in repairable systems analysis (i.e., distribution analysis of times between failure) and presents two methods that are more appropriate for this type of analysis (i.e., analyzing system level data with a stochastic process model or analyzing component level data with a reliability block diagram). An example using race car field data demonstrates why distribution analysis of times between failure is not appropriate. This example is also used to highlight the advantages and disadvantages of the stochastic process model and reliability block diagram approaches.
Common Mistake When
Analyzing Repairable Systems
When fitting a distribution, we assume that the events are statistically independent and identically distributed (s.i.i.d.). However, in a repairable system, the events (failures) are not independent and in most cases are not identically distributed. When a failure occurs in a repairable system, the remaining components have a current age. The next failure event depends on this current age. Thus, the failure events at the system level are dependent.
When we perform a distribution analysis on the times between failure, this is equivalent to saying that we have 9 different systems, and System 1 failed after t1 hours of operation, System 2 failed after t2,…, etc.
This is the same as assuming that the system
is AS-GOOD-AS-NEW after the repair, which is not true in repairable systems
in general. In most cases, the system is AS-BAD-AS-OLD after the repair.
This is particularly true for large systems, where replacing a component
does not have a great impact on the system reliability. For example,
replacing the starter does not have a great impact on the reliability of a
car since there are many other ways that it may fail.
Table 1: Field Data for 3 Race Cars
As shown in Figure 1, we could use Weibull++ to fit a distribution to the times between failure for each system. Note that the PM times are not considered and the time between the last failure and the current age of the system is treated as a suspension. This analysis assumes that we have a sample of 19 systems, and one system failed at 7.3 Km, another failed at 27.4 Km, and so on. The result is a 2-parameter Weibull distribution with beta = 1.1043 and eta = 336.7140. When you use this analysis to calculate the probability that the driver will finish the 200 Km race, the estimate is 56.97%. However, this result is not valid because the events (times between failure) are not s.i.i.d. When applied inappropriately, the analysis method yields incorrect results.
Figure 1: Distribution Analysis on Times Between Failure (in Weibull++)
Instead of fitting a distribution to the times between failure for each system, we could fit a distribution to the first time-to-failure for each system. These are statistically independent and identically distributed events. Figure 2 shows this analysis performed in Weibull++.
Figure 2: Distribution Analysis on First Time-to-Failure per System (in Weibull++)
The results from this type of analysis are limited, however. We could use this analysis to estimate the probability that the car will not fail in the first 200 Km (84.17%). But the confidence interval for this estimate is very wide (one-sided lower 90% bound = 51.13%). When we go on to estimate the probability that no failures will occur in the first ten races (2,000 Km), we find that the system will fail at least once in the next ten races (i.e., the reliability is 0%). However, we cannot use this analysis to estimate how many times the car will fail during the ten races. We also cannot determine whether and/or when to overhaul the system, and so on.
Clearly, a different analysis approach is required that will provide answers to these and other important questions. The remainder of this article presents two methods that are more appropriate for repairable systems analysis and considers the advantages and disadvantages of each method.
Using a Stochastic Process
Model to Analyze Data at the System Level
The Non Homogeneous Poisson Process (NHPP) with a Power Law Failure Intensity is such a model. It assumes that the system is AS-BAD-AS-OLD after each repair and is given by:
NOTE: If we assume that the repair partially renews the system and it is not AS-BAD-AS-OLD after the repair, then the NHPP model may not be the most appropriate model for the analysis. The General Renewal Process (GRP) may be used instead. This model has been discussed in a previous Reliability Edge article (Volume 6, Issue 1, on the Web at http://www.ReliaSoft.com/newsletter/v6i1/restoration.htm) and is available in Weibull++ 7's Parametric RDA folio.
Using the NHPP Power
Law Model for the Race Car Analysis
Figure 3: NHPP Power Law Analysis (in RGA 6)
Figure 4: Cumulative Number of Failures from the NHPP Analysis in RGA 6
Using the Quick Calculation Pad, we can also estimate the probability that the driver will finish the first race (87.31%) and the probability that the driver will finish the third race given that his car has run the first two races, (66.70%). We can estimate the optimum overhaul time for the car by considering the average repair cost ($192,000) and the overhaul cost ($500,000). This is about 1,560 Km (approximately once every 8 races per vehicle). These results are shown in Figure 5.
Figure 5: Probabilities of
Finishing Race 1 and Race 3 and Optimum Overhaul Time
As you can see, the NHPP analysis allows us to answer many questions of interest for a repairable system. However, there are still some unanswered questions, including:
If we have data at the component level (Lowest Replaceable Unit, LRU), we can use a Reliability Block Diagram (RBD) approach to answer these and other questions.
Using an RBD for the
Race Car Analysis
We can use Weibull++ to analyze the times-to-failure and suspensions for each component. The results are shown in Table 2.
Table 2: Component Distributions and Parameters
We can then use ReliaSoft's BlockSim software to create an RBD that represents the reliability-wise configuration of these components, as shown in Figure 6. We use the Weibull++ analyses to define the failure characteristics for each block in the diagram and also enter the repair durations and costs. For the brakes, we define a preventive maintenance policy, which specifies that all four brakes will be replaced every 200 Km.
Figure 6: Race Car RBDs
By simulating the operation of the system for 2,000 Km, we obtain the results displayed in Figures 7 and 8. Some of the results of interest include the expected number of system failures (5.104), the total costs ($910,1942), the number of spare parts required for each component, etc.
Figure 7: System-Level Results
Figure 8: Component Results
The advantages of this approach include the ability to:
The main disadvantage is that the analysis requires detailed information, including failure and repair data at the LRU level.
For more information on the software used to perform the analyses described in this article, visit ReliaSoft’s website at http://www.ReliaSoft.com/Weibull/, http://www.ReliaSoft.com/rga and http://www.ReliaSoft.com/BlockSim.