Reliability Edge Home

Graphical Analysis of Repair Data

 Guest Submission

Wayne Nelson

Purpose. This article presents a simple and informative plot for analyzing data on numbers or costs of repeated repairs of a sample of products. The plotting method provides a nonparametric graphical estimate of the mean cumulative number or cost of repairs per unit versus age. The mean cumulative function (MCF) presented below can be used to

(1) evaluate whether the population repair (or cost) rate increases or decreases with age
(this is useful for product retirement and burn-in decisions),

(2) estimate the average number or cost of repairs per unit on warranty or over the design
life of the product,

(3) compare two or more sets of data from different designs, production periods,
maintenance policies, environments, operating conditions, etc.,

(4) predict future numbers and costs of repairs,

(5) reveal unexpected information and insight, an important advantage of plots.

This article presents the basic population model for such repeated events data and its mean cumulative function (MCF). Using a typical data set for repair data (transmission repair data from cars on a preproduction road test), this article shows how to calculate and plot a sample estimate of the MCF for data from products with a mix of ages. This is followed by an explanation of how to use and interpret such plots.

ReliaSoft's RDA Utility, a free software tool available from www.weibull.com, can be used to automate the analysis and plotting. The author presents such analyses and plots in more detail in his new book, Recurrent-Events Data Analysis for Product Repairs, Disease Episodes, and Other Applications, published in 2003 in the ASA-SIAM Series on Statistics and Applied Probability.

Repair Data
This section describes typical repair data from a sample of units.

Transmission data. Table 1 displays typical repair data for 34 cars in a preproduction road test. Information sought from the data includes (1) the mean cumulative number of repairs per car by 24,000 miles (the "design life" since 1 test mile equals 5.5 customer miles) and (2) whether the population repair rate increases or decreases as the population ages. For each car, the data set consists of the car mileage at each transmission repair and the latest mileage. For example, car 24 had a repair at 7,068 miles, and it was observed until 26,774 miles. In this table, a + indicates the latest mileage observed for a car, called its "censoring age."

Table 1: Transmission repair data

Censoring. A unit’s latest observed age is called its "censoring age," because its repair history beyond that age is unknown. Usually, unit censoring ages differ. The different censoring ages complicate the data analysis and require the methods here. A unit may have no failures; then the censoring age is the only data value. Other units may have one, two, three or more repairs.

Age. Here "age" (or "time") means any useful measure of product usage, such as mileage, days, cycles, months, etc.

The Population and Its Mean Cumulative Function
Model. Needed repair information on a population is given by the population mean cumulative function (MCF). The MCF is a feature of the following population model, which has no censoring. Censoring is a property of the data collection, not the population. At a particular age t, each population unit has accumulated a total cost (or number) of repairs. Figure 1 depicts such cumulative cost histories as smooth curves for ease of viewing. In reality, the histories are staircase functions where the rise of each step is the cost or number of repairs at that age. However, staircase functions are hard to view in such a plot. Consequently, there is a vertical population distribution of the cumulative cost (or number) of repairs at age t. It appears in Figure 1 as a continuous density. This distribution at age t has a population mean M(t). M(t) is plotted versus t as a heavy line in Figure 1. M(t) is called the population "mean cumulative function" (MCF) for the cost (or number) of repairs. It provides most information sought from repair data. This model has no distributions for times to or between repairs.

Figure 1: Population cumulative cost histories (uncensored), distribution

Repair rate. When M(t) is for the number of repairs, the derivative

is called the population "instantaneous repair rate." It is also called the "recurrence rate" or "intensity function" when some other repeating occurrence is observed. It is expressed in repairs per unit time per product, e.g. transmission repairs per 1000 miles per car. Some mistakenly call m(t) the "failure rate," which causes confusion with the quite different failure rate (hazard function) of a life distribution for non-repaired units (usually components). The failure rate for a life distribution has an entirely different definition, meaning and use, as noted by Ascher and Feingold (1984).

Estimate and Plot of the MCF
This section shows how to calculate and plot the sample MCF. The next section shows how to interpret the plot.

Steps. The following steps yield a nonparametric estimate M*(t) of the population MCF M(t) for the number of repairs from a sample of N units.

1. List all repair and censoring ages in order from smallest to largest as in column (1) of Table 2. Denote each censoring age with a +. If a repair age of a unit equals its censoring age, put the repair age first. If two or more units have a common age, list them in a suitable order, possibly random.

Table 2: MCF calculations

1. For each sample age, write the number I of units that passed through that age ("at risk") in column (2) as follows. If the earliest age is a censoring age, then write           I = N - 1; otherwise, write I = N. Proceed down column (2) writing the same I value for each successive repair age. At each censoring age, reduce the I value by one. For the last age, I = 0.
2. For each repair, calculate its observed mean number of repairs at that age as 1/I. For example, for the repair at 28 miles, 1 / 34 = 0.03, which appears in column (3). For a censoring age, the observed mean number is zero, corresponding to a blank in column (3). However, the censoring ages determine the I values of the repairs and thus are properly taken into account.
3. In column (4), calculate the sample mean cumulative function M*(t) for each repair as follows. For the earliest repair age this is the corresponding mean number of repairs, namely 0.03 in Table 2. For each successive repair age this is the corresponding mean number of repairs (column (3)) plus the preceding mean cumulative number (column (4)). For example, at 19,250 miles this is 0.04 + 0.27 = 0.31. Censoring ages have no mean cumulative number.
4. For each repair, plot on graph paper its mean cumulative number (column (4)) against its age (column (1)) as in Figure 2. This plot displays the nonparametric estimate M*(t), also called the sample MCF, as a red staircase function. Censoring times are not plotted.

Plot. Figure 2 was plotted by ReliaSoft's RDA Utility, which does the calculations above. The program also calculated and plotted Nelson's (2003) nonparametric approximate 95% confidence limits for M(t), shown above and below each data point with green lines.

Figure 2: Transmission data MCF and 95% confidence limits

How to Interpret and Use the Plot
MCF estimate. The plot displays a nonparametric estimate M*(t) of M(t). This estimate involves no assumptions about the form of M(t) or the process generating the product histories. The estimate is a staircase function that is flat between repair ages, but the flat portions need not be plotted. The MCF of a large population is usually regarded as a smooth curve, and one usually imagines a smooth curve through the plotted points. Interpretations of many such plots appear in Nelson (2003).

Mean cumulative number. An estimate of the population MCF by a specified age is read directly from the staircase or a curve through the plotted points. For example, from Figure 2, the estimate of the MCF at 24,000 test miles is 0.31 repairs per car during design life, an answer to a basic question.

Repair rate. The derivative of such a curve (imagined or fitted) estimates the repair rate m(t). If the derivative increases with age, the population repair rate increases as products age. If the derivative decreases, the population repair rate decreases with age. The behavior of the rate is used to determine burn-in, overhaul and retirement policies, as described by Nelson (2003). In Figure 2, the repair rate (derivative) decreases as the transmission population ages, the answer to another basic question.

Other information. Nelson (2003) gives other applications and information on

• determining a suitable length of time for factory burn-in of a product,
• predicting future numbers or costs of repairs for a fleet,
• analyzing repair cost data,
• analyzing availability data, including downtime for repairs,
• analyzing data with more complex censoring where product histories have gaps with missing repair data,
• analyzing data with a mix of types of repairs,
• the minimal assumptions on which the nonparametric estimate M*(t) and confidence limits depend,
• comparing two or more sample MCFs from different designs, production periods, environments, maintenance policies, etc.

Conclusion and Acknowledgements
Concluding remarks. The simple plot of the sample MCF is informative and widely useful. It requires minimal assumptions and is simple to make and present to others.

Acknowledgments. This revised version of Nelson (1998) appears here with the kind permission of Wiley, publisher of Quality and Reliability Engineering International. The author gratefully thanks Mr. Richard J. Rudy of Daimler- Chrysler, who generously granted permission to use the transmission data here. The author is pleased to acknowledge Mr. Pantelis Vassiliou, Mr. Adamantios Mettas, and Ms. Lisa Hacker for their valuable contributions to this article.

References
Ascher, Harry and Feingold, Harry (1984), Repairable Systems Reliability, Marcel Dekker, New
York.

Nelson, Wayne (1998), "An Application of Graphical Analysis of Repair Data," Quality and
Reliability Engineering International
14, 49-52.

Nelson, Wayne (2003), Recurrent-Events Data Analysis for Repairs, Disease Episodes, and
Other Applications
, ASASIAM Series on Statistics and Applied Probability, SIAM,