Example 5 - Analyzing Software Reliability Growth
[Download RGA7 Example File (*.rga7)]
Background
When considering reliability growth, some sort of hardware is typically being analyzed. But the same theory and analysis procedures can also be applied to the analysis of software under development. The faults (bugs) that are found during each day's testing of the software can be recorded and then analyzed, just as would be done for hardware. This example will explore how software reliability growth can be analyzed using RGA.
Software for a particular application is under development. The reliability requirement is that no more than one fault may occur during every 8 hours of continuous operation.
Testing begins when the software reaches the "Beta" phase. Three employees are assigned to perform continuous testing during business hours. This results in 24 hours of software testing per day. The software faults are reported and captured in a Failure Reporting, Analysis And Corrective Action System (FRACAS). Given that a new compile of the software is available for testing every week, design engineers implement fixes within a week with the exception of the last two weeks of testing, when fixes are implemented at a faster rate.
The failure rate goal for this software was to have no more than one failure per 8 hours of operation or 1/8 = 0.125 failures per hour. In one day of testing (3 x 8 = 24 hours), the failure intensity goal is 0.125 x 24 = 3 faults per day.
Assume that the following data set was extracted from the FRACAS system:
| Failures in Interval | Days of Testing |
| 45 | 5 |
| 34 | 10 |
| 25 | 15 |
| 17 | 20 |
| 21 | 23 |
| 14 | 26 |
| 10 | 28 |
The data set is grouped by the number of days until a new compile of the software is available.
To analyze the data set, calculate the parameters using the Crow-AMSAA (NHPP) model and use the QCP to estimate the demonstrated failure intensity. From the results, it is determined when the goal of no more than three faults per day will be achieved and how many days of developmental testing are required.
Analysis and Results
A new project is created using a Standard Folio and a Standard Folio for grouped failure times by selecting the Developmental, Times-to-Failure Data and Grouped Failure Times options in the New Data Sheet Setup window Expert view, as shown next.

The data set is then entered and analyzed using the Crow-AMSAA (NHPP) model, as shown next.

The QCP is used to estimate the instantaneous failure intensity demonstrated after 28 days of testing, as shown next.

Analysis and Discussion
Currently, the demonstrated failure intensity is 4.4947 faults per day. Therefore, the question is: "If we continue testing with the same growth rate, when will we achieve the goal of no more than three faults per day?"
To calculate this, the Time/Stage given Instantaneous Failure Intensity calculation option is used, as shown next.

Therefore, 149 - 28 = 121 additional days of testing and development (test-analyze-and-fix) are required to achieve the failure intensity goal. This is much more time than the analysts anticipated so they decide to take a closer look.
The Failure Intensity vs. Time plot is used to display the results. On the Plot Sheet Control Panel, the Use Logarithmic Axes option is cleared. If Auto Refresh is selected, the plot automatically refreshes itself and is shown next.

From this plot, it can be seen that there is a jump in the failure intensity between 20 and 23 days. This is the reason why it is estimated that more development time than expected is required. Therefore, the next step is to analyze the data set for the period up to 20 days of testing.
A new Data Sheet is created in the Standard Folio. Using the New Data Sheet Setup window, Grouped Failure Times is selected. The data set is entered into the new Data Sheet, Data 2, for the first 20 days of testing. The parameters are calculated using the Crow-AMSAA (NHPP) model, as shown next.

The Failure Intensity vs. Time for Data 2 is displayed in a plot with the Use Logarithmic Axes option cleared, as shown next.

This plot shows the decrease in the failure intensity rate over the first 20 days of testing.
The QCP is re-opened and the additional days of testing and development that are required to achieve the failure intensity goal, based on the first 20 days of test data, are calculated, as shown next.

The calculation indicates that a total of 55 days of testing are required. Since we have already completed 28 days of testing, this indicates that only 27 more days would be required based on the analysis from the 20th day of testing. This is much different than the result obtained from the analysis of the full data set.
So the question is: "What happened when the failure intensity jumped on the 23rd day of testing and development?" It turns out that new functionality was implemented at the request of a customer, which caused a major redesign on some general modules of the software. This type of jump is typical in both software and hardware development when new features are introduced and observed.
Due to these significant changes, it is decided that the clock should be reset and the analysts should track the reliability growth from the 20th day forward. In other words, the origin of the test is set at 20 days and the data thereafter are considered as follows:
| Failures in interval | Days of Testing |
| 21 | 3 |
| 14 | 6 |
| 10 | 8 |
Another Data Sheet for grouped failure times is created in the current Standard Folio.
The data set is entered into the new Data Sheet, Data 3, and the parameters are calculated using the Crow-AMSAA (NHPP) model, as shown next.

The Failure Intensity vs. Time plot for Data 3 is created with the Use Logarithmic Axes option cleared, as shown next.

This plot shows the decrease in the failure intensity rate over the last 8 days of testing.
The QCP is re-opened and the additional days of testing and development that are required to achieve the failure intensity goal, based on the analysis from days 20 through 28 of the testing, are calculated, as shown next.

Therefore, when considering this data set, 51 - 8 = 43 more days of developmental testing are required.
While it is too early to make any predictions based on just 8 days of testing, this result can be used to get a general idea of the remaining development time required and to come up with a new testing plan. In this case, it is decided that three more employees need to be added to testing and, if possible, that a new compile needs to be created every two days. This yields a much more aggressive testing and development program with the objective of completing the project within one month.


