Volume 10, Issue 1

Reliability Edge Home

Reliability Growth Planning and Analysis Across Multiple Test Phases

[Editor's Note: This article has been updated since its original publication to reflect a more recent version of the software interface.]

Traditional reliability growth models consider the data from a single phase of developmental testing. However, a reliability growth program often will be conducted across multiple phases. ReliaSoft's RGA software offers an array of new analysis and management tools based on the Crow Extended and Crow Extended - Continuous Evaluation models, which provide the appropriate calculations for reliability growth program planning and multi-phase data analysis.

In this article, we will present a brief overview of the models and work through a sample application for a multi-phase reliability growth program. The example will show how to create a growth plan based on the Crow Extended model and then how to use the Crow Extended - Continuous Evaluation model and Multi-Phase Plot to track progress against the established goals.

Crow Extended Model for Reliability Growth Planning

The Crow Extended model for reliability growth planning is a revised and improved version of the MIL-HDBK-189 growth curve. The growth curve in the military handbook is based on the Crow-AMSAA (NHPP) model. Therefore, using MIL-HDBK-189 for growth planning assumes that the corrective actions for the observed failure modes are incorporated during the test and at the specific time of failure. However, in actual practice, some fixes may be implemented during the test while others may be delayed until after the completion of the test and some may not be fixed at all. Using the Crow Extended model for growth planning allows for additional inputs to account for a specific management strategy as well as delayed fixes with specified effectiveness factors.

Application

A manufacturer is developing a small electric vehicle for use in indoor environments, such as airports. The goal MTBF for the vehicle is 300 operating hours. Twenty developmental units are going to be tested for 3,000 hours each for a total of 60,000 operating hours. The first phase of testing is planned to end at 5,000 hours of cumulative test time; the second phase at 15,000 hours; the third phase at 25,000 hours; the fourth phase at 35,000 hours; the fifth phase at 45,000 hours and the sixth phase will end at 60,000 hours.

The first step in planning and executing an overall reliability growth program is to set an idealized growth curve and planned MTBF goal at each stage of testing. In this way, test data can be tracked against goals so that early warning signals can be identified in time to make any significant changes that may be necessary to meet the final MTBF goal. RGA's new growth planning folio can be used to create effective reliability growth program plans.

In this example, the average fix delay to incorporate corrective actions is considered to be 3,000, 5,000, 6,000, 7,000, 8,000 and 9,000 hours for each of the respective phases. The average fix delay is the amount of test time from when the failure mode is discovered until the fix will be implemented in the units under test. For example, in this plan, the fix for a failure mode discovered at 4,000 hours of testing (in Phase 1) is expected to be implemented 3,000 test hours later at a cumulative test time of 7,000 hours, (which falls within Phase 2). The reason behind the increasing fix delay is that, for this specific application, it is considered to be easier to incorporate design changes earlier during the soft tooling phases, but it gets increasingly difficult as the prototype reaches maturity and hard tooling has been established.

After defining the phases and average fix delays, the reliability team provides the inputs required by the Crow Extended model for reliability growth planning. This allows them to create an idealized reliability growth curve for the overall program and MTBF values of planned growth at each phase. The inputs for this example are as follows:

The Goal MTBF is 300 hours, as mentioned previously.

The Growth Potential (GP) Design Margin is a "safety factor" that can be adjusted to make sure that the desired reliability growth will be reached. The higher the GP Design Margin, the smaller the risk in the program but the more rigorous the reliability growth program will be. In this case, the GP Design Margin was chosen to be 1.3. With an MTBF goal of 300 hours, this means that the program will be designed for a growth potential MTBF of 390 hours.

The Management Strategy determines the percentage of the unique failure modes discovered during the test that will be addressed (i.e., fixed). The failure modes for which management determines that it is not economically or otherwise justified to take a corrective action are classified as A modes. The failure modes that are corrected either during the test or at a later time are classified as B modes. In this case, the plan is for 5% of the discovered failure modes to remain unfixed (A modes) and 95% to be addressed (B modes).

This is an important variable in reliability growth program planning because the management strategy can be changed to address a larger percentage of the discovered failure modes if the MTBF goal cannot be reached with the current strategy.

The Average Effectiveness Factor (EF) for the corrective actions (Avg EF or d ) is determined based on engineering expertise, specific product complexity, prior history, etc. Failure modes are rarely totally eliminated by a corrective action. After failure modes have been found and fixed, a certain percentage of the failure intensity will remain in the system. The EF is the fractional decrease in a mode's failure intensity after the corrective action has been implemented.

Typically, about 30% of the failure intensity for the failure modes that are addressed will remain in the system after all of the corrective actions have been implemented. Therefore in this case, the reliability team chose to select an average effectiveness factor of 0.7 (or 70%).

The Discovery Beta describes the rate at which new, unique B modes are being discovered during testing. A value less than 1 indicates that the inter-arrival times between unique B modes are getting larger. We expect this value to be less than 1 because we assume that most failures will be identified early on, and their inter-arrival times will become larger as the test progresses. Based on prior experience with similar products, the discovery beta for this example was chosen to be 0.71.

The reliability team enters all of this information into RGA. Figure 1 shows the six test phases defined in the Growth Planning Folio as well as the inputs and results for the planning calculations. Based on these inputs, the Crow Extended model for reliability growth planning is used to estimate the overall idealized reliability growth curve and the planned growth at each phase. Figure 2 shows the growth planning results and plot.

Figure 1: Test phases and planning calculations

Figure 2: Growth planning results and plot

The plot in Figure 2 shows two reliability growth curves. The nominal idealized growth curve portrays an overall characteristic pattern, which is used to determine and evaluate intermediate levels of reliability and to construct the planned growth curve for the program. The nominal idealized growth curve failure intensity as a function of test time, t, is based on the Crow Extended model for reliability growth planning and is given by the following equations:

where:

  • λI is the system initial failure intensity.
  • λA is the A mode initial failure intensity.
  • λB is the B mode initial failure intensity.
  • d is the average effectiveness factor.
  • β is the discovery beta and λ is the associated parameter of the Crow-AMSAA (NHPP) model.
  • t is the cumulative test time.
  • t0 is the initialization time, which allows for growth to start after a B mode has occurred.

λI, λA and λB are all functions of the Growth Potential Design Margin, the Goal MTBF, the Average Effectiveness Factor and the Management Strategy. Therefore, they all can be solved by using the inputs provided in the Planning Calculations window shown in Figure 1.

The second curve in the plot, the actual idealized growth curve, differs from the nominal idealized curve in that it takes into account the average fix delay that might occur in each test phase. The actual idealized growth curve goes through the target MTBF at each test phase (i.e., planned growth). The curve is continuous, but not necessarily smooth, since its shape depends on the average fix delay. In this case, the time to reach the goal based on the actual idealized growth curve is T Goal (Act) = 58,050 hours. Therefore, the goal will be met towards the end of the last phase. If the calculations indicate that the goal will not be met during the allocated test time, a revised testing plan should be considered.

We now have an overall reliability growth program plan for the new product. The next section describes how we will use the Crow Extended - Continuous Evaluation model to compare test data at each phase against the overall program goals.

Crow Extended - Continuous Evaluation Model

The Crow Extended - Continuous Evaluation model is designed for analyzing data across multiple test phases while considering the data for all phases as a single data set. In RGA, the model is applied when using one of the new Multi-Phase data sheets. An extension of the Crow Extended model, the 3-parameter Continuous Evaluation model provides the flexibility to handle the practical testing situation where the corrective actions may be applied immediately at the time of failure, at a later time during the same test phase, in between test phases or during a subsequent test phase. Unlike the Crow Extended model, the Crow Extended - Continuous Evaluation model is not constrained by the assumption that testing will be stopped when fixes are applied during a test phase or that all BD modes will be corrected at the end of the test. Based on this flexibility, the end time of testing is not predefined, and the model can be continuously updated with new test data. This is the reason behind the name "continuous evaluation." For the Crow Extended - Continuous Evaluation model, the failure modes are classified as follows:

  • A indicates that a corrective action will not be performed (management chooses not to address these modes for technical, financial or other reasons).
  • BC indicates that all of the units under test receive the corrective action at the time of failure and before the testing resumes. Typically, a BC failure mode does not require extensive root cause failure analysis and it can be fixed quickly and easily. BC modes usually are related to issues such as quality, manufacturing, operator, etc.
  • BD indicates that all of the units under test receive the corrective action at a test time after the first occurrence of the failure mode. In other words, a fix is considered to be "delayed" if it is not implemented immediately at the time of failure before testing resumes. A delayed fix can occur at a later time during the current test phase, between test phases or during a subsequent test phase. Type BD failure modes typically require failure analysis and time to fabricate the corrective action. At any given time during a test phase, there may be some BD modes with corrective actions already implemented and other BD modes that have been seen but not yet fixed.

The Multi-Phase data sheets in RGA use an "Event" column to classify the events that occurred during testing. The possible event codes that can be used in the analysis are:

  • F: indicates a failure time.
  • I: indicates that a fix has been implemented for a BD failure mode at the specified time.
  • Q: indicates that the failure was due to a quality issue. You have the option to include or exclude these records from the analysis.
  • P: indicates that the failure was due to a degradation in performance. You have the option to include or exclude these records from the analysis.
  • AP: indicates an analysis point. These can be shown in a Multi-Phase Plot to track overall project progress and can be compared to an idealized growth curve.
  • PH: indicates the end of a test phase. These can be shown in a Multi-Phase Plot to track overall project progress and can be compared to planned growth phases.
  • X: indicates that the data point will be excluded from the analysis. An "X" can be placed in front of any existing event code or entered by itself.

Application

In our example, phases have been set to 5,000, 15,000 25,000, 35,000, 45,000 and 60,000 hours in accordance with the reliability growth program plan and analysis points have been set for every 1,000 hours. Figure 3 shows a portion of the test data that was collected by testing the units for a cumulative test time of 60,000 hours. This figure also shows the assigned effectiveness factors for each of the BD modes that were not fixed at a particular time during a test phase. For these failure modes, you can specify whether the delayed fix was implemented between specific test phases or not implemented at all. For example, BD mode "5007" had an assigned effectiveness factor of 0.78 (called the nominal effectiveness factor), but was never actually implemented at the end of any phase in the current test program. Therefore, it is indicated as "Not Implemented" and the actual effectiveness factor for analysis purposes is zero (i.e. there is no reliability growth associated with that mode). Also shown in this figure, the fix for BD mode "5025" was implemented at the end of phase 5, so the actual effectiveness factor is equal to the assigned effectiveness factor of 0.58 (i.e., there is reliability growth associated with that mode). For calculation purposes, any delayed fixes that are incorporated during a test phase (i.e., those with an "I" event code in the data set) do not need to have an effectiveness factor specified since the fix is already incorporated in the system and the reliability growth will be reflected in subsequent test data.

The Folio in Figure 3 also shows the overall results from analyzing the data across the six phases with the Crow Extended - Continuous Evaluation model.

Figure 3: Portion of test data and effectiveness factors for BD modes that
were not fixed during testing

Figure 4 shows the demonstrated, projected and growth potential MTBF at the end of the program. The demonstrated MTBF is the instantaneous MTBF at the end of the six-phase test program. The projected MTBF is the MTBF estimated to be reached after the delayed corrective actions are implemented with the specified effectiveness factors. In other words, this reflects the reliability growth due to the BD modes that did not have associated "I" events in the test data. Finally, the growth potential MTBF is the maximum MTBF that can be attained with the current management strategy. The maximum achievable MTBF will be attained when all unique BD modes have been observed and fixed.

Figure 4: Plot with demonstrated, projected and growth potential MTBF

Multi-Phase Plotting

The most powerful application of the Crow Extended - Continuous Evaluation model is in tracking reliability performance as the testing program progresses. The new Multi-Phase Plot in RGA allows for actual results to be plotted at specified analysis points against the goals established in the reliability growth plan. Having this capability allows for an overall and dynamic view of the data. Early warning signals are easy to identify via the plot, enabling management to make any necessary adjustments to the strategy (e.g., fix more failure modes, push for better effectiveness of corrective actions, etc.) in time to assure a successful reliability growth program.

Application

The Multi-Phase Plot in Figure 5 shows the nominal and actual idealized growth curves along with the planned growth at each phase and the MTBF goal line (as determined in the Growth Planning Folio). The plot also displays the demonstrated, projected and growth potential MTBF values at each analysis point and phase (as calculated using the Crow Extended - Continuous Evaluation model for the test data). From the figure, it can be seen that the demonstrated MTBF at the end of the program is ~318 hours, which is higher than the goal of 300 hours. The projected MTBF (with delayed fixes in place) reaches ~341 hours and the growth potential MTBF reaches ~357 hours. Since the goal has been met, the reliability growth planning and execution for this example are considered to be successful.

Figure 5: Multi-Phase plot integrating planned and actual reliability growth

Conclusion

As this article has demonstrated, RGA deploys a powerful new toolset for overall reliability growth program planning and multi-phase data analysis. The growth planning folio allows you to define the amount of time planned for each phase in the testing program, the average fix delay and other settings that describe the planned reliability growth management strategy. You can use this tool to make estimates about whether you will be able to achieve your MTBF goal with a given management strategy or to determine what strategy will be necessary to meet the established goal. Multi-Phase data sheets make it possible to enter data from multiple phases of testing, using "event codes" to specify the end time for each phase, the specific test times when fixes were implemented and other details that can be considered by the Crow Extended - Continuous Evaluation model. This provides an analysis that is more representative of the real-world behavior of the units under test. Finally, you can use the Multi-Phase Plot to link the reliability growth program plan with your test data in order to track the progress and determine whether you will need to make adjustments in the remaining test phases in order to meet your MTBF goal.

References

[1] ReliaSoft, Reliability Growth & Repairable Systems Data Analysis Reference, ReliaSoft Publishing, 2009.

[2] U.S. Department of Defense, Military Handbook 189, Reliability Growth Management, Naval Publications and Forms Center, 1981.

End Article

 

ReliaSoft.com Footer

Copyright © HBM Prenscia Inc. All Rights Reserved.
Privacy Statement | Terms of Use | Site Map | Contact | About Us

Like ReliaSoft on Facebook  Follow ReliaSoft on Twitter  Connect with ReliaSoft on LinkedIn  Follow ReliaSoft on Google+  Watch ReliaSoft videos on YouTube