Volume 2, Issue 3

Reliability Edge Home

Data, Where Art Thou?

Managing the Problem Resolution Process for Products While Acquiring Meaningful Reliability/Quality Data
As reliability analysis tools become more capable, the availability of accurate and timely data for analysis becomes the limiting factor in the ability to perform effective reliability analyses. As shown in Figure 1, the information required for reliability analysis is often stored in various locations within the organization, in multiple formats that may be difficult for the analyst to integrate.

Because accurate and timely product information is central to our work, it is not a surprise that reliability engineers tend to champion the development of integrated systems that provide efficient access to comprehensive and accurate product quality and reliability data. However, the effort required to implement such a system is likely to be a complex process that requires the cooperation of multiple disciplines/departments within the organization. In order to obtain support for the system, it is important to demonstrate that the system provides tangible benefits to management, participating departments and to the entire organization as much as or more than the system helps the reliability engineer. One approach to implement and gain acceptance for a system that meets reliability analysis and other requirements is to build the necessary data capture mechanisms into a unified incident reporting and problem resolution process for the organization’s products. This article examines some considerations for the design and development of a process/system that will maximize efficiency for the affected departments and simultaneously capture/manage valuable reliability/quality information for the product.

Disparate sources for reliability/quality data

Figure 1: Disparate sources for reliability/quality data

Leveraging/Improving Existing Processes
Whether formalized or not, most organizations already have in place an issue reporting and problem resolution process for their products. In general, this process begins during the design stage and continues throughout the product’s life cycle. For example, issues found during prototype testing are usually brought to the attention of the design team, analyzed, corrected and re-tested. This process typically continues through field testing until the product is released onto the market. When the product is being used by customers, any issues that surface are addressed through phone support hotlines, in-house repair technicians and/or other customer support and warranty return programs. Issues may include suggestions for improvement, product failures, customer complaints and other events of interest. These issues may be reported and addressed within the organization through verbal communication among responsible individuals or through e-mails, memos and/or more formal reports.

Although these issue reporting and problem resolution activities occur in most organizations, the responsibility for problem identification and correction shifts during the process among various personnel and departments (e.g., from the engineering department to the in-house testing group to the customer service group) and the valuable reliability/quality information that is identified during problem resolution activities may not be integrated and available for analysis. In most cases, the problem resolution process generates sufficient data at different stages of the product’s life cycle for effective reliability analyses. The challenge is to capture and use the information generated from these processes by determining how best to store, validate, correlate, organize, manage and employ this valuable data resource.

Designing the Problem Resolution Process to Maximize Efficiency and Capture/Manage Reliability Data
A well-constructed issue reporting and problem resolution process will streamline and improve the process for the organization’s personnel and customers while simultaneously facilitating the capture and management of timely, accurate product reliability data in a comprehensive and systematic way. These processes have been discussed in several references and are often referred to as failure reporting, analysis and corrective action (FRACA/FRACAS) processes/systems. The Department of Defense MIL-HDBK-2155 handbook is one reference that presents guidance for the requirements of such a system.

Figure 2 demonstrates an example of an effective FRACAS process/system. This is a closed-loop process designed to allow multiple cross-disciplinary teams to report issues for a product and analyze, manage and resolve problems. In this example, the process is supported by a centralized database that interacts with distributed user input screens and reporting engines. The process begins with an incident report from the source that identified the issue. Depending on the product and the stage in its life cycle, the sources for these reports may include in-house testing facilities, distributors, suppliers, customer support representatives or other personnel with information about the product’s performance and design. All relevant details that may be required to resolve the issue and support future analysis and problem resolution will be captured directly from the source at the time when the issue is observed. These details may include occurrence date/time, affected parts, behavior, fault codes and other items of interest for the particular product and organization. If the incident must be resolved individually (for example, to repair the product for a customer), the details of that resolution are also tracked and recorded by the system.

Click to enlarge...
Click to enlarge...

Figure 2: Incident reporting, analysis and corrective action process to facilitate reliability data capture

Responsible personnel (for example, an engineering review board) review the reported issues/incidents on a regular basis. From within the collection of reported incidents, the reviewers attempt to identify the specific problems that need to be addressed. Often, multiple incidents are instances of the same basic problem. These incidents are then grouped together into problem reports that include information about the problem and information on each reported occurrence related to the problem. The problem reports are then assigned to specific personnel or teams to coordinate the analysis and problem resolution procedures. The problem resolution process often involves assigning personnel to perform a variety of activities intended to fully define, contain, correct and prevent the problem from occurring again in the future. Actions associated with a problem may include, for example:

  • Determine the failure mode and the effects of the failure at all appropriate levels (e.g., component, system, etc.).
  • Devise and implement an approach to contain the problem and prevent further immediate harm.
  • Devise and implement an approach to correct the problem and prevent it from occurring again in the future.

The responsible person or team assigns and oversees the actions necessary to address the particular problem until an acceptable resolution has been reached. In some cases, a “closure review board” may review and “sign off” on the acceptability of the resolution and the organization implements the solution to completely resolve the problem.

Benefits of a Well-Designed Problem Resolution Process with Integrated Reliability Data Capture
An issue reporting and problem resolution system like the one described provides significant benefits to the organization. First, the process facilitates the effective management and control of incidents and problems for products. This provides a host of benefits to the organization, including increased efficiency, cost savings and improved customer satisfaction. In addition, at each step in the process, valuable product quality and reliability information is captured in a targeted, comprehensive and systematic way. With proper analysis, the data can be used to improve product design. As an example, issues found in the field can then be easily correlated to data from in-house testing. Additionally, the process provides data for reliability growth analysis and times-to-failure for detailed life data analysis. The product/quality information can also be employed to improve customer support activities by providing a knowledge base of known issues and solutions that can be used by customer support personnel. The information can also be used to improve product warranty services by providing an early warning to the organization of potential problems in the field before they escalate. Finally, the process can produce information about the efficiency of the organization’s efforts to address problem issues, which can be used to improve these processes and assign adequate resources.

Implementation Issues
Computer technology can play a key role in efforts to establish an effective incident reporting and problem resolution process with simultaneous data capture and management. As with any computer implementation, the success of the system will be directly related to the ease of use, breadth of scope, and flexibility, as well as easy accessibility for all users. A Web approach, which can be easily distributed to users within and outside the organization, is well suited to this type of application. ReliaSoft’s Quality Tracking and Management System (QTMS) is a web-based system that supports the incident reporting and problem resolution activities described in this article. On the Web at http://www.ReliaSoft.com/enterprise.

--End of Reliability Edge Article--


ReliaSoft.com Footer

Copyright © HBM Prenscia Inc. All Rights Reserved.
Privacy Statement | Terms of Use | Site Map | Contact | About Us

Like ReliaSoft on Facebook  Follow ReliaSoft on Twitter  Connect with ReliaSoft on LinkedIn  Follow ReliaSoft on Google+  Watch ReliaSoft videos on YouTube