Reliability and Maintainability Analysis for a Remote Telecommunications System
This article presents a fictional example designed to demonstrate some useful techniques for system reliability, maintainability and availability analysis. The purpose is to investigate the reliability and maintainability of a telecommunications system that will be constructed in an uninhabited stretch of jungle. The BlockSim 6 software is used to model the system and perform the analysis.
Figure 1 displays the reliability block diagram (RBD) to describe the reliability-wise configuration of the system. In addition, the transmitter and receiver are made up of three subassemblies each, while the relay stations have two subassemblies each (all in series). Specifically:
These subassemblies are defined in BlockSim as subdiagrams to the master diagram (i.e., separate diagrams linked to blocks in the main diagram). The subdiagrams are presented in Figure 2. In addition, Table 1 presents the failure distributions and parameters that have been estimated from data collected for each subassembly.
Figure 2: Subdiagrams for the three types of components
Table 1: Distribution and Parameters to Describe the Failure Properties of Each Subassembly (in hours)
In addition, the analysts can generate a Reliability Importance vs. Time plot and use it to determine whether different relays have different impacts on the reliability of the system when they fail, based on their position in the configuration. As shown in Figure 3, the position of the relays within the diagram does matter, even though they are reliability-wise identical. The failure of relay 1 or 5 has the greatest impact on the reliability of the system. Relays 3 and 4 have the second greatest impact and relays 1 and 6 have the smallest impact on system reliability.
Figure 3: Reliability Importance vs. Time plot to compare the impact of the relays on the system reliability
Maintainability and Availability Analysis
Table 2: Distributions and Parameters for Corrective Maintenance Durations for Each Subassembly (in hours)
When the maintenance characteristics for the system have been added to the model, the analysts can use BlockSim's simulation utility to obtain desired results regarding the maintainability and availability of the system. The results generated by completing 10,000 simulation runs for one year (8760 hours) of operation include:
In addition, the analysts can rank the components according to their Failure Criticality Index (RS FCI), which represents the percentage of the system failures that were due to the failure of the given component. As shown in Figure 4, 99.9% of all system failures were due to the transmitter or the receiver. Among those failures, 54% were due to the transmitter and 20.5% of the transmitter failures were due to the solar power supply (SPS1) component. Therefore, improvement to the availability of the SPS1 component will have the greatest impact on the availability of the system.
Figure 4: Summary of selected RS FCI results
Finally, the analysts can estimate the number of spare parts required to maintain the system by looking at the expected number of failures for component, presented in Figure 5. Because all maintenance in this example involves the replacement of a failed component, a spare part will be required for each failure. For SPS1, 0.9579 failures are expected per year. Another way to look at this is to say that there is a 96% chance that maintenance personnel will need a spare part for SPS1 during the year. Of course, the choice as to whether to keep spare parts in stock is based on additional economic and logistic information (e.g., How quickly can the part be obtained? How much does it cost to keep the part on-hand? etc.)
Figure 5: Expected failures (spare parts required)
Complex System Analysis
To demonstrate two such factors, we will modify this example to suppose that a subcontractor has been engaged to repair the system when needed and that it takes an average of 36 hours (following a normal distribution with a standard deviation of 6) for a technician to reach the site and begin the repair. Furthermore, we will assume that only two technicians are qualified to service the system and that the subcontractor keeps a single spare for SPC1 and RLYC1 on-hand. When one of these parts is used, another is ordered. On-hand spares are available immediately but other parts must be ordered and shipped when needed. The time of arrival for all parts that are ordered and shipped follows a normal distribution with a mean of 72 hours and a standard deviation of 12.
Under this scenario, the analysts must expand the system model to include additional information on the resources that are required to perform repairs (i.e., maintenance personnel and spare parts). In BlockSim, this requires the assignment of a maintenance crew policy and spare parts policy to each component. The maintenance crew policy describes any limitations on the number of simultaneous repairs that can be performed on the system (two), any logistical delay time before the maintenance personnel can initiate the action (duration follows a normal distribution with Mean = 36 and Std = 6) and any costs associated with engaging the crew (none).
The spare parts policy describes the number of parts in stock (1 each for SPC1 and RLYC1, 0 for the rest), any logistical delay time before an available part can be used for a maintenance action (none) and the conditions for ordering and shipping parts when needed (order 1 when the stock drops to 0, time for arrival follows normal distribution with Mean = 72 and Std = 12).
When the simulation is repeated for one year of operation according to the modified maintenance plan, new maintainability and availability results are generated. For example, the average availability after one year of operation is 99.6% with the personnel and parts limitations established by the subcontractor. This is slightly less than the 99.93% estimated for unlimited spares and maintenance crews. The expected system downtime is 36.21 hours per year, which is greater than the downtime estimate that did not take maintenance resources into account.