Analysis of Air Separation Unit Shutdown Due to Distributed Control System Failure
2026-04-06 05:57:40··#1
Zhuhai Yingde Gas Co., Ltd.'s 16300m3/h air separation unit was put into operation in June 2005, using a CENTUM CS3000 distributed control system (DCS system). The hardware and software composition of the control system is as follows: (1) Two operation monitoring stations (HIS), with Windows 2000 operating system and 701 network cards to access the real-time control network (V net) to realize real-time monitoring, operation and configuration functions. (2) One control station (FCS), used for process I/O signal input, output and processing, to complete real-time control functions such as analog quantity adjustment, sequential control and real-time calculation. 1. Fault process On June 5, 2005 at 15:36:35, the DCS system operation monitoring station in the central control room of Zhuhai Yingde Gas Co., Ltd.'s 16300m3/h air separation unit suddenly emitted an abnormal "beep beep" sound. Immediately afterwards, except for the large water pump of the circulating water system (because it was not interlocked with the DCS system), all machines stopped instantly, and the air separation unit was forced to stop. The historical information of the DCS system is recorded as follows: 15:36:35 FCSO101 Too Heavy Load (Control station 0101 is overloaded, this is the first fault information); 15:36:40 FCSO101 IOM Fail F101 NODE 01 SLOT04; 15:36:44 FCSO 101 IOM Fail F101 NODE 01 SLOT 01. All NODE (control substation) IOM (control processing module) failed, and all tag numbers showed input circuit open (IOP) and output circuit open (OOP) alarms. 15:37:11 IOM module IOP and OOP alarms returned to normal. After the fault occurred, the instrument engineer immediately contacted the technical personnel of the DCS system supplier and did the following work under their guidance: (1) The control station (i.e. FCSO101) was powered off, the CPU backup battery was powered off at the same time, and the project software program in memory was lost. (2) After 5 minutes, the control station was restarted, and then the instrumentation engineer station re-downloaded the project software to the control station and restarted the engineer station and operator station. After the restart was completed, the FCS0101 Too Heavy Load system alarm reappeared, but other data showed that they returned to normal. (3) In order to resume production as soon as possible, the air compressor was started at 19:33:55 to gradually restore the entire process. However, after the oxygen compressor was started, the disconnection phenomenon of some data trend graphs reappeared, while other aspects were normal. 2. Causes of the fault and handling measures In April 2005, during the commissioning of the air separation equipment, the instrumentation engineer completed the configuration of the entire project software program for its process flow and passed the test. However, after the entire air separation equipment system was started and running normally, the trend data recording graph of the DCS system showed a disconnection phenomenon, which became more and more serious. At that time, it did not receive enough attention. In fact, the DCS system was already overloaded at this time, but it had not yet reached a serious level. In this shutdown fault, the DCS system showed the alarm "control station system overload". Upon inspection, the "CPU IDLE TIME" of control station FCS0101 was found to be 0, indicating that the system did not have sufficient time to process all events within the specified scan cycle. A longer CPU IDLE TIME indicates a lower system load. The CENTUM CS3000 R3.03.00 software documentation states that CPU IDLE TIME should be "with a little time," meaning that a little time is sufficient. However, in practical applications, the manufacturer recommends a CPU IDLE TIME greater than 9 seconds, meaning a system load below 85%; the minimum safe value is 5 seconds, corresponding to a system load of 91.7%. After identifying the problem, the instrumentation engineer immediately optimized the project software, changing some unnecessary "high-speed scan" function blocks to "medium-speed scan" or "basic scan." At 11:57 AM on June 6th, the modification was completed, and CPU IDLE TIME increased from 0 to 6 seconds. Further observation showed no breakpoints in the DCS trend chart, and no other anomalies were observed. Further optimization of system resources increased CPU IDLE TIME to 10 seconds. The fault was fundamentally resolved. The project software analysis report from the DCS system supplier confirmed that the root cause of the fault was excessive load on the DCS system, causing it to momentarily fail to complete computational processing and control functions. After implementing the aforementioned measures, to further optimize control, the system memory value of the PC's built-in Windows 2000 operating system was modified, increasing the minimum value from 384Mb to 768Mb. A strict inspection system was also established, with designated personnel checking the operation of the HIS and FCS daily, including CPU IDLE TIME. 3. Conclusion After the fault occurred, instrumentation engineers from other gas companies within Yingde Gases Corporation were promptly contacted to check the operation of their respective DCS systems (all of which were the same model). The CPU IDLE TIME values for the other gas companies were 34s, 5s, and 10s, respectively, all within safe limits. However, further optimization and control of system resources were still needed. Based on the specific circumstances of each factory, and while ensuring equipment safety, the project software was revised, system resources were optimized, and the normal operation of the air separation equipment was guaranteed.