PLC faults are divided into software faults and hardware faults. This article shares PLC troubleshooting experience with real-world troubleshooting examples. This article is a quick guide to becoming a PLC expert!
The probability of PLC hardware damage or software malfunction is extremely low. When troubleshooting, the focus should be on the PLC's peripheral electrical components. Most PLC faults are due to external interface signal failures. During repair, as long as some controlled actions of the PLC are normal, there is no need to suspect a problem with the PLC program. If the calculation program has output, but the PLC interface has no output, then the interface circuit is faulty. Hardware faults are more common in PLC systems than software faults, and most are caused by unmet external signals or actuator failures, rather than problems with the PLC system itself.
Faults can be diagnosed based on the PLC's input and output status. PLC input and output signals all pass through I/O channels, and some faults will be reflected on the I/O interface channels. Sometimes, by observing the I/O interface status, the cause of the fault can be found.
PLCs all have self-diagnostic functions. When checking for faults, the cause and location of the fault can be identified based on alarm information. This is also a basic means and method for checking and troubleshooting PLC faults. First, determine whether the fault is global or local. If the host computer displays multiple control components malfunctioning and displays many alarm messages, it is necessary to check the CPU module, memory module, communication module, power supply, and other common components.
Experience shows that most faults in PLC control systems are detected through PLC program checks. PLC control systems operate in a specific sequence; by observing the system's actions and comparing faulty and normal conditions, most issues can be identified, leading to the determination of the cause of the fault.
Some faults can be directly displayed on the screen to indicate the cause of the alarm, while others may have alarm information but not directly reflect the cause of the alarm; still others do not generate alarm information, but simply fail to execute certain actions; in both of these cases, tracing the operation of the PLC program is an effective method for checking the fault.
Siemens PLC series products
PLC System Fault Analysis
A PLC mainly consists of a central processing unit, input interfaces, output interfaces, and communication interfaces. The CPU is the core of the PLC, the I/O components are the interface circuits connecting field devices to the CPU, and the communication interface is used to connect to a programmer and a host computer. For an integrated PLC, all components are housed in the same casing; for a modular PLC, each functional component is independently packaged, called a module or template. These modules are connected via a bus and mounted on a rack or rail.
PLC control system faults can be divided into two parts: software faults and hardware faults. A PLC system includes a central processing unit, main chassis, expansion chassis, I/O modules, and related networks and external devices. Field production control equipment includes I/O ports and field control and detection devices, such as relays, contactors, valves, and motors.
1. PLC software malfunction
PLCs have self-diagnostic capabilities. When a module malfunction occurs, it can often alarm and react according to a pre-programmed procedure, which can be determined through fault indicator lights. When the power supply is normal, all indicator lights are also normal, especially the input signals are normal, but the system function is abnormal (no output or erratic output), following the principle of starting with the easy and then moving to the difficult, and starting with software and then moving to hardware, first check whether there is a problem with the user program.
The user program is stored in the PLC's RAM, which is volatile when power is lost. When the backup battery fails and the system power supply is interrupted, there is a high possibility that the program will be lost or corrupted. Strong electromagnetic interference can also cause program errors.
2. PLC hardware failure
① PLC host system failure
A. Power system failure. During continuous operation and heat dissipation, voltage and current fluctuations are unavoidable in the power supply.
B. Communication network system failure. Communication and networks are highly susceptible to external interference, and the external environment is one of the biggest factors causing failures of external communication equipment. Damage to the system bus is mainly due to the fact that PLCs are mostly plug-in structures. Long-term use of plugging and unplugging modules can cause damage to the bus at local printed circuit boards, baseboards, connector interfaces, etc. Under the influence of changes in air temperature and humidity, the aging of the plastic on the bus, the aging of printed circuits, and the oxidation of contact points are all causes of system bus wear.
② PLC I/O port failure.
The main causes of I/O module failures are the effects of various external interferences. First, it should be used in accordance with the requirements and its external protection devices should not be reduced arbitrarily. Second, the main interference factors should be analyzed and the main interference sources should be isolated or dealt with.
③ Failure of on-site control equipment
A. Relays and contactors. To reduce such failures, high-performance relays should be selected whenever possible, and the operating environment of components should be improved to reduce the frequency of replacement. If the field environment is harsh, contactor contacts are prone to arcing or oxidation, which can then cause them to overheat, deform, and eventually become unusable.
B. Valves or gates and similar equipment. Long-term use without maintenance, mechanical and electrical malfunctions are the main causes of failure. Because the key actuators of this type of equipment generally have large relative displacements, or require several steps such as electrical conversion to complete the position change of the valve or gate, or use electric actuators to push and pull the valve or gate to change its position, slight deficiencies in the mechanical, electrical, and hydraulic components can lead to errors or failures.
C. Failures in some components or equipment related to switches, limit positions, safety protection, and on-site operation may be caused by long-term wear or corrosion due to prolonged disuse. The main solution for these equipment failures is regular maintenance to ensure the equipment is always in good working order. For limit switches, especially those on heavy equipment, in addition to regular inspections, multiple protective measures should be incorporated into the design process.
D. Faults in sub-devices of the PLC system, such as junction boxes, terminals, bolts, and nuts. These faults are mainly caused by the manufacturing and installation processes of the equipment itself, as well as long-term arcing and corrosion. Based on engineering experience, these faults are generally difficult to detect and repair. Therefore, the installation and maintenance of the equipment must be carried out in strict accordance with the installation requirements to avoid leaving any potential hidden dangers.
E. Sensor and instrument malfunctions. These malfunctions are generally reflected in abnormal signals in the control system. When installing such equipment, the shielding layer of the signal line should be reliably grounded at one end and should be laid separately from the power cable as much as possible, especially the output cable of the high-interference frequency converter. In addition, software filtering should be performed inside the PLC.
F. Noise (interference) faults in power supply, ground, and signal lines.
Analysis of PLC System Fault Examples
1. Examples of PLC soft faults
A PLC control system that has been shut down for a period of time fails to start after power is applied.
Troubleshooting and troubleshooting: After inspection, the maintenance personnel assumed the program was faulty. They naturally inserted the EPROM card into the PLC, performed a total wipe, copied the program, and restarted. However, the fault persisted. Since the program was small, they read the program from the EPROM line by line and compared it with the instructions in the manual. They found that the instructions were exactly the same. After repeated copying proved ineffective, they concluded that the problem was a PLC hardware fault.
The backup program was retrieved using a PG and compared with the program on the EPROM. The instruction list was the same, but the program storage address had changed. After sending the backup program to the PLC, the device operated normally. This showed that the program on the EPROM also had an error. Erasing and rewriting it resolved the problem.
2. Examples of PLC hardware failures
① A Siemens PLC (S7-300, CPU315-2DP) in a petrochemical plant suddenly stopped operating during use.
Fault Diagnosis and Analysis: The alarm lights, program, and power supply were checked. During the alarm check, the BAT light on the CPU was found to be lit. The program inspection revealed that battery failure was not being addressed. Fault Resolution: The CPU battery was replaced, and the program was reconfigured to handle battery failure.
② One evening, communication between the compressor PLC and the main control PLC suddenly ceased. The main control DCS displayed a communication interruption alarm. Motor signals in the compressor control room all displayed red (stopped) on the main control DCS. Some flow, pressure, and temperature signals in the compressor control room also displayed high/low alarms on the main control DCS. Because of the communication interruption, some important interlocks in the compressor control room could not be transmitted to the main control, causing a plant-wide shutdown.
Fault inspection and analysis: Theoretically speaking, there are two main reasons for the interruption of communication between the compressor PLC and the main control PLC: one is software asynchrony; the other is hardware failure such as CP525 card or CPU card failure.
First, the issue was addressed from a software perspective. A synchronization operation was performed on the main control PLC, forcing bit 14 of the communication data word DW13, but communication still failed to establish. Therefore, the problem was not caused by desynchronization within the main control PLC. Next, a synchronization operation was performed on the compressor PLC, forcing bit 14 of the communication data word MW10, and communication was established. This confirmed that the communication interruption between the compressor PLC and the main control PLC was caused by desynchronization of the compressor program, and the cause of this desynchronization was external electromagnetic interference.
Troubleshooting: To prevent the recurrence of this type of fault, the shielding of the control room should be strengthened, and the use of mobile phones and other communication tools in the control room should be prohibited.
③ SF light alarm on Siemens PLC (S7-300)
Fault Inspection and Analysis: The SF light alarm indicates a fault at the input point. Fault Handling: The operating status of each input point was checked. During the inspection, it was found that a temperature transmitter in the field had no input signal. After handling, the fault disappeared.
④ A certain input point of the PLC is not externally connected (even if the connecting wire on the input terminal is disconnected, the effect is the same), but the input point is actually connected and the corresponding input indicator light is constantly on. Fault analysis: It is determined that the adjacent terminal of this terminal has been connected, and there are iron filings between the input terminals of the PLC, which caused the input point to be connected, or the input point has been damaged.
Troubleshooting: Disconnect all the input terminals of the PLC and find a lot of iron filings on the input terminal blocks. Blow the iron filings off the terminals and then reconnect the wiring. The fault was eliminated.
⑤ The SF light on the PLC digital input card of the control system turns red.
Fault Inspection and Analysis: After re-energizing the card, the fault persisted; restarting the PLC host did not resolve the issue, as the fault indicator light remained red. A thorough inspection of the field signals received by the card revealed a faulty feedback switch. Measurements with a multimeter showed infinite circuit resistance, indicating a faulty feedback switch detected by the digital input card. Fault Resolution: Replacing the faulty switch with a spare malfunction extinguished the fault indicator light.
⑥ The field signals received by the analog input card of the granulator PLC control system are indicated as infinity on the DCS.
Fault Diagnosis and Analysis: The initial analysis suggested a possible fault in the communication cable connecting the pressure transmitter and the junction box. Replacing the communication cable did not resolve the issue. A thorough inspection of the entire circuit revealed three potential sources of the fault: the pressure transmitter itself, the communication cable, and the faulty card. The pressure transmitter and communication cable were ruled out. Upon disassembling the faulty card, a small integrated circuit inside was found to be burnt out. Fault Resolution: Replace the faulty card.
⑦ In a system where two PLCs are hot-standby controllers, only one can operate while the other remains stopped.
Fault Diagnosis and Analysis: After powering down and then powering back on the entire control cabinet, starting both PLC main units simultaneously still resulted in only one PLC main unit running. Consulting relevant materials revealed that system function blocks OB70 and OB72 are responsible for redundancy faults. Without these two function blocks, system redundancy is lost, meaning only one CPU can operate. Fault Resolution: After inserting these two system function blocks, the control system returned to normal.