Discuss distributed systems

Today, automation systems in industries such as metallurgy, petrochemicals, chemicals, power, cement, automotive, pharmaceuticals, and food and beverage almost invariably employ distributed control systems. With the development of electronic, communication, and software technologies, distributed systems have undergone significant changes. This article will explore this topic further. Centralized Systems Early instrument and electrical control systems were centralized because large-scale industrial production lines had not yet been established, and the physical area controlled was relatively small, so there was no need for centralized systems. Equipment control was independent, with control systems installed near the equipment. Input/output cabling distances typically did not exceed twenty meters, and there were no communication requirements between equipment or between equipment and the workshop/plant level. Interaction between equipment and operators was achieved through buttons and indicator lights. Functionality was relatively simple, and the products processed and manufactured were characterized by "large batches, few varieties." Therefore, the level of automation control was at a rudimentary stage. Distributed System Input/Output Systems With the development and expansion of assembly lines, automated lines, and production lines, the requirements for automated control systems have also increased. Because the original input and output cabling distances were only tens of meters, but later requirements increased to hundreds of meters, two problems arose: First, excessive input and output distances led to significant signal attenuation, causing malfunctions or even complete failure (current wiring distances are typically limited to within 400 meters). Second, the parallel connection method for input and output lines increased cable costs with the distance, creating problems for subsequent calibration, debugging, operation, and maintenance, thus significantly increasing the total cost of ownership (TOC) of the project. To address these issues, automation manufacturers introduced a structure called a Remote Input/Output System. Its key feature is the physical separation of the CPU rack and the input/output modules (input/output racks). That is, the input/output modules used for remote input/output points are no longer installed on the local rack but on a remote rack. The local rack and remote rack are connected by a remote communication cable, and the input/output modules on the remote rack are then connected to surrounding input/output points. This changes the original parallel cable to a serial cable (remote cable), reducing wiring costs. (Theoretically, changing from a parallel connection to a serial connection reduces system reliability, which is the price to pay for structural changes.) We must understand that this structural change is technically complex. After receiving the input signal, the remote input module processes it, first converting it into a digital signal, then into a data frame, often referred to as a specific "protocol" with a defined format. This frame then travels through the remote substation module and remote communication cable to the remote master station module, and is transmitted to the CPU station for processing. The processed result then travels back through the remote master station module and remote cable to the remote substation module, where it is "translated" and output by the output module to the actuator. We benefit from the remote system, but are also constrained by it. The remote master station module, remote cable, and remote substation module become the weak points of the entire system, so they require special attention. Our principle is: strengthen the parts that are weak and prone to problems. For example, to address concerns about remote master station issues, we implement redundancy in the master station module; to address concerns about cable issues, we implement cable redundancy; to address concerns about substation issues, we implement substation module redundancy. Such systems are achievable. Considering system costs, many manufacturers have adopted compromise solutions: each remote master station and remote substation is a separate module, available in single-port and dual-port versions, used to connect one or two remote cables, achieving cable redundancy. Redundant cables also have different operating modes, including: both cables operate simultaneously, comparing signals before outputting; both cables operate simultaneously, outputting only one signal; one cable is used and one is on standby, switching only when the used cable fails; one cable is used and one is on standby, switching at time intervals. The switching between used and standby is entirely controlled by the system and performed automatically, generally requiring no user intervention. The choice a manufacturer makes depends on both the technical difficulty and the commercial cost of implementation. Each approach has its own rationale; it's a choice, and also a form of differentiated competition. In short, for users, dual-cable systems are always more reliable than single-cable systems, providing peace of mind. The above discusses the hardware aspect; now let's talk about the software. Industrial environments are harsh, with factors like dust, vibration, impact, and voltage fluctuations. The biggest impact comes from large equipment, such as the starting and stopping of large motors, which trigger a series of chain reactions: contactor engagement and disengagement, soft-drive startup and shutdown, and inverter operation and shutdown. The operation of high-voltage and medium-voltage switchgear also generates strong electrical interference and electromagnetic radiation. These are often located alongside the control cabinets of automation systems, and they easily affect low-level communication signals. Therefore, interference immunity becomes the primary challenge for industrial communication. To address interference issues, besides physical measures such as using shielded cables, single-end grounding of shielding wires, terminating resistors at both ends of signal lines, separating power lines from signal lines or maintaining a certain distance, and separating the grounding of automation systems from other systems, it's also crucial to work on the communication protocol. As mentioned earlier, remote protocols all have defined frame formats, and a part of this is the periodic verification of transmitted data to ensure accuracy. This part is called FCS (Frame Check Sequence), a detection algorithm specifically used for checking communication data. Two commonly used algorithms are Cyclic Redundancy Check (CRC) and Longitudinal Redundancy Check (LRC), available in 8-bit, 16-bit, and 32-bit versions; the more bits, the stronger the verification capability. The basic principle of verification is as follows: Before sending data, the sending system performs calculations on the data in the transmitted data frame, such as CRC calculation, and places the result in the FCS segment to form a data frame for transmission. After the data is transmitted and received by the destination station, the receiving station performs the same calculations on the received data frame as before, and compares the result with the FCS content in the received data frame. If they match, the transmission is error-free, the transmission task for this frame is completed, and the next step is performed. If they do not match, the transmission is faulty, and the sending source is notified to request a retransmission. The sending end usually has a limit on the number of retransmissions, such as 3 times. If errors occur consecutively, it indicates a serious line fault, or even a disconnection, and the system will stop retrying and immediately alarm the CPU and the host computer operator. Distributed Control System In the process control industry, also known as the instrumentation control industry, it has evolved from discrete instrumentation control to integrated control systems, known as DCS (Distributed Control System). When this technology was introduced to China, it was translated as "Distributed Control System" rather than "Distributed Control System" to illustrate the concept of decentralized control and centralized management. However, from the English text, we see that DCS stands for Distributed Control System, emphasizing the concept of "distribution." What does distribution truly mean? Conceptually, distribution means spreading risk; that is, preventing the entire system from paralyzing due to a problem with a single component or part. Structurally, it means distributing the more critical parts of the control system, such as the CPU, across several CPUs; distributing memory across multiple modules; transforming the bus into a redundant network; and transforming the operating system into a real-time network-based, multi-tasking operating system. Let's compare this to a familiar PC to understand the structure of a DCS system. When a teacher introduces the components of a PC, we know it consists of a central processing unit (CPU), internal memory (RAM), external memory (hard drive), internal bus (ISA, VESA, PCI, etc.), input devices (keyboard, mouse, optical drive, camera, etc.), and output devices (monitor, printer, plotter, etc.). A DCS system transforms these components into node units in a network (each equivalent to an independently operating PC) to distribute risk. For example, in a DCS system, the process controller and logic controller are equivalent to two CPUs, respectively handling instrument control and electrical control; the application data manager is equivalent to internal memory; the historical data manager is equivalent to external memory; the operator station and engineer station are equivalent to some input/output units; and all node units are connected together through a real-time local area control network—equivalent to being online within a PC—forming a basic DCS system. It can be seen that a DCS system is essentially an enlarged version of a PC. However, a qualitative change has occurred in its implementation: the failure of a single node unit does not cause the entire system to fail. Unlike our PCs, which frequently crash and require driver reinstallation, the DCS system not only distributes the risk but also the CPU load. In a PC, all tasks are performed by a single CPU, while in a DCS system, multiple CPUs perform the work, reducing the workload of each CPU. This is significant because when the CPU load exceeds a certain value, such as 80%, it can cause overheating, crashes, and other problems—in other words, there is a potential for failure. This is why many gamers use games like "Need for Speed" to test CPU performance; it's essentially an overload test to examine the CPU's capabilities. From another perspective, simply using integration and clock frequency to improve CPU performance has reached physical and material limitations. Therefore, both Intel and AMD employ dual-core and multi-core technologies to enhance CPU performance. This also proves, from the opposite perspective, that using multiple CPUs can improve the overall system capability. [align=center]Figure 2: Typical DCS Structure[/align] Fieldbus System The basic concepts and structure of remote input/output systems have been introduced. A closer analysis reveals that while the system solves the wiring problem from local to remote locations, it still uses traditional racks, traditional modules, and traditional parallel wiring at the remote rack. We might ask: Is it possible to fundamentally change the traditional wiring method? Are there sensors/actuators that use serial connections instead of parallel connections? The answer is: use fieldbus! The history of fieldbus can be traced back to the 1980s. Due to the lack of heavyweights in the industrial automation industry, such as IBM, Microsoft, and Intel, the types of fieldbuses proliferated, somewhat like feudal states during the Spring and Autumn and Warring States periods, each operating independently, numbering in the dozens. Even the IEC approved eighteen categories. Commonly used fieldbuses include: ASIC, CAN, Modbus, InterBus, Profibus, FF, and HART. [align=center] Figure 3: Typical ASIC Configuration[/align] The biggest contribution of fieldbus to industrial control and automation is that it completely overturns traditional wiring methods and distributes intelligence to sensors and actuators. First, as mentioned earlier, traditional wiring methods are parallel, while fieldbuses uniformly use serial wiring. Parallel wiring is only allowed at certain nodes to ensure compatibility with traditional wiring methods. In terms of topology, some fieldbuses support tree, star, ring, and hybrid topologies in addition to bus topologies. In terms of distance, they range from 100 meters to 1000 meters, even several kilometers; in terms of speed, they range from as low as 9600 baud rate to as high as several megabits or tens of megabits. If Ethernet technology is also incorporated into fieldbus, the distance can reach tens to hundreds of kilometers, and the speed can reach gigabit and 10-gigabit speeds! Second, the intelligence of sensors and actuators. Fieldbus can not only transmit the values of traditional physical quantities, such as discrete, analog, and pulse quantities, but also the status quantities of sensors and actuators. This is a qualitative leap! It's a task that was previously impossible without fieldbus! Imagine many sensors installed in places difficult for ordinary people to access, such as chimneys, underground, and nuclear facilities. If these sensors could proactively inform the control system of their operating status, it would be fantastic news for field maintenance personnel, significantly reducing system maintenance costs. It's understood that one manufacturer's transmitter has as many as 37 diagnostic parameters, covering everything from broken wires and excessively high temperatures to faulty solder joints. In short, it includes virtually every problem you might be concerned about. Wouldn't users of such transmitters be thrilled by the functionality of this product? The emergence of intelligent sensors has also shifted maintenance personnel from a passive to a proactive approach to equipment maintenance. Let's take an example to illustrate this. Every sensor has a lifespan. The most important parameters are the electrical lifespan and mechanical lifespan. Firstly, these two parameters are different; secondly, the lifespan doesn't necessarily mean the sensor is unusable. It might continue to function for a while, but there's a possibility of potential failure. Traditional maintenance methods are twofold: one is to only investigate the source of the problem when the sensor malfunctions, becoming unusable or affecting production, and then replace the product. This inevitably impacts production, causing delays and product quality losses. The other is to replace all sensors on a timely basis to prevent production disruptions due to sensor failures. However, this method is more expensive, and some products are replaced after only a few uses, which is wasteful. Let's look at proactive maintenance methods. This fully utilizes the sensor's intelligence—its processor and memory—to monitor its status. A simple example: since sensors have a lifespan, say 200,000 cycles as stated in the manual, count the number of uses. When it's close to that number, such as 180,000 cycles, immediately alert maintenance personnel to replace the sensor immediately, otherwise, a failure is imminent. This allows you to address potential problems before they occur! Of course, even if a fault occurs due to other reasons, it is much easier to find than with traditional sensors because the fault status is already displayed on the operator's screen. Fully Distributed System The distributed input/output system introduced earlier, due to historical reasons, had its master station node units placed in well-maintained server rooms, usually equipped with air conditioning systems to ensure indoor temperature. This was mainly due to concerns that CPUs, being active components, had low integration and were prone to problems. Meanwhile, the input/output slave station node units were located in harsher environments, considering that most input/output modules consisted of passive resistors, capacitors, and other components, thus having strong anti-interference capabilities. This formed the traditional remote input/output system structure. [align=center] Figure 4: Typical Fully Distributed Scheme[/align] Even today, this system structure still has significant problems: First, it relies excessively on the remote network. All input signals, regardless of whether they are used, the duration, or changes, must be sent to the master station via the network in each scan cycle; and the processed results are then sent to remote substations via the network, resulting in busy lines and low efficiency. Secondly, over-reliance on the master station CPU, which performs all calculations, places a heavy burden on it. Furthermore, in the event of problems (including CPU or network issues), because it's a master-slave structure, the branch stations are merely receivers of fluctuations and are powerless to intervene, even if they are aware of the issues. To overcome these problems, fully distributed systems emphasize balance, harmony, and proactive management. The change is as follows: at remote branch stations, the communication adapter is replaced with a small CPU, transforming the unidirectional data transmission from the remote master station to the remote branch station into bidirectional communication, allowing both master-to-branch and branch-to-master transmission. Additionally, the branch station CPU can run some programs, relieving some of the master station CPU's load. The advantages of this structure are: ■ The remote branch station CPU can share some of the master station CPU's workload; this percentage can be determined by the user. For example, the master station CPU and the branch station CPU can each execute 50% of the program. This reduces the master station CPU's load. In the worst-case scenario, such as a master CPU failure or complete interruption of remote communication, the branch station CPU takes over all the work, and the system can still operate normally. ■ With the addition of a substation CPU, the substation's processing power is enhanced, allowing for a reduction in data transmission over the remote network according to application requirements. Not every piece of data, or every scan cycle, needs to be transmitted to the master station. For example, some digital quantities remain unchanged during a scan cycle and have no impact on the program, so they don't need to be transmitted; some analog quantities change very slowly, so we can set a dead zone for them, assuming they haven't changed within that zone, and therefore don't need to be transmitted. This reduces the network load. ■ In the original structure, the substation, lacking a CPU, was completely passive, with the master station sending and the substation receiving – a one-way communication. With the new structure, in addition to the original method, the substation can actively send or request information from the master station, forming a reverse communication. This increases the substation's initiative, allowing it to decide what and how much work to do based on the master station and network conditions, making the entire control system more balanced and its operation more efficient. ■ In a redundant master station system, when the master station CPU switches, if there is no substation CPU, two situations may occur: disrupted switching and smooth switching. This disturbance is caused by the asynchrony between the distributed network's operating cycle and the master station CPU's scanning cycle. Synchronization prevents disturbances, while asynchrony can lead to them. The new architecture, with its added CPU, can assess the master station's operational status and implement appropriate strategies, thus eliminating unnecessary disturbances. The above analysis shows a consistent development process from centralized I/O to distributed I/O (remote I/O), from centralized CPUs to locally distributed CPUs (DCS), from remote embedded CPUs (fieldbus), to remote distributed CPUs (new distributed systems). This development is a result of advancements in chip manufacturing technology, enhanced anti-interference capabilities of branch station CPUs, and reduced processor prices. Therefore, our project designers, implementers, consultants, and end-users should adapt to the times, shift their preconceived notions, and build more efficient, rational, balanced, and harmonious automation systems.

Discuss distributed systems

Read next

CATDOLL Q 92CM Body with TPE Material

Benefit Analysis and Countermeasures for Implementing Green Lighting Projects

CATDOLL 146CM A-CUP Alisa (TPE Body with Hard Silicone Head)

CATDOLL Beth Hard Silicone Head