Share this

Research on Servo Drive Controller Based on Multi-CPU Architecture

2026-04-06 05:09:45 · · #1

Abstract: This paper discusses the implementation method of a servo drive controller, and demonstrates its principles and advantages through a detailed analysis of a multi-CPU controller with an ARM+FPGA+DSP architecture. This architecture features clear division of labor in the controller system, improving the real-time performance of the algorithm; it also improves the real-time performance of the controller's external interface, EtherCAT, and enhances system stability. This architecture was experimentally applied in servo drive products from Tianjin Electrical Research Institute Co., Ltd., and the field application achieved good results, demonstrating the practical significance of this technology.

1 Introduction

A servo driver is a controller used to control a servo motor. It typically controls the motor through three methods: position, speed, and torque, to achieve high-precision system positioning.

Currently, the main architectures of servo controller CPUs are: 1. Single-CPU architecture ARM (Acorn RISC Machine) controller; 2. Single-CPU architecture DSP (Digital Signal Processor) controller; 3. Multi-CPU controller with ARM+FPGA+DSP architecture. Method 3 is currently the most advanced architecture, while the first two methods have limitations. In single-CPU architectures like methods 1 and 2, a single CPU integrates algorithm, data acquisition and control, communication, and display functions. When algorithm complexity and real-time requirements are high, this often leads to compromises. In contrast, in the architecture of method 3, the division of labor among the CPUs is clear: the ARM handles communication and display functions, the DSP handles algorithm computation, and the FPGA handles data acquisition and control. The technical bottleneck of method 3 is that meeting the real-time requirements of the servo controller requires high-speed data communication between the CPUs. Therefore, if a high-speed communication method can be found, the characteristics of each CPU can be fully utilized, functioning as if a multi-core CPU were working.

This study is based on an ARM+FPGA+DSP architecture and uses parallel communication for high-speed internal communication between several CPUs. In addition, the servo driver, as part of the servo system, also needs to communicate at high speed with the main controller and encoder to meet the real-time requirements of the entire servo system. This architecture uses EtherCAT and high-speed 485 interfaces for high-speed communication between the controller and other devices. The principle architecture and software design architecture of the system are described.

2. Principle Analysis of Multi-CPU Architecture Controller

In this controller architecture, the ARM processor handles communication and display functions, the DSP performs algorithm calculations, and the FPGA handles data acquisition and control, as shown in Figure 1. The FPGA acquires physical quantities such as phase voltage and phase current and transmits them to the DSP via a parallel port. The ARM receives commands from the main controller via EtherCAT and acquires information such as the speed and position of the encoder disk via high-speed 485, which is then transmitted to the FPGA via a parallel port before being relayed to the DSP. The DSP executes the servo control algorithm, performs calculations using the data acquired from the FPGA and ARM, and transmits the calculation results back to the FPGA and ARM via a parallel port for related control operations.

Figure 1. Multi-CPU architecture controller structure diagram

In this architecture controller, dual-port RAM communication between several CPUs works by transmitting all data bits in parallel. Data is typically transmitted in multiples of bytes (8 bits), enabling both input and output. The principle of dual-port RAM communication is shown in Figure 2. Dual-port RAM communication features simultaneous transmission of all data bits, high speed, and high efficiency, making it suitable for real-time and high-speed applications.

Figure 2 shows the schematic diagram of dual-port RAM communication.

This controller architecture has two external interfaces: EtherCAT and high-speed RS485. Since the encoder typically uses a RS485 interface, this controller also uses a RS485 interface, implemented using DMA. The interface between the servo controller and the main controller supports protocols such as EtherCAT and PROFINET; this controller uses the more widely used EtherCAT interface. Real-time Ethernet (EtherCAT) is an industrial Ethernet technology developed by Beckhoff. It is increasingly used in servo communication due to its advantages such as high speed, high effective data utilization, full compliance with Ethernet standards, short refresh cycle, and good synchronization performance. Its principle is shown in Figure 3.

Figure 3 EtherCAT message

3. Multi-CPU Architecture Controller Hardware Design

3.1 Design of the ARM Controller

The main controller, ARM, uses the STM32F407 chip to complete the parallel port communication, EtherCAT communication, and RS-485 communication with the FPGA as described in the previous section. The STM32F407 is a high-end 32-bit ARM microcontroller, manufactured by STMicroelectronics (ST), and its core is the Cortex-M4. This design fully utilizes its resources to implement parallel port communication, EtherCAT communication, and RS-485 communication.

Figure 4 FSMC block diagram

Parallel Communication: The STM32F407 features a Flexible Static Memory Controller (FSMC), a built-in high-capacity external memory controller. Using this controller, the STM32 can communicate parallel with FPGAs or memory. The FSMC generates the timing signals for all driving these memories (treating the FPGA as memory): 16 data lines and 16 address lines, as shown in Figure 4.

Figure 5. Block diagram of XINTF data bus connection.

EtherCAT and 485 communication: EtherCAT communication is implemented through the ET1100 chip, which is a powerful dedicated chip for EtherCAT slave controllers (ESC). The ET1100 interfaces with the ARM via SPI. The 485 communication speed is 2.5Mbps and is implemented using DMA. The specific implementation method of DMA will be detailed in the software design section of the next chapter.

Figure 6 Read operation timing

3.2 Parallel Communication Design of Controller FPGA and DSP

The FPGA used is Altera's Cyclone® IV series FPGA, characterized by low cost and low power consumption, up to 532 user I/Os, and support for DDR2 SDRAM interfaces up to 200MHz. The DSP used is a TMS320C28346, connected to the FPGA via XINTF to achieve bidirectional parallel communication. The DSP's input and output are controlled by interrupts. When the FPGA is ready with data, it sends an interrupt to the DSP. The DSP responds to the interrupt, reads data from the corresponding address, and writes data to another address. The FPGA waits 60μs before starting to read data. This enables parallel communication between the two chips. The TMS320C28346 DSP chip has a 16-bit XINTF data bus, serving as an external interface for seamless connection to various external memories or CPUs, as shown in Figure 5. In this system, it is connected to the FPGA's 16 user-definable I/O pins to achieve 16-bit parallel data communication. The TMS320C28346 chip's programmable general-purpose input/output pins can be selected and connected to any of the FPGA's user I/O pins as read/write interrupts for the DSP.

4. Multi-CPU Architecture Controller Software Design

4.1 Software Design of the ARM Controller

The parallel communication between the ARM and FPGA in the controller is implemented using FSMC, which is driven by asynchronous NOR flash memory with non-bus multiplexing. The read timing is shown in Figure 6, and the write timing is shown in Figure 7.

Figure 7 Write operation timing

The address selection for parallel port communication uses sub-module 2 of FSMC's BANK1. The specific program code is as follows:

p.FSMC_AccessMode=FSMC_AccessMode_A;

FSMC_NORSRAMInitStructure.FSMC_Bank=FSMC_Bank1_NORSRAM2;

FSMC_NORSRAMCmd(FSMC_Bank1_NORSRAM2,ENABLE);

Communication between the ARM and FPGA is triggered by an external interrupt. Once the FPGA has prepared the data, it sends an interrupt to the ARM. The interrupt cycle is 120μs. The first 60μs are used for the ARM to read data from the parallel port address and write the data to be transmitted to the FPGA to the corresponding address. The last 60μs are used for the FPGA to read data from the parallel port address. The program code is as follows:

pBuf = (s16 *)EXT_SRAM_ADDR + 31;

for(i=31;i<91;i++)//read

{FPGA_TO_ARM[i]=*pBuf++;}

pBuf=(s16*)EXT_SRAM_ADDR+41;

for(i=41;i<61;i++)//write

{*pBuf++=(FPGA_TO_ARM[i]+1);}

The ARM processor communicates with the encoder via RS-485 at a speed of 2.5 Mbps. Due to this high speed, conventional interrupt methods are insufficient; therefore, this system employs DMA. The program code is as follows:

DMA_InitStructure.DMA_PeripheralBaseAddr=(uint32_t)(&(USART3->DR));//Serial port 3 receives DMA

DMA_InitStructure.DMA_Memory0BaseAddr=(uint32_t)UART3_DMA_RxBuffer;

DMA_InitStructure.DMA_PeripheralBaseAddr=(uint32_t)(&(USART3->DR));//Serial port 3 sends DMA

DMA_InitStructure.DMA_Memory0BaseAddr=(uint32_t)UART3_DMA_TxBuffer;

4.2 Software Design of the Controller FPGA

The FPGA-DSP parallel bus communication module is used for parallel bus data exchange between the FPGA and the DSP. It contains two independent address spaces: one for the DSP to read data from the FPGA's internal data channel, and the other for the DSP to write data to the FPGA. The program uses a dual-port RAM IP core provided by Altera. The program block diagram is shown in Figure 8.

Figure 8 FPGA program module diagram

The frequency of the FPGA-DSP parallel bus communication module's master clock CLK should be at least four times the DSP bus read/write frequency, typically 120MHz. This clock is generated by the FPGA's internal PLL (phase-locked loop). Since the DSP's address bus is generally 16 bits or more, ADDR_DSP is connected to the lower bits of the DSP address bus. RD_DSP is connected to the DSP read enable, and WR_DSP is connected to the DSP write enable. If a chip select signal is present, the DSP read/write enable signal needs to be ORed with the chip select signal before being connected to WR_DSP and RD_DSP.

4.3 Software Design of the Controller DSP

The DSP28346 chip reads and writes to external memory units via an external interface (XINTF), which in this system is the dual-port RAM space of the FPGA. The DSP28346's XINTF is a non-multiplexed asynchronous bus. When configuring XINTF, the ratio of the internal XTIMCLOK clock to SYSCLKOUT should be checked. By writing the XTIMCLOK bit in the XINTFCNF2 register, XTIMCLOK can be configured to be equal to or equal to SYSCLKOUT/2. All XINTF accesses begin on the rising edge of XCLKOUT, and external logic is controlled by the XCLKOUT clock. By writing the CLKMODE bit in the XINTFCNF2 register, XCLKOUT can be configured to a frequency proportional to the internal XINTF clock XTIMCLOK. The program code is as follows:

XintfRegs.XINTCNF2.bit.XTIMCLK=0;

XintfRegs.XINTCNF2.bit.WRBUFF=3;

XintfRegs.XINTCNF2.bit.CLKOFF=0;

XintfRegs.XINTCNF2.bit.CLKMODE=0;

XintfRegs.XINTCNF2.bit.BY4CLKMODE=1;

XintfRegs.XTIMING6.bit.XWRLEAD=3;

XintfRegs.XTIMING6.bit.XWRACTIVE=5;

XintfRegs.XTIMING6.bit.XWRTRAIL=2;

XintfRegs.XTIMING6.bit.XRDLEAD=3;

XintfRegs.XTIMING6.bit.XRDACTIVE=5;

XintfRegs.XTIMING6.bit.XRDTRAIL=2;

XintfRegs.XTIMING6.bit.X2TIMING=0;

XintfRegs.XTIMING6.bit.USEREADY=1;

XintfRegs.XTIMING6.bit.READYMODE=1;

5. Conclusion

The CPU architectures of servo drives on the market mainly consist of single ARM controllers and single DSP controllers. Single ARM controllers offer advantages in communication and control, but complex algorithms often reduce the overall system's real-time performance due to computational time. Single DSP controllers have advantages in algorithms, but external communication and control are often more complex to implement. The multi-CPU controller solution using an ARM+FPGA+DSP architecture completely solves the drawbacks of the above two methods, fully leveraging the advantages of each CPU. The architecture described in this paper has been tested and applied in our company's servo drive products, demonstrating the feasibility and technical advantages of the "ARM+FPGA+DSP multi-CPU controller architecture".

Figure 9. Online simulation diagram of CPU algorithm

As shown in the red-marked box in Figure 9, the CPU utilization of the entire servo drive system is 75.22% (A: number of the lowest priority counters during idle run; B: number of the lowest priority counters during full algorithm run; CPU utilization = (AB)/A*100). This CPU architecture has achieved the goal of controlling CPU utilization and improving system efficiency.

Figure 10

Figure 10 shows the offline inertia identification during a 2-second cycle (0.5 RPM) with a constant PI. Channel 1 is the RPM setpoint, channel 2 is the electromagnetic torque, channel 3 is the actual RPM, and channel 4 is the actual A-phase current. Speed ​​acquisition first uses the ARM's high-speed 485 to acquire code disk data, which is then transmitted to the DSP and FPGA via a parallel port. After algorithm processing, the data is sent back to the code disk. The entire closed-loop process is transmitted via parallel port, meeting the real-time requirements of the servo driver algorithm. This CPU architecture achieves the goal of improving the overall system's real-time performance.

Read next

CATDOLL Mimi Soft Silicone Head

You can choose the skin tone, eye color, and wig, or upgrade to implanted hair. Soft silicone heads come with a functio...

Articles 2026-02-22