Optimization strategies for interrupt context saving in embedded operating systems
2026-04-06 07:38:44··#1
Abstract: Most embedded operating systems choose to protect all general-purpose registers when saving the interrupt context after entering an interrupt. This common practice prolongs the microprocessor's memory access time and increases the possibility of memory conflicts. This paper proposes a strategy of saving the interrupt context based on the general-purpose register requirements of the interrupt service routine. This effectively reduces the number of general-purpose registers that need to be protected during interrupt context saving, shortens the interrupt response time, and improves the real-time performance of the system. Finally, this optimization strategy is summarized. Keywords: Embedded system, real-time performance, interrupt response, interrupt latency, context saving I. Real-time Performance of Embedded Systems Embedded systems are application-centric, computer technology-based, and customizable hardware and software systems suitable for application systems with strict requirements on functionality, reliability, cost, size, and power consumption. High real-time performance is a fundamental requirement for embedded systems. The IEEE (Institute of Electrical and Electronics Engineers) defines a real-time system as "a system whose correctness depends not only on the logical result of the computation but also on the time taken to produce the result." Real-time systems can generally be divided into two categories: hardware real-time and software real-time. Hardware real-time systems have a mandatory, unchangeable time limit that does not allow any errors exceeding the time limit. Timeout errors can cause damage or even system failure, or the system may fail to achieve its expected goals. The time limit of a soft real-time system is flexible and can tolerate occasional timeout errors. The consequences of failure are not serious, only slightly reducing the throughput of the system. II. Interrupt response time The real-time performance of interrupts is an important aspect of real-time systems. Interrupt response time is the main factor affecting the real-time performance of interrupts. Interrupt response is defined as the time from the occurrence of an interrupt to the start of execution of the user's interrupt service code to handle the interrupt[1], including interrupt delay time and time to save the interrupt context. All real-time systems must disable interrupts before entering the critical section code segment and enable interrupts after executing the critical code. Interrupt delay time is the time from issuing an interrupt request to the task enabling interrupts[1]. Saving the interrupt context has two functions. First, it is to save the context of the task before the interrupt. Second, if interrupt nesting occurs, the context of the upper-level interrupt must also be saved. Therefore, the entire interrupt response process is shown in Figure 1. To ensure that the interrupt service is handled as quickly as possible, the interrupt response time must be reduced. However, as can be seen from the figure, the interrupt delay time is determined by the task before the interrupt. When entering an interrupt, the interrupt response time can only be reduced by shortening the interrupt context saving time as much as possible, thereby improving the interrupt real-time performance. [align=center] Figure 1. Interrupt Response Diagram[/align] III. Improvement of Interrupt Context Saving 3.1 Traditional Interrupt Context Saving Methods For most embedded operating systems today, the first thing to do when entering an interrupt is to save the context before the interrupt occurs, that is, to save the return address, program status word, stack pointer, and all general-purpose registers to the interrupt stack to prevent the user interrupt service routine from destroying the context after the interrupt return. Taking the µC/OS-II microkernel as an example, the process of saving the context after entering an interrupt on ARM and X86 microprocessors is shown in Figure 2. As can be seen from the code, in both different architectures, three memory access instructions need to be executed to save the context, one of which is a bulk memory access instruction (STMFD SP!, {R0-R12} and PUSHA) to save general-purpose registers R0-R12 and AX, CX, DX, BX, SP, BP, SI, DI. [align=center]Figure 2. Interrupt Context Saving on ARM and X86 µC/OS-II[/align] According to the quantization formula: The formula uses CPU time to measure the performance of the microprocessor architecture. The first part is the instruction execution time, including fetch, parsing, and execution, while the second part indicates that if the instruction is a memory access instruction, the CPU time should be added to the memory access time when the cache misses. Since the memory access speed is much greater than the CPU execution speed, especially for bulk memory access instructions, once a memory conflict is encountered, the waiting time will be longer. In microprocessor cores without cache, such as ARM7TDMI and ARM9TDMI, the CPU time formula for bulk memory access instructions becomes completely different as follows: Therefore, in these processor cores, when processing bulk memory access instructions such as task switching and interrupt context saving, the system will wait, thus affecting real-time performance. 3.2 Optimization Strategy for Interrupt Context Saving In interrupt context saving, protecting the return address, program status word, and stack pointer is necessary; otherwise, it will not be able to return smoothly after the interrupt ends. The purpose of protecting general-purpose registers is to prevent user interrupt service routines from using these registers, which could overwrite existing data and cause errors in task execution after the interrupt returns. Therefore, the protection of general-purpose registers within an interrupt depends entirely on how the interrupt service routine uses them. Only a limited number of general-purpose registers used by the interrupt service routine need to be saved, rather than all of them. Taking the ARM architecture as an example, in user mode, the available general-purpose registers are R0 to R12, R13 is used for the stack pointer, R14 for the return address, and R15 for the PC. If the interrupt service routine only uses a small portion of R0 to R12, then only this small portion of the general-purpose registers can be saved when the interrupt occurs, thereby reducing memory access time and ultimately shortening the interrupt response time and improving interrupt real-time performance. In practice, this strategy is feasible. First, the general-purpose registers required by each interrupt service routine are known. When writing user interrupt service routines in assembly language, the required general-purpose registers are controlled by the programmer; in C language, the compiler determines which general-purpose registers are used. Secondly, in existing embedded operating systems, interrupt service routines are often required to be as short as possible. For example, in Linux, interrupt service routines are divided into Bottom Half and Top Half. Therefore, most interrupt service routines do not use all the protected general-purpose registers, resulting in redundant protection of the remaining general-purpose registers. 3.3 µC/OS-II Clock Interrupt Context Protection Optimization Clock interrupts are a relatively important part of the operating system and also a part with high real-time requirements. In UNIX, the priority of clock interrupts is defined as 6, second only to the highest priority. Taking µC/OS-II clock interrupt handling as an example, the interrupt handling process is shown in Figure 3. In the µC/OS-II clock interrupt service, the interrupt nesting counter OSIntNesting is first incremented by 1 to prevent task scheduling in nested interrupts; then OSTimeTick() is called to decrement OSTCBDly of each sleeping task and increment the system time OSTime; finally, OSIntExit() is called to perform task scheduling, and if no task switching is required, it returns to the interrupt service routine. It is evident that the most frequent operations in clock interrupt handling are concentrated in the functions OSTimeTick() and OSIntExit(). Compiling these two functions with the -s option of the ARMCC compiler, the resulting assembly code shows that the former requires R0, R1, and R4-R7, while the latter requires R0-R3 but does not use R8-R12. Furthermore, the operations of OSIntNesting++ can be performed entirely using R0-R7. Therefore, when entering interrupt handling, only R0-R7 need to be saved. Thus, the code for protecting the interrupt context obtained by rewriting ① in Figure 3 is shown in Figure 4. [align=center]Figure 3. µC/OS-II Clock Interrupt Handling Figure 4. µC/OS-II Clock Interrupt Context Saving[/align] Other interrupt handling in µC/OS-II is similar to clock interrupt handling, only requiring the replacement of OSTimeTick() with the corresponding processing. If the corresponding processing can be concentrated in registers R0-R3 without sacrificing code efficiency, then only R0-R3 is used in this interrupt handling, and only their protection is needed, thereby further shortening the interrupt response time and greatly improving interrupt real-time performance. IV. Summary Traditional interrupt context saving saves the contents of all registers, which simplifies program design but also creates redundant register protection, increasing interrupt response time. Limited interrupt context saving strategy can provide limited protection based on the general-purpose registers needed in the specific interrupt service, shortening the context saving time and allowing user interrupt services to be processed as early as possible, improving interrupt real-time performance. However, the efficiency of limited interrupt context saving is also affected by the complexity of the interrupt service handling and the compiler performance. For interrupts with simple interrupt services but high real-time requirements, the effect is more obvious. For complex interrupt services, more general-purpose registers are needed, and thus more registers are needed for interrupt context saving. In the same interrupt service, a high-efficiency compiler can use as few registers as possible to complete the interrupt service without sacrificing code efficiency, thereby reducing the number of registers that need interrupt context saving and achieving the requirement of improving interrupt real-time performance. References [1] Jean J. L, µC/OS-II, The Real-Time Kernel, R&D Technical Books, 1998 [2] John LH, David AP, Computer Architecture: Quantitative Research Methods: 3rd Edition, Machinery Industry Press, 2002 [3] Mao Decao, Hu Ximing, Embedded Systems: Using Open Source Code and StrongARM/Xscale Processors, Zhejiang University Press, 2003