Abstract: To ensure the compatibility, reliability, and high availability of hardware and system software applications, it is crucial to quickly identify and accurately locate problems in the hardware or software during the system development and testing phases. Therefore, offline testing and analysis of embedded systems become essential. This paper discusses offline single-board hardware testing methods and system testing methods, using the broadband switching system developed by Shanghai Bell Alcatel Co., Ltd. as an example. Keywords: Embedded system; Hardware testing; Broadband switch Introduction With the development of embedded systems, there is an urgent need for offline testing and analysis during the embedded system development phase to ensure the compatibility, high reliability, and high availability of the system's software applications and hardware, and to quickly identify and accurately locate problems within the system. This paper discusses offline single-board hardware testing methods and system testing methods, using the broadband switching system developed by Shanghai Bell Alcatel Co., Ltd. as an example. Overview of Offline Single-Board Hardware Testing In broadband switch systems, offline testing includes self-test testing and general offline testing. Self-test testing is performed after single-board initialization to ensure the correct operation of the board. It mainly includes watchdog testing, rapid hardware device testing, and download path testing. Rapid hardware testing completes register testing and testing of individual hardware devices on the single board, which includes many test items. If any test item fails, the entire test will stop until the watchdog timeout restarts the system. Download testing is performed to ensure the software download function works correctly. This test mainly completes communication interface data transmission and reception testing and interrupt function testing. Typical offline testing involves more specific testing of the above items during factory inspection, development phase testing, and maintenance diagnosis to pinpoint the error location on the board. Watchdog Testing Watchdog testing must be completed before any hardware test. This is because a hardware test failure requires a system restart, and hardware test failure is usually determined by watchdog timeout. Therefore, the watchdog needs to function correctly during hardware testing. The watchdog testing method is to set and activate a 1-second watchdog timer, wait 1 second, and then the system restarts. Flash Testing Flash memory can store both programs and data. When burning Flash, pre-calculated checksum values can be stored. When testing Flash, the program recalculates the checksum and compares it with the pre-stored value. There are two methods for testing data Flash. There are two types of memory tests: one is the non-destructive basic test, mainly the checksum test; the other is the destructive extended test, including read/write tests and address/data bus tests, the specific methods of which are consistent with memory tests. Basic tests can be used during system self-tests, while extended tests can be used during maintenance and diagnosis. Memory Testing Memory tests can be divided into three categories: 1. Data Bus Test: 0001 is cyclically shifted left and written to memory, then read out and compared. 2. Memory Area Test: Read and write tests are performed on all memory storage units (read/write tests of 5555H and AAAAH). 3. Address Bus Test: Address accumulation tests are performed on all memory storage units. Starting from the base address of RAM, a different value (incrementing value) is written to each storage unit (according to the bus width), the address increments, until all storage units hold different contents, then read out and verified. Address bus testing can also use a fast testing method: write the address value 0x1 to the memory unit at address 0x1, cyclically shift the address value left, write the corresponding address value to the corresponding memory address in sequence, and finally verify. In this system, self-testing only includes memory area testing. Furthermore, due to time constraints, only randomly selected memory pages are tested for read/write operations. Other memory testing methods can be used for factory inspection, development testing, and repair diagnostics. Main Control Chip Testing Main control chip testing mainly involves timer testing, register testing, interrupt testing, and on-chip RAM testing. Register testing verifies the functionality of special registers to ensure proper CPU register operation. Interrupt testing artificially generates hardware interrupts to test the main control chip's response and whether it promptly flags the corresponding interrupt register bits. On-chip memory testing follows general memory testing rules. Simple PLD/FPGA Testing In broadband switch systems, larger FPGAs often implement more complex functions. Detailed functional testing of these functions is required. For simpler, smaller PLDs/FPGAs, self-testing methods are sought, incorporating appropriate self-testing techniques during PLD or FPGA development. When the main control chip needs to test them, it sets and reads the corresponding PLD or FPGA test interfaces to obtain the test results. PCI Bus Testing: The PCI bus is commonly used to connect processors and various peripherals. It provides a low-latency path, allowing the processor to directly access any PCI device mapped in memory or I/O address space. It also provides a high-bandwidth path, allowing PCI master devices to directly access main memory. The testing method first tests whether the PCI configuration space registers can be read and written correctly, and then tests whether the memory mapping can be read and written correctly at both ends. Embedded System Offline Testing MethodsIncremental Testing Model After single-board testing is completed, the system may still not work properly after integration. The main reason is that the interface introduces many new problems when modules call each other. For example, data may be lost through the interface; one module may have an undesirable impact on another module; incorrect hardware connections between modules may also cause communication problems; errors accumulate to an unacceptable level, etc. Therefore, comprehensive testing is needed to discover various errors. If all modules are assembled at once according to design requirements and then the system software is run directly, this is called non-incremental integration. This method is prone to confusion; correcting one error may introduce new errors, and the mixture of old and new errors makes it more difficult to determine the cause and location of the error. Incremental integration methods gradually locate and correct errors by expanding the test software segment by segment, increasing the test scope step by step. Depending on the system's characteristics, two incremental integration models can be adopted: bottom-up integration and top-down integration. Broadband rack systems consist of a main control board and other individual boards. The offline system test software uses a top-down integration method. The main control board downloads the system test programs for each individual board to the target board via the network. Then, the main control board searches for individual boards using a depth-first strategy. First, the main control board sends messages to directly connected individual boards. If the connection between the main control board and the individual boards is correct, and the individual board's hardware and software are functioning normally, the individual board returns its relevant information to the main control board after receiving the message. Next, the main control board obtains relevant information from the lower-level individual boards through the directly connected individual boards, until it obtains relevant information for all individual boards (location, board type, etc.), and then the comprehensive testing of the entire system can begin. Comprehensive Testing Methods Currently, most large-scale embedded systems are distributed processing systems, with multiple modules working collaboratively to complete complex functions, interconnected via networks. The entire system is generally divided into three different layers: device layer, system layer, and application layer. Offline comprehensive testing of the system can be conducted for these three layers through interoperability testing, functional testing, and performance testing. Interoperability Testing Interoperability testing includes physical connectivity and consistency testing to ensure that there are no problems when interconnecting the various modules in the system. Physical connectivity and consistency testing is the most basic content of network system testing, mainly involving cable testing to verify whether the tested cables and wiring meet design requirements and international standards. In broadband switch systems, interoperability testing involves the main control board sending messages sequentially to each PBA board according to the network connection layer, waiting for their responses. If the main control board receives a response within a specified time, it indicates that the network connection from the main control board to that board is correct. Simultaneously, the main control board also obtains relevant information about the board from the PBA responses, laying the foundation for subsequent functional and performance testing. Functional Testing After the interoperability testing within the entire system is completed, functional testing is performed to verify whether the equipment can perform its intended functions. The functional tests required vary depending on the equipment. If the single-board hardware operates without abnormalities, the main control board then starts the single board to execute its specific functions. Performance Testing After completing system equipment testing and network interoperability testing, various applications can be loaded onto the system. Performance testing is the highest level of testing in comprehensive testing, mainly testing the system's support level for applications. Performance testing has different classification methods. In the broadband switch system, a simulation method is used. In an actual rack environment, the first single board in the test set actively sends data packets to perform loopback testing, mainly testing the data link layer, including traffic analysis and error data statistics. Conclusion The above introduces some methods for implementing single-board hardware testing and system offline integration testing models in broadband switch systems. In specific development, these tests are used to check design problems as early as possible during the design phase. During the maintenance phase, these tests effectively locate problems found in the field. These tests play a very important role in the reliability of the broadband switch system, ensuring the safe and stable operation of the system in the field. References 1 PCI Local Bus Specification Revision 2.2 December 18, 1998 Copyright 1992, 1993, 1995, 1998 PCI Special Interest Group 2 VxWorks Programmer's Guide 5.4 Edition 1 Copyright 1984 -1999 Wind River Systems, Inc. 3 MPC860 PowerQUICCTM User's Manual Motorola Inc. 1998