[Abstract] : This paper introduces the application of Linux in the embedded field and the host and target machine development models, and details the implementation process of a simplified kernel. It analyzes the structure of the glibc system library and ELF file format, and the principles of shared library trimming techniques, proposing and implementing a library trimming scheme. Keywords : Embedded; Linux; Miniaturization I. Overview Embedded Linux generally refers to a dedicated Linux operating system suitable for specific embedded applications after miniaturizing and trimming a standard Linux distribution. Embedded systems are typically resource-constrained, with relatively small processor computing power and RAM or other memory capacities. Therefore, how to create a miniaturized Linux operating system is a primary consideration. Embedded Linux systems generally adopt a three-layer structure: the core layer mainly consists of the Linux kernel and modules; the calling interface layer is the system library, primarily the glibc library; and the application layer consists of applications designed according to user requirements. To achieve high resource utilization, the latter two layers exist in the form of ELF files, dynamically loading external functional code during runtime. Generally, establishing a cross-platform development environment is the first step in embedded software development. The heterogeneity (different processor architectures) between the host and target hardware platforms is the fundamental reason for cross-platform development. Furthermore, due to limited resources, developing software directly on the embedded system's hardware platform is inconvenient or even impossible. Therefore, a Host/Target development model is typically adopted, as shown in Table 1. Host (Host) Target (Target) Hardware PC or workstation, with x86 CPUs being dominant Embedded system hardware, diverse processors (x86, ARM, PowerPC, MIPS, 68K, etc.) Software Desktop operating systems such as Windows and Linux, rich integrated development environments (such as WindRiver's Tornado) Limited software resources, usually downloaded from the host during development. [align=center]Table 1 Characteristics of Cross-Platform Development Environments[/align] Cross-platform development environments include cross-compilers, cross-debuggers, and system emulators, such as the GNU toolchain frequently used in embedded Linux development. Developers need to select a suitable GNU cross-compiler based on the target platform, and then recompile the kernel and other software on the host machine so that the resulting target code can run on the target machine. This process is quite cumbersome and error-prone. The host and target machines are typically connected via Ethernet or serial port. Currently, there are hundreds of embedded Linux development projects and distributions worldwide, such as ETLinux, LPR, μC-Linux, and ThinLinux, all with open-source code, as shown in Table 2. Name | Characteristics ETLinux | Designed to run Linux on small industrial computers, especially PC/104 modules Router Project | LPR | Aimed at network equipment and embedded systems such as routers, access servers, and thin servers, and can be installed on a floppy disk. Similar projects include Linux On A Floppy (LOAF) μC-Linux | Linux running on systems without an MMU. ThinLinux, a Linux distribution designed for embedded and specific applications, supports microprocessors such as Motorola DragonBall (M68EZ328), M68328, M68EN322, ColdFire, QUICC, ARM7TDMI, MC68EN302, Axis ETRAX, Intel i960, PRISMA, and Atari 68k. It runs on Intel and PC-compatible hardware. [Table 2: Several Open-Source Embedded Linux Distributions] Additionally, there are commercial distributions such as Coventive XLinux, LineoEmbedix, LynuxWorks BlueCat, and MontaVista Linux. For real-time environments, there are real-time extensions such as RT-Linux and RTAI. In recent years, an increasing number of target systems have chosen the increasingly cost-effective x86 processors and mature PC architecture as their hardware platform. A survey conducted by LinuxDevices.com shows that 31% of embedded system developers chose x86 processors as their target platform in the past two years and 35% in the next two years, ranking first. For developers whose host and target machines are both PC-compatible platforms, besides following the above model, there is a simpler way to create a miniaturized Linux system: based on a regular Linux distribution, compile the kernel, copy the necessary files, and use the initial RAM disk (initrd) mechanism to create a root file system, a miniaturized Linux system can be quickly implemented. II. Miniaturization Techniques Linux is increasingly widely used in various embedded devices. However, general Linux distributions are very large and difficult to use in embedded devices with limited storage space. Therefore, we must trim the Linux system. There are roughly four main trimming techniques for Linux systems, which can effectively reduce the system size without affecting system performance. ① Remove redundant files. General Linux distributions contain many help documents, auxiliary programs, configuration files, and data templates. In embedded systems, these files are unnecessary and can be completely deleted. Even a large number of comments in the configuration files can be removed. ② Shared library trimming. Embedded systems have a limited number of applications, and shared libraries may contain redundant code that will never be used. This code can be deleted. ③ Using alternative software packages with similar functionality. Linux has many software packages with similar functions. Smaller packages can be selected and ported to embedded devices to replace larger ones. ④ Modifying source code. This includes reconfiguring and compiling software packages, removing unnecessary features; increasing software modularity to improve trimming efficiency; and reconfiguring the kernel to remove unnecessary drivers and modules. 1. Simplified Kernel. Unlike the microkernel architecture of traditional embedded operating systems, the Linux kernel uses a monolithic architecture. The entire kernel is a single, very large program. Its advantages include enabling direct communication between different parts of the system, effectively shortening task switching time and improving system response speed. The disadvantage is also obvious: the kernel size is relatively large because the Linux kernel includes not only basic operating system functions such as task scheduling, memory management, and interrupt handling, but also file systems, network protocols, and device drivers. The Linux kernel is highly modular and configurable, allowing for different functionalities and thus reducing its size. For example, Linux supports many file systems, including ext2, ext3, FAT, Reiserfs, and JFS. You can choose the file system you need based on your specific requirements, such as compiling only the ext2 file system into the kernel. The main steps for compiling the kernel are as follows ("#" represents the command prompt): # cd /usr/src/linux-2.4 # make menuconfig # make dep; make clean; make bzlmage The successfully compiled kernel file is arch/i386/boot/bzlmage. Refer to the README file in the kernel source code package for specific instructions. To further increase flexibility and reduce kernel size, Linux also provides a loadable kernel module mechanism. Many kernel functions can be compiled as modules and dynamically loaded at runtime, rather than being directly compiled into the kernel. However, in embedded Linux systems, it is more common to compile a standalone kernel as needed, using the module mechanism less frequently. The resulting kernel is typically several hundred kB or even around 1 MB, which is relatively large compared to traditional embedded operating system kernels (for example, the VxWorks kernel, which includes file system and network support, is approximately 250 kB). When configuring the kernel, developers need to have a good understanding of the dependencies between various functional modules; otherwise, compilation failures may occur. In the VxWorks kernel configuration process, if dependencies are broken, there are clear indications to avoid such errors. 2. Shared Library Trimming: Among miniaturization techniques, shared library trimming is easily implemented in software, creating automatic trimming tools with the most significant effect. The following focuses on shared library miniaturization techniques. The basic idea of shared library miniaturization is to extract and parse the dependencies between object files and symbols within the system library, construct a relational model based on these dependencies, perform relational calculations, and, based on the symbol information in the application, achieve system library miniaturization at the object file level. The implementation consists of four steps: a) Determining the set of functions to be called. Inside the ELF file, there is a symbol table with an Elf32-Sym array structure used for internal symbol definitions and external symbol references. By analyzing this symbol table, the symbols to be called (system functions) in the ELF application can be extracted, thus establishing a many-to-many relationship between the application and the symbols to be called. b) Determining the correspondence between system library functions and object files. The system library is logically divided into three levels: libraries, object files, and symbols. Both libraries and object files are in ELF format. The library is obtained by analyzing the image file *_pic.a of the library and the symbol table in each object file. The relationships between object file definitions, object file-symbol definitions, and object file-symbol calls are determined. c. Determine the dependencies between system library object files. The complete dependencies between object files are obtained through relational calculations of the relationships in step b. d. Generate a miniaturized system library. The set of object files that the called functions depend on is obtained through relational calculations of the application-to-call symbol table and object file-to-object file dependency table. Relinking these results in a minimized library file. 2.1 Principles of Shared Library Trimming Technology Shared libraries store pre-compiled object code, generally common code repeatedly used by applications. In Linux systems, applications and libraries can be statically or dynamically linked. During static linking, the linker selects the code needed by the application from the library and copies it to the generated executable file. Obviously, when a static library is used by multiple programs, there are multiple redundant copies on disk and in memory. During dynamic linking, the linker doesn't actually copy the library code into the executable file; only when the executable runs does the loader check if the library has already been loaded into memory by another executable. If it's not in memory, it's loaded from disk. This allows multiple applications to share the same copy of the library's code, saving storage space. This is the main reason why embedded Linux systems use shared libraries. When using statically linked libraries, the linker automatically links only the modules used by the library into the executable. However, this method isn't used with shared libraries, mainly because the linker doesn't know which parts of the library the application ultimately uses before execution. Therefore, to tailor shared libraries, the principles of dynamic linking must be analyzed first. Both shared libraries and executables contain several symbol tables that define external symbols, divided into exported and imported symbols. Exported symbols are those defined in the file but usable by other files, typically functions that can be called by other files; imported symbols are those used by the file but not defined, typically functions called by the file, and generally specify the shared library where the symbol is defined. Before loading an executable file or shared library, the loader iterates through each of its imported symbols, checking if the related code for that symbol is already in memory. If not, it first locates and loads the shared library that defines that symbol. Since applications and shared libraries in embedded Linux systems are generally deterministic, shared libraries may contain exported symbols that will never be called by other files. Removing the corresponding code for these symbols from the shared library will not affect the normal operation of the system. Existing trimming techniques are all implemented based on the above principle. The following section analyzes its implementation method in detail. 2.2 ELF File Symbol Extraction The ELF format was developed and released by UNIX Laboratories as an application binary interface. ELF is currently a file format widely used in Linux systems. 2.2.1 ELF File Process Image Loading The beginning of an ELF file is an ELF Header structure, which contains two pointers pointing to two array structures: the Program header table and the Section header table. The array elements in the Program header table locate the executable code segment inside the file; the array elements in the Section header table store relevant relocation and dynamic linking information. The loader controls these two types of arrays to load the process image. 2.2.2 Symbol Table and Relocation Process in ELF Files The ELF file's Section header table contains a Section of type SHT_DYNSYM, which records all symbols needed to create the process image. a) Symbol Value Determination and Symbol Location: The string section (.shstrtab) in the ELF file stores all strings. The ELF header stores the section index of the section header name string table (.shstrtab) through the e_shstrndx field. The symbol name field value in the ELF file is a character index of the .shstrtab section: In the Symbol structure, St_name corresponds to an index in the corresponding string table, which corresponds to its symbol value. St_value corresponds to two different types of addresses: for symbols defined internally in the file, it corresponds to the file-internal relative address of the symbol content; for symbols called externally, it corresponds to the address of the symbol to be called (resolved) or an entry in the relocation table (unresolved). St_info stores the symbol type and corresponding attributes. b) Relocation of the Called Symbol: In the symbol table, STT-SEC-TION corresponds to the relocation entry information table. Relocation entries exist in the ELF file as an array, where R_offset stores the address applied to the relocation behavior, and R_addend corresponds to an offset used to calculate the value to be stored in the relocation field. R_info gives the symbol index affected by relocation and the type of relocation application. For example, when the type is R_386_JMP_SLOT, the symbol value corresponds to the position of a .plt (procedure linking table) entry. c. Loading external symbols. For loading external symbol code, the loader loads the external symbol code into the process image through lazy MODE loading mode: the first call to the external symbol loads the symbol code to be called into the .got table through the loading code in PLT[0] and the pop parameters in PLT[1]; subsequent calls to this symbol are controlled through the corresponding .got table entry. 2.2.3. Implementation of ELF file symbol extraction For each shared object file participating in dynamic linking, its program header table has an entry element of type PT_DYNAMIC. The entry point points to a section called `.dynamic section`, which is an array of `Elf32_Dyn` structures. The `Elf32_Dyn` structure contains an attribute flag `d_tag` and a union structure `d_un`, where `d_tag` controls the interpretation within `d_un`. The entry at index `DT_SYMTAB` in the array points to the symbol table. By analyzing the relevant control structures of the symbol table, relocation table, procedure linking table, and global procedure table, the separation and extraction of file-defined symbols and symbols to be called are completed. The algorithm is as follows: Symbols defined in the ELF file and symbols to be called are distinguished by the target pointed to by their `st_value`: symbols to be called corresponding to the `.plt` table need relocation; symbols corresponding to internal symbols, if weakly typed (WEAK) or globally typed (GL0BOL), are used for calls from other files. By extracting symbols to be called from the ELF format, the dependency relationship between the application and the symbols is established. 2.3 Embedded System Miniaturization Results and Analysis: The tables are joined to obtain the set of target files that the application depends on. The target files in the set are deduplicated and rejoined to obtain a minimized library. A comparison of the data in the libraries before and after is shown in Table 3. After miniaturization, the number of components within the system library is significantly reduced, with the library being shrunk by nearly 50%. For applications in increasingly large embedded systems, optimization and reduction of the application based on the dependencies within the library files can reduce the system library size by 40% to 50% for general applications. Table 3: Comparison of System Library Before and After Miniaturization [align=center]Table 3: Comparison of System Library Before and After Miniaturization[/align] III. Conclusion In recent years, embedded Linux technology has developed rapidly, and various commercial and open-source Linux distributions have provided multiple choices for different hardware platforms and application environments. The Linux file system is actually very large. Constructing an embedded Linux file system is a very complex process. How to make a file system more compact and efficient while ensuring security is a topic that requires in-depth exploration. In particular, shared library trimming techniques can remove most of the redundant code in a library, but this requires the library's source code to be written in a relatively standardized manner, and different architectures require different handling. However, the field of library trimming is still relatively new and not yet mature. After extensive testing of this technology, we believe we can overcome its shortcomings and enable its widespread use in the embedded Linux field. Using the above methods, we have constructed a streamlined embedded version of the Linux file system, which allows the kernel to run while keeping the system as streamlined as possible, and meets the requirements of various products and systems.