Design of External Memory Module for PLD-based Embedded System
2026-04-06 08:00:37··#1
Abstract : Taking the MCS-96 series microcontroller as an example, this paper introduces a design scheme for a memory module using a programmable logic device (PLD), which includes Flash memory and RAM. A convenient memory expansion method is proposed, which effectively solves the problem of insufficient storage space in embedded systems, especially in data acquisition and storage systems. This scheme has the characteristics of strong versatility and simple read/write control, and has strong practicality. In embedded systems, due to limitations such as design cost and size, CPUs (including DSPs, microcontrollers, etc.) often suffer from insufficient address space. Many documents introduce relevant memory expansion methods. The existing methods usually use the CPU's I/O interface to generate chip select or high-order address signals, and use these signals to page the memory. However, when jumping between pages, it will cause inconvenience to program design. For CPUs without internal memory and using unified addressing, such as 80C196KC20 [1], such page switching will cause the CPU to be unable to continue executing the current program and generate an error (see Figure 1). After the CPU performs a page switch, it should continue executing instructions from page 1, but instead, it incorrectly executes the corresponding instructions from page 2, which is not the desired result. Therefore, finding an effective memory expansion method is an urgent problem to be solved in practical applications. [img=315,202]http://www.e-works.net.cn/ewk2004/fileupload/images/127423032386093750.gif[/img] 1 Memory Expansion Method Solutions In the use of MCS-96 series microcontrollers, it was found that 64KB of storage space can meet the needs of most users (usually, user applications are less than 10KB), but if it is used for data storage control, it will lead to a serious shortage of storage space. Statistical analysis of practical applications revealed that in many cases, data access is limited to sequential continuous operations. Utilizing this characteristic, the data storage space can be simplified in design, specifically by performing batch data access through continuous reading or writing to the same address, thereby saving address space. In a 16-bit CPU, any 64K words (2 to the power of 16) of memory space can be mapped to two addresses (one for reading and one for writing). This mapping method can expand the memory to a maximum of 2G words (2 to the power of 31), but it also brings many difficulties in logic control. With the rapid development of programmable logic devices (PLDs), including FPGAs, EPLDE4 [4], CPLDs, etc., the design of digital logic circuits has been greatly simplified, making this memory expansion idea possible. 2. Specific Implementation of the Memory Expansion Method The following uses the system designed by the author as an example to explain in detail the implementation of this memory expansion method. This system is a multi-functional data acquisition device capable of 12-bit A/D conversion at a maximum rate of 40k times/s, and can save the acquired data to Flash ROM to prevent data loss due to power failure. The technical requirements are as follows: ① It can store a maximum of 32KB of sampling data; ② It can simultaneously store 4 system configuration programs, each 4KB, for a total of 16KB; ③ Due to the characteristics of Flash ROM, read and write operations cannot be performed during the programming stage after data is written. Therefore, to ensure the normal operation of system sampling and microcontroller operation, an additional 32KB of RAM is required as a data cache; ④ The system program, interrupt service routine, etc., occupy a total of 56KB (28KB each reserved in Flash ROM and RAM), requiring a total storage space of 136KB. This requirement exceeds the 64KB addressing range of the 96 series microcontroller. Therefore, a memory module was designed, the structure of which is shown in Figure 2. [img=580,205]http://www.e-works.net.cn/ewk2004/fileupload/images/127423032543750000.gif[/img] The Flash ROM uses Atmel's AT29C1024, with a capacity of 128KB and a data bus width of 16 bits; the RAM memory consists of two CY7C199 chips, with a data bus width of 16 bits and a capacity of 64KB. The 80C196 microcontroller's ALE is the address latch signal, /WE is the write valid signal, /RD is the read valid signal, and READY is the ready signal. The MCS-96 series microcontrollers support both 8-bit and 16-bit operating modes; to improve system performance, the 16-bit operating mode is selected. The 96 series microcontroller addresses are calculated in bytes, therefore, A0=0 in 16-bit operating mode has no practical meaning. In normal read/write operations, the latched addresses AD1 to AD15 are used as A1 to A15, while A16 = 0. The following section uses reading Flash ROM as an example to illustrate the address extension method. For directly addressable addresses, the EPLD acts as a latch, separating the time-division multiplexed address and data buses of AD0 to AD15 to generate independent address and data buses. Two special addresses are defined here: the read address Address_F_R for the Flash ROM data block and the read position pointer address Address_F_RP. First, a 16-bit binary number is written to Address_F_RP, representing the starting address of the data block to be read. The 16-bit range is 0 to 65535, therefore the specified starting address range is 64K words, or 128K bytes. Then, read operations are performed continuously from Address_F_R. Each time a read occurs, the position pointer is automatically incremented by 1 without needing to be reset. If a new position needs to be read, simply write the new position data to the Address_F_RP address. The implementation of this function within the EPLD device is shown in Figure 3. The counter can be synchronously set to an initial value and synchronously count, and is declared as 1pm_counter in AHDL language [5]. Among them, CNT_EN is the counting enable control. When CNT_EN is high, the counter will automatically increment by one every time the rising edge of CLOCK arrives, thus realizing the function of automatic address increment; CLOCK is the synchronous clock input terminal, which is valid on the rising edge; SLOAD is the counter synchronous initial value setting signal. When this signal is high, under the action of the rising edge of CLOCK, the output of the counter Q[15.0]=D[15.0], thus realizing the function of initializing the reading position. The counter is described in AHDL as follows: counter: lPm_counter with (1pm_width=16); counter.clock=/rd&(/we#(a[15..0]!=Address_F_RP); counter.sload=(a[15..0]==Address_F_RP); counter.cnt_en=(a[15..0]=Address_F_R); counter.data[15..0]=D[15..0]; [img=549,329]http://www.e-works.net.cn/ewk2004/fileupload/images/127423032799531250.gif[/img] The clock signal must ensure a rising edge is generated when writing to Address_F_RP to modify the read position or when reading data from Address_F_R. Buses a0-a15 and D0-D15 are the address and data buses separated from AD0-AD15, respectively. The multiplexer selects the output address based on S0-S3 generated by address decoding, and the output address is directly connected to the address lines of RAM and Flash ROM. If an address other than Address_F_RP is accessed, the address output bus A115...1) = a[15... 1] A16=0, meaning the microcontroller directly accesses the memory; if Address_F_R is read, then chip select /CS2 is valid and A[16..1)Q(15..0] is used as the output address. This allows for automatic switching between different memory areas, greatly increasing memory expansion capabilities and simplifying program design. The same method can be used to define the data block write address Address_F_W and write position pointer address Address_F_WP in FlashROM. Similar methods exist for defining Address_R_ (RAM data block read address), Address_R_RP (RAM data block read position pointer address), Address_R_W (RAM data block write address), and Address_R_WP (RAM data block write position pointer address) in RAM. This facilitates reading and writing to extended memory areas. The following example uses MCS-96 assembly language to illustrate how the program operates. For example, if data needs to be continuously acquired from IOPORT0 and then stored in a specified data block in RAM for processing, the following program can be written: LD 40H, address value; the address value is the destination address to be written, 16-bit word addressing. ST 40H, Address_R_WP; Set the write position pointer REPEAT: LDB 40H, IOPORT0 LDB 41H, IOPORT0; 40H and 41H are internal registers, so they are read twice consecutively because they are stored word-wise. ST 40H, Address_R_WP; Write to the specified position, conditional check to exit the loop JMP REPEAT [img=385,336]http://www.e-works.net.cn/ewk2004/fileupload/images/127423032990312500.gif[/img] From the simple example above, it can be seen that this memory organization method greatly simplifies the complexity of programming, and can use the method of assigning initial values to the position pointer to realize read and write operations on any location in the extended memory. 3 Address Allocation With the memory expansion methods described above, combined with the system's technical parameters and the characteristics of the microcontroller, a reasonable memory address allocation scheme can be created. The microcontroller's address partitioning is as follows: 0000H~01FFH: System register area; 0200H~1EFFH is reserved for the user area, directly mapped to 0200H~1EFFH in Flash ROM. This area can be used to store data, programs, etc., and can be directly addressed by the microcontroller. 1000H~1FFFH: User area. In actual use, addresses such as Address_11R, Address_F_WP, and access addresses for some special devices such as A/D converters and LCD displays are set in this area. 2000H~207FH: This area contains the interrupt vector area, chip configuration byte area, reserved word area, etc., directly mapped to 2000H-207FH in Flash ROM. 2080H~8FFFH: User area. The microcontroller also boots from 2080H. The program execution begins at this address range, so this address range is directly mapped to 2080H~8FFFH in the Flash ROM. This area is used for system boot, initialization, and other programs. The user area 9000H~FFFFH is mapped to 9000H~FFFFH in the RAM as the system program's execution area. [img=549,248]http://www.e-works.net.cn/ewk2004/fileupload/images/127423033369843750.gif[/img] The above allocation scheme can be implemented by decoding the address bus to generate the corresponding chip select signals /CS1 and /CS2. After this allocation, the usage of Flash ROM and RAM is shown in Figure 4. In Figure 4, the white area represents the region directly addressable by the microcontroller via the bus, which can be accessed directly by the microcontroller. The gray area represents the extended memory region, which cannot be directly accessed by the microcontroller, but can be read and written using addresses generated by the EPLD through the methods described earlier. The following is a brief introduction to the practical uses of each region: The 0000H~1FFH and 1F00H~1FFFH regions in the Flash ROM are not utilized due to their small capacity. After system startup, program execution begins at address 2080H in the Flash ROM, copying the contents of 2000H~8FFFH to 9000H~FFFFH in RAM, and then jumping to RAM to execute the system program. Because the Hash ROM is slow, a certain number of wait cycles need to be inserted during read and write operations; therefore, copying the program to RAM for execution improves system performance. Simultaneously, after writing to the Flash ROM, it cannot be read within the 10ms programming phase, so RAM also provides a location for program execution during this time. With this allocation, the program length is limited to 28KB, which is sufficient for the system's needs in practice. The ROM contains 28KB (9000-FFFFH) to store four system configuration programs, each up to 7KB in length; 64KB (10000H-1FFFFH) is used as the data storage area. The RAM contains 36KB (0000H-8FFFH) as a data buffer. From the above analysis, it can be seen that the final design's specifications exceed actual requirements and effectively solve practical application problems. 4. Reasonable Use of the EADY Signal Finally, let's introduce the crucial role of the microcontroller's READY signal in this system. As seen in the previous design, the system contains high-speed RAM and a slow Hash ROM. Initially, the Hash ROM used was the AT29C1024-70JCt31, the fastest of its type, with an effective data setup time of only 70ns. The microcontroller's read/write timings do not include wait cycles, as shown in Figure 5. The time from the falling edge of ALE to the rising edge of /RD is 80ns, the response time of Hash is 70ns, and the delay of EPLD causes the microcontroller to read data from Hash ROM to be unstable, which manifests as the inability to write to Flash ROM online, frequent erroneous execution results, and system crashes. Therefore, a wait cycle must be added to extend the read and write time to meet the requirements of Hash ROM. Here, only one wait cycle (100ns) needs to be inserted to meet the requirements, so the chip configuration byte CCR.5=0, CCR.4;0[1]. In this way, when the READY signal is low, it will be automatically inserted and only one wait cycle will be inserted. A simple approach is to connect the chip select signal /CS2 of Flash ROM to READY, so that when the Flash ROM chip is selected, the READY signal will become low along with /CS2. According to this idea, the READY signal can be reset inside EPLD as follows: ready=!(((a[15..0]>=H"0200")&(a [15..0]<:=H"1EFF")) #((a[15..0]>=H"2000")&(a[15..0]<=H"8FFF"))#(a[15..0]==Address_F_R) #(a[15..0]==Address_F_W)&!ALE) However, the actual fault persists. The timing signal obtained through testing is shown in Figure 6. [img=513,326]http://www.e-works.net.cn/ewk2004/fileupload/images/127423033617031250.gif[/img] The READY signal is generated 5ns after the falling edge of ALE, rendering the READY signal invalid. The only way to solve this problem is to generate the READY signal in advance. In practice, the effective address is generated after being latched on the falling edge of ALE, which is also the source of the last term in the READY signal generation expression. However, considering that the address generation should occur before the falling edge of ALE to ensure that the correct address is latched, we boldly propose to make the generation of the READY signal no longer controlled by ALE. As long as an address is generated on the bus, a judgment can be made, thus generating the READY signal in advance. However, this approach disrupts the synchronous timing, and asynchronous generation of the READY signal is prone to causing hazards. Through analysis, it can be found that asynchronous generation of the READY signal does not bring any unstable factors. Therefore, the READY signal is modified as follows: ready=!(((a[15..0]>=H"0200")&(a[15..0]<=H"1EFF")) #((a[15..0]>=H"2000")&(a[15..0]<=H"8FFF")) #(a[15..0)==Address_F_R) #(a[15..0]==Address_F_W) That is, the judgment of the address valid signal ALE is removed. After modification, the system works stably and normally. The read and write timing of Hash ROM after modification is shown in Figure 7, while the read and write timing of RAM is still shown in Figure 6, and the purpose is achieved. Considering that inserting a wait cycle greatly increases the read and write time, the AT29C1024-70JC is replaced with the inexpensive AT29C1024-12JC (valid data setup time is 120ns)[3], and the system can still work stably. Through use, it is proven that this memory design scheme is feasible. [img=462,278]http://www.e-works.net.cn/ewk2004/fileupload/images/127423033869531250.gif[/img] The preceding sections detailed a practical memory expansion method based on PLD devices, effectively solving the memory expansion problem in embedded systems, especially data acquisition and storage systems. This method simplifies program design and does not require design modifications for different CPU models, exhibiting excellent portability. A more complex microcontroller external memory organization scheme, including a storage system composed of Flash ROM and RAM, was also presented. Finally, a method was proposed to change the synchronous generation of the READY signal to asynchronous generation, resolving the problem caused by the CPU switching between high-speed RAM and low-speed Flash ROM. Ultimately, a relatively complete CPU external memory system was designed.