Abstract: In digital system interconnect design, high-speed serial I/O technology is replacing traditional parallel I/O technology, becoming a current trend. Compared with traditional parallel interface technology, serial solutions offer greater bandwidth, longer distances, lower costs, and higher capabilities. Ethernet , as a high-speed serial transmission method, is currently the most basic and popular local area network (LAN) technology, and its speed is constantly increasing to adapt to various newly developed services such as streaming video. GMII is a standard Gigabit Ethernet interface located between the MAC layer and the physical layer. Therefore, the GMII interface protocol can be implemented on an FPGA platform to complete data communication between the MAC and physical layers.
1. Introduction to GMII Interface Protocol
MII (Media Independent Interface) is an Ethernet industry standard defined by IEEE-802.3. Ethernet includes a data interface and a management interface between the MAC and PHY [1]. The data interface includes two independent channels, which are used to send and receive data respectively. They each have their own data signals, clock signals and control signals. GMII is the MII interface of Gigabit Ethernet. The data interface requires a total of 16 signals, and the interface signals are shown in Figure 1.
GMII uses an 8-bit interface with a 125MHz operating clock, thus achieving a transmission rate of up to 1000Mbps. It is also compatible with the 10/100Mbps operating mode specified by MII. The MII interface is mainly divided into four parts: the MAC layer to physical layer transmit data interface, the physical layer to MAC layer receive data interface, the physical layer to MAC layer status indication interface, and the MAC layer and physical layer control and status information interface (MDIO). Specific signal descriptions are shown in Table 1.
2 Design Scheme
Xilinx provides a Gigabit Ethernet development kit for Virtex-5 ML505/ML506 development boards, which support 10/100M and 1/10G Ethernet and are ideal platforms for learning and developing high-speed connectivity devices. Xilinx provides a LogiCORE solution with parameterizable 10/1Gbps Ethernet physical layer controller functionality [2]. This core is designed to work with the latest Virtex-5, Virtex-4 and Virtex-II Pro platform FPGAs and can be seamlessly integrated into the Xilinx design flow.
The two main modules of an Ethernet system are Media Access Control (MAC) and Physical Layer (PHY). The MAC consists of two modules: data frame encapsulation/decapsulation and media access management, which perform the functions of data frame encapsulation, decapsulation, transmission, and reception. The PHY encodes the data to be transmitted according to the physical layer's encoding rules, then performs digital-to-analog conversion to convert it into an analog signal before sending it out. Receiving data is the reverse process.
2.1 Circuit Architecture
The Ethernet controller mainly performs FPGA design of MAC sublayer, MAC layer and upper layer protocol interface and MAC layer and PHY interface GMII. The overall structure block diagram is shown in Figure 2. The whole system is divided into the data generation module, the sending module, the CRC encoding generation module, the physical layer encoding and decoding module, the receiving and verification module, and the GMII management module. The sending module and the receiving module mainly provide MAC frame sending and receiving functions. Their main operations include MAC frame encapsulation and unpacking and error detection. It directly provides a parallel data interface to the external physical layer chip [3]. In the implementation, the physical layer processing directly uses the commercial gigabit PHY chip. In the simulation process, the physical layer IP_CORE is used to implement it. Therefore, this paper focuses on the development of the MAC controller.
2.2 Introduction to the MAC Protocol
The MAC control module consists of two modules: data encapsulation/decapsulation and media access management. It performs the functions of data frame encapsulation, decapsulation, transmission, and reception. The frame format is shown in Table 2.
The preamble ensures permanent synchronization between the physical layer signals and the received frame timing. The length type indicates the length of the subsequent data; if the actual data length is insufficient, it needs to be padded with zeros. Type 0x0800 represents IP protocol data, and hexadecimal 0x809b represents AppleTalk protocol data, etc. This document sends IP protocol data. The data at the end of the frame is a checksum calculated using a CRC circuit.
2.3 Ethernet FCS Processing
The FCS of the check bit is the Cyclic Redundancy Code (CRC). Its detailed encoding process is as follows: based on the length and characteristics of the data stream M, a characteristic polynomial of length n is selected, and n zeros are added after the data stream M. This is used as the dividend and divided by the (n+1)-bit binary sequence P composed of the characteristic polynomial to obtain the quotient Q and the divisor R. The divisor R is n bits. R is added as a redundancy code after M and sent out. The serial algorithm implementation circuit of CRC8 encoding is shown in Figure 3 [4]:
Before encoding, all registers are initialized. Then, the information sequence to be sent is sequentially input into the encoder at the input terminal. After the entire information sequence is input, the value in the register is the required remainder, i.e., the CRC checksum. This article uses a CRC32 polynomial, the expression of which is shown below. The data segment to be encoded starts from the destination field and ends at the data field. Using a similar circuit, redundancy code encoding can be implemented using Verilog.
3 Circuit Implementation and Simulation
3.1 MAC Sender - Data Framing
Ethernet transmits data frame by frame. After receiving a frame, network devices and components need a short period of time to recover and prepare for receiving the next frame. The interframe gap is the time margin required between frames; the minimum interframe gap for Ethernet is 96 bits (12 bytes). Therefore, at the start of transmission, it is necessary to determine whether the interframe gap requirement is met. Based on the GMII interface transmission timing shown in Figure 4, a state machine as shown in Figure 5 is designed. State transitions are achieved by counting bytes in each state.
During data transmission, the MAC sending module encapsulates the data that the upper-layer protocol needs to send through the Ethernet protocol and sends the data to the PHY layer. The sending module can also encapsulate the frame header and frame tail flag signals received from the host and the data to be sent from the external storage unit obtained from the host interface according to the standard protocol, and send the data in 8-bit data width format to the PHY layer when the channel is idle. Then, the PHY chip performs digital-to-analog conversion on the data and sends it to the network.
In the physical layer, the Etherent1000BASE-XPCS/PMAIPCORE is generated using the ISE platform to receive data from the MAC layer. This core supports internal or external GMII and can be linked with MAC or custom logic. The main components of the IP core are PMA and PCS. PMA is the media layer of the physical layer, and PCS is the physical layer encoding module. It can perform 8B/10B encoding and decoding, 64B/66B encoding and decoding, COMMA character detection, align the received data to the appropriate word boundary, generate and detect pseudo-random sequences, clock correction and channel binding, etc. [5].
3.2 MAC Receiver - Data Extraction
After receiving the returned data, the MAC end needs to inspect it, first extracting the payload data and the redundancy check code crc_cmp. When the length of the sent data packet is uncertain, the corresponding data and check code cannot be extracted using a counter. The feasible operation is shown in the timing diagram of Figure 6.
At the receiving end, if the first byte of the preamble (55) is detected, the counter starts counting. When the count reaches 14, the next clock cycle begins, generating the actual data to be transmitted, producing the Rx_dv_i signal. This signal remains low until four bytes of checksum are received, at which point it is pulled low, and Rx_dv_i is delayed by four clock cycles to obtain the Rx_dv_a4 signal. Simultaneously, Rx_data is also delayed by four clock cycles. When both Rx_dv and Rx_dv_a4 are high, the data on Rx_data_d4 is the payload data. When Rx_dv is low and Rx_dv_a3 is high, Rx_data_a4 contains the checksum. Using this scheme, even without knowing the exact number of data bytes transmitted, the payload data and FCS checksum data can be extracted separately.
3.3 Circuit Function Simulation
After the relevant data is extracted, the data from the sending end is compared with the received data. If they are different, a low-level data_error signal is generated to indicate an error. Simultaneously, the received data is sent to the CRC encoding circuit to generate a checksum rx_crc. The sender's checksum crc_cmp is compared with rx_crc; if they are different, a low-level crc_error signal is generated. Finally, the error indication signal 'error' is generated by ANDing data_error and crc_error.
After the design was completed, the circuit was simulated using Modelsim software. The simulation waveform is shown in Figure 7. It can be seen that the three error flag signals are all at a high level after the circuit starts working, indicating that the circuit has successfully completed the data transmission between the MAC and PHY.
4. Conclusion
High-speed serial transmission technology is one of the three major application areas for FPGAs in the future. This paper starts from the overall structure and basic protocol of Ethernet transmission and designs a gigabit Ethernet transmission system. Using MAC+PHY as the core, it completes the basic functions of the physical layer and data link layer in the network architecture. Simulation verifies that the data is transmitted accurately and reliably between the data link layer and the physical layer, demonstrating good stability and high flexibility. This system can also be used to transmit images and large amounts of data.