The H.264 / AVC standard has a range of new features superior to MPEG4 and H.263, and the H.264 is more than about 50% of H.263 by the same reconstruction image quality. But the cost of the savings rate increases the algorithm complexity. Since the software has not been implemented with the software that has not been implemented. Therefore, hardware acceleration must be used, which is the original intention of this decoder design.
Although H.264 is compared to the same mass H.263 image, the rate is half saver, but since the goal of the present solution is to solve the decoding work of H.264's HD image (1080i), it is also necessary to apply to the universal video outside. Set, so the selected interface requires both the transmission of the high-speed code stream source file to be easily swapped. The USB interface happens to meet these two conditions. High quality source file has a large amount of data, which is high for transmission interfaces. And in the simulation environment of the FPGA, the USB interface should also be responsible for the task of the decoding result to the PC host computer. This requires at least the transmission speed must be guaranteed to exceed the decoding speed. Compared with the USB 1.1 interface, the transfer of USB 2.0 interfaces is more in line with this design.
It is understood that the transmission interface needs to be a transmission rate of at least 30 Mb / s to decode 1080i images.
Device selection
Simulation and verification using FPGA basically has become an essential link in IC design, especially for large-scale design. The design of this decoder IC uses Virtex II FPGA as the simulation environment. For this design, the XC2V6000 encapsulated by FF1517 BGA has fully met the design requirements. With the premise of considering design costs, this FPGA is a relatively high cost option.
Cypress's EZ-USB FX2 is a microprocessor integrating USB 2.0, which integrates a USB 2.0 transceiver, SIE (serial interface engine), enhanced 8051 microcontroller, and programmable peripheral interface. This optimization design of FX2 can reach 56MB / s data transmission rate, and the maximum bandwidth allowed by USB 2.0 is 480MB / s, i.e., 60MB / s. This chip adds many integrated control functions under the premise of the transmission bandwidth. GPIF and SLAVE FIFO model provide simple and seamless connection interfaces for external FPGA, DSP, and ATA.
System architecture
The main body of this design is shown in Figure 1, in the simulation platform of the FPGA, the Virtex II includes an interface module of the codec body and the FPGA. The USB 2.0 chip 68013a is a stand-alone portion, responsible for USB data transmission between FPGAs and PCs. The SRAM and DRAM of the FPGA film are used as the extended storage device for the FPGA, and the source code files required for the decoder are stored, decoded files, and software program files used in the decoder.
Figure 1 Schematic diagram of FPGA simulation transmission
In this design, the decoder end has a powerful function and has a CPU embedded. The function of active identification command can be performed. Therefore, the PC side and the decoder are in the right position. The work of the PC includes sending a command header, sending a command, transmitting a stream, receives a backfax decoding result, etc. The FPGA end includes receiving and identifying the command head with the PC command, receives and stores streams in SRAM and DRAM, read SRAM and The decoding result in the DRAM is passed back to the PC side.
USB 2.0 chip work mode and firmware
1 Determination of Working Mode of Chip
In the design, there are two processes involving large quantities of data file transfer: PC transfer source file down, and FPGA transmits decoding result files to the up bit PC. It is the highest in the USB transmission requirements. If the source code file is not adapted to the decoding speed, the decoder is stopped; if the retrieval result is lagging behind, the decoding result that is not transmitted is overwritten. Any case of the situation will directly lead to failure of the decoder.
When the transmission requirements are very high, select the SLAVE FIFO (Batch Transfer) mode provided by EZ-USB FX2, which can meet the transmission requirements. In this mode, 6 endpoints are divided in the USB chip memory unit, and the following is simply referred to as EP. EP0 and EP1 are retained as chip configuration FIFO. EP2, 4, 6, 8 can be transmitted as a user, and 4 EPs are constituted by double FIFO (Double FIFO).
For example, as shown in Figure 2, the USB performs OUT transmission, and the EP2 endpoint is set to 512 bytes Dual FIFO. In the external device, the USB end can continue to send data as long as one 512-byte FIFO is "half full". When the FIFO is "full", FX2 automatically converts it to the external interface, excludes waiting for reading; and transfer the next FIFO in the USB interface queue to the USB interface for continuing write data . The external interface is similar to this, as long as one FIFO is "half full", you can continue to read data. When the currently operated FIFO reads "empty", FX2 automatically converts it to the USB interface, excludes the waiting Write and puts the next FIFO in the External Interface Queue to "full" to the interface for external devices.
Figure 2 EndPoint Schematic
Figure 3 shows the working process of dual FIFO. When a 512-byte FIFO is full, the FPGA can take out the data inside while the PC can write data to another FIFO (a set of index arrows). When a 512-byte FIFO is empty, the PC can write data. At the same time, FPGAs can read another FIFO (a set of virtual arrows).
Figure 3 Dual EP operation mode
2 firmware program design
During the initialization of the USB device by writing a firmware program, the following important configuration registers need to be set.
Ifconfig; setting the USB clock is provided by the outside and selects the Slave FIFO mode.
EPXCFG (x = 2, 4, 6, 8); configuration of 4 EP (endpoint FIFO).
EPXFIFOCFG (x = 2, 4, 6, 8); configure 4 EP automatic transmission mode and transmission bit wide.
Some other registers can be configured separately according to the actual needs. Configure EP2 in this design to transfer command heads, EP4 is used to transmit source code files, and EP6 is used to transfer commands, and EP8 is used to transmit decoding result files.
After completing the design of the firmware program, the Control Panel that comes with FX2 can be used to download the completion of the firmware to the 68013A chip, or stored in the external I2C, so that the chip reads yourself when it is reset.
3 circuit design schematic
Figure 4 is a circuit design schematic of this design, the original, the original, the above sequence is: CY7C68013A chip, power coupling capacitive group, USB 2.0 standard interface, standard RS232 serial port, external crystal oscillator and HIN232 serial chip. This design is made according to this circuit schematic, and the function of USB 2.0 is completed.
Figure 4 Circuit Design Schematic
Design of FPGA on the decoder and USB interface module
In the SLAVE FIFO mode, the FPGA can actively decide whether it is necessary to read the data in the USB internal FIFO, not just the data transmitted by the passive PC. As shown in Figure 5, the control method: SLOE, SLRD, and SLWR are used as the read and write signal and enable control signals of EP. FIFOADR [1: 0] As the selection signal of 4 EP, the current operation target EP is selected. Pktend is the control terminal that the FPGA actively commands the USB chip to send the data to the upper PC. Flagx (x = a, b, c, d) represents the full information of the currently selected FIFO. FD (8 or 16 bits) is a two-way data transfer port. The FPGA interface controls these ports to achieve the purpose of operating the USB.
Figure 5 Decoder and USB interface
In the FPGA interface, this design also defines a depth of 256, a width of 32 bits of FIFO (internal FIFO). The reason is that the SRAM and DRAM portions are constantly called in this design, so that the memory cell is occupied. The USB is not possible to operate with the storage unit. Therefore, in the FPGA interface, the data fd (8-bit or 16 bits) of multiple USB transmission first is first spliced into 32-bit data to store internal FIFO, and when SRAM and DRAM are idle, it is transmitted. Such processing makes the USB transmission not depend on the operating state of the memory cell, and further enhances the speed of the USB transmission to meet the requirements of the transmission.
Design verification and results analysis
When the USB driver under the Windows operating system is developed, this design successfully utilizes the EZ-USB chip to complete the transmission of video data with Virtex II FPGA. And when the frequency of 66 MHz below the FPGA, real-time transmission, decoding of H.264 format video is completed. In the detection of the transmission rate, the USB can reach the speed of 33MB / s or more, and fully adapt to the requirements of the decoder.
Design Analysis: This design uses two levels of FIFO, which fully exerts the speed advantage of USB 2.0. The design releases the bottleneck during transmission and decoding, and it is seamlessly connected. The deficiencies are due to the SLAVE FIFO mode restriction of the USB chip, and the PC and the decoder must communicate directly using the command interaction, accounting for a certain bandwidth. In a state where the command is too frequent, the efficiency is not high, but the high volume data transmission is small.
Conclude
The verification platform is successful, and the average 33MB / s, up to 40MB / s is achieved by the source code file test of a variety of compression ratios. Completed and exceeded the design requirements.
Editor in charge: GT, read full text
Our other product: