1 FPGA floating point operation pushes new
In the past, FPGAs are in the floating point operation, in order to meet the IEEE 754 standard, each operation requires normalization and normalization steps, resulting in a great performance bottleneck. Because these normalization and normalization steps generally implement large-scale barrel shift registers in FPGAs, a large number of logic and wiring resources are required. Usually a single floating point adder requires 500 lookup tables (LUTs), and a single floating point takes more than 30% of the more complex mathematical functions such as LUT, an index, and natural logarithm, which requires approximately 1,000 LUTs. Therefore, as the DSP algorithm is increasing, the FPGA performance will significantly deteriorate, and the FPGA occupying 80% to 90% logic resources will cause serious wiring congestion, hinder the fast interconnection of FPGA, and will eventually affect timing convergence.
In order to solve the above problems, the 2010 Altera introduces the integration data path design in the DSP Builder Module library. It combines the basic operator in a function or data path, and selects normalized input by analyzing the bit growth of the data path, and assigns sufficient to the data path, and eliminate normalization and normalization steps as much as possible. This optimization platform combines the fixed-point DSP module with the programmable soft nuclear logic to avoid a large number of such barrel shift registers. Reduced 50% logic compared to equivalent data paths composed of several basic IEEE 754 operators, delayed by 50%. Moreover, this method is generally higher than the method of using the basic IEEE 754 floating point operator library.
The floating point calculation performance and efficiency of the fused data path method is the industry before the hard floating point DSP module in Altera Arria 10 and Stratix 10 devices. Table 1 shows the results of the Cholesky solver running on the Stratix V version of the DSP development kit, the form is AX = B, using the fusion data path design process in the DSP Builder module library. Generally, the input matrix function of Cholesky is large, delayed, so it is difficult to implement in the FPGA hardware, but the logic of floating point calculations for the DSP Builder Module library with converged data pathway is only 3 to the basic floating point multiplier. 4 times, and will generate a result at each clock cycle, so that the corresponding size of Cholesky solution can be achieved.
With the advancement of the Altera DSP module architecture and the continuous optimization of software tools, Altera has enabled high-performance floating point operations in FPGA, and the hard floating point DSP module in Altera's Arria 10 and Stratix 10 devices is the industry's float. Point solution representative.
2 hard floating point DSP improve design efficiency accelerated
The hard floating point DSP modules in the Arria 10 and Stratix 10 devices not only improve the performance, but also speed up the product.
In terms of improving the performance performance, it is mainly reflected in three aspects:
One is to save logical resources. The hard floating point DSP module in Arria 10 and Stratix 10 devices, the FPGA system overcomes the challenges of the limitations mentioned above. In the past, you need to use a fixed-point multiplier and FPGA logic to implement floating point calculation function. Altera's hard floating point DSP is hardly required to use the logical resources required for existing FPGA floating point calculations, and the barrel shift register can be in hard Realization in the nuclear DSP module avoids normalization and normalization functions that use valuable FPGA resource operations. With the hard-core floating point DSP module, this innovative architecture is built, which not only saves a lot of logical resources, timing convergence or FMAX requirements are no longer limited to subordinate wiring, thereby ensuring FPGAs using 80% to 90% logic resources. It can still maintain a high FMAX performance.
The second is to improve the number. The hard-core floating point DSP module supports many plural floating-point operations, including multiplication, addiction, and other floating point outputs that comply with the IEEE 754 standard to ensure consistency in applications with high resolution requirements. In the past, FPGAs realized floating point operations were represented by binary complement on internal data paths. When an algorithm input is output, this internal binary complement indicates that the IEEE 754 format is converted to each other. This is critical to solving the resources of the barrel shift register, but the actual output value is deviated with the MATLAB / SIMULINK model value. However, after using the hard floating point module in the ARRIA 10 and the Stratix 10 device, the actual output value is consistent with the highness of the Simulink model.
The third is to improve energy efficiency. The Arria 10 and Stratix 10 devices have also achieved energy efficiency floating point in the FPGA industry. Each watts are 50 GFLOP and 100 GFLOPS, which greatly reduces the logic and wiring resources required for previous floating point operations, which greatly reduced Nuclear power consumption.
In accelerated product listing, the integrated hard floating point DSP integrated in FPGA supports many common DSP models and simulation environments, and seamlessly optimizes floating point operations. In various applications such as Thunder to Communication Systems from the military, Arria 10 and Stratix 10 devices provide designers with more efficient design, with an average of 6-12 months. On the one hand, because there is no additional conversion process. In the previous generation of FPGAs, high-performance floating point operations are required, and the conversion needs to be converted, and the floating point is converted into fixed points, implemented in the FPGA, analyzed, converted and verified in the fixed point implementation. This conversion process is generally cumbersome. In addition, after this process is completed, it is also necessary to verify the accuracy of the conversion process. If you have any modifications or changes in design, you need to re-transfer these processes and continuous conversion. On the other hand, because Altera provides easy-to-use design tools. Altera excellent DSP design tools include DSP Builder, providing hardware designers, model-based designers, as well as OpenCL-oriented software development kits (SDK) provided for software programmers. With these tools, the designers do not need floating point to the fixed-point conversion process, and the implementation process does not require debugging in the implementation process, and the system definition and simulation can be completed in a few minutes until the system is implemented. When using DSP Builder or OpenCL Design Algorithm, designers can focus on algorithms definitions and iterations, rather than design hardware, helping them shorten the development and verification time.
Be
Source: Wiku Electronic Market Network
Our other product: