"0 Introduction
Compared with h.264/avc, the new generation video compression standard h.265/hevc can achieve lower bit rate, that is, higher compression ratio under the same image quality. Due to the visual characteristics of human eyes, rate allocation in different regions is a key problem in the process of dynamic rate coding. If the video can be divided into region of interest (ROI) and ordinary region in the coding process, and the bit rate allocation of them can be dynamically adjusted, better subjective video quality can be obtained at the same or even lower bit rate, so as to enhance the user experience. The speed and quality of ROI region extraction have a great impact on the coding algorithm. Therefore, it is particularly important to achieve low complexity and high-quality ROI region extraction and rate allocation according to the characteristics of h.265/hevc video coding.
The ROI extraction and rate allocation method is applied to JPEG 2000 still image compression, which improves the image quality of ROI area and achieves good rate saving effect. The VLSI hardware design of ROI extraction is carried out by using FPGA, which achieves satisfactory results without significantly improving the image coding time, but the system can only be used for still image coding; Both proposed h.265/hevc rate control based on ROI, that is, compression performance optimization method, and achieved some results. The research shows that although h.265/hevc coding standard has reduced the bit rate compared with h.264/avc standard to a certain extent, ROI region rate control also works for the latest hevc standard, but the impact of ROI extraction algorithm complexity on coding rate is not considered; In the literature, the Gaussian background model is used to establish the virtual background frame, which reduces the bit rate of h.265/hevc coding, but the ROI variable quality coding for human eye characteristics and the influence of the background frame establishment efficiency on the encoder rate are not considered.
Based on the block characteristics of video coding algorithm and the fine-grained parallel characteristics of FPGA, a Gaussian background modeling ROI mapping method based on block matching is proposed, and the hardware implementation and verification of the algorithm are carried out on FPGA platform by using HLS tool. FPGA processing speed reaches 22 fps@1 080 P, variable quality coding of ROI mapped CTU area can save an average bit rate of about 10%, and the overall video quality remains stable.
Gaussian background modeling and its improvement for video coding
1.1 basic principle of pixel based Gaussian background modeling
Gaussian background modeling is a background modeling method based on probability model. The traditional Gaussian background modeling algorithm is based on pixels. A frame image in digital video can be regarded as a two-dimensional discrete function f (x, y, t) for spatiotemporal position (x, y, t). For a given channel in a given color space, f has only a unique value for a given (x0, Y0, t0); For a given time T0, f can be regarded as a two-dimensional random field, which is generally considered as a stationary random field.
From a statistical point of view, the appearance and movement of foreground objects are temporary and sudden, while the background is long-term and stable. For time t, given (x0, Y0), f (x0, Y0, t) satisfies a certain probability distribution, which usually conforms to Gaussian distribution.
The expression of Gaussian background model is:
1.2 Gaussian background modeling based on block matching ROI mapping algorithm
According to the expression of the original Gaussian background model, the pixel based Gaussian background modeling algorithm needs a large number of complex floating-point calculations. Generally, it takes hundreds of frames to complete the model, resulting in high time-consuming and unsuitable for hardware implementation.
Gaussian background modeling method only considers the time correlation of pixels at the same position, and regards all pixels as isolated points. On the one hand, it needs a lot of repetitive calculation, on the other hand, it will produce "false alarm" when the background changes.
There are spatial redundancy, temporal redundancy and knowledge redundancy in video sequences. In view of the redundancy in the image space of a frame, the video coding algorithm uses the block mode for intra prediction, and transforms and quantizes the residual between the predicted value and the original value, so as to achieve the purpose of video compression.
In this paper, the block matching method is used to replace the pixel matching and updating method of the original Gaussian background modeling, and a Gaussian background modeling ROI extraction algorithm based on block matching is proposed. On the one hand, the background modeling calculation based on block can avoid a large number of operations in the process of pixel based algorithm; On the other hand, Gaussian background modeling based on blocks can unify the establishment of background and the division of video coding blocks.
After the background is established by Gaussian modeling, the new video frame is divided into blocks, and the foreground block and background block are determined according to the sad criterion. The expression of sad discrimination is shown in equation (5). Where B represents the established background block and C represents the pixel block at the corresponding position of the current video frame. In this paper, n takes 8.
The basic steps are described as follows:
Step 1: video block division. Press n & times; The N-scale divides the original video into several disjoint sub regions.
Step 2: model initialization. For the block region, the basic parameters of Gaussian model are initialized μ、σ、λ、α。
Step 3: frame count determination. Read in the video. If the number of video frames meets the update cycle P, go to step 4, otherwise go to step 5.
Step 4: update the model. Update the block background model.
Step 5: front background determination. According to the sad criterion, the foreground and background are divided.
Step 6: ROI area mapping. The CTU in the video is mapped according to the foreground block distribution. In this paper, the hevc CTU scale is set as 32 & times; 32, the mapping result will be sent to the h.265/hevc encoder.
The algorithm flow is shown in Figure 1.
2 ROI region adaptive coding based on rate distortion optimization
2.1 rate distortion optimization for ROI region
In order to reduce the code rate and achieve better image quality, rate distortion optimization can be defined as the following optimization problem: when the code rate R ≤ Rmax, adjust the coding algorithm to minimize the gain and loss truth D, that is:
Equation (8) is usually used as the basis for RDO, but in fact, the coding blocks are often not independent of each other, resulting in the obtained value being the local optimal solution.
In this paper, through the division of ROI regions, assuming that ROI regions and non ROI regions are independent and identically distributed in a frame, the rate distortion optimization function can be described as:
Since equation (9) takes into account the correlation of coding blocks, it can avoid falling into local optimization to a certain extent. According to the analysis, equation (9) will get a better solution than equation (8).
Further, starting from the subjective video quality, the human eye expects better video quality in the ROI area. Therefore, this paper adds restrictions in the implementation process:
2.2 hevc code integrating ROI extraction
In this paper, ROI region is sent into hevc encoder for variable quality coding. In order to prevent the obvious block effect caused by the large difference of coding parameters between ROI region and surrounding non ROI region, the quantization parameters are adjusted by nonlinear compensation. The specific methods are as follows.
Note that the quantization parameter of coding block a where the ROI area is located is Q1, and the quantization parameter of coding block B in the nearby non ROI area is Q2. The center point coordinates of a are recorded as (XA, ya), and the center point coordinates of B are recorded as (XB, Yb). Then Q1, Q2 and the Hamming distance D of the center positions of a and B should meet the following relationship:
3 hardware design and Implementation
In order to illustrate the effectiveness of this method, the Gaussian background modeling ROI algorithm based on block matching is implemented in hardware and embedded into the process of hevc coding.
In this paper, the hardware design of ROI region mapping and adaptive coding based on background modeling is carried out by using high level synthesis (HLS) tool and based on Xilinx MPSoC platform zcu102. HLS tool can map the high-level description of C / C + + language to hardware description language (VHDL or Verilog) to improve development efficiency.
The hardware includes three modules: background establishment, background update, ROI determination and mapping. Finally, the mapping results are sent to the video encoder. Its basic structure is shown in Figure 2.
The original video data is cached in DDR, and the access rate is accelerated through line cache in FPGA. Under the control of frame counter, the video data multiplexer sends the video to different processing units, maps the ROI area to the coding tree unit (CTU) in h.265 standard, and sends the mapping results to h.265 encoder. In the encoder, ROI adaptive QP adjustment is carried out according to the regional nature, and finally the encoded code stream is written back to DDR.
4 experimental results and analysis
4.1 experimental environment
This paper carries out experiments based on Xilinx zcu102 embedded development platform. Zcu102 is equipped with zynq ultrascale xczu9eg-2ffvb1156 FPGA chip. The internal architecture of the chip mainly includes processing system (PS) and programmable logic (PL).
The consumption of PL hardware resources is shown in Table 1. Considering certain scalability, the image resolution in the hardware design can be configured, and the maximum resolution is 1 920 & times; 1 080。
4.2 background modeling effect and ROI mapping results
Figure 3 shows the background modeling and ROI mapping results based on FPGA. The sequence used is hevc standard test sequence basketball drill_ 832&TImes; 480_ 50.yuv。 Fig. 3 (a) is the 201st frame of the video sequence, Fig. 3 (b) is the background frame modeled by the first 200 frames, and Fig. 3 (c) is the mapping result for hevc CTU, in which the white area is the ROI area mapped. It can be seen that the moving characters in the video are accurately mapped to the area bounded by the CTU size. By observing the original video sequence, it can be seen that the background area in the original video sequence changes with time (for example, the basket will shake with the impact of the basketball), but these changes do not affect the mapping of ROI area (i.e. there is no "false alarm" phenomenon), and the algorithm has certain robustness.
Table 2 shows the speed comparison of processing under different resolutions. The clock frequency of PL part is 120 MHz. It can be seen from the table that the design of this paper is in 1920 × High real-time performance can still be achieved at the resolution of 1080.
4.3 performance evaluation of hevc video coding embedded with ROI rate control
In order to further illustrate the effectiveness of hevc coding after embedding ROI region, the coding results of hevc encoder are verified by experiments. Select test sequences with different resolutions and scenarios respectively, and calculate the changes of overall bit rate and PSNR. The results are shown in Table 3.
As can be seen from table 3, using the background modeling ROI mapping algorithm proposed in this paper for rate control, the overall PSNR of the encoded image does not change greatly, but the average bit rate is saved by about 10%, which verifies the effectiveness of this algorithm in rate control.
5 Conclusion
Based on the characteristics of block based video coding algorithm, this paper proposes a block based Gaussian background modeling ROI mapping method, which is implemented on FPGA by HLS method and used for h.265/hevc video coding. The experimental results show that the algorithm runs fast on FPGA platform and can be effectively integrated into h.265/hevc hardware encoder; In h.265/hevc, variable quality coding of the extracted ROI region can save an average bit rate of about 10%, and the overall video quality remains stable., Read the full text, technical section
Radio equipment interface compression algorithm for evaluating IQ data based on vivado high level synthesis tool
With the introduction of non disruptive update I / O architecture, PLD can be reconfigured with assurance
FPGA products are upgraded and attacked, and Xilinx's three strategies help accelerate market growth
Design of implementation method of adjustable FIR filter based on FPGA in practical communication system
Detailed explanation on the development status and future development trend of domestic FPGA“
Our other product: