"When you see this article, I saw this article written on the author's author. It can be very clear, and the image is introduced in the H.264 / AVC, H.265 / HEVC, H.266 / VVC video coding standard. The evolution process of division technology is analyzed, analyzes different coding standard image division technology.
The current video coding standard uses a block-based hybrid encoding method, which is predicted in intra or inter-frame prediction with a block as a basic unit, and then transforms the predicted residual, and finally block the block mode, prediction. Information and post-quantization residuals are entropy encoded to obtain coded streams. With the continuous evolution of the coding standard, the blocking technology is constantly developing, and the blocking method presents from small pieces to large blocks, from simple to complex division, to achieve the purpose of continuous improvement of the editing code.
Be
1.H.264 / AVC video encoded image division
In the H.264 / AVC coding standard, the input image is divided into a block of fixed sizes as the code of the encoded, and it is called a macroblock (MB, Macro Block), including a brightness block and two color block blocks. The brightness block size is 16 × 16. For the sake of simplicity, the block size described in the text is the size of the brightness block, and if 4: 2: 0 sample is used, the chroma block size is half the size of the brightness block.
In the prediction process, the macroblock is further divided into small pieces for prediction depending on the prediction mode. In intra prediction, macroblocks can be divided into small pieces of 16 × 16, 8 × 8, 4 × 4, each with intra prediction, respectively. Two levels of blocking mode in the inter prediction mode, first of all, the macroblock can be divided into small pieces of 16 x 16, 16 × 8, 8 × 16, 8 × 8. If the division is 8 × 8, the four pieces of divided blocks are called a sub-macroblock, and the sub-macroblock can be divided, divided into small pieces of 8 × 8, 8 × 4, 4 x 8, 4 x 4. Each small piece can be performed independently, but the small blocks in each sub-macroblock can only refer to the same reference frame.
In the conversion quantization process, the macroblock is divided into a small piece of 4 × 4 or 8 × 8, and the predictive residuals in each small piece are varied and quantified, respectively, and the quantization lattice coefficient is obtained.
In addition, in order to limit the error spread and transmission, the concept of Slice is proposed in H.264 / AVC, and a picture is divided into one or more slice, and the macroblock in each slice is scanned in the raster scan. Code. Each SLICE can be encoded and transmitted separately, not affected by other SLICEs, so it is also conducive to parallel operations.
Be
Be
2.H.265 / HEVC video encoded image division
Be
The basic coding principle of HEVC is basically consistent with H.264 / AVC, that is, the plus transform block encoding; on the code details and the former, including intra prediction, inter prediction, motion estimation and motion compensation, DCT transformation , Quantify, loop filtering, entropy coding, and reconstruction, etc. But compared to H.264 / AVC. HEVC has taken important improvements in each coding link.
2.1 CTU division
A image is divided into coding tree units (CTU, Coding Tree Unit), and CTUs are encoded basic units (conceptually corresponding to macroblocks in H.264 / AVC). A CTU includes a brightness coding tree (CTB, Coding Tree Block) and two chroma encoding blocks, and the maximum size of Cu in the H.265 / HEVC standard is generally 64 × 64.
2.2SLICE and TILE division
Each slice of the H.265 / HEVC standard can be further divided into SLICE segment (SLICE clip), and the SLICE segment contains an independent slice segment and a non-stand-alone SLICE segment (Dependent Slice Segment). A SLICE can contain one or more SLICE segments that start with a separate SLICE clip and follow multiple non-stand-alone SLICE segments. The encoding syntax of the independent SLICE clip is independent of the pre-sequence slice fragment, rather than the partial encoded syntax element value in the independent SLICE clip requires export from the SLICE segment header before the decoding order.
An example of the SLICE division is given in FIG. The image contains 11 × 9 CTUs, divided into two slice, the first SLICE consists of a separate SLICE segment and two non-stand-alone slice segments, the second SLICE contains A separate SLICE fragment.
Be
Be
The H.265 / HEVC coding standard is based on the H.264 / AVC, and the concept of Tile (film) has also been added. Unlike Slice's strip division, Tile can divide the image and vertically into multiple rectangular areas. Each TILE contains one or more CTUs, encoding in the scan order, can further enhance the power of parallel processing. The image is vertically divided into two Tiles, and the image includes a slice, which contains a separate SLICE segment and four non-independent SLICE segments; the first TILE contains two slice, second A TILE contains a slice.
Be
Be
Be
Be
2.3 Cu division and coding tree structure
In order to accommodate a variety of videos, the CTU uses a quadrush tree (QT, Quadro Tree), and the CU, CODING Unit), and the CU is the basic unit of the intra / inter-frame encoded. A CU contains a brightness encoding block (CB, Coding Block), and two chroma encoding blocks and related syntax structures, the maximum CU size is CTU, the minimum CU size is 8 × 8
Be
2.4 pu division
The leaf node Cu obtained by 2.3 coding trees can be divided into three types according to the prediction method: intra predicted intra CU, inter predicted Inter Cu and Skipped Cu. Skipped Cu can be seen as a special case of Inter Cu, does not include motion information and residual information. The leaf node CU includes one or more prediction units (PU, PREDICTION UNIT), H.265 / HEVC supports 4 × 4 × 4 × 64 size PU, with eight division patterns.
For intra encoding mode, there are two possible division mode: Part_2nx2n and part_nxn. When and only when the size of the Cu is the minimum Cu (SCU, SMALLEST CU), the intra prediction has the division of Part_nxn, which can divide a CU into 4 identified PUs.
For interframe coding modes, there may be eight types: Part_ 2nx2n, Part_ 2nxn, Part_ NX2N, Part_ 2NxNu, Part_2nxnd, Part_nlx2n, Part_nRx2n, Part_nxn. In this, Part_2nxnu, part_2nxnd, part_nlx2n, part_nrx2n belong to asymmetric motion division, and there is a non-symmetric division mode when the CU is greater than the SCU. For Part_nxn division, when and only when the current CU is the SCU and the SCU is greater than 8 × 8. For the division of 8 × 4 and 4 × 8, it is only only in one-way prediction
2.5 TU division and transform tree structure
For residual signals, the Cu is divided into the conversion unit TU using the residual strip tree. The structure is shown in Figure 6.
A TU contains a brightness transform block (TB, Transform Block), and two chromaticity conversion blocks. Only the square division is allowed to divide one CB to 1 or 4 PB. The same TU has the same transformation and quantization process, supported by 4 × 4 to 32 × 32. Unlike previous coding standards, in inter prediction, TB can span the boundary of the PB, which can further maximize the encoding efficiency of interforming encoding.
Be
Be
3, VVC / H.266 video encoded image division
Be
3.1 CTU division and tree structure
VVC / H.266 Video encoded image First divides the encoding tree unit CTU with HEVC, but the maximum size increases from 64 × 64 from the H.265 / HEVC standard to 128 × 128, which can adapt a larger image size.
H.266 / VVC proposes a four-fork tree and a multi-type tree (MTT, Multi-Type Tree), including binary trees (BT, Binary Tree) and trigemis (TT, Ternary Tree), unified H.265 The concept of Cu, PU, TU in / HEVC, and supports a more flexible CU division shape. The CTU is divided according to the string structure, and the leaf node is further divided by MTT, with a total of 4 types: vertical two division (split_bt_ver), horizontal division (split_bt_hor), vertical 1: 2: 1 three divided (split_tt_ver) and horizontal 1 : 2: 1 third division (split_tt_hor), as shown in Figure 6. Multi-type tree leaves become coding unit Cu, which is not further divided while CU is not greater than the maximum conversion unit (64 × 64). Therefore, in most cases, both Cu, Pu, TU have the same size.
During the encoding process, the image division information is required to identify, so that the decoding end can decode correctly, and FIG. 7 lists the specific division identification method. The CTU is the root node of the quadrush, the qt_split_cu_ FAG identifies whether the quadruple tree structure is divided. The leaf node of the four-tree is further divided by the MTT structure. MTT_SPLIT_CU_FAG is used to identify whether the MTT division is used, if used, MTT_SPLIT_VERTITICAL_FAG is used to identify the MTT division direction, "1" represents vertical division, "0" represents horizontal division, then use MTT_SPLIT_CU_BINARY_FAG The identification is divided into two points or three points, and finally determines the division of the CTU.
Be
Be
A division structure diagram of a quadruple tree and a nesting multi-type tree is given in FIG. 8, which can improve division flexibility according to the self-adaptation of image content. The maximum Cu size is CTU, the minimum Cu size is 4 × 4, the maximum change block size is 64 × 64
Be
Be
Be
3.2 Virtual pipeline unit
In hardware implementations, the pipeline data processing is generally employed to ensure efficiency. The non-overlapping unit in the image is defined as a virtual pipeline data unit (VPDU, Virtual Pipeline Data Units), which can be independent of each other at the same time, so the water data processing can be achieved. The VPDU size is closely related to the current large number of pipeline buffers, considering the specific implementation limit, the VPDU is set to the maximum conversion unit size 64 × 64. However, the TT and BT division used in the H.266 / VVC may cause the VPDU size to become large, so the following restrictions are made in the latest H.266 / VVC reference software platform VTM6.0:
When the length of the Cu is 128, the TT disabled Cu width is 128, and the BT is disabled when the Cu width is 128, and the BT is disabled; the Cu is 128, when the width is less than 128, BT is disabled.
In Fig. 9, the VTM6 is given to the divided limit example, the width and height of 128, divided into four 64 × 64 VPDUs, according to the division restriction, and the red line is disabled.
3.3 Chromatic separation tree
Taking into account the different characteristics of brightness and chromaticity and the concrete implementation, the chromaticity can adopt a separate division tree structure without having to be consistent with the brightness division tree. The chromaticity division of the I frame in the H.266 / VVC uses chromaticity separation trees, P frames, and B frame color division, and the brightness division is consistent with the brightness division. After the I frame color separation tree is divided, the size of the chroma block may be 2 × 2, 4 × 2 and 2 × 4, resulting in the hardware implementation of the pipeline coding path is too long, which is not conducive to the actual application. Therefore, when chromaticity separation tree is divided, small pieces of 2 × 2, 4 × 2 and 2 x 4 are disabled.
Be
Be
3.4 SLICE, TILE and Brick Structure
H.266 / VVC encoding Based on the H.265 / HEVC, the Tile is further divided into brick (brick). A Tile can be divided into one or more bricks, one brick contains a TILE multi-line CTU.
SLICE supports two modes, one is a raster-scan slice mode, and the other is a rectangular Slice mode. The scanning sequence SLICE mode is consistent with H.265 / HEVC, and one slice can contain one or more Tiles divided according to image scan mode sequence. In the rectangular Slice mode, one slice can include a plurality of Bricks consisting of an image rectangular area.
An image of Figure 10 is horizontally and vertically divided into 12 Tiles, including three slice divided by scanning sequence; the right figure contains four Tiles, the TILE in the upper left corner contains a brick, and the upper right tile contains 5 bricks, The lower left TILE contains 2 bricks, and the lower right tile contains 3 BRick, a total of 4 rectangular SLICEs.
Be
Be
Be
Be
4. Comparison analysis of different coded image division technology
Be
With the evolving of image division technology, the division structure from simple macroblock structure to complex quadruple, binary tree, trigemocontrol structure, brightness and chroma from consistent division structure development into separable division, block tissue structure Simple slice expands new structures such as Tile, Brick, encoding, predicting, transform blocks and division shapes are also more abundant and flexible, and the comparison of image division technology corresponding to different coding criteria is shown in Table 1
Be
Be
Table 1 Image division technology comparison coding standard image division H.264 / AVC H.265 / HEVC H.266 / VVC division structure macroblock and sub-macroblock four-tree recursion (Cu / PU / TU structure) quadrush tree and nested Multi-Type Type Tree Creation Chromaticity Division Consistency Uniform I Frame Brightness Chromaticity Separable Differential Block Tissue Split Slice, SLICE Segment, Tile Scan Sencture Slice, SLICE Segment, Rectangular Slice, Tile, Brick Maximum Brightness Coding Block size 16 × 16 64 × 64 128 × 128 minimum brightness encoding block size 4 × 4 8 × 8 4 × 4 maximum brightness prediction block size 16 × 16 64 × 64 128 × 128 minimum brightness prediction block size 4 × 4 4 × 4 4 × 4 Maximum Brightness Transmission Block Size 8 × 8 32 × 32 64 × 64 Minimum Brightness Transformers Size 4 × 4 4 × 4 4 × 4 Frame Prediction Block Division Shaped Square square square + Non-square inter Blood prediction block divided shape symmetry Symmetrical + single-layer asymmetric symmetry + multi-layer asymmetry
Be
Be
5. Summary
With the emergence of video applications such as 4K, 8K super high definition, video data volume explosions increased, the encoding technology has put forward higher demand. As the basis of the mixed coding technology framework, the image division is continuously developed from a single, fixed division towards diverse, flexible division structure, and can be more efficiently adapted to adapt the codec processing of high-resolution images. In addition, new image division uses a more abundant image tissue structure, which is conducive to the realization of anti-misunderstandings and parallel processing. However, complex division generally leads to a significant increase in complexity, and puts higher challenges for the decoding implementation, so it is necessary to make certain restrictions and optimizations for image division, seek encoding performance and coding complexity. Balance between.
(If there is any infringement, remove it now !!!) "
Our other product: