"JVT (Joint Video Team, Video Joint Working Group) was established in Pattaya, Thailand in December 2001. It consists of two expert in two international standardized organizations of ITU-T and ISO. JVT work goal is to develop one New video coding standards to achieve high compression ratio, high image quality, good network adaptability, etc. At present, the work of JVT has been accepted by ITU-T, and new video compression coding standards are called H.264 standards. This standard is also accepted by ISO, called the AVC (Advanced VideoCoding) standard, is a Part 10 of MPEG-4. H.264 standard can be divided into three gears: basic grade (simple version, wide application face); main grade ( A number of technical measures to improve image quality and increase compression ratio can be used for SDTV, HDTV and DVD, etc.); extended grade (available for video streaming of various networks). H.264 is not only H.263 and MPEG- 4 Save 50% of the code rate and have a better support function for network transmission. It introduces the encoding mechanism for the IP package, which is conducive to packet transmission in the network, support streaming stream of video in the network. H.264 Have strong anti-challenge characteristics, adaptable to the packet loss rate, video transmission in the radio channel in which the interference. H.264 supports the hierarchical transmission under different network resources, thereby obtaining a smooth image quality. H.264 can Adapt to video transfer in different networks, network affinity is good.
The main objective of the H.264 standard is to provide more excellent image quality over the same bandwidth compared to other existing video coding standards. H.264 is compared with the previous international standards such as H.263 and MPEG-4, the most technological advantages are reflected in the following four aspects:
1. Each video frame is separated into a block consisting of pixels, so the process of encoding the video frame can reach the level of block.
2. Some original blocks of video frames are spatially predictive, conversion, optimization, and entropy encoding (variable long coding) for some original blocks of video frames.
3. The method of temporary storage is employed for different blocks of the continuous frame, so that only the part thereof in the continuous frame is encoded. This algorithm is accomplished by exercise prediction and motion compensation. For some specific blocks, searching for the search to determine the motion vector of the block in one or more frames, and thus the main block is predicted in the rear coding and decoding.
4. The residual block in the video frame is encoded by the remaining spatial redundancy technology. For example, for the source block and the corresponding prediction block, the conversion, optimization, and entropy encoding are again used.
H.264 retains both the advantages of compression technology and the essence of other compression techniques that cannot be compressed, H.264 features and advanced advantages:
1. Low bit flow (Low bit): Compared to compression techniques such as MPEG2 and MPEG4 ASP, the amount of data after the compression of H.264 is only 1/8 of MPEG2, 1/3 of MPEG2, and 1/3 of MPEG2. Obviously, the use of H.264 compression technology will greatly save users' download time and data traffic charges.
2. High quality image: H.264 provides continuous, smooth high quality images (DVD quality).
3. The fault tolerance is strong: H.264 provides the necessary tools that address errors such as packet loss that occurs in unstable network environments.
4. Network adaptability: H.264 provides a network adaptation layer (NetWork Adaptation Layer, is it a NetWork Abstract Layer Network Abstract layer?), Making H.264 files easily transfer (eg internet, cdma, gprs) , WCDMA, CDMA2000, etc.). Be
First, H.264 video compression system
The H.264 standard compression system consists of two parts: Video Coding Layer, VCL and Network Abstract Layer (NAL). The VCL includes VCL encoders and VCL decoders. The main functions are video data compression coding and decoding, which include motion compensation, transform coding, entropy encoding and other compressed units. NAL is used to provide VCL to provide a unified unified interface that is unrelated to the network. It is responsible for encapsulating the video data to transmit it in the network, which uses a unified data format, including a single byte of the header information, multiple bytes Video data with group frames, logical channel signaling, timing information, sequence end signal, etc. The header includes the storage flag and type flag. The storage flag is used to indicate the current data is not a reference frame. Type flags are used to indicate the type of image data.
VCL can transfer encoding parameters adjusted in current network conditions.
Second, H.264 features
Like H.264 and H.261, H.263, also using DCT transformation encoding DiPCM differential encoding, i.e., hybrid coding structure. At the same time, the H.264 introduces new coding methods under the mixed encoded framework, improves coding efficiency, closer to practical applications.
H.264
There is no cumbersome option, but strive to make a simple
"
Return to basics
"
, It has a ratio
H.263 ++
Better compression performance and have the ability to adapt to a variety of channels.
H.264
A wide range of applications, which can meet a variety of different rates, video applications of different situations, with better blending and anti-packet processing capabilities.
The basic system of H.264 does not need to use copyright, with open nature, which can adapt to the use of IP and wireless networks, which is of great significance for current Internet transmission multimedia information, mobile network transmission broadband information.
On the system level, H.264 proposes a new concept, conceptual segmentation between video coding layer, VCL and network extraction layer (NAL), the former is the core of video content The representation of the compressed content is a representation of the presentation by a particular type of network, such a structure facilitates the package and better priority control of information. The H.264 system encoding block diagram is shown below:
although
H.264
Coded basic structure
H.261
,
H.263
It is similar, but it has been improved in many links, and is now listed below.
1. A variety of better exercise estimates
High precision estimation
exist
H.263
Half pixel estimation is used in
H.264
The middle is further adopted
1/4
Pixel even
1/8
Motion estimation of pixels. That is, the displacement of the real motion vector may be
1/4
even
1/8
The pixel is the basic unit. Obviously, the higher the accuracy of the motion vector displacement, the smaller the remaining error of the frame, the lower the transmission code rate, the higher the compression ratio.
exist
H.264
In use
6
Order
FIR
Inclusions of the filter
1/2
The value of the pixel position. when
1/2
After the pixel value is obtained,
1/4
Pixel values can be obtained by linear interpolation.
For
4: 1: 1
Video format, brightness signal
1/4
Pixel accuracy corresponds to the chromaticity portion
1/8
The motion vector of the pixel is therefore necessary to perform chroma signals
1/8
Inclusion of pixels.
In theory, if the accuracy of the motion compensation is doubled (for example, from the intensity of the whole
1/2
Pixel accuracy),
0.5bit / Sample
Code gain, but actual verification found that in motion vectors exceed
1/8
After the pixels, the system basically has no significant gain, so
H.264
Medium, only
1/4
Pixel concentration motion vector pattern, not adopting
1/8
The accuracy of the pixel.
Multi-macroblock division mode estimation
exist
H.264
Prediction mode, a macroblock (
MB
)
Seduce
Different modes, this multi-mode flexible, subtle macroblock division, more in line with the actual moving object in the image, so, in each macroblock can be included
1
,
2
,
4
,
8
or
16
Sport vector.
Multi-parameter frame estimate
exist
H.264
In the middle, a plurality of parameter frames can be used, that is, there is a plurality of unscaded parameter frames in the encoder's cache, and the encoder selects a better encoding effect as a parameter frame, and points out Which frame is used to predict that it can obtain a better encoding effect than only the previously encoded frame as a predictive frame.
2. Small size 4 * 4 integer transformation
Video compression coding conventional common units are
8 * 8. In H.264, small size 4 * 4
Block, because the size of the conversion block becomes small, the division of the moving object is more accurate. In this case, the amount of calculation during the image transformation process is small, and the connection error of the edge of the moving object is also greatly reduced.
When there is a smoothing area in the image, in order not to produce a gradation difference between blocks due to small-size transformation,
H.264
Brightness data for intra macroblock
16
Piece
4 * 4
Block
DCT
The coefficient is the second time
4 * 4
Block transformation, chromaticity data
4
Piece
4 * 4
Block
DC
Coefficient (one of each small piece, total
4
Piece
DC
Coefficient)
2 * 2
Block transformation.
H.263
Not only makes the size of the image transform block becomes smaller, not the integer operation, not the implementation, that is, the encoder and the decoder change and the reflex transformation of the reverse transformation, no
"
Anti-transform error
"
.
3. More accurate intra prediction
In H.264, each pixel in each 4 * 4 can perform intra prediction with different weighting and to which the previously encoded pixels are used.
The intra encoding is used to reduce the space redundancy of the image. In order to increase the efficiency of the H.264 intra coding, the spatial relevance of the adjacent macroblock is taken into a given frame, and the adjacent macroblocks typically contain similar properties. Therefore, when encoding a given macroblock, first, according to the surrounding macroblock prediction (typically based on the macroblock in the upper left corner, because the macroblock has been encoded), then the predicted value and the actual value The value is encoded so that the code rate can be greatly reduced relative to the frame encoding directly.
H.264 provides 6 modes for 4 × 4 pixel macroblock prediction, including 1 DC prediction and 5 direction predictions, as shown below. In the figure, the A to I of the adjacent block has been encoded, which can be used to predict if we select mode 4, then, A, B, C, D4 pixels are predicted to equal equivalents with E. Values, E, F, G, and H4 pixels are predicted to equal value to F, and the flat regions containing very little spatial information in the image, and H.264 also supports 16 × 16 intra-intra-code encoding. Be
4, inter-frame prediction coding
Inter-frame prediction coding utilizes time redundancy in the continuous frame to perform motion estimation and compensation. H.264 Motion Compensation Supports most of the key features in previous video coding standards, and flexibly adds more features, in addition to supporting P frame, B frame, H.264 also supports a new flow of flow Frame --SP frame. After the SP frame is included in the code stream, it can quickly switch between the similar content but there is a stream of different yaw rates, while supporting random access and fast playback mode.
The Motion estimation of H.264 has the following four characteristics.
(1) macroblock segmentation of different sizes and shapes
Motion compensation for each 16 × 16 pixel macroblock can be used in different sizes and shapes, and H.264 supports seven modes, as shown in Figure 4. The motion compensation of the small block mode improves performance, reducing the block effect, and improves the quality of the image.
(2) High-precision sub-pixel motion compensation
In H.263, a half pixel precision is used, while in H.264, a motion estimate of 1/4 or 1/8 pixel precision can be employed. In the case where the same accuracy is required, H.264 uses a 1/4 or 1/8 pixel precision, the residual is smaller than the residual of the H.263 using a half pixel precision motion. Thus, in the same accuracy, H.264 is smaller than the code rate required in the frame encoding.
(3) Multi-frame prediction
H.264 provides an optional multi-frame prediction feature, and 5 different reference frames can be selected when encoded, providing better error correction performance, so that video image quality can be improved. This feature is primarily applied to the following occasions: cyclical motion, translation, transform the camera in two different scenes.
(4) Go to block filters
H.264 defines a filter that adaptively remove block effects, which can handle the horizontal and vertical block edges in the prediction loop, greatly reduces the square effect.
5, quantization
32 different quantization steps can be selected in H.264, which is similar to 31 quantization steps in H.263, but in H.264, the step size is incorporated in 12.5%, and Not a fixed constant. In H.264, there are two ways to read the transform coefficient: zigzag scan and double scan. In most cases, a simple zigzag scan is used; double scan is only used in blocks of smaller quantization levels to help improve coding efficiency.
6. Unified VLC
The final step of video coding processing is to entropy encoding, and there are two methods about entropy encoding in H.264.
1, unified VLC (ie UVLC: Universal VLC). UVLC uses an identical code table to encode, and the decoder is easily identified by the prefix of the codeword, and the UVLC can quickly get ridage quickly when the bit is incorrect.
In H.263 and other standards, different VLC code tables are used depending on the data types, motion vectors such as transform factors, motion vectors, etc.. The UVLC code table in H.264 provides a simple method, regardless of what type of data, which is specified, all use unified variant length coding tables. Its advantage is simple; the disadvantage is that a single code table is derived from the probability statistical distribution model. It does not consider the correlation between encoded symbols, and the effect is not very good when the medium is high.
2, content adaptive binary arithmetic code (Cabac: context adaptive binary arithme Tic Coding. Its encoding performance is slightly better than UVLC, but the complexity is high.
Arithmetic coding enables the code and decoding of the probability model of all syntax elements (transform factors, motion vectors). In order to improve the efficiency of the arithmetic coding, the basic probability model can be adapted to the statistical characteristics change with the video frame by the process of modeling. Content modeling provides the conditional probability estimation of encoding symbols, using the appropriate content model, the correlation between the symbols can be removed by selecting the corresponding probability model of the encoded symbols currently to encode symbols, and different syntax elements are usually maintained. Different models.
Third, performance advantages
H.264 and MPEG-4, H.263 ++ encoding performance comparison use 6 test rates: 32kbit / s, 10f / s and qcif; 64kbit / s, 15f / s and qcif; 128kbit / s, 15f / S and CIF; 256 kbit / s, 15f / s and qcif; 512 kbit / s, 30f / s and CIF; 1024 kbit / s, 30f / s and cif. The test results indicate that H.264 has a better PSNR performance than MPEG and H.263 ++.
H.264
of
PSNR
Compare
MPEG-4
Average
2DB
,Compare
H.263 ++
Average
3DB
.
Fourth, new fast motion valuation algorithm
New fast motion valuation algorithm
Umhexagons
(China Patent) is a kind of operational capacity relative to
H.264
Sino-rapid full search algorithm can save
90
% Or more new algorithms, full name
"
Asymmetric cross-type multi-level hexagonal point search algorithm
"
(
Unsymmetrical-crossmuti-hexagon search
)
"
This is a whole pixel motion estimate algorithm. Since it is encoded at a high-yaw rate of high-yard motion image, the amount of operation is very low under conditions that maintain a better distortion performance.
H.264
Standard officially adopted.
ITU
and
ISO
Cooperative development
H.264
(
MPEG-4 Part 10
) It is possible to broadcast, communicate, and storage media (
CD DVD
) Accepting standards that become unified, most likely to be a standard for broadband interaction. my country's source coding standard has not yet developed, close attention
H.264
Development, the work of developing my country's source coding standard is being tightened.
The H264 standard enables moving image compression techniques to a higher stage, providing high quality image transfer on lower bandwidth is a highlight of H.264. The promotion application of H.264 is higher in the system of video terminal, gatekeepers, gateways, and MCUs, which will force video conferencing soft and hardware equipment in all aspects.
Http://kb.cnblogs.com/page/168157/ "
Our other product: